MATLAB performance optimization can turn workflows that took 5-6 minutes into processes completed in under 1 second, a speedup of x500. Yet many users struggle with plotting operations taking upwards of 10 seconds and wonder why MATLAB runs so slow.
The answer lies in understanding what performance optimization is and applying targeted MATLAB performance tips. We’ll show you how to accelerate your MATLAB performance when working with large datasets through pre-allocation techniques and vectorization.
Understanding MATLAB Performance Issues with Large Datasets
What Is Performance Optimization
Performance optimization refers to the systematic process of identifying and eliminating computational inefficiencies in MATLAB code. Large datasets present unique challenges since they can appear as files too large to fit into available memory, files that need extended processing times, or collections of many small files. No single approach addresses all scenarios, which is why MATLAB has multiple tools to access and process large data.
The core goal involves making code execute faster while using memory more efficiently. Users working with databases that contain large volumes of data experience out-of-memory issues or slow processing. Performance optimization tackles both symptoms through targeted techniques that address specific bottlenecks in data handling, computation and visualization.
Why Is MATLAB Running So Slow
Several factors contribute to MATLAB performance degradation. Background processes that share computational resources decrease the performance of MATLAB code. Memory management plays a critical role since MATLAB can use up to 100% of RAM (not including virtual memory) to allocate memory for arrays, and exceeding that threshold produces errors.
The workspace browser creates unexpected slowdowns. Simple operations like 1+1 can take minutes to complete when storing large objects around 10GB in the workspace. This occurs because the workspace browser refreshes its list of variables, values and statistics after running any code. This refresh takes considerable time to determine for large object structures or cell arrays. You can resolve this issue by closing the workspace browser or hiding its tab.
Memory allocation patterns affect speed over time. Programs that run for extended periods (10+ hours) with nested loops often slow down over time and become 20 times slower than original execution rates. This happens when array variables have not been preallocated, which forces MATLAB to reallocate memory in each iteration. Reallocating slows things down by a lot the larger those variables become, owing to the difficulty of finding ever-larger chunks of contiguous memory.
Common Performance Bottlenecks in Large Datasets
Data loading operations create the most severe performance issues. Text file parsing represents a major bottleneck, especially when using functions like textread and str2num. Profiling reveals these operations as critical hotspots that consume disproportionate execution time.
Graphics performance degrades faster with too many objects. Checking just three visualization checkboxes can create 37,365 different graphic handles within plot axes. The MATLAB graphics engine manages all these objects even when invisible and causes zooming and panning operations to become excruciatingly slow. No graphic element should remain plotted if not visible unless handles need frequent toggling on/off.
Memory issues manifest in multiple ways. Available memory restricts processing whole datasets at once when working with native ODBC connections. Arrays that resize during execution cause programs to run out of memory. Memory fragmentation worsens over time in long-running MATLAB instances on Windows systems.
Database operations present their own challenges. Selecting large volumes of data from databases into MATLAB triggers out-of-memory issues or slow processing. Working with datasets like 25GB LiDAR point clouds can exhaust memory even on machines equipped with 256GB of RAM.
Inefficient code patterns compound these problems. Global variables decrease performance, as does overloading built-in functions on standard MATLAB data classes. Placing independent operations inside loops creates redundant computations. Changing the class or array shape of existing variables takes extra processing time compared to creating new variables.
Using the MATLAB Profiler to Identify Bottlenecks
Profiling measures the time your code takes to run and identifies where MATLAB spends the most time. You need to locate the slow parts of your program before attempting any optimization.
How to Run a Profiling Session
Open the Profiler app from the Apps tab under MATLAB by clicking the Profiler icon to begin a profiling session. You can also type profile viewer in the Command Window. Both methods launch the same interface.
Enter the code you want to profile in the edit box within the Profile section once the Profiler opens. Click Run and Time to execute your code while the Profiler collects performance data. Use profile on to start tracking execution time if you want programmatic control, run your code, then execute profile viewer to display results.
The general process follows these steps: Run the Profiler on your code and review the profile summary results. Break down functions and individual lines of code that use time or are called often. Save the profiling results and implement potential performance improvements. Save files and run clear all, then run the Profiler again to compare results. This iterative approach continues until you achieve satisfactory MATLAB performance.
Reading Profiler Results
The Profile Summary displays a flame graph at the top of the results when profiling completes. Each function appears as a bar in this hierarchical visualization. User-defined functions display in blue, while MathWorks functions appear in gray. Parent functions sit lower on the graph with child functions appearing higher. The bar spanning the entire bottom labeled Profile Summary represents all code that ran.
Bar width indicates the percentage of total run time consumed by each function. You can hover over any bar to reveal the actual percentage, time values and full function name.
A function table containing five columns sits below the flame graph. Function Name lists each function called during execution. Calls shows how many times the profiled code called that function, which matters since loops execute functions over and over. Total Time displays the amount of CPU time spent executing that function and any other functions it calls. This represents the complete time from when the function starts until it finishes.
Self Time indicates CPU time spent executing only that function and excludes child functions it calls. The Total Time Plot provides a graphical representation using a bar with a dark band. This dark band represents Self Time.
Knowing how much total execution time belongs to constituent blocks versus its own self-time proves useful for a model component with any hierarchy level. Click column headings to reorder the table by that value. Function names sort alphabetically while numerical values arrange in descending order.
Pinpointing Critical Hotspots
Sort results by Self Time and work through the program to see which parts consume most execution time. These areas deserve your optimization focus first since they produce the most improvement for the least effort.
You can click a function name to display detailed information for that function. The Profiler shows how many times parent functions called this function and the total time spent in it. A source code listing annotated with execution time for each line appears below this information, showing how many times each line executed and highlighting the slowest lines in shades of red.
Look for functions in the flame graph or function table that use time or are called most often. These represent your critical hotspots. To name just one example, profiling might reveal that plot(x,y,'LineStyle','none','Marker','o','Color','b') runs much slower than plot(x(:),y(:),'LineStyle','none','Marker','o','Color','b').
Save profiling results as HTML files using the profsave command after identifying hotspots. This saves files to the profile_results subfolder in your current working folder as a default. Saving results lets you compare performance before and after implementing optimizations.
Programs designed for system sizes of all types need profiling at different scales since different parts dominate at different scales. Profile for a range of sizes to identify code sections that scale least well. Remove or comment out the profile on and profile viewer commands once finished profiling, as having profiling enabled makes your code run slower.
Optimizing Data Loading and I/O Operations
File I/O operations often emerge as the primary culprit when profiling reveals data loading bottlenecks. The functions you choose for reading data affect execution speed, sometimes by orders of magnitude.
Replacing textread with textscan
The textscan function offers several advantages over textread. It proves much more efficient when reading very large files or a very large number of files. The function also provides more features and supports more types and formats. It offers greater flexibility when reading from arbitrary points in files. fopen and fclose with textscan provide more power and safety when dealing with files and errors.
Converting from textread to textscan requires a different approach. Where you wrote [x,y,z] = textread(filename,format) before, the new workflow opens the file first. You write fid = fopen(filename), then C = textscan(fid,format), followed by fclose(fid). Extract variables with [x,y,z] = C{:}. The output from textscan returns as a cell array instead of separate output variables.
The file pointer maintains its position between calls to textscan and allows you to read data in portions. This capability proves valuable for processing datasets too large for memory. The textscan workflow also supports reading from remote locations, and error messages provide clear guidance on adjusting syntax.
Using sscanf Instead of str2num
The str2num function relies on eval, which creates performance problems. Code called by eval is not accelerated by the JIT engine and produces noticeable slowdowns with repetitive code like loops. This reliance on eval also makes code liable to unexpected behavior when input data contains valid commands or function calls. The function lacks support for parallel loops or the code compiler.
sscanf delivers superior performance for speed-critical applications. Converting character arrays using sscanf can complete in 0.013550 seconds compared to slower alternatives. When converting cell arrays of strings to numeric values, combining sprintf and sscanf provides the fastest conversion. Instead of V = str2double(C), write V = sscanf(sprintf(' %s',C{:}),'%f',[1,Inf]).
Loading Data in Optimal Formats
Binary files load much faster than text files. The -v6 option when saving MAT files produces dramatic speed improvements. Files saved with this older format use less compression than newer versions and result in larger file sizes but faster load times. Tests showed training data load times dropping from 78 seconds to near 0 seconds after switching to version 6 format.
Text files in ASCII format prove inefficient for data storage. They lack flexibility and power compared to binary alternatives. Memory-mapping binary files enables access using standard indexing operations. The application accesses files on disk the same way it accesses dynamic memory and provides efficiency, faster file access, and the ability to share memory between applications.
Reducing File Read Times
Iterative array growth destroys performance. An array that grows element by element requests excessive memory from the operating system. A loop creating one million elements needs approximately 4 TB of memory requests because each iteration creates a new larger array and copies old data. Pre-allocation eliminates this waste.
When loading multiple files, collect data in a cell array first rather than concatenate inside loops. Write Forces = cell(1, length(d)), then populate with Forces{ii} = load(d(ii).name, '-ascii') inside the loop. Combine with Combined_Forces = cat(1, Forces{:}) outside the loop. This approach avoids repeated concatenation work.
Large text files cannot benefit from parallel processing in most cases. Disk controllers maintain limited command queues, and multiple processes reading different file sections create contention for the read head. Sequential reading remains faster than seeking to different disk locations. Parsing text into binary runs faster than disk reads and makes parallelization ineffective for plain text files.
Managing Memory for Large Datasets
Poor memory management turns efficient MATLAB code into sluggish operations prone to out-of-memory errors. Once you optimize data loading, how you allocate, store and release memory determines whether your program completes in seconds or crashes midway through processing.
Preallocating Arrays
Arrays that expand one element at a time force MATLAB to perform expensive operations behind the scenes. MATLAB creates a copy when you enlarge an array beyond its current memory location, moves it to a larger contiguous block and then deallocates the original. Two copies exist at the same time during this transition, doubling memory requirements and increasing the risk of running out of space.
The performance penalty scales poorly. A loop creating one million elements without preallocation required 0.301528 seconds. The preallocated version completed in just 0.011938 seconds. That represents a 25x speedup from a single optimization technique.
Preallocation requires estimating array size beforehand and then creating it at full capacity using functions like zeros, ones, or NaN. Write x = zeros(1,1000000) before your loop instead of letting x grow element by element. Specify the data type for non-double arrays: A = zeros(100,'int8') rather than converting after creation. The first approach allocates memory once as the target type. The second wastefully creates doubles first and then converts each element.
Clearing Unused Variables
Variables occupying memory should be removed once their purpose concludes. The clear function removes variables from the workspace. clearvars accepts specific variable names for selective clearing: clearvars x y removes only those two. Use clearvars -except C D to preserve certain variables while clearing everything else.
Clearing becomes especially important when processing large datasets one after another. Write results to disk using save if generating substantial data amounts, then use clear to free that memory before continuing. Load each variable one at a time when loading MAT files containing multiple large variables, process it, clear it and then move to the next.
Memory release timing matters when overwriting variables. MATLAB requires temporary storage equal to the new array size before overwriting an existing variable. Clear the old variable first to make space: write clear a followed by a = rand(1e5) rather than reassigning it.
Using Data Types Efficiently
Selecting appropriate data types prevents memory waste. MATLAB defaults to double (8 bytes per element), but many operations function well with smaller types. The table below shows memory requirements:
| Class | Bytes | Supported Operations |
|---|---|---|
| logical | 1 | Logical/conditional |
| int8, uint8 | 1 | Arithmetic and simple functions |
| int16, uint16 | 2 | Arithmetic and simple functions |
| single | 4 | Most math operations |
| int32, uint32 | 4 | Arithmetic and simple functions |
| double | 8 | All math operations |
Storing 1,000 small unsigned integers as uint8 saves 7 KB compared to double. Single-precision preserves all information for 24-bit data acquisition systems while cutting disk usage in half.
Structures and cell arrays carry overhead beyond raw data storage. Each element requires a separate memory allocation with metadata. Structures with many fields containing small contents waste substantial space. Use simple numeric arrays whenever data complexity allows.
Working with Memory-Mapped Files
Memory-mapping treats file contents as arrays you can access in your workspace. MATLAB maps portions of files on disk to address ranges, enabling file access through standard indexing without explicit fread or fwrite calls.
This mechanism provides faster file access since data reads and writes use virtual memory capabilities built into the operating system. MATLAB only accesses disk when you reference specific mapped regions, reading just those parts rather than entire files.
Memory-mapping works best with binary files accessed randomly or read multiple times. Create a map with m = memmapfile('records.dat','Format','double') and then access data via m.Data as if it were a workspace array.
Accelerating Graphics and Visualization Performance
Graphics operations create performance bottlenecks that rival data loading and memory management issues. Datasets with millions of points often cause plotting operations to take 10+ seconds, especially when visualization code lacks optimization.
Reducing Plotted Data Points
Data reduction accelerates plotting without noticeable visual effect. The reducem function reduces vector map data points using the Douglas-Peucker line simplification algorithm. This method subdivides polygons until a straight line segment can replace a run of points, with no point deviating from that line by more than the specified tolerance.
Polygon and line data simplification speeds up calculations without making noticeable effect on displayed results. Plotting only elements within current axes limits eliminates wasted rendering. Zoomed-in axes exclude most data points. Filtering longitude and latitude values to visible ranges before plotting prevents processing thousands of off-screen coordinates.
Deleting vs Hiding Graphics Objects
The difference between deleting and hiding graphics handles affects performance. Just three visualization checkboxes created 37,365 different graphic objects within plot axes. The graphics engine manages all these handles even when invisible. Zooming and panning become painfully slow.
No graphic element should remain plotted if not visible, unless objects need frequent toggling on/off. Deleting objects with the delete function rather than setting their Visible property to 'off' eliminates this overhead. Keeping invisible objects trades recurring creation time against sustained performance degradation.
Using Low-Level Plotting Functions
Low-level functions execute faster than their high-level counterparts. Use line instead of scatter, plot, or plot3. The surface function outperforms surf. The line and patch functions provide the core plotting capabilities needed for most visualizations.
These functions bypass extra processing layers that high-level commands invoke. Rendering completes with fewer computational steps.
Combining Multiple Objects into Single Handles
Consolidating separate line segments into unified objects produces substantial speedups. Interspacing NaN values between segments allows a single line object to replace thousands of individual handles. This technique reduced approximately 7000 separate line handles into one line and improved both creation time and subsequent axes operations like zoom and pan.
The result delivered speedups of x50-100. Displaying slip-rate labels in a zoomed region dropped from 33 seconds to 0.6 seconds. Residual velocity vectors decreased from 1.63 seconds to 0.02 seconds.
Optimizing Axes Properties Before Plotting
Setting axes properties to static values before plotting avoids runtime dynamic computation. Writing xlim([0 10]) before adding data prevents MATLAB from recalculating limits based on each new point. This eliminates expensive auto-calculation cycles where getting properties triggers updates and setting properties marks the model as needing updates.
Advanced MATLAB Performance Tips
Beyond foundational techniques, advanced optimization methods address computational limitations through specialized approaches. These techniques deliver performance gains when standard optimizations reach their limits.
Vectorization Techniques
Vectorized code uses Single Instruction Multiple Data (SIMD) instructions that perform similar computations on multiple data at once. The JIT compiler translates vectorized MATLAB operations into BLAS and LAPACK subroutines that exploit SIMD capabilities. Multithreaded routines execute on machines supporting parallel execution without manual intervention. Write C = A + B instead of looping through elements. Matrix operations complete much faster than scalar iterations.
Using Parallel Computing
You can distribute computations across multiple workers by starting a parallel pool with parpool("Processes"). Tall arrays handle out-of-memory data and process chunks sequentially. Functions like writeall use open parallel pools when you set UseParallel to true. The mapreducer function switches execution environments between local sessions and cluster computing.
Implementing C-MEX Functions
You create executable binaries callable from MATLAB by building MEX functions using mex arrayProduct.c -R2018a. Critical routines converted to C/C++ and compiled with MEX improve speed by 5x. Total improvements reach 10x when applied to the right operations. MEX functions handle numerically heavy operations that remain simple in logic.
Choosing the Right Algorithm
Algorithm selection affects convergence speed and memory usage in major ways. The interior-point method handles large sparse problems well. Use trust-region-reflective when problems include only bounds or linear equality constraints. Trial different algorithms since predicting optimal performance is difficult.
Conclusion
You now have the tools to transform your MATLAB performance from painfully slow to lightning fast. We’ve covered everything from profiling bottlenecks to preallocation, vectorization and advanced techniques that can deliver speedups of x500 or more.
Consistency is key. Profile your code first, identify the hotspots and apply the right optimization to suit each situation. Start with the basics like preallocation and efficient data loading, then move to advanced methods when needed.
Note that performance optimization is iterative. Keep measuring and refining your approach. These techniques in your arsenal will turn those 10-second plotting operations and memory crashes into problems of the past.


