Factor analysis is one of the most powerful statistical techniques available to researchers working with large, multi-variable datasets. If you have ever collected survey data with dozens of questions and wondered how to reduce them into a smaller set of meaningful dimensions – factor analysis is exactly the tool you need. Understanding how to run factor analysis in SPSS is an essential skill for anyone working in market research, social science, psychology, or any field that relies on structured survey instruments.
At Linkinfotech, we support research teams that regularly work with complex datasets requiring advanced analytical techniques. This step-by-step guide walks through the complete process of running factor analysis in SPSS – from data preparation to output interpretation – so your team can extract maximum value from every dataset.
What Is Factor Analysis and Why Does It Matter?
Factor analysis is a data reduction technique that identifies underlying latent variables – called factors – that explain the pattern of correlations among a set of observed variables. In practical terms, it answers the question: “Which variables in my dataset are measuring the same underlying construct?”
For example, a customer satisfaction survey with 20 rating questions might actually be measuring just four underlying dimensions – service quality, value for money, communication effectiveness, and product reliability. Factor analysis reveals these dimensions and tells you which questions belong to each one.
There are two main types:
- Exploratory Factor Analysis (EFA) – used when you do not have a pre-existing theory about how variables should group. You let the data reveal the factor structure
- Confirmatory Factor Analysis (CFA) – used when you want to test a specific theoretical model. CFA is typically conducted in structural equation modelling software rather than SPSS
This guide focuses on Exploratory Factor Analysis in SPSS, which is the most commonly used approach in survey-based market research and academic research programmes.
When Should You Use Factor Analysis?
Before running factor analysis in SPSS, confirm that your research situation meets the appropriate conditions:
- You have multiple survey items (typically 10 or more) measuring related constructs
- Your variables are continuous or ordinal with at least 5 response categories
- You have a sufficient sample size – a minimum of 100 respondents, though 200+ is recommended for reliable results. A common rule of thumb is 5–10 respondents per variable
- Your variables show meaningful intercorrelations – factor analysis cannot find structure where no correlation exists
- Your objective is to reduce dimensionality or identify latent constructs underlying observed variables
Factor analysis is particularly valuable in survey research programmes where questionnaires are long and complex – making it a core component of professional data processing and analytics workflows that transform raw survey responses into structured insight.
Step 1 – Prepare Your Data in SPSS
Before running the analysis, your data must be clean, complete, and correctly formatted.
Data Preparation Checklist
- Check for missing values: Factor analysis handles missing data poorly. Use listwise deletion (exclude cases with any missing values) or mean substitution for items with low missingness. For large datasets with complex missing patterns, multiple imputation is preferable
- Check variable scales: All variables entering the analysis should be on comparable scales. If some items are reverse-scored, recode them before analysis so that high scores consistently mean the same thing
- Screen for outliers: Extreme outlier cases distort correlations and therefore distort factor solutions. Use z-scores or Mahalanobis distance to identify and review outliers
- Verify variable types: In SPSS Data View, confirm all factor analysis variables are set to Numeric type with Scale measurement level
Good data preparation is inseparable from good data management practice. A clean, well-structured dataset at this stage saves significant analytical effort later.
Step 2 – Access Factor Analysis in SPSS
Once your data is prepared, follow this navigation path in SPSS:
Analyze → Dimension Reduction → Factor
This opens the Factor Analysis dialogue box. From here, move all variables you want to include in the analysis from the left panel into the Variables box on the right.
Step 3 – Configure the Descriptives Options
Click the Descriptives button. The following options are recommended:
- Initial solution – displays initial communalities, eigenvalues, and percentage of variance explained
- Coefficients – displays the correlation matrix between all variables
- KMO and Bartlett’s test of sphericity – two critical pre-analysis checks
Understanding KMO and Bartlett’s Test
These two statistics tell you whether your data is suitable for factor analysis before you interpret any results.
Kaiser-Meyer-Olkin (KMO) Measure of Sampling Adequacy:
- Ranges from 0 to 1
- Values above 0.70 are acceptable
- Values above 0.80 are good
- Values above 0.90 are excellent
- Values below 0.50 indicate the data is not suitable for factor analysis
Bartlett’s Test of Sphericity:
- Tests whether the correlation matrix is significantly different from an identity matrix
- Result should be statistically significant (p < 0.05)
- A non-significant result means variables are uncorrelated and factor analysis is inappropriate
If both tests pass, proceed with the analysis. If they fail, review your variable selection – you may have included items that do not intercorrelate sufficiently.
Step 4 – Configure the Extraction Method
Click the Extraction button. The key decisions here are:
Method Selection
- Principal Components Analysis (PCA): The SPSS default. Technically not factor analysis, but widely used for data reduction. Treats all variance (shared + unique) as analysable
- Principal Axis Factoring (PAF): The preferred method for true Exploratory Factor Analysis. Analyses only shared variance (communality), making it more theoretically appropriate for identifying latent constructs
For most survey research applications, Principal Axis Factoring is recommended when the goal is to identify underlying constructs. PCA is more appropriate when the goal is purely data reduction without latent variable assumptions.
Number of Factors to Extract
SPSS offers several criteria:
- Eigenvalues greater than 1 (Kaiser criterion): The default. Retains factors with eigenvalues above 1.0. Tends to over-extract in large datasets
- Fixed number of factors: Specify the exact number based on theory or prior research
- Scree plot: A visual method; retain factors above the natural “elbow” in the plot
Best practice: Use the scree plot and eigenvalue criterion together, guided by theoretical expectations about how many constructs your survey was designed to measure.
Display Options
- Check Unrotated factor solution and Scree plot under Display
These outputs are essential for evaluating the preliminary factor structure before rotation is applied. The scree plot is particularly useful for visualising results that will later feed into charting services for research reports and presentations.
Step 5 – Configure the Rotation Method
Click the Rotation button. This is one of the most important decisions in the entire process.
Rotation improves the interpretability of the factor solution by redistributing variance across factors to produce a simpler, cleaner pattern of loadings.
Rotation Options
Orthogonal Rotation (factors assumed to be uncorrelated):
- Varimax – the most widely used rotation. Maximises the variance of squared loadings within each factor, producing factors with a small number of high loadings and many near-zero loadings. Best when factors are theoretically independent
Oblique Rotation (factors allowed to correlate):
- Oblimin (Direct Oblimin) – allows factors to correlate. More realistic for most psychological and social science constructs, which rarely exist in complete independence
- Promax – computationally simpler version of oblique rotation; suitable for large datasets
Which to use:
- If you have theoretical reasons to believe factors are independent, use Varimax
- If factors are likely to correlate (which is common in attitude and satisfaction surveys) – use Oblimin or Promax
- When unsure, run both and compare interpretability
Check the Rotated solution and Loading plot(s) under Display.
Step 6 – Configure Factor Scores (Optional)
Click the Scores button if you want SPSS to compute factor scores – new variables representing each respondent’s position on each extracted factor.
- Select Save as variables
- Choose Regression as the method (most commonly used)
Factor scores allow you to use factor analysis results in subsequent analyses – regression, clustering, or group comparisons. This is particularly useful when factor analysis feeds into broader segmentation work, connecting directly with data collection programmes where respondent-level data is retained for multi-stage analysis.
Step 7 – Configure Options
Click the Options button:
- Exclude cases listwise – recommended for clean results
- Sorted by size – organises the factor loading matrix so the highest loadings appear first within each factor, making interpretation significantly easier
- Suppress absolute values less than – set to 0.30 or 0.35. This hides small loadings from the output, making the pattern matrix much easier to read
Click Continue, then OK to run the analysis.
Step 8 – Interpret the SPSS Output
SPSS generates several output tables. Here is what each one means and what to look for.
Communalities Table
Shows how much variance in each variable is explained by the extracted factors.
- Initial – proportion of variance explained assuming all factors are retained
- Extraction – proportion of variance explained by the retained factors only
Variables with extraction communalities below 0.30 are poorly represented by the factor solution and should be considered for removal. Strong communalities (above 0.50) indicate the factors are capturing the variable well.
Total Variance Explained Table
Shows the eigenvalue and percentage of total variance explained by each factor.
- Focus on the Rotation Sums of Squared Loadings column (after rotation)
- Together, retained factors should explain at least 50–60% of total variance for the solution to be considered adequate
- Higher explained variance indicates a stronger, more coherent factor structure
Scree Plot
A line graph plotting eigenvalues against factor number. Look for the natural “elbow” – the point where the curve flattens. Retain factors above this point. This visual is frequently included in research deliverables and report writing services presentations to communicate the factor retention rationale to non-technical stakeholders.
Pattern Matrix (Oblique Rotation) or Rotated Component Matrix (Varimax)
This is the most important output table. It shows the factor loadings – the correlation between each variable and each factor.
Interpreting loadings:
- Loadings above 0.50 are considered strong and practically significant
- Loadings between 0.30 and 0.50 are moderate
- Loadings below 0.30 are weak and typically suppressed (if you set the suppression threshold in Options)
Cross-loadings occur when a variable loads substantially on more than one factor (both loadings above 0.30). Cross-loading variables are ambiguous – they belong to multiple factors simultaneously – and should typically be removed if they cannot be theoretically justified.
Factor Correlation Matrix (Oblique Rotation Only)
If you used oblique rotation, SPSS also produces a factor correlation matrix showing how strongly the factors relate to each other.
- Correlations below 0.30 suggest factors are sufficiently independent – orthogonal rotation (Varimax) may have been more appropriate
- Correlations above 0.30 confirm that oblique rotation was the right choice
Step 9 – Name and Validate Your Factors
Once you have identified clean, interpretable factors from the pattern matrix, name each factor based on the shared conceptual meaning of the variables that load onto it.
For example, if items about “staff friendliness,” “staff knowledge,” and “staff responsiveness” all load strongly onto Factor 1 – you might name it Staff Service Quality.
Validation Steps
- Face validity – do the variables loading on each factor make conceptual sense together?
- Cronbach’s Alpha – run reliability analysis on each factor’s items. Alpha above 0.70 indicates acceptable internal consistency
- Replication – if sample size permits, split your data and run factor analysis on each half to check stability of the solution
Factor validation is a critical quality assurance step in any research programme. When survey instruments are designed and validated through structured survey programming processes, factor analysis at the analysis stage confirms that the instrument is measuring what it was designed to measure.
Common Problems and How to Fix Them

Even experienced analysts encounter issues when running factor analysis. Here are the most common problems and solutions:
- KMO below 0.50: Remove variables with low individual KMO values (shown in the Anti-image Correlation matrix diagonal). Re-run until the overall KMO exceeds 0.60
- Too many factors extracted: Apply a more conservative extraction criterion. Use the scree plot rather than the eigenvalue-greater-than-1 rule exclusively
- Variables with low communalities (below 0.30): Remove these variables and re-run. They are not well represented by the factor solution
- Cross-loading variables: Remove the variable if it loads above 0.30 on more than one factor and cannot be theoretically assigned to one. Alternatively, retain it in the factor where the loading is highest, and the theoretical fit is strongest
- Factors with fewer than 3 variables: Factors defined by only one or two variables are unreliable. Consider collapsing them with adjacent factors or removing the items
Reporting Factor Analysis Results
When reporting factor analysis for publication, client deliverables, or academic submission, include:
- Sample size and number of variables analysed
- KMO value and Bartlett’s test significance level
- Extraction method (e.g., Principal Axis Factoring) and rotation method (e.g., Oblimin)
- Number of factors retained and the criteria used (e.g., eigenvalues > 1 confirmed by scree plot)
- Total variance explained by retained factors
- Pattern matrix or rotated component matrix showing all loadings above the threshold
- Factor names and the items loading onto each factor
- Cronbach’s Alpha for each factor
Results from factor analysis frequently feed into broader research outputs tracked through interactive dashboard systems where factor scores are monitored across survey waves to track changes in underlying constructs over time.
Final Thoughts
Knowing how to run factor analysis in SPSS step by step is a skill that pays dividends across every type of survey-based research. From instrument validation to respondent segmentation and construct measurement, factor analysis transforms multi-item questionnaires into coherent, interpretable dimensions that support sharper insights and stronger strategic recommendations.
The process requires careful attention at every stage – data preparation, method selection, rotation choice, output interpretation, and results reporting. When each step is handled rigorously, factor analysis produces findings that are both statistically sound and practically meaningful.
At Linkinfotech, we integrate advanced analytical techniques, including factor analysis, into our end-to-end research operations capabilities – ensuring that every dataset we work with is transformed into the clearest, most actionable insight possible.
Frequently Asked Questions
A minimum of 100 respondents is generally required, but 200 or more is recommended for reliable and stable results. A common guideline is 5 to 10 respondents per variable. Larger samples produce more stable factor solutions and reduce the risk of chance correlations distorting the structure.
Principal Components Analysis (PCA) extracts components that account for all variance in the dataset – both shared and unique variance. Factor analysis (using Principal Axis Factoring) extracts factors that account only for shared variance, making it more appropriate when the goal is to identify latent constructs. PCA is better suited for pure data reduction without theoretical interpretation.
The KMO (Kaiser-Meyer-Olkin) statistic measures whether the patterns of correlations in your data are compact enough for factor analysis to produce reliable factors. Values above 0.70 are acceptable, above 0.80 are good, and above 0.90 are excellent. Values below 0.50 indicate that factor analysis is not appropriate for the dataset.
The most reliable approach combines three methods: the Kaiser criterion (retain factors with eigenvalues above 1.0), the scree plot (retain factors above the natural elbow), and parallel analysis (compare actual eigenvalues to those from random data). Theoretical expectations about the number of constructs being measured should also inform the final decision.
Varimax is an orthogonal rotation that assumes factors are uncorrelated. It produces clean, easy-to-interpret factor structures and is widely used. Oblimin is an oblique rotation that allows factors to correlate – which is more realistic for most attitudinal and behavioural constructs. If your factors correlate above 0.30, oblique rotation is preferred.
A cross-loading occurs when a variable loads substantially (above 0.30) on more than one factor simultaneously. Cross-loading variables are problematic because they cannot be cleanly assigned to a single factor, which complicates interpretation and reduces the clarity of the factor solution. They should typically be removed and the analysis re-run.
Report the KMO value, Bartlett’s test result, extraction and rotation methods, number of factors retained, total variance explained, and the full pattern or rotated component matrix with loadings above the threshold (typically 0.30 or 0.35). Include factor names and Cronbach’s Alpha for each factor’s items to demonstrate internal consistency.
Factor analysis requires numerical data and cannot be applied directly to raw open-ended text. However, once open-ended responses have been processed through structured coding – assigning numeric codes to response categories – the resulting coded variables can be included in factor analysis alongside scaled items. This integration of qualitative and quantitative data is part of advanced online panel and survey analytics workflows.
