I've created several videos on cluster analysis, although I'm definitely not an expert on cluster analysis.
- VIDEO TUTORIAL: Two-step Cluster Analysis in SPSS
- VIDEO TUTORIAL: Two-step Cluster Analysis Scatter Plot
- VIDEO TUTORIAL: Two step cluster analysis compare clusters simultaneously
- VIDEO TUTORIAL: Validating a two-step cluster analysis - how many clusters?
- VIDEO TUTORIAL: Hierarchical Cluster Analysis SPSS
- VIDEO TUTORIAL: Validating a Hierarchical Cluster Analysis
- VIDEO TUTORIAL: K-means cluster analysis SPSS
- VIDEO TUTORIAL: Validating K-means cluster anslysis in SPSS
- VIDEO TUTORIAL: A Primer on Multiple Discriminant Analysis in SPSS
A note on selecting the right number of clusters:
In their book "Multivariate data analysis" Joseph Hair et al (2010) state that "no standard objective selection procedure exists" (p. 514) for deciding how many clusters should be extracted. Again on page 516: "No single objective procedure is available to determine the correct number of clusters; rather the researcher must evaluate alternative cluster solutions on the following considerations..." they then list four considerations:
- Avoid extremely small clusters
- Try to maximize heterogeneity between clusters
- "All clusters should be significantly different across the set of clustering variables"
- Clusters should be theoretically valid and useful
If you follow the approaches in the videos, I think #3 here is our best argument. We're using the ANOVA with Bonferroni post-hoc pairwise comparisons to assess whether all clusters are significantly different across the set of clustering variables. Here is the citation for Hair:
- Hair, J. F., Jr., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall.