other:inspect3d:tutorials:analysis_of_baseball_hitters
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
other:inspect3d:tutorials:analysis_of_baseball_hitters [2025/01/10 20:47] – wikisysop | other:inspect3d:tutorials:analysis_of_baseball_hitters [2025/01/10 20:49] (current) – wikisysop | ||
---|---|---|---|
Line 159: | Line 159: | ||
Variance explained is the first graph displayed after running a PCA, and displays the variance explained by each principal component as well as the cumulative variance of each principal component. We can see that 90.0% of the variance can be explained by the first five principal components. Generally you want to increase the number of PCs until they explain 95% of the variance, but for this data set 7 PCs are needed to reach that variance and we don't want to start overfitting or capturing too much noise, so the number of PCs will be left at five. | Variance explained is the first graph displayed after running a PCA, and displays the variance explained by each principal component as well as the cumulative variance of each principal component. We can see that 90.0% of the variance can be explained by the first five principal components. Generally you want to increase the number of PCs until they explain 95% of the variance, but for this data set 7 PCs are needed to reach that variance and we don't want to start overfitting or capturing too much noise, so the number of PCs will be left at five. | ||
- | {{: | + | {{: |
\\ | \\ | ||
Line 166: | Line 166: | ||
When looking at the group scores the goal is to be able to confidently classify subjects as either being college or high school hitters. Analyzing this graph shows that the standard errors of all five PCs do not overlap, suggesting that all 5 PCs can differentiate between the groups. We can see from variance explained that PC1 accounts for the most overall variation at 38.9%, but since PC2 and PC5 are showing the most significant differences for each group, due to the group scores not overlapping and not crossing over zero they will be the focus of the remaining analysis. | When looking at the group scores the goal is to be able to confidently classify subjects as either being college or high school hitters. Analyzing this graph shows that the standard errors of all five PCs do not overlap, suggesting that all 5 PCs can differentiate between the groups. We can see from variance explained that PC1 accounts for the most overall variation at 38.9%, but since PC2 and PC5 are showing the most significant differences for each group, due to the group scores not overlapping and not crossing over zero they will be the focus of the remaining analysis. | ||
- | {{: | + | {{: |
\\ | \\ | ||
Line 173: | Line 173: | ||
We can take a closer look at PC2 and PC5 to see why these principal components may be showing the most significant differences. Looking at the mean signal trace of the college and high school subjects pelvic internal/ | We can take a closer look at PC2 and PC5 to see why these principal components may be showing the most significant differences. Looking at the mean signal trace of the college and high school subjects pelvic internal/ | ||
- | {{: | + | {{: |
\\ | \\ |
other/inspect3d/tutorials/analysis_of_baseball_hitters.1736542052.txt.gz · Last modified: 2025/01/10 20:47 by wikisysop