On Wed, May 15, 2013 at 03:11:46PM -0400, Adam Hughes wrote: > I noticed in the PCA class methods fit() and fit_transform(), there is a > keyword option "y=none" that is never actually used. I was curious why this > is, but it's not all the important.
Because methods should have the same signature across scikit-learn, whether they are supervised or unsupervised. > My second question is in regard to the choice to use singular value > decomposition. I will be performing PCA on spectral data, which > generally has a much higher feature dimension than sample dimension. > For example, I may have 2000 features (wavelengths) but only 10 time > points (columns). The data is not sparse, however. My question > basically is will the SVD still predict the same values that the brute > force computation of the eigenvectors of the covariance matrix would > give? Are there caveats, or do you think it's safe to use with > confidence? Yes, this is what the SVD is. > Additionally, I noticed the SVD option "full_matricies" is set to false. I > realize this is an approximation to speed up the computation of the SVD, and > in > the numpy example, they used np.allclose() to verify that the correction is > insignificant. Can I be confident that setting full_matricies to false is > always a good idea, or are there cases that it may introduce error? It's not an approximation, it's exact. G ------------------------------------------------------------------------------ AlienVault Unified Security Management (USM) platform delivers complete security visibility with the essential security capabilities. Easily and efficiently configure, manage, and operate all of your security controls from a single console and one unified framework. Download a free trial. http://p.sf.net/sfu/alienvault_d2d _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
