On Wed, May 15, 2013 at 03:11:46PM -0400, Adam Hughes wrote:
> I noticed in the PCA class methods fit() and fit_transform(), there is a
> keyword option "y=none" that is never actually used.  I was curious why this
> is, but it's not all the important.

Because methods should have the same signature across scikit-learn,
whether they are supervised or unsupervised.

> My second question is in regard to the choice to use singular value
> decomposition.  I will be performing PCA on spectral data, which
> generally has a much higher feature dimension than sample dimension.
>  For example, I may have 2000 features (wavelengths) but only 10 time
> points (columns).  The data is not sparse, however.  My question
> basically is will the SVD still predict the same values that the brute
> force computation of the eigenvectors of the covariance matrix would
> give?   Are there caveats, or do you think it's safe to use with
> confidence?

Yes, this is what the SVD is.

> Additionally, I noticed the SVD option "full_matricies" is set to false.  I
> realize this is an approximation to speed up the computation of the SVD, and 
> in
> the numpy example, they used np.allclose() to verify that the correction is
> insignificant.  Can I be confident that setting full_matricies to false is
> always a good idea, or are there cases that it may introduce error?

It's not an approximation, it's exact.

G

------------------------------------------------------------------------------
AlienVault Unified Security Management (USM) platform delivers complete
security visibility with the essential security capabilities. Easily and
efficiently configure, manage, and operate all of your security controls
from a single console and one unified framework. Download a free trial.
http://p.sf.net/sfu/alienvault_d2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to