On Tue, Jun 12, 2012 at 1:03 AM, Justin R <justinbr...@gmail.com> wrote: > operating system Windows 7 > matplotlib version : 1.1.0 > obtained from sourceforge > > the class seems to generate the same Wt matrix for every input. The > every element of the weight matrix is either +sqrt(1/2) or -sqrt(1/2). > > dat1 = 4*np.random.randn(200,1) + 2 > dat2 = dat1*.25 + 1*np.random.randn(200,1) > pcaObj1 = PCA(np.hstack((dat1,dat2))) > print pcaObj1.Wt > > dat3 = 2*np.random.randn(200,1) + 2 > dat4 = dat3*2 + 3*np.random.randn(200,1) > pcaObj2 = PCA(np.hstack((dat1,dat2))) > print pcaObj2.Wt > > The output Y seems to be correct, and the projection function works. > only the Wt matrix seems to be messed up. Am I using this class > incorrectly, or could this be a bug?
Hi, I wouldn't call myself a PCA expert - so don't weight my answer too heavily - but here is what I think is happening: Looking at the code, the input data array is centered and scaled to unit variance in each dimension. The attribute .a of the class is a copy of the array that is actually sent to the SVD; note the centering/scaling. I don't have a proof of this, but intuitively I expect that the PCA axes associated with a 2-dimension centered/scaled array will always be at 45" angles (e.g., [1,1], [-1,1], etc., which are normalized to [sqrt(1/2), sqrt(1/2)], etc). I think one way to describe this is that after centering/scaling there are no degrees of freedom left if you only started with 2 dimensions. So I don't think there is a bug, but it is maybe unclear what the PCA class is doing. If you increase to > 2 dimensions, you can see there is random fluctuation in Wt: In [102]: pcaObj = PCA(np.random.randn(200,2)) In [103]: pcaObj.Wt Out[103]: array([[-0.70710678, -0.70710678], [-0.70710678, 0.70710678]]) In [104]: pcaObj = PCA(np.random.randn(200,3)) In [105]: pcaObj.Wt Out[105]: array([[ 0.65456366, -0.24141116, -0.7164266 ], [ 0.39843462, 0.91551401, 0.05553329], [ 0.64249223, -0.32179924, 0.69544877]]) In [106]: pcaObj = PCA(np.random.randn(200,3)) In [107]: pcaObj.Wt Out[107]: array([[-0.29885902, -0.67436982, 0.67521007], [-0.95428685, 0.21449891, -0.20815098], [-0.00446109, -0.70655189, -0.70764718]]) Hope that helps, Aronne ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Matplotlib-users mailing list Matplotlib-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-users