You need to subtract mean values to obtain the covariance matrix (http://en.wikipedia.org/wiki/Covariance_matrix).
On Fri, Jan 9, 2015 at 6:41 PM, Upul Bandara <[email protected]> wrote: > Hi Xiangrui, > > Thanks for the reply. > > Julia code is also using the covariance matrix: > (1/n)*X'*X ; > > Thanks, > Upul > > On Fri, Jan 9, 2015 at 2:11 AM, Xiangrui Meng <[email protected]> wrote: >> >> The Julia code is computing the SVD of the Gram matrix. PCA should be >> applied to the covariance matrix. -Xiangrui >> >> On Thu, Jan 8, 2015 at 8:27 AM, Upul Bandara <[email protected]> >> wrote: >> > Hi All, >> > >> > I tried to do PCA for the Iris dataset >> > [https://archive.ics.uci.edu/ml/datasets/Iris] using MLLib >> > >> > [http://spark.apache.org/docs/1.1.1/mllib-dimensionality-reduction.html]. >> > Also, PCA was calculated in Julia using following method: >> > >> > Sigma = (1/numRow(X))*X'*X ; >> > [U, S, V] = svd(Sigma); >> > Ureduced = U(:, 1:k); >> > Z = X*Ureduced; >> > >> > However, I'm seeing a little difference between values given by MLLib >> > and >> > the method shown above . >> > >> > Does anyone have any idea about this difference? >> > >> > Additionally, I have attached two visualizations, related to two >> > approaches. >> > >> > Thanks, >> > Upul >> > >> > >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: [email protected] >> > For additional commands, e-mail: [email protected] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
