Re: diffrence in PCA of MLib vs H2o in R

2015-03-24 Thread Sean Owen
Those implementations are computing an SVD of the input matrix directly, and while you generally need the columns to have mean 0, you can turn that off with the options you cite. I don't think this is possible in the MLlib implementation, since it is computing the principal components by

Re: diffrence in PCA of MLib vs H2o in R

2015-03-24 Thread roni
Reza, That SVD.v matches the H2o and R prComp (non-centered) Thanks -R On Tue, Mar 24, 2015 at 11:38 AM, Sean Owen so...@cloudera.com wrote: (Oh sorry, I've only been thinking of TallSkinnySVD) On Tue, Mar 24, 2015 at 6:36 PM, Reza Zadeh r...@databricks.com wrote: If you want to do a

Re: diffrence in PCA of MLib vs H2o in R

2015-03-24 Thread Reza Zadeh
Great! On Tue, Mar 24, 2015 at 2:53 PM, roni roni.epi...@gmail.com wrote: Reza, That SVD.v matches the H2o and R prComp (non-centered) Thanks -R On Tue, Mar 24, 2015 at 11:38 AM, Sean Owen so...@cloudera.com wrote: (Oh sorry, I've only been thinking of TallSkinnySVD) On Tue, Mar 24,

Re: diffrence in PCA of MLib vs H2o in R

2015-03-24 Thread Reza Zadeh
If you want to do a nonstandard (or uncentered) PCA, you can call computeSVD on RowMatrix, and look at the resulting 'V' Matrix. That should match the output of the other two systems. Reza On Tue, Mar 24, 2015 at 3:53 AM, Sean Owen so...@cloudera.com wrote: Those implementations are computing

Re: diffrence in PCA of MLib vs H2o in R

2015-03-24 Thread Sean Owen
(Oh sorry, I've only been thinking of TallSkinnySVD) On Tue, Mar 24, 2015 at 6:36 PM, Reza Zadeh r...@databricks.com wrote: If you want to do a nonstandard (or uncentered) PCA, you can call computeSVD on RowMatrix, and look at the resulting 'V' Matrix. That should match the output of the

diffrence in PCA of MLib vs H2o in R

2015-03-24 Thread roni
I am trying to compute PCA using computePrincipalComponents. I also computed PCA using h2o in R and R's prcomp. The answers I get from H2o and R's prComp (non h2o) is same when I set the options for H2o as standardized=FALSE and for r's prcomp as center = false. How do I make sure that the