Hi,

Can PCA be implemented in a SparkR-MLLib integration?

perhaps 2 separate issues..

1) Having the methods in SparkRWrapper and RFormula which will send the
right input types through the pipeline
MLLib PCA operates either on a RowMatrix,  or the feature vector of an
RDD[LabeledPoint]. The labels aren't used.. though in the second case it
may be useful to be able to keep the label.

2) formula parsing from R
In R syntax, you can, for example in prcomp, have a formula which has no
label (response variable) --  eg.  prcomp(~ Col1 + Col2 + Col3, data =
myDataFrame)
Can RFormula currently parse this type of formula?


Thanks for listening / ideas.
Deb

Reply via email to