Hi, Can PCA be implemented in a SparkR-MLLib integration?
perhaps 2 separate issues.. 1) Having the methods in SparkRWrapper and RFormula which will send the right input types through the pipeline MLLib PCA operates either on a RowMatrix, or the feature vector of an RDD[LabeledPoint]. The labels aren't used.. though in the second case it may be useful to be able to keep the label. 2) formula parsing from R In R syntax, you can, for example in prcomp, have a formula which has no label (response variable) -- eg. prcomp(~ Col1 + Col2 + Col3, data = myDataFrame) Can RFormula currently parse this type of formula? Thanks for listening / ideas. Deb