I am trying to run a SVD on a dataframe and I have used ml TF-IDF which has created a dataframe. Now for Singular Value Decomposition I am trying to use RowMatrix which takes in RDD with mllib.Vector so I have to convert this Dataframe with what I assumed was ml.Vector
However the conversion val convertedTermDocMatrix = MLUtils.convertMatrixColumnsFromML(termDocMatrix,"features") fails with java.lang.IllegalArgumentException: requirement failed: Column features must be new Matrix type to be converted to old type but got org.apache.spark.ml.linalg.VectorUDT So the question is: How do I perform SVD on a DataFrame? I assume all the functionalities of mllib has not be ported to ml. I tried to convert my entire project to use RDD but computeSVD on RowMatrix is throwing up out of Memory errors and anyway I would like to stick with DataFrame. Our text corpus is around 55 Gb of text data. Ganesh -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/VectorUDT-and-ml-Vector-for-SVD-tp28038.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org