Hi all, I am trying to use the new ALS implementation under org.apache.spark.ml.recommendation.ALS.
The new method to invoke for training seems to be override def fit(dataset: DataFrame, paramMap: ParamMap): ALSModel. How do I create a dataframe object from ratings data set that is on hdfs ? where as the method in the old ALS implementation under org.apache.spark.mllib.recommendation.ALS was def train( ratings: RDD[Rating], rank: Int, iterations: Int, lambda: Double, blocks: Int, seed: Long ): MatrixFactorizationModel My code to run the old ALS train method is as below: "val sc = new SparkContext(conf) val pfile = args(0) val purchase=sc.textFile(pfile) val ratings = purchase.map(_.split(',') match { case Array(user, item, rate) => Rating(user.toInt, item.toInt, rate.toInt) }) val model = ALS.train(ratings, rank, numIterations, 0.01)" Now, for the new ALS fit method, I am trying to use the below code to run, but getting a compilation error: val als = new ALS() .setRank(rank) .setRegParam(regParam) .setImplicitPrefs(implicitPrefs) .setNumUserBlocks(numUserBlocks) .setNumItemBlocks(numItemBlocks) val sc = new SparkContext(conf) val pfile = args(0) val purchase=sc.textFile(pfile) val ratings = purchase.map(_.split(',') match { case Array(user, item, rate) => Rating(user.toInt, item.toInt, rate.toInt) }) val model = als.fit(ratings.toDF()) I get an error that the method toDF() is not a member of org.apache.spark.rdd.RDD[org.apache.spark.ml.recommendation.ALS.Rating[Int]]. Appreciate the help ! Thanks, Jay -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/RDD-to-DataFrame-for-using-ALS-under-org-apache-spark-ml-recommendation-ALS-tp22083.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org