I'm coming from a Hadoop background but I'm totally new to Apache Spark. I'd like to do topic modeling using LDA algorithm on some txt files. The example on the Spark website assumes that the input to the LDA is a file containing the words counts. I wonder if someone could help me figuring out the steps to start from actual txt documents (actual content) and come up with the actual topics.
-- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/topic-modeling-using-LDA-in-MLLib-tp22128.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org