Hi, there I hope someone can clarify this for me. It seems that some of the MLlib algorithms such as KMean, Linear Regression and Logistics Regression have a Streaming version, which can do online machine learning. But does that mean other MLLib algorithm cannot be used in Spark streaming applications, such as random forest, SVM, collaborate filtering, etc??
DStreams are essentially a sequence of RDDs. We can use DStream.transform() and DStream.foreachRDD() operations, which allows you access RDDs in a DStream and apply MLLib functions on them. So it looks like all MLLib algorithms should be able to run in the streaming application. Am I wrong? Thanks in advance. Lan --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org