Re: Is spark streaming +MlLib for online learning?

2015-02-18 Thread mucaho
Hi

What is the general consensus/roadmap for implementing additional online /
streamed trainable models?

Apache Spark 1.2.1 currently supports streaming linear regression 
clustering, although other streaming linear methods are planned according to
the issue tracker.
However, I can not find any details on the issue tracker about online
training of a collaborative filter. Judging from  another mailing list
discussion
http://mail-archives.us.apache.org/mod_mbox/spark-user/201501.mbox/%3ce07aa61e-eeb9-4ded-be3e-3f04003e4...@storefront.be%3E
  
incremental training should be possible for ALS. Any plans for the future?

Regards
mucaho



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Is-spark-streaming-MlLib-for-online-learning-tp19701p21698.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Is spark streaming +MlLib for online learning?

2015-02-18 Thread Reza Zadeh
This feature request is already being tracked:
https://issues.apache.org/jira/browse/SPARK-4981
Aiming for 1.4
Best,
Reza

On Wed, Feb 18, 2015 at 2:40 AM, mucaho muc...@yahoo.com wrote:

 Hi

 What is the general consensus/roadmap for implementing additional online /
 streamed trainable models?

 Apache Spark 1.2.1 currently supports streaming linear regression 
 clustering, although other streaming linear methods are planned according
 to
 the issue tracker.
 However, I can not find any details on the issue tracker about online
 training of a collaborative filter. Judging from  another mailing list
 discussion
 
 http://mail-archives.us.apache.org/mod_mbox/spark-user/201501.mbox/%3ce07aa61e-eeb9-4ded-be3e-3f04003e4...@storefront.be%3E
 
 incremental training should be possible for ALS. Any plans for the future?

 Regards
 mucaho



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Is-spark-streaming-MlLib-for-online-learning-tp19701p21698.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Re: Is spark streaming +MlLib for online learning?

2014-11-25 Thread Xiangrui Meng
In 1.2, we added streaming k-means:
https://github.com/apache/spark/pull/2942 . -Xiangrui

On Mon, Nov 24, 2014 at 5:25 PM, Joanne Contact joannenetw...@gmail.com wrote:
 Thank you Tobias!

 On Mon, Nov 24, 2014 at 5:13 PM, Tobias Pfeiffer t...@preferred.jp wrote:

 Hi,

 On Tue, Nov 25, 2014 at 9:40 AM, Joanne Contact joannenetw...@gmail.com
 wrote:

 I seemed to read somewhere that spark is still batch learning, but spark
 streaming could allow online learning.


 Spark doesn't do Machine Learning itself, but MLlib does. MLlib currently
 can do online learning only for linear regression
 https://spark.apache.org/docs/1.1.0/mllib-linear-methods.html#streaming-linear-regression,
 as far as I know.

 Tobias



-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Is spark streaming +MlLib for online learning?

2014-11-24 Thread Joanne Contact
Hi Gurus,

Sorry for my naive question. I am new.

I seemed to read somewhere that spark is still batch learning, but spark
streaming could allow online learning.

I could not find this on the website now.

http://spark.apache.org/docs/latest/streaming-programming-guide.html

I know MLLib uses incremental or iterative algorithms, I wonder if this is
also true between batches of spark streaming.

So the question is: say, when I call MLLib linear regression, does the
training use one batch data as training data, if yes, then the model update
between batches is already taken care of? That is, the model will
eventually use all data that arrived from the beginning until current time
of scoring as the training data, or the model only use data coming in the
past limited number of batches as training data?


Many thanks!

J


Re: Is spark streaming +MlLib for online learning?

2014-11-24 Thread Tobias Pfeiffer
Hi,

On Tue, Nov 25, 2014 at 9:40 AM, Joanne Contact joannenetw...@gmail.com
wrote:

 I seemed to read somewhere that spark is still batch learning, but spark
 streaming could allow online learning.


Spark doesn't do Machine Learning itself, but MLlib does. MLlib currently
can do online learning only for linear regression 
https://spark.apache.org/docs/1.1.0/mllib-linear-methods.html#streaming-linear-regression,
as far as I know.

Tobias


Re: Is spark streaming +MlLib for online learning?

2014-11-24 Thread Joanne Contact
Thank you Tobias!

On Mon, Nov 24, 2014 at 5:13 PM, Tobias Pfeiffer t...@preferred.jp wrote:

 Hi,

 On Tue, Nov 25, 2014 at 9:40 AM, Joanne Contact joannenetw...@gmail.com
 wrote:

 I seemed to read somewhere that spark is still batch learning, but spark
 streaming could allow online learning.


 Spark doesn't do Machine Learning itself, but MLlib does. MLlib currently
 can do online learning only for linear regression 
 https://spark.apache.org/docs/1.1.0/mllib-linear-methods.html#streaming-linear-regression,
 as far as I know.

 Tobias