Re: MLLib inside Storm : silly or not ?

2014-06-20 Thread Eustache DIEMERT
Yes, learning on a dedicated Spark cluster and predicting inside a Storm bolt is quite OK :) Thanks all for your answers. I'll post back if/when we experience this solution. E/ 2014-06-19 20:45 GMT+02:00 Shuo Xiang : > If I'm understanding correctly, you want to use MLlib for offline trainin

Re: MLLib inside Storm : silly or not ?

2014-06-19 Thread Shuo Xiang
If I'm understanding correctly, you want to use MLlib for offline training and then deploy the learned model to Storm? In this case I don't think there is any problem. However if you are looking for online model update/training, this can be complicated and I guess quite a few algorithms in mllib at

Re: MLLib inside Storm : silly or not ?

2014-06-19 Thread Matei Zaharia
You should be able to use many of the MLlib Model objects directly in Storm, if you save them out using Java serialization. The only one that won’t work is probably ALS, because it’s a distributed model. Otherwise, you will have to output them in your own format and write code for evaluating th

Re: MLLib inside Storm : silly or not ?

2014-06-19 Thread Surendranauth Hiraman
I can't speak for MLlib, too. But I can say the model of training in Hadoop M/R or Spark and production scoring in Storm works very well. My team has done online learning (Sofia ML library, I think) in Storm as well. I would be interested in this answer as well. -Suren On Thu, Jun 19, 2014 at

Re: MLLib inside Storm : silly or not ?

2014-06-19 Thread Eustache DIEMERT
Well, yes VW is an appealing option but I only found "experimental" integrations so far. Also, early experiments suggest Decision Trees Ensembles (RF, GBT) perform better than generalized linear models on our data. Hence the interest for MLLib :) Any other comments / suggestions welcome :) E/

Re: MLLib inside Storm : silly or not ?

2014-06-19 Thread Charles Earl
While I can't definitively speak to MLLib online learning, I'm sure you're evaluating Vowpal Wabbit, for which there's been some storm integrations contributed. Also you might look at factorie, http://factorie.cs.understanding.edu, which at least provides an online lda. C On Thursday, June 19, 20

MLLib inside Storm : silly or not ?

2014-06-19 Thread Eustache DIEMERT
Hi Sparkers, We have a Storm cluster and looking for a decent execution engine for machine learned models. What I've seen from MLLib is extremely positive, but we can't just throw away our Storm based stack. So my question is: is it feasible/recommended to train models in Spark/MLLib and execute