Re: thought experiment: use spark ML to real time prediction

DB Tsai Thu, 12 Nov 2015 20:30:07 -0800

This will bring the whole dependencies of spark will may break the web app.



Sincerely,

DB Tsai
----------------------------------------------------------
Web: https://www.dbtsai.com
PGP Key ID: 0xAF08DF8D

On Thu, Nov 12, 2015 at 8:15 PM, Nirmal Fernando <nir...@wso2.com> wrote:

>
>
> On Fri, Nov 13, 2015 at 2:04 AM, darren <dar...@ontrenet.com> wrote:
>
>> I agree 100%. Making the model requires large data and many cpus.
>>
>> Using it does not.
>>
>> This is a very useful side effect of ML models.
>>
>> If mlib can't use models outside spark that's a real shame.
>>
>
> Well you can as mentioned earlier. You don't need Spark runtime for
> predictions, save the serialized model and deserialize to use. (you need
> the Spark Jars in the classpath though)
>
>>
>>
>> Sent from my Verizon Wireless 4G LTE smartphone
>>
>>
>> -------- Original message --------
>> From: "Kothuvatiparambil, Viju" <viju.kothuvatiparam...@bankofamerica.com>
>>
>> Date: 11/12/2015 3:09 PM (GMT-05:00)
>> To: DB Tsai <dbt...@dbtsai.com>, Sean Owen <so...@cloudera.com>
>> Cc: Felix Cheung <felixcheun...@hotmail.com>, Nirmal Fernando <
>> nir...@wso2.com>, Andy Davidson <a...@santacruzintegration.com>, Adrian
>> Tanase <atan...@adobe.com>, "user @spark" <user@spark.apache.org>,
>> Xiangrui Meng <men...@gmail.com>, hol...@pigscanfly.ca
>> Subject: RE: thought experiment: use spark ML to real time prediction
>>
>> I am glad to see DB’s comments, make me feel I am not the only one facing
>> these issues. If we are able to use MLLib to load the model in web
>> applications (outside the spark cluster), that would have solved the
>> issue.  I understand Spark is manly for processing big data in a
>> distributed mode. But, there is no purpose in training a model using MLLib,
>> if we are not able to use it in applications where needs to access the
>> model.
>>
>>
>>
>> Thanks
>>
>> Viju
>>
>>
>>
>> *From:* DB Tsai [mailto:dbt...@dbtsai.com]
>> *Sent:* Thursday, November 12, 2015 11:04 AM
>> *To:* Sean Owen
>> *Cc:* Felix Cheung; Nirmal Fernando; Andy Davidson; Adrian Tanase; user
>> @spark; Xiangrui Meng; hol...@pigscanfly.ca
>> *Subject:* Re: thought experiment: use spark ML to real time prediction
>>
>>
>>
>> I think the use-case can be quick different from PMML.
>>
>>
>>
>> By having a Spark platform independent ML jar, this can empower users to
>> do the following,
>>
>>
>>
>> 1) PMML doesn't contain all the models we have in mllib. Also, for a ML
>> pipeline trained by Spark, most of time, PMML is not expressive enough to
>> do all the transformation we have in Spark ML. As a result, if we are able
>> to serialize the entire Spark ML pipeline after training, and then load
>> them back in app without any Spark platform for production scorning, this
>> will be very useful for production deployment of Spark ML models. The only
>> issue will be if the transformer involves with shuffle, we need to figure
>> out a way to handle it. When I chatted with Xiangrui about this, he
>> suggested that we may tag if a transformer is shuffle ready. Currently, at
>> Netflix, we are not able to use ML pipeline because of those issues, and we
>> have to write our own scorers in our production which is quite a duplicated
>> work.
>>
>>
>>
>> 2) If users can use Spark's linear algebra like vector or matrix code in
>> their application, this will be very useful. This can help to share code in
>> Spark training pipeline and production deployment. Also, lots of good stuff
>> at Spark's mllib doesn't depend on Spark platform, and people can use them
>> in their application without pulling lots of dependencies. In fact, in my
>> project, I have to copy & paste code from mllib into my project to use
>> those goodies in apps.
>>
>>
>>
>> 3) Currently, mllib depends on graphx which means in graphx, there is no
>> way to use mllib's vector or matrix. And
>>
>
>
>
> --
>
> Thanks & regards,
> Nirmal
>
> Team Lead - WSO2 Machine Learner
> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
> Mobile: +94715779733
> Blog: http://nirmalfdo.blogspot.com/
>
>
>

Re: thought experiment: use spark ML to real time prediction

Reply via email to