Thanks for letting me know about this, it looks pretty interesting.  From
reading the documentation it seems that the server must be built on a Spark
cluster, is that correct?  Is it possible to deploy it in on a Java
server?  That is how we are currently running our web app.



On Tue, Nov 4, 2014 at 7:57 PM, Simon Chan <simonc...@gmail.com> wrote:

> The latest version of PredictionIO, which is now under Apache 2 license,
> supports the deployment of MLlib models on production.
>
> The "engine" you build will including a few components, such as:
> - Data - includes Data Source and Data Preparator
> - Algorithm(s)
> - Serving
> I believe that you can do the feature vector creation inside the Data
> Preparator component.
>
> Currently, the package comes with two templates: 1)  Collaborative
> Filtering Engine Template - with MLlib ALS; 2) Classification Engine
> Template - with MLlib Naive Bayes. The latter one may be useful to you. And
> you can customize the Algorithm component, too.
>
> I have just created a doc: http://docs.prediction.io/0.8.1/templates/
> Love to hear your feedback!
>
> Regards,
> Simon
>
>
>
> On Mon, Oct 27, 2014 at 11:03 AM, chirag lakhani <chirag.lakh...@gmail.com
> > wrote:
>
>> Would pipelining include model export?  I didn't see that in the
>> documentation.
>>
>> Are there ways that this is being done currently?
>>
>>
>>
>> On Mon, Oct 27, 2014 at 12:39 PM, Xiangrui Meng <men...@gmail.com> wrote:
>>
>>> We are working on the pipeline features, which would make this
>>> procedure much easier in MLlib. This is still a WIP and the main JIRA
>>> is at:
>>>
>>> https://issues.apache.org/jira/browse/SPARK-1856
>>>
>>> Best,
>>> Xiangrui
>>>
>>> On Mon, Oct 27, 2014 at 8:56 AM, chirag lakhani
>>> <chirag.lakh...@gmail.com> wrote:
>>> > Hello,
>>> >
>>> > I have been prototyping a text classification model that my company
>>> would
>>> > like to eventually put into production.  Our technology stack is
>>> currently
>>> > Java based but we would like to be able to build our models in
>>> Spark/MLlib
>>> > and then export something like a PMML file which can be used for model
>>> > scoring in real-time.
>>> >
>>> > I have been using scikit learn where I am able to take the training
>>> data
>>> > convert the text data into a sparse data format and then take the other
>>> > features and use the dictionary vectorizer to do one-hot encoding for
>>> the
>>> > other categorical variables.  All of those things seem to be possible
>>> in
>>> > mllib but I am still puzzled about how that can be packaged in such a
>>> way
>>> > that the incoming data can be first made into feature vectors and then
>>> > evaluated as well.
>>> >
>>> > Are there any best practices for this type of thing in Spark?  I hope
>>> this
>>> > is clear but if there are any confusions then please let me know.
>>> >
>>> > Thanks,
>>> >
>>> > Chirag
>>>
>>
>>
>

Reply via email to