Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

2017-02-06 Thread Hollin Wilkins
Hi All - We got a number of great questions and ended up adding responses to them on the MLeap Documentation page, in the FAQ section . We're also including a "condensed" version at the bottom of this email. We appreciate the interest and the discussion

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

2017-02-06 Thread Aseem Bansal
I agree with you that this is needed. There is a JIRA https://issues.apache.org/jira/browse/SPARK-10413 On Sun, Feb 5, 2017 at 11:21 PM, Debasish Das wrote: > Hi Aseem, > > Due to production deploy, we did not upgrade to 2.0 but that's critical > item on our list. > >

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

2017-02-05 Thread Debasish Das
Hi Aseem, Due to production deploy, we did not upgrade to 2.0 but that's critical item on our list. For exposing models out of PipelineModel, let me look into the ML tasks...we should add it since dataframe should not be must for model scoring...many times model are scored on api or streaming

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

2017-02-04 Thread Aseem Bansal
@Debasish I see that the spark version being used in the project that you mentioned is 1.6.0. I would suggest that you take a look at some blogs related to Spark 2.0 Pipelines, Models in new ml package. The new ml package's API as of latest Spark 2.1.0 release has no way to call predict on single

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

2017-02-04 Thread Chris Fregly
to date, i haven't seen very good performance coming from mleap. i believe ram from databricks keeps getting you guys on stage at the spark summits, but i've been unimpressed with the performance numbers - as well as your choice to reimplement own non-standard "pmml-like" mechanism which incurs

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

2017-02-04 Thread Debasish Das
Except of course lda als and neural net modelfor them the model need to be either prescored and cached on a kv store or the matrices / graph should be kept on kv store to access them using a REST API to serve the output..for neural net its more fun since its a distributed or local graph over

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

2017-02-04 Thread Debasish Das
If we expose an API to access the raw models out of PipelineModel can't we call predict directly on it from an API ? Is there a task open to expose the model out of PipelineModel so that predict can be called on itthere is no dependency of spark context in ml model... On Feb 4, 2017 9:11 AM,

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

2017-02-04 Thread Aseem Bansal
- In Spark 2.0 there is a class called PipelineModel. I know that the title says pipeline but it is actually talking about PipelineModel trained via using a Pipeline. - Why PipelineModel instead of pipeline? Because usually there is a series of stuff that needs to be done when doing

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

2017-02-04 Thread Debasish Das
I am not sure why I will use pipeline to do scoring...idea is to build a model, use model ser/deser feature to put it in the row or column store of choice and provide a api access to the model...we support these primitives in github.com/Verizon/trapezium...the api has access to spark context in

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

2017-02-04 Thread Aseem Bansal
Does this support Java 7? What is your timezone in case someone wanted to talk? On Fri, Feb 3, 2017 at 10:23 PM, Hollin Wilkins wrote: > Hey Aseem, > > We have built pipelines that execute several string indexers, one hot > encoders, scaling, and a random forest or linear

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

2017-02-03 Thread Hollin Wilkins
Hey Asher, A phone call may be the best to discuss all of this. But in short: 1. It is quite easy to add custom pipelines/models to MLeap. All of our out-of-the-box transformers can serve as a good example of how to do this. We are also putting together documentation on how to do this in our docs

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

2017-02-03 Thread Asher Krim
I have a bunch of questions for you Hollin: How easy is it to add support for custom pipelines/models? Are Spark mllib models supported? We currently run spark in local mode in an api service. It's not super terrible, but performance is a constant struggle. Have you benchmarked any performance

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

2017-02-03 Thread Hollin Wilkins
Hey Aseem, We have built pipelines that execute several string indexers, one hot encoders, scaling, and a random forest or linear regression at the end. Execution time for the linear regression was on the order of 11 microseconds, a bit longer for random forest. This can be further optimized by

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

2017-02-03 Thread Aseem Bansal
Does this support Java 7? On Fri, Feb 3, 2017 at 5:30 PM, Aseem Bansal wrote: > Is computational time for predictions on the order of few milliseconds (< > 10 ms) like the old mllib library? > > On Thu, Feb 2, 2017 at 10:12 PM, Hollin Wilkins wrote: >

Re: [ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

2017-02-03 Thread Aseem Bansal
Is computational time for predictions on the order of few milliseconds (< 10 ms) like the old mllib library? On Thu, Feb 2, 2017 at 10:12 PM, Hollin Wilkins wrote: > Hey everyone, > > > Some of you may have seen Mikhail and I talk at Spark/Hadoop Summits about > MLeap and how

[ML] MLeap: Deploy Spark ML Pipelines w/o SparkContext

2017-02-02 Thread Hollin Wilkins
Hey everyone, Some of you may have seen Mikhail and I talk at Spark/Hadoop Summits about MLeap and how you can use it to build production services from your Spark-trained ML pipelines. MLeap is an open-source technology that allows Data Scientists and Engineers to deploy Spark-trained ML