Re: mllib model in production web API

vincent gromakowski Tue, 18 Oct 2016 00:17:54 -0700

Hi
Did you try applying the model with akka instead of spark ?
https://spark-summit.org/eu-2015/events/real-time-anomaly-detection-with-spark-ml-and-akka/


Le 18 oct. 2016 5:58 AM, "Aseem Bansal" <asmbans...@gmail.com> a écrit :

> @Nicolas
>
> No, ours is different. We required predictions within 10ms time frame so
> we needed much less latency than that.
>
> Every algorithm has some parameters. Correct? We took the parameters from
> the mllib and used them to create ml package's model. ml package's model's
> prediction time was much faster compared to mllib package's transformation.
> So essentially use spark's distributed machine learning library to train
> the model, save to S3, load from S3 in a different system and then convert
> it into the vector based API model for actual predictions.
>
> There were obviously some transformations involved but we didn't use
> Pipeline for those transformations. Instead, we re-wrote them for the
> Vector based API. I know it's not perfect but if we had used the
> transformations within the pipeline that would make us dependent on spark's
> distributed API and we didn't see how we will really reach our latency
> requirements. Would have been much simpler and more DRY if the
> PipelineModel had a predict method based on vectors and was not distributed.
>
> As you can guess it is very much model-specific and more work. If we
> decide to use another type of Model we will have to add conversion
> code/transformation code for that also. Only if spark exposed a prediction
> method which is as fast as the old machine learning package.
>
> On Sat, Oct 15, 2016 at 8:42 PM, Nicolas Long <nicolasl...@gmail.com>
> wrote:
>
>> Hi Sean and Aseem,
>>
>> thanks both. A simple thing which sped things up greatly was simply to
>> load our sql (for one record effectively) directly and then convert to a
>> dataframe, rather than using Spark to load it. Sounds stupid, but this took
>> us from > 5 seconds to ~1 second on a very small instance.
>>
>> Aseem: can you explain your solution a bit more? I'm not sure I
>> understand it. At the moment we load our models from S3
>> (RandomForestClassificationModel.load(..) ) and then store that in an
>> object property so that it persists across requests - this is in Scala. Is
>> this essentially what you mean?
>>
>>
>>
>>
>>
>>
>> On 12 October 2016 at 10:52, Aseem Bansal <asmbans...@gmail.com> wrote:
>>
>>> Hi
>>>
>>> Faced a similar issue. Our solution was to load the model, cache it
>>> after converting it to a model from mllib and then use that instead of ml
>>> model.
>>>
>>> On Tue, Oct 11, 2016 at 10:22 PM, Sean Owen <so...@cloudera.com> wrote:
>>>
>>>> I don't believe it will ever scale to spin up a whole distributed job
>>>> to serve one request. You can look possibly at the bits in mllib-local. You
>>>> might do well to export as something like PMML either with Spark's export
>>>> or JPMML and then load it into a web container and score it, without Spark
>>>> (possibly also with JPMML, OpenScoring)
>>>>
>>>>
>>>> On Tue, Oct 11, 2016, 17:53 Nicolas Long <nicolasl...@gmail.com> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> so I have a model which has been stored in S3. And I have a Scala
>>>>> webapp which for certain requests loads the model and transforms submitted
>>>>> data against it.
>>>>>
>>>>> I'm not sure how to run this quickly on a single instance though. At
>>>>> the moment Spark is being bundled up with the web app in an uberjar (sbt
>>>>> assembly).
>>>>>
>>>>> But the process is quite slow. I'm aiming for responses < 1 sec so
>>>>> that the webapp can respond quickly to requests. When I look the Spark UI 
>>>>> I
>>>>> see:
>>>>>
>>>>> Summary Metrics for 1 Completed Tasks
>>>>>
>>>>> Metric    Min    25th percentile    Median    75th percentile    Max
>>>>> Duration    94 ms    94 ms    94 ms    94 ms    94 ms
>>>>> Scheduler Delay    0 ms    0 ms    0 ms    0 ms    0 ms
>>>>> Task Deserialization Time    3 s    3 s    3 s    3 s    3 s
>>>>> GC Time    2 s    2 s    2 s    2 s    2 s
>>>>> Result Serialization Time    0 ms    0 ms    0 ms    0 ms    0 ms
>>>>> Getting Result Time    0 ms    0 ms    0 ms    0 ms    0 ms
>>>>> Peak Execution Memory    0.0 B    0.0 B    0.0 B    0.0 B    0.0 B
>>>>>
>>>>> I don't really understand why deserialization and GC should take so
>>>>> long when the models are already loaded. Is this evidence I am doing
>>>>> something wrong? And where can I get a better understanding on how Spark
>>>>> works under the hood here, and how best to do a standalone/bundled jar
>>>>> deployment?
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Nic
>>>>>
>>>>
>>>
>>
>

Re: mllib model in production web API

Reply via email to