If you're trying to score a single example by way of an RDD or
Dataset, then no it will never be that fast. It's a whole distributed
operation, and while you might manage low latency for one job at a
time, consider what will happen when hundreds of them are running at
once. It's just huge overkill for scoring a single example (but,
pretty fine for high-er latency, high throughput batch operations)

However if you're scoring a Vector locally I can't imagine it's that
slow. It does some linear algebra but it's not that complicated. Even
something unoptimized should be fast.

On Thu, Sep 1, 2016 at 1:37 PM, Aseem Bansal <asmbans...@gmail.com> wrote:
> Hi
>
> Currently trying to use NaiveBayes to make predictions. But facing issues
> that doing the predictions takes order of few seconds. I tried with other
> model examples shipped with Spark but they also ran in minimum of 500 ms
> when I used Scala API. With
>
> Has anyone used spark ML to do predictions for a single row under 20 ms?
>
> I am not doing premature optimization. The use case is that we are doing
> real time predictions and we need results 20ms. Maximum 30ms. This is a hard
> limit for our use case.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to