Here's a high level overview of Spark's ML Pipelines around when it came
out: https://www.youtube.com/watch?v=OednhGRp938.

But reading your description, you might be able to build a basic version of
this without ML.  Spark has broadcast variables
<http://spark.apache.org/docs/latest/programming-guide.html#broadcast-variables>
that
would allow you to put flagged IP ranges into an array and make that
available on every node.  Then you can filters to detect users who've
logged in from a flagged IP range.

Jon Gregg

On Thu, Feb 23, 2017 at 9:19 PM, Mina Aslani <aslanim...@gmail.com> wrote:

> Hi,
>
> I am going to start working on anomaly detection using Spark MLIB. Please
> note that I have not used Spark so far.
>
> I would like to read some data and if a user logged in from different ip
> address which is not common consider it as an anomaly, similar to what
> apple/google does.
>
> My preferred language of programming is JAVA.
>
> I am wondering if you can let me know about books/workshops which guide me
> on the ML algorithm to use and how to implement. I would like to know about
> the Spark supervised/unsupervised options and the suggested algorithm.
>
> I really appreciate if you share you thoughts/experience/insight with me.
>
> Best regards,
> Mina
>

Reply via email to