Here's a high level overview of Spark's ML Pipelines around when it came out: https://www.youtube.com/watch?v=OednhGRp938.
But reading your description, you might be able to build a basic version of this without ML. Spark has broadcast variables <http://spark.apache.org/docs/latest/programming-guide.html#broadcast-variables> that would allow you to put flagged IP ranges into an array and make that available on every node. Then you can filters to detect users who've logged in from a flagged IP range. Jon Gregg On Thu, Feb 23, 2017 at 9:19 PM, Mina Aslani <aslanim...@gmail.com> wrote: > Hi, > > I am going to start working on anomaly detection using Spark MLIB. Please > note that I have not used Spark so far. > > I would like to read some data and if a user logged in from different ip > address which is not common consider it as an anomaly, similar to what > apple/google does. > > My preferred language of programming is JAVA. > > I am wondering if you can let me know about books/workshops which guide me > on the ML algorithm to use and how to implement. I would like to know about > the Spark supervised/unsupervised options and the suggested algorithm. > > I really appreciate if you share you thoughts/experience/insight with me. > > Best regards, > Mina >