Re: What are factors need to Be considered when upgrading to Spark 2.1.0 from Spark 1.6.0

2017-09-23 Thread vaquar khan
http://spark.apache.org/docs/latest/sql-programming-guide.html#migration-guide Regards, Vaquar khan On Fri, Sep 22, 2017 at 4:41 PM, Gokula Krishnan D wrote: > Thanks for the reply. Forgot to mention that, our Batch ETL Jobs are in > Core-Spark. > > > On Sep 22, 2017, at

Re: Apache Spark - MLLib challenges

2017-09-23 Thread vaquar khan
MLIB is old RDD-based API since Apache Spark 2 is recommended to use dataset based APIs to get good performance and introduce ML. ML contains new API build around Dataset and ML Pipelines ,mllib is slowly being deprecated (this already happened in case of linear regression) MLIB currently

Re: Apache Spark - MLLib challenges

2017-09-23 Thread Koert Kuipers
our main challenge has been the lack of support for missing values generally On Sat, Sep 23, 2017 at 3:41 AM, Irfan Kabli wrote: > Dear All, > > We are looking to position MLLib in our organisation for machine learning > tasks and are keen to understand if their are

Re: Apache Spark - MLLib challenges

2017-09-23 Thread Aseem Bansal
This is something I wrote specifically for the challenges that we faced when taking spark ml models to production http://www.tothenew.com/blog/when-you-take-your-machine-learning-models-to-production-for-real-time-predictions/ On Sat, Sep 23, 2017 at 1:33 PM, Jörn Franke

Re: Amazon Elastic Cache + Spark Streaming

2017-09-23 Thread Saravanan Nagarajan
sure Thanks! On Fri, Sep 22, 2017 at 5:36 PM, ayan guha wrote: > AWS Elastic Cache supports MemCach and Redis. Spark has a Redis connector > which I believe you can use to > connect to Elastic Cache. > > On Sat, Sep 23, 2017 at

Re: Apache Spark - MLLib challenges

2017-09-23 Thread Jörn Franke
As far as I know there is currently no encryption in-memory in Spark. There are some research projects to create secure enclaves in-memory based on Intel sgx, but there is still a lot to do in terms of performance and security objectives. The more interesting question is why would you need this

Apache Spark - MLLib challenges

2017-09-23 Thread Irfan Kabli
Dear All, We are looking to position MLLib in our organisation for machine learning tasks and are keen to understand if their are any challenges that you might have seen with MLLib in production. We will be going with the pure open-source approach here, rather than using one of the hadoop