Re: Spark Meetup Istanbul

2015-06-25 Thread Paco Nathan
Hi Ayan, Yes, there is -- quite active Check the Spark global events listing to see about meetups and other Spark-related talks in Melbourne: https://docs.google.com/spreadsheets/d/1HKb_uwpQOOtBihRH8nBhgOHrsuy1nsGNlKwG32_qA3Y/edit#gid=0 ...and many other locations :) Paco On Thu, Jun 25, 2015

Re: JAVA for SPARK certification

2015-05-11 Thread Paco Nathan
Note that O'Reilly Media has test prep materials in development. The exam does include questions in Scala, Python, Java, and SQL -- and frankly a number of the questions are about comparing or identifying equivalent Spark techniques between two of those different languages. The questions do not

Re: Spark Trainings/ Professional certifications

2015-01-07 Thread Paco Nathan
online test of spark certification managed through Kryterion. Could you please give me the link about it? Thanks a lot in advance. Cheers Gen On Wed, Jan 7, 2015 at 6:18 PM, Paco Nathan cet...@gmail.com wrote: Hi Saurabh, In your area, Big Data Partnership provides Spark training: http

Re: Spark Trainings/ Professional certifications

2015-01-07 Thread Paco Nathan
Hi Saurabh, In your area, Big Data Partnership provides Spark training: http://www.bigdatapartnership.com/ As Sean mentioned, there is a certification program via a partnership between O'Reilly Media and Databricks http://www.oreilly.com/go/sparkcert That is offered in two ways, in-person at

Re: Clustering text data with MLlib

2014-12-29 Thread Paco Nathan
Jatin, One approach for determining K would be to sample the data set and run PCA. Then evaluate how many many of the resulting eigenvalue/eigenvector pairs to use before you reach diminishing returns on cumulative error. That number provides a reasonably good value for K to use in KMeans. With