RE: Scala Vs Python

2016-09-02 Thread Santoshakhilesh
I have seen a talk by Brian Clapper in NE-SCALA 2016 - RDDs, DataFrames and Datasets @ Apache Spark - NE Scala 2016 At 15:00 there is a slide to show a comparison of aggregating 10 Million integer pairs using RDD , DataFrame with different language bindings like Scala , Python , R As per

RE: Scala Vs Python

2016-08-31 Thread Santoshakhilesh
Hi , I would prefer Scala if you are starting afresh , this is considering both ease of usage , features , performance and support. You will find numerous examples & support with Scala which might not be true for any other language. I had personally developed the first version of my App using

RE: Cumulative Sum function using Dataset API

2016-08-09 Thread Santoshakhilesh
You could check following link. http://stackoverflow.com/questions/35154267/how-to-compute-cumulative-sum-using-spark From: Jon Barksdale [mailto:jon.barksd...@gmail.com] Sent: 09 August 2016 08:21 To: ayan guha Cc: user Subject: Re: Cumulative Sum function using Dataset API I don't think that

RE: GraphX Java API

2016-06-05 Thread Santoshakhilesh
in scala and it turned out much simpler to develop in scala due to some of its powerful functions like lambda , map , filter etc… which were not available to me in Java 7. Regards, Santosh Akhilesh From: Sonal Goyal [mailto:sonalgoy...@gmail.com] Sent: 01 June 2016 00:56 To: Santoshakhilesh Cc: Kumar

RE: GraphX Java API

2016-05-31 Thread Santoshakhilesh
From: Kumar, Abhishek (US - Bengaluru) [mailto:abhishekkuma...@deloitte.com] Sent: 30 May 2016 13:24 To: Santoshakhilesh; user@spark.apache.org Cc: Golatkar, Jayesh (US - Bengaluru); Soni, Akhil Dharamprakash (US - Bengaluru); Matta, Rishul (US - Bengaluru); Aich, Risha (US - Bengaluru); Kumar

RE: GraphX Java API

2016-05-27 Thread Santoshakhilesh
GraphX APis are available only in Scala. If you need to use GraphX you need to switch to Scala. From: Kumar, Abhishek (US - Bengaluru) [mailto:abhishekkuma...@deloitte.com] Sent: 27 May 2016 19:59 To: user@spark.apache.org Subject: GraphX Java API Hi, We are trying to consume the Java API for

How to setup a long running spark streaming job with continuous window refresh

2016-01-21 Thread Santoshakhilesh
Hi, I have following scenario in my project; 1.I will continue to get a stream of data from a source 2.I need to calculate mean and variance for a key every minute 3.After minute is over I should restart fresh computing the values for new minute Example: 10:00:00 computation and