subject:"SparkSQL parallelism"

Re: SparkSQL parallelism

2016-02-11 Thread Rishi Mishra

I am not sure why all 3 nodes should query. If you have not mentioned any partitions it should only be one partition of JDBCRDD where all dataset should reside. On Fri, Feb 12, 2016 at 10:15 AM, Madabhattula Rajesh Kumar < mrajaf...@gmail.com> wrote: > Hi, > > I have a spark cluster with One

SparkSQL parallelism

2016-02-11 Thread Madabhattula Rajesh Kumar

Hi, I have a spark cluster with One Master and 3 worker nodes. I have written a below code to fetch the records from oracle using sparkSQL val sqlContext = new org.apache.spark.sql.SQLContext(sc) val employees = sqlContext.read.format("jdbc").options( Map("url" ->