I am not sure why all 3 nodes should query. If you have not mentioned any partitions it should only be one partition of JDBCRDD where all dataset should reside.
On Fri, Feb 12, 2016 at 10:15 AM, Madabhattula Rajesh Kumar < mrajaf...@gmail.com> wrote: > Hi, > > I have a spark cluster with One Master and 3 worker nodes. I have written > a below code to fetch the records from oracle using sparkSQL > > val sqlContext = new org.apache.spark.sql.SQLContext(sc) > val employees = sqlContext.read.format("jdbc").options( > Map("url" -> "jdbc:oracle:thin:@xxxx:1525:SID", > "dbtable" -> "(select * from employee where name like '%18%')", > "user" -> "username", > "password" -> "password")).load > > I have a submitted this job to spark cluster using spark-submit command. > > > > *Looks like, All 3 workers are executing same query and fetching same > data. It means, it is making 3 jdbc calls to oracle.* > *How to make this code to make a single jdbc call to oracle(In case of > more than one worker) ?* > > Please help me to resolve this use case > > Regards, > Rajesh > > > -- Regards, Rishitesh Mishra, SnappyData . (http://www.snappydata.io/) https://in.linkedin.com/in/rishiteshmishra