Re: spark session jdbc performance

Srinivasa Reddy Tatiredidgari Tue, 24 Oct 2017 23:31:39 -0700

Hi, is the subquery is user defined sqls or table name in db.If it is user 
Defined sql.Make sure ur partition column is in main select clause.


Sent from Yahoo Mail on Android 
 
  On Wed, Oct 25, 2017 at 3:25, Naveen Madhire<vmadh...@umail.iu.edu> wrote:   

Hi,

 

I am trying to fetch data from Oracle DB using a subquery and experiencing lot 
of performance issues.

 

Below is the query I am using,

 

Using Spark 2.0.2

 

val df = spark_session.read.format("jdbc")
.option("driver","oracle.jdbc.OracleDriver")
.option("url", jdbc_url)
   .option("user", user)
   .option("password", pwd)
   .option("dbtable", "subquery")
   .option("partitionColumn", "id")  //primary key column uniformly distributed
   .option("lowerBound", "1")
   .option("upperBound", "500000")
.option("numPartitions", 30)
.load()

 

The above query is running using the 30 partitions, but when I see the UI it is 
only using 1 partiton to run the query.

 

Can anyone tell if I am missing anything or do I need to anything else to tune 
the performance of the query.

 Thanks

Re: spark session jdbc performance

Reply via email to