spark session jdbc performance

Naveen Madhire Tue, 24 Oct 2017 14:55:52 -0700

Hi,



I am trying to fetch data from Oracle DB using a subquery and experiencing
lot of performance issues.



Below is the query I am using,



*Using Spark 2.0.2*



*val *df = spark_session.read.format(*"jdbc"*)
.option(*"driver"*,*"*oracle.jdbc.OracleDriver*"*)
.option(*"url"*, jdbc_url)
   .option(*"user"*, user)
   .option(*"password"*, pwd)
   .option(*"dbtable"*, *"subquery"*)
   .option(*"partitionColumn"*, *"id"*)  //primary key column uniformly
distributed
   .option(*"lowerBound"*, *"1"*)
   .option(*"upperBound"*, *"500000"*)
.option(*"numPartitions"*, 30)
.load()



The above query is running using the 30 partitions, but when I see the UI
it is only using 1 partiton to run the query.



Can anyone tell if I am missing anything or do I need to anything else to
tune the performance of the query.

 *Thanks*

spark session jdbc performance

Reply via email to