Which version of Spark you are using ?? I can get correct results using JdbcRDD. Infact there is a test suite precisely for this (JdbcRDDSuite) . I changed according to your input and got correct results from this test suite.
On Wed, Sep 23, 2015 at 11:00 AM, satish chandra j <jsatishchan...@gmail.com > wrote: > HI All, > > JdbcRDD constructor has following parameters, > > *JdbcRDD > <https://spark.apache.org/docs/1.2.0/api/java/org/apache/spark/rdd/JdbcRDD.html#JdbcRDD(org.apache.spark.SparkContext,%20scala.Function0,%20java.lang.String,%20long,%20long,%20int,%20scala.Function1,%20scala.reflect.ClassTag)>* > (SparkContext > <https://spark.apache.org/docs/1.2.0/api/java/org/apache/spark/SparkContext.html> > sc, > scala.Function0<java.sql.Connection> getConnection, String sql, *long > lowerBound, > long upperBound, int numPartitions*, scala.Function1<java.sql.ResultSet,T > <https://spark.apache.org/docs/1.2.0/api/java/org/apache/spark/rdd/JdbcRDD.html>> > mapRow, > scala.reflect.ClassTag<T > <https://spark.apache.org/docs/1.2.0/api/java/org/apache/spark/rdd/JdbcRDD.html> > > evidence$1) > > where the below parameters *lowerBound* refers to Lower boundary of > entire data, *upperBound *refers to Upper boundary of entire data and > *numPartitions > *refer to Number of partitions > > Source table to which JbdcRDD is fetching data from Oracle DB has more > than 500 records but its confusing when I tried several executions by > changing "numPartitions" parameter > > LowerBound,UpperBound,numPartitions: Output Count > > 0 ,100 ,1 : 100 > > 0 ,100 ,2 : 151 > > 0 ,100 ,3 : 201 > > > Please help me in understanding the why Output count is 151 if > numPartitions is 2 and Output count is 201 if numPartitions is 3 > > Regards, > > Satish Chandra >