[ https://issues.apache.org/jira/browse/SPARK-24226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Marcelo Vanzin updated SPARK-24226: ----------------------------------- Priority: Major (was: Blocker) > while reading data from oracle 12c from spark and using the numofpartition > more than 1 is not returning the exact count > ----------------------------------------------------------------------------------------------------------------------- > > Key: SPARK-24226 > URL: https://issues.apache.org/jira/browse/SPARK-24226 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.2.0 > Reporter: Chandan > Priority: Major > > Reading data from oracle using JDBC using spark sql context as below. > val query = s"""(select col1,col2,rownum from schematic.tablename) A)""" > val df = sparkcontextInstance.sqlcontext.read.("jdbc") > .option("url", urlstring) > .option("dbtable", query) > .option("user", username) > .option("password", password) > .option("numPartitions", 20) > .option("partitionColumn", "rownum") > .option("lowerBound", 1) > .option("upperBound", 3000000).option("fetchsize", 1500) > .load() > df.count() is returning only 150000 i.e upper bound/numpartition > The table has 3 million records > The table does not have any numerical column so taken rownum as partition > column > The above code is returning the data frame count -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org