Suresh Thalamati created SPARK-13167:
----------------------------------------

             Summary: JDBC data source does not include null value partition 
columns rows in the result.
                 Key: SPARK-13167
                 URL: https://issues.apache.org/jira/browse/SPARK-13167
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.6.0, 2.0.0
            Reporter: Suresh Thalamati


Reading from am JDBC data source using a partition column that is nullable can 
return incorrect number of rows, if there are rows with null value for 
partition column.

{code}
val emp = 
sqlContext.read.jdbc("jdbc:h2:mem:testdb0;user=testUser;password=testPass", 
"TEST.EMP", "theid", 0, 4, 3, new Properties)
emp.count()
{code}

Above jdbc read call sets up the partitions of the following form. It does not 
include null predicate.

{code}
JDBCPartition(THEID < 1,0),JDBCPartition(THEID >= 1 AND THEID < 
2,1),JDBCPartition(THEID >= 2,2)
{code}

Rows with null values in partition column are not included in the results 
because the partition predicate does not specify is null predicates.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to