[ 
https://issues.apache.org/jira/browse/SPARK-32632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro resolved SPARK-32632.
--------------------------------------
    Resolution: Invalid

> Bad partitioning in spark jdbc method with parameter lowerBound and upperBound
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-32632
>                 URL: https://issues.apache.org/jira/browse/SPARK-32632
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Liu Dinghua
>            Priority: Major
>
> When I use the jdbc methed
> {code:java}
> def jdbc( url: String, table: String, columnName: String, lowerBound: Long, 
> upperBound: Long, numPartitions: Int, connectionProperties: Properties)
> {code}
>  
>   I am confused by the partitions generated by this method,  for  rows of the 
> first partition aren't limited by the lowerBound and the ones of the last 
> partition are not limited by the upperBound. 
>   
>  For example, I use the method  as follow:
>   
> {code:java}
> val data = spark.read.jdbc(url, table, "id", 2, 5, 3,buildProperties()) 
> .selectExpr("id","appkey","funnel_name")
> data.show(100, false)  
> {code}
>  
> The result partitions info is :
>  20/08/05 16:58:59 INFO JDBCRelation: Number of partitions: 3, WHERE clauses 
> of these partitions: `id` < 3 or `id` is null, `id` >= 3 AND `id` < 4, `id` 
> >= 4
> The returned data is:
> ||id|| appkey||funnel_name||
> |0|yanshi|test001|
> |1|yanshi|test002|
> |2|yanshi|test003|
> |3|xingkong|test_funnel|
> |4|xingkong|test_funnel2|
> |5|xingkong|test_funnel3|
> |6|donews|test_funnel4|
> |7|donews|test_funnel|
> |8|donews|test_funnel2|
> |9|dami|test_funnel3|
> |13|dami|test_funnel4|
> |15|xiaoai|test_funnel6|
>  
> Normally, the clause of the first partition should be " `id` >=2 and `id` < 3 
> "  because the lowerBound is 2, and the clause of the last partition should 
> be " `id` >= 4 and `id` < 5 ",  but the facts are not.
>  
>  
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to