[ 
https://issues.apache.org/jira/browse/SPARK-22398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16235708#comment-16235708
 ] 

Marco Gaido commented on SPARK-22398:
-------------------------------------

[~hyukjin.kwon] I think that here there are two points:
 1) partition with leading 0s are interpreted as integers (and I think this is 
a wrong behavior, but it can be fixed disabling typeInference)
 2) IN type coercion with literals behaves differently from type coercion in 
other parts.
Due to the title of the JIRA I thought that the best option was to track 1 here 
and open a new JIRA with a relevant title for 2.


> Partition directories with leading 0s cause wrong results
> ---------------------------------------------------------
>
>                 Key: SPARK-22398
>                 URL: https://issues.apache.org/jira/browse/SPARK-22398
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.0
>            Reporter: Bogdan Raducanu
>            Priority: Major
>
> Repro case:
> {code}
> spark.range(8).selectExpr("'0' || cast(id as string) as id", "id as 
> b").write.mode("overwrite").partitionBy("id").parquet("/tmp/bug1")
> spark.read.parquet("/tmp/bug1").where("id in ('01')").show
> +---+---+
> |  b| id|
> +---+---+
> +---+---+
> spark.read.parquet("/tmp/bug1").where("id = '01'").show
> +---+---+
> |  b| id|
> +---+---+
> |  1|  1|
> +---+---+
> {code}
> I think somewhere there is some special handling of this case for equals but 
> not the same for IN.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to