[ https://issues.apache.org/jira/browse/SPARK-22398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16235708#comment-16235708 ]
Marco Gaido commented on SPARK-22398: ------------------------------------- [~hyukjin.kwon] I think that here there are two points: 1) partition with leading 0s are interpreted as integers (and I think this is a wrong behavior, but it can be fixed disabling typeInference) 2) IN type coercion with literals behaves differently from type coercion in other parts. Due to the title of the JIRA I thought that the best option was to track 1 here and open a new JIRA with a relevant title for 2. > Partition directories with leading 0s cause wrong results > --------------------------------------------------------- > > Key: SPARK-22398 > URL: https://issues.apache.org/jira/browse/SPARK-22398 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.3.0 > Reporter: Bogdan Raducanu > Priority: Major > > Repro case: > {code} > spark.range(8).selectExpr("'0' || cast(id as string) as id", "id as > b").write.mode("overwrite").partitionBy("id").parquet("/tmp/bug1") > spark.read.parquet("/tmp/bug1").where("id in ('01')").show > +---+---+ > | b| id| > +---+---+ > +---+---+ > spark.read.parquet("/tmp/bug1").where("id = '01'").show > +---+---+ > | b| id| > +---+---+ > | 1| 1| > +---+---+ > {code} > I think somewhere there is some special handling of this case for equals but > not the same for IN. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org