[jira] [Assigned] (SPARK-34314) Wrong discovered partition value
[ https://issues.apache.org/jira/browse/SPARK-34314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-34314: --- Assignee: Maxim Gekk > Wrong discovered partition value > > > Key: SPARK-34314 > URL: https://issues.apache.org/jira/browse/SPARK-34314 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.8, 3.0.2, 3.1.0, 3.2.0 >Reporter: Maxim Gekk >Assignee: Maxim Gekk >Priority: Major > > The example below portraits the issue: > {code:scala} > val df = Seq((0, "AA"), (1, "-0")).toDF("id", "part") > df.write > .partitionBy("part") > .format("parquet") > .save(path) > val readback = spark.read.parquet(path) > readback.printSchema() > readback.show(false) > {code} > It write the partition value as string: > {code} > /private/var/folders/p3/dfs6mf655d7fnjrsjvldh0tcgn/T/spark-e09eae99-7ecf-4ab2-b99b-f63f8dea658d > ├── _SUCCESS > ├── part=-0 > │ └── part-1-02144398-2896-4d21-9628-a8743d098cb4.c000.snappy.parquet > └── part=AA > └── part-0-02144398-2896-4d21-9628-a8743d098cb4.c000.snappy.parquet > {code} > *"-0"* and "AA". > but when Spark reads data back, it transforms "-0" to "0" > {code} > root > |-- id: integer (nullable = true) > |-- part: string (nullable = true) > +---++ > |id |part| > +---++ > |0 |AA | > |1 |0 | > +---++ > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-34314) Wrong discovered partition value
[ https://issues.apache.org/jira/browse/SPARK-34314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34314: Assignee: Apache Spark > Wrong discovered partition value > > > Key: SPARK-34314 > URL: https://issues.apache.org/jira/browse/SPARK-34314 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.0 >Reporter: Maxim Gekk >Assignee: Apache Spark >Priority: Major > > The example below portraits the issue: > {code:scala} > val df = Seq((0, "AA"), (1, "-0")).toDF("id", "part") > df.write > .partitionBy("part") > .format("parquet") > .save(path) > val readback = spark.read.parquet(path) > readback.printSchema() > readback.show(false) > {code} > It write the partition value as string: > {code} > /private/var/folders/p3/dfs6mf655d7fnjrsjvldh0tcgn/T/spark-e09eae99-7ecf-4ab2-b99b-f63f8dea658d > ├── _SUCCESS > ├── part=-0 > │ └── part-1-02144398-2896-4d21-9628-a8743d098cb4.c000.snappy.parquet > └── part=AA > └── part-0-02144398-2896-4d21-9628-a8743d098cb4.c000.snappy.parquet > {code} > *"-0"* and "AA". > but when Spark reads data back, it transforms "-0" to "0" > {code} > root > |-- id: integer (nullable = true) > |-- part: string (nullable = true) > +---++ > |id |part| > +---++ > |0 |AA | > |1 |0 | > +---++ > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-34314) Wrong discovered partition value
[ https://issues.apache.org/jira/browse/SPARK-34314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34314: Assignee: (was: Apache Spark) > Wrong discovered partition value > > > Key: SPARK-34314 > URL: https://issues.apache.org/jira/browse/SPARK-34314 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.0 >Reporter: Maxim Gekk >Priority: Major > > The example below portraits the issue: > {code:scala} > val df = Seq((0, "AA"), (1, "-0")).toDF("id", "part") > df.write > .partitionBy("part") > .format("parquet") > .save(path) > val readback = spark.read.parquet(path) > readback.printSchema() > readback.show(false) > {code} > It write the partition value as string: > {code} > /private/var/folders/p3/dfs6mf655d7fnjrsjvldh0tcgn/T/spark-e09eae99-7ecf-4ab2-b99b-f63f8dea658d > ├── _SUCCESS > ├── part=-0 > │ └── part-1-02144398-2896-4d21-9628-a8743d098cb4.c000.snappy.parquet > └── part=AA > └── part-0-02144398-2896-4d21-9628-a8743d098cb4.c000.snappy.parquet > {code} > *"-0"* and "AA". > but when Spark reads data back, it transforms "-0" to "0" > {code} > root > |-- id: integer (nullable = true) > |-- part: string (nullable = true) > +---++ > |id |part| > +---++ > |0 |AA | > |1 |0 | > +---++ > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org