[ https://issues.apache.org/jira/browse/SPARK-25722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dongjoon Hyun updated SPARK-25722: ---------------------------------- Description: Among built-in data sources, `avro` and `orc` doesn't allow `backtick` in column names. We had better be consistent if possible. * Option 1: Support a backtick character * Option 2: Disallow a backtick character (This may be considered as a regression at TEXT/CSV/JSON/Parquet) So, Option 1 is better. *TEXT*, *CSV*, *JSON*, *PARQUET* {code:java} Seq("text", "csv", "json", "parquet").foreach { format => Seq("1").toDF("`").write.mode("overwrite").format(format).save("/tmp/t") }{code} *AVRO* {code:java} scala> Seq("1").toDF("`").write.mode("overwrite").format("avro").save("/tmp/t") org.apache.avro.SchemaParseException: Illegal initial character: `{code} *ORC (native)* {code:java} scala> Seq("1").toDF("`").write.mode("overwrite").format("orc").save("/tmp/t") java.lang.IllegalArgumentException: Unmatched quote at 'struct<^```:string>'{code} *ORC (hive)* {code:java} scala> Seq("1").toDF("`").write.mode("overwrite").format("orc").save("/tmp/t") java.lang.IllegalArgumentException: Error: name expected at the position 7 of 'struct<`:string>' but '`' is found.{code} was: Among built-in data sources, `avro` and `orc` doesn't allow `backtick` in column names. We had better be consistent if possible. * Option 1: Support a backtick character * Option 2: Disallow a backtick character (This may be considered as a regression at TEXT/CSV/JSON/Parquet) So, Option 1 is better. *TEXT*, *CSV*, *JSON*, *PARQUET* {code:java} Seq("text", "csv", "json", "parquet").foreach { format => Seq("1").toDF("`").write.mode("overwrite").format(format).save("/tmp/t") }{code} *AVRO* {code:java} scala> Seq("1").toDF("`").write.mode("overwrite").format("avro").save("/tmp/t") org.apache.avro.SchemaParseException: Illegal initial character: `{code} *ORC* {code:java} scala> Seq("1").toDF("`").write.mode("overwrite").format("orc").save("/tmp/t") java.lang.IllegalArgumentException: Unmatched quote at 'struct<^```:string>'{code} > Support a backtick character in column names > -------------------------------------------- > > Key: SPARK-25722 > URL: https://issues.apache.org/jira/browse/SPARK-25722 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.4.0 > Reporter: Dongjoon Hyun > Priority: Minor > > Among built-in data sources, `avro` and `orc` doesn't allow `backtick` in > column names. We had better be consistent if possible. > * Option 1: Support a backtick character > * Option 2: Disallow a backtick character (This may be considered as a > regression at TEXT/CSV/JSON/Parquet) > So, Option 1 is better. > *TEXT*, *CSV*, *JSON*, *PARQUET* > {code:java} > Seq("text", "csv", "json", "parquet").foreach { format => > Seq("1").toDF("`").write.mode("overwrite").format(format).save("/tmp/t") > }{code} > *AVRO* > {code:java} > scala> > Seq("1").toDF("`").write.mode("overwrite").format("avro").save("/tmp/t") > org.apache.avro.SchemaParseException: Illegal initial character: `{code} > *ORC (native)* > {code:java} > scala> Seq("1").toDF("`").write.mode("overwrite").format("orc").save("/tmp/t") > java.lang.IllegalArgumentException: Unmatched quote at > 'struct<^```:string>'{code} > *ORC (hive)* > {code:java} > scala> Seq("1").toDF("`").write.mode("overwrite").format("orc").save("/tmp/t") > java.lang.IllegalArgumentException: Error: name expected at the position 7 of > 'struct<`:string>' but '`' is found.{code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org