[ https://issues.apache.org/jira/browse/SPARK-4521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496646#comment-14496646 ]
Yin Huai commented on SPARK-4521: --------------------------------- https://github.com/apache/spark/pull/5263 is for Spark-6607. [~lian cheng] Should we resolve this one? > Parquet fails to read columns with spaces in the name > ----------------------------------------------------- > > Key: SPARK-4521 > URL: https://issues.apache.org/jira/browse/SPARK-4521 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.2.0 > Reporter: Michael Armbrust > > I think this is actually a bug in parquet, but it would be good to track it > here as well. To reproduce: > {code} > jsonRDD(sparkContext.parallelize("""{"number of clusters": > 1}"""::Nil)).saveAsParquetFile("test") > parquetFile("test").collect() > {code} > {code} > org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in > stage 8.0 failed 1 times, most recent failure: Lost task 0.0 in stage 8.0 > (TID 13, localhost): java.lang.IllegalArgumentException: field ended by ';': > expected ';' but got 'of' at line 1: optional int32 number of > at parquet.schema.MessageTypeParser.check(MessageTypeParser.java:209) > at > parquet.schema.MessageTypeParser.addPrimitiveType(MessageTypeParser.java:182) > at parquet.schema.MessageTypeParser.addType(MessageTypeParser.java:108) > at > parquet.schema.MessageTypeParser.addGroupTypeFields(MessageTypeParser.java:96) > at parquet.schema.MessageTypeParser.parse(MessageTypeParser.java:89) > at > parquet.schema.MessageTypeParser.parseMessageType(MessageTypeParser.java:79) > at > parquet.hadoop.ParquetRecordReader.initializeInternalReader(ParquetRecordReader.java:189) > at > parquet.hadoop.ParquetRecordReader.initialize(ParquetRecordReader.java:138) > at > org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:135) > at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:107) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org