[ https://issues.apache.org/jira/browse/SPARK-20593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Herman van Hovell closed SPARK-20593. ------------------------------------- Resolution: Not A Problem > Writing Parquet: Cannot build an empty group > -------------------------------------------- > > Key: SPARK-20593 > URL: https://issues.apache.org/jira/browse/SPARK-20593 > Project: Spark > Issue Type: Question > Components: Spark Core, Spark Shell > Affects Versions: 2.1.1 > Environment: I use Apache Spark 2.1.1 (used 2.1.0 and it was the > same, switched today). Tested only on Mac > Reporter: Viktor Khristenko > Priority: Minor > > Hi, > This is my first ticket and I apologize for/if I'm doing certain things in an > improper way. > I have a dataset: > {noformat} > root > |-- muons: array (nullable = true) > | |-- element: struct (containsNull = true) > | | |-- reco::Candidate: struct (nullable = true) > | | |-- qx3_: integer (nullable = true) > | | |-- pt_: float (nullable = true) > | | |-- eta_: float (nullable = true) > | | |-- phi_: float (nullable = true) > | | |-- mass_: float (nullable = true) > | | |-- vertex_: struct (nullable = true) > | | | |-- fCoordinates: struct (nullable = true) > | | | | |-- fX: float (nullable = true) > | | | | |-- fY: float (nullable = true) > | | | | |-- fZ: float (nullable = true) > | | |-- pdgId_: integer (nullable = true) > | | |-- status_: integer (nullable = true) > | | |-- cachePolarFixed_: struct (nullable = true) > | | |-- cacheCartesianFixed_: struct (nullable = true) > {noformat} > As you can see, there are 3 empty structs in this schema. I know 100% that I > can read/manipulate/do whatever. However, when I try writing to disk in > parquet, I get the following Exception: > ds.write.format("parquet").save(outputPathName): > java.lang.IllegalStateException: Cannot build an empty group > at org.apache.parquet.Preconditions.checkState(Preconditions.java:91) > at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:622) > at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:497) > at org.apache.parquet.schema.Types$Builder.named(Types.java:286) > at > org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:535) > at > org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321) > at > org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:534) > at > org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:533) > So, basically I would like to understand if it's a bug or an intended > behavior??? I also assume that it's related to the empty structs. Any help > would be really appreciated! > I've quickly created stripped version and that one works without any issues! > For reference, I put a link to the original question on SO[1] > VK > [1] > http://stackoverflow.com/questions/43767358/apache-spark-parquet-cannot-build-an-empty-group -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org