[jira] [Updated] (SPARK-20593) Writing Parquet: Cannot build an empty group
[ https://issues.apache.org/jira/browse/SPARK-20593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Khristenko updated SPARK-20593: -- Environment: I use Apache Spark 2.1.1 (used 2.1.0 and it was the same, switched today). Tested only on Mac (was: I use Apache Spark 2.1.1 (used 2.1.0 and it was the same, switched today). Tested only Mac) > Writing Parquet: Cannot build an empty group > > > Key: SPARK-20593 > URL: https://issues.apache.org/jira/browse/SPARK-20593 > Project: Spark > Issue Type: Question > Components: Spark Core, Spark Shell >Affects Versions: 2.1.1 > Environment: I use Apache Spark 2.1.1 (used 2.1.0 and it was the > same, switched today). Tested only on Mac >Reporter: Viktor Khristenko >Priority: Minor > > Hi, > This is my first ticket and I apologize for/if I'm doing certain things in an > improper way. > I have a dataset: > {noformat} > root > |-- muons: array (nullable = true) > ||-- element: struct (containsNull = true) > |||-- reco::Candidate: struct (nullable = true) > |||-- qx3_: integer (nullable = true) > |||-- pt_: float (nullable = true) > |||-- eta_: float (nullable = true) > |||-- phi_: float (nullable = true) > |||-- mass_: float (nullable = true) > |||-- vertex_: struct (nullable = true) > ||||-- fCoordinates: struct (nullable = true) > |||||-- fX: float (nullable = true) > |||||-- fY: float (nullable = true) > |||||-- fZ: float (nullable = true) > |||-- pdgId_: integer (nullable = true) > |||-- status_: integer (nullable = true) > |||-- cachePolarFixed_: struct (nullable = true) > |||-- cacheCartesianFixed_: struct (nullable = true) > {noformat} > As you can see, there are 3 empty structs in this schema. I know 100% that I > can read/manipulate/do whatever. However, when I try writing to disk in > parquet, I get the following Exception: > ds.write.format("parquet").save(outputPathName): > java.lang.IllegalStateException: Cannot build an empty group > at org.apache.parquet.Preconditions.checkState(Preconditions.java:91) > at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:622) > at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:497) > at org.apache.parquet.schema.Types$Builder.named(Types.java:286) > at > org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:535) > at > org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321) > at > org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:534) > at > org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:533) > So, basically I would like to understand if it's a bug or an intended > behavior??? I also assume that it's related to the empty structs. Any help > would be really appreciated! > I've quickly created stripped version and that one works without any issues! > For reference, I put a link to the original question on SO[1] > VK > [1] > http://stackoverflow.com/questions/43767358/apache-spark-parquet-cannot-build-an-empty-group -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-20593) Writing Parquet: Cannot build an empty group
[ https://issues.apache.org/jira/browse/SPARK-20593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Khristenko updated SPARK-20593: -- Description: Hi, This is my first ticket and I apologize for/if I'm doing certain things in an improper way. I have a dataset: {noformat} root |-- muons: array (nullable = true) ||-- element: struct (containsNull = true) |||-- reco::Candidate: struct (nullable = true) |||-- qx3_: integer (nullable = true) |||-- pt_: float (nullable = true) |||-- eta_: float (nullable = true) |||-- phi_: float (nullable = true) |||-- mass_: float (nullable = true) |||-- vertex_: struct (nullable = true) ||||-- fCoordinates: struct (nullable = true) |||||-- fX: float (nullable = true) |||||-- fY: float (nullable = true) |||||-- fZ: float (nullable = true) |||-- pdgId_: integer (nullable = true) |||-- status_: integer (nullable = true) |||-- cachePolarFixed_: struct (nullable = true) |||-- cacheCartesianFixed_: struct (nullable = true) {noformat} As you can see, there are 3 empty structs in this schema. I know 100% that I can read/manipulate/do whatever. However, when I try writing to disk in parquet, I get the following Exception: ds.write.format("parquet").save(outputPathName): java.lang.IllegalStateException: Cannot build an empty group at org.apache.parquet.Preconditions.checkState(Preconditions.java:91) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:622) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:497) at org.apache.parquet.schema.Types$Builder.named(Types.java:286) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:535) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:534) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:533) So, basically I would like to understand if it's a bug or an intended behavior??? I also assume that it's related to the empty structs. Any help would be really appreciated! I've quickly created stripped version and that one works without any issues! For reference, I put a link to the original question on SO[1] VK [1] http://stackoverflow.com/questions/43767358/apache-spark-parquet-cannot-build-an-empty-group was: Hi, This is my first ticket and I apologize for/if I'm doing certain things in an improper way. I have a dataset: {noformat} root |-- muons: array (nullable = true) ||-- element: struct (containsNull = true) |||-- reco::Candidate: struct (nullable = true) |||-- qx3_: integer (nullable = true) |||-- pt_: float (nullable = true) |||-- eta_: float (nullable = true) |||-- phi_: float (nullable = true) |||-- mass_: float (nullable = true) |||-- vertex_: struct (nullable = true) ||||-- fCoordinates: struct (nullable = true) |||||-- fX: float (nullable = true) |||||-- fY: float (nullable = true) |||||-- fZ: float (nullable = true) |||-- pdgId_: integer (nullable = true) |||-- status_: integer (nullable = true) |||-- cachePolarFixed_: struct (nullable = true) |||-- cacheCartesianFixed_: struct (nullable = true) {noformat} As you can see, there are 3 empty structs in this schema. I know 100% that I can read/manipulate/do whatever. However, when I try writing to disk in parquet, I get the following Exception: ds.write.format("parquet").save(outputPathName): java.lang.IllegalStateException: Cannot build an empty group at org.apache.parquet.Preconditions.checkState(Preconditions.java:91) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:622) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:497) at org.apache.parquet.schema.Types$Builder.named(Types.java:286) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:535) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:534) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:533) So, basically I would like to understand if it's a bug or an intended behavior??? I also assume that it's related to the empty structs. An
[jira] [Updated] (SPARK-20593) Writing Parquet: Cannot build an empty group
[ https://issues.apache.org/jira/browse/SPARK-20593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Khristenko updated SPARK-20593: -- Description: Hi, This is my first ticket and I apologize for/if I'm doing certain things in an improper way. I have a dataset: {noformat} root |-- muons: array (nullable = true) ||-- element: struct (containsNull = true) |||-- reco::Candidate: struct (nullable = true) |||-- qx3_: integer (nullable = true) |||-- pt_: float (nullable = true) |||-- eta_: float (nullable = true) |||-- phi_: float (nullable = true) |||-- mass_: float (nullable = true) |||-- vertex_: struct (nullable = true) ||||-- fCoordinates: struct (nullable = true) |||||-- fX: float (nullable = true) |||||-- fY: float (nullable = true) |||||-- fZ: float (nullable = true) |||-- pdgId_: integer (nullable = true) |||-- status_: integer (nullable = true) |||-- cachePolarFixed_: struct (nullable = true) |||-- cacheCartesianFixed_: struct (nullable = true) {noformat} As you can see, there are 3 empty structs in this schema. I know 100% that I can read/manipulate/do whatever. However, when I try writing to disk in parquet, I get the following Exception: ds.write.format("parquet").save(outputPathName): java.lang.IllegalStateException: Cannot build an empty group at org.apache.parquet.Preconditions.checkState(Preconditions.java:91) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:622) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:497) at org.apache.parquet.schema.Types$Builder.named(Types.java:286) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:535) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:534) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:533) So, basically I would like to understand if it's a bug or an intended behavior??? I also assume that it's related to the empty structs. Any help would be really appreciated! I've quickly created stripped version and that one works without any issues! For reference, I put a link to a original question on SO[1] VK [1] http://stackoverflow.com/questions/43767358/apache-spark-parquet-cannot-build-an-empty-group was: Hi, This is my first ticket and I apologize for/if I'm doing certain things in an improper way. I have a dataset: ``` root |-- muons: array (nullable = true) ||-- element: struct (containsNull = true) |||-- reco::Candidate: struct (nullable = true) |||-- qx3_: integer (nullable = true) |||-- pt_: float (nullable = true) |||-- eta_: float (nullable = true) |||-- phi_: float (nullable = true) |||-- mass_: float (nullable = true) |||-- vertex_: struct (nullable = true) ||||-- fCoordinates: struct (nullable = true) |||||-- fX: float (nullable = true) |||||-- fY: float (nullable = true) |||||-- fZ: float (nullable = true) |||-- pdgId_: integer (nullable = true) |||-- status_: integer (nullable = true) |||-- cachePolarFixed_: struct (nullable = true) |||-- cacheCartesianFixed_: struct (nullable = true) ``` As you can see, there are 3 empty structs in this schema. I know 100% that I can read/manipulate/do whatever. However, when I try writing to disk in parquet, I get the following Exception: ds.write.format("parquet").save(outputPathName): java.lang.IllegalStateException: Cannot build an empty group at org.apache.parquet.Preconditions.checkState(Preconditions.java:91) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:622) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:497) at org.apache.parquet.schema.Types$Builder.named(Types.java:286) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:535) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:534) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:533) So, basically I would like to understand if it's a bug or an intended behavior??? I also assume that it's related to the empty structs. Any help would be
[jira] [Updated] (SPARK-20593) Writing Parquet: Cannot build an empty group
[ https://issues.apache.org/jira/browse/SPARK-20593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Khristenko updated SPARK-20593: -- Description: Hi, This is my first ticket and I apologize for/if I'm doing certain things in an improper way. I have a dataset: ``` root |-- muons: array (nullable = true) ||-- element: struct (containsNull = true) |||-- reco::Candidate: struct (nullable = true) |||-- qx3_: integer (nullable = true) |||-- pt_: float (nullable = true) |||-- eta_: float (nullable = true) |||-- phi_: float (nullable = true) |||-- mass_: float (nullable = true) |||-- vertex_: struct (nullable = true) ||||-- fCoordinates: struct (nullable = true) |||||-- fX: float (nullable = true) |||||-- fY: float (nullable = true) |||||-- fZ: float (nullable = true) |||-- pdgId_: integer (nullable = true) |||-- status_: integer (nullable = true) |||-- cachePolarFixed_: struct (nullable = true) |||-- cacheCartesianFixed_: struct (nullable = true) ``` As you can see, there are 3 empty structs in this schema. I know 100% that I can read/manipulate/do whatever. However, when I try writing to disk in parquet, I get the following Exception: ds.write.format("parquet").save(outputPathName): java.lang.IllegalStateException: Cannot build an empty group at org.apache.parquet.Preconditions.checkState(Preconditions.java:91) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:622) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:497) at org.apache.parquet.schema.Types$Builder.named(Types.java:286) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:535) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:534) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:533) So, basically I would like to understand if it's a bug or an intended behavior??? I also assume that it's related to the empty structs. Any help would be really appreciated! I've quickly created stripped version and that one works without any issues! For reference, I put a link to a original question on SO[1] VK [1] http://stackoverflow.com/questions/43767358/apache-spark-parquet-cannot-build-an-empty-group was: Hi, This is my first ticket and I apologize for/if I'm doing certain things in an improper way. I have a dataset: {quote} root |-- muons: array (nullable = true) ||-- element: struct (containsNull = true) |||-- reco::Candidate: struct (nullable = true) |||-- qx3_: integer (nullable = true) |||-- pt_: float (nullable = true) |||-- eta_: float (nullable = true) |||-- phi_: float (nullable = true) |||-- mass_: float (nullable = true) |||-- vertex_: struct (nullable = true) ||||-- fCoordinates: struct (nullable = true) |||||-- fX: float (nullable = true) |||||-- fY: float (nullable = true) |||||-- fZ: float (nullable = true) |||-- pdgId_: integer (nullable = true) |||-- status_: integer (nullable = true) |||-- cachePolarFixed_: struct (nullable = true) |||-- cacheCartesianFixed_: struct (nullable = true) {quote} As you can see, there are 3 empty structs in this schema. I know 100% that I can read/manipulate/do whatever. However, when I try writing to disk in parquet, I get the following Exception: ds.write.format("parquet").save(outputPathName): java.lang.IllegalStateException: Cannot build an empty group at org.apache.parquet.Preconditions.checkState(Preconditions.java:91) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:622) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:497) at org.apache.parquet.schema.Types$Builder.named(Types.java:286) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:535) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:534) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:533) So, basically I would like to understand if it's a bug or an intended behavior??? I also assume that it's related to the empty structs. Any help would be reall
[jira] [Updated] (SPARK-20593) Writing Parquet: Cannot build an empty group
[ https://issues.apache.org/jira/browse/SPARK-20593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Khristenko updated SPARK-20593: -- Description: Hi, This is my first ticket and I apologize for/if I'm doing certain things in an improper way. I have a dataset: {quote} root |-- muons: array (nullable = true) ||-- element: struct (containsNull = true) |||-- reco::Candidate: struct (nullable = true) |||-- qx3_: integer (nullable = true) |||-- pt_: float (nullable = true) |||-- eta_: float (nullable = true) |||-- phi_: float (nullable = true) |||-- mass_: float (nullable = true) |||-- vertex_: struct (nullable = true) ||||-- fCoordinates: struct (nullable = true) |||||-- fX: float (nullable = true) |||||-- fY: float (nullable = true) |||||-- fZ: float (nullable = true) |||-- pdgId_: integer (nullable = true) |||-- status_: integer (nullable = true) |||-- cachePolarFixed_: struct (nullable = true) |||-- cacheCartesianFixed_: struct (nullable = true) {quote} As you can see, there are 3 empty structs in this schema. I know 100% that I can read/manipulate/do whatever. However, when I try writing to disk in parquet, I get the following Exception: ds.write.format("parquet").save(outputPathName): java.lang.IllegalStateException: Cannot build an empty group at org.apache.parquet.Preconditions.checkState(Preconditions.java:91) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:622) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:497) at org.apache.parquet.schema.Types$Builder.named(Types.java:286) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:535) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:534) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:533) So, basically I would like to understand if it's a bug or an intended behavior??? I also assume that it's related to the empty structs. Any help would be really appreciated! I've quickly created stripped version and that one works without any issues! For reference, I put a link to a original question on SO[1] VK [1] http://stackoverflow.com/questions/43767358/apache-spark-parquet-cannot-build-an-empty-group was: Hi, This is my first ticket and I apologize for/if I'm doing certain things in an improper way. I have a dataset: root - muons: array (nullable = true) - element: struct (containsNull = true) - reco::Candidate: struct (nullable = true) - qx3_: integer (nullable = true) - pt_: float (nullable = true) - eta_: float (nullable = true) - phi_: float (nullable = true) - mass_: float (nullable = true) - vertex_: struct (nullable = true) - fCoordinates: struct (nullable = true) - fX: float (nullable = true) - fY: float (nullable = true) - fZ: float (nullable = true) - pdgId_: integer (nullable = true) - status_: integer (nullable = true) - cachePolarFixed_: struct (nullable = true) - cacheCartesianFixed_: struct (nullable = true) As you can see, there are 3 empty structs in this schema. I know 100% that I can read/manipulate/do whatever. However, when I try writing to disk in parquet, I get the following Exception: ds.write.format("parquet").save(outputPathName): java.lang.IllegalStateException: Cannot build an empty group at org.apache.parquet.Preconditions.checkState(Preconditions.java:91) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:622) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:497) at org.apache.parquet.schema.Types$Builder.named(Types.java:286) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:535) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:534) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:533) So, basically I would like to understand if it's a bug or an intended behavior??? I also assume that it's related to the empty structs. Any help would be really appreciated! I've quickly created stripped version and that one works without any issues! For reference, I put a link to a original questio
[jira] [Updated] (SPARK-20593) Writing Parquet: Cannot build an empty group
[ https://issues.apache.org/jira/browse/SPARK-20593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Khristenko updated SPARK-20593: -- Description: Hi, This is my first ticket and I apologize for/if I'm doing certain things in an improper way. I have a dataset: root - muons: array (nullable = true) - element: struct (containsNull = true) - reco::Candidate: struct (nullable = true) - qx3_: integer (nullable = true) - pt_: float (nullable = true) - eta_: float (nullable = true) - phi_: float (nullable = true) - mass_: float (nullable = true) - vertex_: struct (nullable = true) - fCoordinates: struct (nullable = true) - fX: float (nullable = true) - fY: float (nullable = true) - fZ: float (nullable = true) - pdgId_: integer (nullable = true) - status_: integer (nullable = true) - cachePolarFixed_: struct (nullable = true) - cacheCartesianFixed_: struct (nullable = true) As you can see, there are 3 empty structs in this schema. I know 100% that I can read/manipulate/do whatever. However, when I try writing to disk in parquet, I get the following Exception: ds.write.format("parquet").save(outputPathName): java.lang.IllegalStateException: Cannot build an empty group at org.apache.parquet.Preconditions.checkState(Preconditions.java:91) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:622) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:497) at org.apache.parquet.schema.Types$Builder.named(Types.java:286) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:535) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:534) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:533) So, basically I would like to understand if it's a bug or an intended behavior??? I also assume that it's related to the empty structs. Any help would be really appreciated! I've quickly created stripped version and that one works without any issues! For reference, I put a link to a original question on SO[1] VK [1] http://stackoverflow.com/questions/43767358/apache-spark-parquet-cannot-build-an-empty-group was: Hi, This is my first ticket and I apologize for/if I'm doing certain things in an improper way. I have a dataset: root * muons: array (nullable = true) * element: struct (containsNull = true) * reco::Candidate: struct (nullable = true) * qx3_: integer (nullable = true) * pt_: float (nullable = true) * eta_: float (nullable = true) * phi_: float (nullable = true) * mass_: float (nullable = true) * vertex_: struct (nullable = true) * fCoordinates: struct (nullable = true) * fX: float (nullable = true) * fY: float (nullable = true) * fZ: float (nullable = true) * pdgId_: integer (nullable = true) * status_: integer (nullable = true) * cachePolarFixed_: struct (nullable = true) * cacheCartesianFixed_: struct (nullable = true) As you can see, there are 3 empty structs in this schema. I know 100% that I can read/manipulate/do whatever. However, when I try writing to disk in parquet, I get the following Exception: ds.write.format("parquet").save(outputPathName): java.lang.IllegalStateException: Cannot build an empty group at org.apache.parquet.Preconditions.checkState(Preconditions.java:91) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:622) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:497) at org.apache.parquet.schema.Types$Builder.named(Types.java:286) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:535) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:534) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:533) So, basically I would like to understand if it's a bug or an intended behavior??? I also assume that it's related to the empty structs. Any help would be really appreciated! I've quickly created stripped version and that one works without any issues! For reference, I put a link to a original question on SO[1] VK [1] http://stackoverflow.com/questions/43767358/apache-spark-parquet-cannot-build-an-empty-group > Writing Parquet: Cannot build an empty group > ---
[jira] [Updated] (SPARK-20593) Writing Parquet: Cannot build an empty group
[ https://issues.apache.org/jira/browse/SPARK-20593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Khristenko updated SPARK-20593: -- Description: Hi, This is my first ticket and I apologize for/if I'm doing certain things in an improper way. I have a dataset: root * muons: array (nullable = true) * element: struct (containsNull = true) * reco::Candidate: struct (nullable = true) * qx3_: integer (nullable = true) * pt_: float (nullable = true) * eta_: float (nullable = true) * phi_: float (nullable = true) * mass_: float (nullable = true) * vertex_: struct (nullable = true) * fCoordinates: struct (nullable = true) * fX: float (nullable = true) * fY: float (nullable = true) * fZ: float (nullable = true) * pdgId_: integer (nullable = true) * status_: integer (nullable = true) * cachePolarFixed_: struct (nullable = true) * cacheCartesianFixed_: struct (nullable = true) As you can see, there are 3 empty structs in this schema. I know 100% that I can read/manipulate/do whatever. However, when I try writing to disk in parquet, I get the following Exception: ds.write.format("parquet").save(outputPathName): java.lang.IllegalStateException: Cannot build an empty group at org.apache.parquet.Preconditions.checkState(Preconditions.java:91) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:622) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:497) at org.apache.parquet.schema.Types$Builder.named(Types.java:286) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:535) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:534) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:533) So, basically I would like to understand if it's a bug or an intended behavior??? I also assume that it's related to the empty structs. Any help would be really appreciated! I've quickly created stripped version and that one works without any issues! For reference, I put a link to a original question on SO[1] VK [1] http://stackoverflow.com/questions/43767358/apache-spark-parquet-cannot-build-an-empty-group was: Hi, This is my first ticket and I apologize for/if I'm doing certain things in an improper way. I have a dataset: root * muons: array (nullable = true) * element: struct (containsNull = true) * reco::Candidate: struct (nullable = true) * qx3_: integer (nullable = true) * pt_: float (nullable = true) * eta_: float (nullable = true) * phi_: float (nullable = true) * mass_: float (nullable = true) * vertex_: struct (nullable = true) * fCoordinates: struct (nullable = true) * fX: float (nullable = true) * fY: float (nullable = true) * fZ: float (nullable = true) * pdgId_: integer (nullable = true) * status_: integer (nullable = true) * cachePolarFixed_: struct (nullable = true) * cacheCartesianFixed_: struct (nullable = true) As you can see, there are 3 empty structs in this schema. I know 100% that I can read/manipulate/do whatever. However, when I try writing to disk in parquet, I get the following Exception: ds.write.format("parquet").save(outputPathName): java.lang.IllegalStateException: Cannot build an empty group at org.apache.parquet.Preconditions.checkState(Preconditions.java:91) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:622) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:497) at org.apache.parquet.schema.Types$Builder.named(Types.java:286) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:535) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:534) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:533) So, basically I would like to understand if it's a bug or an intended behavior??? I also assume that it's related to the empty structs. Any help would be really appreciated! I've quickly created stripped version and that one works without any issues! For reference, I put a link to a original question on SO[1] VK [1] http://stackoverflow.com/questions/43767358/apache-spark-parquet-cannot-build-an-empty-group > Writing Parquet: Cannot build an empty group >
[jira] [Updated] (SPARK-20593) Writing Parquet: Cannot build an empty group
[ https://issues.apache.org/jira/browse/SPARK-20593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Khristenko updated SPARK-20593: -- Description: Hi, This is my first ticket and I apologize for/if I'm doing certain things in an improper way. I have a dataset: root * muons: array (nullable = true) * element: struct (containsNull = true) * reco::Candidate: struct (nullable = true) * qx3_: integer (nullable = true) * pt_: float (nullable = true) * eta_: float (nullable = true) * phi_: float (nullable = true) * mass_: float (nullable = true) * vertex_: struct (nullable = true) * fCoordinates: struct (nullable = true) * fX: float (nullable = true) * fY: float (nullable = true) * fZ: float (nullable = true) * pdgId_: integer (nullable = true) * status_: integer (nullable = true) * cachePolarFixed_: struct (nullable = true) * cacheCartesianFixed_: struct (nullable = true) As you can see, there are 3 empty structs in this schema. I know 100% that I can read/manipulate/do whatever. However, when I try writing to disk in parquet, I get the following Exception: ds.write.format("parquet").save(outputPathName): java.lang.IllegalStateException: Cannot build an empty group at org.apache.parquet.Preconditions.checkState(Preconditions.java:91) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:622) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:497) at org.apache.parquet.schema.Types$Builder.named(Types.java:286) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:535) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:534) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:533) So, basically I would like to understand if it's a bug or an intended behavior??? I also assume that it's related to the empty structs. Any help would be really appreciated! I've quickly created stripped version and that one works without any issues! For reference, I put a link to a original question on SO[1] VK [1] http://stackoverflow.com/questions/43767358/apache-spark-parquet-cannot-build-an-empty-group was: Hi, This is my first ticket and I apologize for/if I'm doing certain things in an improper way. I have a dataset: root muons: array (nullable = true) element: struct (containsNull = true) reco::Candidate: struct (nullable = true) qx3_: integer (nullable = true) pt_: float (nullable = true) eta_: float (nullable = true) phi_: float (nullable = true) mass_: float (nullable = true) vertex_: struct (nullable = true) fCoordinates: struct (nullable = true) fX: float (nullable = true) fY: float (nullable = true) fZ: float (nullable = true) pdgId_: integer (nullable = true) status_: integer (nullable = true) cachePolarFixed_: struct (nullable = true) cacheCartesianFixed_: struct (nullable = true) As you can see, there are 3 empty structs in this schema. I know 100% that I can read/manipulate/do whatever. However, when I try writing to disk in parquet, I get the following Exception: ds.write.format("parquet").save(outputPathName): java.lang.IllegalStateException: Cannot build an empty group at org.apache.parquet.Preconditions.checkState(Preconditions.java:91) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:622) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:497) at org.apache.parquet.schema.Types$Builder.named(Types.java:286) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:535) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:534) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:533) So, basically I would like to understand if it's a bug or an intended behavior??? I also assume that it's related to the empty structs. Any help would be really appreciated! I've quickly created stripped version and that one works without any issues! For reference, I put a link to a original question on SO[1] VK [1] http://stackoverflow.com/questions/43767358/apache-spark-parquet-cannot-build-an-empty-group > Writing Parquet: Cannot build an empty group > >
[jira] [Updated] (SPARK-20593) Writing Parquet: Cannot build an empty group
[ https://issues.apache.org/jira/browse/SPARK-20593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Khristenko updated SPARK-20593: -- Description: Hi, This is my first ticket and I apologize for/if I'm doing certain things in an improper way. I have a dataset: root muons: array (nullable = true) element: struct (containsNull = true) reco::Candidate: struct (nullable = true) qx3_: integer (nullable = true) pt_: float (nullable = true) eta_: float (nullable = true) phi_: float (nullable = true) mass_: float (nullable = true) vertex_: struct (nullable = true) fCoordinates: struct (nullable = true) fX: float (nullable = true) fY: float (nullable = true) fZ: float (nullable = true) pdgId_: integer (nullable = true) status_: integer (nullable = true) cachePolarFixed_: struct (nullable = true) cacheCartesianFixed_: struct (nullable = true) As you can see, there are 3 empty structs in this schema. I know 100% that I can read/manipulate/do whatever. However, when I try writing to disk in parquet, I get the following Exception: ds.write.format("parquet").save(outputPathName): java.lang.IllegalStateException: Cannot build an empty group at org.apache.parquet.Preconditions.checkState(Preconditions.java:91) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:622) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:497) at org.apache.parquet.schema.Types$Builder.named(Types.java:286) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:535) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:534) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:533) So, basically I would like to understand if it's a bug or an intended behavior??? I also assume that it's related to the empty structs. Any help would be really appreciated! I've quickly created stripped version and that one works without any issues! For reference, I put a link to a original question on SO[1] VK [1] http://stackoverflow.com/questions/43767358/apache-spark-parquet-cannot-build-an-empty-group was: Hi, This is my first ticket and I apologize for/if I'm doing certain things in an improper way. I have a dataset: root |-- muons: array (nullable = true) ||-- element: struct (containsNull = true) |||-- reco::Candidate: struct (nullable = true) |||-- qx3_: integer (nullable = true) |||-- pt_: float (nullable = true) |||-- eta_: float (nullable = true) |||-- phi_: float (nullable = true) |||-- mass_: float (nullable = true) |||-- vertex_: struct (nullable = true) ||||-- fCoordinates: struct (nullable = true) |||||-- fX: float (nullable = true) |||||-- fY: float (nullable = true) |||||-- fZ: float (nullable = true) |||-- pdgId_: integer (nullable = true) |||-- status_: integer (nullable = true) |||-- cachePolarFixed_: struct (nullable = true) |||-- cacheCartesianFixed_: struct (nullable = true) As you can see, there are 3 empty structs in this schema. I know 100% that I can read/manipulate/do whatever. However, when I try writing to disk in parquet, I get the following Exception: ds.write.format("parquet").save(outputPathName): java.lang.IllegalStateException: Cannot build an empty group at org.apache.parquet.Preconditions.checkState(Preconditions.java:91) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:622) at org.apache.parquet.schema.Types$BaseGroupBuilder.build(Types.java:497) at org.apache.parquet.schema.Types$Builder.named(Types.java:286) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:535) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:534) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convertField$1.apply(ParquetSchemaConverter.scala:533) So, basically I would like to understand if it's a bug or an intended behavior??? I also assume that it's related to the empty structs. Any help would be really appreciated! I've quickly created stripped version and that one works without any issues! For reference, I put a link to a original question on SO[1] VK [1] http://stackoverflow.com/qu