xuFabius opened a new issue #754: InvalidSchemaException URL: https://github.com/apache/incubator-hudi/issues/754 When I run ./run_hoodie_app.sh. The program ended with InvalidSchemaException. ` Exception in thread "main" java.lang.ExceptionInInitializerError at org.apache.spark.sql.execution.datasources.parquet.ParquetWriteSupport$$anonfun$setSchema$2.apply(ParquetWriteSupport.scala:444) at org.apache.spark.sql.execution.datasources.parquet.ParquetWriteSupport$$anonfun$setSchema$2.apply(ParquetWriteSupport.scala:444) at scala.collection.immutable.List.foreach(List.scala:381) at org.apache.spark.sql.execution.datasources.parquet.ParquetWriteSupport$.setSchema(ParquetWriteSupport.scala:444) at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat.buildReaderWithPartitionValues(ParquetFileFormat.scala:314) at org.apache.spark.sql.execution.FileSourceScanExec.inputRDD$lzycompute(DataSourceScanExec.scala:294) at org.apache.spark.sql.execution.FileSourceScanExec.inputRDD(DataSourceScanExec.scala:290) at org.apache.spark.sql.execution.FileSourceScanExec.inputRDDs(DataSourceScanExec.scala:312) at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:610) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247) at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:337) at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38) at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3278) at org.apache.spark.sql.Dataset$$anonfun$collect$1.apply(Dataset.scala:2727) at org.apache.spark.sql.Dataset$$anonfun$collect$1.apply(Dataset.scala:2727) at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258) at org.apache.spark.sql.Dataset.collect(Dataset.scala:2727) at HoodieJavaApp.run(HoodieJavaApp.java:185) at HoodieJavaApp.main(HoodieJavaApp.java:95) Caused by: org.apache.parquet.schema.InvalidSchemaException: A group type can not be empty. Parquet does not support empty group without leaves. Empty group: spark_schema at org.apache.parquet.schema.GroupType.<init>(GroupType.java:92) at org.apache.parquet.schema.GroupType.<init>(GroupType.java:48) at org.apache.parquet.schema.MessageType.<init>(MessageType.java:50) at org.apache.parquet.schema.Types$MessageTypeBuilder.named(Types.java:1256) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$.<init>(ParquetSchemaConverter.scala:567) at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$.<clinit>(ParquetSchemaConverter.scala) ... 27 more ` In the pom.xml, parquet.version=1.8.2, spark.version=2.3.3, avro.version=1.8.2
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services