wzx140 commented on code in PR #6745: URL: https://github.com/apache/hudi/pull/6745#discussion_r982426959
########## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/io/storage/HoodieSparkParquetStreamWriter.java: ########## @@ -40,6 +40,7 @@ public class HoodieSparkParquetStreamWriter implements HoodieSparkFileWriter, Au public HoodieSparkParquetStreamWriter(FSDataOutputStream outputStream, HoodieRowParquetConfig parquetConfig) throws IOException { this.writeSupport = parquetConfig.getWriteSupport(); + this.writeSupport.enableLegacyFormat(); Review Comment: If we use HoodieAvroRecord to write parquetLog, it will write with AvroWriteSupport#"parquet.avro.write-old-list-structure"=true. In order to be compatible, We should also use HoodieSparkRecord write parquetLog with legency format. So that we can use HoodieAvroRecord to read the parquet log which written by HoodieSparkRecord. ``` // Standard mode: // // <list-repetition> group <name> (LIST) { // repeated group list { // ^~~~ repeatedGroupName // <element-repetition> <element-type> element; // ^~~~~~~ elementFieldName // } // } // Legacy mode, with non-nullable elements: // // <list-repetition> group <name> (LIST) { // repeated <element-type> array; // ^~~~~ repeatedFieldName // } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org