wzx140 commented on code in PR #6745:
URL: https://github.com/apache/hudi/pull/6745#discussion_r982426959


##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/io/storage/HoodieSparkParquetStreamWriter.java:
##########
@@ -40,6 +40,7 @@ public class HoodieSparkParquetStreamWriter implements 
HoodieSparkFileWriter, Au
   public HoodieSparkParquetStreamWriter(FSDataOutputStream outputStream,
       HoodieRowParquetConfig parquetConfig) throws IOException {
     this.writeSupport = parquetConfig.getWriteSupport();
+    this.writeSupport.enableLegacyFormat();

Review Comment:
   If we use HoodieAvroRecord to write parquetLog, it will write with 
AvroWriteSupport#"parquet.avro.write-old-list-structure"=true. In order to be 
compatible, We should also use HoodieSparkRecord write parquetLog with legency 
format. So that we can use HoodieAvroRecord to read the parquet log which 
written by HoodieSparkRecord.
   
   ```
           // Standard mode:
           //
           //   <list-repetition> group <name> (LIST) {
           //     repeated group list {
           //                    ^~~~  repeatedGroupName
           //       <element-repetition> <element-type> element;
           //                                           ^~~~~~~  
elementFieldName
           //     }
           //   }
   
           // Legacy mode, with non-nullable elements:
           //
           //   <list-repetition> group <name> (LIST) {
           //     repeated <element-type> array;
           //                             ^~~~~  repeatedFieldName
           //   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to