GitHub user parisni edited a comment on the discussion: what's the point of parquet.avro.write-old-list-structure=false for hudi->iceberg
Hi Tim. I did some experimentations and i confirm iceberg spark reader fails in case the of old style list. Still, iceberg athena reader does not care old or new style list. I also confirm setting `spark.hadoop.parquet.avro.write-old-list-structure=false` when writing with hudi fixes the iceberg spark reader. I was curious if a given table can mixe both `parquet.avro.write-old-list-structure` = `false or true`. This is not the case when one use the avro payload merger: the upsert fails to read the parquet files. Interestingly when using `org.apache.hudi.HoodieSparkRecordMerger`, it does use the new style list by default. Moreover it does not honour the `spark.hadoop.parquet.avro.write-old-list-structure=true` either. My conclusion are: - `parquet.avro.write-old-list-structure=false` is mandatory if one use the avro payload merger and they need to rewrite the whole table - in case one use the org.apache.hudi.HoodieSparkRecordMerger, then it's all right - in case one use the `bulk_insert` it always produce new style lists too @the-other-tim-brown can you confirm ? GitHub link: https://github.com/apache/incubator-xtable/discussions/770#discussioncomment-15479738 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
