[ https://issues.apache.org/jira/browse/HUDI-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17747076#comment-17747076 ]
Aditya Goenka commented on HUDI-6589: ------------------------------------- Upon further investigation and debugging, it has been determined that to address the issue related to Avro-parquet compatibility and allow arrays with null elements, you need to set the Spark configuration parameter spark.hadoop.parquet.avro.write-old-list-structure to false. This configuration parameter controls the behavior of how Avro arrays with null elements are written to Parquet format. By default, Avro arrays with null elements are written in a way that preserves their internal structure, which can cause compatibility problems with certain tools. By setting spark.hadoop.parquet.avro.write-old-list-structure to false, you enable support for arrays with null elements and ensure they are handled correctly during the write process. This was not a Hudi issue. I was able to insert the record you pasted by just setting this --conf 'spark.hadoop.parquet.avro.write-old-list-structure=false > Upsert failing for array type if value given [null] > --------------------------------------------------- > > Key: HUDI-6589 > URL: https://issues.apache.org/jira/browse/HUDI-6589 > Project: Apache Hudi > Issue Type: Bug > Reporter: Aditya Goenka > Priority: Critical > Fix For: 0.15.0 > > > Hudi Upserts are failing when data in a nested field is [null], > Details in GitHub issue (see last comment) - > [https://github.com/apache/hudi/issues/9141] -- This message was sent by Atlassian Jira (v8.20.10#820010)