[ 
https://issues.apache.org/jira/browse/HUDI-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17747076#comment-17747076
 ] 

Aditya Goenka commented on HUDI-6589:
-------------------------------------

Upon further investigation and debugging, it has been determined that to 
address the issue related to Avro-parquet compatibility and allow arrays with 
null elements, you need to set the Spark configuration parameter 
spark.hadoop.parquet.avro.write-old-list-structure to false.

This configuration parameter controls the behavior of how Avro arrays with null 
elements are written to Parquet format. By default, Avro arrays with null 
elements are written in a way that preserves their internal structure, which 
can cause compatibility problems with certain tools. By setting 
spark.hadoop.parquet.avro.write-old-list-structure to false, you enable support 
for arrays with null elements and ensure they are handled correctly during the 
write process.

This was not a Hudi issue. I was able to insert the record you pasted by just 
setting this --conf 'spark.hadoop.parquet.avro.write-old-list-structure=false

> Upsert failing for array type if value given [null]
> ---------------------------------------------------
>
>                 Key: HUDI-6589
>                 URL: https://issues.apache.org/jira/browse/HUDI-6589
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: Aditya Goenka
>            Priority: Critical
>             Fix For: 0.15.0
>
>
> Hudi Upserts are failing when data in a nested field is [null],
> Details in GitHub issue (see last comment) - 
> [https://github.com/apache/hudi/issues/9141]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to