umehrot2 commented on issue #3841: URL: https://github.com/apache/hudi/issues/3841#issuecomment-955116742
Removing from `blocked-on-user` and marking as a `release blocker`. This has been reported in slack as well https://apache-hudi.slack.com/archives/C4D716NPQ/p1635490536147600. Although I haven't tried reproducing it, but my hunch is that this is because on underlying parquet issue https://github.com/apache/parquet-mr/pull/560. So, after the initial write, when you later perform upsert, it fails while trying to read the original file because if the above issue. Based on the slack conversation, it seems to be reproducible when simply using `list if structs` which should be a fairly simple use-case. ``` |-- item_pairs: array (nullable = true) | |-- element: struct (containsNull = true) | | |-- additional_attributes: string (nullable = true) | | |-- mapping_state: string (nullable = true) | | |-- to_item_version: long (nullable = true) | | |-- to_item_attributes: string (nullable = true) | | |-- to_region_id: string (nullable = true) | | |-- to_marketplace_id: string (nullable = true) | | |-- to_item_id: string (nullable = true) | | |-- to_website_id: string (nullable = true) ``` If this is true, then we might have to upgrade the parquet version to 1.11 or 1.12 to get this fix. I will try to reproduce this on my side next week and file a jira as needed. cc @vinothchandar -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org