umehrot2 commented on issue #3841:
URL: https://github.com/apache/hudi/issues/3841#issuecomment-955116742


   Removing from `blocked-on-user` and marking as a `release blocker`. This has 
been reported in slack as well 
https://apache-hudi.slack.com/archives/C4D716NPQ/p1635490536147600. Although I 
haven't tried reproducing it, but my hunch is that this is because on 
underlying parquet issue https://github.com/apache/parquet-mr/pull/560. So, 
after the initial write, when you later perform upsert, it fails while trying 
to read the original file because if the above issue. Based on the slack 
conversation, it seems to be reproducible when simply using `list if structs` 
which should be a fairly simple use-case.
   
   ```
    |-- item_pairs: array (nullable = true)
    |    |-- element: struct (containsNull = true)
    |    |    |-- additional_attributes: string (nullable = true)
    |    |    |-- mapping_state: string (nullable = true)
    |    |    |-- to_item_version: long (nullable = true)
    |    |    |-- to_item_attributes: string (nullable = true)
    |    |    |-- to_region_id: string (nullable = true)
    |    |    |-- to_marketplace_id: string (nullable = true)
    |    |    |-- to_item_id: string (nullable = true)
    |    |    |-- to_website_id: string (nullable = true)
   ```
   
   If this is true, then we might have to upgrade the parquet version to 1.11 
or 1.12 to get this fix. I will try to reproduce this on my side next week and 
file a jira as needed. cc @vinothchandar 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to