Re: [I] [SUPPORT]Data loss occurs when using bulkinsert [hudi]

via GitHub Sun, 22 Oct 2023 03:08:12 -0700


blackcheckren commented on issue #9748:
URL: https://github.com/apache/hudi/issues/9748#issuecomment-1774052145


   @ad1happy2go Sorry for the late reply. I read the tables in Maxcompute into 
the memory, sort them by primary key, and write them into the Hudi table. Then 
I read the table from the file system and compare the data in the original 
table. However, I did not find any abnormality in the data level. I printed out 
the records that did not exist in the Hudi table but existed in the original 
table, and read the records with the primary key minus 1 in the original table 
according to the primary key, and the data performance was normal. It's 
confusing to me.
   I wonder if the number of null values in the timestamp field is the cause, 
because I observe that the number of null values in the above data is only 1, 
and there are an even number of null values above and below.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [I] [SUPPORT]Data loss occurs when using bulkinsert [hudi]

Reply via email to