Re: [I] Data Loss During Load Testing with METADATA Enabled and Autoscale Flink [hudi]

via GitHub Thu, 13 Feb 2025 01:36:17 -0800


maheshguptags commented on issue #12738:
URL: https://github.com/apache/hudi/issues/12738#issuecomment-2656023027


   Hi
   
   > do you mean after the checkpoint failed, records ingested after that will 
loss? 
   
   @cshuo Let's assume 10 million records are ingested into the job. While 
processing these records, if the job manager triggers the creation of a new 
Task Manager (TM) due to auto-scaling, or if a TM is manually removed (to test 
the scenario without auto-scaling), a checkpoint failure could occur, causing 
all the previously ingested data (the 10 million records) to be discarded.
   
   If new data (e.g., 1 million records) is ingested after the checkpoint 
failure, the new data will be successfully processed and ingested to Hudi, 
provided the next checkpoint succeeds.
   
   To summarize:
   
   Ingest 10M records → checkpoint failure (due to TM change) → discard all data
   Ingest 1M new records → checkpoint success → successfully ingested into 
Hudi(only 1M).
   
   Thanks 
   Mahesh 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] Data Loss During Load Testing with METADATA Enabled and Autoscale Flink [hudi]

Reply via email to