cshuo commented on issue #18424:
URL: https://github.com/apache/hudi/issues/18424#issuecomment-4160950068

   Nice analysis! 
   
   > Yes this is true. Basically disruptorQueue.close() closes disruptor thread 
after the distruptor buffer is fully consumed. Then the GCS upload thread found 
that "writer thread is not alive" so it will exit as well. Then the following 
step to flush parquet buffer to GCS failed. The proposed solution is actually 
not killing the disruptor thread. Is there a reason we want to kill and re-init 
that thread? If we kill that, the GCS upload thread will also exit, then we 
need to re-init 2 threads.
   
   The reason to close the disruptor queue is to flush all the records to 
consumer, refer to this dicusssion: 
https://github.com/apache/hudi/pull/17864#discussion_r2696411681
   
   Have you investigated how to flush all data in the Disruptor queue without 
closing it? If it's possible, the disruptor queue  can be actually reused among 
checkpoints.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to