cshuo commented on issue #18424: URL: https://github.com/apache/hudi/issues/18424#issuecomment-4160950068
Nice analysis! > Yes this is true. Basically disruptorQueue.close() closes disruptor thread after the distruptor buffer is fully consumed. Then the GCS upload thread found that "writer thread is not alive" so it will exit as well. Then the following step to flush parquet buffer to GCS failed. The proposed solution is actually not killing the disruptor thread. Is there a reason we want to kill and re-init that thread? If we kill that, the GCS upload thread will also exit, then we need to re-init 2 threads. The reason to close the disruptor queue is to flush all the records to consumer, refer to this dicusssion: https://github.com/apache/hudi/pull/17864#discussion_r2696411681 Have you investigated how to flush all data in the Disruptor queue without closing it? If it's possible, the disruptor queue can be actually reused among checkpoints. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
