cshuo commented on PR #18083:
URL: https://github.com/apache/hudi/pull/18083#issuecomment-3949854378

   @prashantwason thks for the detail explanation. 
   
   > Would you like me to add some basic benchmarking to quantify the latency 
improvements? Or do you have specific concerns about the approach that I should 
address first?
   
   Regarding the append write with buffer sorting, there are already two 
approaches, `AppendWriteFunctionWithBIMBufferSort` and 
`AppendWriteFunctionWithDisruptorBufferSort` in the repository. Both implement 
sorting and flushing in an asynchronous manner, aiming to mitigate spike issues 
during the flush process. 
   
   This PR adopts a continuous sorting approach, which can indeed alleviate 
sorting spikes. But the flush process involves not only sorting but also 
file-writing overhead and continuous sort will spread sorting overhead across 
each record writing process. Therefore, I’d like to understand how much 
performance gain this PR will bring compared to the existed functions.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to