TheodoreLx opened a new pull request, #3479:
URL: https://github.com/apache/celeborn/pull/3479

   ### What changes were proposed in this pull request?
   
   Provides a configuration item that can copy the body buffer in pushdata to a 
newly requested buffer before writing on the worker, achieving 100% buffer 
internal space utilization, and ultimately significantly improving the overall 
utilization of NettyMemory.
   
   ### Why are the changes needed?
   In the worker, Netty uses AdaptiveRecvByteBufAllocator to determine the 
buffer size to allocate in advance when reading data from the socket. However, 
in certain network environments, there can be a significant discrepancy between 
the buffer size predicted and allocated by AdaptiveRecvByteBufAllocator and the 
actual data size read from the socket. This can result in a large buffer being 
allocated but only a small amount of data being read, ultimately leading to 
very low overall memory utilization in the worker. A clear metric is that 
NettyMemory is large but DiskBuffer is very small. This means that the worker 
may receive a small amount of data but quickly enter the Pause state due to 
excessive NettyMemory usage.
   
   ### Does this PR introduce _any_ user-facing change?
   
   no
   
   ### How was this patch tested?
   
   cluster test
   
   ### Performance Test
   <img width="1697" height="700" alt="image" 
src="https://github.com/user-attachments/assets/56495d08-6da7-4d43-8e8a-da87a33ccf90";
 />
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to