TheodoreLx opened a new pull request, #3479: URL: https://github.com/apache/celeborn/pull/3479
### What changes were proposed in this pull request? Provides a configuration item that can copy the body buffer in pushdata to a newly requested buffer before writing on the worker, achieving 100% buffer internal space utilization, and ultimately significantly improving the overall utilization of NettyMemory. ### Why are the changes needed? In the worker, Netty uses AdaptiveRecvByteBufAllocator to determine the buffer size to allocate in advance when reading data from the socket. However, in certain network environments, there can be a significant discrepancy between the buffer size predicted and allocated by AdaptiveRecvByteBufAllocator and the actual data size read from the socket. This can result in a large buffer being allocated but only a small amount of data being read, ultimately leading to very low overall memory utilization in the worker. A clear metric is that NettyMemory is large but DiskBuffer is very small. This means that the worker may receive a small amount of data but quickly enter the Pause state due to excessive NettyMemory usage. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? cluster test ### Performance Test <img width="1697" height="700" alt="image" src="https://github.com/user-attachments/assets/56495d08-6da7-4d43-8e8a-da87a33ccf90" /> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
