fxli4 opened a new issue, #15732: URL: https://github.com/apache/dubbo/issues/15732
### Pre-check - [x] I am sure that all the content I provide is in English. ### Search before asking - [x] I had searched in the [issues](https://github.com/apache/dubbo/issues?q=is%3Aissue) and found no similar issues. ### Apache Dubbo Component Java SDK (apache/dubbo) ### Dubbo Version Dubbo 3.3.6 JDK21 ### Steps to reproduce this issue **Root Cause:** The issue occurs in a BatchExecutorQueue implementation where: - Multiple worker threads produce messages into a ConcurrentLinkedQueue - A single netty EventLoop thread consumes messages by copying all elements from ConcurrentLinkedQueue to a LinkedList in one operation **Problem:** In extreme scenes, when multiple worker threads generate large volumes of messages simultaneously, the bulk copy operation causes: 1. Long polling/blocking: The entire copy process blocks the EventLoop thread 2. Availability impact: Extended blocking reduces system responsiveness 3. Network I/O disruption: Since this uses Netty's EventLoop thread, the blocking prevents timely reading of incoming messages from other services, creating a cascading failure effect **SourceCode:** [BatchExecutorQueue.run()](https://github.com/apache/dubbo/blob/dea02069082226b4677579474db4d156dbb8fb0e/dubbo-common/src/main/java/org/apache/dubbo/common/BatchExecutorQueue.java#L53) <img width="576" height="867" alt="Image" src="https://github.com/user-attachments/assets/fa665eb1-a501-4c82-8d65-af3f3566c7a4" /> ### What you expected to happen **Impact:** - EventLoop thread starvation - Delayed message processing - Potential service unavailability during high load - Impaired ability to handle incoming requests from other services The core issue is the synchronous bulk copy operation that doesn't yield the EventLoop thread, violating the non-blocking principle essential for proper EventLoop performance. ### Anything else I conducted a comprehensive performance test to evaluate and compare three distinct implementation strategies for [BatchExecutorQueue.run()](https://github.com/apache/dubbo/blob/dea02069082226b4677579474db4d156dbb8fb0e/dubbo-common/src/main/java/org/apache/dubbo/common/BatchExecutorQueue.java#L53). 1. Old Implemention <img width="406" height="1010" alt="Image" src="https://github.com/user-attachments/assets/ab4aa053-dbca-4fbe-a88c-b3ec342db955" /> 2. Simple Implemention <img width="434" height="830" alt="Image" src="https://github.com/user-attachments/assets/af2230db-9e35-4590-b366-8c65d256eabc" /> 3. Two queue swap Implemention <img width="643" height="1040" alt="Image" src="https://github.com/user-attachments/assets/3bbbe035-54b3-4d22-984d-af48df830f65" /> **Test Case** <img width="1224" height="2045" alt="Image" src="https://github.com/user-attachments/assets/2d8cab97-2751-4423-9e73-5024ea2e3aa6" /> **Test Results** ---------------- current 1 test start --------------- old implement message total at first second: 191046 old implement message test ended, flush 20000000 message spend: 8634s simple implement message total at first second: 5685992 simple implement message test ended, flush 20000000 message spend: 3255s new implement message total at first second: 6114666 new implement message test ended, flush 20000000 message spend: 3065s ---------------- current 1 test end --------------- ---------------- current 2 test start --------------- old implement message total at first second: 5674259 old implement message test ended, flush 20000000 message spend: 3683s simple implement message total at first second: 6138832 simple implement message test ended, flush 20000000 message spend: 3248s new implement message total at first second: 6770830 new implement message test ended, flush 20000000 message spend: 2601s ---------------- current 2 test end --------------- ---------------- current 3 test start --------------- old implement happened long loop old implement message total at first second: 0 old implement message test ended, flush 20000000 message spend: 7799s simple implement message total at first second: 2525975 simple implement message test ended, flush 20000000 message spend: 5145s new implement message total at first second: 7497753 new implement message test ended, flush 20000000 message spend: 2492s ---------------- current 3 test end --------------- ---------------- current 4 test start --------------- old implement message total at first second: 4634112 old implement message test ended, flush 20000000 message spend: 4768s simple implement message total at first second: 3782242 simple implement message test ended, flush 20000000 message spend: 4839s new implement message total at first second: 8016594 new implement message test ended, flush 20000000 message spend: 2528s ---------------- current 4 test end --------------- ---------------- current 5 test start --------------- old implement happened long loop old implement message total at first second: 0 old implement message test ended, flush 20000000 message spend: 8214s simple implement message total at first second: 3491755 simple implement message test ended, flush 20000000 message spend: 4466s new implement message total at first second: 7299851 new implement message test ended, flush 20000000 message spend: 2924s ---------------- current 5 test end --------------- ---------------- current 6 test start --------------- old implement happened long loop old implement message total at first second: 0 old implement message test ended, flush 20000000 message spend: 8783s simple implement message total at first second: 2149255 simple implement message test ended, flush 20000000 message spend: 4661s new implement message total at first second: 6844237 new implement message test ended, flush 20000000 message spend: 3036s ---------------- current 6 test end --------------- ---------------- current 7 test start --------------- old implement happened long loop old implement message total at first second: 0 old implement message test ended, flush 20000000 message spend: 5546s simple implement message total at first second: 2256523 simple implement message test ended, flush 20000000 message spend: 5691s new implement message total at first second: 8699966 new implement message test ended, flush 20000000 message spend: 2311s ---------------- current 7 test end --------------- ---------------- current 8 test start --------------- old implement message total at first second: 5715617 old implement message test ended, flush 20000000 message spend: 4799s simple implement message total at first second: 2465812 simple implement message test ended, flush 20000000 message spend: 5467s new implement message total at first second: 7249972 new implement message test ended, flush 20000000 message spend: 2795s ---------------- current 8 test end --------------- ---------------- current 9 test start --------------- old implement message total at first second: 4161615 old implement message test ended, flush 20000000 message spend: 6156s simple implement message total at first second: 2383010 simple implement message test ended, flush 20000000 message spend: 5085s new implement message total at first second: 7212382 new implement message test ended, flush 20000000 message spend: 2835s ---------------- current 9 test end --------------- ---------------- current 10 test start --------------- old implement message total at first second: 6002999 old implement message test ended, flush 20000000 message spend: 5510s simple implement message total at first second: 5626736 simple implement message test ended, flush 20000000 message spend: 3273s new implement message total at first second: 7298384 new implement message test ended, flush 20000000 message spend: 2796s ---------------- current 10 test end --------------- message total average for old's 10 test: 6428.25s message total average for simple's 10 test: 4523.875s message total average for new's 10 test: 2750.875s ### Are you willing to submit a pull request to fix on your own? - [x] Yes I am willing to submit a pull request on my own! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
