fxli4 opened a new issue, #15732:
URL: https://github.com/apache/dubbo/issues/15732

   ### Pre-check
   
   - [x] I am sure that all the content I provide is in English.
   
   
   ### Search before asking
   
   - [x] I had searched in the 
[issues](https://github.com/apache/dubbo/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Apache Dubbo Component
   
   Java SDK (apache/dubbo)
   
   ### Dubbo Version
   
   Dubbo 3.3.6 JDK21
   
   ### Steps to reproduce this issue
   
   **Root Cause:**
   
   The issue occurs in a BatchExecutorQueue implementation where:
   
   - Multiple worker threads produce messages into a ConcurrentLinkedQueue
   
   - A single netty EventLoop thread consumes messages by copying all elements 
from ConcurrentLinkedQueue to a LinkedList in one operation
   
   **Problem:**
   In extreme scenes, when multiple worker threads generate large volumes of 
messages simultaneously, the bulk copy operation causes:
   
   1. Long polling/blocking: The entire copy process blocks the EventLoop thread
   
   2. Availability impact: Extended blocking reduces system responsiveness
   
   3. Network I/O disruption: Since this uses Netty's EventLoop thread, the 
blocking prevents timely reading of incoming messages from other services, 
creating a cascading failure effect
   
   
   **SourceCode:** 
[BatchExecutorQueue.run()](https://github.com/apache/dubbo/blob/dea02069082226b4677579474db4d156dbb8fb0e/dubbo-common/src/main/java/org/apache/dubbo/common/BatchExecutorQueue.java#L53)
   
   <img width="576" height="867" alt="Image" 
src="https://github.com/user-attachments/assets/fa665eb1-a501-4c82-8d65-af3f3566c7a4";
 />
   
   
   
   
   ### What you expected to happen
   
   **Impact:**
   
   - EventLoop thread starvation
   
   - Delayed message processing
   
   - Potential service unavailability during high load
   
   - Impaired ability to handle incoming requests from other services
   
   The core issue is the synchronous bulk copy operation that doesn't yield the 
EventLoop thread, violating the non-blocking principle essential for proper 
EventLoop performance.
   
   ### Anything else
   
   I conducted a comprehensive performance test to evaluate and compare three 
distinct implementation strategies for 
[BatchExecutorQueue.run()](https://github.com/apache/dubbo/blob/dea02069082226b4677579474db4d156dbb8fb0e/dubbo-common/src/main/java/org/apache/dubbo/common/BatchExecutorQueue.java#L53).
 
   
   1. Old Implemention
   <img width="406" height="1010" alt="Image" 
src="https://github.com/user-attachments/assets/ab4aa053-dbca-4fbe-a88c-b3ec342db955";
 />
   
   2. Simple Implemention
   <img width="434" height="830" alt="Image" 
src="https://github.com/user-attachments/assets/af2230db-9e35-4590-b366-8c65d256eabc";
 />
   
   3. Two queue swap Implemention
   <img width="643" height="1040" alt="Image" 
src="https://github.com/user-attachments/assets/3bbbe035-54b3-4d22-984d-af48df830f65";
 />
   
   
   
   **Test Case**
   
   <img width="1224" height="2045" alt="Image" 
src="https://github.com/user-attachments/assets/2d8cab97-2751-4423-9e73-5024ea2e3aa6";
 />
   
   **Test Results**
   ---------------- current 1 test start ---------------
   old implement message total at first second: 191046
   old implement message test ended, flush 20000000 message spend: 8634s
   simple implement message total at first second: 5685992
   simple implement message test ended, flush 20000000 message spend: 3255s
   new implement message total at first second: 6114666
   new implement message test ended, flush 20000000 message spend: 3065s
   ---------------- current 1 test end ---------------
   
   ---------------- current 2 test start ---------------
   old implement message total at first second: 5674259
   old implement message test ended, flush 20000000 message spend: 3683s
   simple implement message total at first second: 6138832
   simple implement message test ended, flush 20000000 message spend: 3248s
   new implement message total at first second: 6770830
   new implement message test ended, flush 20000000 message spend: 2601s
   ---------------- current 2 test end ---------------
   
   ---------------- current 3 test start ---------------
   old implement happened long loop
   old implement message total at first second: 0
   old implement message test ended, flush 20000000 message spend: 7799s
   simple implement message total at first second: 2525975
   simple implement message test ended, flush 20000000 message spend: 5145s
   new implement message total at first second: 7497753
   new implement message test ended, flush 20000000 message spend: 2492s
   ---------------- current 3 test end ---------------
   
   ---------------- current 4 test start ---------------
   old implement message total at first second: 4634112
   old implement message test ended, flush 20000000 message spend: 4768s
   simple implement message total at first second: 3782242
   simple implement message test ended, flush 20000000 message spend: 4839s
   new implement message total at first second: 8016594
   new implement message test ended, flush 20000000 message spend: 2528s
   ---------------- current 4 test end ---------------
   
   ---------------- current 5 test start ---------------
   old implement happened long loop
   old implement message total at first second: 0
   old implement message test ended, flush 20000000 message spend: 8214s
   simple implement message total at first second: 3491755
   simple implement message test ended, flush 20000000 message spend: 4466s
   new implement message total at first second: 7299851
   new implement message test ended, flush 20000000 message spend: 2924s
   ---------------- current 5 test end ---------------
   
   ---------------- current 6 test start ---------------
   old implement happened long loop
   old implement message total at first second: 0
   old implement message test ended, flush 20000000 message spend: 8783s
   simple implement message total at first second: 2149255
   simple implement message test ended, flush 20000000 message spend: 4661s
   new implement message total at first second: 6844237
   new implement message test ended, flush 20000000 message spend: 3036s
   ---------------- current 6 test end ---------------
   
   ---------------- current 7 test start ---------------
   old implement happened long loop
   old implement message total at first second: 0
   old implement message test ended, flush 20000000 message spend: 5546s
   simple implement message total at first second: 2256523
   simple implement message test ended, flush 20000000 message spend: 5691s
   new implement message total at first second: 8699966
   new implement message test ended, flush 20000000 message spend: 2311s
   ---------------- current 7 test end ---------------
   
   ---------------- current 8 test start ---------------
   old implement message total at first second: 5715617
   old implement message test ended, flush 20000000 message spend: 4799s
   simple implement message total at first second: 2465812
   simple implement message test ended, flush 20000000 message spend: 5467s
   new implement message total at first second: 7249972
   new implement message test ended, flush 20000000 message spend: 2795s
   ---------------- current 8 test end ---------------
   
   ---------------- current 9 test start ---------------
   old implement message total at first second: 4161615
   old implement message test ended, flush 20000000 message spend: 6156s
   simple implement message total at first second: 2383010
   simple implement message test ended, flush 20000000 message spend: 5085s
   new implement message total at first second: 7212382
   new implement message test ended, flush 20000000 message spend: 2835s
   ---------------- current 9 test end ---------------
   
   ---------------- current 10 test start ---------------
   old implement message total at first second: 6002999
   old implement message test ended, flush 20000000 message spend: 5510s
   simple implement message total at first second: 5626736
   simple implement message test ended, flush 20000000 message spend: 3273s
   new implement message total at first second: 7298384
   new implement message test ended, flush 20000000 message spend: 2796s
   ---------------- current 10 test end ---------------
   
   message total average for old's 10 test: 6428.25s
   message total average for simple's 10 test: 4523.875s
   message total average for new's 10 test: 2750.875s
   
   ### Are you willing to submit a pull request to fix on your own?
   
   - [x] Yes I am willing to submit a pull request on my own!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to