wu-sheng opened a new pull request, #13710:
URL: https://github.com/apache/skywalking/pull/13710

   ### Add QueueUsageBenchmark to validate BatchQueueStats metrics under 
realistic backpressure
   
   - [x] Tests(including UT, IT, E2E) are added to verify the new feature.
   
   #### What
   
   Add `QueueUsageBenchmark` — a separate benchmark that validates 
`BatchQueueStats` usage metrics
   under realistic backpressure by adding simulated per-item CPU cost 
(busy-spin) in consumers.
   Also add usage sampling to the existing `BatchQueueBenchmark`.
   
   Unlike the existing `BatchQueueBenchmark` which uses no-op consumers (queue 
stays near 0%),
   this benchmark creates genuine backpressure so the queue fills up and the 
usage percentage
   becomes meaningful — validating the 
`metrics_aggregation_queue_used_percentage` SO11Y metric.
   
   #### Test scenarios
   
   | Test | Consumer cost | Strategy | Types |
   |------|--------------|----------|-------|
   | `usageUnderLightLoad` | 200 ns/item | IF_POSSIBLE | 500 |
   | `usageUnderMediumLoad` | 500 ns/item | IF_POSSIBLE | 500 |
   | `usageUnderHeavyLoad` | 1 μs/item | IF_POSSIBLE | 500 |
   | `usageUnderHeavyLoadBlocking` | 1 μs/item | BLOCKING | 500 |
   | `usageUnderMediumLoad1000Types` | 500 ns/item | IF_POSSIBLE | 1000 |
   | `usageUnderHeavyLoadBlocking1000Types` | 1 μs/item | BLOCKING | 1000 |
   
   #### Benchmark results (8 threads, 32 producers, JDK 25)
   
   ```
   === Queue Usage Benchmark: light-load ===
     Types:           500     Partitions: 350     BufferSize: 50000
     Strategy:        IF_POSSIBLE     Consumer cost: 200 ns/item
     Produced:        76,838,592      Consumed:      62,417,115
     Consume rate:    12,483,423 items/sec
     Drop rate:       18.77%
     Total usage:     min=45.6%, avg=83.1%, max=100.0%
     Top partition:   avg=100.0%, max=100.0%
   
   === Queue Usage Benchmark: medium-load ===
     Types:           500     Partitions: 350     BufferSize: 50000
     Strategy:        IF_POSSIBLE     Consumer cost: 500 ns/item
     Produced:        58,351,493      Consumed:      38,301,507
     Consume rate:    7,643,486 items/sec
     Drop rate:       34.36%
     Total usage:     min=65.2%, avg=88.1%, max=100.0%
     Top partition:   avg=100.0%, max=100.0%
   
   === Queue Usage Benchmark: heavy-load ===
     Types:           500     Partitions: 350     BufferSize: 50000
     Strategy:        IF_POSSIBLE     Consumer cost: 1000 ns/item
     Produced:        46,145,961      Consumed:      24,945,951
     Consume rate:    4,989,190 items/sec
     Drop rate:       45.94%
     Total usage:     min=42.1%, avg=91.4%, max=100.0%
     Top partition:   avg=98.7%, max=100.0%
   
   === Queue Usage Benchmark: heavy-load-blocking ===
     Types:           500     Partitions: 350     BufferSize: 50000
     Strategy:        BLOCKING        Consumer cost: 1000 ns/item
     Produced:        38,724,600      Consumed:      31,502,928
     Consume rate:    6,095,768 items/sec
     Drop rate:       18.65%
     Total usage:     min=30.7%, avg=40.3%, max=52.8%
     Top partition:   avg=95.2%, max=100.0%
   
   === Queue Usage Benchmark: medium-load-1000t ===
     Types:           1000    Partitions: 600     BufferSize: 50000
     Strategy:        IF_POSSIBLE     Consumer cost: 500 ns/item
     Produced:        72,106,253      Consumed:      38,796,897
     Consume rate:    7,759,379 items/sec
     Drop rate:       46.19%
     Total usage:     min=62.0%, avg=92.0%, max=100.0%
     Top partition:   avg=100.0%, max=100.0%
   
   === Queue Usage Benchmark: heavy-load-blocking-1000t ===
     Types:           1000    Partitions: 600     BufferSize: 50000
     Strategy:        BLOCKING        Consumer cost: 1000 ns/item
     Produced:        51,913,900      Consumed:      32,526,678
     Consume rate:    5,788,695 items/sec
     Drop rate:       37.34%
     Total usage:     min=22.2%, avg=55.0%, max=67.1%
     Top partition:   avg=94.6%, max=100.0%
   ```
   
   #### Key observations
   
   - **Usage metrics validated**: `BatchQueueStats.totalUsedPercentage()` 
correctly reports 40-92% avg under backpressure, confirming the SO11Y 
`metrics_aggregation_queue_used_percentage` metric works as expected.
   - **IF_POSSIBLE drops visible**: Drop rates scale with consumer cost (19% at 
200ns → 46% at 1μs), confirming the strategy correctly sheds load.
   - **BLOCKING backpressure**: Lower total usage (40-55% avg) because 
producers block when partitions fill, throttling the produce rate.
   - **Top partition saturation**: Individual partitions hit 100% even when 
total usage is moderate, showing the value of per-partition monitoring via 
`topN()`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to