xiangfu0 commented on PR #17247:
URL: https://github.com/apache/pinot/pull/17247#issuecomment-3817847209

   > Please run a benchmark to quantify the performance overhead of enabling 
tracking
   
   did one round of the perf benchmark, no significant change:
   
   # Distinct MSQE Tracking Overhead
   
   ## Run summary
   
   - Date: 2026-01-16 14:45:53 
   - Benchmark: org.apache.pinot.perf.BenchmarkDistinctQueriesMSQE
   - Dataset rows per segment: 1500000 (2 segments)
   - Scenarios: EXP(0.001), EXP(0.5), EXP(0.999)
   - JMH: 1.37
   - JDK: OpenJDK 17.0.15
   - Warmup: 5 x 1s, Measurement: 10 x 1s, Forks: 3
   
   ## Command
   
   ```bash
   
JAVA_HOME=/opt/homebrew/Cellar/openjdk@17/17.0.15/libexec/openjdk.jdk/Contents/Home
 \
   
PATH=/opt/homebrew/Cellar/openjdk@17/17.0.15/libexec/openjdk.jdk/Contents/Home/bin:$PATH
 \
   JAVA_TOOL_OPTIONS="--add-opens=java.base/java.lang=ALL-UNNAMED 
--add-opens=java.base/java.lang.reflect=ALL-UNNAMED" \
   java -jar pinot-perf/target/benchmarks.jar BenchmarkDistinctQueriesMSQE \
     -p _scenario='EXP(0.001),EXP(0.5),EXP(0.999)' -p 
_trackingMode=enabled,disabled -wi 5 -i 10 -f 3 \
     -rf json -rff /tmp/jmh-distinct-msqe.json
   ```
   
   ## Results (avg ms/op)
   
   | Scenario | Query | Enabled (ms ± err) | Disabled (ms ± err) | Delta (ms) | 
Delta (%) | Within combined err? |
   |---|---|---:|---:|---:|---:|:---:|
   | EXP(0.001) | SELECT DISTINCT INT_COL FROM MyTable LIMIT 100000 | 4.401 ± 
0.313 | 4.505 ± 0.372 | -0.104 | -2.3% | yes |
   | EXP(0.001) | SELECT DISTINCT INT_COL FROM MyTable ORDER BY INT_COL DESC 
LIMIT 100000 | 5.897 ± 0.419 | 5.730 ± 0.331 | 0.167 | 2.9% | yes |
   | EXP(0.001) | SELECT DISTINCT INT_COL, LOW_CARDINALITY_STRING_COL FROM 
MyTable LIMIT 100000 | 115.513 ± 6.820 | 109.707 ± 2.066 | 5.806 | 5.3% | yes |
   | EXP(0.001) | SELECT DISTINCT LOW_CARDINALITY_STRING_COL FROM MyTable LIMIT 
1000 | 2.513 ± 0.315 | 2.396 ± 0.356 | 0.117 | 4.9% | yes |
   | EXP(0.001) | SELECT DISTINCT RAW_STRING_COL FROM MyTable LIMIT 100000 | 
99.181 ± 1.644 | 97.187 ± 1.633 | 1.995 | 2.1% | yes |
   | EXP(0.001) | SELECT DISTINCT RAW_STRING_COL FROM MyTable WHERE 
LOW_CARDINALITY_STRING_COL = 'value1' LIMIT 100000 | 32.749 ± 0.623 | 33.785 ± 
1.160 | -1.036 | -3.1% | yes |
   | EXP(0.5) | SELECT DISTINCT INT_COL FROM MyTable LIMIT 100000 | 2.546 ± 
0.316 | 2.562 ± 0.365 | -0.015 | -0.6% | yes |
   | EXP(0.5) | SELECT DISTINCT INT_COL FROM MyTable ORDER BY INT_COL DESC 
LIMIT 100000 | 2.527 ± 0.408 | 2.483 ± 0.363 | 0.045 | 1.8% | yes |
   | EXP(0.5) | SELECT DISTINCT INT_COL, LOW_CARDINALITY_STRING_COL FROM 
MyTable LIMIT 100000 | 42.006 ± 0.785 | 44.234 ± 3.330 | -2.229 | -5.0% | yes |
   | EXP(0.5) | SELECT DISTINCT LOW_CARDINALITY_STRING_COL FROM MyTable LIMIT 
1000 | 2.458 ± 0.348 | 2.584 ± 0.261 | -0.126 | -4.9% | yes |
   | EXP(0.5) | SELECT DISTINCT RAW_STRING_COL FROM MyTable LIMIT 100000 | 
69.407 ± 2.116 | 70.811 ± 6.542 | -1.404 | -2.0% | yes |
   | EXP(0.5) | SELECT DISTINCT RAW_STRING_COL FROM MyTable WHERE 
LOW_CARDINALITY_STRING_COL = 'value1' LIMIT 100000 | 17.463 ± 0.472 | 17.431 ± 
0.384 | 0.033 | 0.2% | yes |
   | EXP(0.999) | SELECT DISTINCT INT_COL FROM MyTable LIMIT 100000 | 2.407 ± 
0.402 | 2.434 ± 0.312 | -0.027 | -1.1% | yes |
   | EXP(0.999) | SELECT DISTINCT INT_COL FROM MyTable ORDER BY INT_COL DESC 
LIMIT 100000 | 2.446 ± 0.426 | 2.690 ± 0.425 | -0.243 | -9.0% | yes |
   | EXP(0.999) | SELECT DISTINCT INT_COL, LOW_CARDINALITY_STRING_COL FROM 
MyTable LIMIT 100000 | 41.789 ± 0.519 | 43.100 ± 1.493 | -1.311 | -3.0% | yes |
   | EXP(0.999) | SELECT DISTINCT LOW_CARDINALITY_STRING_COL FROM MyTable LIMIT 
1000 | 2.534 ± 0.360 | 2.474 ± 0.353 | 0.060 | 2.4% | yes |
   | EXP(0.999) | SELECT DISTINCT RAW_STRING_COL FROM MyTable LIMIT 100000 | 
71.486 ± 2.025 | 70.870 ± 1.141 | 0.616 | 0.9% | yes |
   | EXP(0.999) | SELECT DISTINCT RAW_STRING_COL FROM MyTable WHERE 
LOW_CARDINALITY_STRING_COL = 'value1' LIMIT 100000 | 18.834 ± 0.500 | 18.970 ± 
0.660 | -0.137 | -0.7% | yes |
   
   ## Overhead evaluation
   
   - Positive delta means tracking enabled is slower (overhead).
   - Negative delta means tracking enabled is faster.
   - For all scenario/query pairs, deltas remain within the combined 99.9% CI 
error bounds, so no significant difference is established.
   
   ## Notes
   
   - JMH reported lingering Netty/async threads after completion; forks were 
force-terminated after the shutdown timeout.
   - The run logged warnings about direct reserved memory; consider running 
with more direct memory if noise persists.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to