clintropolis commented on issue #8578: parallel broker merges on fork join pool URL: https://github.com/apache/incubator-druid/pull/8578#issuecomment-548140327 ### ParallelMergeCombiningSequenceBenchmark and ParallelMergeCombiningSequenceJmhThreadsBenchmark These are the new benchmarks I have added which I have been using to try and illustrate some things so that we can feel confident about merging this PR and making it the default behavior. **My primary goal with this PR and approach to doing broker merges in parallel is to:** * be better when we can (low concurrency/under utilized processors) * be not much worse when we can't (high concurrency/over utilized processors) so I have been mainly focusing on the performance increase to be had in the best case, and on trying to trace out the behavior in the worst case so we can see that it is not _dramatically_ worse than what is currently in master. For each of these sets of observations, I decided to try different configurations in order to determine the best set of defaults based on the earlier questions about whether 10ms or 100ms task run time would be better, wall-time or CPU time, etc. The `strategy` parameter of the benchmark encodes these configuration options, with parallel merges in the form of `parallelism-{number of parallel tasks}-{target task time}ms-{batch size}-{initial task yield size}` and `combiningMergeSequence-same-thread` to be the reference that is the same threaded analog of what `ParallelMergeCombiningSequence` is doing. Additionally, `inputSequenceType` specifies the behavior of the input sequences being fed into the `strategy`, covering behavior when the sequences are totally non-blocking, non-blocking after some amount of delay, and occasionally blocking after some delay. The benchmarks are split into two because I wanted to look at two slightly different measurements of the worst case. The first is provided by `ParallelMergeCombiningSequenceBenchmark` which is measuring the time to complete a concurrently executed group of queries which uses an executor thread pool to run some number of concurrent sequences and collect the average time for the entire group of threads to complete. The second is to focus more on invididual thread experience and uses the JMH `@Thread` annotation, and is provided by `ParallelMergeCombiningSequenceJmhThreadsBenchmark`. This measurement is expected to be lower than the concurrency performance measurement provided by `ParallelMergeCombiningSequenceBenchmark` because it is not driven entirely by the _slowest_ thread, rather the slower threads will be averaged across all of the threads. If you would like to run these yourself: ``` java -server -cp benchmarks/target/benchmarks.jar org.apache.druid.benchmark.sequences.ParallelMergeCombiningSequenceBenchmark ``` and ``` java -server -cp benchmarks/target/benchmarks.jar org.apache.druid.benchmark.sequences.ParallelMergeCombiningSequenceJmhThreadsBenchmark ``` Just be sure to adjust the configuration to **not include all of the available parameters** unless you have multiple days of patience for them to complete.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org