clintropolis opened a new pull request #8089: add CachingClusteredClient 
benchmark, refactor some stuff
URL: https://github.com/apache/incubator-druid/pull/8089
 
 
   ### Description
   
   This PR adds a benchmark for `CachingClusteredClient` and some refactoring 
of the query processing pipeline to provide the foundation for testing 
approaches to parallel broker merges.
   
   Benchmarks can be run with a command like the following:
   
   ```
   java -Ddruid.benchmark.cacheDir=./tmp/benches/ -jar 
benchmarks/target/benchmarks.jar 
org.apache.druid.benchmark.query.CachingClusteredClientBenchmark
   ```
   
   Substituting benchmark cache directory as appropriate.
   
   #### Background
   I'm having a go at parallel broker merges, making another attempt to achieve 
the goals of #5913 and #6629, eventually planning to attempt the `ForkJoinPool` 
in `asyncMode` approach suggested by @leventov in [this 
thread](https://github.com/apache/incubator-druid/pull/6629#discussion_r241089247).
 Before that, in order to untangle things a bit, I've taken the benchmarks from 
#6629 (credit to @jihoonson) and updated/simplified them to take advantage of 
some of the changes to `SegmentGenerator` from #6794, to allow a persistent 
cache for the generated benchmark segments for much faster benchmarking. I've 
also extracted some of the useful refactorings and got a bit more adventurous. 
This should help isolate these supporting changes from any future PR which adds 
parallel merging, reducing review overhead.
   
   #### Refactoring
   
   ##### `CombiningFunction<T>`
   Added `CombiningFunction<T>`, a new `@FunctionalInterface` to replace 
`BinaryFn<Type1, Type2, OutType>`, since all actual usages were of the form 
`BinaryFn<T, T, T>` and being strictly used in merging 
sequences/iterators/iterables, etc.
   
   ##### `QueryToolChest` and `ResultMergeQueryRunner`
   In order to split out the mechanisms useful during merge from the merge 
implementation, `QueryToolChest` now has 2 additional functions:
   
   ```
   CombiningFunction<ResultType> createMergeFn(Query<ResultType> query)
   ```
   and
   ```
   Ordering<ResultType> createOrderingFn(Query<ResultType> query)
   ```
   
   For group-by queries, `GroupByStrategy` also has these method signatures, 
since `GroupByQueryToolchest` is delegating these things to the strategy.
   
   These methods are passed into a refactored, non-abstract 
`ResultMergeQueryRunner`, as function generators, that given a `Query` produce 
either a `CombiningFunction` or `Ordering` respectively.
   
   ##### `ConnectionCountServerSelectorStrategy` is now 
`WeightedServerSelectorStrategy`
   I did not refactor `QueryableDruidServer` in quite the same manner as #6629, 
but I did still modify `QueryableDruidServer` and `QueryRunner` to add a 
`getWeight` method, as suggested by @drcrallen in [this comment 
thread](https://github.com/apache/incubator-druid/pull/6629#discussion_r240789022)
 to make the selector strategy a bit more generic instead of hard casting 
`QueryRunner` to a `DirectDruidClient` to get the number of connections.
   
   #### Removed
   `OrderedMergingIterator`, `OrderedMergingSequence`, and 
`SortingMergeIterator` have been removed, since they were strictly used by 
their tests.
   
   <hr>
   
   This PR has:
   - [ ] been self-reviewed.
   - [x] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org

Reply via email to