the problem here is that in our cluster we have enough nodes to where it is 
reasonable for a small datasource to have one segment or less per historical 
node. In such a scenario there will be a large quantity of cache requests (one 
per server) that would have been better to batch at the beginning.

Basically I expect an increase in load on the cache system due to lack of 
ability to batch fetch cache results if such an approach were taken. That is a 
significant change in workflow compared to the implementation in `/master` 
where the cached results are fetched in bulk first, with a limit on the qty of 
results that can be cached at broker per call.

As a bit of context: by allowing a limit on the qty of results per batch call 
at the broker level, it allows us to not even try to fetch, say, 1M results if 
we know our cache system can only probably return 10k results in the timeout 
limit given.

[ Full content available at: 
https://github.com/apache/incubator-druid/pull/5913 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to