We're starting work on adding backup requests
<http://static.googleusercontent.com/media/research.google.com/en/us/people/jeff/Berkeley-Latency-Mar2012.pdf>
to the ShardHandler. Roughly something like:

1. Send requests to 100 shards.
2. Wait for results from 75 to come back.
3. Wait for either a) the other 25 to come back or b) 20% more time to
elapse
4. If any shards have still not returned results, send a second request to
a different server for each of the missing shards.

I want to be sure I understand the ShardHandler contract correctly before
getting started. My understanding is :

--ShardHandler#take methods
<https://github.com/apache/lucene-solr/blob/dff38c2051ba26f928687139218bbc43e9004ebe/solr/core/src/java/org/apache/solr/handler/component/ShardHandler.java#L25:L26>
can be called with different ShardRequests having been submitted
<https://github.com/apache/lucene-solr/blob/dff38c2051ba26f928687139218bbc43e9004ebe/solr/core/src/java/org/apache/solr/handler/component/ShardHandler.java#L24>
.
--ShardHandler#takeXXX is then called in a loop, returning a ShardResponse
from the last shard returning for a given ShardRequest.
--When ShardHandler#takeXXX returns null, the SearchHandler
<https://github.com/apache/lucene-solr/blob/dff38c2051ba26f928687139218bbc43e9004ebe/solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java#L277:L367>
proceeds
<https://github.com/apache/lucene-solr/blob/dff38c2051ba26f928687139218bbc43e9004ebe/solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java#L333>
.

For example, the flow could look like:

shardHandler.submit(slowGroupingRequest, "shard1", groupingParams);
shardHandler.submit(slowGroupingRequest, "shard2", groupingParams);
shardHandler.submit(fastFacetRefinementRequest, "shard1", facetParams);
shardHandler.submit(fastFacetRefinementRequest, "shard2", facetParams);
shardHandler.takeCompletedOrError(); // returns fastFacetRefinementRequest
with responses
shardHandler.takeCompletedOrError(); // returns slowGroupingRequest with
responses
shardHandler.takeCompletedOrError(); // return null, SearchHandler exits
take loop

Does that seem like a correct understanding of the
SearchHandler->ShardHandler interaction?

If so, it seems that to make backup requests work we'd need to fanout
individual ShardRequests independently, each with its own completion
service and pending queue. Does that sound right?

Thanks!

--Gregg

Reply via email to