[ 
https://issues.apache.org/jira/browse/SOLR-1143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748128#action_12748128
 ] 

Martijn van Groningen commented on SOLR-1143:
---------------------------------------------

Sorry for my confusing comment. I meant to say takeOrError() does return 
immediately when an exception occurs. To avoid more confusion I will sketch a 
situation from what I currently understand from the code to show that 
takeOrError() should not be used when returning partial result.

For each stage a number of requests may be send to the shards and a number of 
responses may be returned from the shards for further processing.
Lets say we have three shards and we send a shard request in a certain stage to 
all three shards. If the first response contains an error the current behaviour 
is to return the response immediately, without adding the two other responses 
(that did return without an error). Because of this the so called partial 
result might contain less data or even nothing.  Therefore I think take() 
should be used there. I think takeOrError() is only suitable when not using 
partial result.

{code:java}
ShardResponse takeCompletedOrError() {
    while (pending.size() > 0) {
      try {
        Future<ShardResponse> future = completionService.take();
        pending.remove(future);
        ShardResponse rsp = future.get();
        if (rsp.getException() != null) return rsp; // now we return and if 
there are more pending results, we lose them
        ...............
        rsp.getShardRequest().responses.add(rsp);
        if (rsp.getShardRequest().responses.size() == 
rsp.getShardRequest().actualShards.length) {
          return rsp;
        }
      } catch (InterruptedException e) {
      ......
    }
    return null;
  }
{code}

Again this what I understand from the code. What do you think about this? 

I also did some more thinking about how to improve shard failures. Currently if 
a shard fails in a early stage of the distributed search we keep sending 
requests to the shard, although we noticed in a previous stage that it was not 
responding. You think that it is a good idea to mark a shard as failed, so that 
it will not use the shard that is marked as failed for the current running 
search? 

> Return partial results when a connection to a shard is refused
> --------------------------------------------------------------
>
>                 Key: SOLR-1143
>                 URL: https://issues.apache.org/jira/browse/SOLR-1143
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>            Reporter: Nicolas Dessaigne
>            Assignee: Grant Ingersoll
>             Fix For: 1.4
>
>         Attachments: SOLR-1143-2.patch, SOLR-1143-3.patch, SOLR-1143.patch
>
>
> If any shard is down in a distributed search, a ConnectException it thrown.
> Here's a little patch that change this behaviour: if we can't connect to a 
> shard (ConnectException), we get partial results from the active shards. As 
> for TimeOut parameter (https://issues.apache.org/jira/browse/SOLR-502), we 
> set the parameter "partialResults" at true.
> This patch also adresses a problem expressed in the mailing list about a year 
> ago 
> (http://www.nabble.com/partialResults,-distributed-search---SOLR-502-td19002610.html)
> We have a use case that needs this behaviour and we would like to know your 
> thougths about such a behaviour? Should it be the default behaviour for 
> distributed search?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to