[ https://issues.apache.org/jira/browse/SOLR-1143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853168#action_12853168 ]
Peter Sturge commented on SOLR-1143: ------------------------------------ This is a cool patch - yes, very useful. I've found a couple of issues with it, though: 1. When going through the 'waiting for shard replies' loop, because no exception is thrown on shard failure, the next block after the loop can throw a NullPointerException in {{SearchComponent.handleResponses()}} for any SearchComponent that checks shard responses. It could be that this doesn't always happen, but it certainly happens in FacetComponent when date_facets are turned on. 2. There's a bit of code that sets {{partialResults=true}} if there's at least one failure, but it doesn't set it to false if everything's ok. In order for the patch to operate, this parameter must have already been present and true, otherwise the patch is essentially 'disabled' anyway (problem of using the same parameter as input and result). I've made some modifications to the patch for these and a couple of other things: 1. FacetComponent modified to check for null shard reponse. Perhaps it would be better to check this in SearchHandler.handleResponses(), but then no SearchComponents would be contacted re failed shards, even if they don't care that it's failed (is that a good thing?). 2. Added a new CommonParams parameter called FAILED_SHARDS. {{partialResults}} is now only an input parameter to enable the feature (Note: {{partialResults}} is referenced in RequestHandlerBase, but it's not from the patch - is this an existing parameter that is used for something else?! If so, perhaps the name should be changed to something like {{allowPartialResults}} to avoid b/w compat and other potential conflicts). The output parameter that goes in the response header is now: {{failedShards=shard0;shard1;shardn}}. If everything succeeds, there will be no failedShards in the response header, otherwise, a list of failed shards is given. This is very useful to alert someone/something that a server/network needs attention (e.g. a health checker thread could run empty disributed seaches solely for the purpose of checking status). 3. Changed the detection of a shard request error to be any Exception, rather than just ConnectException. This way, any failure is caught and can be actioned. Possible TODO: it might be nice to include a short message (Exception class name?) in the FAILED_SHARDS parameter about what failed (e.g. ConnectException, IOException, etc.). If you like this idea, please say so, and I'll include it - i.e. something like: {{ failedShards=myshard:8983/solr/core0|ConnectException;myothershard:8983/solr/core0|IOException}} I'm currently testing these changes in our internal build. In the meantime, any comments are grealy appreciated. If there are no objections, I'll add a patch update when the dev test run is complete. > Return partial results when a connection to a shard is refused > -------------------------------------------------------------- > > Key: SOLR-1143 > URL: https://issues.apache.org/jira/browse/SOLR-1143 > Project: Solr > Issue Type: Improvement > Components: search > Reporter: Nicolas Dessaigne > Assignee: Grant Ingersoll > Fix For: 1.5 > > Attachments: SOLR-1143-2.patch, SOLR-1143-3.patch, SOLR-1143.patch > > > If any shard is down in a distributed search, a ConnectException it thrown. > Here's a little patch that change this behaviour: if we can't connect to a > shard (ConnectException), we get partial results from the active shards. As > for TimeOut parameter (https://issues.apache.org/jira/browse/SOLR-502), we > set the parameter "partialResults" at true. > This patch also adresses a problem expressed in the mailing list about a year > ago > (http://www.nabble.com/partialResults,-distributed-search---SOLR-502-td19002610.html) > We have a use case that needs this behaviour and we would like to know your > thougths about such a behaviour? Should it be the default behaviour for > distributed search? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.