[ 
https://issues.apache.org/jira/browse/SOLR-12708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16632585#comment-16632585
 ] 

Varun Thacker commented on SOLR-12708:
--------------------------------------

{quote}I have to be honest and admit that I copied the full block from 
{{CreateShardCmd.java}}. I think the code is doing the right thing there. In 
both branches of the {{if}} the code checks if the main {{results}} has 
success/failure node already, and creates if necessary. Then adds the 
corresponding {{addResult}} field into the main one. The only difference is 
that the failure recalled before the {{if}} block.
{quote}
I think the usage of this code block is correct in CreateShardCmd but not where 
we are using it in the patch. Here's why : CreateShardCmd is one core admin API 
call . So the response is either a success or failure. Hence the if-else block 
covers it.

In this patch, there are multiple add-replica calls who's response we are 
acknowledging. So there can be replicas that came back with success and some 
that failed. If there is a failure we will just add the failure response back 
to the results and not the success ones .

This way we process requests and responses are very complicated for some reason 
and we should improve it in general . But do you see what I am seeing here?

 

 

> Async collection actions should not hide failures
> -------------------------------------------------
>
>                 Key: SOLR-12708
>                 URL: https://issues.apache.org/jira/browse/SOLR-12708
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Admin UI, Backup/Restore
>    Affects Versions: 7.4
>            Reporter: Mano Kovacs
>            Assignee: Varun Thacker
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Async collection API may hide failures compared to sync version. 
> [OverseerCollectionMessageHandler::processResponses|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/cloud/api/collections/OverseerCollectionMessageHandler.java#L744]
>  structures errors differently in the response, that hides failures from most 
> evaluators. RestoreCmd did not receive, nor handle async addReplica issues.
> Sample create collection sync and async result with invalid solrconfig.xml:
> {noformat}
> {
> "responseHeader":{
> "status":0,
> "QTime":32104},
> "failure":{
> "localhost:8983_solr":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://localhost:8983/solr: Error CREATEing SolrCore 
> 'name4_shard1_replica_n1': Unable to create core [name4_shard1_replica_n1] 
> Caused by: The content of elements must consist of well-formed character data 
> or markup.",
> "localhost:8983_solr":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://localhost:8983/solr: Error CREATEing SolrCore 
> 'name4_shard2_replica_n2': Unable to create core [name4_shard2_replica_n2] 
> Caused by: The content of elements must consist of well-formed character data 
> or markup.",
> "localhost:8983_solr":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://localhost:8983/solr: Error CREATEing SolrCore 
> 'name4_shard1_replica_n2': Unable to create core [name4_shard1_replica_n2] 
> Caused by: The content of elements must consist of well-formed character data 
> or markup.",
> "localhost:8983_solr":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://localhost:8983/solr: Error CREATEing SolrCore 
> 'name4_shard2_replica_n1': Unable to create core [name4_shard2_replica_n1] 
> Caused by: The content of elements must consist of well-formed character data 
> or markup."}
> }
> {noformat}
> vs async:
> {noformat}
> {
> "responseHeader":{
> "status":0,
> "QTime":3},
> "success":{
> "localhost:8983_solr":{
> "responseHeader":{
> "status":0,
> "QTime":12}},
> "localhost:8983_solr":{
> "responseHeader":{
> "status":0,
> "QTime":3}},
> "localhost:8983_solr":{
> "responseHeader":{
> "status":0,
> "QTime":11}},
> "localhost:8983_solr":{
> "responseHeader":{
> "status":0,
> "QTime":12}}},
> "myTaskId2709146382836":{
> "responseHeader":{
> "status":0,
> "QTime":1},
> "STATUS":"failed",
> "Response":"Error CREATEing SolrCore 'name_shard2_replica_n2': Unable to 
> create core [name_shard2_replica_n2] Caused by: The content of elements must 
> consist of well-formed character data or markup."},
> "status":{
> "state":"completed",
> "msg":"found [myTaskId] in completed tasks"}}
> {noformat}
> Proposing adding failure node to the results, keeping backward compatible but 
> correct result.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to