[ 
https://issues.apache.org/jira/browse/SOLR-12291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Khludnev updated SOLR-12291:
------------------------------------
    Attachment:     (was: SOLR-12291.patch)

> Async prematurely reports completed status that causes severe shard loss
> ------------------------------------------------------------------------
>
>                 Key: SOLR-12291
>                 URL: https://issues.apache.org/jira/browse/SOLR-12291
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Backup/Restore, SolrCloud
>            Reporter: Varun Thacker
>            Assignee: Mikhail Khludnev
>            Priority: Major
>         Attachments: SOLR-12291.patch, SOLR-12291.patch, SOLR-12291.patch, 
> SOLR-12291.patch, SOLR-12291.patch, SOLR-12291.patch, SOLR-12291.patch, 
> SOLR-122911.patch
>
>
> The OverseerCollectionMessageHandler sliceCmd assumes only one replica exists 
> on one node
> When multiple replicas of a slice are on the same node we only track one 
> replica's async request. This happens because the async requestMap's key is 
> "node_name"
> I discovered this when [~alabax] shared some logs of a restore issue, where 
> the second replica got added before the first replica had completed it's 
> restorecore action.
> While looking at the logs I noticed that the overseer never called 
> REQUESTSTATUS for the restorecore action , almost as if it had missed 
> tracking that particular async request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to