[ 
https://issues.apache.org/jira/browse/SOLR-9310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15381495#comment-15381495
 ] 

Pushkar Raste commented on SOLR-9310:
-------------------------------------

I was able to figure a few more things about what going own. The existing 
{{PeerSyncTest}} does not change core's state from {{ACTIVE}} to recovering and 
hence the condition following condition (block) in 
{{DistributedUpdateLogProcessor.versionAdd()}} does not get executed 
{code}
if (ulog.getState() != UpdateLog.State.ACTIVE && (cmd.getFlags() & 
UpdateCommand.REPLAY) == 0) {
              // we're not in an active state, and this update isn't from a 
replay, so buffer it.
              cmd.setFlags(cmd.getFlags() | UpdateCommand.BUFFERING);
              ulog.add(cmd);
              return true;
            }
{code}

However, the test I have attached certainly goes through above mentioned code 
which results in updates came from log replay to be buffered. As a result of 
this {{compareFingerPrint()}} check at the end of {{PeerSync.handleUpdates()}} 
fails,whenever a PeerSync was triggered and core was in not ACTIVE state. I am 
not entirely sure why a core would in ACTIVE state if PeerSync was triggered 
(which happens in {{PeerSyncTest}}). 

I think {{compareFingerPrint}} check should be moved out of {{PeerSync}} class 
to {{RecoveryStrategy}} after buffered updates are applied. It might also be a 
good idea to move the commit after log replay as current commit seems to be 
resulting in a NOOP.

> PeerSync fails on a node restart due to IndexFingerPrint mismatch
> -----------------------------------------------------------------
>
>                 Key: SOLR-9310
>                 URL: https://issues.apache.org/jira/browse/SOLR-9310
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Pushkar Raste
>         Attachments: PeerSyncReplicationTest.patch
>
>
> I found that Peer Sync fails if a node restarts and documents were indexed 
> while node was down. IndexFingerPrint check fails after recovering node 
> applies updates. 
> This happens only when node restarts and not if node just misses updates due 
> reason other than it being down.
> Please check attached patch for the test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to