[ 
https://issues.apache.org/jira/browse/SOLR-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15795803#comment-15795803
 ] 

ASF subversion and git services commented on SOLR-9915:
-------------------------------------------------------

Commit 122fa6cf64a56dd5ab5aff84f7f5c9a1305bde4e in lucene-solr's branch 
refs/heads/branch_6x from [~noble.paul]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=122fa6c ]

SOLR-9915: PeerSync alreadyInSync check is not backwards compatible and results 
in full replication during a rolling restart


> PeerSync alreadyInSync check is not backwards compatible
> --------------------------------------------------------
>
>                 Key: SOLR-9915
>                 URL: https://issues.apache.org/jira/browse/SOLR-9915
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: replication (java)
>    Affects Versions: 6.3
>            Reporter: Tim Owen
>            Assignee: Noble Paul
>            Priority: Blocker
>         Attachments: SOLR-9915.patch
>
>
> The fingerprint check added to PeerSync in SOLR-9446 works fine when all 
> servers are running 6.3 but this means it's hard to do a rolling upgrade from 
> e.g. 6.2.1 to 6.3 because the 6.3 server sends a request to a 6.2.1 server to 
> get a fingerprint and then gets a NPE because the older server doesn't return 
> the expected field in its response.
> This leads to the PeerSync completely failing, and results in a full index 
> replication from scratch, copying all index files over the network. We 
> noticed this happening when we tried to do a rolling upgrade on one of our 
> 6.2.1 clusters to 6.3. Unfortunately this amount of replication was hammering 
> our disks and network, so we had to do a full shutdown, upgrade all to 6.3 
> and restart, which was not ideal for a production cluster.
> The attached patch should behave more gracefully in this situation, as it 
> will typically return false for alreadyInSync() and then carry on doing the 
> normal re-sync based on versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to