[ 
https://issues.apache.org/jira/browse/SOLR-7932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14702773#comment-14702773
 ] 

Ramkumar Aiyengar commented on SOLR-7932:
-----------------------------------------

Thanks for the comments [~ysee...@gmail.com]. With master-slave replication, 
yes, this is less of a problem (you still have to deal with clock skew though).

There are two places where the index time is used..

 - To compare if they are equal to skip replication. Unless I am mistaken, the 
timestamp check is not useful to detect index re-creation in this case.
 - To check if full index replication should be forced. I see the use here 
(though I don't see an easy way you can do this in a cloud without stopping the 
full cloud, blowing an index on one but not all replicas, and making sure it 
comes up first)

I am more concerned really about the first case, as you can lose data if you 
are unlucky. Do you agree that the timestamp check can be removed there?

For the second, probably the index creation time is a better thing to check 
against rather than the last commit time, as it is less subject to skew? I 
don't know if Lucene even provides a way to know when the index was initially 
created though. And this could be tackled as a different issue..

> Solr replication relies on timestamps to sync across machines
> -------------------------------------------------------------
>
>                 Key: SOLR-7932
>                 URL: https://issues.apache.org/jira/browse/SOLR-7932
>             Project: Solr
>          Issue Type: Bug
>          Components: replication (java)
>            Reporter: Ramkumar Aiyengar
>         Attachments: SOLR-7932.patch
>
>
> Spinning off SOLR-7859, noticed there that wall time recorded as commit data 
> on a commit to check if replication needs to be done. In IndexFetcher, there 
> is this code:
> {code}
>       if (!forceReplication && 
> IndexDeletionPolicyWrapper.getCommitTimestamp(commit) == latestVersion) {
>         //master and slave are already in sync just return
>         LOG.info("Slave in sync with master.");
>         successfulInstall = true;
>         return true;
>       }
> {code}
> It appears as if we are checking wall times across machines to check if we 
> are in sync, this could go wrong.
> Once a decision is made to replicate, we do seem to use generations instead, 
> except for this place below checks both generations and timestamps to see if 
> a full copy is needed..
> {code}
>       // if the generation of master is older than that of the slave , it 
> means they are not compatible to be copied
>       // then a new index directory to be created and all the files need to 
> be copied
>       boolean isFullCopyNeeded = IndexDeletionPolicyWrapper
>           .getCommitTimestamp(commit) >= latestVersion
>           || commit.getGeneration() >= latestGeneration || forceReplication;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to