[ https://issues.apache.org/jira/browse/SOLR-8263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15024675#comment-15024675 ]
Renaud Delbru commented on SOLR-8263: ------------------------------------- [~shalinmangar] Yes, you understood the sequence correctly. To be more precise here is how it works: 1) the tlog files of the leader are downloaded in a temporary directory 2) After the files have been downloaded properly, a write lock is acquired by the IndexFetcher. The original tlog directory is renamed as a backup directory, and the temporary directory is renamed as the active tlog directory. 3) The update log is reset with the new active log directory. During this reset, the recovery info is used to read the backup buffered tlog file and every buffered operation is copied to the new buffered tlog. 4) The write lock is released, and the recovery operation will continue and apply the buffered updates. Indeed, the buffered tlog can contain duplicate operations with the replica update log. During the recovery operation, the replica might receive from the leader some operations that will be buffered, but they might be also present in one of the tlog that is downloaded from the leader. Apart from the disk space usage of these duplicate operations and the additional network transfer, there is no harm, as these duplicate operations will be ignored by the peer cluster. We could improve the tlog recovery operation to de-duplicate the buffered tlog while copying the buffered updates. We could check the version of the latest operations in the downloaded tlog, and skip operations from the buffered tlog if their version is inferior to the latest know. It should be a relatively small patch. I can try to work on that in the next days and submit something, if that's fine with you and [~erickerickson] ? > Tlog replication could interfere with the replay of buffered updates > -------------------------------------------------------------------- > > Key: SOLR-8263 > URL: https://issues.apache.org/jira/browse/SOLR-8263 > Project: Solr > Issue Type: Sub-task > Reporter: Renaud Delbru > Assignee: Erick Erickson > Attachments: SOLR-8263-trunk-1.patch, SOLR-8263-trunk-2.patch > > > The current implementation of the tlog replication might interfere with the > replay of the buffered updates. The current tlog replication works as follow: > 1) Fetch the the tlog files from the master > 2) reset the update log before switching the tlog directory > 3) switch the tlog directory and re-initialise the update log with the new > directory. > Currently there is no logic to keep "buffered updates" while resetting and > reinitializing the update log. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org