[ https://issues.apache.org/jira/browse/SOLR-8263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15044802#comment-15044802 ]
Renaud Delbru commented on SOLR-8263: ------------------------------------- While the patch SOLR-8263-trunk-3 which added the dedup logic for the buffered updates seems straightforward, it introduced an issue which could lead to loss of documents. The dedup logic was using the version of the last operation from the tlog files transferred from the master as a starting point for the dedup logic. However, these tlog files were not in synch with the index commit point, there were likely ahead of the index commit point (i.e., there were containing operations that occurred after the index commit point). Therefore, the starting point of the dedup logic was ahead of the index commit point, and therefore it was dropping all operations that occurred between the index commit point and the time the tlog files were transferred from the master. In order to solve this, we had to modify the ReplicationHandler to filter out tlog files that were not associated to a given commit point. To find the tlog files associated to an index commit point, we fetch the max version of an index commit using VersionInfo.getMaxVersionFromIndex and use this version number to discard tlog files. Tlog file name encodes the version of their starting operation (this was originally used for seeking more efficiently across multiple tlog files), and we use this starting version to discard tlog that were created after the commit point (i.e., if starting version > max version). The new patch committed by Erick includes this approach. > Tlog replication could interfere with the replay of buffered updates > -------------------------------------------------------------------- > > Key: SOLR-8263 > URL: https://issues.apache.org/jira/browse/SOLR-8263 > Project: Solr > Issue Type: Sub-task > Reporter: Renaud Delbru > Assignee: Erick Erickson > Fix For: 5.5, 6.0 > > Attachments: SOLR-6273-plus-8263-5x.patch, SOLR-8263-trunk-1.patch, > SOLR-8263-trunk-2.patch, SOLR-8263-trunk-3.patch, SOLR-8263.patch > > > The current implementation of the tlog replication might interfere with the > replay of the buffered updates. The current tlog replication works as follow: > 1) Fetch the the tlog files from the master > 2) reset the update log before switching the tlog directory > 3) switch the tlog directory and re-initialise the update log with the new > directory. > Currently there is no logic to keep "buffered updates" while resetting and > reinitializing the update log. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org