I agree that using a single tlog for 2 purposes is confusing. Perhaps a separate tlog purely for buffering purposes during recovery/peer-sync etc. be clearer? You said "another separate file" but I imagine we can use our tlog code but another instance pointed to another directory.
FWIW I actually question if buffering should happen at all due to the complexity it brings (e.g. SOLR-8030) vs blocking then failing if blocked for too long... but I guess that ship has sailed. On Tue, Dec 27, 2016 at 5:52 PM Đạt Cao Mạnh <[email protected]> wrote: > Currently, we write buffering logs to current tlog and not apply that > updates to index. Then we rely on replay log to apply that updates to > index. But at the same time there are some updates also write to current > tlog and applied to the index. > > For example, during peersync, if new updates come to replica we will end > up with this tlog > tlog : old1, new1, new2, old2, new3, old3 > old updates belong to peersync, and these updates are applied to the index. > new updates belong to buffering updates, and these updates are not applied > to the index. > > But writing all the updates to same current tlog make code base very > complex. Should we write buffering updates to another temporary file? > > By doing this, it will help our code base simpler. It also makes replica > recovery for SOLR-9835 more easier. Because after peersync success we can > copy new updates from temporary file to current tlog, for example > tlog : old1, old2, old3 > temporary tlog : new1, new2, new3 > --> > tlog : old1, old2, old3, new1, new2, new3 > > Note that in SOLR-9835 we can not rely on fingerprint for peersync > because updates are not applied to replicas. > -- Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker LinkedIn: http://linkedin.com/in/davidwsmiley | Book: http://www.solrenterprisesearchserver.com
