[ https://issues.apache.org/jira/browse/SOLR-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15778369#comment-15778369 ]
Cao Manh Dat edited comment on SOLR-9835 at 12/26/16 1:50 PM: -------------------------------------------------------------- [~yo...@apache.org][~ysee...@gmail.com] : Here are scenario for the problem that I encountered today - an replica ( let's call it rep1 ) is on recovering mode -> its ulog will be on buffering state. - rep1 receives an update ( contain doc1 ), rep1 will write the update to its tlog without updating ulog.map for real-time-get - rep1 replay buffered updates, rep1 will write doc1 to its index, and update ulog.map for real-time-get ( but in this case, ulog.map will point doc1 -> position = -1 because we don't write updateCommand with REPLAY flag to tlog ) - client call real-time-get for doc1 - rep1 will always open a real-time-searcher for this case. Because ulog.map for doc 1 return position = -1 I just wonder why we do that currently? Why don't we just write the update to tlog and ulog.map so we don't have to open a new real-time-searcher for this case? was (Author: caomanhdat): [~yo...@apache.org][~ysee...@gmail.com] : Here are scenario for the problem that I encountered today - an replica ( let's call it rep1 ) is on recovering mode -> its ulog will be on buffering state. - rep1 receives an update ( contain doc1 ), rep1 will write the update to its tlog without updating ulog.map for real-time-get - rep1 replay buffered updates, rep1 will write doc1 to its index, and update ulog.map for real-time-get ( but in this case, ulog.map will point doc1 -> position = -1 because we don't write updateCommand with REPLAY flag to tlog ) - client call real-time-get for doc1 - rep1 will always open a real-time-searcher for this case I just wonder why we do that currently? Why don't we just write the update to tlog and ulog.map so we don't have to open a new real-time-searcher for this case? > Create another replication mode for SolrCloud > --------------------------------------------- > > Key: SOLR-9835 > URL: https://issues.apache.org/jira/browse/SOLR-9835 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Cao Manh Dat > Assignee: Shalin Shekhar Mangar > Attachments: SOLR-9835.patch, SOLR-9835.patch > > > The current replication mechanism of SolrCloud is called state machine, which > replicas start in same initial state and for each input, the input is > distributed across replicas so all replicas will end up with same next state. > But this type of replication have some drawbacks > - The commit (which costly) have to run on all replicas > - Slow recovery, because if replica miss more than N updates on its down > time, the replica have to download entire index from its leader. > So we create create another replication mode for SolrCloud called state > transfer, which acts like master/slave replication. In basically > - Leader distribute the update to other replicas, but the leader only apply > the update to IW, other replicas just store the update to UpdateLog (act like > replication). > - Replicas frequently polling the latest segments from leader. > Pros: > - Lightweight for indexing, because only leader are running the commit, > updates. > - Very fast recovery, replicas just have to download the missing segments. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org