[
https://issues.apache.org/jira/browse/SOLR-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15778369#comment-15778369
]
Cao Manh Dat commented on SOLR-9835:
------------------------------------
[[email protected]][[email protected]] : Here are scenario for the problem
that I encountered today
- an replica ( let's call it rep1 ) is on recovering mode -> its ulog will be
on buffering state.
- rep1 receives an update ( contain doc1 ), rep1 will write the update to its
tlog without updating ulog.map for real-time-get
- rep1 replay buffered updates, rep1 will write doc1 to its index, and update
ulog.map for real-time-get ( but in this case, ulog.map will point doc1 ->
position = -1 because we don't write updateCommand with REPLAY flag to tlog )
- client call real-time-get for doc1
- rep1 will always open a real-time-searcher for this case
I just wonder why we do that currently? Why don't we just write the update to
tlog and ulog.map so we don't have to open a new real-time-searcher for this
case?
> Create another replication mode for SolrCloud
> ---------------------------------------------
>
> Key: SOLR-9835
> URL: https://issues.apache.org/jira/browse/SOLR-9835
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Cao Manh Dat
> Assignee: Shalin Shekhar Mangar
> Attachments: SOLR-9835.patch, SOLR-9835.patch
>
>
> The current replication mechanism of SolrCloud is called state machine, which
> replicas start in same initial state and for each input, the input is
> distributed across replicas so all replicas will end up with same next state.
> But this type of replication have some drawbacks
> - The commit (which costly) have to run on all replicas
> - Slow recovery, because if replica miss more than N updates on its down
> time, the replica have to download entire index from its leader.
> So we create create another replication mode for SolrCloud called state
> transfer, which acts like master/slave replication. In basically
> - Leader distribute the update to other replicas, but the leader only apply
> the update to IW, other replicas just store the update to UpdateLog (act like
> replication).
> - Replicas frequently polling the latest segments from leader.
> Pros:
> - Lightweight for indexing, because only leader are running the commit,
> updates.
> - Very fast recovery, replicas just have to download the missing segments.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]