[
https://issues.apache.org/jira/browse/SOLR-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15815930#comment-15815930
]
Tomás Fernández Löbbe commented on SOLR-9835:
---------------------------------------------
Great idea! just took a quick look at the patch to understand this better. I
have a couple of questions/comments, I know this is work in progress, so feel
free to disregard any of my comments if you are working on them:
{code}
onlyLeaderIndexes =
zkStateReader.getClusterState().getCollection(collection).getLiveReplicas() ==
1;
{code}
Maybe add a method to DocCollection like {{isOnlyLeaderIndexes()}} (or choose
other name)? I understand why you did this, but this code is repeated many
times, maybe can be improved for now.
{code}
private Map<String, ReplicateFromLeader> replicateFromLeaders = new HashMap<>();
{code}
Does this need to be synchronized?
{code}
- private final String masterUrl;
+ private String masterUrl;
{code}
should {{masterUrl}} now be volatile?
{code}
+ public static boolean waitForInSyncWithLeader(SolrCore core, Replica
leaderReplica) throws InterruptedException {
+ if (waitForReplicasInSync == null) return true;
+
+ Pair<Boolean,Integer> pair = parseValue(waitForReplicasInSync);
+ boolean enabled = pair.first();
+ if (!enabled) return true;
+
+ Thread.sleep(1000);
+ HttpSolrClient leaderClient = new
HttpSolrClient.Builder(leaderReplica.getCoreUrl()).build();
+ long leaderVersion = -1;
+ String localVersion = null;
+ try {
+ for (int i = 0; i < pair.second(); i++) {
+ if (core.isClosed()) return true;
+ ModifiableSolrParams params = new ModifiableSolrParams();
+ params.set(CommonParams.QT, ReplicationHandler.PATH);
+ params.set(COMMAND, CMD_DETAILS);
+
+ NamedList<Object> response = leaderClient.request(new
QueryRequest(params));
+ leaderVersion = (long)
((NamedList)response.get("details")).get("indexVersion");
+
+ localVersion =
core.getDeletionPolicy().getLatestCommit().getUserData().get(SolrIndexWriter.COMMIT_TIME_MSEC_KEY);
+ if (localVersion == null && leaderVersion == 0) return true;
+
+ if (localVersion != null && Long.parseLong(localVersion) ==
leaderVersion) {
+ return true;
+ } else {
+ Thread.sleep(500);
+ }
+ }
+
+ } catch (Exception e) {
+ log.error("Exception when wait for replicas in sync with master");
+ } finally {
+ try {
+ if (leaderClient != null) leaderClient.close();
+ } catch (IOException e) {
+ e.printStackTrace();
+ }
+ }
+
+ return false;
+ }
{code}
In many cases in the tests the leader will change before the replication
happens, right? Does it make sense to discover the leader inside of the loop?
Also, is there a way to remove that Thread.sleep(1000) at the beginning? This
code will be called very frequently in tests.
> Create another replication mode for SolrCloud
> ---------------------------------------------
>
> Key: SOLR-9835
> URL: https://issues.apache.org/jira/browse/SOLR-9835
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Cao Manh Dat
> Assignee: Shalin Shekhar Mangar
> Attachments: SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch,
> SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch
>
>
> The current replication mechanism of SolrCloud is called state machine, which
> replicas start in same initial state and for each input, the input is
> distributed across replicas so all replicas will end up with same next state.
> But this type of replication have some drawbacks
> - The commit (which costly) have to run on all replicas
> - Slow recovery, because if replica miss more than N updates on its down
> time, the replica have to download entire index from its leader.
> So we create create another replication mode for SolrCloud called state
> transfer, which acts like master/slave replication. In basically
> - Leader distribute the update to other replicas, but the leader only apply
> the update to IW, other replicas just store the update to UpdateLog (act like
> replication).
> - Replicas frequently polling the latest segments from leader.
> Pros:
> - Lightweight for indexing, because only leader are running the commit,
> updates.
> - Very fast recovery, replicas just have to download the missing segments.
> To use this new replication mode, a new collection must be created with an
> additional parameter {{liveReplicas=1}}
> {code}
> http://localhost:8983/solr/admin/collections?action=CREATE&name=newCollection&numShards=2&replicationFactor=1&liveReplicas=1
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]