[ 
https://issues.apache.org/jira/browse/SOLR-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15815930#comment-15815930
 ] 

Tomás Fernández Löbbe commented on SOLR-9835:
---------------------------------------------

Great idea! just took a quick look at the patch to understand this better. I 
have a couple of questions/comments, I know this is work in progress, so feel 
free to disregard any of my comments if you are working on them:

{code}
onlyLeaderIndexes = 
zkStateReader.getClusterState().getCollection(collection).getLiveReplicas() == 
1;
{code}
Maybe add a method to DocCollection like {{isOnlyLeaderIndexes()}} (or choose 
other name)? I understand why you did this, but this code is repeated many 
times, maybe can be improved for now.

{code}
private Map<String, ReplicateFromLeader> replicateFromLeaders = new HashMap<>();
{code}
Does this need to be synchronized?

{code}
-  private final String masterUrl;
+  private String masterUrl;
{code}
should {{masterUrl}} now be volatile?

{code}
+  public static boolean waitForInSyncWithLeader(SolrCore core, Replica 
leaderReplica) throws InterruptedException {
+    if (waitForReplicasInSync == null) return true;
+
+    Pair<Boolean,Integer> pair = parseValue(waitForReplicasInSync);
+    boolean enabled = pair.first();
+    if (!enabled) return true;
+
+    Thread.sleep(1000);
+    HttpSolrClient leaderClient = new 
HttpSolrClient.Builder(leaderReplica.getCoreUrl()).build();
+    long leaderVersion = -1;
+    String localVersion = null;
+    try {
+      for (int i = 0; i < pair.second(); i++) {
+        if (core.isClosed()) return true;
+        ModifiableSolrParams params = new ModifiableSolrParams();
+        params.set(CommonParams.QT, ReplicationHandler.PATH);
+        params.set(COMMAND, CMD_DETAILS);
+
+        NamedList<Object> response = leaderClient.request(new 
QueryRequest(params));
+        leaderVersion = (long) 
((NamedList)response.get("details")).get("indexVersion");
+
+        localVersion = 
core.getDeletionPolicy().getLatestCommit().getUserData().get(SolrIndexWriter.COMMIT_TIME_MSEC_KEY);
+        if (localVersion == null && leaderVersion == 0) return true;
+
+        if (localVersion != null && Long.parseLong(localVersion) == 
leaderVersion) {
+          return true;
+        } else {
+          Thread.sleep(500);
+        }
+      }
+
+    } catch (Exception e) {
+      log.error("Exception when wait for replicas in sync with master");
+    } finally {
+      try {
+        if (leaderClient != null) leaderClient.close();
+      } catch (IOException e) {
+        e.printStackTrace();
+      }
+    }
+
+    return false;
+  }

{code}
In many cases in the tests the leader will change before the replication 
happens, right? Does it make sense to discover the leader inside of the loop? 
Also, is there a way to remove that Thread.sleep(1000) at the beginning? This 
code will be called very frequently in tests.

> Create another replication mode for SolrCloud
> ---------------------------------------------
>
>                 Key: SOLR-9835
>                 URL: https://issues.apache.org/jira/browse/SOLR-9835
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Cao Manh Dat
>            Assignee: Shalin Shekhar Mangar
>         Attachments: SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, 
> SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch
>
>
> The current replication mechanism of SolrCloud is called state machine, which 
> replicas start in same initial state and for each input, the input is 
> distributed across replicas so all replicas will end up with same next state. 
> But this type of replication have some drawbacks
> - The commit (which costly) have to run on all replicas
> - Slow recovery, because if replica miss more than N updates on its down 
> time, the replica have to download entire index from its leader.
> So we create create another replication mode for SolrCloud called state 
> transfer, which acts like master/slave replication. In basically
> - Leader distribute the update to other replicas, but the leader only apply 
> the update to IW, other replicas just store the update to UpdateLog (act like 
> replication).
> - Replicas frequently polling the latest segments from leader.
> Pros:
> - Lightweight for indexing, because only leader are running the commit, 
> updates.
> - Very fast recovery, replicas just have to download the missing segments.
> To use this new replication mode, a new collection must be created with an 
> additional parameter {{liveReplicas=1}}
> {code}
> http://localhost:8983/solr/admin/collections?action=CREATE&name=newCollection&numShards=2&replicationFactor=1&liveReplicas=1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to