[ 
https://issues.apache.org/jira/browse/SOLR-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-9835:
-------------------------------
    Description: 
The current replication mechanism of SolrCloud is called state machine, which 
replicas start in same initial state and for each input, the input is 
distributed across replicas so all replicas will end up with same next state. 

But this type of replication have some drawbacks
- The commit (which costly) have to run on all replicas
- Slow recovery, because if replica miss more than N updates on its down time, 
the replica have to download entire index from its leader.

So we create create another replication mode for SolrCloud called state 
transfer, which acts like master/slave replication. In basically
- Leader distribute the update to other replicas, but the leader only apply the 
update to IW, other replicas just store the update to UpdateLog (act like 
replication).
- Replicas frequently polling the latest segments from leader.

Pros:
- Lightweight for indexing, because only leader are running the commit, updates.
- Very fast recovery, replicas just have to download the missing segments.

On CAP point of view, this ticket will trying to promise to end users a 
distributed systems :
- Partition tolerance
- Weak Consistency for normal query : clusters can serve stale data. This 
happen when leader finish a commit and slave is fetching for latest segment. 
This period can at most {{pollInterval + time to fetch latest segment}}.
- Consistency for RTG : if we *do not use DQBs*, replicas will consistence with 
master just like original SolrCloud mode
- Weak Availability : just like original SolrCloud mode. If a leader down, 
client must wait until new leader being elected.

To use this new replication mode, a new collection must be created with an 
additional parameter {{liveReplicas=1}}
{code}
http://localhost:8983/solr/admin/collections?action=CREATE&name=newCollection&numShards=2&replicationFactor=1&realtimeReplicas=1
{code}

  was:
The current replication mechanism of SolrCloud is called state machine, which 
replicas start in same initial state and for each input, the input is 
distributed across replicas so all replicas will end up with same next state. 

But this type of replication have some drawbacks
- The commit (which costly) have to run on all replicas
- Slow recovery, because if replica miss more than N updates on its down time, 
the replica have to download entire index from its leader.

So we create create another replication mode for SolrCloud called state 
transfer, which acts like master/slave replication. In basically
- Leader distribute the update to other replicas, but the leader only apply the 
update to IW, other replicas just store the update to UpdateLog (act like 
replication).
- Replicas frequently polling the latest segments from leader.

Pros:
- Lightweight for indexing, because only leader are running the commit, updates.
- Very fast recovery, replicas just have to download the missing segments.

On CAP point of view, this ticket will trying to promise to end users a 
distributed systems :
- Partition tolerance
- Weak Consistency for normal query : clusters can serve stale data. This 
happen when leader finish a commit and slave is fetching for latest segment. 
This period can at most {{pollInterval + time to fetch latest segment}}.
- Consistency for RTG : if we *do not use DQBs*, replicas will consistence with 
master just like original SolrCloud mode
- Weak Availability : just like original SolrCloud mode. If a leader down, 
client must wait until new leader being elected.

To use this new replication mode, a new collection must be created with an 
additional parameter {{liveReplicas=1}}
{code}
http://localhost:8983/solr/admin/collections?action=CREATE&name=newCollection&numShards=2&replicationFactor=1&liveReplicas=1
{code}


> Create another replication mode for SolrCloud
> ---------------------------------------------
>
>                 Key: SOLR-9835
>                 URL: https://issues.apache.org/jira/browse/SOLR-9835
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Cao Manh Dat
>            Assignee: Shalin Shekhar Mangar
>         Attachments: SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, 
> SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, 
> SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, 
> SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch
>
>
> The current replication mechanism of SolrCloud is called state machine, which 
> replicas start in same initial state and for each input, the input is 
> distributed across replicas so all replicas will end up with same next state. 
> But this type of replication have some drawbacks
> - The commit (which costly) have to run on all replicas
> - Slow recovery, because if replica miss more than N updates on its down 
> time, the replica have to download entire index from its leader.
> So we create create another replication mode for SolrCloud called state 
> transfer, which acts like master/slave replication. In basically
> - Leader distribute the update to other replicas, but the leader only apply 
> the update to IW, other replicas just store the update to UpdateLog (act like 
> replication).
> - Replicas frequently polling the latest segments from leader.
> Pros:
> - Lightweight for indexing, because only leader are running the commit, 
> updates.
> - Very fast recovery, replicas just have to download the missing segments.
> On CAP point of view, this ticket will trying to promise to end users a 
> distributed systems :
> - Partition tolerance
> - Weak Consistency for normal query : clusters can serve stale data. This 
> happen when leader finish a commit and slave is fetching for latest segment. 
> This period can at most {{pollInterval + time to fetch latest segment}}.
> - Consistency for RTG : if we *do not use DQBs*, replicas will consistence 
> with master just like original SolrCloud mode
> - Weak Availability : just like original SolrCloud mode. If a leader down, 
> client must wait until new leader being elected.
> To use this new replication mode, a new collection must be created with an 
> additional parameter {{liveReplicas=1}}
> {code}
> http://localhost:8983/solr/admin/collections?action=CREATE&name=newCollection&numShards=2&replicationFactor=1&realtimeReplicas=1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to