[jira] [Commented] (SOLR-14648) Creating TLOG with pure multiple PULL replica, leading to 0 doc count

Erick Erickson (Jira) Tue, 14 Jul 2020 07:28:11 -0700


    [ 
https://issues.apache.org/jira/browse/SOLR-14648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157424#comment-17157424
 ]


Erick Erickson commented on SOLR-14648:
---------------------------------------

First, I agree that nuking all the existing PULL replica indexes is A Bad Thing 
in the scenario you point out.

That said I don't think picking a PULL replica to grab the index from 
automatically is a good solution since there's no guarantee that any PULL 
replica is up to date. Consider this scenario:

PULL replica1 is offline

the TLOG and other PULL replicas get lots of udpates. For some unfathomable 
reason, everything except replica1 is taken down. Now a new TLOG replica is 
created and it gets the index from replica1 which isn't up to date at all.

This is just the most egregious scenario. In a more realistic scenario, the 
PULL replicas may or may not have gotten the latest index changes when the TLOG 
replica goes down, so how would it be possible to choose among them? In fact 
there's no guarantee at all that _any_ of the PULL replicas remaining have the 
latest updates.

I'm thinking of something along the lines of creating a new TLOG replica in 
your scenario failing. More generally failing if there are no active TLOG 
replicas (leaders) in the existing collection. We'd need a way to fix this, 
perhaps a way to "promote" a PULL replica to a TLOG replica. There'd still be 
the possibility of losing documents, but just like FORCELEADER we can document 
this and let people decide whether it's worth the risk or they should just 
reindex everything.

We should not risk data inconsistency without somehow making sure that the 
users understand the risk.

> Creating TLOG with pure multiple PULL replica, leading to 0 doc count
> ---------------------------------------------------------------------
>
>                 Key: SOLR-14648
>                 URL: https://issues.apache.org/jira/browse/SOLR-14648
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>    Affects Versions: 8.3.1
>            Reporter: Sayan Das
>            Priority: Major
>
> With only PULL replica whenever we create a new TLOG as leader fresh 
> replication happens, resulting in flushing the older indexes from existing 
> PULL replicas
> Steps to replicate:
>  # Create 1 NRT or 1 TLOG replica as leader with multiple PULL replicas
>  # Index few documents and let it replicate in all the replicas
>  # Delete all the TLOG/NRT replica leaving PULL types
>  # Create a new TLOG/NRT as leader, once recovery completes it replaces all 
> the older indexes
> In ideal scenario it should have replicated from any one of the PULL replicas 
> that has latest indexes after that TLOG/NRT replica should be registered as 
> leader



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14648) Creating TLOG with pure multiple PULL replica, leading to 0 doc count

Reply via email to