[ 
https://issues.apache.org/jira/browse/SOLR-17306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Kroiss updated SOLR-17306:
--------------------------------
    Attachment: solr-replication-test.txt

> Solr Repeater or Slave loses data after restart when replication is not 
> enabled on leader
> -----------------------------------------------------------------------------------------
>
>                 Key: SOLR-17306
>                 URL: https://issues.apache.org/jira/browse/SOLR-17306
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 9.2, 9.3, 9.4, 9.6, 9.5.0
>            Reporter: Peter Kroiss
>            Priority: Major
>         Attachments: solr-replication-test.txt
>
>
> We are testing Solr 9.6.2 in a leader - repeater - follower configuration. We 
> have times where we write the leader heavily, in that time replication is 
> disabled to save bandwidth.
> In the time, when replication is disabled on leader, the repeater restarts 
> for some reason, the repeater loses all documents and doesn't recover when 
> the leader is opened for replication.
> The documents are deleted but indexVersion and generation properties are set 
> to the value of the leader, so the repeater or follower doesn't recover when 
> the leader is opened for replication again.
> It recovers only when there are commits on the leader after opening the 
> replication.
> Log:
> 2024-05-22 06:18:42.186 INFO  (qtp16373883-27-null-23) [c: s: r: x:mycore 
> t:null-23] o.a.s.c.S.Request webapp=/solr path=/replication 
> params=\{wt=json&command=details} status=0 QTime=10
> 2024-05-22 06:18:46.195 INFO  (indexFetcher-43-thread-1) [c: s: r: x:mycore 
> t:] o.a.s.h.IndexFetcher Leader's generation: 0
> 2024-05-22 06:18:46.195 INFO  (indexFetcher-43-thread-1) [c: s: r: x:mycore 
> t:] o.a.s.h.IndexFetcher Leader's version: 0
> 2024-05-22 06:18:46.195 INFO  (indexFetcher-43-thread-1) [c: s: r: x:mycore 
> t:] o.a.s.h.IndexFetcher Follower's generation: 2913
> 2024-05-22 06:18:46.195 INFO  (indexFetcher-43-thread-1) [c: s: r: x:mycore 
> t:] o.a.s.h.IndexFetcher Follower's version: 1716300697144
> 2024-05-22 06:18:46.195 INFO  (indexFetcher-43-thread-1) [c: s: r: x:mycore 
> t:] o.a.s.h.IndexFetcher New index in Leader. Deleting mine...
>  
> --> there is no new Index in Leader it is only closed for replication
>  
>  
> We think the problem is in IndexFetcher
> old: if (IndexDeletionPolicyWrapper.getCommitTimestamp(commit) != 0L) {
> forceReplication - will probably fix the problem
> new : if (forceReplication && 
> IndexDeletionPolicyWrapper.getCommitTimestamp(commit) != 0L) {
>  
>  
>  
>  
> When investigation the problem we also found some inconsistencies in the 
> details request. There are two fragments leader. When the leader is closed 
> for replication the property leader. replicationEnabled is set to true, the 
> property follower. leaderDetails. Leader. replicationEnabled is correct.
>  
> Example
> curl -s 
> "https://solr9-repeater:8983/solr/mycore/replication?wt=json&command=details"; 
> | jq  '.details |
> { indexSize: .indexSize, indexVersion: .indexVersion, generation: 
> .generation, indexPath: .indexPath, leader: \\{  replicableVersion: 
> .leader.replicableVersion, replicableGeneration: 
> .leader.replicableGeneration, replicationEnabled: .leader.replicationEnabled }
> ,
> follower: { leaderDetails: { indexSize: .follower.leaderDetails.indexSize, 
> generation: .follower.leaderDetails.generation,
>  indexVersion: .follower.leaderDetails.indexVersion, indexPath: 
> .follower.leaderDetails.indexPath,
> leader:
> { replicableVersion:  .follower.leaderDetails.leader.replicableVersion , 
> replicableGeneration:  .follower.leaderDetails.leader.replicableGeneration, 
> replicationEnabled:  .follower.leaderDetails.leader.replicationEnabled }
>    }}
> }'
>  
> {
>   "indexSize": "10.34 GB",
>   "indexVersion": 1716358708159,
>   "generation": 2913,
>   "indexPath": "/var/solr/data/mycore/data/index.20240522061946262",
>   "leader":
> {     "replicableVersion": 1716358708159,     "replicableGeneration": 2913,   
>   "replicationEnabled": "true"   }
> ,
>   "follower": {
>     "leaderDetails": {
>       "indexSize": "10.34 GB",
>       "generation": 2913,
>       "indexVersion": 1716358708159,
>       "indexPath": "/var/solr/data/mycore/data/restore.20240508131046932",
>       "leader":
> {         "replicableVersion": 1716358708159,         "replicableGeneration": 
> 2913,         "replicationEnabled": "false"       }
>     }
>   }
> }



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to