[ 
https://issues.apache.org/jira/browse/SOLR-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14264277#comment-14264277
 ] 

Forest Soup edited comment on SOLR-6359 at 1/5/15 7:27 AM:
-----------------------------------------------------------

Thanks. But will there be this case?  
After a snapshot recovery of core A is done, the tlog is still out-of-date 
without any new records from recovery, and it's not cleared. And if the just 
recovered core(core A) taking the leader role, and another core(core C) is 
trying to recover from it. As A's tlog contains the old entries without newest 
ones, will the core C do a peersync only with the old records, but missing the 
newest ones?

And I think the snapshot recovery is because there are too much difference 
between the 2 cores, so the tlog gap are also too much. So the out-of-date tlog 
is no longer needed for peersync.

Our testing shows the snapshot recovery does not clean tlog with below steps:
1, Core A and core B are 2 replicas of a shard.
2, Core A down, and core B took leader role. And it takes some updates and 
record them to its tlog.
3, After A up, it will do recovery from B, and if the difference are too much, 
A will do snapshot pull recovery. And during the snapshot pull recovery, there 
is no other update comes in. After the snapshot pull recovery, the tlog of A is 
not updated, it still does NOT contain any most recent from B. 
And the tlog are still out-of-date, although the index of A is already updated.
4, Core A down again, and core B still remain the leader role, and it takes 
some other updates and recore them to its tlog.
5, After A up again, it will do recovery from B. But it found its tlog is still 
too old. So it will do a snapshot recovery again, which is not necessary.

Do you agree? Thanks!


was (Author: forest_soup):
Thanks. But will there be this case?  
After a snapshot recovery of core A is done, the tlog is still out-of-date 
without any new records from recovery, and it's not cleared. And if the just 
recovered core(core A) taking the leader role, and another core(core C) is 
trying to recover from it. As A's tlog contains the old entries without newest 
ones, will the core C do a peersync only with the old records, but missing the 
newest ones?

And I think the snapshot recovery is because there are too much difference 
between the 2 cores, so the tlog gap are also too much. So the out-of-date tlog 
is no longer needed for peersync.

Our testing shows the snapshot recovery does not clean tlog with below steps:
1, Core A and core B are 2 replicas of a shard.
2, Core A down, and core B took leader role. And it takes some updates and 
record them to its tlog.
3, After A up, it will do recovery from B, and if the difference are too much, 
A will do snapshot pull recovery. And during the snapshot pull recovery, there 
is no other update comes in. After the snapshot pull recovery, the tlog of A is 
not updated, it still does NOT contain any most recent from B. 
***** And the tlog are still out-of-date, although the index of A is already 
updated. *****
4, Core A down again, and core B still remain the leader role, and it takes 
some other updates and recore them to its tlog.
5, After A up again, it will do recovery from B. But it found its tlog is still 
too old. So it will do a snapshot recovery again, which is not necessary.

Do you agree? Thanks!

> Allow customization of the number of records and logs kept by UpdateLog
> -----------------------------------------------------------------------
>
>                 Key: SOLR-6359
>                 URL: https://issues.apache.org/jira/browse/SOLR-6359
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Ramkumar Aiyengar
>            Assignee: Mark Miller
>            Priority: Minor
>             Fix For: 5.0, Trunk
>
>
> Currently {{UpdateLog}} hardcodes the number of logs and records it keeps, 
> and the hardcoded numbers (100 records, 10 logs) can be quite low (esp. the 
> records) in an heavily indexing setup, leading to full recovery even if Solr 
> was just stopped and restarted.
> These values should be customizable (even if only present as expert options).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to