[ 
https://issues.apache.org/jira/browse/SOLR-7332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14519457#comment-14519457
 ] 

Timothy Potter commented on SOLR-7332:
--------------------------------------

Haven't been able to reproduce this with many stress tests on EC2 and it's 
starting to get expensive ;-)

bq. Were there any recoveries or change of leaders during the run?

There definitely could have been some recoveries but I'm not sure. I'm taking a 
snapshot of cluster state before I run my tests to compare to after in case I 
do reproduce this. Yesterday I pushed it very hard with 48 reducers from 
Hadoop, which led to some network issue between leader and replica and the 
leader put the replica into recovery, see SOLR-7483. However, the replica 
eventually recovered and was in-sync with the leader at the end, which is 
goodness.

bq. No... 

Thanks for confirming. I was thinking that maybe it had something to do with 
this patch resetting the max after replaying the tlog:

>From UpdateLog:
{code}
@@ -1247,6 +1269,12 @@
         // change the state while updates are still blocked to prevent races
         state = State.ACTIVE;
         if (finishing) {
+
+          // after replay, update the max from the index
+          log.info("Re-computing max version from index after log re-play.");
+          maxVersionFromIndex = null;
+          getMaxVersionFromIndex();
+
           versionInfo.unblockUpdates();
         }
{code}

But since updates are blocked while this happens, it seems like the right thing 
to do.

I'm going to run this a few more times using same setup as when it occurred the 
first time and then I think we should commit this to trunk and see how it 
behaves for a few days, as the performance improvement is a big win.

> Seed version buckets with max version from index
> ------------------------------------------------
>
>                 Key: SOLR-7332
>                 URL: https://issues.apache.org/jira/browse/SOLR-7332
>             Project: Solr
>          Issue Type: Sub-task
>          Components: SolrCloud
>            Reporter: Timothy Potter
>            Assignee: Timothy Potter
>         Attachments: SOLR-7332.patch, SOLR-7332.patch, SOLR-7332.patch, 
> SOLR-7332.patch, SOLR-7332.patch
>
>
> See full discussion with Yonik and I in SOLR-6816.
> The TL;DR of that discussion is that we should initialize highest for each 
> version bucket to the MAX value of the {{__version__}} field in the index as 
> early as possible, such as after the first soft- or hard- commit. This will 
> ensure that bulk adds where the docs don't exist avoid an unnecessary lookup 
> for a non-existent document in the index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to