[ 
https://issues.apache.org/jira/browse/SOLR-7936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-7936.
----------------------------------
    Resolution: Cannot Reproduce

Can't get this to fail now.

> Bogus failure when deleting collections.
> ----------------------------------------
>
>                 Key: SOLR-7936
>                 URL: https://issues.apache.org/jira/browse/SOLR-7936
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Erick Erickson
>            Assignee: Erick Erickson
>
> When looking at the CDCR test failures, we began to wonder whether the 
> problem was
> 1> the cdcr code itself
> 2> the test framework
> 3> Solr
> Some of the failures seem to be "impossible" assuming collection 
> creation/deletion work OK.
> So I wrote a little program to exercise collection creation/deletion outside 
> the test framework by just adding and deleting the same collection over and 
> over and over again, and it started regularly failing in 
> OverseerCollectionMessageHandler.deleteCollection about line 780 it would 
> throw the "Could not fully remove the collection" exception:
> {code}
>       TimeOut timeout = new TimeOut(30, TimeUnit.SECONDS);
>       boolean removed = false;
>       while (! timeout.hasTimedOut()) {
>         Thread.sleep(100);
>         // WORKS SO FAR IF UNCOMMENTED zkStateReader.updateClusterState();
>         removed = !zkStateReader.getClusterState().hasCollection(collection);
>         if (removed) {
>           Thread.sleep(500); // just a bit of time so it's more likely other
>                              // readers see on return
>           break;
>         }
>       }
>       if (!removed) {
>         throw new SolrException(ErrorCode.SERVER_ERROR,
>             "Could not fully remove collection: " + collection);
>       }
> {code}
> However, the collection is really gone from clusterstate. When I put the 
> updateClusterState() in above, it doesn't seem to fail. Is it as simple as 
> the updateClusterState() call?
> Without the update in place, it failed within 20 reps very regularly. So far, 
> with the update in place we're at 132 and counting. Any comments?
> If this runs 1,000 times tonight, I'll check it in if there are no 
> objections. I don't know what it means for CDCR yet though.
> I'm also suspicious of the 500ms sleep. Anyone have a clue what that's in 
> there for?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to