[
https://issues.apache.org/jira/browse/SOLR-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15319716#comment-15319716
]
ASF GitHub Bot commented on SOLR-9191:
--------------------------------------
Github user dragonsinth commented on a diff in the pull request:
https://github.com/apache/lucene-solr/pull/41#discussion_r66172805
--- Diff:
solr/core/src/java/org/apache/solr/cloud/OverseerTaskProcessor.java ---
@@ -466,6 +466,8 @@ private void markTaskComplete(String id, String asyncId)
log.warn("Could not find and remove async call [" + asyncId + "]
from the running map.");
}
}
+
+ workQueue.remove(head);
--- End diff --
@markrmiller can you think of any reason not to do this? I don't
understand why currently getting things out of the queue takes an extra
iteration. I think my fix unmasked a latent problem exposed by
DeleteStatusTest; to get that test to pass I have to eagerly remove completed
items from the work queue, which seems correct to me. Not sure why we'd want
to wait for a loop-around to `cleanUpWorkQueue()`
> OverseerTaskQueue.peekTopN() fatally flawed
> -------------------------------------------
>
> Key: SOLR-9191
> URL: https://issues.apache.org/jira/browse/SOLR-9191
> Project: Solr
> Issue Type: Bug
> Affects Versions: 5.4, 5.4.1, 5.5, 5.5.1, 6.0, 6.0.1
> Reporter: Scott Blum
> Assignee: Scott Blum
> Priority: Blocker
> Fix For: 5.6, 6.1, 5.5.2, 6.0.2, 6.2
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> We rewrote DistributedQueue in SOLR-6760, to optimize its obvious use case as
> a FIFO. But in doing so, we broke the assumptions in
> OverseerTaskQueue.peekTopN()..
> OverseerTaskQueue.peekTopN() involves filtering out items you're already
> working on, it's trying to peek for new items in the queue beyond what you
> already know about. But DistributedQueue (being designed as a FIFO) doesn't
> know about the filtering; as long as it has any items in-memory it just keeps
> returning those over and over without ever pulling new data from ZK. This is
> true even if the watcher has fired and marked the state as dirty. So
> OverseerTaskQueue gets into a state where it can never read new items in ZK
> because DQ keeps returning the same items that it has marked as in-progress.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]