[ https://issues.apache.org/jira/browse/SOLR-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15319717#comment-15319717 ]
ASF GitHub Bot commented on SOLR-9191: -------------------------------------- Github user dragonsinth commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/41#discussion_r66172870 --- Diff: solr/core/src/test/org/apache/solr/cloud/DistributedQueueTest.java --- @@ -137,6 +136,49 @@ public void testDistributedQueueBlocking() throws Exception { assertNull(dq.poll()); } + @Test + public void testPeekElements() throws Exception { + String dqZNode = "/distqueue/test"; + byte[] data = "hello world".getBytes(UTF8); + + DistributedQueue dq = makeDistributedQueue(dqZNode); + + // Populate with data. + dq.offer(data); + dq.offer(data); + dq.offer(data); + + // Should be able to get 0, 1, 2, or 3 instantly + for (int i = 0; i <= 3; ++i) { + assertEquals(i, dq.peekElements(i, 0, child -> true).size()); + } + + // Asking for more should return only 3. + assertEquals(3, dq.peekElements(4, 0, child -> true).size()); + + // If we filter everything out, we should block for the full time. + long start = System.nanoTime(); + assertEquals(0, dq.peekElements(4, 1000, child -> false).size()); + assertTrue(System.nanoTime() - start >= TimeUnit.MILLISECONDS.toNanos(500)); + + // If someone adds a new matching element while we're waiting, we should return immediately. + executor.submit(() -> { + try { + Thread.sleep(500); + dq.offer(data); + } catch (Exception e) { + // ignore + } + }); + start = System.nanoTime(); + assertEquals(1, dq.peekElements(4, 2000, child -> { + // The 4th element in the queue will end with a "3". + return child.endsWith("3"); + }).size()); + assertTrue(System.nanoTime() - start < TimeUnit.MILLISECONDS.toNanos(1000)); + assertTrue(System.nanoTime() - start >= TimeUnit.MILLISECONDS.toNanos(250)); + } + --- End diff -- @markrmiller the new test you suggested > OverseerTaskQueue.peekTopN() fatally flawed > ------------------------------------------- > > Key: SOLR-9191 > URL: https://issues.apache.org/jira/browse/SOLR-9191 > Project: Solr > Issue Type: Bug > Affects Versions: 5.4, 5.4.1, 5.5, 5.5.1, 6.0, 6.0.1 > Reporter: Scott Blum > Assignee: Scott Blum > Priority: Blocker > Fix For: 5.6, 6.1, 5.5.2, 6.0.2, 6.2 > > Original Estimate: 24h > Remaining Estimate: 24h > > We rewrote DistributedQueue in SOLR-6760, to optimize its obvious use case as > a FIFO. But in doing so, we broke the assumptions in > OverseerTaskQueue.peekTopN().. > OverseerTaskQueue.peekTopN() involves filtering out items you're already > working on, it's trying to peek for new items in the queue beyond what you > already know about. But DistributedQueue (being designed as a FIFO) doesn't > know about the filtering; as long as it has any items in-memory it just keeps > returning those over and over without ever pulling new data from ZK. This is > true even if the watcher has fired and marked the state as dirty. So > OverseerTaskQueue gets into a state where it can never read new items in ZK > because DQ keeps returning the same items that it has marked as in-progress. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org