[ https://issues.apache.org/jira/browse/SOLR-6760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14680909#comment-14680909 ]
Scott Blum commented on SOLR-6760: ---------------------------------- One big question I have is how to handle session/loss connection issues. I worry that a hole in the patch I have is what happens if we lose ZK connection and regain it? I have a feeling we'd end up in deadlock. I think I need to add a general ConnectionStateListener so that I can mark having no watcher if we lose connection, and when connection comes back, tickle the notEmpty condition to cause all blocked threads to loop and ultimately refetch ZK state. > New optimized DistributedQueue implementation for overseer > ---------------------------------------------------------- > > Key: SOLR-6760 > URL: https://issues.apache.org/jira/browse/SOLR-6760 > Project: Solr > Issue Type: Bug > Reporter: Noble Paul > Assignee: Noble Paul > Attachments: SOLR-6760.patch > > > Currently the DQ works as follows > * read all items in the directory > * sort them all > * take the head and return it and discard everything else > * rinse and repeat > This works well when we have only a handful of items in the Queue. If the > items in the queue is much larger (in tens of thousands) , this is > counterproductive > As the overseer queue is a multiple producers + single consumer queue, We can > read them all in bulk and before processing each item , just do a > zk.exists(itemname) and if all is well we don't need to do the fetch all + > sort thing again -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org