Scott Blum created SOLR-11423:
---------------------------------

             Summary: Overseer queue needs a hard cap (maximum size) that 
clients respect
                 Key: SOLR-11423
                 URL: https://issues.apache.org/jira/browse/SOLR-11423
             Project: Solr
          Issue Type: Improvement
      Security Level: Public (Default Security Level. Issues are Public)
          Components: SolrCloud
            Reporter: Scott Blum
            Assignee: Scott Blum


When Solr gets into pathological GC thrashing states, it can fill the overseer 
queue with literally thousands and thousands of queued state changes.  Many of 
these end up being duplicated up/down state updates.  Our production cluster 
has gotten to the 100k queued items level many times, and there's nothing 
useful you can do at this point except manually purge the queue in ZK.  
Recently, it hit 3 million queued items, at which point our entire ZK cluster 
exploded.

I propose a hard cap.  Any client trying to enqueue a item when a queue is full 
would throw an exception.  I was thinking maybe 10,000 items would be a 
reasonable limit.  Thoughts?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to