[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15905224#comment-15905224 ]
Shawn Heisey commented on SOLR-5872: ------------------------------------ I don't know how this issue escaped my attention, especially since it's been around a few years. [~mewmewball] mentioned early on in this issue that each state change results in four ZK writes. When I opened SOLR-7191, I found that when any collection changed state, something was sent to the overseer queue for *every* collection. If I remember right, this happens even when adding a new collection, which seems completely insane to me. When the number of collections gets large enough, Solr has a tendency to run into ZOOKEEPER-1162, because entries can be added to the overseer queue at a much faster rate than the overseer can process them. During my testing on SOLR-7191 with version 5, Solr generated an overseer queue with 850,000 entries in it, resulting in a ZK packet size of 14 megabytes. I am not at all familiar with how SolrCloud's zookeeper code works. Exploring that rabbit hole will take a pretty major time investment. I've been reluctant to spend that time. Other people *do* understand it, so I mostly just bounce ideas off of those people and ask questions. bq. I'll take a look but the problem we were seeing was in Zookeeper cluster not in solr I don't see anything in your comment on 2017/03/01 that describes a problem with ZK. It sounds like problems with Solr using ZK. The overseer is a Solr component, it's not in ZK. If SOLR-10130 is occurring on your system, then an upgrade to 6.4.2 will help. > Eliminate overseer queue > ------------------------- > > Key: SOLR-5872 > URL: https://issues.apache.org/jira/browse/SOLR-5872 > Project: Solr > Issue Type: Improvement > Components: SolrCloud > Reporter: Noble Paul > Assignee: Noble Paul > > The overseer queue is one of the busiest points in the entire system. The > raison d'ĂȘtre of the queue is > * Provide batching of operations for the main clusterstate,json so that > state updates are minimized > * Avoid race conditions and ensure order > Now , as we move the individual collection states out of the main > clusterstate.json, the batching is not useful anymore. > Race conditions can easily be solved by using a compare and set in Zookeeper. > The proposed solution is , whenever an operation is required to be performed > on the clusterstate, the same thread (and of course the same JVM) > # read the fresh state and version of zk node > # construct the new state > # perform a compare and set > # if compare and set fails go to step 1 > This should be limited to all operations performed on external collections > because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org