[jira] [Comment Edited] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890021#comment-15890021 ] albert vico oton edited comment on SOLR-5872 at 3/9/17 11:51 AM: - Hello, we are currently trying to do a deploy of around 200 collections and solrcloud can't handle it, it just dies due update_status messages propagation everytime we try to add a new collection, each collection has 3 replicas, and sizes are not very large. Also, I do not see why collection A should be aware of collection B state. But moving to the topic, overseer node dies since he can not handle all the stress due the flooding of messages. IMHO we have here a single point of failure in a distributed system, which is not very recommended. since it would be useful for big fat shards, my suggestion would be to make this optional behavior, so people like us, who need to have a more distributed approach, can make use of solrcloud. Since right now it is impossible to. and I'm not talking about "thousands" of collections actually with as few as 100 we are seeing very bad performance. was (Author: alvico): Hello, we are currently trying to do a deploy of around 200 collections and solrcloud can't handle it, it just dies due update_status messages propagation everytime we try to add a new collection, each collection has 3 replicas, and sizes are not very large. Also, I do not see why collection A should be aware of collection B state. But moving to the topic, overseer node dies since he can not handle all the stress due the flooding of messages. IMHO we have here a single point of failure in a distributed system, which is not very recommended. since it would be useful for big fat shards, my suggestion would be to make this optional behavior, so people like use who need to have a more distributed approach can make use of solrcloud. Since right now it is impossible to. and I'm not talking about "thousands" of collections actually with as few as 100 we are seeing very bad performance. > Eliminate overseer queue > - > > Key: SOLR-5872 > URL: https://issues.apache.org/jira/browse/SOLR-5872 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > > The overseer queue is one of the busiest points in the entire system. The > raison d'être of the queue is > * Provide batching of operations for the main clusterstate,json so that > state updates are minimized > * Avoid race conditions and ensure order > Now , as we move the individual collection states out of the main > clusterstate.json, the batching is not useful anymore. > Race conditions can easily be solved by using a compare and set in Zookeeper. > The proposed solution is , whenever an operation is required to be performed > on the clusterstate, the same thread (and of course the same JVM) > # read the fresh state and version of zk node > # construct the new state > # perform a compare and set > # if compare and set fails go to step 1 > This should be limited to all operations performed on external collections > because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698435#comment-14698435 ] Ramkumar Aiyengar edited comment on SOLR-5872 at 8/15/15 8:29 PM: -- Though I haven't done serious experiments on this as yet, I see the lack of batching in stateFormat=2 is a potential blocker to it's adoption. We need some benchmarks on a single collection with lots of cores (at least 1000), and see how it works with stateFormat=1, stateFormat=2, and this new approach. Keep in mind that hundreds of cores might change state at the same time, that's the real benefit to batching. I fear that without a batching approach, the system might choke due to the contention at that point. My point here being that stateFormat=2 not doing batching isn't a convincing enough argument to eliminate overseer queue, may be the effort should be directed more towards getting batching for stateFormat=2 if that's more useful. was (Author: andyetitmoves): Though I haven't done serious experiments on this as yet, I see the lack of batching in stateFormat=2 is a potential blocker to it's adoption. We need some benchmarks on a single collection with lots of cores (at least 1000), and see how it works with stateFormat=1, stateFormat=2, and this new approach. Keep in mind that hundreds of cores might change state at the same time, that's the real benefit to batching. I fear that without a batching approach, the system might choke due to the contention at that point. Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939422#comment-13939422 ] Noble Paul edited comment on SOLR-5872 at 3/18/14 4:04 PM: --- bq.That is also how I first implemented the clusterstate Can you throw some light on how was the ZK schema for your initial impl? If all nodes of a given slice is under one zk directory , one watch on the parent should be fine, right? was (Author: noble.paul): bq.That is also how I first implemented the clusterstate Can you throw some light on how was the ZK schema for your initial impl? If all nodes of a given slice is in one watch on the parent should be fine, right? Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938670#comment-13938670 ] Jessica Cheng edited comment on SOLR-5872 at 3/18/14 1:44 AM: -- {quote}For further discussion around the change, there should be background if you search the archives.{quote} If you wouldn't mind terribly, will you please paste the link of a few relevant threads in the archive? (Sorry, I'm not familiar with all the keywords and archives, etc., yet.) {quote}There is a strong argument to be made that we should first investigate the performance issues with the current strategy. ZooKeeper is pretty fast - these state updates are tiny and batched. It seems like we should be able to do a lot better without throwing out code that has been getting hardened for a long time now.{quote} I see where your hesitation is now, and I can definitely agree. Sounds like there are a few points to be investigated for the current system before we attempt to change anything: - Why is the Overseer's so slow at updating cluster state/ What's causing the build-up of queue messages during a restart? - What can we do to generally solve the problem of the Overseer being killed on every instance restart in a rolling bounce? - How much is actually batched? My gut is that for external collections, batching won't be of that much benefit (except for that super-large collection case that Yoink mentioned), but I agree that if the current system can be hardened to work even for those, then the simplicity of one code path should be preferred over ultra-optimizing for a non-issue (assuming the first two points above can be fixed). was (Author: mewmewball): quoteFor further discussion around the change, there should be background if you search the archives./quote If you wouldn't mind terribly, will you please paste the link of a few relevant threads in the archive? (Sorry, I'm not familiar with all the keywords and archives, etc., yet.) quoteThere is a strong argument to be made that we should first investigate the performance issues with the current strategy. ZooKeeper is pretty fast - these state updates are tiny and batched. It seems like we should be able to do a lot better without throwing out code that has been getting hardened for a long time now./quote I see where your hesitation is now, and I can definitely agree. Sounds like there are a few points to be investigated for the current system before we attempt to change anything: - Why is the Overseer's so slow at updating cluster state/ What's causing the build-up of queue messages during a restart? - What can we do to generally solve the problem of the Overseer being killed on every instance restart in a rolling bounce? - How much is actually batched? My gut is that for external collections, batching won't be of that much benefit (except for that super-large collection case that Yoink mentioned), but I agree that if the current system can be hardened to work even for those, then the simplicity of one code path should be preferred over ultra-optimizing for a non-issue (assuming the first two points above can be fixed). Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5872) Eliminate overseer queue
[ https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938796#comment-13938796 ] Noble Paul edited comment on SOLR-5872 at 3/18/14 4:28 AM: --- bq. I think if we decide to split out the clusterstate.json per collection, that is the direction we should take Yes, that is the plan we would probably switch to that from 5.0 or something. But the challenge is to offer a smother migration path. Till then we need a name to differentiate both modes * initially , users would be able to switch to that mode when creating a collection (an opt In) SOLR-5473 does that * offer an API to migrate to the new format SOLR-5756 * Make it the default format (from say 5.0) * deprecate the old format was (Author: noble.paul): bq. I think if we decide to split out the clusterstate.json per collection, that is the direction we should take Yes, that is the plan we would probably switch to that from 5.0 or something. But the challenge is to offer a smother migration path. * initially , users would be able to switch to that mode when creating a collection (an opt In) SOLR-5473 does that * offer an API to migrate to the new format SOLR-5756 * Make it the default format (from say 5.0) * deprecate the old format Eliminate overseer queue - Key: SOLR-5872 URL: https://issues.apache.org/jira/browse/SOLR-5872 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul The overseer queue is one of the busiest points in the entire system. The raison d'être of the queue is * Provide batching of operations for the main clusterstate,json so that state updates are minimized * Avoid race conditions and ensure order Now , as we move the individual collection states out of the main clusterstate.json, the batching is not useful anymore. Race conditions can easily be solved by using a compare and set in Zookeeper. The proposed solution is , whenever an operation is required to be performed on the clusterstate, the same thread (and of course the same JVM) # read the fresh state and version of zk node # construct the new state # perform a compare and set # if compare and set fails go to step 1 This should be limited to all operations performed on external collections because batching would be required for others -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org