[jira] [Commented] (SOLR-4165) Queries blocked when stopping and starting a node
[ https://issues.apache.org/jira/browse/SOLR-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13579165#comment-13579165 ] Markus Jelsma commented on SOLR-4165: - Mark, i'm not sure this issue is entirely resolved. If i'm doing a stress test against a cluster and restart a node, the entire cluster still gets blocked. SOLR-3655 did improve things a lot, now the cluster only gets blocked when a node stops. When the node starts up again the stress test continues without being interrupted. Queries blocked when stopping and starting a node - Key: SOLR-4165 URL: https://issues.apache.org/jira/browse/SOLR-4165 Project: Solr Issue Type: Bug Components: search, SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1420056M - markus - 2012-12-11 11:52:06 Reporter: Markus Jelsma Assignee: Mark Miller Priority: Critical Fix For: 4.2, 5.0 Our 10 node test cluster (10 shards, 20 cores) blocks incoming queries briefly when a node is stopped gracefully and again blocks queries for at least a few seconds when the node is started again. We're using siege to send roughly 10 queries per second to a pair a load balancers. Those load balancers ping (admin/ping) each node every few hundres milliseconds. The ping queries continue to operate normally while the requests to our main request handler is blocked. A manual request directly to a live Solr node is also blocked for the same duration. There are no errors logged. But it is clear that the the entire cluster blocks queries as soon as the starting node is reading its config from Zookeeper, likely even slightly earlier. The blocking time when stopping a node varies between 1 or 5 seconds. The blocking time when starting a node varies between 10 up to 30 seconds. The blocked queries come rushing in again after a queue of ping requests are served. The ping request sets the main request handler via the qt parameter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4165) Queries blocked when stopping and starting a node
[ https://issues.apache.org/jira/browse/SOLR-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13579187#comment-13579187 ] Mark Miller commented on SOLR-4165: --- i guess reopen and rename this is the best move Queries blocked when stopping and starting a node - Key: SOLR-4165 URL: https://issues.apache.org/jira/browse/SOLR-4165 Project: Solr Issue Type: Bug Components: search, SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1420056M - markus - 2012-12-11 11:52:06 Reporter: Markus Jelsma Assignee: Mark Miller Priority: Critical Fix For: 4.2, 5.0 Our 10 node test cluster (10 shards, 20 cores) blocks incoming queries briefly when a node is stopped gracefully and again blocks queries for at least a few seconds when the node is started again. We're using siege to send roughly 10 queries per second to a pair a load balancers. Those load balancers ping (admin/ping) each node every few hundres milliseconds. The ping queries continue to operate normally while the requests to our main request handler is blocked. A manual request directly to a live Solr node is also blocked for the same duration. There are no errors logged. But it is clear that the the entire cluster blocks queries as soon as the starting node is reading its config from Zookeeper, likely even slightly earlier. The blocking time when stopping a node varies between 1 or 5 seconds. The blocking time when starting a node varies between 10 up to 30 seconds. The blocked queries come rushing in again after a queue of ping requests are served. The ping request sets the main request handler via the qt parameter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4165) Queries blocked when stopping and starting a node
[ https://issues.apache.org/jira/browse/SOLR-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13574814#comment-13574814 ] Mark Miller commented on SOLR-4165: --- Yeah, resolving as a duplicate - I'll solve this in SOLR-3655. Queries blocked when stopping and starting a node - Key: SOLR-4165 URL: https://issues.apache.org/jira/browse/SOLR-4165 Project: Solr Issue Type: Bug Components: search, SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1420056M - markus - 2012-12-11 11:52:06 Reporter: Markus Jelsma Assignee: Mark Miller Priority: Critical Fix For: 4.2, 5.0 Our 10 node test cluster (10 shards, 20 cores) blocks incoming queries briefly when a node is stopped gracefully and again blocks queries for at least a few seconds when the node is started again. We're using siege to send roughly 10 queries per second to a pair a load balancers. Those load balancers ping (admin/ping) each node every few hundres milliseconds. The ping queries continue to operate normally while the requests to our main request handler is blocked. A manual request directly to a live Solr node is also blocked for the same duration. There are no errors logged. But it is clear that the the entire cluster blocks queries as soon as the starting node is reading its config from Zookeeper, likely even slightly earlier. The blocking time when stopping a node varies between 1 or 5 seconds. The blocking time when starting a node varies between 10 up to 30 seconds. The blocked queries come rushing in again after a queue of ping requests are served. The ping request sets the main request handler via the qt parameter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4165) Queries blocked when stopping and starting a node
[ https://issues.apache.org/jira/browse/SOLR-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13554530#comment-13554530 ] Steve Rowe commented on SOLR-4165: -- Mark, can this be resolved as a duplicate of SOLR-3655? Queries blocked when stopping and starting a node - Key: SOLR-4165 URL: https://issues.apache.org/jira/browse/SOLR-4165 Project: Solr Issue Type: Bug Components: search, SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1420056M - markus - 2012-12-11 11:52:06 Reporter: Markus Jelsma Assignee: Mark Miller Priority: Critical Fix For: 4.1, 5.0 Our 10 node test cluster (10 shards, 20 cores) blocks incoming queries briefly when a node is stopped gracefully and again blocks queries for at least a few seconds when the node is started again. We're using siege to send roughly 10 queries per second to a pair a load balancers. Those load balancers ping (admin/ping) each node every few hundres milliseconds. The ping queries continue to operate normally while the requests to our main request handler is blocked. A manual request directly to a live Solr node is also blocked for the same duration. There are no errors logged. But it is clear that the the entire cluster blocks queries as soon as the starting node is reading its config from Zookeeper, likely even slightly earlier. The blocking time when stopping a node varies between 1 or 5 seconds. The blocking time when starting a node varies between 10 up to 30 seconds. The blocked queries come rushing in again after a queue of ping requests are served. The ping request sets the main request handler via the qt parameter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4165) Queries blocked when stopping and starting a node
[ https://issues.apache.org/jira/browse/SOLR-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13551878#comment-13551878 ] Markus Jelsma commented on SOLR-4165: - Hi Mark, this is for standard stops. On shutdown the cluster can stall very briefly, a matter of 1 or 2 seconds at most in our case. On start up the problem is more serious. Queries blocked when stopping and starting a node - Key: SOLR-4165 URL: https://issues.apache.org/jira/browse/SOLR-4165 Project: Solr Issue Type: Bug Components: search, SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1420056M - markus - 2012-12-11 11:52:06 Reporter: Markus Jelsma Assignee: Mark Miller Priority: Critical Fix For: 4.1, 5.0 Our 10 node test cluster (10 shards, 20 cores) blocks incoming queries briefly when a node is stopped gracefully and again blocks queries for at least a few seconds when the node is started again. We're using siege to send roughly 10 queries per second to a pair a load balancers. Those load balancers ping (admin/ping) each node every few hundres milliseconds. The ping queries continue to operate normally while the requests to our main request handler is blocked. A manual request directly to a live Solr node is also blocked for the same duration. There are no errors logged. But it is clear that the the entire cluster blocks queries as soon as the starting node is reading its config from Zookeeper, likely even slightly earlier. The blocking time when stopping a node varies between 1 or 5 seconds. The blocking time when starting a node varies between 10 up to 30 seconds. The blocked queries come rushing in again after a queue of ping requests are served. The ping request sets the main request handler via the qt parameter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4165) Queries blocked when stopping and starting a node
[ https://issues.apache.org/jira/browse/SOLR-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13551461#comment-13551461 ] Mark Miller commented on SOLR-4165: --- Hey Markus - how were you stopping the node? Standard stop or kill? A standard stop should pull the node out of live nodes pretty darn quickly... Queries blocked when stopping and starting a node - Key: SOLR-4165 URL: https://issues.apache.org/jira/browse/SOLR-4165 Project: Solr Issue Type: Bug Components: search, SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1420056M - markus - 2012-12-11 11:52:06 Reporter: Markus Jelsma Assignee: Mark Miller Priority: Critical Fix For: 4.1, 5.0 Our 10 node test cluster (10 shards, 20 cores) blocks incoming queries briefly when a node is stopped gracefully and again blocks queries for at least a few seconds when the node is started again. We're using siege to send roughly 10 queries per second to a pair a load balancers. Those load balancers ping (admin/ping) each node every few hundres milliseconds. The ping queries continue to operate normally while the requests to our main request handler is blocked. A manual request directly to a live Solr node is also blocked for the same duration. There are no errors logged. But it is clear that the the entire cluster blocks queries as soon as the starting node is reading its config from Zookeeper, likely even slightly earlier. The blocking time when stopping a node varies between 1 or 5 seconds. The blocking time when starting a node varies between 10 up to 30 seconds. The blocked queries come rushing in again after a queue of ping requests are served. The ping request sets the main request handler via the qt parameter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4165) Queries blocked when stopping and starting a node
[ https://issues.apache.org/jira/browse/SOLR-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13543723#comment-13543723 ] Markus Jelsma commented on SOLR-4165: - SOLR-3655 sounds like what i describe. Seems i opened a duplicate. Thanks! Queries blocked when stopping and starting a node - Key: SOLR-4165 URL: https://issues.apache.org/jira/browse/SOLR-4165 Project: Solr Issue Type: Bug Components: search, SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1420056M - markus - 2012-12-11 11:52:06 Reporter: Markus Jelsma Priority: Critical Fix For: 5.0 Our 10 node test cluster (10 shards, 20 cores) blocks incoming queries briefly when a node is stopped gracefully and again blocks queries for at least a few seconds when the node is started again. We're using siege to send roughly 10 queries per second to a pair a load balancers. Those load balancers ping (admin/ping) each node every few hundres milliseconds. The ping queries continue to operate normally while the requests to our main request handler is blocked. A manual request directly to a live Solr node is also blocked for the same duration. There are no errors logged. But it is clear that the the entire cluster blocks queries as soon as the starting node is reading its config from Zookeeper, likely even slightly earlier. The blocking time when stopping a node varies between 1 or 5 seconds. The blocking time when starting a node varies between 10 up to 30 seconds. The blocked queries come rushing in again after a queue of ping requests are served. The ping request sets the main request handler via the qt parameter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4165) Queries blocked when stopping and starting a node
[ https://issues.apache.org/jira/browse/SOLR-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542994#comment-13542994 ] Markus Jelsma commented on SOLR-4165: - Anyone here to test whether this issue applies to 4.x as well? Queries blocked when stopping and starting a node - Key: SOLR-4165 URL: https://issues.apache.org/jira/browse/SOLR-4165 Project: Solr Issue Type: Bug Components: search, SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1420056M - markus - 2012-12-11 11:52:06 Reporter: Markus Jelsma Priority: Critical Fix For: 5.0 Our 10 node test cluster (10 shards, 20 cores) blocks incoming queries briefly when a node is stopped gracefully and again blocks queries for at least a few seconds when the node is started again. We're using siege to send roughly 10 queries per second to a pair a load balancers. Those load balancers ping (admin/ping) each node every few hundres milliseconds. The ping queries continue to operate normally while the requests to our main request handler is blocked. A manual request directly to a live Solr node is also blocked for the same duration. There are no errors logged. But it is clear that the the entire cluster blocks queries as soon as the starting node is reading its config from Zookeeper, likely even slightly earlier. The blocking time when stopping a node varies between 1 or 5 seconds. The blocking time when starting a node varies between 10 up to 30 seconds. The blocked queries come rushing in again after a queue of ping requests are served. The ping request sets the main request handler via the qt parameter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4165) Queries blocked when stopping and starting a node
[ https://issues.apache.org/jira/browse/SOLR-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13543010#comment-13543010 ] Mark Miller commented on SOLR-4165: --- 4x and 5x are pretty much in alignment these days. Are you still seeing this? Very strange... Queries blocked when stopping and starting a node - Key: SOLR-4165 URL: https://issues.apache.org/jira/browse/SOLR-4165 Project: Solr Issue Type: Bug Components: search, SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1420056M - markus - 2012-12-11 11:52:06 Reporter: Markus Jelsma Priority: Critical Fix For: 5.0 Our 10 node test cluster (10 shards, 20 cores) blocks incoming queries briefly when a node is stopped gracefully and again blocks queries for at least a few seconds when the node is started again. We're using siege to send roughly 10 queries per second to a pair a load balancers. Those load balancers ping (admin/ping) each node every few hundres milliseconds. The ping queries continue to operate normally while the requests to our main request handler is blocked. A manual request directly to a live Solr node is also blocked for the same duration. There are no errors logged. But it is clear that the the entire cluster blocks queries as soon as the starting node is reading its config from Zookeeper, likely even slightly earlier. The blocking time when stopping a node varies between 1 or 5 seconds. The blocking time when starting a node varies between 10 up to 30 seconds. The blocked queries come rushing in again after a queue of ping requests are served. The ping request sets the main request handler via the qt parameter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4165) Queries blocked when stopping and starting a node
[ https://issues.apache.org/jira/browse/SOLR-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13543021#comment-13543021 ] Markus Jelsma commented on SOLR-4165: - Yes. Query time is consistent until a node starts. A few seconds after start up all other nodes stop responding for a significant period (10-30 seconds). When that time has passed, the nodes suddenly send the response again. Queries blocked when stopping and starting a node - Key: SOLR-4165 URL: https://issues.apache.org/jira/browse/SOLR-4165 Project: Solr Issue Type: Bug Components: search, SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1420056M - markus - 2012-12-11 11:52:06 Reporter: Markus Jelsma Priority: Critical Fix For: 5.0 Our 10 node test cluster (10 shards, 20 cores) blocks incoming queries briefly when a node is stopped gracefully and again blocks queries for at least a few seconds when the node is started again. We're using siege to send roughly 10 queries per second to a pair a load balancers. Those load balancers ping (admin/ping) each node every few hundres milliseconds. The ping queries continue to operate normally while the requests to our main request handler is blocked. A manual request directly to a live Solr node is also blocked for the same duration. There are no errors logged. But it is clear that the the entire cluster blocks queries as soon as the starting node is reading its config from Zookeeper, likely even slightly earlier. The blocking time when stopping a node varies between 1 or 5 seconds. The blocking time when starting a node varies between 10 up to 30 seconds. The blocked queries come rushing in again after a queue of ping requests are served. The ping request sets the main request handler via the qt parameter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4165) Queries blocked when stopping and starting a node
[ https://issues.apache.org/jira/browse/SOLR-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13543022#comment-13543022 ] Markus Jelsma commented on SOLR-4165: - We're also seeing the restarted node as ACTIVE immediately after start up in the cloud view but it's schema and index have not been loaded yet, only after everything is initialized the state becomes RECOVERING. Is it possible it's active to early so the other nodes query it but do not receive reply until it's fully initialized? Queries blocked when stopping and starting a node - Key: SOLR-4165 URL: https://issues.apache.org/jira/browse/SOLR-4165 Project: Solr Issue Type: Bug Components: search, SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1420056M - markus - 2012-12-11 11:52:06 Reporter: Markus Jelsma Priority: Critical Fix For: 5.0 Our 10 node test cluster (10 shards, 20 cores) blocks incoming queries briefly when a node is stopped gracefully and again blocks queries for at least a few seconds when the node is started again. We're using siege to send roughly 10 queries per second to a pair a load balancers. Those load balancers ping (admin/ping) each node every few hundres milliseconds. The ping queries continue to operate normally while the requests to our main request handler is blocked. A manual request directly to a live Solr node is also blocked for the same duration. There are no errors logged. But it is clear that the the entire cluster blocks queries as soon as the starting node is reading its config from Zookeeper, likely even slightly earlier. The blocking time when stopping a node varies between 1 or 5 seconds. The blocking time when starting a node varies between 10 up to 30 seconds. The blocked queries come rushing in again after a queue of ping requests are served. The ping request sets the main request handler via the qt parameter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4165) Queries blocked when stopping and starting a node
[ https://issues.apache.org/jira/browse/SOLR-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13543028#comment-13543028 ] Mark Miller commented on SOLR-4165: --- Probably - good thought. Take a look at SOLR-3655 by the way. I'll try and think on this some... Queries blocked when stopping and starting a node - Key: SOLR-4165 URL: https://issues.apache.org/jira/browse/SOLR-4165 Project: Solr Issue Type: Bug Components: search, SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1420056M - markus - 2012-12-11 11:52:06 Reporter: Markus Jelsma Priority: Critical Fix For: 5.0 Our 10 node test cluster (10 shards, 20 cores) blocks incoming queries briefly when a node is stopped gracefully and again blocks queries for at least a few seconds when the node is started again. We're using siege to send roughly 10 queries per second to a pair a load balancers. Those load balancers ping (admin/ping) each node every few hundres milliseconds. The ping queries continue to operate normally while the requests to our main request handler is blocked. A manual request directly to a live Solr node is also blocked for the same duration. There are no errors logged. But it is clear that the the entire cluster blocks queries as soon as the starting node is reading its config from Zookeeper, likely even slightly earlier. The blocking time when stopping a node varies between 1 or 5 seconds. The blocking time when starting a node varies between 10 up to 30 seconds. The blocked queries come rushing in again after a queue of ping requests are served. The ping request sets the main request handler via the qt parameter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org