Connection imbalance leads to overloaded ZK instances
-----------------------------------------------------
Key: ZOOKEEPER-856
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-856
Project: Zookeeper
Issue Type: Bug
Reporter: Travis Crawford
We've experienced a number of issues lately where "ruok" requests would take
upwards of 10 seconds to return, and ZooKeeper instances were extremely
sluggish. The sluggish instance requires a restart to make it responsive again.
I believe the issue is connections are very imbalanced, leading to certain
instances having many thousands of connections, while other instances are
largely idle.
A potential solution is periodically disconnecting/reconnecting to balance
connections over time; this seems fine because sessions should not be affected,
and therefore ephemaral nodes and watches should not be affected.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.