Hi,

During initial storm 2.4.0 startup, zk nodes were making use of
storm.zookeeper.root as expected, in zookeeper 3.7.1 cluster. When there is
any connection error between storm supervisor/worker and zookeeper leader
for few seconds, I see zk node requests from storm, are sent to zookeeper
without making use of storm.zookeeper.root value. So in / itself I see
/supervisors zk node, in which some of the supervisors are listed.

Because of this, some of the supervisors are not listed in storm ui
anymore.  In this scenario I see supervisors are also blacklisted and even
after removal from blacklisting after 30 minutes, still storm ui doesn't
display the supervisors.  I am not sure what are the other issues this
incorrect storm zk node hierarchy might be causing to storm 2.4.0 cluster.

Anyone has seen any issue like this.  If so what is the solution.  Even
though I am able to find this zk node issue by enabling storm debug log,
not able to find the class that might be causing this because debug log is
not there for some of storm zk storage related classes.

If storm.zookeeper.root is set to /  in storm.yaml, it looks like temporary
workaround, because during initial setup and also affer zk intermittent
error scenario it uses /.   But since this storm-zk intermittent error
might be common scenario, wondering there should be better solution or may
be something is not setup properly OR Is this bug in storm zk storage
related class. Any input will helpful.  Thanks for your time.

Thillai

Reply via email to