zk-shell has a command to lookup the session-id + ip:port that owns an
ephemeral znode:

https://github.com/rgs1/zk_shell/blob/master/zk_shell/shell.py#L2734

Example:

$ pip3 install zk-shell
$ zk-shell your-server:2181
(CONNECTED [your-server:2181]) /test> create foo bar ephemeral=true
(CONNECTED [your-server:2181]) /test> ephemeral_endpoint foo your-server
0x1504f372ab5cd077 10.1.1.14:54376 your-server:2181

Note: you need to replace your-server with the full list of servers (e.g.:
server1:2181,server2:2181,...) in the last argument
to the ephemeral_endpoint command.


-rgs

On Fri, Oct 30, 2020 at 12:30 PM Paul Summermatter <[email protected]>
wrote:

> RE: ZooKeeper 3.4.6
>
> All,
>
>         I'm trying to troubleshoot a problem and could use some guidance
> from the experts on ZK administration. I have a cluster of applications
> that share work and that create ephemeral nodes representing the work in ZK
> expressly so that, if one application fails, the ephemeral nodes should be
> deleted, and the other apps should be able to pick up the work that is now
> not being completed by the failed instance.
>
>         Yesterday evening, one application instance suffered from some
> severe memory pressure and had to run multiple stop the world GC cycles.
> The pauses appear to have triggered a SessionExpiredException in
> org.apache.zookeeper.ClientCnxn$SendThread.run (I correlated multiple
> "Pause Full" statements in the GC logs with the ZK session timeout in the
> application logs). After the timeout, the connection was re-established in
> under 1,000ms, but the ephemeral nodes remained in ZooKeeper, leaving them
> as orphans. We've seen this behavior before and have had to delete the
> nodes manually using the zkCli.sh utility.
>
>         In an attempt to troubleshoot this issue, I'm trying to correlate
> the ephemeral owner that is listed on a node when you run the 'get' command
> with the ID of an active session. Basically, I'm trying to understand
> whether ZK thinks there is still an active session associated with the
> ephemeral node in the hopes that that might lead to an explanation for why
> the ZK server didn't seem to recognize the session timeout sensed on the
> client that triggered a new connection and would explain why the ephemeral
> nodes were not deleted as they should have been when the connection dropped.
>
>         I've tried the various four letter commands on the server to see
> if any of them output anything that looks like the ephemeral owner ID
> without any success. Any suggestions/guidance would be greatly appreciated.
> Note, right now, upgrading is not an option, but I'm certainly open to that
> if there are known issues with ephemeral nodes in 3.4 that are addressed in
> newer versions.
>
> Regards,
> Paul

Reply via email to