Hi Andrew,

Not sure how much this helps, but in version 1.x, state was on the
following znodes:

/$storm-znode/storms
/$storm-znode/assignments
/$storm-znode/blobstore


Deleting all references (with rm or deleteall, depending on Zookeeper's
version), followed by a Nimbus's rolling restart should suffice.

On Mon, Oct 25, 2021, 18:49 Andrew Neilson <arsneil...@gmail.com> wrote:

> Hi,
>
> We're running a v2.2.0 cluster with two nimbus hosts and recently noticed
> storm-nimbus on the leader is effectively in a restart loop.
>
> When I look at nimbus.log on that host it is full of log entries related
> to old versions of topologies we're running. There are the two types of
> exceptions I am seeing
>
> 1. get blob meta exception:
>
> For *topology-A *for example, we're currently on topology-A-25:
>
> 2021-10-25 13:39:51.064 o.a.s.d.n.Nimbus pool-29-thread-62 [WARN]
> Exception when getting heartbeat timeout.
> 2021-10-25 13:39:51.075 o.a.s.d.n.Nimbus pool-29-thread-16 [WARN] get blob
> meta exception.
> org.apache.storm.utils.WrappedKeyNotFoundException:
> topology-A-5-1633368551-stormjar.jar
>
> For *topology-B*, we're on topology-B-24:
>
> 2021-10-25 13:38:51.106 o.a.s.d.n.Nimbus pool-29-thread-21 [WARN] get blob
> meta exception.
> org.apache.storm.utils.WrappedKeyNotFoundException:
> topology-B-11-1632770137-stormcode.ser
>
> 2. Send HB exception:
>
> 2021-10-25 13:39:51.745 o.a.s.d.n.Nimbus pool-29-thread-36 [WARN]
> Exception when getting heartbeat timeout.
> 2021-10-25 13:39:51.760 o.a.s.d.n.Nimbus pool-29-thread-37 [WARN] Send HB
> exception. (topology id='topology-A-10-1632769783')
> org.apache.storm.utils.WrappedNotAliveException: topology-A-10-1632769783
>
> This seems isolated to two versions of "topology-A" and one version of
> "topology-B".
>
> I'm not seeing references to these topology versions in Zookeeper. Does
> anyone know how to safely clear out this old state? If not, any suggestions
> on how to debug this? Further, is this related to any known bug?
>
> Thanks,
> Andrew
>

Reply via email to