Hi Congxian,

I think I have figured out the issue. It's related to the checkpoint directory
collision issue you responded to in the other thread. We reproduced this bug on
1.6.1 after unchaining the operators.

There are two stateful operators in the chain, one is a
CoBroadcastWithKeyedOperator, the other one is a StreamMapper. The
CoBroadcastWithKeyedOperator creates timer states in RocksDB, the latter
doesn’t. Because of the checkpoint directory collision bug, we always end up
saving the states for CoBroadcastWithKeyedOperator.

After breaking these two operators apart, they try to restore from the same set
of saved states. When the StreamMapper opens the RocksDB files, it doesn’t care
about any of the column families in there, including the timer states. Hence the
error.

--
Ning

Reply via email to