Hi All! We are encountering an error on a larger stateful job (around 1 TB + state) on restore from a rocksdb checkpoint. The taskmanagers keep crashing with a segfault coming from the rocksdb native logic and seem to be related to the FlinkCompactionFilter mechanism.
The gist with the full error report: report: https://gist.github.com/gyfora/f307aa570d324d063e0ade9810f8bb25 The core part is here: V [libjvm.so+0x79478f] Exceptions:: (Thread*, char const*, int, oopDesc*)+0x15f V [libjvm.so+0x960a68] jni_Throw+0x88 C [librocksdbjni-linux64.so+0x222aa1] JavaListElementFilter::NextUnexpiredOffset(rocksdb::Slice const&, long, long) const+0x121 C [librocksdbjni-linux64.so+0x6486c1] rocksdb::flink::FlinkCompactionFilter::ListDecide(rocksdb::Slice const&, std::string*) const+0x81 C [librocksdbjni-linux64.so+0x648bea] rocksdb::flink::FlinkCompactionFilter::FilterV2(int, rocksdb::Slice const&, rocksdb::CompactionFilter::ValueType, rocksdb::Slice const&, std::string*, std::string*) const+0x14a Has anyone encountered a similar issue before? Thanks Gyula