Hi!

I'm working on a join setup that does fuzzy matching in case the client
does not send enough parameters to join by a foreign key.  There's a few
ways I can store the state.  I'm curious about best practices around this.
I'm using rocksdb as the state storage.

I was reading the code for IntervalJoin
<https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/co/IntervalJoinOperator.java>
and was a little shocked by the implementation.  It feels designed for very
short join intervals.

I read this set of pages
<https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html>
but I'm looking for one level deeper.  E.g. what are performance
characteristics of different types of state crud operations with rocksdb?
E.g. I could create extra MapState to act as an index.  When is this worth
it?

Reply via email to