Thanks Gordon and Seth! On Wed, Mar 10, 2021, 21:55 Tzu-Li (Gordon) Tai <tzuli...@apache.org> wrote:
> Hi Dan, > > For a deeper dive into state backends and how they manage state, or > performance critical aspects such as state serialization and choosing > appropriate state structures, I highly recommend starting from this webinar > done by my colleague Seth Weismann: > https://www.youtube.com/watch?v=9GF8Hwqzwnk. > > Cheers, > Gordon > > On Wed, Mar 10, 2021 at 1:58 AM Dan Hill <quietgol...@gmail.com> wrote: > >> Hi! >> >> I'm working on a join setup that does fuzzy matching in case the client >> does not send enough parameters to join by a foreign key. There's a few >> ways I can store the state. I'm curious about best practices around this. >> I'm using rocksdb as the state storage. >> >> I was reading the code for IntervalJoin >> <https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/co/IntervalJoinOperator.java> >> and was a little shocked by the implementation. It feels designed for very >> short join intervals. >> >> I read this set of pages >> <https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html> >> but I'm looking for one level deeper. E.g. what are performance >> characteristics of different types of state crud operations with rocksdb? >> E.g. I could create extra MapState to act as an index. When is this worth >> it? >> >> >>