Hi, I saw Apache Flink User Mailing List archive. - static/dynamic lookups in flink streaming being discussed, and then I saw this FLIP https://cwiki.apache.org/confluence/display/FLINK/FLIP-17+Side+Inputs+for+DataStream+API. I know we havent made much progress on this topic. I still wanted to put forward my problem statement around this. I am also looking for a dynamic lookup in Flink operators. I actually want to pre-fetch various Data Sources, like DB, Filesystem, Cassandra etc. into memory. Along with that, I have to ensure a refresh of in-memory lookup table periodically. The period being a configurable parameter. This is what a map operator would look like with lookup: -> Load in-memory lookup - Refresh timer start-> Stream processing start-> Call lookup-> Use lookup result in Stream processing -> Timer elapsed -> Reload lookup data source into in-memory table-> Continue processing
My concern around these are : 1) Possibly storing the same copy of data in every Task slots memory or state backend(RocksDB in my case).2) Having a dedicated refresh thread for each subtask instance(possibly, every Task Manager having multiple refresh thread) Am i thinking in the right direction? Or missing something very obvious? It confusing. Any leads are much appreciated. Thanks in advance. Cheers, Chirag