Re: Flink streaming. Broadcast reference data map across nodes
Hi, what Ufuk said is valid. In addition, you can make your function a RichFunction and load the static data in the open() method. In the future, you might be able to handle this use case with a feature called side inputs that we're currently working on: https://docs.google.com/document/d/1hIgxi2Zchww_5fWUHLoYiXwSBXjv-M5eOv-MKQYN3m4/edit Best, Aljoscha On Tue, 21 Feb 2017 at 15:50 Ufuk Celebiwrote: > On Tue, Feb 21, 2017 at 2:35 PM, Vadim Vararu > wrote: > > Basically, i have a big dictionary of reference data that has to be > > accessible from all the nodes (in order to do some joins of log line with > > reference line). > > If the dictionary is small you can make it part of the closures that > are send to the task managers. Just make it part of your function. > > If it is large, I'm not sure what the best way is to do it is right > now. I've CC'd Aljoscha who can probably help... >
Re: Flink streaming. Broadcast reference data map across nodes
On Tue, Feb 21, 2017 at 2:35 PM, Vadim Vararuwrote: > Basically, i have a big dictionary of reference data that has to be > accessible from all the nodes (in order to do some joins of log line with > reference line). If the dictionary is small you can make it part of the closures that are send to the task managers. Just make it part of your function. If it is large, I'm not sure what the best way is to do it is right now. I've CC'd Aljoscha who can probably help...
Flink streaming. Broadcast reference data map across nodes
Hi all, I would like to do something similar to Spark's broadcast mechanism. Basically, i have a big dictionary of reference data that has to be accessible from all the nodes (in order to do some joins of log line with reference line). I did not find yet a way to do it. Any ideas?