Re: Flink streaming. Broadcast reference data map across nodes

2017-02-26 Thread Aljoscha Krettek
Hi,
what Ufuk said is valid. In addition, you can make your function a
RichFunction and load the static data in the open() method.

In the future, you might be able to handle this use case with a feature
called side inputs that we're currently working on:
https://docs.google.com/document/d/1hIgxi2Zchww_5fWUHLoYiXwSBXjv-M5eOv-MKQYN3m4/edit

Best,
Aljoscha

On Tue, 21 Feb 2017 at 15:50 Ufuk Celebi  wrote:

> On Tue, Feb 21, 2017 at 2:35 PM, Vadim Vararu 
> wrote:
> > Basically, i have a big dictionary of reference data that has to be
> > accessible from all the nodes (in order to do some joins of log line with
> > reference line).
>
> If the dictionary is small you can make it part of the closures that
> are send to the task managers. Just make it part of your function.
>
> If it is large, I'm not sure what the best way is to do it is right
> now. I've CC'd Aljoscha who can probably help...
>


Re: Flink streaming. Broadcast reference data map across nodes

2017-02-21 Thread Ufuk Celebi
On Tue, Feb 21, 2017 at 2:35 PM, Vadim Vararu  wrote:
> Basically, i have a big dictionary of reference data that has to be
> accessible from all the nodes (in order to do some joins of log line with
> reference line).

If the dictionary is small you can make it part of the closures that
are send to the task managers. Just make it part of your function.

If it is large, I'm not sure what the best way is to do it is right
now. I've CC'd Aljoscha who can probably help...


Flink streaming. Broadcast reference data map across nodes

2017-02-21 Thread Vadim Vararu

Hi all,


I would like to do something similar to Spark's broadcast mechanism.

Basically, i have a big dictionary of reference data that has to be 
accessible from all the nodes (in order to do some joins of log line 
with reference line).


I did not find yet a way to do it.


Any ideas?