Hi Marke,

Q1: From your description of the problem, "Broadcast State Pattern" seems
to be the suitable choice.
If you want to keep the same state on all parallel instances which process
stream[1] and update/store that state the same way on each instance by
using each element of stream[2].

Q2: Apart of simple synchronous queries to database upon getting each
element of stream[2], you might benefit from using async IO (1). E.g. you
could put it before broadcasting stream[2] and broadcast database response.

Best,
Andrey

(1)
https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/operators/asyncio.html

On Thu, Jan 24, 2019 at 1:10 PM Marke Builder <marke.buil...@gmail.com>
wrote:

> Hi,
>
> I have a question regarding the "Broadcast State Pattern".
> My job consume two streams (kafka, rabbitmq), on one of the streams come a
> lot of data and continuously[1]. On the other  very few and rarely[2]. I'm
> using the Broadcast State pattern, because the stream[2] are updating data
> which are required for stream[1].
>
> Q1: Is the Broadcast State Pattern the right way?
>
> As I mentioned above, the stream[2] provide data and "say" read additional
> data from a database.
>
> Q2: What is the best(the most efficient) way to request a database from
> the processElement(...) function?
>
> Many Thanks!
> Marke
>

Reply via email to