A related example I've found:

https://ngsi-ld-tutorials.readthedocs.io/en/latest/big-data-flink.html#feedback-loop-persisting-context-data

Salva

On 2023/06/09 07:28:17 Salva Alcántara wrote:
> I want to implement a job which feeds sink messages back into a source,
> effectively introducing a cycle. For simplicity, consider the following
> kafka-based enrichment scenario:
>
> - A source for the "facts"
> - A source for the "dimensions"
> - A sink for the "dimensions" (this creates a loop)
> - A co-process operator for joining doing enrichment
> - Finally, another sink for the final "enriched facts"
>
> The thing is that, based on the "acts, the data in the dimensions table
> might change, and in that case I want to update both the internal state of
> the join operator (where the table is materialized) and emit the update to
> the sink (because another service needs that information which in turn
will
> potentially result in new facts coming in).
>
> So, to be clear, the feedback loop is not an internal one within the Flink
> job pipeline, but an external one and hence the graph topology continues
to
> be a DAG.
>
> I was just wondering how common this use case is and which precautions are
> necessary to take if any. In other libraries/frameworks, that might be
> problematic, e.g.:
>
> - https://github.com/lovoo/goka/issues/95
>
> I guess for keeping the local state of the joiner in sync with the i/o
> kafka topic I would need to enable exactly once guarantees at the sink
> level, e.g., by following this recipe:
>
> -
>
https://docs.immerok.cloud/docs/how-to-guides/development/exactly-once-with-apache-kafka-and-apache-flink/
>
> Thanks in advance!
>
> Salva
>

Reply via email to