End-to-end exactly-once semantics in simple streaming app

2019-03-19 Thread Patrick Fial
Hello, I am working on a streaming application with apache flink, which shall provide end-to-end exactly-once delivery guarantees. The application is roughly built like this: environment.addSource(consumer) .map(… idempotent transformations ...) .map(new DatabaseFunction) .map(… idempoten

Re: End-to-end exactly-once semantics in simple streaming app

2019-03-19 Thread Andrey Zagrebin
Hi Patrick, One approach, I would try, is to use Flink state and sync it with database in initializeState and CheckpointListener.notifyCheckpointComplete. Basically issue only idempotent updates to database but only when the last checkpoint is securely taken and records before it are not processed

Re: End-to-end exactly-once semantics in simple streaming app

2019-03-20 Thread Patrick Fial
Hi Andrey, thanks for your feedback. I am not sure if I understand 100% correctly, but using the flink state to store my stuff (in addition to the oracle database) is not an option, because to my knowledge flink state does not allow arbitrary lookup queries, which I need to do, however. Also, g

Re: End-to-end exactly-once semantics in simple streaming app

2019-03-21 Thread Kostas Kloudas
Hi Patrick, In order for you DB records to be up-to-date and correct, I think that you would have to implement a 2-phase-commit sink. Now for querying multiple keys, why not doing the following: Let's assume for a single result record, you want to join data from K1, K2, K3. You can have a functio

Re: End-to-end exactly-once semantics in simple streaming app

2019-03-31 Thread Patrick Fial
Hi, thanks for your reply and sorry for the late response. The problem is, I am unsure how I should implement the two-phase-commit pattern, because my JDBC connection is within a .map()/.flatMap() operator, and it is NOT a data sink. As written in my original question, my stream setup is a sim

Re: End-to-end exactly-once semantics in simple streaming app

2019-04-08 Thread Fabian Hueske
Hi Patrick, In general, you could also implement the 2PC logic in a regular operator. It does not have to be a sink. You would need to add the logic of TwoPhaseCommitSinkFunction to your operator. However, the TwoPhaseCommitSinkFunction does not work well with JDBC. The problem is that you would n

Re: End-to-end exactly-once semantics in simple streaming app

2019-04-08 Thread Piotr Nowojski
Hi, Regarding the JDBC and Two-Phase commit (2PC) protocol. As Fabian mentioned it is not supported by the JDBC standard out of the box. With some workarounds I guess you could make it work by for example following one of the ideas: 1. Write records using JDBC with at-least-once semantics, by f