Re: CDC using Query

2022-02-11 Thread Martijn Visser
Hi Mohan,

I don't know the specifics about the single Kafka Connect worker.

The Flink CDC connector is NOT a Kafka Connector. As explained before,
there is no Kafka involved when using this connector. As also is mentioned
in the same readme, it indeed provides exactly once processing.

Best regards,

Martijn

Op vr 11 feb. 2022 om 13:05 schreef mohan radhakrishnan <
radhakrishnan.mo...@gmail.com>

> Hello,
>   Ok. I may not have understood the answer to my previous
> question.
> When I listen to https://www.youtube.com/watch?v=IOZ2Um6e430 at 20:14 he
> starts to talk about this.
> Is he talking about a single Kafka Connect worker or a cluster ? He
> mentions that it is 'atleast-once'.
> So Flink's version is an improvement ? So Flink's Kafka Connector in a
> Connect cluster guarantees 'Exactly-once' ?
> Please bear with me.
>
> This will have other consequences too as our MQ may need a MQ connector.(
> Probably from Flink or Confluent  )
> Different connectors may have different guarantees.
>
> Thanks.
>
>> 3. Delivering to kafka from flink is not exactly once. Is that right ?
>>
>>
>> No, both Flink CDC Connector and Flink Kafka Connector provide exactly
>> once implementation.
>>
>
>
>
>
>
>
> On Fri, Feb 11, 2022 at 1:57 PM Martijn Visser 
> wrote:
>
>> Hi,
>>
>> The readme on the Flink CDC connectors [1] say that Oracle Databases
>> version 11, 12, 19 are supported with Oracle Driver 19.3.0.0.
>>
>> Best regards,
>>
>> Martijn
>>
>> [1]
>> https://github.com/ververica/flink-cdc-connectors/blob/master/README.md
>>
>> On Fri, 11 Feb 2022 at 08:37, mohan radhakrishnan <
>> radhakrishnan.mo...@gmail.com> wrote:
>>
>>> Thanks. I looked at it. Our primary DB is Oracle and MySql. Flink CDC
>>> Connector uses Debezium. I think. So ververica doesn't have a Flink CDC
>>> Connector for Oracle ?
>>>
>>> On Mon, Feb 7, 2022 at 3:03 PM Leonard Xu  wrote:
>>>
 Hello, mohan

 1. Does flink have any support to track any missed source Jdbc CDC
 records ?


 Flink CDC Connector provides Exactly once semantics which means they
 won’t miss records. Tips: The Flink JDBC Connector only
 Scan the database once which can not continuously read CDC stream.

 2. What is the equivalent of Kafka consumer groups ?


 Different database has different CDC mechanism, it’s serverId which
 used to mark a slave for MySQL/MariaDB, it’s slot name for PostgresSQL.


 3. Delivering to kafka from flink is not exactly once. Is that right ?


 No, both Flink CDC Connector and Flink Kafka Connector provide exactly
 once implementation.

 BTW, if your destination is Elasticsearch, the quick start demo[1] may
 help you.

 Best,
 Leonard

 [1]
 https://ververica.github.io/flink-cdc-connectors/master/content/quickstart/mysql-postgres-tutorial.html



 Thanks

 On Friday, February 4, 2022, mohan radhakrishnan <
 radhakrishnan.mo...@gmail.com> wrote:

> Hello,
>So the jdbc source connector is  kafka and
> transformation is done by flink (flink sql) ? But that connector can miss
> records. I thought. Started looking at flink for this and other use cases.
> Can I see the alternative to spring cloudstreams( kafka streams )?
> Since I am learning flink, kafka streams' changelog topics and 
> exactly-once
> delivery and dlqs seemed good for our cŕitical push notifications.
>
> We also needed a  elastic  sink.
>
> Thanks
>
> On Friday, February 4, 2022, Dawid Wysakowicz 
> wrote:
>
>> Hi Mohan,
>>
>> I don't know much about Kafka Connect, so I will not talk about its
>> features and differences to Flink. Flink on its own does not have a
>> capability to read a CDC stream directly from a DB. However there is the
>> flink-cdc-connectors[1] projects which embeds the standalone Debezium
>> engine inside of Flink's source and can process DB changelog with all
>> processing guarantees that Flink provides.
>>
>> As for the idea of processing further with Kafka Streams. Why not
>> process data with Flink? What do you miss in Flink?
>>
>> Best,
>>
>> Dawid
>>
>> [1] https://github.com/ververica/flink-cdc-connectors
>>
>> On 04/02/2022 13:55, mohan radhakrishnan wrote:
>>
>>> Hi,
>>>  When I was looking for CDC I realized Flink uses Kafka
>>> Connector to stream to Flink. The idea is to send it forward to Kafka 
>>> and
>>> consume it using Kafka Streams.
>>>
>>> Are there source DLQs or additional mechanisms to detect failures to
>>> read from the DB ?
>>>
>>> We don't want to use Debezium and our CDC is based on queries.
>>>
>>> What mechanisms does Flink have that a Kafka Connect worker does not
>>> ? Kafka Connect workers can go down and source data can be lost.
>>>
>>> Does 

Re: CDC using Query

2022-02-11 Thread mohan radhakrishnan
Hello,
  Ok. I may not have understood the answer to my previous
question.
When I listen to https://www.youtube.com/watch?v=IOZ2Um6e430 at 20:14 he
starts to talk about this.
Is he talking about a single Kafka Connect worker or a cluster ? He
mentions that it is 'atleast-once'.
So Flink's version is an improvement ? So Flink's Kafka Connector in a
Connect cluster guarantees 'Exactly-once' ?
Please bear with me.

This will have other consequences too as our MQ may need a MQ connector.(
Probably from Flink or Confluent  )
Different connectors may have different guarantees.

Thanks.

> 3. Delivering to kafka from flink is not exactly once. Is that right ?
>
>
> No, both Flink CDC Connector and Flink Kafka Connector provide exactly
> once implementation.
>






On Fri, Feb 11, 2022 at 1:57 PM Martijn Visser 
wrote:

> Hi,
>
> The readme on the Flink CDC connectors [1] say that Oracle Databases
> version 11, 12, 19 are supported with Oracle Driver 19.3.0.0.
>
> Best regards,
>
> Martijn
>
> [1]
> https://github.com/ververica/flink-cdc-connectors/blob/master/README.md
>
> On Fri, 11 Feb 2022 at 08:37, mohan radhakrishnan <
> radhakrishnan.mo...@gmail.com> wrote:
>
>> Thanks. I looked at it. Our primary DB is Oracle and MySql. Flink CDC
>> Connector uses Debezium. I think. So ververica doesn't have a Flink CDC
>> Connector for Oracle ?
>>
>> On Mon, Feb 7, 2022 at 3:03 PM Leonard Xu  wrote:
>>
>>> Hello, mohan
>>>
>>> 1. Does flink have any support to track any missed source Jdbc CDC
>>> records ?
>>>
>>>
>>> Flink CDC Connector provides Exactly once semantics which means they
>>> won’t miss records. Tips: The Flink JDBC Connector only
>>> Scan the database once which can not continuously read CDC stream.
>>>
>>> 2. What is the equivalent of Kafka consumer groups ?
>>>
>>>
>>> Different database has different CDC mechanism, it’s serverId which used
>>> to mark a slave for MySQL/MariaDB, it’s slot name for PostgresSQL.
>>>
>>>
>>> 3. Delivering to kafka from flink is not exactly once. Is that right ?
>>>
>>>
>>> No, both Flink CDC Connector and Flink Kafka Connector provide exactly
>>> once implementation.
>>>
>>> BTW, if your destination is Elasticsearch, the quick start demo[1] may
>>> help you.
>>>
>>> Best,
>>> Leonard
>>>
>>> [1]
>>> https://ververica.github.io/flink-cdc-connectors/master/content/quickstart/mysql-postgres-tutorial.html
>>>
>>>
>>>
>>> Thanks
>>>
>>> On Friday, February 4, 2022, mohan radhakrishnan <
>>> radhakrishnan.mo...@gmail.com> wrote:
>>>
 Hello,
So the jdbc source connector is  kafka and
 transformation is done by flink (flink sql) ? But that connector can miss
 records. I thought. Started looking at flink for this and other use cases.
 Can I see the alternative to spring cloudstreams( kafka streams )?
 Since I am learning flink, kafka streams' changelog topics and exactly-once
 delivery and dlqs seemed good for our cŕitical push notifications.

 We also needed a  elastic  sink.

 Thanks

 On Friday, February 4, 2022, Dawid Wysakowicz 
 wrote:

> Hi Mohan,
>
> I don't know much about Kafka Connect, so I will not talk about its
> features and differences to Flink. Flink on its own does not have a
> capability to read a CDC stream directly from a DB. However there is the
> flink-cdc-connectors[1] projects which embeds the standalone Debezium
> engine inside of Flink's source and can process DB changelog with all
> processing guarantees that Flink provides.
>
> As for the idea of processing further with Kafka Streams. Why not
> process data with Flink? What do you miss in Flink?
>
> Best,
>
> Dawid
>
> [1] https://github.com/ververica/flink-cdc-connectors
>
> On 04/02/2022 13:55, mohan radhakrishnan wrote:
>
>> Hi,
>>  When I was looking for CDC I realized Flink uses Kafka Connector
>> to stream to Flink. The idea is to send it forward to Kafka and consume 
>> it
>> using Kafka Streams.
>>
>> Are there source DLQs or additional mechanisms to detect failures to
>> read from the DB ?
>>
>> We don't want to use Debezium and our CDC is based on queries.
>>
>> What mechanisms does Flink have that a Kafka Connect worker does not
>> ? Kafka Connect workers can go down and source data can be lost.
>>
>> Does the idea  to send it forward to Kafka and consume it using Kafka
>> Streams make sense ? The checkpointing feature of Flink can help ? I plan
>> to use Kafka Streams for 'Exactly-once Delivery' and changelog topics.
>>
>> Could you point out relevant material to read ?
>>
>> Thanks,
>> Mohan
>>
>
>>>


Re: CDC using Query

2022-02-11 Thread Martijn Visser
Hi,

The readme on the Flink CDC connectors [1] say that Oracle Databases
version 11, 12, 19 are supported with Oracle Driver 19.3.0.0.

Best regards,

Martijn

[1] https://github.com/ververica/flink-cdc-connectors/blob/master/README.md

On Fri, 11 Feb 2022 at 08:37, mohan radhakrishnan <
radhakrishnan.mo...@gmail.com> wrote:

> Thanks. I looked at it. Our primary DB is Oracle and MySql. Flink CDC
> Connector uses Debezium. I think. So ververica doesn't have a Flink CDC
> Connector for Oracle ?
>
> On Mon, Feb 7, 2022 at 3:03 PM Leonard Xu  wrote:
>
>> Hello, mohan
>>
>> 1. Does flink have any support to track any missed source Jdbc CDC
>> records ?
>>
>>
>> Flink CDC Connector provides Exactly once semantics which means they
>> won’t miss records. Tips: The Flink JDBC Connector only
>> Scan the database once which can not continuously read CDC stream.
>>
>> 2. What is the equivalent of Kafka consumer groups ?
>>
>>
>> Different database has different CDC mechanism, it’s serverId which used
>> to mark a slave for MySQL/MariaDB, it’s slot name for PostgresSQL.
>>
>>
>> 3. Delivering to kafka from flink is not exactly once. Is that right ?
>>
>>
>> No, both Flink CDC Connector and Flink Kafka Connector provide exactly
>> once implementation.
>>
>> BTW, if your destination is Elasticsearch, the quick start demo[1] may
>> help you.
>>
>> Best,
>> Leonard
>>
>> [1]
>> https://ververica.github.io/flink-cdc-connectors/master/content/quickstart/mysql-postgres-tutorial.html
>>
>>
>>
>> Thanks
>>
>> On Friday, February 4, 2022, mohan radhakrishnan <
>> radhakrishnan.mo...@gmail.com> wrote:
>>
>>> Hello,
>>>So the jdbc source connector is  kafka and transformation
>>> is done by flink (flink sql) ? But that connector can miss records. I
>>> thought. Started looking at flink for this and other use cases.
>>> Can I see the alternative to spring cloudstreams( kafka streams )? Since
>>> I am learning flink, kafka streams' changelog topics and exactly-once
>>> delivery and dlqs seemed good for our cŕitical push notifications.
>>>
>>> We also needed a  elastic  sink.
>>>
>>> Thanks
>>>
>>> On Friday, February 4, 2022, Dawid Wysakowicz 
>>> wrote:
>>>
 Hi Mohan,

 I don't know much about Kafka Connect, so I will not talk about its
 features and differences to Flink. Flink on its own does not have a
 capability to read a CDC stream directly from a DB. However there is the
 flink-cdc-connectors[1] projects which embeds the standalone Debezium
 engine inside of Flink's source and can process DB changelog with all
 processing guarantees that Flink provides.

 As for the idea of processing further with Kafka Streams. Why not
 process data with Flink? What do you miss in Flink?

 Best,

 Dawid

 [1] https://github.com/ververica/flink-cdc-connectors

 On 04/02/2022 13:55, mohan radhakrishnan wrote:

> Hi,
>  When I was looking for CDC I realized Flink uses Kafka Connector
> to stream to Flink. The idea is to send it forward to Kafka and consume it
> using Kafka Streams.
>
> Are there source DLQs or additional mechanisms to detect failures to
> read from the DB ?
>
> We don't want to use Debezium and our CDC is based on queries.
>
> What mechanisms does Flink have that a Kafka Connect worker does not ?
> Kafka Connect workers can go down and source data can be lost.
>
> Does the idea  to send it forward to Kafka and consume it using Kafka
> Streams make sense ? The checkpointing feature of Flink can help ? I plan
> to use Kafka Streams for 'Exactly-once Delivery' and changelog topics.
>
> Could you point out relevant material to read ?
>
> Thanks,
> Mohan
>

>>


Re: CDC using Query

2022-02-10 Thread mohan radhakrishnan
Thanks. I looked at it. Our primary DB is Oracle and MySql. Flink CDC
Connector uses Debezium. I think. So ververica doesn't have a Flink CDC
Connector for Oracle ?

On Mon, Feb 7, 2022 at 3:03 PM Leonard Xu  wrote:

> Hello, mohan
>
> 1. Does flink have any support to track any missed source Jdbc CDC records
> ?
>
>
> Flink CDC Connector provides Exactly once semantics which means they won’t
> miss records. Tips: The Flink JDBC Connector only
> Scan the database once which can not continuously read CDC stream.
>
> 2. What is the equivalent of Kafka consumer groups ?
>
>
> Different database has different CDC mechanism, it’s serverId which used
> to mark a slave for MySQL/MariaDB, it’s slot name for PostgresSQL.
>
>
> 3. Delivering to kafka from flink is not exactly once. Is that right ?
>
>
> No, both Flink CDC Connector and Flink Kafka Connector provide exactly
> once implementation.
>
> BTW, if your destination is Elasticsearch, the quick start demo[1] may
> help you.
>
> Best,
> Leonard
>
> [1]
> https://ververica.github.io/flink-cdc-connectors/master/content/quickstart/mysql-postgres-tutorial.html
>
>
>
> Thanks
>
> On Friday, February 4, 2022, mohan radhakrishnan <
> radhakrishnan.mo...@gmail.com> wrote:
>
>> Hello,
>>So the jdbc source connector is  kafka and transformation
>> is done by flink (flink sql) ? But that connector can miss records. I
>> thought. Started looking at flink for this and other use cases.
>> Can I see the alternative to spring cloudstreams( kafka streams )? Since
>> I am learning flink, kafka streams' changelog topics and exactly-once
>> delivery and dlqs seemed good for our cŕitical push notifications.
>>
>> We also needed a  elastic  sink.
>>
>> Thanks
>>
>> On Friday, February 4, 2022, Dawid Wysakowicz 
>> wrote:
>>
>>> Hi Mohan,
>>>
>>> I don't know much about Kafka Connect, so I will not talk about its
>>> features and differences to Flink. Flink on its own does not have a
>>> capability to read a CDC stream directly from a DB. However there is the
>>> flink-cdc-connectors[1] projects which embeds the standalone Debezium
>>> engine inside of Flink's source and can process DB changelog with all
>>> processing guarantees that Flink provides.
>>>
>>> As for the idea of processing further with Kafka Streams. Why not
>>> process data with Flink? What do you miss in Flink?
>>>
>>> Best,
>>>
>>> Dawid
>>>
>>> [1] https://github.com/ververica/flink-cdc-connectors
>>>
>>> On 04/02/2022 13:55, mohan radhakrishnan wrote:
>>>
 Hi,
  When I was looking for CDC I realized Flink uses Kafka Connector
 to stream to Flink. The idea is to send it forward to Kafka and consume it
 using Kafka Streams.

 Are there source DLQs or additional mechanisms to detect failures to
 read from the DB ?

 We don't want to use Debezium and our CDC is based on queries.

 What mechanisms does Flink have that a Kafka Connect worker does not ?
 Kafka Connect workers can go down and source data can be lost.

 Does the idea  to send it forward to Kafka and consume it using Kafka
 Streams make sense ? The checkpointing feature of Flink can help ? I plan
 to use Kafka Streams for 'Exactly-once Delivery' and changelog topics.

 Could you point out relevant material to read ?

 Thanks,
 Mohan

>>>
>


Re: CDC using Query

2022-02-07 Thread Leonard Xu
Hello, mohan

> 1. Does flink have any support to track any missed source Jdbc CDC records ? 

Flink CDC Connector provides Exactly once semantics which means they won’t miss 
records. Tips: The Flink JDBC Connector only 
Scan the database once which can not continuously read CDC stream.

> 2. What is the equivalent of Kafka consumer groups ?

Different database has different CDC mechanism, it’s serverId which used to 
mark a slave for MySQL/MariaDB, it’s slot name for PostgresSQL. 


> 3. Delivering to kafka from flink is not exactly once. Is that right ?

No, both Flink CDC Connector and Flink Kafka Connector provide exactly once 
implementation.

BTW, if your destination is Elasticsearch, the quick start demo[1] may help you.

Best,
Leonard

[1] 
https://ververica.github.io/flink-cdc-connectors/master/content/quickstart/mysql-postgres-tutorial.html


> 
> Thanks
> 
> On Friday, February 4, 2022, mohan radhakrishnan 
> mailto:radhakrishnan.mo...@gmail.com>> wrote:
> Hello,
>So the jdbc source connector is  kafka and transformation is 
> done by flink (flink sql) ? But that connector can miss records. I thought. 
> Started looking at flink for this and other use cases.
> Can I see the alternative to spring cloudstreams( kafka streams )? Since I am 
> learning flink, kafka streams' changelog topics and exactly-once delivery and 
> dlqs seemed good for our cŕitical push notifications.
> 
> We also needed a  elastic  sink.
> 
> Thanks
> 
> On Friday, February 4, 2022, Dawid Wysakowicz  > wrote:
> Hi Mohan,
> 
> I don't know much about Kafka Connect, so I will not talk about its features 
> and differences to Flink. Flink on its own does not have a capability to read 
> a CDC stream directly from a DB. However there is the flink-cdc-connectors[1] 
> projects which embeds the standalone Debezium engine inside of Flink's source 
> and can process DB changelog with all processing guarantees that Flink 
> provides.
> 
> As for the idea of processing further with Kafka Streams. Why not process 
> data with Flink? What do you miss in Flink?
> 
> Best,
> 
> Dawid
> 
> [1] https://github.com/ververica/flink-cdc-connectors 
> 
> 
> On 04/02/2022 13:55, mohan radhakrishnan wrote:
> Hi,
>  When I was looking for CDC I realized Flink uses Kafka Connector to 
> stream to Flink. The idea is to send it forward to Kafka and consume it using 
> Kafka Streams.
> 
> Are there source DLQs or additional mechanisms to detect failures to read 
> from the DB ?
> 
> We don't want to use Debezium and our CDC is based on queries.
> 
> What mechanisms does Flink have that a Kafka Connect worker does not ? Kafka 
> Connect workers can go down and source data can be lost.
> 
> Does the idea  to send it forward to Kafka and consume it using Kafka Streams 
> make sense ? The checkpointing feature of Flink can help ? I plan to use 
> Kafka Streams for 'Exactly-once Delivery' and changelog topics.
> 
> Could you point out relevant material to read ?
> 
> Thanks,
> Mohan



Re: CDC using Query

2022-02-06 Thread mohan radhakrishnan
Hello,
 I have some specific questions. Appreciate some pointers
1. Does flink have any support to track any missed source Jdbc CDC records
?
2. What is the equivalent of Kafka consumer groups ?
3. Delivering to kafka from flink is not exactly once. Is that right ?

Thanks

On Friday, February 4, 2022, mohan radhakrishnan <
radhakrishnan.mo...@gmail.com> wrote:

> Hello,
>So the jdbc source connector is  kafka and transformation
> is done by flink (flink sql) ? But that connector can miss records. I
> thought. Started looking at flink for this and other use cases.
> Can I see the alternative to spring cloudstreams( kafka streams )? Since I
> am learning flink, kafka streams' changelog topics and exactly-once
> delivery and dlqs seemed good for our cŕitical push notifications.
>
> We also needed a  elastic  sink.
>
> Thanks
>
> On Friday, February 4, 2022, Dawid Wysakowicz 
> wrote:
>
>> Hi Mohan,
>>
>> I don't know much about Kafka Connect, so I will not talk about its
>> features and differences to Flink. Flink on its own does not have a
>> capability to read a CDC stream directly from a DB. However there is the
>> flink-cdc-connectors[1] projects which embeds the standalone Debezium
>> engine inside of Flink's source and can process DB changelog with all
>> processing guarantees that Flink provides.
>>
>> As for the idea of processing further with Kafka Streams. Why not process
>> data with Flink? What do you miss in Flink?
>>
>> Best,
>>
>> Dawid
>>
>> [1] https://github.com/ververica/flink-cdc-connectors
>>
>> On 04/02/2022 13:55, mohan radhakrishnan wrote:
>>
>>> Hi,
>>>  When I was looking for CDC I realized Flink uses Kafka Connector to
>>> stream to Flink. The idea is to send it forward to Kafka and consume it
>>> using Kafka Streams.
>>>
>>> Are there source DLQs or additional mechanisms to detect failures to
>>> read from the DB ?
>>>
>>> We don't want to use Debezium and our CDC is based on queries.
>>>
>>> What mechanisms does Flink have that a Kafka Connect worker does not ?
>>> Kafka Connect workers can go down and source data can be lost.
>>>
>>> Does the idea  to send it forward to Kafka and consume it using Kafka
>>> Streams make sense ? The checkpointing feature of Flink can help ? I plan
>>> to use Kafka Streams for 'Exactly-once Delivery' and changelog topics.
>>>
>>> Could you point out relevant material to read ?
>>>
>>> Thanks,
>>> Mohan
>>>
>>


Re: CDC using Query

2022-02-04 Thread mohan radhakrishnan
Hello,
   So the jdbc source connector is  kafka and transformation is
done by flink (flink sql) ? But that connector can miss records. I thought.
Started looking at flink for this and other use cases.
Can I see the alternative to spring cloudstreams( kafka streams )? Since I
am learning flink, kafka streams' changelog topics and exactly-once
delivery and dlqs seemed good for our cŕitical push notifications.

We also needed a  elastic  sink.

Thanks

On Friday, February 4, 2022, Dawid Wysakowicz 
wrote:

> Hi Mohan,
>
> I don't know much about Kafka Connect, so I will not talk about its
> features and differences to Flink. Flink on its own does not have a
> capability to read a CDC stream directly from a DB. However there is the
> flink-cdc-connectors[1] projects which embeds the standalone Debezium
> engine inside of Flink's source and can process DB changelog with all
> processing guarantees that Flink provides.
>
> As for the idea of processing further with Kafka Streams. Why not process
> data with Flink? What do you miss in Flink?
>
> Best,
>
> Dawid
>
> [1] https://github.com/ververica/flink-cdc-connectors
>
> On 04/02/2022 13:55, mohan radhakrishnan wrote:
>
>> Hi,
>>  When I was looking for CDC I realized Flink uses Kafka Connector to
>> stream to Flink. The idea is to send it forward to Kafka and consume it
>> using Kafka Streams.
>>
>> Are there source DLQs or additional mechanisms to detect failures to read
>> from the DB ?
>>
>> We don't want to use Debezium and our CDC is based on queries.
>>
>> What mechanisms does Flink have that a Kafka Connect worker does not ?
>> Kafka Connect workers can go down and source data can be lost.
>>
>> Does the idea  to send it forward to Kafka and consume it using Kafka
>> Streams make sense ? The checkpointing feature of Flink can help ? I plan
>> to use Kafka Streams for 'Exactly-once Delivery' and changelog topics.
>>
>> Could you point out relevant material to read ?
>>
>> Thanks,
>> Mohan
>>
>


Re: CDC using Query

2022-02-04 Thread Dawid Wysakowicz

Hi Mohan,

I don't know much about Kafka Connect, so I will not talk about its 
features and differences to Flink. Flink on its own does not have a 
capability to read a CDC stream directly from a DB. However there is the 
flink-cdc-connectors[1] projects which embeds the standalone Debezium 
engine inside of Flink's source and can process DB changelog with all 
processing guarantees that Flink provides.


As for the idea of processing further with Kafka Streams. Why not 
process data with Flink? What do you miss in Flink?


Best,

Dawid

[1] https://github.com/ververica/flink-cdc-connectors

On 04/02/2022 13:55, mohan radhakrishnan wrote:

Hi,
     When I was looking for CDC I realized Flink uses Kafka Connector 
to stream to Flink. The idea is to send it forward to Kafka and 
consume it using Kafka Streams.


Are there source DLQs or additional mechanisms to detect failures to 
read from the DB ?


We don't want to use Debezium and our CDC is based on queries.

What mechanisms does Flink have that a Kafka Connect worker does not ? 
Kafka Connect workers can go down and source data can be lost.


Does the idea  to send it forward to Kafka and consume it using Kafka 
Streams make sense ? The checkpointing feature of Flink can help ? I 
plan to use Kafka Streams for 'Exactly-once Delivery' and changelog 
topics.


Could you point out relevant material to read ?

Thanks,
Mohan


OpenPGP_signature
Description: OpenPGP digital signature


CDC using Query

2022-02-04 Thread mohan radhakrishnan
Hi,
 When I was looking for CDC I realized Flink uses Kafka Connector to
stream to Flink. The idea is to send it forward to Kafka and consume it
using Kafka Streams.

Are there source DLQs or additional mechanisms to detect failures to read
from the DB ?

We don't want to use Debezium and our CDC is based on queries.

What mechanisms does Flink have that a Kafka Connect worker does not ?
Kafka Connect workers can go down and source data can be lost.

Does the idea  to send it forward to Kafka and consume it using Kafka
Streams make sense ? The checkpointing feature of Flink can help ? I plan
to use Kafka Streams for 'Exactly-once Delivery' and changelog topics.

Could you point out relevant material to read ?

Thanks,
Mohan