Re: Kafka Streams/Connect for Persistence?

Mathieu Fenniak Fri, 22 Jul 2016 07:48:06 -0700

Hm, cool.  Thanks Gwen and Guozhang.

Loose-coupling (especially with regard to the number of instances running),
batch inserts, and exactly-once are very convincing.  Dynamic schema is
interesting / scary, but, I'd need a dynamic app on the other side which I
don't have. :-)


I'll plod along with KS-foreach until the JDBC sink connector is available,
but, would definitely pick up the JDBC sink connector and give it a try
when available.

Thanks,

Mathieu


On Thu, Jul 21, 2016 at 7:07 PM, Gwen Shapira <g...@confluent.io> wrote:

> In addition, our soon-to-be-released JDBC sink connector uses the
> Connect framework to do things that are kind of annoying to do
> yourself:
> * Convert data types
> * create tables if needed, add columns to tables if needed based on
> the data in Kafka
> * support for both insert and upsert
> * configurable batch inserts
> * exactly-once from Kafka to DB (using upserts)
>
> We'll notify you when we open the repository. Just a bit of cleanup left :)
>
>
> On Thu, Jul 21, 2016 at 1:45 PM, Guozhang Wang <wangg...@gmail.com> wrote:
> > Hi Mathieu,
> >
> > I'm cc'ing Ewen for answering your question as well, but here are my two
> > cents:
> >
> > 1. One benefit of piping end result from KS to KC rather than using
> > .foreach() in KS directly is that you can have a loose coupling between
> > data processing and data copying. For example, for the latter approach
> the
> > number of JDBC connections is tied to the number of your KS instances,
> > while in practice you may want to have a different number.
> >
> > 2. We are working on end-to-end exactly-once semantics right now, which
> is
> > a big project involving both KS and KC. From the KS point of view, any
> > logic inside the foreach() call is a "black-box" to it and any
> side-effects
> > it may result in is not considered in its part of the exactly-once
> > semantics; whereas with KC it has full knowledge about the connector and
> > hence can achieve exactly-once as well for copying data to your RDBMS.
> >
> >
> > Guozhang
> >
> > On Thu, Jul 21, 2016 at 6:49 AM, Mathieu Fenniak <
> > mathieu.fenn...@replicon.com> wrote:
> >
> >> Hello again, Kafka users,
> >>
> >> My end goal is to get stream-processed data into a PostgreSQL database.
> >>
> >> I really like the architecture that Kafka Streams takes; it's "just" a
> >> library, I can build a normal Java application around it and deal with
> >> configuration and orchestration myself.  To persist my data, it's easy
> to
> >> add a .foreach() to the end of my topology and upsert data into my DB
> with
> >> jdbc.
> >>
> >> I'm interpreting based upon the docs that the recommended approach
> would be
> >> to send my final data back to a Kafka topic, and use Connect with a
> sink to
> >> persist that data.  That seems really interesting, but it's another
> complex
> >> moving part that I could do without.
> >>
> >> What advantages does Kafka Connect provide that I would be missing out
> on
> >> by persisting my data directly from my Kafka Streams application?
> >>
> >> Thanks,
> >>
> >> Mathieu
> >>
> >
> >
> >
> > --
> > -- Guozhang
>

Re: Kafka Streams/Connect for Persistence?

Reply via email to