Hm, cool. Thanks Gwen and Guozhang. Loose-coupling (especially with regard to the number of instances running), batch inserts, and exactly-once are very convincing. Dynamic schema is interesting / scary, but, I'd need a dynamic app on the other side which I don't have. :-)
I'll plod along with KS-foreach until the JDBC sink connector is available, but, would definitely pick up the JDBC sink connector and give it a try when available. Thanks, Mathieu On Thu, Jul 21, 2016 at 7:07 PM, Gwen Shapira <g...@confluent.io> wrote: > In addition, our soon-to-be-released JDBC sink connector uses the > Connect framework to do things that are kind of annoying to do > yourself: > * Convert data types > * create tables if needed, add columns to tables if needed based on > the data in Kafka > * support for both insert and upsert > * configurable batch inserts > * exactly-once from Kafka to DB (using upserts) > > We'll notify you when we open the repository. Just a bit of cleanup left :) > > > On Thu, Jul 21, 2016 at 1:45 PM, Guozhang Wang <wangg...@gmail.com> wrote: > > Hi Mathieu, > > > > I'm cc'ing Ewen for answering your question as well, but here are my two > > cents: > > > > 1. One benefit of piping end result from KS to KC rather than using > > .foreach() in KS directly is that you can have a loose coupling between > > data processing and data copying. For example, for the latter approach > the > > number of JDBC connections is tied to the number of your KS instances, > > while in practice you may want to have a different number. > > > > 2. We are working on end-to-end exactly-once semantics right now, which > is > > a big project involving both KS and KC. From the KS point of view, any > > logic inside the foreach() call is a "black-box" to it and any > side-effects > > it may result in is not considered in its part of the exactly-once > > semantics; whereas with KC it has full knowledge about the connector and > > hence can achieve exactly-once as well for copying data to your RDBMS. > > > > > > Guozhang > > > > On Thu, Jul 21, 2016 at 6:49 AM, Mathieu Fenniak < > > mathieu.fenn...@replicon.com> wrote: > > > >> Hello again, Kafka users, > >> > >> My end goal is to get stream-processed data into a PostgreSQL database. > >> > >> I really like the architecture that Kafka Streams takes; it's "just" a > >> library, I can build a normal Java application around it and deal with > >> configuration and orchestration myself. To persist my data, it's easy > to > >> add a .foreach() to the end of my topology and upsert data into my DB > with > >> jdbc. > >> > >> I'm interpreting based upon the docs that the recommended approach > would be > >> to send my final data back to a Kafka topic, and use Connect with a > sink to > >> persist that data. That seems really interesting, but it's another > complex > >> moving part that I could do without. > >> > >> What advantages does Kafka Connect provide that I would be missing out > on > >> by persisting my data directly from my Kafka Streams application? > >> > >> Thanks, > >> > >> Mathieu > >> > > > > > > > > -- > > -- Guozhang >