Re: Kafka for event sourcing architecture

Christian Posta Wed, 18 May 2016 04:20:07 -0700

So Kafka is a fine solution as part of an event-sourced story. It's not a
simple solution, but it fits.


Kakfa can store data for a long time and you shouldn't discount this,
however, using it as the primary long-term data store might not be a good
fit if we're talking storing raw events for years and years. I think
someone in another thread mentioned moving these raw events to S3 or some
object store which is probably a good idea. Kafka can store the "head" of
the raw event stream as well as "snapshots" in time of the raw event stream
that your applications can consume. If a raw event stream ends up super
big, it may be impractical for an application to consume it from he
beginning of time to recreate its datastore anyway; thus
snapshots/aggregate snapshots help with this. Years and years of raw events
may be useful for batch analytics (spark/hadoop) as well.

In terms of an event store, I quite like the idea of using raw events to
generate the projections/materialized views of domain objects AND using
events within those domain objects to generate new raw events. This can be
done nicely with Kafka feeding a DDD aggregate root --> traditional
database --> Kafka. You can use CDC style solutions to capture the database
journal log and publish events that way. Things like Kafka Connect become
really handy for a solution like this as well as Martin K's Bottled Water
project and our Debezium project: http://debezium.io  ... i've got a
blog+video demo of all of this coming soon!

On Wed, May 18, 2016 at 1:15 AM, Olivier Lalonde <olalo...@gmail.com> wrote:

> Dave:
>
> Thanks for the suggestion, Bookkeeper looks interesting. I assume you meant
> "more high-level interface"?
>
> Chris:
>
> Thanks Chris, I actually came across a few of your talks and articles when
> researching this style of architecture (along with stuff by Greg Young and
> Martin Kleppmann) and really excited to see you pop in this thread.
>
> I understand Kafka is fairly low level for what I really need. I'm not
> really convinced I will need random access to the event store though, this
> seems to be a specific style of event sourcing architecture (with roots
> stemming from DDD?) where the application state is all kept in memory?
>
> My idea was to use a PostgreSQL to "cache" my state and do queries from
> (and possibly other types of databases in the future as needed). To build
> the PostgreSQL database or add new derived state "caches" (Greg Young calls
> those "projections" I believe) I would just replay the event stream from
> event 0 with some glue code to handle the new event => database logic. It
> keeps things simple though I suppose it could be slow when my event stream
> gets really big (could add snapshots as a concept eventually).
>
> I read a bit about eventuate.io but it wasn't clear to me whether this was
> an application framework or a server (open source? commercial?).
>
> One thing that doesn't help is that we're primarily a Node.js shop and all
> the information, tooling, examples and terminology is heavily geared
> towards JVM developers. I found this great talk by Stefan Kutko
> https://www.youtube.com/watch?v=X_VHWQa1k0k which describes a Node.js
> based
> event sourced architecture but it's a bit short on implementation details.
>
> Abhaya:
>
> Also curious to hear what others have to say about this. What I'm trying to
> achieve is to have a single unified event store from which all other data
> stores are derived but I'm still unsure whether Kafka is the right choice.
> I also believe that a lot of people use Kafka primarily as a message queue
> and keep track of CRUD style state in a more traditional ACID database. One
> thing I wonder about is let's say someone updates their shopping cart, you
> can do a SQL transaction that updates your cart state and then publish a
> message to Kafka for further processing but what if your process crashes in
> between? Is there a way to do transactions that encompass both your SQL
> database and the act of publishing a message to Kafka?
>
> Thanks again everyone for educating me :)
>
> Oli
>
> On Tue, May 17, 2016 at 11:26 PM, Abhaya P <abhaya...@gmail.com> wrote:
>
> > A curious question:
> > How does  database + Kafka broker solution work?
> > Would the consumer(s) load the database and then overlay the events?
> > Or is it that the events are overlaid on the database and the database is
> > queried?
> > I am curious on how a complete solution would look like with "database +
> > Kafka" conceptually.
> >
> > Thanks,
> > Abhaya
> >
> >
> > On Tue, May 17, 2016 at 10:46 PM, Chris Richardson <
> > ch...@chrisrichardson.net> wrote:
> >
> > > Oli,
> > >
> > > Kafka is only a partial solution.
> > > As I describe here (
> > >
> > >
> >
> http://www.slideshare.net/chris.e.richardson/hacksummit-2016-eventdriven-microservices-events-on-the-outside-on-the-inside-and-at-the-core/57
> > > )
> > > an event store is a hybrid of a database and a message broker.
> > > It is a database because it provides an API for inserting events for an
> > > entity and retrieving them by the entity's primary key.
> > > It is a message broker because it provides an API for subscribing to
> > > events.
> > > Kafka clearly satisfies the latter but not the former.
> > >
> > > Just my two cents.
> > >
> > > Chris
> > >
> > > --
> > > Microservices application platform http://eventuate.io
> > >
> > > On Tue, May 17, 2016 at 12:18 AM, Olivier Lalonde <olalo...@gmail.com>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I am considering adopting an "event sourcing" architecture for a
> > system I
> > > > am developing and Kafka seems like a good choice of store for events.
> > > >
> > > > For those who aren't aware, this architecture style consists in
> storing
> > > all
> > > > state changes of the system as an ordered log of events and building
> > > > derivative views as needed for easier querying (using a SQL database
> > for
> > > > example). Those views must be completely derived from the event log
> > alone
> > > > so that the log effectively becomes a "single source of truth".
> > > >
> > > > I was wondering if anyone else is using Kafka for that purpose and
> more
> > > > specifically:
> > > >
> > > > 1) Can Kafka store messages permanently?
> > > >
> > > > 2) Let's say I throw away my derived view and want to re-build it
> from
> > > > scratch, is it possible to consume messages from a topic from its
> very
> > > > first message and once it has caught up, listen for new messages like
> > it
> > > > would normally do?
> > > >
> > > > 2) Does it support transactions? Let's say I want to push 3 messages
> > > > atomically but the producer process crashes after sending only 2
> > > messages,
> > > > is it possible to "rollback" the first 2 messages (e.g. "all or
> > nothing"
> > > > semantics)?
> > > >
> > > > 3) Does it support request/response style semantics or can they be
> > > > simulated? My system's primary interface with the outside world is an
> > > HTTP
> > > > API so it would be nice if I could publish an event and wait for all
> > the
> > > > internal services which need to process the event to be "done"
> > > > processing before returning a response.
> > > >
> > > > PS: I'm a Node.js/Go developer so when possible please avoid Java
> > centric
> > > > terminology.
> > > >
> > > > Thanks!
> > > >
> > > > - Oli
> > > >
> > > > --
> > > > - Oli
> > > >
> > > > Olivier Lalonde
> > > > http://www.syskall.com <-- connect with me!
> > > >
> > >
> >
>
>
>
> --
> - Oli
>
> Olivier Lalonde
> http://www.syskall.com <-- connect with me!
>



-- 
*Christian Posta*
twitter: @christianposta
http://www.christianposta.com/blog
http://fabric8.io

Re: Kafka for event sourcing architecture

Reply via email to