So Kafka is a fine solution as part of an event-sourced story. It's not a simple solution, but it fits.
Kakfa can store data for a long time and you shouldn't discount this, however, using it as the primary long-term data store might not be a good fit if we're talking storing raw events for years and years. I think someone in another thread mentioned moving these raw events to S3 or some object store which is probably a good idea. Kafka can store the "head" of the raw event stream as well as "snapshots" in time of the raw event stream that your applications can consume. If a raw event stream ends up super big, it may be impractical for an application to consume it from he beginning of time to recreate its datastore anyway; thus snapshots/aggregate snapshots help with this. Years and years of raw events may be useful for batch analytics (spark/hadoop) as well. In terms of an event store, I quite like the idea of using raw events to generate the projections/materialized views of domain objects AND using events within those domain objects to generate new raw events. This can be done nicely with Kafka feeding a DDD aggregate root --> traditional database --> Kafka. You can use CDC style solutions to capture the database journal log and publish events that way. Things like Kafka Connect become really handy for a solution like this as well as Martin K's Bottled Water project and our Debezium project: http://debezium.io ... i've got a blog+video demo of all of this coming soon! On Wed, May 18, 2016 at 1:15 AM, Olivier Lalonde <olalo...@gmail.com> wrote: > Dave: > > Thanks for the suggestion, Bookkeeper looks interesting. I assume you meant > "more high-level interface"? > > Chris: > > Thanks Chris, I actually came across a few of your talks and articles when > researching this style of architecture (along with stuff by Greg Young and > Martin Kleppmann) and really excited to see you pop in this thread. > > I understand Kafka is fairly low level for what I really need. I'm not > really convinced I will need random access to the event store though, this > seems to be a specific style of event sourcing architecture (with roots > stemming from DDD?) where the application state is all kept in memory? > > My idea was to use a PostgreSQL to "cache" my state and do queries from > (and possibly other types of databases in the future as needed). To build > the PostgreSQL database or add new derived state "caches" (Greg Young calls > those "projections" I believe) I would just replay the event stream from > event 0 with some glue code to handle the new event => database logic. It > keeps things simple though I suppose it could be slow when my event stream > gets really big (could add snapshots as a concept eventually). > > I read a bit about eventuate.io but it wasn't clear to me whether this was > an application framework or a server (open source? commercial?). > > One thing that doesn't help is that we're primarily a Node.js shop and all > the information, tooling, examples and terminology is heavily geared > towards JVM developers. I found this great talk by Stefan Kutko > https://www.youtube.com/watch?v=X_VHWQa1k0k which describes a Node.js > based > event sourced architecture but it's a bit short on implementation details. > > Abhaya: > > Also curious to hear what others have to say about this. What I'm trying to > achieve is to have a single unified event store from which all other data > stores are derived but I'm still unsure whether Kafka is the right choice. > I also believe that a lot of people use Kafka primarily as a message queue > and keep track of CRUD style state in a more traditional ACID database. One > thing I wonder about is let's say someone updates their shopping cart, you > can do a SQL transaction that updates your cart state and then publish a > message to Kafka for further processing but what if your process crashes in > between? Is there a way to do transactions that encompass both your SQL > database and the act of publishing a message to Kafka? > > Thanks again everyone for educating me :) > > Oli > > On Tue, May 17, 2016 at 11:26 PM, Abhaya P <abhaya...@gmail.com> wrote: > > > A curious question: > > How does database + Kafka broker solution work? > > Would the consumer(s) load the database and then overlay the events? > > Or is it that the events are overlaid on the database and the database is > > queried? > > I am curious on how a complete solution would look like with "database + > > Kafka" conceptually. > > > > Thanks, > > Abhaya > > > > > > On Tue, May 17, 2016 at 10:46 PM, Chris Richardson < > > ch...@chrisrichardson.net> wrote: > > > > > Oli, > > > > > > Kafka is only a partial solution. > > > As I describe here ( > > > > > > > > > http://www.slideshare.net/chris.e.richardson/hacksummit-2016-eventdriven-microservices-events-on-the-outside-on-the-inside-and-at-the-core/57 > > > ) > > > an event store is a hybrid of a database and a message broker. > > > It is a database because it provides an API for inserting events for an > > > entity and retrieving them by the entity's primary key. > > > It is a message broker because it provides an API for subscribing to > > > events. > > > Kafka clearly satisfies the latter but not the former. > > > > > > Just my two cents. > > > > > > Chris > > > > > > -- > > > Microservices application platform http://eventuate.io > > > > > > On Tue, May 17, 2016 at 12:18 AM, Olivier Lalonde <olalo...@gmail.com> > > > wrote: > > > > > > > Hi all, > > > > > > > > I am considering adopting an "event sourcing" architecture for a > > system I > > > > am developing and Kafka seems like a good choice of store for events. > > > > > > > > For those who aren't aware, this architecture style consists in > storing > > > all > > > > state changes of the system as an ordered log of events and building > > > > derivative views as needed for easier querying (using a SQL database > > for > > > > example). Those views must be completely derived from the event log > > alone > > > > so that the log effectively becomes a "single source of truth". > > > > > > > > I was wondering if anyone else is using Kafka for that purpose and > more > > > > specifically: > > > > > > > > 1) Can Kafka store messages permanently? > > > > > > > > 2) Let's say I throw away my derived view and want to re-build it > from > > > > scratch, is it possible to consume messages from a topic from its > very > > > > first message and once it has caught up, listen for new messages like > > it > > > > would normally do? > > > > > > > > 2) Does it support transactions? Let's say I want to push 3 messages > > > > atomically but the producer process crashes after sending only 2 > > > messages, > > > > is it possible to "rollback" the first 2 messages (e.g. "all or > > nothing" > > > > semantics)? > > > > > > > > 3) Does it support request/response style semantics or can they be > > > > simulated? My system's primary interface with the outside world is an > > > HTTP > > > > API so it would be nice if I could publish an event and wait for all > > the > > > > internal services which need to process the event to be "done" > > > > processing before returning a response. > > > > > > > > PS: I'm a Node.js/Go developer so when possible please avoid Java > > centric > > > > terminology. > > > > > > > > Thanks! > > > > > > > > - Oli > > > > > > > > -- > > > > - Oli > > > > > > > > Olivier Lalonde > > > > http://www.syskall.com <-- connect with me! > > > > > > > > > > > > > -- > - Oli > > Olivier Lalonde > http://www.syskall.com <-- connect with me! > -- *Christian Posta* twitter: @christianposta http://www.christianposta.com/blog http://fabric8.io