Hi Tom,

There is, indeed, the problem with replication in case of the leader change for 
the partition.
Hence, I think the best, best approach would to have Kafka emit events in case 
of:

- partition leader change
- offset file to be cleaned up

This still leaves a lot of work for the ops people to actually implement the 
cold storage but it makes it at least possible. From there on, it’s all about 
tooling.
I’ve got electricity back, I’m going to prepare a JIRA.
–  
Best regards,

Radek Gruchalski

ra...@gruchalski.com
de.linkedin.com/in/radgruchalski

Confidentiality:
This communication is intended for the above-named person and may be 
confidential and/or legally privileged.
If it has come to you in error you must take no action based on it, nor must 
you copy or show it to anyone; please delete/destroy and inform the sender 
immediately.

On May 18, 2016 at 3:34:30 PM, Tom Crayford (tcrayf...@heroku.com) wrote:

The issue I have with that log.cleanup.policy approach is how it impacts  
with replication. Different replicas often have different sets of segment  
files. There are effectively two options then:  
1. All replicas upload all their segments. This has issues with storage  
space, but is easy to implement. It's hard to reason about from a user's  
perspective though - what happens when the replicas for a partition are  
reassigned? Do we upload under some namespace by broker id? That'd mean  
reassigning replicas will change where things get stored in S3  
2. Only the leader uploads it's segments. This presents a totally different  
issue, as different brokers have different sets of segment files. To recap  
a second, segment files contain a certain set of offsets, e.g. 0-100000  
might be in one file, then 100001-200000 in another. However, there's no  
consistency of offset overlap between replicas. This means that Kafka's now  
in charge of tracking which offset it's deleted to (in some replicated  
storage), and it has to truncate files before sending them to the archive  
command. Alternatively, another option is to allow multiple overlaps in  
storage. This presents confusing behaviour for users when the leader  
changes, as there will now be overlaps in storage. Also worth thinking  
about how a cold storage proposal works with unclean leader election.  

Another point I take issue with is the fact that now we're using Kafka's on  
disk storage format as a long term API thing for the cold storage of data.  
Clients *have* to integrate against the broker code for reading on disk  
storage (which only works for JVM clients right now, furthering the already  
huge amount of work it takes to support non JVM client libraries). Right  
now that code *and* the storage format isn't public at all and is subject  
to change (aside from normal backwards compatibility concerns with on disk  
storage). With a current "a separate consumer archives to cold storage"  
approach the consumer gets to specify whatever format it likes, control  
batching in a way that makes sense for your cold storage system of choice  
(some may prefer 1GB batches, some may prefer 100MB batches etc). Doing  
this in the archiving command would be possible, but does put additional  
strain on the broker machine and complicates the archiving command  
significantly.  

Furthermore, that rough cold storage proposal, as outlined still requires  
clients to do large amounts of custom work to consume from the beginning of  
time. I think avoiding much of that custom work was much of the point of  
"why can't Kafka do infinite retention" thing. I can imagine a nicer to use  
thing in which Kafka, upon receiving an earlier offset than the beginning  
of it's queue could *also* restore the segments from cold storage one by  
one, but that seems extremely tricky to implement, if much nicer in  
practice.  

Overall, any cold storage proposal seems like a drastic change and  
complication of the role of Kafka in a system. This makes it harder to  
implement and harder to run in production. It also makes it more confusing  
to use as a customer (even if you have an ops team or a PaaS offering  
running it for you), because now you're using a long term data store, not  
an ephemeral streaming system.  

I think Kafka is a great fit in an event-sourced architecture, but not as  
the long term cold storage, just the messaging layer. For long term low  
volume event storage, a traditional database often works very well anyway  
(the control plane that powers Heroku Kafka and Heroku Postgres does  
something like this with Postgres, and it's worked extremely well for the  
8+ years the control plane has been running). For high volume, a consumer  
archiving to S3 (or other long term storage) is relatively simple to  
implement (or you can use secor which is an off the shelf dropin for most  
folk).  

On Wed, May 18, 2016 at 12:19 PM, Christian Posta <christian.po...@gmail.com  
> wrote:  

> So Kafka is a fine solution as part of an event-sourced story. It's not a  
> simple solution, but it fits.  
>  
> Kakfa can store data for a long time and you shouldn't discount this,  
> however, using it as the primary long-term data store might not be a good  
> fit if we're talking storing raw events for years and years. I think  
> someone in another thread mentioned moving these raw events to S3 or some  
> object store which is probably a good idea. Kafka can store the "head" of  
> the raw event stream as well as "snapshots" in time of the raw event stream  
> that your applications can consume. If a raw event stream ends up super  
> big, it may be impractical for an application to consume it from he  
> beginning of time to recreate its datastore anyway; thus  
> snapshots/aggregate snapshots help with this. Years and years of raw events  
> may be useful for batch analytics (spark/hadoop) as well.  
>  
> In terms of an event store, I quite like the idea of using raw events to  
> generate the projections/materialized views of domain objects AND using  
> events within those domain objects to generate new raw events. This can be  
> done nicely with Kafka feeding a DDD aggregate root --> traditional  
> database --> Kafka. You can use CDC style solutions to capture the database  
> journal log and publish events that way. Things like Kafka Connect become  
> really handy for a solution like this as well as Martin K's Bottled Water  
> project and our Debezium project: http://debezium.io ... i've got a  
> blog+video demo of all of this coming soon!  
>  
> On Wed, May 18, 2016 at 1:15 AM, Olivier Lalonde <olalo...@gmail.com>  
> wrote:  
>  
> > Dave:  
> >  
> > Thanks for the suggestion, Bookkeeper looks interesting. I assume you  
> meant  
> > "more high-level interface"?  
> >  
> > Chris:  
> >  
> > Thanks Chris, I actually came across a few of your talks and articles  
> when  
> > researching this style of architecture (along with stuff by Greg Young  
> and  
> > Martin Kleppmann) and really excited to see you pop in this thread.  
> >  
> > I understand Kafka is fairly low level for what I really need. I'm not  
> > really convinced I will need random access to the event store though,  
> this  
> > seems to be a specific style of event sourcing architecture (with roots  
> > stemming from DDD?) where the application state is all kept in memory?  
> >  
> > My idea was to use a PostgreSQL to "cache" my state and do queries from  
> > (and possibly other types of databases in the future as needed). To build  
> > the PostgreSQL database or add new derived state "caches" (Greg Young  
> calls  
> > those "projections" I believe) I would just replay the event stream from  
> > event 0 with some glue code to handle the new event => database logic. It  
> > keeps things simple though I suppose it could be slow when my event  
> stream  
> > gets really big (could add snapshots as a concept eventually).  
> >  
> > I read a bit about eventuate.io but it wasn't clear to me whether this  
> was  
> > an application framework or a server (open source? commercial?).  
> >  
> > One thing that doesn't help is that we're primarily a Node.js shop and  
> all  
> > the information, tooling, examples and terminology is heavily geared  
> > towards JVM developers. I found this great talk by Stefan Kutko  
> > https://www.youtube.com/watch?v=X_VHWQa1k0k which describes a Node.js  
> > based  
> > event sourced architecture but it's a bit short on implementation  
> details.  
> >  
> > Abhaya:  
> >  
> > Also curious to hear what others have to say about this. What I'm trying  
> to  
> > achieve is to have a single unified event store from which all other data  
> > stores are derived but I'm still unsure whether Kafka is the right  
> choice.  
> > I also believe that a lot of people use Kafka primarily as a message  
> queue  
> > and keep track of CRUD style state in a more traditional ACID database.  
> One  
> > thing I wonder about is let's say someone updates their shopping cart,  
> you  
> > can do a SQL transaction that updates your cart state and then publish a  
> > message to Kafka for further processing but what if your process crashes  
> in  
> > between? Is there a way to do transactions that encompass both your SQL  
> > database and the act of publishing a message to Kafka?  
> >  
> > Thanks again everyone for educating me :)  
> >  
> > Oli  
> >  
> > On Tue, May 17, 2016 at 11:26 PM, Abhaya P <abhaya...@gmail.com> wrote:  
> >  
> > > A curious question:  
> > > How does database + Kafka broker solution work?  
> > > Would the consumer(s) load the database and then overlay the events?  
> > > Or is it that the events are overlaid on the database and the database  
> is  
> > > queried?  
> > > I am curious on how a complete solution would look like with "database  
> +  
> > > Kafka" conceptually.  
> > >  
> > > Thanks,  
> > > Abhaya  
> > >  
> > >  
> > > On Tue, May 17, 2016 at 10:46 PM, Chris Richardson <  
> > > ch...@chrisrichardson.net> wrote:  
> > >  
> > > > Oli,  
> > > >  
> > > > Kafka is only a partial solution.  
> > > > As I describe here (  
> > > >  
> > > >  
> > >  
> >  
> http://www.slideshare.net/chris.e.richardson/hacksummit-2016-eventdriven-microservices-events-on-the-outside-on-the-inside-and-at-the-core/57
>   
> > > > )  
> > > > an event store is a hybrid of a database and a message broker.  
> > > > It is a database because it provides an API for inserting events for  
> an  
> > > > entity and retrieving them by the entity's primary key.  
> > > > It is a message broker because it provides an API for subscribing to  
> > > > events.  
> > > > Kafka clearly satisfies the latter but not the former.  
> > > >  
> > > > Just my two cents.  
> > > >  
> > > > Chris  
> > > >  
> > > > --  
> > > > Microservices application platform http://eventuate.io  
> > > >  
> > > > On Tue, May 17, 2016 at 12:18 AM, Olivier Lalonde <  
> olalo...@gmail.com>  
> > > > wrote:  
> > > >  
> > > > > Hi all,  
> > > > >  
> > > > > I am considering adopting an "event sourcing" architecture for a  
> > > system I  
> > > > > am developing and Kafka seems like a good choice of store for  
> events.  
> > > > >  
> > > > > For those who aren't aware, this architecture style consists in  
> > storing  
> > > > all  
> > > > > state changes of the system as an ordered log of events and  
> building  
> > > > > derivative views as needed for easier querying (using a SQL  
> database  
> > > for  
> > > > > example). Those views must be completely derived from the event log  
> > > alone  
> > > > > so that the log effectively becomes a "single source of truth".  
> > > > >  
> > > > > I was wondering if anyone else is using Kafka for that purpose and  
> > more  
> > > > > specifically:  
> > > > >  
> > > > > 1) Can Kafka store messages permanently?  
> > > > >  
> > > > > 2) Let's say I throw away my derived view and want to re-build it  
> > from  
> > > > > scratch, is it possible to consume messages from a topic from its  
> > very  
> > > > > first message and once it has caught up, listen for new messages  
> like  
> > > it  
> > > > > would normally do?  
> > > > >  
> > > > > 2) Does it support transactions? Let's say I want to push 3  
> messages  
> > > > > atomically but the producer process crashes after sending only 2  
> > > > messages,  
> > > > > is it possible to "rollback" the first 2 messages (e.g. "all or  
> > > nothing"  
> > > > > semantics)?  
> > > > >  
> > > > > 3) Does it support request/response style semantics or can they be  
> > > > > simulated? My system's primary interface with the outside world is  
> an  
> > > > HTTP  
> > > > > API so it would be nice if I could publish an event and wait for  
> all  
> > > the  
> > > > > internal services which need to process the event to be "done"  
> > > > > processing before returning a response.  
> > > > >  
> > > > > PS: I'm a Node.js/Go developer so when possible please avoid Java  
> > > centric  
> > > > > terminology.  
> > > > >  
> > > > > Thanks!  
> > > > >  
> > > > > - Oli  
> > > > >  
> > > > > --  
> > > > > - Oli  
> > > > >  
> > > > > Olivier Lalonde  
> > > > > http://www.syskall.com <-- connect with me!  
> > > > >  
> > > >  
> > >  
> >  
> >  
> >  
> > --  
> > - Oli  
> >  
> > Olivier Lalonde  
> > http://www.syskall.com <-- connect with me!  
> >  
>  
>  
>  
> --  
> *Christian Posta*  
> twitter: @christianposta  
> http://www.christianposta.com/blog  
> http://fabric8.io  
>  

Reply via email to