Say am I doing this, a scenerio that I just came up with that demonstrates #2.
Someone signs up on a website, and you have to: 1. create the user profile 2. send email confirmation email 3. resize avatar Now once a person registers on a website, I write a message to Kafka. Now I have 3 different things to process (1,2,3), if I get to #2 and then the server loses power, if I replay, I will re-send the confirmation email 2 times. Sure in this case its not that big of a deal, but just pretend it is, what should be done? I guess I have to keep track of state then per step in ZK right? I mean that's the only way so I guess I am answering my own question but was hoping for people with real-life experience to chime in. I could write 3 messages to kafka, but maybe order is important :) On Mon, Dec 9, 2013 at 3:31 PM, Philip O'Toole <phi...@loggly.com> wrote: > We use Zookeeper, as is standard with Kafka. > > Our systems are idempotent, so we only store offsets when the message is > fully processed. If this means we occasionally replay a message due to some > corner-case, or simply a restart, it doesn't matter. > > Philip > > > On Mon, Dec 9, 2013 at 12:28 PM, S Ahmed <sahmed1...@gmail.com> wrote: > > > I was hoping people could comment on how they handle the following > > scenerios: > > > > 1. Storing the last successfully processed messageId/Offset. Are people > > using mysql, redis, etc.? What are the tradeoffs here? > > > > 2. How do you handle recovering from an error while processesing a given > > event? > > > > There are various scenerioes for #2, like: > > 1. Do you mark the start of processing a message somewhere, and then > update > > the status to complete and THEN update the last messaged processed for > #1? > > 2. Do you only mark the status as complete, and not the start of > processing > > it? I guess this depends of there are intermediate steps and processing > > the entire message again would result in some duplicated work right? > > >