log purge - only processed records after certain period

Jaroslav Libák Tue, 30 May 2017 07:47:32 -0700

Hello

I'm thinking about using Kafka for messaging use-case, when records will be
entity change events, e.g "orderStateChange". There can be multiple
consumers of events and I do not want to lose any Kafka log records unless
they have been processed by all consumers (e.g due to some consumers
temporarily not working). I prefer to block producers instead of losing 
records (that requires problem to be fixed so that records are processed).


I read Kafka documentation and the above of course doesn't work out of box
as Kafka topic doesn't know about consumers and log purge is either size or
period based, or we use compaction.

But it seems the above scenario could be implemented by using log
compaction. Each record key would be something like "eventType/entityId/
someUniqueHash" so that records out of box do not get compacted. Topics 
would have no size or period based purging. There would be a consumer that
would discover indexes per partition for all other consumers and consume up
to the lowest common index, producing new record with (key, null), which 
according to documentation means log compaction will delete records with 
that given key (so that the original and the null value record will get 
deleted). This unfortunately means consumers would see (key, null) records
they would have to ignore.

It is not clear to me how Kafka handles situation when we are running with
low disk space - does it fill up disk space until OS file write error is 
returned? I want to block producers before that happens. I haven't found any
limits that would make Kafka refuse new records in topic.

Due to above problems it seems it would be best not to use Kafka. One
alternative is RabbitMQ, but Kafka has the advantage of speed (single topic
per event type instead of single topic per consumer so less IO) and keeping
messages persisted even after being consumed (listeners do not need to be 
known at the time of event production).

Jaroslav

log purge - only processed records after certain period

Reply via email to