Hello,

I'm working on a team that is starting to use Kafka as a distributed
transaction log for a set of in-memory databases which can be replicated
across nodes.  We decided to use Kafka instead of Bookkeeper for a variety
of reasons, but there are a couple spots where Kafka is not a perfect fit.

The biggest issue facing us is deleting old transactions from the log after
checkpointing the database.  We can't use any of the built-in size or
time-based deletion mechanisms efficiently, because we could get ourselves
into a dangerous state where we're deleting transactions that haven't been
checkpointed yet.  The current approach we're looking at is rolling a new
topic each time we checkpoint, and deleting the old topic once all replicas
have consumed everything in it.

Another idea we came up with is using a pluggable compaction policy; we
would set the message key as the offset or transaction id, and the policy
would delete all messages with a key smaller than that id.
I took a stab at implementing the hook in Kafka for pluggable compaction
policies at
https://github.com/apache/kafka/compare/trunk...bill-warshaw:pluggable_compaction_policy
(rough implementation), and it seems fairly straightforward.  One problem
that we run into is that the custom policy class can only access
information that is defined in the configuration, and the configuration
doesn't allow custom key-value pairs; if we wanted to pass it information
dynamically, we'd have to use some hack like calling Zookeeper from within
the class.
To get around this, my best idea is to add the ability to specify arbitrary
key-value pairs in the configuration, that our client could use to pass
information to the custom policy.  Does this set off any alarm bells for
you guys?  If so, are there other approaches we could take that come to
mind?


Thanks for your time,
Bill Warshaw

-- 
 <http://appianworld.com>
This message and any attachments are solely for the intended recipient. If 
you are not the intended recipient, disclosure, copying, use, or 
distribution of the information included in this message is prohibited -- 
please immediately and permanently delete this message.

Reply via email to