Daniel Schierbeck created KAFKA-1827:
----------------------------------------
Summary: Optimistic Locking when Producing Messages
Key: KAFKA-1827
URL: https://issues.apache.org/jira/browse/KAFKA-1827
Project: Kafka
Issue Type: Improvement
Reporter: Daniel Schierbeck
(I wasn't able to post to the ML, so I'm adding an issue instead. I hope that's
okay.)
I'm trying to design a system that uses Kafka as its primary data store by
persisting immutable events into a topic and keeping a secondary index in
another data store. The secondary index would store the "entities". Each event
would pertain to some "entity", e.g. a user, and those entities are stored in
an easily queriable way.
Kafka seems well suited for this, but there's one thing I'm having problems
with. I cannot guarantee that only one process writes events about an entity,
which makes the design vulnerable to integrity issues.
For example, say that a user can have multiple email addresses assigned, and
the EmailAddressRemoved event is published when the user removes one. There's
an integrity constraint, though: every user MUST have at least one email
address. As far as I can see, there's no way to stop two separate processes
from looking up a user entity, seeing that there are two email addresses
assigned, and each publish an event. The end result would violate the contraint.
If I'm wrong in saying that this isn't possible I'd love some feedback!
My current thinking is that Kafka could relatively easily support this kind of
application with a small additional API. Kafka already has the abstract notion
of entities through its key-based retention policy. If the produce API was
modified in order to allow an integer OffsetConstraint, the following algorithm
could determine whether the request should proceed:
1. For every key seen, keep track of the offset of the latest message
referencing the key.
2. When an OffsetContraint is specified in the produce API call, compare that
value with the latest offset for the message key.
2.1. If they're identical, allow the operation to continue.
2.2. If they're not identical, fail with some OptimisticLockingFailure.
Would such a feature be completely out of scope for Kafka?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)