Simple example of how to take advantage of this behavior:
Suppose you're sending document updates through Kafka. If you use the
document ID as the message key and the default hash partitioner, the
updates for a given document will exist on the same partition and come
into the consumer in
Another idea. If a set of messages arrive over a single TCP connection, route
to a partition depending on TCP connection.
To be honest, these approaches, while they work, may not scale when the message
rate is high. If at all possible, try to think of a way to remove this
requirement from your
I understand that there are no guarantees per say that a message may be a
duplicate (its the consumer's job to guarantee that), but when it comes to
message order, is kafka built in such a way that it is impossible to get
messages in the wrong order?
Certain use cases might not be sensitive to