Hi All,

I am working on Kafka 0.9 output operator and one of the requirement is to
implement Exactly Once Output operator. Here is the one possible idea,
please give your feedback or suggest new design.

-------------------------------------------------------------------------------------------------------------------------

Use *Key* to store meta information which is used during recovery.

Operator users will use *Value* to store their key-value pair and implement
the Kafka partitioning accordingly.

Format of the *Key* is as specified below:



Key = 1. OperatorName#ApexPartitionId#WindowId#Message#MessageId ( During
message write )

         2. OperatorName#ApexPartitionId#WindowId#CheckPoint ( During end
Window )

During End window, checkpoint marker is written to all the Kafka partitions
of the topic.

Every message is given a message id, counter-reset every window, and then
written to Kafka.

During recovery, Kafka partitions are read until the last checkpoint
message from this operator is reached and the partially written window is
constructed.

--------------------------------------------------------------------------------

Note: Existing Kafka exactly once operator, ( Kafka 0.8 ) also needs to be
re-written.

Thanks

Reply via email to