Hi Philip,
I have been scanning through
https://cwiki.apache.org/confluence/display/KAFKA/Consumer+threading+refactor+design
and KIP-848 and from this I understand that the kafka consumer API will
not change.
Perhaps the refactoring and/or KIP-848 is a good opportunity to improve
the API somewhat. In this email I explain why and also give a rough idea
what that could look like.
In the current API, the rebalance listener callback gives the user a
chance to commit all work in progress before a partition is actually
revoked and assigned to another consumer.
While the callback is doing all this, the main user thread is not able
to process new incoming data. So the rebalance listener affects latency
and throughput for non-revoked partitions during a rebalance.
In addition, I feel that doing a lot of stuff /in/ a callback is always
quite awkward. Better only use it to trigger some processing elsewhere.
Therefore, I would like to propose a new API that does not have these
problems and is easy to use (and I hope still easy to implement). In my
ideal world, poll is the only method that you need. Lets call it poll2
(to do: come up with a less crappy name). Poll2 returns more than just
the polled records, it will also contain newly assigned partitions,
partitions that will be revoked during the next call to poll2,
partitions that were lost, and perhaps it will even contain the offsets
committed so far.
The most important idea here is that partitions are not revoked
immediately, but in the next call to poll2.
With this API, a user can commit offsets at their own pace during a
rebalance. Optionally, for the case that processing of data from the
to-be-revoked partition is stil ongoing, we allow the user to postpone
the actual revocation in the next poll, so that polling can continue for
other partitions.
Since we are no longer blocking the main user thread, partitions that
are not revoked can be processed at full speed.
Removal of the rebalance listener also makes the API safer; there is no
more need for the thread-id check (nor KIP-944) because, concurrent
invocations are simply no longer needed. (Of course, if backward
compatibility is a goal, not all of these things can be done.)
Curious to your thoughts and kind regards,
Erik.
--
Erik van Oosten
e.vanoos...@grons.nl
https://day-to-day-stuff.blogspot.com