Jun Rao created KAFKA-18641:
-------------------------------
Summary: AsyncKafkaConsumer could lose records with auto offset
commit
Key: KAFKA-18641
URL: https://issues.apache.org/jira/browse/KAFKA-18641
Project: Kafka
Issue Type: Bug
Components: consumer
Affects Versions: 4.0.0
Reporter: Jun Rao
In the new AsyncKafkaConsumer, the application thread will keep updating the
auto commit timer through PollEvent. In the consumer network thread, once the
timer has expired, it generates an offset commit request with the current
offset position in subscriptions. However, at this point, the records before
that offset could just be polled from FetchBuffer, but not actually consumed by
the application. If the application dies immediately, those records may never
be consumed by the application since the offset could have been committed.
The ClassicKafkaConsumer doesn't seem to have this problem. In each poll()
call, before fetching new records, it first calls ConsumerCoordinator.poll(),
which generates an OffsetCommitRequest with the current offset position in
subscriptions. Since this is done in the same application thread, it guarantees
that all records returned in the previous poll() have been processed. The
problem exists in AsyncKafkaConsumer because the polling of new records and
committing offsets are done in separate threads.
This problem exists for async offset commit and auto offset commit during
rebalance in AsyncKafkaConsumer too.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)