GitHub user tashoyan opened a pull request:
https://github.com/apache/kafka/pull/3051
KAFKA-5235: GetOffsetShell: new KafkaConsumer API, support for multiple
topics, minimized number of requests to server
This PR addresses two improvements:
[KAFKA-5235 GetOffsetShell: retrieve offsets for all given topics and
partitions with single request to the
broker](https://issues.apache.org/jira/browse/KAFKA-5235)
[KAFKA-5234 GetOffsetShell: retrieve offsets for multiple topics with
single request](https://issues.apache.org/jira/browse/KAFKA-5234)
1. Previous implementation used SimpleConsumer to get offsets and old
Producer API to get topic/partition metadata. Previous implementation
determined a leader broker for each partition and then requested the leader for
offsets. In total, it did as many requests to the broker as the number of
partitions (plus a request to Zookeeper for metadata).
New implementation `kafka-get-offsets.sh` uses KafkaConsumer API. It makes
at most two requests to the broker: 1) to query existing topics and partitions,
2) to grab all requested offsets. New implementation correctly handles
non-existing topics and partitions asked by user:
> kafka-get-offsets.sh --bootstrap-servers vm:9092 --topics AAA,ZZZ
--partitions 0,1
> AAA:0:7
> AAA:1:Partition not found
> ZZZ:0:Topic not found
2. Previously, user could get offsets for one topic only. Now user can get
offsets for many topics at once:
`kafka-get-offsets.sh --bootstrap-servers vm:9092 --topics AAA,ZZZ`
Moreover, now user is able to retrieve offsets for _all_ topics - this is
the default when no topics specified:
`kafka-get-offsets.sh --bootstrap-servers vm:9092`
Thanks to this feature, there is no need anymore to retrieve all topics by
means of `kafka-topics.sh`.
When no topics specified, the new `kafka-get-offsets.sh` tool takes into
account only user-level topics and ignores Kafka-internal topics (i.e. consumer
offsets). This behavior can be altered via a special command line argument:
`kafka-get-offsets.sh --bootstrap-servers vm:9092 --include-internal-topics`
3. New `kafka-get-offsets.sh` tool is consistent with other console tools
with respect to command line argument names. In addition,
`kafka-get-offsets.sh` tool gives the possibility to pass an arbitrary setting
to KafkaConsumer via `--consumer-property` argument.
I hope, now `kafka-get-offsets.sh` is easier in use and gives performance
improvement.
@junrao I suppose you may want to review.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tashoyan/kafka trunk
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/kafka/pull/3051.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3051
----
commit ec16d064aac4dfba164aeefaae3950db7b2e35af
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-01T19:52:26Z
Add a testable method that retrieves offsets: getOffsets(). Cover it with
tests.
commit df60a30da3a5c3d0d95c85df2c5eb32a6eeae107
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-01T19:55:13Z
Fix some trivial warnings
commit 5aa9639666d32f7e036cee4cb42ac9b7223def2e
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-01T21:07:35Z
Switch the implementation to getOffsets().
commit 3d772b8fac0c45cbe7631064a57361b7928b9bc2
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-01T22:01:19Z
Add a test for a replicated partition
commit 15e8b1a83471919c40d56f32ac858a38b7ad7b31
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-10T21:42:33Z
Add the implementation based on new KafkaConsumer. Switch tests to this new
implementation. Now it works for replicated topics.
commit 1cdfce266c217b080410235c671f2764068dc96c
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-11T20:33:34Z
Implement support for non-existing topics and partitions
commit 7a4bdc4deed9987f348ad76c027bf891d7ef3257
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-11T21:08:26Z
Refactor: rename the method. Add documentation.
commit d473adb73434b7d4347cf62fbf29e73615ea8a82
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T07:39:13Z
Refine the doc
commit a21bfab73bac67d2441472f8f73d7a9c956dcc5c
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T09:11:19Z
Add more tests for non-existing partitions. Refactor tests.
commit 0b8e6f8e02802a5e3fb40156efa656bb91f2211d
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T09:26:34Z
Add a test for a non-existing topic.
commit 6a65574ac28f86712894e25d6edff1051bbc50ac
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T10:42:08Z
Add the ability to retrieve offsets for all partition of all topics. Add
the ability to exclude internal topics.
commit f23c85305868b2f4f0a00c318cb3a2d0786b467b
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T14:54:55Z
Switch the tool to new implementation
commit 5eb477fcd6de0eccf51a89f35b9898f60eb21106
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T18:01:38Z
Update the test according to changes in the contract
commit c1b4b979101b01d29c473a839b823f7d7db7fd5c
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T18:01:54Z
Remove old implementation
commit 81cc383c3d0640b07926a5ead0a292cdca622029
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T22:42:49Z
Update the implementation to support independently empty topic set and
empty partition set
commit afaed0f91e01a367dc8cb9aebeb79511884313eb
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T22:53:21Z
Remove redundant information from error messages, topic and partition are
available in the key. Remove unused constant.
commit 79eb356f781dcbbdc238d74a1fc69f725382954e
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T23:02:25Z
Update test expectations after error messages were changed
commit 36f00072d8fed7bc0d629c92d13022bdcb709cf9
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T23:10:04Z
Add some TODO
commit e2b467c882a89bd31747e80557761c3dbd774584
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T13:55:12Z
Add an utility to install on my machine - remove it later
commit d76672a2611ae7167bc91036fd48121abbc79975
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T13:56:23Z
These TODOs are done
commit 05a9e04a1f0c440e7cc640773a357ffb4e1f507d
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T14:14:56Z
Add an utility to install on my machine - remove it later
commit 4c92895f7b083c4f76f48b11a83ccdccbf62d835
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T14:15:53Z
Formatting
commit 5393d043c1c7e69682b48d4ab17ade72d489dd5f
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T18:14:22Z
Make command line args consistent and aligned with other Kafka console
tools. Deprecate old args and display warnings.
commit 8e47485d7a67ceda1f8949df882d719feb9de099
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T18:23:39Z
Add shell wrapper for GetOffsetShell
commit e0cf259ee9c6239d0eaaaf6647f82413f144d8ee
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T18:36:04Z
Add debug output
commit 54086e19a9fcff38fbb16250a65b5687b2fc700d
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T19:14:38Z
Pass extra consumer properties specified in command line to the consumer
commit 10b0a84c05ccf43ac5d1eae213c10dfbaaa134a5
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T19:27:40Z
Refactor: rename tests
commit e7c9d197691a3754f6c96476c05937b2860d33d4
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T21:26:36Z
Cosmetic
commit c2dbfa78900c1c401342dafb95afbcd88dc9ae07
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T22:06:15Z
Implement getting offsets for a timestamp
commit 1cb3d12d286fe03e8ae8baf980dc385d19bbbf17
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T22:35:11Z
Fix bug: report non-existing partition when should report non-existing
topic. Update tests to distinguish the two error handling paths.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---