[
https://issues.apache.org/jira/browse/KAFKA-5235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009814#comment-16009814
]
ASF GitHub Bot commented on KAFKA-5235:
---------------------------------------
GitHub user tashoyan opened a pull request:
https://github.com/apache/kafka/pull/3051
KAFKA-5235: GetOffsetShell: new KafkaConsumer API, support for multiple
topics, minimized number of requests to server
This PR addresses two improvements:
[KAFKA-5235 GetOffsetShell: retrieve offsets for all given topics and
partitions with single request to the
broker](https://issues.apache.org/jira/browse/KAFKA-5235)
[KAFKA-5234 GetOffsetShell: retrieve offsets for multiple topics with
single request](https://issues.apache.org/jira/browse/KAFKA-5234)
1. Previous implementation used SimpleConsumer to get offsets and old
Producer API to get topic/partition metadata. Previous implementation
determined a leader broker for each partition and then requested the leader for
offsets. In total, it did as many requests to the broker as the number of
partitions (plus a request to Zookeeper for metadata).
New implementation `kafka-get-offsets.sh` uses KafkaConsumer API. It makes
at most two requests to the broker: 1) to query existing topics and partitions,
2) to grab all requested offsets. New implementation correctly handles
non-existing topics and partitions asked by user:
> kafka-get-offsets.sh --bootstrap-servers vm:9092 --topics AAA,ZZZ
--partitions 0,1
> AAA:0:7
> AAA:1:Partition not found
> ZZZ:0:Topic not found
2. Previously, user could get offsets for one topic only. Now user can get
offsets for many topics at once:
`kafka-get-offsets.sh --bootstrap-servers vm:9092 --topics AAA,ZZZ`
Moreover, now user is able to retrieve offsets for _all_ topics - this is
the default when no topics specified:
`kafka-get-offsets.sh --bootstrap-servers vm:9092`
Thanks to this feature, there is no need anymore to retrieve all topics by
means of `kafka-topics.sh`.
When no topics specified, the new `kafka-get-offsets.sh` tool takes into
account only user-level topics and ignores Kafka-internal topics (i.e. consumer
offsets). This behavior can be altered via a special command line argument:
`kafka-get-offsets.sh --bootstrap-servers vm:9092 --include-internal-topics`
3. New `kafka-get-offsets.sh` tool is consistent with other console tools
with respect to command line argument names. In addition,
`kafka-get-offsets.sh` tool gives the possibility to pass an arbitrary setting
to KafkaConsumer via `--consumer-property` argument.
I hope, now `kafka-get-offsets.sh` is easier in use and gives performance
improvement.
@junrao I suppose you may want to review.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tashoyan/kafka trunk
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/kafka/pull/3051.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3051
----
commit ec16d064aac4dfba164aeefaae3950db7b2e35af
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-01T19:52:26Z
Add a testable method that retrieves offsets: getOffsets(). Cover it with
tests.
commit df60a30da3a5c3d0d95c85df2c5eb32a6eeae107
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-01T19:55:13Z
Fix some trivial warnings
commit 5aa9639666d32f7e036cee4cb42ac9b7223def2e
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-01T21:07:35Z
Switch the implementation to getOffsets().
commit 3d772b8fac0c45cbe7631064a57361b7928b9bc2
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-01T22:01:19Z
Add a test for a replicated partition
commit 15e8b1a83471919c40d56f32ac858a38b7ad7b31
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-10T21:42:33Z
Add the implementation based on new KafkaConsumer. Switch tests to this new
implementation. Now it works for replicated topics.
commit 1cdfce266c217b080410235c671f2764068dc96c
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-11T20:33:34Z
Implement support for non-existing topics and partitions
commit 7a4bdc4deed9987f348ad76c027bf891d7ef3257
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-11T21:08:26Z
Refactor: rename the method. Add documentation.
commit d473adb73434b7d4347cf62fbf29e73615ea8a82
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T07:39:13Z
Refine the doc
commit a21bfab73bac67d2441472f8f73d7a9c956dcc5c
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T09:11:19Z
Add more tests for non-existing partitions. Refactor tests.
commit 0b8e6f8e02802a5e3fb40156efa656bb91f2211d
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T09:26:34Z
Add a test for a non-existing topic.
commit 6a65574ac28f86712894e25d6edff1051bbc50ac
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T10:42:08Z
Add the ability to retrieve offsets for all partition of all topics. Add
the ability to exclude internal topics.
commit f23c85305868b2f4f0a00c318cb3a2d0786b467b
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T14:54:55Z
Switch the tool to new implementation
commit 5eb477fcd6de0eccf51a89f35b9898f60eb21106
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T18:01:38Z
Update the test according to changes in the contract
commit c1b4b979101b01d29c473a839b823f7d7db7fd5c
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T18:01:54Z
Remove old implementation
commit 81cc383c3d0640b07926a5ead0a292cdca622029
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T22:42:49Z
Update the implementation to support independently empty topic set and
empty partition set
commit afaed0f91e01a367dc8cb9aebeb79511884313eb
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T22:53:21Z
Remove redundant information from error messages, topic and partition are
available in the key. Remove unused constant.
commit 79eb356f781dcbbdc238d74a1fc69f725382954e
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T23:02:25Z
Update test expectations after error messages were changed
commit 36f00072d8fed7bc0d629c92d13022bdcb709cf9
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-12T23:10:04Z
Add some TODO
commit e2b467c882a89bd31747e80557761c3dbd774584
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T13:55:12Z
Add an utility to install on my machine - remove it later
commit d76672a2611ae7167bc91036fd48121abbc79975
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T13:56:23Z
These TODOs are done
commit 05a9e04a1f0c440e7cc640773a357ffb4e1f507d
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T14:14:56Z
Add an utility to install on my machine - remove it later
commit 4c92895f7b083c4f76f48b11a83ccdccbf62d835
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T14:15:53Z
Formatting
commit 5393d043c1c7e69682b48d4ab17ade72d489dd5f
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T18:14:22Z
Make command line args consistent and aligned with other Kafka console
tools. Deprecate old args and display warnings.
commit 8e47485d7a67ceda1f8949df882d719feb9de099
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T18:23:39Z
Add shell wrapper for GetOffsetShell
commit e0cf259ee9c6239d0eaaaf6647f82413f144d8ee
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T18:36:04Z
Add debug output
commit 54086e19a9fcff38fbb16250a65b5687b2fc700d
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T19:14:38Z
Pass extra consumer properties specified in command line to the consumer
commit 10b0a84c05ccf43ac5d1eae213c10dfbaaa134a5
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T19:27:40Z
Refactor: rename tests
commit e7c9d197691a3754f6c96476c05937b2860d33d4
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T21:26:36Z
Cosmetic
commit c2dbfa78900c1c401342dafb95afbcd88dc9ae07
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T22:06:15Z
Implement getting offsets for a timestamp
commit 1cb3d12d286fe03e8ae8baf980dc385d19bbbf17
Author: Arseniy Tashoyan <[email protected]>
Date: 2017-05-13T22:35:11Z
Fix bug: report non-existing partition when should report non-existing
topic. Update tests to distinguish the two error handling paths.
----
> GetOffsetShell: retrieve offsets for all given topics and partitions with
> single request to the broker
> ------------------------------------------------------------------------------------------------------
>
> Key: KAFKA-5235
> URL: https://issues.apache.org/jira/browse/KAFKA-5235
> Project: Kafka
> Issue Type: Improvement
> Components: tools
> Reporter: Arseniy Tashoyan
> Labels: tool
> Fix For: 0.11.0.0
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> GetOffsetShell is implemented on old SimpleConsumer. It needs Zookeeper to
> retrieve metadata about topics and partitions. At present, GetOffsetShell
> does the following:
> - get metadata from Zookeeper
> - iterate over partitions
> - for each partition, connect to its leader broker and request offsets
> Instead, GetOffsetShell can use new KafkaConsumer and retrieve offsets by
> means of endOffsets(), beginningOffsets() and offsetsForTimes() methods. One
> request is sufficient for all topics and partitions.
> As far as GetOffsetShell is re-implemented with new KafkaConsumer API, it
> will not depend on obsolete API: SimpleConsumer, old producer API.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)