Re: Review Request 25995: Patch for KAFKA-1650
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25995/ --- (Updated Dec. 8, 2014, 9:36 a.m.) Review request for kafka. Bugs: KAFKA-1650 and KAKFA-1650 https://issues.apache.org/jira/browse/KAFKA-1650 https://issues.apache.org/jira/browse/KAKFA-1650 Repository: kafka Description (updated) --- Addressed Guozhang's comments. Addressed Guozhang's comments commit before switch to trunk commit before rebase Rebased on trunk, Addressed Guozhang's comments. Addressed Guozhang's comments on MaxInFlightRequests Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into mirrormaker-redesign Incorporated Guozhang's comments Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into mirrormaker-redesign Merged KAFKA-345 into this patch. Incorporated Joel and Jun's comments. Added consumer rebalance listener to mirror maker, will test it later. Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into mirrormaker-redesign Conflicts: core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala core/src/test/scala/unit/kafka/consumer/ZookeeperConsumerConnectorTest.scala added custom config for consumer rebalance listener Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into mirrormaker-redesign Add configurable consumer rebalance listener Incorporated Guozhang's comments Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into mirrormaker-redesign Incorporated Guozhang's comments. Addressed Guozhang's comment. numMessageUnacked should be decremented no matter the send was successful or not. Addressed Jun's comments. Incorporated Jun's comments Incorporated Jun's comments and rebased on trunk Diffs (updated) - core/src/main/scala/kafka/consumer/ConsumerConnector.scala 62c0686e816d2888772d5a911becf625eedee397 core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala e991d2187d03241f639eeaf6769fb59c8c99664c core/src/main/scala/kafka/javaapi/consumer/ZookeeperConsumerConnector.scala 9baad34a9793e5067d11289ece2154ba87b388af core/src/main/scala/kafka/tools/MirrorMaker.scala b06ff6000183b257005b5ac3ccc7ba8976f1ab8d Diff: https://reviews.apache.org/r/25995/diff/ Testing --- Thanks, Jiangjie Qin
[jira] [Updated] (KAFKA-1650) Mirror Maker could lose data on unclean shutdown.
[ https://issues.apache.org/jira/browse/KAFKA-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiangjie Qin updated KAFKA-1650: Attachment: KAFKA-1650_2014-12-08_01:36:01.patch Mirror Maker could lose data on unclean shutdown. - Key: KAFKA-1650 URL: https://issues.apache.org/jira/browse/KAFKA-1650 Project: Kafka Issue Type: Improvement Reporter: Jiangjie Qin Assignee: Jiangjie Qin Attachments: KAFKA-1650.patch, KAFKA-1650_2014-10-06_10:17:46.patch, KAFKA-1650_2014-11-12_09:51:30.patch, KAFKA-1650_2014-11-17_18:44:37.patch, KAFKA-1650_2014-11-20_12:00:16.patch, KAFKA-1650_2014-11-24_08:15:17.patch, KAFKA-1650_2014-12-03_15:02:31.patch, KAFKA-1650_2014-12-03_19:02:13.patch, KAFKA-1650_2014-12-04_11:59:07.patch, KAFKA-1650_2014-12-06_18:58:57.patch, KAFKA-1650_2014-12-08_01:36:01.patch Currently if mirror maker got shutdown uncleanly, the data in the data channel and buffer could potentially be lost. With the new producer's callback, this issue could be solved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1650) Mirror Maker could lose data on unclean shutdown.
[ https://issues.apache.org/jira/browse/KAFKA-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237679#comment-14237679 ] Jiangjie Qin commented on KAFKA-1650: - Updated reviewboard https://reviews.apache.org/r/25995/diff/ against branch origin/trunk Mirror Maker could lose data on unclean shutdown. - Key: KAFKA-1650 URL: https://issues.apache.org/jira/browse/KAFKA-1650 Project: Kafka Issue Type: Improvement Reporter: Jiangjie Qin Assignee: Jiangjie Qin Attachments: KAFKA-1650.patch, KAFKA-1650_2014-10-06_10:17:46.patch, KAFKA-1650_2014-11-12_09:51:30.patch, KAFKA-1650_2014-11-17_18:44:37.patch, KAFKA-1650_2014-11-20_12:00:16.patch, KAFKA-1650_2014-11-24_08:15:17.patch, KAFKA-1650_2014-12-03_15:02:31.patch, KAFKA-1650_2014-12-03_19:02:13.patch, KAFKA-1650_2014-12-04_11:59:07.patch, KAFKA-1650_2014-12-06_18:58:57.patch, KAFKA-1650_2014-12-08_01:36:01.patch Currently if mirror maker got shutdown uncleanly, the data in the data channel and buffer could potentially be lost. With the new producer's callback, this issue could be solved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 28643: Patch for KAFKA-1802
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28643/ --- (Updated Dec. 8, 2014, 10:56 a.m.) Review request for kafka. Bugs: KAFKA-1802 https://issues.apache.org/jira/browse/KAFKA-1802 Repository: kafka Description (updated) --- KAFKA-1802 - Add a new type of request for the discovery of the controller KAFKA-1802 - UpdateMetadataRequest is not sent on startup, so brokers do not cache cluster info Diffs (updated) - clients/src/main/java/org/apache/kafka/common/protocol/ApiKeys.java 109fc965e09b2ed186a073351bd037ac8af20a4c clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 7517b879866fc5dad5f8d8ad30636da8bbe7784a clients/src/main/java/org/apache/kafka/common/requests/AdminRequest.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/requests/AdminResponse.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/requests/ClusterMetadataRequest.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/requests/ClusterMetadataResponse.java PRE-CREATION clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java df37fc6d8f0db0b8192a948426af603be3444da4 core/src/main/scala/kafka/api/AdminRequest.scala PRE-CREATION core/src/main/scala/kafka/api/AdminResponse.scala PRE-CREATION core/src/main/scala/kafka/api/ClusterMetadataRequest.scala PRE-CREATION core/src/main/scala/kafka/api/ClusterMetadataResponse.scala PRE-CREATION core/src/main/scala/kafka/api/RequestKeys.scala c24c0345feedc7b9e2e9f40af11bfa1b8d328c43 core/src/main/scala/kafka/api/admin/request/args/ParseException.scala PRE-CREATION core/src/main/scala/kafka/api/admin/request/args/TopicCommandArguments.scala PRE-CREATION core/src/main/scala/kafka/common/AdminRequestFailedException.scala PRE-CREATION core/src/main/scala/kafka/common/ErrorMapping.scala eedc2f5f21dd8755fba891998456351622e17047 core/src/main/scala/kafka/controller/ControllerChannelManager.scala eb492f00449744bc8d63f55b393e2a1659d38454 core/src/main/scala/kafka/controller/KafkaController.scala 66df6d2fbdbdd556da6bea0df84f93e0472c8fbf core/src/main/scala/kafka/server/KafkaApis.scala 2a1c0326b6e6966d8b8254bd6a1cb83ad98a3b80 core/src/main/scala/kafka/server/MetadataCache.scala bf81a1ab88c14be8697b441eedbeb28fa0112643 core/src/test/scala/unit/kafka/api/AdminRequestTest.scala PRE-CREATION core/src/test/scala/unit/kafka/api/RequestResponseSerializationTest.scala cd16ced5465d098be7a60498326b2a98c248f343 Diff: https://reviews.apache.org/r/28643/diff/ Testing --- Thanks, Andrii Biletskyi
[jira] [Commented] (KAFKA-1802) Add a new type of request for the discovery of the controller
[ https://issues.apache.org/jira/browse/KAFKA-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237754#comment-14237754 ] Andrii Biletskyi commented on KAFKA-1802: - Updated reviewboard https://reviews.apache.org/r/28643/diff/ against branch origin/trunk Add a new type of request for the discovery of the controller - Key: KAFKA-1802 URL: https://issues.apache.org/jira/browse/KAFKA-1802 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Andrii Biletskyi Fix For: 0.8.3 Attachments: KAFKA-1802.patch, KAFKA-1802_2014-12-08_12:56:03.patch The goal here is like meta data discovery is for producer so CLI can find which broker it should send the rest of its admin requests too. Any broker can respond to this specific AdminMeta RQ/RP but only the controller broker should be responding to Admin message otherwise that broker should respond to any admin message with the response for what the controller is. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1802) Add a new type of request for the discovery of the controller
[ https://issues.apache.org/jira/browse/KAFKA-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrii Biletskyi updated KAFKA-1802: Attachment: KAFKA-1802_2014-12-08_12:56:03.patch Add a new type of request for the discovery of the controller - Key: KAFKA-1802 URL: https://issues.apache.org/jira/browse/KAFKA-1802 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Andrii Biletskyi Fix For: 0.8.3 Attachments: KAFKA-1802.patch, KAFKA-1802_2014-12-08_12:56:03.patch The goal here is like meta data discovery is for producer so CLI can find which broker it should send the rest of its admin requests too. Any broker can respond to this specific AdminMeta RQ/RP but only the controller broker should be responding to Admin message otherwise that broker should respond to any admin message with the response for what the controller is. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 28481: Patch for KAFKA-1792
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28481/ --- (Updated Dec. 8, 2014, 11:43 a.m.) Review request for kafka. Bugs: KAFKA-1792 https://issues.apache.org/jira/browse/KAFKA-1792 Repository: kafka Description (updated) --- KAFKA-1792: CR KAFKA-1792: CR2 Diffs (updated) - core/src/main/scala/kafka/admin/AdminUtils.scala 28b12c7b89a56c113b665fbde1b95f873f8624a3 core/src/main/scala/kafka/admin/ReassignPartitionsCommand.scala 979992b68af3723cd229845faff81c641123bb88 core/src/test/scala/unit/kafka/admin/AdminTest.scala e28979827110dfbbb92fe5b152e7f1cc973de400 topics.json ff011ed381e781b9a177036001d44dca3eac586f Diff: https://reviews.apache.org/r/28481/diff/ Testing --- Thanks, Dmitry Pekar
[jira] [Updated] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments
[ https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Pekar updated KAFKA-1792: Attachment: KAFKA-1792_2014-12-08_13:42:43.patch change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments - Key: KAFKA-1792 URL: https://issues.apache.org/jira/browse/KAFKA-1792 Project: Kafka Issue Type: Sub-task Components: tools Reporter: Dmitry Pekar Assignee: Dmitry Pekar Fix For: 0.8.3 Attachments: KAFKA-1792.patch, KAFKA-1792_2014-12-03_19:24:56.patch, KAFKA-1792_2014-12-08_13:42:43.patch Current implementation produces fair replica distribution between specified list of brokers. Unfortunately, it doesn't take into account current replica assignment. So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth broker id=3, generate will create an assignment config which will redistribute replicas fairly across brokers [0..3] in the same way as those partitions were created from scratch. It will not take into consideration current replica assignment and accordingly will not try to minimize number of replica moves between brokers. As proposed by [~charmalloc] this should be improved. New output of improved --generate algorithm should suite following requirements: - fairness of replica distribution - every broker will have R or R+1 replicas assigned; - minimum of reassignments - number of replica moves between brokers will be minimal; Example. Consider following replica distribution per brokers [0..3] (we just added brokers 2 and 3): - broker - 0, 1, 2, 3 - replicas - 7, 6, 0, 0 The new algorithm will produce following assignment: - broker - 0, 1, 2, 3 - replicas - 4, 3, 3, 3 - moves - -3, -3, +3, +3 It will be fair and number of moves will be 6, which is minimal for specified initial distribution. The scope of this issue is: - design an algorithm matching the above requirements; - implement this algorithm and unit tests; - test it manually using different initial assignments; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments
[ https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237786#comment-14237786 ] Dmitry Pekar commented on KAFKA-1792: - Updated reviewboard https://reviews.apache.org/r/28481/diff/ against branch origin/trunk change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments - Key: KAFKA-1792 URL: https://issues.apache.org/jira/browse/KAFKA-1792 Project: Kafka Issue Type: Sub-task Components: tools Reporter: Dmitry Pekar Assignee: Dmitry Pekar Fix For: 0.8.3 Attachments: KAFKA-1792.patch, KAFKA-1792_2014-12-03_19:24:56.patch, KAFKA-1792_2014-12-08_13:42:43.patch Current implementation produces fair replica distribution between specified list of brokers. Unfortunately, it doesn't take into account current replica assignment. So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth broker id=3, generate will create an assignment config which will redistribute replicas fairly across brokers [0..3] in the same way as those partitions were created from scratch. It will not take into consideration current replica assignment and accordingly will not try to minimize number of replica moves between brokers. As proposed by [~charmalloc] this should be improved. New output of improved --generate algorithm should suite following requirements: - fairness of replica distribution - every broker will have R or R+1 replicas assigned; - minimum of reassignments - number of replica moves between brokers will be minimal; Example. Consider following replica distribution per brokers [0..3] (we just added brokers 2 and 3): - broker - 0, 1, 2, 3 - replicas - 7, 6, 0, 0 The new algorithm will produce following assignment: - broker - 0, 1, 2, 3 - replicas - 4, 3, 3, 3 - moves - -3, -3, +3, +3 It will be fair and number of moves will be 6, which is minimal for specified initial distribution. The scope of this issue is: - design an algorithm matching the above requirements; - implement this algorithm and unit tests; - test it manually using different initial assignments; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments
[ https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237796#comment-14237796 ] Dmitry Pekar commented on KAFKA-1792: - [~nehanarkhede] Thank you for your comments. I've updated and added a patch for fixed items. I can't agree with you about unit test for the algorithm. If it contains a bug or could be improved in future, than we would not be able to guarantee it correctness after the fix/improvement if not having the unit-test. The above unit test, IMHO, already contains those scenarios, but may be I've missed some important scenario. Could you please review it also? change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments - Key: KAFKA-1792 URL: https://issues.apache.org/jira/browse/KAFKA-1792 Project: Kafka Issue Type: Sub-task Components: tools Reporter: Dmitry Pekar Assignee: Dmitry Pekar Fix For: 0.8.3 Attachments: KAFKA-1792.patch, KAFKA-1792_2014-12-03_19:24:56.patch, KAFKA-1792_2014-12-08_13:42:43.patch Current implementation produces fair replica distribution between specified list of brokers. Unfortunately, it doesn't take into account current replica assignment. So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth broker id=3, generate will create an assignment config which will redistribute replicas fairly across brokers [0..3] in the same way as those partitions were created from scratch. It will not take into consideration current replica assignment and accordingly will not try to minimize number of replica moves between brokers. As proposed by [~charmalloc] this should be improved. New output of improved --generate algorithm should suite following requirements: - fairness of replica distribution - every broker will have R or R+1 replicas assigned; - minimum of reassignments - number of replica moves between brokers will be minimal; Example. Consider following replica distribution per brokers [0..3] (we just added brokers 2 and 3): - broker - 0, 1, 2, 3 - replicas - 7, 6, 0, 0 The new algorithm will produce following assignment: - broker - 0, 1, 2, 3 - replicas - 4, 3, 3, 3 - moves - -3, -3, +3, +3 It will be fair and number of moves will be 6, which is minimal for specified initial distribution. The scope of this issue is: - design an algorithm matching the above requirements; - implement this algorithm and unit tests; - test it manually using different initial assignments; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments
[ https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237796#comment-14237796 ] Dmitry Pekar edited comment on KAFKA-1792 at 12/8/14 11:48 AM: --- [~nehanarkhede] Thank you for your comments. I've updated rb and added a patch for fixed items. I can't agree with you about unit test for the algorithm. If it contains a bug or could be improved in future, than we would not be able to guarantee it correctness after the fix/improvement if not having the unit-test. The above unit test, IMHO, already contains those scenarios, but may be I've missed some important scenario. Could you please review it also? was (Author: dmitry pekar): [~nehanarkhede] Thank you for your comments. I've updated and added a patch for fixed items. I can't agree with you about unit test for the algorithm. If it contains a bug or could be improved in future, than we would not be able to guarantee it correctness after the fix/improvement if not having the unit-test. The above unit test, IMHO, already contains those scenarios, but may be I've missed some important scenario. Could you please review it also? change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments - Key: KAFKA-1792 URL: https://issues.apache.org/jira/browse/KAFKA-1792 Project: Kafka Issue Type: Sub-task Components: tools Reporter: Dmitry Pekar Assignee: Dmitry Pekar Fix For: 0.8.3 Attachments: KAFKA-1792.patch, KAFKA-1792_2014-12-03_19:24:56.patch, KAFKA-1792_2014-12-08_13:42:43.patch Current implementation produces fair replica distribution between specified list of brokers. Unfortunately, it doesn't take into account current replica assignment. So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth broker id=3, generate will create an assignment config which will redistribute replicas fairly across brokers [0..3] in the same way as those partitions were created from scratch. It will not take into consideration current replica assignment and accordingly will not try to minimize number of replica moves between brokers. As proposed by [~charmalloc] this should be improved. New output of improved --generate algorithm should suite following requirements: - fairness of replica distribution - every broker will have R or R+1 replicas assigned; - minimum of reassignments - number of replica moves between brokers will be minimal; Example. Consider following replica distribution per brokers [0..3] (we just added brokers 2 and 3): - broker - 0, 1, 2, 3 - replicas - 7, 6, 0, 0 The new algorithm will produce following assignment: - broker - 0, 1, 2, 3 - replicas - 4, 3, 3, 3 - moves - -3, -3, +3, +3 It will be fair and number of moves will be 6, which is minimal for specified initial distribution. The scope of this issue is: - design an algorithm matching the above requirements; - implement this algorithm and unit tests; - test it manually using different initial assignments; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments
[ https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237796#comment-14237796 ] Dmitry Pekar edited comment on KAFKA-1792 at 12/8/14 11:49 AM: --- [~nehanarkhede] Thank you for your comments. I've updated rb and added a patch for fixed items. I can't agree with you about unit test for the algorithm. If the algorithm contains a bug or could be improved in future, than we would not be able to verify it correctness after the fix/improvement if not having the unit-test. The above unit test, IMHO, already contains those scenarios, but may be I've missed some important scenario. Could you please review it also? was (Author: dmitry pekar): [~nehanarkhede] Thank you for your comments. I've updated rb and added a patch for fixed items. I can't agree with you about unit test for the algorithm. If it contains a bug or could be improved in future, than we would not be able to guarantee it correctness after the fix/improvement if not having the unit-test. The above unit test, IMHO, already contains those scenarios, but may be I've missed some important scenario. Could you please review it also? change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments - Key: KAFKA-1792 URL: https://issues.apache.org/jira/browse/KAFKA-1792 Project: Kafka Issue Type: Sub-task Components: tools Reporter: Dmitry Pekar Assignee: Dmitry Pekar Fix For: 0.8.3 Attachments: KAFKA-1792.patch, KAFKA-1792_2014-12-03_19:24:56.patch, KAFKA-1792_2014-12-08_13:42:43.patch Current implementation produces fair replica distribution between specified list of brokers. Unfortunately, it doesn't take into account current replica assignment. So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth broker id=3, generate will create an assignment config which will redistribute replicas fairly across brokers [0..3] in the same way as those partitions were created from scratch. It will not take into consideration current replica assignment and accordingly will not try to minimize number of replica moves between brokers. As proposed by [~charmalloc] this should be improved. New output of improved --generate algorithm should suite following requirements: - fairness of replica distribution - every broker will have R or R+1 replicas assigned; - minimum of reassignments - number of replica moves between brokers will be minimal; Example. Consider following replica distribution per brokers [0..3] (we just added brokers 2 and 3): - broker - 0, 1, 2, 3 - replicas - 7, 6, 0, 0 The new algorithm will produce following assignment: - broker - 0, 1, 2, 3 - replicas - 4, 3, 3, 3 - moves - -3, -3, +3, +3 It will be fair and number of moves will be 6, which is minimal for specified initial distribution. The scope of this issue is: - design an algorithm matching the above requirements; - implement this algorithm and unit tests; - test it manually using different initial assignments; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1694) kafka command line and centralized operations
[ https://issues.apache.org/jira/browse/KAFKA-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrii Biletskyi updated KAFKA-1694: Attachment: (was: KAFKA-1772_1802_1775_1774.patch) kafka command line and centralized operations - Key: KAFKA-1694 URL: https://issues.apache.org/jira/browse/KAFKA-1694 Project: Kafka Issue Type: Bug Reporter: Joe Stein Priority: Critical Fix For: 0.8.3 https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Command+Line+and+Related+Improvements -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1694) kafka command line and centralized operations
[ https://issues.apache.org/jira/browse/KAFKA-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrii Biletskyi updated KAFKA-1694: Attachment: KAFKA-1772_1802_1775_1774_v2.patch kafka command line and centralized operations - Key: KAFKA-1694 URL: https://issues.apache.org/jira/browse/KAFKA-1694 Project: Kafka Issue Type: Bug Reporter: Joe Stein Priority: Critical Fix For: 0.8.3 Attachments: KAFKA-1772_1802_1775_1774_v2.patch https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Command+Line+and+Related+Improvements -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1694) kafka command line and centralized operations
[ https://issues.apache.org/jira/browse/KAFKA-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237283#comment-14237283 ] Andrii Biletskyi edited comment on KAFKA-1694 at 12/8/14 12:40 PM: --- I've added a single patch that covers all currently implemented functionality (Admin message + basis shell functionality with TopicCommand) to receive some initial feedback. To start the Shell please follow the instructions under https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Command+Line+Tool+Installation was (Author: abiletskyi): I've added a single patch that covers all currently implemented functionality (Admin message + basis shell functionality with TopicCommand) to receive some initial feedback. Patch is created against trunk, commit 7e9368b. To get this working: 1) apply patch 2) build kafka: ./gradlew releaseTarGz_2_10_4 3) start somewhere kafka from build release (archive is in ./core/build/distributions) 4.1) To start interactive shell: #sudo bin/kafka.sh --shell --broker host:port #kafka help Or: 4.2) Call TopicCommand right from kafka.sh. E.g.: #sudo bin/kafka.sh --list-topics --broker host:port kafka command line and centralized operations - Key: KAFKA-1694 URL: https://issues.apache.org/jira/browse/KAFKA-1694 Project: Kafka Issue Type: Bug Reporter: Joe Stein Priority: Critical Fix For: 0.8.3 Attachments: KAFKA-1772_1802_1775_1774_v2.patch https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Command+Line+and+Related+Improvements -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1774) REPL and Shell Client for Admin Message RQ/RP
[ https://issues.apache.org/jira/browse/KAFKA-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237951#comment-14237951 ] Andrii Biletskyi commented on KAFKA-1774: - Patch is available under parent ticket - KAFKA-1694 REPL and Shell Client for Admin Message RQ/RP - Key: KAFKA-1774 URL: https://issues.apache.org/jira/browse/KAFKA-1774 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Andrii Biletskyi Fix For: 0.8.3 We should have a REPL we can work in and execute the commands with the arguments. With this we can do: ./kafka.sh --shell kafkaattach cluster -b localhost:9092; kafkadescribe topic sampleTopicNameForExample; the command line version can work like it does now so folks don't have to re-write all of their tooling. kafka.sh --topics --everything the same like kafka-topics.sh is kafka.sh --reassign --everything the same like kafka-reassign-partitions.sh is -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1772) Add an Admin message type for request response
[ https://issues.apache.org/jira/browse/KAFKA-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrii Biletskyi updated KAFKA-1772: Attachment: (was: KAFKA-1772_2014-12-02_16:23:26.patch) Add an Admin message type for request response -- Key: KAFKA-1772 URL: https://issues.apache.org/jira/browse/KAFKA-1772 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Andrii Biletskyi Fix For: 0.8.3 - utility int8 - command int8 - format int8 - args variable length bytes utility 0 - Broker 1 - Topic 2 - Replication 3 - Controller 4 - Consumer 5 - Producer Command 0 - Create 1 - Alter 3 - Delete 4 - List 5 - Audit format 0 - JSON args e.g. (which would equate to the data structure values == 2,1,0) meta-store: { {zookeeper:localhost:12913/kafka} }args: { partitions: [ {topic: topic1, partition: 0}, {topic: topic1, partition: 1}, {topic: topic1, partition: 2}, {topic: topic2, partition: 0}, {topic: topic2, partition: 1}, ] } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1772) Add an Admin message type for request response
[ https://issues.apache.org/jira/browse/KAFKA-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216456#comment-14216456 ] Andrii Biletskyi edited comment on KAFKA-1772 at 12/8/14 3:12 PM: -- Patch is available under parent ticket - KAFKA-1694 was (Author: abiletskyi): Created reviewboard https://reviews.apache.org/r/28175/diff/ against branch origin/trunk Add an Admin message type for request response -- Key: KAFKA-1772 URL: https://issues.apache.org/jira/browse/KAFKA-1772 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Andrii Biletskyi Fix For: 0.8.3 - utility int8 - command int8 - format int8 - args variable length bytes utility 0 - Broker 1 - Topic 2 - Replication 3 - Controller 4 - Consumer 5 - Producer Command 0 - Create 1 - Alter 3 - Delete 4 - List 5 - Audit format 0 - JSON args e.g. (which would equate to the data structure values == 2,1,0) meta-store: { {zookeeper:localhost:12913/kafka} }args: { partitions: [ {topic: topic1, partition: 0}, {topic: topic1, partition: 1}, {topic: topic1, partition: 2}, {topic: topic2, partition: 0}, {topic: topic2, partition: 1}, ] } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1772) Add an Admin message type for request response
[ https://issues.apache.org/jira/browse/KAFKA-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrii Biletskyi updated KAFKA-1772: Attachment: (was: KAFKA-1772.patch) Add an Admin message type for request response -- Key: KAFKA-1772 URL: https://issues.apache.org/jira/browse/KAFKA-1772 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Andrii Biletskyi Fix For: 0.8.3 - utility int8 - command int8 - format int8 - args variable length bytes utility 0 - Broker 1 - Topic 2 - Replication 3 - Controller 4 - Consumer 5 - Producer Command 0 - Create 1 - Alter 3 - Delete 4 - List 5 - Audit format 0 - JSON args e.g. (which would equate to the data structure values == 2,1,0) meta-store: { {zookeeper:localhost:12913/kafka} }args: { partitions: [ {topic: topic1, partition: 0}, {topic: topic1, partition: 1}, {topic: topic1, partition: 2}, {topic: topic2, partition: 0}, {topic: topic2, partition: 1}, ] } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (KAFKA-1802) Add a new type of request for the discovery of the controller
[ https://issues.apache.org/jira/browse/KAFKA-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrii Biletskyi updated KAFKA-1802: Comment: was deleted (was: Updated reviewboard https://reviews.apache.org/r/28643/diff/ against branch origin/trunk) Add a new type of request for the discovery of the controller - Key: KAFKA-1802 URL: https://issues.apache.org/jira/browse/KAFKA-1802 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Andrii Biletskyi Fix For: 0.8.3 Attachments: KAFKA-1802.patch, KAFKA-1802_2014-12-08_12:56:03.patch The goal here is like meta data discovery is for producer so CLI can find which broker it should send the rest of its admin requests too. Any broker can respond to this specific AdminMeta RQ/RP but only the controller broker should be responding to Admin message otherwise that broker should respond to any admin message with the response for what the controller is. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (KAFKA-1802) Add a new type of request for the discovery of the controller
[ https://issues.apache.org/jira/browse/KAFKA-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrii Biletskyi updated KAFKA-1802: Comment: was deleted (was: Created reviewboard https://reviews.apache.org/r/28643/diff/ against branch origin/trunk [patch created on top of patch for KAFKA-1772]) Add a new type of request for the discovery of the controller - Key: KAFKA-1802 URL: https://issues.apache.org/jira/browse/KAFKA-1802 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Andrii Biletskyi Fix For: 0.8.3 Attachments: KAFKA-1802.patch, KAFKA-1802_2014-12-08_12:56:03.patch The goal here is like meta data discovery is for producer so CLI can find which broker it should send the rest of its admin requests too. Any broker can respond to this specific AdminMeta RQ/RP but only the controller broker should be responding to Admin message otherwise that broker should respond to any admin message with the response for what the controller is. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KAFKA-1774) REPL and Shell Client for Admin Message RQ/RP
[ https://issues.apache.org/jira/browse/KAFKA-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrii Biletskyi resolved KAFKA-1774. - Resolution: Implemented REPL and Shell Client for Admin Message RQ/RP - Key: KAFKA-1774 URL: https://issues.apache.org/jira/browse/KAFKA-1774 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Andrii Biletskyi Fix For: 0.8.3 We should have a REPL we can work in and execute the commands with the arguments. With this we can do: ./kafka.sh --shell kafkaattach cluster -b localhost:9092; kafkadescribe topic sampleTopicNameForExample; the command line version can work like it does now so folks don't have to re-write all of their tooling. kafka.sh --topics --everything the same like kafka-topics.sh is kafka.sh --reassign --everything the same like kafka-reassign-partitions.sh is -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1802) Add a new type of request for the discovery of the controller
[ https://issues.apache.org/jira/browse/KAFKA-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrii Biletskyi updated KAFKA-1802: Attachment: (was: KAFKA-1802.patch) Add a new type of request for the discovery of the controller - Key: KAFKA-1802 URL: https://issues.apache.org/jira/browse/KAFKA-1802 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Andrii Biletskyi Fix For: 0.8.3 The goal here is like meta data discovery is for producer so CLI can find which broker it should send the rest of its admin requests too. Any broker can respond to this specific AdminMeta RQ/RP but only the controller broker should be responding to Admin message otherwise that broker should respond to any admin message with the response for what the controller is. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1802) Add a new type of request for the discovery of the controller
[ https://issues.apache.org/jira/browse/KAFKA-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237954#comment-14237954 ] Andrii Biletskyi commented on KAFKA-1802: - Patch is available under parent ticket - KAFKA-1694 Add a new type of request for the discovery of the controller - Key: KAFKA-1802 URL: https://issues.apache.org/jira/browse/KAFKA-1802 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Andrii Biletskyi Fix For: 0.8.3 The goal here is like meta data discovery is for producer so CLI can find which broker it should send the rest of its admin requests too. Any broker can respond to this specific AdminMeta RQ/RP but only the controller broker should be responding to Admin message otherwise that broker should respond to any admin message with the response for what the controller is. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KAFKA-1775) Re-factor TopicCommand into thew handerAdminMessage call
[ https://issues.apache.org/jira/browse/KAFKA-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrii Biletskyi resolved KAFKA-1775. - Resolution: Implemented Re-factor TopicCommand into thew handerAdminMessage call - Key: KAFKA-1775 URL: https://issues.apache.org/jira/browse/KAFKA-1775 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Andrii Biletskyi Fix For: 0.8.3 kafka-topic.sh should become kafka --topic --everything else the same from the CLI perspective so we need to have the calls from the byte lalery get fed into that same code (few changes as possible called from the handleAdmin call after deducing what Utility[1] it is operating for I think we should not remove the existing kafka-topic.sh and preserve the existing functionality (with as little code duplication as possible) until 0.9 (and there we can remove it after folks have used it for a release or two and feedback and the rest)[2] [1] https://issues.apache.org/jira/browse/KAFKA-1772 [2] https://issues.apache.org/jira/browse/KAFKA-1776 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1802) Add a new type of request for the discovery of the controller
[ https://issues.apache.org/jira/browse/KAFKA-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrii Biletskyi updated KAFKA-1802: Attachment: (was: KAFKA-1802_2014-12-08_12:56:03.patch) Add a new type of request for the discovery of the controller - Key: KAFKA-1802 URL: https://issues.apache.org/jira/browse/KAFKA-1802 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Andrii Biletskyi Fix For: 0.8.3 The goal here is like meta data discovery is for producer so CLI can find which broker it should send the rest of its admin requests too. Any broker can respond to this specific AdminMeta RQ/RP but only the controller broker should be responding to Admin message otherwise that broker should respond to any admin message with the response for what the controller is. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (KAFKA-1774) REPL and Shell Client for Admin Message RQ/RP
[ https://issues.apache.org/jira/browse/KAFKA-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrii Biletskyi reopened KAFKA-1774: - REPL and Shell Client for Admin Message RQ/RP - Key: KAFKA-1774 URL: https://issues.apache.org/jira/browse/KAFKA-1774 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Andrii Biletskyi Fix For: 0.8.3 We should have a REPL we can work in and execute the commands with the arguments. With this we can do: ./kafka.sh --shell kafkaattach cluster -b localhost:9092; kafkadescribe topic sampleTopicNameForExample; the command line version can work like it does now so folks don't have to re-write all of their tooling. kafka.sh --topics --everything the same like kafka-topics.sh is kafka.sh --reassign --everything the same like kafka-reassign-partitions.sh is -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (KAFKA-1775) Re-factor TopicCommand into thew handerAdminMessage call
[ https://issues.apache.org/jira/browse/KAFKA-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrii Biletskyi reopened KAFKA-1775: - Re-factor TopicCommand into thew handerAdminMessage call - Key: KAFKA-1775 URL: https://issues.apache.org/jira/browse/KAFKA-1775 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Andrii Biletskyi Fix For: 0.8.3 kafka-topic.sh should become kafka --topic --everything else the same from the CLI perspective so we need to have the calls from the byte lalery get fed into that same code (few changes as possible called from the handleAdmin call after deducing what Utility[1] it is operating for I think we should not remove the existing kafka-topic.sh and preserve the existing functionality (with as little code duplication as possible) until 0.9 (and there we can remove it after folks have used it for a release or two and feedback and the rest)[2] [1] https://issues.apache.org/jira/browse/KAFKA-1772 [2] https://issues.apache.org/jira/browse/KAFKA-1776 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1775) Re-factor TopicCommand into thew handerAdminMessage call
[ https://issues.apache.org/jira/browse/KAFKA-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237960#comment-14237960 ] Andrii Biletskyi commented on KAFKA-1775: - Patch is available under parent ticket - KAFKA-1694 Re-factor TopicCommand into thew handerAdminMessage call - Key: KAFKA-1775 URL: https://issues.apache.org/jira/browse/KAFKA-1775 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Andrii Biletskyi Fix For: 0.8.3 kafka-topic.sh should become kafka --topic --everything else the same from the CLI perspective so we need to have the calls from the byte lalery get fed into that same code (few changes as possible called from the handleAdmin call after deducing what Utility[1] it is operating for I think we should not remove the existing kafka-topic.sh and preserve the existing functionality (with as little code duplication as possible) until 0.9 (and there we can remove it after folks have used it for a release or two and feedback and the rest)[2] [1] https://issues.apache.org/jira/browse/KAFKA-1772 [2] https://issues.apache.org/jira/browse/KAFKA-1776 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1806) broker can still expose uncommitted data to a consumer
[ https://issues.apache.org/jira/browse/KAFKA-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238136#comment-14238136 ] lokesh Birla commented on KAFKA-1806: - is there any update on this? broker can still expose uncommitted data to a consumer -- Key: KAFKA-1806 URL: https://issues.apache.org/jira/browse/KAFKA-1806 Project: Kafka Issue Type: Bug Components: consumer Affects Versions: 0.8.1.1 Reporter: lokesh Birla Assignee: Neha Narkhede Although following issue: https://issues.apache.org/jira/browse/KAFKA-727 is marked fixed but I still see this issue in 0.8.1.1. I am able to reproducer the issue consistently. [2014-08-18 06:43:58,356] ERROR [KafkaApi-1] Error when processing fetch request for partition [mmetopic4,2] offset 1940029 from consumer with correlation id 21 (kafka.server.Kaf kaApis) java.lang.IllegalArgumentException: Attempt to read with a maximum offset (1818353) less than the start offset (1940029). at kafka.log.LogSegment.read(LogSegment.scala:136) at kafka.log.Log.read(Log.scala:386) at kafka.server.KafkaApis.kafka$server$KafkaApis$$readMessageSet(KafkaApis.scala:530) at kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$readMessageSets$1.apply(KafkaApis.scala:476) at kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$readMessageSets$1.apply(KafkaApis.scala:471) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:233) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:233) at scala.collection.immutable.Map$Map1.foreach(Map.scala:119) at scala.collection.TraversableLike$class.map(TraversableLike.scala:233) at scala.collection.immutable.Map$Map1.map(Map.scala:107) at kafka.server.KafkaApis.kafka$server$KafkaApis$$readMessageSets(KafkaApis.scala:471) at kafka.server.KafkaApis$FetchRequestPurgatory.expire(KafkaApis.scala:783) at kafka.server.KafkaApis$FetchRequestPurgatory.expire(KafkaApis.scala:765) at kafka.server.RequestPurgatory$ExpiredRequestReaper.run(RequestPurgatory.scala:216) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-1810) Add IP Filtering / Whitelists-Blacklists
Jeff Holoman created KAFKA-1810: --- Summary: Add IP Filtering / Whitelists-Blacklists Key: KAFKA-1810 URL: https://issues.apache.org/jira/browse/KAFKA-1810 Project: Kafka Issue Type: New Feature Components: core, network Reporter: Jeff Holoman Assignee: Jun Rao Priority: Minor While longer-term goals of security in Kafka are on the roadmap there exists some value for the ability to restrict connection to Kafka brokers based on IP address. This is not intended as a replacement for security but more of a precaution against misconfiguration and to provide some level of control to Kafka administrators about who is reading/writing to their cluster. 1) In some organizations software administration vs o/s systems administration and network administration is disjointed and not well choreographed. Providing software administrators the ability to configure their platform relatively independently (after initial configuration) from Systems administrators is desirable. 2) Configuration and deployment is sometimes error prone and there are situations when test environments could erroneously read/write to production environments 3) An additional precaution against reading sensitive data is typically welcomed in most large enterprise deployments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (KAFKA-1810) Add IP Filtering / Whitelists-Blacklists
[ https://issues.apache.org/jira/browse/KAFKA-1810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Holoman reassigned KAFKA-1810: --- Assignee: Jeff Holoman (was: Jun Rao) Add IP Filtering / Whitelists-Blacklists - Key: KAFKA-1810 URL: https://issues.apache.org/jira/browse/KAFKA-1810 Project: Kafka Issue Type: New Feature Components: core, network Reporter: Jeff Holoman Assignee: Jeff Holoman Priority: Minor Fix For: 0.8.3 While longer-term goals of security in Kafka are on the roadmap there exists some value for the ability to restrict connection to Kafka brokers based on IP address. This is not intended as a replacement for security but more of a precaution against misconfiguration and to provide some level of control to Kafka administrators about who is reading/writing to their cluster. 1) In some organizations software administration vs o/s systems administration and network administration is disjointed and not well choreographed. Providing software administrators the ability to configure their platform relatively independently (after initial configuration) from Systems administrators is desirable. 2) Configuration and deployment is sometimes error prone and there are situations when test environments could erroneously read/write to production environments 3) An additional precaution against reading sensitive data is typically welcomed in most large enterprise deployments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1810) Add IP Filtering / Whitelists-Blacklists
[ https://issues.apache.org/jira/browse/KAFKA-1810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Holoman updated KAFKA-1810: Fix Version/s: 0.8.3 Add IP Filtering / Whitelists-Blacklists - Key: KAFKA-1810 URL: https://issues.apache.org/jira/browse/KAFKA-1810 Project: Kafka Issue Type: New Feature Components: core, network Reporter: Jeff Holoman Assignee: Jun Rao Priority: Minor Fix For: 0.8.3 While longer-term goals of security in Kafka are on the roadmap there exists some value for the ability to restrict connection to Kafka brokers based on IP address. This is not intended as a replacement for security but more of a precaution against misconfiguration and to provide some level of control to Kafka administrators about who is reading/writing to their cluster. 1) In some organizations software administration vs o/s systems administration and network administration is disjointed and not well choreographed. Providing software administrators the ability to configure their platform relatively independently (after initial configuration) from Systems administrators is desirable. 2) Configuration and deployment is sometimes error prone and there are situations when test environments could erroneously read/write to production environments 3) An additional precaution against reading sensitive data is typically welcomed in most large enterprise deployments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1810) Add IP Filtering / Whitelists-Blacklists
[ https://issues.apache.org/jira/browse/KAFKA-1810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238179#comment-14238179 ] Jeff Holoman commented on KAFKA-1810: - There was a discussion in KAFKA-1512 regarding the capability for limiting connections by setting max connections to 0. This gives a cleaner way to perform similar functionality Add IP Filtering / Whitelists-Blacklists - Key: KAFKA-1810 URL: https://issues.apache.org/jira/browse/KAFKA-1810 Project: Kafka Issue Type: New Feature Components: core, network Reporter: Jeff Holoman Assignee: Jeff Holoman Priority: Minor Fix For: 0.8.3 While longer-term goals of security in Kafka are on the roadmap there exists some value for the ability to restrict connection to Kafka brokers based on IP address. This is not intended as a replacement for security but more of a precaution against misconfiguration and to provide some level of control to Kafka administrators about who is reading/writing to their cluster. 1) In some organizations software administration vs o/s systems administration and network administration is disjointed and not well choreographed. Providing software administrators the ability to configure their platform relatively independently (after initial configuration) from Systems administrators is desirable. 2) Configuration and deployment is sometimes error prone and there are situations when test environments could erroneously read/write to production environments 3) An additional precaution against reading sensitive data is typically welcomed in most large enterprise deployments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [DISCUSSION] adding the serializer api back to the new java producer
Ok, based on all the feedbacks that we have heard, I plan to do the following. 1. Keep the generic api in KAFKA-1797. 2. Add a new constructor in Producer/Consumer that takes the key and the value serializer instance. 3. Have KAFKA-1797 reviewed and checked into 0.8.2 and trunk. This will make it easy for people to reuse common serializers while at the same time allow people to use the byte array api if one chooses to do so. I plan to make those changes in the next couple of days unless someone strongly objects. Thanks, Jun On Fri, Dec 5, 2014 at 5:46 PM, Jiangjie Qin j...@linkedin.com.invalid wrote: Hi Jun, Thanks for pointing out this. Yes, putting serialization/deserialization code into record does lose some flexibility. Some more thinking, I think no matter what we do to bind the producer and serializer/deserializer, we can always to the same thing on Record, i.e. We can also have some constructor like ProducerRecorSerializerK, V, DeserializerK, V. The downside of this is that we could potentially have a serializer/deserializer instance for each record (that's actually the very reason that I propose to put the code in record). This problem could be addressed by either using a singleton class or factory for serializer/deserializer library. But it might be a little bit complicated and we are not able to enforce that to external library either. So it seems only make sense if we really want to: 1. Have a single simple producer interface. AND 2. use a single producer send all type of messages I'm not sure if these requirement are strong enough to make us take the complexity of singleton/factory class serializer/deserializer library. Thanks. Jiangjie (Becket) Qin On 12/5/14, 3:16 PM, Jun Rao j...@confluent.io wrote: Jiangjie, The issue with adding the serializer in ProducerRecord is that you need to implement all combinations of serializers for key and value. So, instead of just implementing int and string serializers, you will have to implement all 4 combinations. Adding a new producer constructor like ProducerK, V(KeySerializerK, ValueSerializerV, Properties properties) can be useful. Thanks, Jun On Thu, Dec 4, 2014 at 10:33 AM, Jiangjie Qin j...@linkedin.com.invalid wrote: I'm just thinking instead of binding serialization with producer, another option is to bind serializer/deserializer with ProducerRecord/ConsumerRecord (please see the detail proposal below.) The arguments for this option is: A. A single producer could send different message types. There are several use cases in LinkedIn for per record serializer - In Samza, there are some in-stream order-sensitive control messages having different deserializer from other messages. - There are use cases which need support for sending both Avro messages and raw bytes. - Some use cases needs to deserialize some Avro messages into generic record and some other messages into specific record. B. In current proposal, the serializer/deserilizer is instantiated according to config. Compared with that, binding serializer with ProducerRecord and ConsumerRecord is less error prone. This option includes the following changes: A. Add serializer and deserializer interfaces to replace serializer instance from config. Public interface Serializer K, V { public byte[] serializeKey(K key); public byte[] serializeValue(V value); } Public interface deserializer K, V { Public K deserializeKey(byte[] key); public V deserializeValue(byte[] value); } B. Make ProducerRecord and ConsumerRecord abstract class implementing Serializer K, V and Deserializer K, V respectively. Public abstract class ProducerRecord K, V implements Serializer K, V {...} Public abstract class ConsumerRecord K, V implements Deserializer K, V {...} C. Instead of instantiate the serializer/Deserializer from config, let concrete ProducerRecord/ConsumerRecord extends the abstract class and override the serialize/deserialize methods. Public class AvroProducerRecord extends ProducerRecord String, GenericRecord { ... @Override Public byte[] serializeKey(String key) {Š} @Override public byte[] serializeValue(GenericRecord value); } Public class AvroConsumerRecord extends ConsumerRecord String, GenericRecord { ... @Override Public K deserializeKey(byte[] key) {Š} @Override
controller conflict
Hello, We are using kafka 0.8.1.1 and hit this bug - 1029, /controller ephemeral node conflict. Is it supposed to be fixed or has something to do with 1451? Thanks.
[jira] [Commented] (KAFKA-1044) change log4j to slf4j
[ https://issues.apache.org/jira/browse/KAFKA-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238611#comment-14238611 ] Jon Barksdale commented on KAFKA-1044: -- As as workaround, you can exclude log4j from kafka, and just use the org.slf4j:log4j-over-slf4j instead. That feeds in most log4j commands to slf4j, and then logback or another slf4j binding can be used. change log4j to slf4j -- Key: KAFKA-1044 URL: https://issues.apache.org/jira/browse/KAFKA-1044 Project: Kafka Issue Type: Bug Components: log Affects Versions: 0.8.0 Reporter: sjk Assignee: Jay Kreps Fix For: 0.9.0 can u chanage the log4j to slf4j, in my project, i use logback, it's conflict with log4j. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1797) add the serializer/deserializer api to the new java client
[ https://issues.apache.org/jira/browse/KAFKA-1797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neha Narkhede updated KAFKA-1797: - Reviewer: Neha Narkhede add the serializer/deserializer api to the new java client -- Key: KAFKA-1797 URL: https://issues.apache.org/jira/browse/KAFKA-1797 Project: Kafka Issue Type: Improvement Components: core Affects Versions: 0.8.2 Reporter: Jun Rao Assignee: Jun Rao Attachments: kafka-1797.patch Currently, the new java clients take a byte array for both the key and the value. While this api is simple, it pushes the serialization/deserialization logic into the application. This makes it hard to reason about what type of data flows through Kafka and also makes it hard to share an implementation of the serializer/deserializer. For example, to support Avro, the serialization logic could be quite involved since it might need to register the Avro schema in some remote registry and maintain a schema cache locally, etc. Without a serialization api, it's impossible to share such an implementation so that people can easily reuse. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1784) Implement a ConsumerOffsetClient library
[ https://issues.apache.org/jira/browse/KAFKA-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238723#comment-14238723 ] Neha Narkhede commented on KAFKA-1784: -- [~mgharat] Thanks for the patch. Can you add a little more context on the intent and usage of this library? It is meant for admin usage only or do you intend to refactor the existing KafkaApis to use this new library? Implement a ConsumerOffsetClient library Key: KAFKA-1784 URL: https://issues.apache.org/jira/browse/KAFKA-1784 Project: Kafka Issue Type: New Feature Reporter: Joel Koshy Assignee: Mayuresh Gharat Priority: Blocker Fix For: 0.8.2 Attachments: KAFKA-1784.patch I think it would be useful to provide an offset client library. It would make the documentation a lot simpler. Right now it is non-trivial to commit/fetch offsets to/from kafka. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1784) Implement a ConsumerOffsetClient library
[ https://issues.apache.org/jira/browse/KAFKA-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neha Narkhede updated KAFKA-1784: - Reviewer: Neha Narkhede Assigning to myself for review since [~jjkoshy] is on vacation. Implement a ConsumerOffsetClient library Key: KAFKA-1784 URL: https://issues.apache.org/jira/browse/KAFKA-1784 Project: Kafka Issue Type: New Feature Reporter: Joel Koshy Assignee: Mayuresh Gharat Priority: Blocker Fix For: 0.8.2 Attachments: KAFKA-1784.patch I think it would be useful to provide an offset client library. It would make the documentation a lot simpler. Right now it is non-trivial to commit/fetch offsets to/from kafka. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1784) Implement a ConsumerOffsetClient library
[ https://issues.apache.org/jira/browse/KAFKA-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238736#comment-14238736 ] Mayuresh Gharat commented on KAFKA-1784: Yes. We are planning to do that. Actually the patch for KAFKA-1013 has those. Currently this patch can be used for admin usage and then we will get back to KAFKA-1013 to make it work with KafkaApis and Consumer. Implement a ConsumerOffsetClient library Key: KAFKA-1784 URL: https://issues.apache.org/jira/browse/KAFKA-1784 Project: Kafka Issue Type: New Feature Reporter: Joel Koshy Assignee: Mayuresh Gharat Priority: Blocker Fix For: 0.8.2 Attachments: KAFKA-1784.patch I think it would be useful to provide an offset client library. It would make the documentation a lot simpler. Right now it is non-trivial to commit/fetch offsets to/from kafka. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [DISCUSSION] adding the serializer api back to the new java producer
Thank you Jay. I agree with the issue that you point w.r.t paired serializers. I also think having mix serialization types is rare. To get the current behavior, one can simply use a ByteArraySerializer. This is best understood by talking with many customers and you seem to have done that. I am convinced about the change. For the rest who gave -1 or 0 for this proposal, does the answers for the three points(updated) below seem reasonable? Are these explanations convincing? 1. Can we keep the serialization semantics outside the Producer interface and have simple bytes in / bytes out for the interface (This is what we have today). The points for this is to keep the interface simple and usage easy to understand. The points against this is that it gets hard to share common usage patterns around serialization/message validations for the future. 2. Can we create a wrapper producer that does the serialization and have different variants of it for different data formats? The points for this is again to keep the main API clean. The points against this is that it duplicates the API, increases the surface area and creates redundancy for a minor addition. 3. Do we need to support different data types per record? The current interface (bytes in/bytes out) lets you instantiate one producer and use it to send multiple data formats. There seems to be some valid use cases for this. Mixed serialization types are rare based on interactions with customers. To get the current behavior, one can simply use a ByteArraySerializer. On 12/5/14 5:00 PM, Jay Kreps j...@confluent.io wrote: Hey Sriram, Thanks! I think this is a very helpful summary. Let me try to address your point about passing in the serde at send time. I think the first objection is really to the paired key/value serializer interfaces. This leads to kind of a weird combinatorial thing where you would have an avro/avro serializer a string/avro serializer, a pb/pb serializer, and a string/pb serializer, and so on. But your proposal would work as well with separate serializers for key and value. I think the downside is just the one you call out--that this is a corner case and you end up with two versions of all the apis to support it. This also makes the serializer api more annoying to implement. I think the alternative solution to this case and any other we can give people is just configuring ByteArraySerializer which gives you basically the api that you have now with byte arrays. If this is incredibly common then this would be a silly solution, but I guess the belief is that these cases are rare and a really well implemented avro or json serializer should be 100% of what most people need. In practice the cases that actually mix serialization types in a single stream are pretty rare I think just because the consumer then has the problem of guessing how to deserialize, so most of these will end up with at least some marker or schema id or whatever that tells you how to read the data. Arguable this mixed serialization with marker is itself a serializer type and should have a serializer of its own... -Jay On Fri, Dec 5, 2014 at 3:48 PM, Sriram Subramanian srsubraman...@linkedin.com.invalid wrote: This thread has diverged multiple times now and it would be worth summarizing them. There seems to be the following points of discussion - 1. Can we keep the serialization semantics outside the Producer interface and have simple bytes in / bytes out for the interface (This is what we have today). The points for this is to keep the interface simple and usage easy to understand. The points against this is that it gets hard to share common usage patterns around serialization/message validations for the future. 2. Can we create a wrapper producer that does the serialization and have different variants of it for different data formats? The points for this is again to keep the main API clean. The points against this is that it duplicates the API, increases the surface area and creates redundancy for a minor addition. 3. Do we need to support different data types per record? The current interface (bytes in/bytes out) lets you instantiate one producer and use it to send multiple data formats. There seems to be some valid use cases for this. I have still not seen a strong argument against not having this functionality. Can someone provide their views on why we don't need this support that is possible with the current API? One possible approach for the per record serialization would be to define public interface SerDeK,V { public byte[] serializeKey(); public K deserializeKey(); public byte[] serializeValue(); public V deserializeValue(); } This would be used by both the Producer and the Consumer. The send APIs can then be public FutureRecordMetadata send(ProducerRecordK,V record); public FutureRecordMetadata send(ProducerRecordK,V record, Callback callback); public FutureRecordMetadata send(ProducerRecordK,V
[jira] [Created] (KAFKA-1811) ensuring registered broker host:port is unique
Jun Rao created KAFKA-1811: -- Summary: ensuring registered broker host:port is unique Key: KAFKA-1811 URL: https://issues.apache.org/jira/browse/KAFKA-1811 Project: Kafka Issue Type: Improvement Reporter: Jun Rao Currently, we expect each of the registered broker to have a unique host:port pair. However, we don't enforce that, which causes various weird problems. It would be useful to ensure this during broker registration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-1812) Allow IpV6 in configuration with parseCsvMap
Jeff Holoman created KAFKA-1812: --- Summary: Allow IpV6 in configuration with parseCsvMap Key: KAFKA-1812 URL: https://issues.apache.org/jira/browse/KAFKA-1812 Project: Kafka Issue Type: Bug Reporter: Jeff Holoman Assignee: Jeff Holoman Priority: Minor Fix For: 0.8.3 The current implementation of parseCsvMap in Utils expects k:v,k:v. This modifies that function to accept a string with multiple : characters and splitting on the last occurrence per pair. This limitation is noted in the Reviewboard comments for KAFKA-1512 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1812) Allow IpV6 in configuration with parseCsvMap
[ https://issues.apache.org/jira/browse/KAFKA-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neha Narkhede updated KAFKA-1812: - Labels: newbie (was: ) Allow IpV6 in configuration with parseCsvMap - Key: KAFKA-1812 URL: https://issues.apache.org/jira/browse/KAFKA-1812 Project: Kafka Issue Type: Bug Reporter: Jeff Holoman Assignee: Jeff Holoman Priority: Minor Labels: newbie Fix For: 0.8.3 The current implementation of parseCsvMap in Utils expects k:v,k:v. This modifies that function to accept a string with multiple : characters and splitting on the last occurrence per pair. This limitation is noted in the Reviewboard comments for KAFKA-1512 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1811) ensuring registered broker host:port is unique
[ https://issues.apache.org/jira/browse/KAFKA-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neha Narkhede updated KAFKA-1811: - Labels: newbie (was: ) ensuring registered broker host:port is unique -- Key: KAFKA-1811 URL: https://issues.apache.org/jira/browse/KAFKA-1811 Project: Kafka Issue Type: Improvement Reporter: Jun Rao Labels: newbie Currently, we expect each of the registered broker to have a unique host:port pair. However, we don't enforce that, which causes various weird problems. It would be useful to ensure this during broker registration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1810) Add IP Filtering / Whitelists-Blacklists
[ https://issues.apache.org/jira/browse/KAFKA-1810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238985#comment-14238985 ] Neha Narkhede commented on KAFKA-1810: -- [~jholoman] Not sure I understood what you are proposing. Can you be more specific about the changes you are proposing? Add IP Filtering / Whitelists-Blacklists - Key: KAFKA-1810 URL: https://issues.apache.org/jira/browse/KAFKA-1810 Project: Kafka Issue Type: New Feature Components: core, network Reporter: Jeff Holoman Assignee: Jeff Holoman Priority: Minor Fix For: 0.8.3 While longer-term goals of security in Kafka are on the roadmap there exists some value for the ability to restrict connection to Kafka brokers based on IP address. This is not intended as a replacement for security but more of a precaution against misconfiguration and to provide some level of control to Kafka administrators about who is reading/writing to their cluster. 1) In some organizations software administration vs o/s systems administration and network administration is disjointed and not well choreographed. Providing software administrators the ability to configure their platform relatively independently (after initial configuration) from Systems administrators is desirable. 2) Configuration and deployment is sometimes error prone and there are situations when test environments could erroneously read/write to production environments 3) An additional precaution against reading sensitive data is typically welcomed in most large enterprise deployments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1806) broker can still expose uncommitted data to a consumer
[ https://issues.apache.org/jira/browse/KAFKA-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238987#comment-14238987 ] Neha Narkhede commented on KAFKA-1806: -- [~lokeshbirla] Please can you provide the steps to reproduce this issue? broker can still expose uncommitted data to a consumer -- Key: KAFKA-1806 URL: https://issues.apache.org/jira/browse/KAFKA-1806 Project: Kafka Issue Type: Bug Components: consumer Affects Versions: 0.8.1.1 Reporter: lokesh Birla Assignee: Neha Narkhede Although following issue: https://issues.apache.org/jira/browse/KAFKA-727 is marked fixed but I still see this issue in 0.8.1.1. I am able to reproducer the issue consistently. [2014-08-18 06:43:58,356] ERROR [KafkaApi-1] Error when processing fetch request for partition [mmetopic4,2] offset 1940029 from consumer with correlation id 21 (kafka.server.Kaf kaApis) java.lang.IllegalArgumentException: Attempt to read with a maximum offset (1818353) less than the start offset (1940029). at kafka.log.LogSegment.read(LogSegment.scala:136) at kafka.log.Log.read(Log.scala:386) at kafka.server.KafkaApis.kafka$server$KafkaApis$$readMessageSet(KafkaApis.scala:530) at kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$readMessageSets$1.apply(KafkaApis.scala:476) at kafka.server.KafkaApis$$anonfun$kafka$server$KafkaApis$$readMessageSets$1.apply(KafkaApis.scala:471) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:233) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:233) at scala.collection.immutable.Map$Map1.foreach(Map.scala:119) at scala.collection.TraversableLike$class.map(TraversableLike.scala:233) at scala.collection.immutable.Map$Map1.map(Map.scala:107) at kafka.server.KafkaApis.kafka$server$KafkaApis$$readMessageSets(KafkaApis.scala:471) at kafka.server.KafkaApis$FetchRequestPurgatory.expire(KafkaApis.scala:783) at kafka.server.KafkaApis$FetchRequestPurgatory.expire(KafkaApis.scala:765) at kafka.server.RequestPurgatory$ExpiredRequestReaper.run(RequestPurgatory.scala:216) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: controller conflict
It's hard to say without any details. But you are free to try 0.8.2-beta as both of those JIRAs should be fixed in this version. On Mon, Dec 8, 2014 at 2:29 PM, Kane Kim kane.ist...@gmail.com wrote: Hello, We are using kafka 0.8.1.1 and hit this bug - 1029, /controller ephemeral node conflict. Is it supposed to be fixed or has something to do with 1451? Thanks. -- Thanks, Neha
[jira] [Commented] (KAFKA-1810) Add IP Filtering / Whitelists-Blacklists
[ https://issues.apache.org/jira/browse/KAFKA-1810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14239020#comment-14239020 ] Jeff Holoman commented on KAFKA-1810: - [~nehanarkhede] Sure no problem. I had a request to provide the ability to specify a range of IP addresses to either include or exclude. I was thinking the easiest way would be to specify IP addresses in CIDR notation and include them in the server.properties such as 192.168.2.0/24:allow, 192.168.1.0/16:deny. This would allow an administrator to accept/deny connections based on ip ranges. Does that clarify? Add IP Filtering / Whitelists-Blacklists - Key: KAFKA-1810 URL: https://issues.apache.org/jira/browse/KAFKA-1810 Project: Kafka Issue Type: New Feature Components: core, network Reporter: Jeff Holoman Assignee: Jeff Holoman Priority: Minor Fix For: 0.8.3 While longer-term goals of security in Kafka are on the roadmap there exists some value for the ability to restrict connection to Kafka brokers based on IP address. This is not intended as a replacement for security but more of a precaution against misconfiguration and to provide some level of control to Kafka administrators about who is reading/writing to their cluster. 1) In some organizations software administration vs o/s systems administration and network administration is disjointed and not well choreographed. Providing software administrators the ability to configure their platform relatively independently (after initial configuration) from Systems administrators is desirable. 2) Configuration and deployment is sometimes error prone and there are situations when test environments could erroneously read/write to production environments 3) An additional precaution against reading sensitive data is typically welcomed in most large enterprise deployments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1642) [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost
[ https://issues.apache.org/jira/browse/KAFKA-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14239063#comment-14239063 ] Bhavesh Mistry commented on KAFKA-1642: --- [~stevenz3wu], 0.8.2 is very well tested and worked well under heavy load. This bug is rare only happen when broker or network has issue. We have been producing about 7 to 10 TB per day using this new producer, so 0.8.2 is very safe to use in production. It has survived pick traffic of the year on large e-commerce site. So I am fairly confident that New Java API is indeed does true round-robin and much faster than Scala Based API. [~ewencp], I will verify the patch by end of this Friday, but do let me know your understanding based on my last comment. Thanks, Bhavesh [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost --- Key: KAFKA-1642 URL: https://issues.apache.org/jira/browse/KAFKA-1642 Project: Kafka Issue Type: Bug Components: producer Affects Versions: 0.8.2 Reporter: Bhavesh Mistry Assignee: Ewen Cheslack-Postava Priority: Blocker Fix For: 0.8.2 Attachments: 0001-Initial-CPU-Hish-Usage-by-Kafka-FIX-and-Also-fix-CLO.patch, KAFKA-1642.patch, KAFKA-1642.patch, KAFKA-1642_2014-10-20_17:33:57.patch, KAFKA-1642_2014-10-23_16:19:41.patch I see my CPU spike to 100% when network connection is lost for while. It seems network IO thread are very busy logging following error message. Is this expected behavior ? 2014-09-17 14:06:16.830 [kafka-producer-network-thread] ERROR org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka producer I/O thread: java.lang.IllegalStateException: No entry found for node -2 at org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:110) at org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:99) at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:394) at org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:380) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:174) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:175) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:744) Thanks, Bhavesh -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1642) [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost
[ https://issues.apache.org/jira/browse/KAFKA-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14239063#comment-14239063 ] Bhavesh Mistry edited comment on KAFKA-1642 at 12/9/14 6:53 AM: [~stevenz3wu], 0.8.2 is very well tested and worked well under heavy load. This bug is rare only happen when broker or network has issue. We have been producing about 7 to 10 TB per day using this new producer, so 0.8.2 is very safe to use in production. It has survived pick traffic of the year on large e-commerce site. So I am fairly confident that New Java API is indeed does true round-robin and much faster than Scala Based API. [~ewencp], I will verify the patch by end of this Friday, but do let me know your understanding based on my last comment. The goal is to rest this issue and cover all the use case. Thanks, Bhavesh was (Author: bmis13): [~stevenz3wu], 0.8.2 is very well tested and worked well under heavy load. This bug is rare only happen when broker or network has issue. We have been producing about 7 to 10 TB per day using this new producer, so 0.8.2 is very safe to use in production. It has survived pick traffic of the year on large e-commerce site. So I am fairly confident that New Java API is indeed does true round-robin and much faster than Scala Based API. [~ewencp], I will verify the patch by end of this Friday, but do let me know your understanding based on my last comment. Thanks, Bhavesh [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost --- Key: KAFKA-1642 URL: https://issues.apache.org/jira/browse/KAFKA-1642 Project: Kafka Issue Type: Bug Components: producer Affects Versions: 0.8.2 Reporter: Bhavesh Mistry Assignee: Ewen Cheslack-Postava Priority: Blocker Fix For: 0.8.2 Attachments: 0001-Initial-CPU-Hish-Usage-by-Kafka-FIX-and-Also-fix-CLO.patch, KAFKA-1642.patch, KAFKA-1642.patch, KAFKA-1642_2014-10-20_17:33:57.patch, KAFKA-1642_2014-10-23_16:19:41.patch I see my CPU spike to 100% when network connection is lost for while. It seems network IO thread are very busy logging following error message. Is this expected behavior ? 2014-09-17 14:06:16.830 [kafka-producer-network-thread] ERROR org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka producer I/O thread: java.lang.IllegalStateException: No entry found for node -2 at org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:110) at org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:99) at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:394) at org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:380) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:174) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:175) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:744) Thanks, Bhavesh -- This message was sent by Atlassian JIRA (v6.3.4#6332)