[jira] [Commented] (KAFKA-1772) Add an Admin message type for request response
[ https://issues.apache.org/jira/browse/KAFKA-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226259#comment-14226259 ] Andrii Biletskyi commented on KAFKA-1772: - [~junrao]: Thanks for your feedback! My vision: 1. Yes, looks like utility+command fits flawlessly only topic command. Although, I like the idea separating request types at utility level - this structures a bit tons of our commands. I.e. limit possible utilities to mentioned in the ticket but do not regulate commands as shared among all utilities. 2. Not sure about that, it is still being discussed whether we should use json for that (because of third-party json lib dependency); maybe we also can represent args in simple byte format (as all current requests). 3. Since all commands are really subtype of AdminRequest we can't have specific response for each command. Currently AdminResponse is a just an outcome string (and optionally error code). So if mutating command is successful - empty Response is returned, if it is list/describe command - description is returned in outcome string. 4. No final decision here. It's proposed to make all commands async on broker side and leave to client responsibility to block, executing verify or whatever else method to check command is completed. When commands are called from cli, this logic, of course, will be plugged into cli code, so user (if he wants) will experience such commands as blocking. Add an Admin message type for request response -- Key: KAFKA-1772 URL: https://issues.apache.org/jira/browse/KAFKA-1772 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Andrii Biletskyi Fix For: 0.8.3 Attachments: KAFKA-1772.patch - utility int8 - command int8 - format int8 - args variable length bytes utility 0 - Broker 1 - Topic 2 - Replication 3 - Controller 4 - Consumer 5 - Producer Command 0 - Create 1 - Alter 3 - Delete 4 - List 5 - Audit format 0 - JSON args e.g. (which would equate to the data structure values == 2,1,0) meta-store: { {zookeeper:localhost:12913/kafka} }args: { partitions: [ {topic: topic1, partition: 0}, {topic: topic1, partition: 1}, {topic: topic1, partition: 2}, {topic: topic2, partition: 0}, {topic: topic2, partition: 1}, ] } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1774) REPL and Shell Client for Admin Message RQ/RP
[ https://issues.apache.org/jira/browse/KAFKA-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226355#comment-14226355 ] Andrii Biletskyi commented on KAFKA-1774: - [~junrao]: thanks for noting dependency aspect. I think now it's even better to have all admin cmd code placed under separate project ./tools. It looks reasonable users will be interested in a thin simple jar to do some admin commands through cli, without notion of all ./core classes and stuff. REPL and Shell Client for Admin Message RQ/RP - Key: KAFKA-1774 URL: https://issues.apache.org/jira/browse/KAFKA-1774 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Andrii Biletskyi Fix For: 0.8.3 We should have a REPL we can work in and execute the commands with the arguments. With this we can do: ./kafka.sh --shell kafkaattach cluster -b localhost:9092; kafkadescribe topic sampleTopicNameForExample; the command line version can work like it does now so folks don't have to re-write all of their tooling. kafka.sh --topics --everything the same like kafka-topics.sh is kafka.sh --reassign --everything the same like kafka-reassign-partitions.sh is -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1801) Remove non-functional variable definition in log4j.properties
[ https://issues.apache.org/jira/browse/KAFKA-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raman Gupta updated KAFKA-1801: --- Fix Version/s: 0.8.1 Status: Patch Available (was: Open) In log4j.properties, a property kafka.logs.dir was defined. However, modifying this property has no effect because log4j.properties always uses the System property set in bin/kafka-run-class.sh before the locally set one. Remove non-functional variable definition in log4j.properties - Key: KAFKA-1801 URL: https://issues.apache.org/jira/browse/KAFKA-1801 Project: Kafka Issue Type: Improvement Components: log Affects Versions: 0.8.2 Reporter: Raman Gupta Assignee: Jay Kreps Priority: Trivial Labels: easyfix, patch Fix For: 0.8.1 Original Estimate: 5m Remaining Estimate: 5m In log4j.properties, a property kafka.logs.dir is defined. However, modifying this property has no effect because log4j will always use the system property defined in kafka-run-class.sh before using the locally defined property in log4j.properties. Therefore, its probably less confusing to simply remove this property from here. See http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/PropertyConfigurator.html for the property search order (system property first, locally defined property second). An alternative solution: remove the system property from kafka-run-class.sh and keep the one here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1796) Sanity check partition command line tools
[ https://issues.apache.org/jira/browse/KAFKA-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang updated KAFKA-1796: - Description: We need to sanity check the input json has the valid values before triggering the admin process. For example, we have seen a scenario where the json input for partition reassignment tools have partition replica info as {broker-1, broker-1, broker-2} and it is still accepted in ZK and eventually lead to under replicated count, etc. This is partially because we use a Map rather than a Set reading the json input for this case; but in general we need to make sure the input parameters like Json needs to be valid before writing it to ZK. (was: We need to sanity check the input json has the valid values (for example, the replica list does not have duplicate broker ids, etc) before triggering the admin process.) Sanity check partition command line tools - Key: KAFKA-1796 URL: https://issues.apache.org/jira/browse/KAFKA-1796 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Labels: newbie Fix For: 0.8.3 We need to sanity check the input json has the valid values before triggering the admin process. For example, we have seen a scenario where the json input for partition reassignment tools have partition replica info as {broker-1, broker-1, broker-2} and it is still accepted in ZK and eventually lead to under replicated count, etc. This is partially because we use a Map rather than a Set reading the json input for this case; but in general we need to make sure the input parameters like Json needs to be valid before writing it to ZK. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1796) Sanity check partition command line tools
[ https://issues.apache.org/jira/browse/KAFKA-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226440#comment-14226440 ] Guozhang Wang commented on KAFKA-1796: -- Updated the description, this may also be related to [~joestein]'s proposal for admin requests. Sanity check partition command line tools - Key: KAFKA-1796 URL: https://issues.apache.org/jira/browse/KAFKA-1796 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Labels: newbie Fix For: 0.8.3 We need to sanity check the input json has the valid values before triggering the admin process. For example, we have seen a scenario where the json input for partition reassignment tools have partition replica info as {broker-1, broker-1, broker-2} and it is still accepted in ZK and eventually lead to under replicated count, etc. This is partially because we use a Map rather than a Set reading the json input for this case; but in general we need to make sure the input parameters like Json needs to be valid before writing it to ZK. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (KAFKA-1801) Remove non-functional variable definition in log4j.properties
[ https://issues.apache.org/jira/browse/KAFKA-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raman Gupta updated KAFKA-1801: --- Comment: was deleted (was: In log4j.properties, a property kafka.logs.dir was defined. However, modifying this property has no effect because log4j.properties always uses the System property set in bin/kafka-run-class.sh before the locally set one.) Remove non-functional variable definition in log4j.properties - Key: KAFKA-1801 URL: https://issues.apache.org/jira/browse/KAFKA-1801 Project: Kafka Issue Type: Improvement Components: log Affects Versions: 0.8.2 Reporter: Raman Gupta Assignee: Jay Kreps Priority: Trivial Labels: easyfix, patch Fix For: 0.8.2 Original Estimate: 5m Remaining Estimate: 5m In log4j.properties, a property kafka.logs.dir is defined. However, modifying this property has no effect because log4j will always use the system property defined in kafka-run-class.sh before using the locally defined property in log4j.properties. Therefore, its probably less confusing to simply remove this property from here. See http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/PropertyConfigurator.html for the property search order (system property first, locally defined property second). An alternative solution: remove the system property from kafka-run-class.sh and keep the one here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1801) Remove non-functional variable definition in log4j.properties
[ https://issues.apache.org/jira/browse/KAFKA-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raman Gupta updated KAFKA-1801: --- Fix Version/s: (was: 0.8.1) 0.8.2 Status: Patch Available (was: Open) In log4j.properties, a property kafka.logs.dir was defined. However, modifying this property has no effect because log4j.properties always uses the System property set in bin/kafka-run-class.sh before the locally set one. Remove non-functional variable definition in log4j.properties - Key: KAFKA-1801 URL: https://issues.apache.org/jira/browse/KAFKA-1801 Project: Kafka Issue Type: Improvement Components: log Affects Versions: 0.8.2 Reporter: Raman Gupta Assignee: Jay Kreps Priority: Trivial Labels: easyfix, patch Fix For: 0.8.2 Original Estimate: 5m Remaining Estimate: 5m In log4j.properties, a property kafka.logs.dir is defined. However, modifying this property has no effect because log4j will always use the system property defined in kafka-run-class.sh before using the locally defined property in log4j.properties. Therefore, its probably less confusing to simply remove this property from here. See http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/PropertyConfigurator.html for the property search order (system property first, locally defined property second). An alternative solution: remove the system property from kafka-run-class.sh and keep the one here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1801) Remove non-functional variable definition in log4j.properties
[ https://issues.apache.org/jira/browse/KAFKA-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raman Gupta updated KAFKA-1801: --- Status: Open (was: Patch Available) Remove non-functional variable definition in log4j.properties - Key: KAFKA-1801 URL: https://issues.apache.org/jira/browse/KAFKA-1801 Project: Kafka Issue Type: Improvement Components: log Affects Versions: 0.8.2 Reporter: Raman Gupta Assignee: Jay Kreps Priority: Trivial Labels: easyfix, patch Fix For: 0.8.1 Original Estimate: 5m Remaining Estimate: 5m In log4j.properties, a property kafka.logs.dir is defined. However, modifying this property has no effect because log4j will always use the system property defined in kafka-run-class.sh before using the locally defined property in log4j.properties. Therefore, its probably less confusing to simply remove this property from here. See http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/PropertyConfigurator.html for the property search order (system property first, locally defined property second). An alternative solution: remove the system property from kafka-run-class.sh and keep the one here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1642) [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost
[ https://issues.apache.org/jira/browse/KAFKA-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226751#comment-14226751 ] Bhavesh Mistry edited comment on KAFKA-1642 at 11/26/14 8:08 PM: - [~ewencp], Even setting long following parameter, states of system does get impacted does not matter what reconnect.backoff.ms and retry.backoff.ms is set to. Once Node state is removed, the time out is set to 0. Please see the following logs. #15 minutes reconnect.backoff.ms=90 retry.backoff.ms=90 {code} 2014-11-26 11:01:27.898 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:02:27.903 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:03:27.903 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:04:27.903 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:05:27.904 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:06:27.905 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:07:27.906 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:08:27.908 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:09:27.908 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:10:27.909 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:11:27.909 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:12:27.910 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:13:27.911 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:14:27.912 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:15:27.914 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:00:27.613 [kafka-producer-network-thread | heartbeat] ERROR org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka producer I/O thread: 2014-11-26 11:00:27.613 [kafka-producer-network-thread | rawlog] ERROR org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka producer I/O thread: java.lang.IllegalStateException: No entry found for node -1 at org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:131) at org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:120) at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:407) at org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:393) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:187) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:744) java.lang.IllegalStateException: No entry found for node -3 at org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:131) at org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:120) at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:407) at org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:393) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:187) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:744) 2014-11-26 11:00:27.613 [kafka-producer-network-thread | heartbeat] ERROR org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka producer I/O thread: 2014-11-26 11:00:27.613 [kafka-producer-network-thread | error] ERROR
[jira] [Commented] (KAFKA-1642) [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost
[ https://issues.apache.org/jira/browse/KAFKA-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226751#comment-14226751 ] Bhavesh Mistry commented on KAFKA-1642: --- [~ewencp], Even setting long following parameter, states of system does get impacted does not matter what reconnect.backoff.ms and retry.backoff.ms is set to. Once Node state is removed, the time out is set to 0. Please see the following logs. # 15 minutes reconnect.backoff.ms=90 retry.backoff.ms=90 {code} 2014-11-26 11:01:27.898 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:02:27.903 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:03:27.903 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:04:27.903 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:05:27.904 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:06:27.905 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:07:27.906 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:08:27.908 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:09:27.908 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:10:27.909 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:11:27.909 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:12:27.910 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:13:27.911 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:14:27.912 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:15:27.914 Kafka Drop message topic=.rawlog org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 6 ms. 2014-11-26 11:00:27.613 [kafka-producer-network-thread | heartbeat] ERROR org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka producer I/O thread: 2014-11-26 11:00:27.613 [kafka-producer-network-thread | rawlog] ERROR org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka producer I/O thread: java.lang.IllegalStateException: No entry found for node -1 at org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:131) at org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:120) at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:407) at org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:393) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:187) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:744) java.lang.IllegalStateException: No entry found for node -3 at org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:131) at org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:120) at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:407) at org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:393) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:187) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:744) 2014-11-26 11:00:27.613 [kafka-producer-network-thread | heartbeat] ERROR org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka producer I/O thread: 2014-11-26 11:00:27.613 [kafka-producer-network-thread | error] ERROR org.apache.kafka.clients.producer.internals.Sender - Uncaught
[jira] [Updated] (KAFKA-1800) KafkaException was not recorded at the per-topic metrics
[ https://issues.apache.org/jira/browse/KAFKA-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang updated KAFKA-1800: - Attachment: KAFKA-1800.patch KafkaException was not recorded at the per-topic metrics Key: KAFKA-1800 URL: https://issues.apache.org/jira/browse/KAFKA-1800 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Fix For: 0.9.0 Attachments: KAFKA-1800.patch When KafkaException was thrown from producer.send() call, it is not recorded on the per-topic record-error-rate, but only the global error-rate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1800) KafkaException was not recorded at the per-topic metrics
[ https://issues.apache.org/jira/browse/KAFKA-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang updated KAFKA-1800: - Status: Patch Available (was: Open) KafkaException was not recorded at the per-topic metrics Key: KAFKA-1800 URL: https://issues.apache.org/jira/browse/KAFKA-1800 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Fix For: 0.9.0 Attachments: KAFKA-1800.patch When KafkaException was thrown from producer.send() call, it is not recorded on the per-topic record-error-rate, but only the global error-rate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 28479: Fix KAFKA-1800
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28479/ --- Review request for kafka. Bugs: KAFKA-1800 https://issues.apache.org/jira/browse/KAFKA-1800 Repository: kafka Description --- 1. Add the logic for recoding per-topic record error rate in KafkaProducer handling thrown KafkaExceptions; 2. Move the metrics registration from send requests, since for some corner cases the request may not be sent and hence the per-topic metrics would never be registered Diffs - clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 32f444ebbd27892275af7a0947b86a6b8317a374 clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 84a7a07269c51ccc22ebb4ff9797292d07ba778e Diff: https://reviews.apache.org/r/28479/diff/ Testing --- Thanks, Guozhang Wang
Review Request 28481: Patch for KAFKA-1792
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28481/ --- Review request for kafka. Bugs: KAFKA-1792 https://issues.apache.org/jira/browse/KAFKA-1792 Repository: kafka Description --- KAFKA-1792: change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments Diffs - core/src/main/scala/kafka/admin/AdminUtils.scala 28b12c7b89a56c113b665fbde1b95f873f8624a3 core/src/main/scala/kafka/admin/ReassignPartitionsCommand.scala 979992b68af3723cd229845faff81c641123bb88 core/src/test/scala/unit/kafka/admin/AdminTest.scala e28979827110dfbbb92fe5b152e7f1cc973de400 topics.json ff011ed381e781b9a177036001d44dca3eac586f Diff: https://reviews.apache.org/r/28481/diff/ Testing --- Thanks, Dmitry Pekar
[jira] [Commented] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments
[ https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226816#comment-14226816 ] Dmitry Pekar commented on KAFKA-1792: - Created reviewboard https://reviews.apache.org/r/28481/diff/ against branch origin/trunk change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments - Key: KAFKA-1792 URL: https://issues.apache.org/jira/browse/KAFKA-1792 Project: Kafka Issue Type: Sub-task Components: tools Reporter: Dmitry Pekar Assignee: Dmitry Pekar Fix For: 0.8.3 Attachments: KAFKA-1792.patch Current implementation produces fair replica distribution between specified list of brokers. Unfortunately, it doesn't take into account current replica assignment. So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth broker id=3, generate will create an assignment config which will redistribute replicas fairly across brokers [0..3] in the same way as those partitions were created from scratch. It will not take into consideration current replica assignment and accordingly will not try to minimize number of replica moves between brokers. As proposed by [~charmalloc] this should be improved. New output of improved --generate algorithm should suite following requirements: - fairness of replica distribution - every broker will have R or R+1 replicas assigned; - minimum of reassignments - number of replica moves between brokers will be minimal; Example. Consider following replica distribution per brokers [0..3] (we just added brokers 2 and 3): - broker - 0, 1, 2, 3 - replicas - 7, 6, 0, 0 The new algorithm will produce following assignment: - broker - 0, 1, 2, 3 - replicas - 4, 3, 3, 3 - moves - -3, -3, +3, +3 It will be fair and number of moves will be 6, which is minimal for specified initial distribution. The scope of this issue is: - design an algorithm matching the above requirements; - implement this algorithm and unit tests; - test it manually using different initial assignments; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments
[ https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Pekar updated KAFKA-1792: Status: Patch Available (was: In Progress) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments - Key: KAFKA-1792 URL: https://issues.apache.org/jira/browse/KAFKA-1792 Project: Kafka Issue Type: Sub-task Components: tools Reporter: Dmitry Pekar Assignee: Dmitry Pekar Fix For: 0.8.3 Attachments: KAFKA-1792.patch Current implementation produces fair replica distribution between specified list of brokers. Unfortunately, it doesn't take into account current replica assignment. So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth broker id=3, generate will create an assignment config which will redistribute replicas fairly across brokers [0..3] in the same way as those partitions were created from scratch. It will not take into consideration current replica assignment and accordingly will not try to minimize number of replica moves between brokers. As proposed by [~charmalloc] this should be improved. New output of improved --generate algorithm should suite following requirements: - fairness of replica distribution - every broker will have R or R+1 replicas assigned; - minimum of reassignments - number of replica moves between brokers will be minimal; Example. Consider following replica distribution per brokers [0..3] (we just added brokers 2 and 3): - broker - 0, 1, 2, 3 - replicas - 7, 6, 0, 0 The new algorithm will produce following assignment: - broker - 0, 1, 2, 3 - replicas - 4, 3, 3, 3 - moves - -3, -3, +3, +3 It will be fair and number of moves will be 6, which is minimal for specified initial distribution. The scope of this issue is: - design an algorithm matching the above requirements; - implement this algorithm and unit tests; - test it manually using different initial assignments; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments
[ https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Pekar updated KAFKA-1792: Attachment: KAFKA-1792.patch change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments - Key: KAFKA-1792 URL: https://issues.apache.org/jira/browse/KAFKA-1792 Project: Kafka Issue Type: Sub-task Components: tools Reporter: Dmitry Pekar Assignee: Dmitry Pekar Fix For: 0.8.3 Attachments: KAFKA-1792.patch Current implementation produces fair replica distribution between specified list of brokers. Unfortunately, it doesn't take into account current replica assignment. So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth broker id=3, generate will create an assignment config which will redistribute replicas fairly across brokers [0..3] in the same way as those partitions were created from scratch. It will not take into consideration current replica assignment and accordingly will not try to minimize number of replica moves between brokers. As proposed by [~charmalloc] this should be improved. New output of improved --generate algorithm should suite following requirements: - fairness of replica distribution - every broker will have R or R+1 replicas assigned; - minimum of reassignments - number of replica moves between brokers will be minimal; Example. Consider following replica distribution per brokers [0..3] (we just added brokers 2 and 3): - broker - 0, 1, 2, 3 - replicas - 7, 6, 0, 0 The new algorithm will produce following assignment: - broker - 0, 1, 2, 3 - replicas - 4, 3, 3, 3 - moves - -3, -3, +3, +3 It will be fair and number of moves will be 6, which is minimal for specified initial distribution. The scope of this issue is: - design an algorithm matching the above requirements; - implement this algorithm and unit tests; - test it manually using different initial assignments; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior
[ https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriharsha Chintalapani reassigned KAFKA-1461: - Assignee: Sriharsha Chintalapani (was: nicu marasoiu) Replica fetcher thread does not implement any back-off behavior --- Key: KAFKA-1461 URL: https://issues.apache.org/jira/browse/KAFKA-1461 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.1.1 Reporter: Sam Meder Assignee: Sriharsha Chintalapani Labels: newbie++ Fix For: 0.8.3 The current replica fetcher thread will retry in a tight loop if any error occurs during the fetch call. For example, we've seen cases where the fetch continuously throws a connection refused exception leading to several replica fetcher threads that spin in a pretty tight loop. To a much lesser degree this is also an issue in the consumer fetcher thread, although the fact that erroring partitions are removed so a leader can be re-discovered helps some. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior
[ https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226840#comment-14226840 ] Sriharsha Chintalapani commented on KAFKA-1461: --- [~charmalloc] [~nmarasoiu] I can take this. I am looking at this code for another JIRA. Replica fetcher thread does not implement any back-off behavior --- Key: KAFKA-1461 URL: https://issues.apache.org/jira/browse/KAFKA-1461 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.1.1 Reporter: Sam Meder Assignee: nicu marasoiu Labels: newbie++ Fix For: 0.8.3 The current replica fetcher thread will retry in a tight loop if any error occurs during the fetch call. For example, we've seen cases where the fetch continuously throws a connection refused exception leading to several replica fetcher threads that spin in a pretty tight loop. To a much lesser degree this is also an issue in the consumer fetcher thread, although the fact that erroring partitions are removed so a leader can be re-discovered helps some. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1273) Brokers should make sure replica.fetch.max.bytes = message.max.bytes
[ https://issues.apache.org/jira/browse/KAFKA-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226929#comment-14226929 ] Neha Narkhede commented on KAFKA-1273: -- I'm not able to reopen the issue as well. [~junrao], are you able to? Brokers should make sure replica.fetch.max.bytes = message.max.bytes - Key: KAFKA-1273 URL: https://issues.apache.org/jira/browse/KAFKA-1273 Project: Kafka Issue Type: Bug Components: replication Affects Versions: 0.8.0 Reporter: Dong Zhong Assignee: Sriharsha Chintalapani Labels: newbie If message.max.bytes is larger than replica.fetch.max.bytes,followers can't fetch data from the leader and will incur endless retry. And this may cause high network traffic between followers and leaders. Brokers should make sure replica.fetch.max.bytes = message.max.bytes by adding a sanity check, or throw an exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1476) Get a list of consumer groups
[ https://issues.apache.org/jira/browse/KAFKA-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226935#comment-14226935 ] Neha Narkhede commented on KAFKA-1476: -- [~balaji.sesha...@dish.com] I had already left my comments on the rb. Get a list of consumer groups - Key: KAFKA-1476 URL: https://issues.apache.org/jira/browse/KAFKA-1476 Project: Kafka Issue Type: Wish Components: tools Affects Versions: 0.8.1.1 Reporter: Ryan Williams Assignee: BalajiSeshadri Labels: newbie Fix For: 0.9.0 Attachments: ConsumerCommand.scala, KAFKA-1476-LIST-GROUPS.patch, KAFKA-1476-RENAME.patch, KAFKA-1476-REVIEW-COMMENTS.patch, KAFKA-1476.patch, KAFKA-1476.patch, KAFKA-1476.patch, KAFKA-1476_2014-11-10_11:58:26.patch, KAFKA-1476_2014-11-10_12:04:01.patch, KAFKA-1476_2014-11-10_12:06:35.patch It would be useful to have a way to get a list of consumer groups currently active via some tool/script that ships with kafka. This would be helpful so that the system tools can be explored more easily. For example, when running the ConsumerOffsetChecker, it requires a group option bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --topic test --group ? But, when just getting started with kafka, using the console producer and consumer, it is not clear what value to use for the group option. If a list of consumer groups could be listed, then it would be clear what value to use. Background: http://mail-archives.apache.org/mod_mbox/kafka-users/201405.mbox/%3cCAOq_b1w=slze5jrnakxvak0gu9ctdkpazak1g4dygvqzbsg...@mail.gmail.com%3e -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1476) Get a list of consumer groups
[ https://issues.apache.org/jira/browse/KAFKA-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226947#comment-14226947 ] BalajiSeshadri commented on KAFKA-1476: --- [~nehanarkhede] I was requesting for comment regarding exposing as API.Are you guys planning for it ?. Get a list of consumer groups - Key: KAFKA-1476 URL: https://issues.apache.org/jira/browse/KAFKA-1476 Project: Kafka Issue Type: Wish Components: tools Affects Versions: 0.8.1.1 Reporter: Ryan Williams Assignee: BalajiSeshadri Labels: newbie Fix For: 0.9.0 Attachments: ConsumerCommand.scala, KAFKA-1476-LIST-GROUPS.patch, KAFKA-1476-RENAME.patch, KAFKA-1476-REVIEW-COMMENTS.patch, KAFKA-1476.patch, KAFKA-1476.patch, KAFKA-1476.patch, KAFKA-1476_2014-11-10_11:58:26.patch, KAFKA-1476_2014-11-10_12:04:01.patch, KAFKA-1476_2014-11-10_12:06:35.patch It would be useful to have a way to get a list of consumer groups currently active via some tool/script that ships with kafka. This would be helpful so that the system tools can be explored more easily. For example, when running the ConsumerOffsetChecker, it requires a group option bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --topic test --group ? But, when just getting started with kafka, using the console producer and consumer, it is not clear what value to use for the group option. If a list of consumer groups could be listed, then it would be clear what value to use. Background: http://mail-archives.apache.org/mod_mbox/kafka-users/201405.mbox/%3cCAOq_b1w=slze5jrnakxvak0gu9ctdkpazak1g4dygvqzbsg...@mail.gmail.com%3e -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 28479: Fix KAFKA-1800
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28479/#review63145 --- clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java https://reviews.apache.org/r/28479/#comment105340 It would be useful to clarify the comment on why this needed to be moved further up as you explained offline - i.e., since buffer exhaustion (for example) can happen before the sender gets a chance to register the metrics. Also, we should probably discuss on the jira the additional caveat of failed metadata fetches. i.e., since that happens in the network-client the true record error rate would be higher than what's counted by sendermetrics. The options that we have are: * Expose Sender's maybeRegisterTopicMetrics and use that in NetworkClient maybeUpdateMetadata if there are no known partitions for a topic * Keep it as you have it for now and just accept the above discrepancy - (or we could address that in a separate jira as it is orthogonal). - Joel Koshy On Nov. 26, 2014, 8:20 p.m., Guozhang Wang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28479/ --- (Updated Nov. 26, 2014, 8:20 p.m.) Review request for kafka. Bugs: KAFKA-1800 https://issues.apache.org/jira/browse/KAFKA-1800 Repository: kafka Description --- 1. Add the logic for recoding per-topic record error rate in KafkaProducer handling thrown KafkaExceptions; 2. Move the metrics registration from send requests, since for some corner cases the request may not be sent and hence the per-topic metrics would never be registered Diffs - clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 32f444ebbd27892275af7a0947b86a6b8317a374 clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 84a7a07269c51ccc22ebb4ff9797292d07ba778e Diff: https://reviews.apache.org/r/28479/diff/ Testing --- Thanks, Guozhang Wang
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227033#comment-14227033 ] Guozhang Wang commented on KAFKA-992: - We have seen some scenarios which are not fully resolved by this patch: under certain cases the ephemeral node are not deleted ever after the session has expired (there is a ticket ZOOKEEPER-1208 for this and it is marked to be fixed in 3.3.4, but we are still seeing this issue with a newer version). For this corner case one thing we can do (or more precisely hack around) is to force-delete the ZK path when the written timestamp and the current timestamp's difference is larger than the ZK session timeout value already. Double Check on Broker Registration to Avoid False NodeExist Exception -- Key: KAFKA-992 URL: https://issues.apache.org/jira/browse/KAFKA-992 Project: Kafka Issue Type: Bug Reporter: Neha Narkhede Assignee: Guozhang Wang Attachments: KAFKA-992.v1.patch, KAFKA-992.v10.patch, KAFKA-992.v11.patch, KAFKA-992.v12.patch, KAFKA-992.v13.patch, KAFKA-992.v14.patch, KAFKA-992.v2.patch, KAFKA-992.v3.patch, KAFKA-992.v4.patch, KAFKA-992.v5.patch, KAFKA-992.v6.patch, KAFKA-992.v7.patch, KAFKA-992.v8.patch, KAFKA-992.v9.patch The current behavior of zookeeper for ephemeral nodes is that session expiration and ephemeral node deletion is not an atomic operation. The side-effect of the above zookeeper behavior in Kafka, for certain corner cases, is that ephemeral nodes can be lost even if the session is not expired. The sequence of events that can lead to lossy ephemeral nodes is as follows - 1. The session expires on the client, it assumes the ephemeral nodes are deleted, so it establishes a new session with zookeeper and tries to re-create the ephemeral nodes. 2. However, when it tries to re-create the ephemeral node,zookeeper throws back a NodeExists error code. Now this is legitimate during a session disconnect event (since zkclient automatically retries the operation and raises a NodeExists error). Also by design, Kafka server doesn't have multiple zookeeper clients create the same ephemeral node, so Kafka server assumes the NodeExists is normal. 3. However, after a few seconds zookeeper deletes that ephemeral node. So from the client's perspective, even though the client has a new valid session, its ephemeral node is gone. This behavior is triggered due to very long fsync operations on the zookeeper leader. When the leader wakes up from such a long fsync operation, it has several sessions to expire. And the time between the session expiration and the ephemeral node deletion is magnified. Between these 2 operations, a zookeeper client can issue a ephemeral node creation operation, that could've appeared to have succeeded, but the leader later deletes the ephemeral node leading to permanent ephemeral node loss from the client's perspective. Thread from zookeeper mailing list: http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1800) KafkaException was not recorded at the per-topic metrics
[ https://issues.apache.org/jira/browse/KAFKA-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang updated KAFKA-1800: - Description: When KafkaException was thrown from producer.send() call, it is not recorded on the per-topic record-error-rate, but only the global error-rate. Since users are usually monitoring on the per-topic metrics, loosing all dropped messages caused by kafka producer thrown exceptions could be very dangerous. was:When KafkaException was thrown from producer.send() call, it is not recorded on the per-topic record-error-rate, but only the global error-rate. KafkaException was not recorded at the per-topic metrics Key: KAFKA-1800 URL: https://issues.apache.org/jira/browse/KAFKA-1800 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Fix For: 0.9.0 Attachments: KAFKA-1800.patch When KafkaException was thrown from producer.send() call, it is not recorded on the per-topic record-error-rate, but only the global error-rate. Since users are usually monitoring on the per-topic metrics, loosing all dropped messages caused by kafka producer thrown exceptions could be very dangerous. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1800) KafkaException was not recorded at the per-topic metrics
[ https://issues.apache.org/jira/browse/KAFKA-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang updated KAFKA-1800: - Description: When KafkaException was thrown from producer.send() call, it is not recorded on the per-topic record-error-rate, but only the global error-rate. Since users are usually monitoring on the per-topic metrics, loosing all dropped message counts at this level that are caused by kafka producer thrown exceptions such as BufferExhaustedException could be very dangerous. was: When KafkaException was thrown from producer.send() call, it is not recorded on the per-topic record-error-rate, but only the global error-rate. Since users are usually monitoring on the per-topic metrics, loosing all dropped messages caused by kafka producer thrown exceptions could be very dangerous. KafkaException was not recorded at the per-topic metrics Key: KAFKA-1800 URL: https://issues.apache.org/jira/browse/KAFKA-1800 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Fix For: 0.9.0 Attachments: KAFKA-1800.patch When KafkaException was thrown from producer.send() call, it is not recorded on the per-topic record-error-rate, but only the global error-rate. Since users are usually monitoring on the per-topic metrics, loosing all dropped message counts at this level that are caused by kafka producer thrown exceptions such as BufferExhaustedException could be very dangerous. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1800) KafkaException was not recorded at the per-topic metrics
[ https://issues.apache.org/jira/browse/KAFKA-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227042#comment-14227042 ] Guozhang Wang commented on KAFKA-1800: -- There is still a corner case after this patch that the per-topic metrics cannot be recorded because they are not registered: when KafkaProducer's waitOnMetadata throws a TimeoutException because the topic metadata is not available, this error cannot be recorded at the per-topic metrics because they are only registered at the sender level when the produce requests are being sent (in the patch it is changed to when the it is refreshed). To solve this issue, one proposal is that: 1. In Metrics.registerMetric() function, when the metric already exists, treat it as a no-op instead of throwing IllegalArgumentException. 2. Expose a registerSenderMetrics() API of sender in kafka producer, which will be triggered before metadata.awaitUpdate(version, remainingWaitMs) in waitForMetadata. This fix is a little bit hacky though, so I would like to hear opinions from other people? [~jkreps] KafkaException was not recorded at the per-topic metrics Key: KAFKA-1800 URL: https://issues.apache.org/jira/browse/KAFKA-1800 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Fix For: 0.9.0 Attachments: KAFKA-1800.patch When KafkaException was thrown from producer.send() call, it is not recorded on the per-topic record-error-rate, but only the global error-rate. Since users are usually monitoring on the per-topic metrics, loosing all dropped message counts at this level that are caused by kafka producer thrown exceptions such as BufferExhaustedException could be very dangerous. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1781) Readme should specify that Gradle 2.0 is required for initial bootstrap
[ https://issues.apache.org/jira/browse/KAFKA-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227078#comment-14227078 ] Jun Rao commented on KAFKA-1781: We can probably just double commit KAFKA-1624 to 0.8.2 and add the gradle version requirement for JDK 8 in README. Readme should specify that Gradle 2.0 is required for initial bootstrap --- Key: KAFKA-1781 URL: https://issues.apache.org/jira/browse/KAFKA-1781 Project: Kafka Issue Type: Bug Components: build Affects Versions: 0.8.2 Reporter: Jean-Francois Im Priority: Blocker Fix For: 0.8.2 Attachments: gradle-2.0-readme.patch Current README.md says You need to have gradle installed. As the bootstrap procedure doesn't work with gradle 1.12, this needs to say that 2.0 or greater is needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1642) [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost
[ https://issues.apache.org/jira/browse/KAFKA-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227286#comment-14227286 ] Soumen Sarkar commented on KAFKA-1642: -- Is it reasonable to expect that timeout should have a lower bound (say *100 ms*) instead of being 0? [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost --- Key: KAFKA-1642 URL: https://issues.apache.org/jira/browse/KAFKA-1642 Project: Kafka Issue Type: Bug Components: producer Affects Versions: 0.8.1.1, 0.8.2 Reporter: Bhavesh Mistry Assignee: Ewen Cheslack-Postava Priority: Blocker Fix For: 0.8.2 Attachments: 0001-Initial-CPU-Hish-Usage-by-Kafka-FIX-and-Also-fix-CLO.patch, KAFKA-1642.patch, KAFKA-1642_2014-10-20_17:33:57.patch, KAFKA-1642_2014-10-23_16:19:41.patch I see my CPU spike to 100% when network connection is lost for while. It seems network IO thread are very busy logging following error message. Is this expected behavior ? 2014-09-17 14:06:16.830 [kafka-producer-network-thread] ERROR org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka producer I/O thread: java.lang.IllegalStateException: No entry found for node -2 at org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:110) at org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:99) at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:394) at org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:380) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:174) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:175) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:744) Thanks, Bhavesh -- This message was sent by Atlassian JIRA (v6.3.4#6332)