[jira] [Commented] (KAFKA-1772) Add an Admin message type for request response

2014-11-26 Thread Andrii Biletskyi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226259#comment-14226259
 ] 

Andrii Biletskyi commented on KAFKA-1772:
-

[~junrao]: Thanks for your feedback!  My vision:
1. Yes, looks like utility+command fits flawlessly only topic command. 
Although, I like the idea separating request types at utility level - this 
structures a bit tons of our commands. I.e. limit possible utilities to 
mentioned in the ticket but do not regulate commands as shared among all 
utilities.
2. Not sure about that, it is still being discussed whether we should use json 
for that (because of third-party json lib dependency); maybe we also can 
represent args in simple byte format (as all current requests).
3. Since all commands are really subtype of AdminRequest we can't have specific 
response for each command. Currently AdminResponse is a just an outcome string 
(and optionally error code). So if mutating command is successful - empty 
Response is returned, if it is list/describe command - description is returned 
in outcome string.
4. No final decision here. It's proposed to make all commands async on broker 
side and leave to client responsibility to block, executing verify or whatever 
else method to check command is completed. When commands are called from cli, 
this logic, of course, will be plugged into cli code, so user (if he wants) 
will experience such commands as blocking.


 Add an Admin message type for request response
 --

 Key: KAFKA-1772
 URL: https://issues.apache.org/jira/browse/KAFKA-1772
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Andrii Biletskyi
 Fix For: 0.8.3

 Attachments: KAFKA-1772.patch


 - utility int8
 - command int8
 - format int8
 - args variable length bytes
 utility 
 0 - Broker
 1 - Topic
 2 - Replication
 3 - Controller
 4 - Consumer
 5 - Producer
 Command
 0 - Create
 1 - Alter
 3 - Delete
 4 - List
 5 - Audit
 format
 0 - JSON
 args e.g. (which would equate to the data structure values == 2,1,0)
 meta-store: {
 {zookeeper:localhost:12913/kafka}
 }args: {
  partitions:
   [
 {topic: topic1, partition: 0},
 {topic: topic1, partition: 1},
 {topic: topic1, partition: 2},
  
 {topic: topic2, partition: 0},
 {topic: topic2, partition: 1},
   ]
 }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1774) REPL and Shell Client for Admin Message RQ/RP

2014-11-26 Thread Andrii Biletskyi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226355#comment-14226355
 ] 

Andrii Biletskyi commented on KAFKA-1774:
-

[~junrao]: thanks for noting dependency aspect.
I think now it's even better to have all admin cmd code placed under separate 
project ./tools. It looks reasonable users will be interested in a thin simple 
jar to do some admin commands through cli, without notion of all ./core classes 
and stuff.

 REPL and Shell Client for Admin Message RQ/RP
 -

 Key: KAFKA-1774
 URL: https://issues.apache.org/jira/browse/KAFKA-1774
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Andrii Biletskyi
 Fix For: 0.8.3


 We should have a REPL we can work in and execute the commands with the 
 arguments. With this we can do:
 ./kafka.sh --shell 
 kafkaattach cluster -b localhost:9092;
 kafkadescribe topic sampleTopicNameForExample;
 the command line version can work like it does now so folks don't have to 
 re-write all of their tooling.
 kafka.sh --topics --everything the same like kafka-topics.sh is 
 kafka.sh --reassign --everything the same like kafka-reassign-partitions.sh 
 is 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1801) Remove non-functional variable definition in log4j.properties

2014-11-26 Thread Raman Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raman Gupta updated KAFKA-1801:
---
Fix Version/s: 0.8.1
   Status: Patch Available  (was: Open)

In log4j.properties, a property kafka.logs.dir was defined. However, modifying 
this property has no effect because log4j.properties always uses the System 
property set in bin/kafka-run-class.sh before the locally set one.

 Remove non-functional variable definition in log4j.properties
 -

 Key: KAFKA-1801
 URL: https://issues.apache.org/jira/browse/KAFKA-1801
 Project: Kafka
  Issue Type: Improvement
  Components: log
Affects Versions: 0.8.2
Reporter: Raman Gupta
Assignee: Jay Kreps
Priority: Trivial
  Labels: easyfix, patch
 Fix For: 0.8.1

   Original Estimate: 5m
  Remaining Estimate: 5m

 In log4j.properties, a property kafka.logs.dir is defined. However, modifying 
 this property has no effect because log4j will always use the system property 
 defined in kafka-run-class.sh before using the locally defined property in 
 log4j.properties. Therefore, its probably less confusing to simply remove 
 this property from here.
 See 
 http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/PropertyConfigurator.html
  for the property search order (system property first, locally defined 
 property second).
 An alternative solution: remove the system property from kafka-run-class.sh 
 and keep the one here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1796) Sanity check partition command line tools

2014-11-26 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1796:
-
Description: We need to sanity check the input json has the valid values 
before triggering the admin process. For example, we have seen a scenario where 
the json input for partition reassignment tools have partition replica info as 
{broker-1, broker-1, broker-2} and it is still accepted in ZK and eventually 
lead to under replicated count, etc. This is partially because we use a Map 
rather than a Set reading the json input for this case; but in general we need 
to make sure the input parameters like Json  needs to be valid before writing 
it to ZK.  (was: We need to sanity check the input json has the valid values 
(for example, the replica list does not have duplicate broker ids, etc) before 
triggering the admin process.)

 Sanity check partition command line tools
 -

 Key: KAFKA-1796
 URL: https://issues.apache.org/jira/browse/KAFKA-1796
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
  Labels: newbie
 Fix For: 0.8.3


 We need to sanity check the input json has the valid values before triggering 
 the admin process. For example, we have seen a scenario where the json input 
 for partition reassignment tools have partition replica info as {broker-1, 
 broker-1, broker-2} and it is still accepted in ZK and eventually lead to 
 under replicated count, etc. This is partially because we use a Map rather 
 than a Set reading the json input for this case; but in general we need to 
 make sure the input parameters like Json  needs to be valid before writing it 
 to ZK.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1796) Sanity check partition command line tools

2014-11-26 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226440#comment-14226440
 ] 

Guozhang Wang commented on KAFKA-1796:
--

Updated the description, this may also be related to [~joestein]'s proposal for 
admin requests.

 Sanity check partition command line tools
 -

 Key: KAFKA-1796
 URL: https://issues.apache.org/jira/browse/KAFKA-1796
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
  Labels: newbie
 Fix For: 0.8.3


 We need to sanity check the input json has the valid values before triggering 
 the admin process. For example, we have seen a scenario where the json input 
 for partition reassignment tools have partition replica info as {broker-1, 
 broker-1, broker-2} and it is still accepted in ZK and eventually lead to 
 under replicated count, etc. This is partially because we use a Map rather 
 than a Set reading the json input for this case; but in general we need to 
 make sure the input parameters like Json  needs to be valid before writing it 
 to ZK.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (KAFKA-1801) Remove non-functional variable definition in log4j.properties

2014-11-26 Thread Raman Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raman Gupta updated KAFKA-1801:
---
Comment: was deleted

(was: In log4j.properties, a property kafka.logs.dir was defined. However, 
modifying this property has no effect because log4j.properties always uses the 
System property set in bin/kafka-run-class.sh before the locally set one.)

 Remove non-functional variable definition in log4j.properties
 -

 Key: KAFKA-1801
 URL: https://issues.apache.org/jira/browse/KAFKA-1801
 Project: Kafka
  Issue Type: Improvement
  Components: log
Affects Versions: 0.8.2
Reporter: Raman Gupta
Assignee: Jay Kreps
Priority: Trivial
  Labels: easyfix, patch
 Fix For: 0.8.2

   Original Estimate: 5m
  Remaining Estimate: 5m

 In log4j.properties, a property kafka.logs.dir is defined. However, modifying 
 this property has no effect because log4j will always use the system property 
 defined in kafka-run-class.sh before using the locally defined property in 
 log4j.properties. Therefore, its probably less confusing to simply remove 
 this property from here.
 See 
 http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/PropertyConfigurator.html
  for the property search order (system property first, locally defined 
 property second).
 An alternative solution: remove the system property from kafka-run-class.sh 
 and keep the one here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1801) Remove non-functional variable definition in log4j.properties

2014-11-26 Thread Raman Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raman Gupta updated KAFKA-1801:
---
Fix Version/s: (was: 0.8.1)
   0.8.2
   Status: Patch Available  (was: Open)

In log4j.properties, a property kafka.logs.dir was defined. However, modifying 
this property has no effect because log4j.properties always uses the System 
property set in bin/kafka-run-class.sh before the locally set one.

 Remove non-functional variable definition in log4j.properties
 -

 Key: KAFKA-1801
 URL: https://issues.apache.org/jira/browse/KAFKA-1801
 Project: Kafka
  Issue Type: Improvement
  Components: log
Affects Versions: 0.8.2
Reporter: Raman Gupta
Assignee: Jay Kreps
Priority: Trivial
  Labels: easyfix, patch
 Fix For: 0.8.2

   Original Estimate: 5m
  Remaining Estimate: 5m

 In log4j.properties, a property kafka.logs.dir is defined. However, modifying 
 this property has no effect because log4j will always use the system property 
 defined in kafka-run-class.sh before using the locally defined property in 
 log4j.properties. Therefore, its probably less confusing to simply remove 
 this property from here.
 See 
 http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/PropertyConfigurator.html
  for the property search order (system property first, locally defined 
 property second).
 An alternative solution: remove the system property from kafka-run-class.sh 
 and keep the one here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1801) Remove non-functional variable definition in log4j.properties

2014-11-26 Thread Raman Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raman Gupta updated KAFKA-1801:
---
Status: Open  (was: Patch Available)

 Remove non-functional variable definition in log4j.properties
 -

 Key: KAFKA-1801
 URL: https://issues.apache.org/jira/browse/KAFKA-1801
 Project: Kafka
  Issue Type: Improvement
  Components: log
Affects Versions: 0.8.2
Reporter: Raman Gupta
Assignee: Jay Kreps
Priority: Trivial
  Labels: easyfix, patch
 Fix For: 0.8.1

   Original Estimate: 5m
  Remaining Estimate: 5m

 In log4j.properties, a property kafka.logs.dir is defined. However, modifying 
 this property has no effect because log4j will always use the system property 
 defined in kafka-run-class.sh before using the locally defined property in 
 log4j.properties. Therefore, its probably less confusing to simply remove 
 this property from here.
 See 
 http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/PropertyConfigurator.html
  for the property search order (system property first, locally defined 
 property second).
 An alternative solution: remove the system property from kafka-run-class.sh 
 and keep the one here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1642) [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost

2014-11-26 Thread Bhavesh Mistry (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226751#comment-14226751
 ] 

Bhavesh Mistry edited comment on KAFKA-1642 at 11/26/14 8:08 PM:
-

[~ewencp],

Even setting long following parameter, states of system does get impacted does 
not matter what reconnect.backoff.ms and retry.backoff.ms is set to.  Once Node 
state is removed, the time out is set to 0.  Please see the following logs.  

#15 minutes
reconnect.backoff.ms=90
retry.backoff.ms=90

{code}
2014-11-26 11:01:27.898 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:02:27.903 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:03:27.903 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:04:27.903 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:05:27.904 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:06:27.905 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:07:27.906 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:08:27.908 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:09:27.908 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:10:27.909 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:11:27.909 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:12:27.910 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:13:27.911 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:14:27.912 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:15:27.914 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:00:27.613 [kafka-producer-network-thread | heartbeat] ERROR 
org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka 
producer I/O thread: 
 2014-11-26 11:00:27.613 [kafka-producer-network-thread | rawlog] ERROR 
org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka 
producer I/O thread: 
java.lang.IllegalStateException: No entry found for node -1
at 
org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:131)
at 
org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:120)
at 
org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:407)
at 
org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:393)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:187)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115)
at java.lang.Thread.run(Thread.java:744)
java.lang.IllegalStateException: No entry found for node -3
at 
org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:131)
at 
org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:120)
at 
org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:407)
at 
org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:393)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:187)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115)
at java.lang.Thread.run(Thread.java:744)
 2014-11-26 11:00:27.613 [kafka-producer-network-thread | heartbeat] ERROR 
org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka 
producer I/O thread: 
 2014-11-26 11:00:27.613 [kafka-producer-network-thread | error] ERROR 

[jira] [Commented] (KAFKA-1642) [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost

2014-11-26 Thread Bhavesh Mistry (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226751#comment-14226751
 ] 

Bhavesh Mistry commented on KAFKA-1642:
---

[~ewencp],

Even setting long following parameter, states of system does get impacted does 
not matter what reconnect.backoff.ms and retry.backoff.ms is set to.  Once Node 
state is removed, the time out is set to 0.  Please see the following logs.  

# 15 minutes
reconnect.backoff.ms=90
retry.backoff.ms=90

{code}
2014-11-26 11:01:27.898 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:02:27.903 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:03:27.903 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:04:27.903 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:05:27.904 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:06:27.905 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:07:27.906 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:08:27.908 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:09:27.908 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:10:27.909 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:11:27.909 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:12:27.910 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:13:27.911 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:14:27.912 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:15:27.914 Kafka Drop message topic=.rawlog
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 6 ms.
 2014-11-26 11:00:27.613 [kafka-producer-network-thread | heartbeat] ERROR 
org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka 
producer I/O thread: 
 2014-11-26 11:00:27.613 [kafka-producer-network-thread | rawlog] ERROR 
org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka 
producer I/O thread: 
java.lang.IllegalStateException: No entry found for node -1
at 
org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:131)
at 
org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:120)
at 
org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:407)
at 
org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:393)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:187)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115)
at java.lang.Thread.run(Thread.java:744)
java.lang.IllegalStateException: No entry found for node -3
at 
org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:131)
at 
org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:120)
at 
org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:407)
at 
org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:393)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:187)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115)
at java.lang.Thread.run(Thread.java:744)
 2014-11-26 11:00:27.613 [kafka-producer-network-thread | heartbeat] ERROR 
org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka 
producer I/O thread: 
 2014-11-26 11:00:27.613 [kafka-producer-network-thread | error] ERROR 
org.apache.kafka.clients.producer.internals.Sender - Uncaught 

[jira] [Updated] (KAFKA-1800) KafkaException was not recorded at the per-topic metrics

2014-11-26 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1800:
-
Attachment: KAFKA-1800.patch

 KafkaException was not recorded at the per-topic metrics
 

 Key: KAFKA-1800
 URL: https://issues.apache.org/jira/browse/KAFKA-1800
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.9.0

 Attachments: KAFKA-1800.patch


 When KafkaException was thrown from producer.send() call, it is not recorded 
 on the per-topic record-error-rate, but only the global error-rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1800) KafkaException was not recorded at the per-topic metrics

2014-11-26 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1800:
-
Status: Patch Available  (was: Open)

 KafkaException was not recorded at the per-topic metrics
 

 Key: KAFKA-1800
 URL: https://issues.apache.org/jira/browse/KAFKA-1800
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.9.0

 Attachments: KAFKA-1800.patch


 When KafkaException was thrown from producer.send() call, it is not recorded 
 on the per-topic record-error-rate, but only the global error-rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 28479: Fix KAFKA-1800

2014-11-26 Thread Guozhang Wang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28479/
---

Review request for kafka.


Bugs: KAFKA-1800
https://issues.apache.org/jira/browse/KAFKA-1800


Repository: kafka


Description
---

1. Add the logic for recoding per-topic record error rate in KafkaProducer 
handling thrown KafkaExceptions; 2. Move the metrics registration from send 
requests, since for some corner cases the request may not be sent and hence the 
per-topic metrics would never be registered


Diffs
-

  clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
32f444ebbd27892275af7a0947b86a6b8317a374 
  clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
84a7a07269c51ccc22ebb4ff9797292d07ba778e 

Diff: https://reviews.apache.org/r/28479/diff/


Testing
---


Thanks,

Guozhang Wang



Review Request 28481: Patch for KAFKA-1792

2014-11-26 Thread Dmitry Pekar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28481/
---

Review request for kafka.


Bugs: KAFKA-1792
https://issues.apache.org/jira/browse/KAFKA-1792


Repository: kafka


Description
---

KAFKA-1792: change behavior of --generate to produce assignment config with 
fair replica distribution and minimal number of reassignments


Diffs
-

  core/src/main/scala/kafka/admin/AdminUtils.scala 
28b12c7b89a56c113b665fbde1b95f873f8624a3 
  core/src/main/scala/kafka/admin/ReassignPartitionsCommand.scala 
979992b68af3723cd229845faff81c641123bb88 
  core/src/test/scala/unit/kafka/admin/AdminTest.scala 
e28979827110dfbbb92fe5b152e7f1cc973de400 
  topics.json ff011ed381e781b9a177036001d44dca3eac586f 

Diff: https://reviews.apache.org/r/28481/diff/


Testing
---


Thanks,

Dmitry Pekar



[jira] [Commented] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments

2014-11-26 Thread Dmitry Pekar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226816#comment-14226816
 ] 

Dmitry Pekar commented on KAFKA-1792:
-

Created reviewboard https://reviews.apache.org/r/28481/diff/
 against branch origin/trunk

 change behavior of --generate to produce assignment config with fair replica 
 distribution and minimal number of reassignments
 -

 Key: KAFKA-1792
 URL: https://issues.apache.org/jira/browse/KAFKA-1792
 Project: Kafka
  Issue Type: Sub-task
  Components: tools
Reporter: Dmitry Pekar
Assignee: Dmitry Pekar
 Fix For: 0.8.3

 Attachments: KAFKA-1792.patch


 Current implementation produces fair replica distribution between specified 
 list of brokers. Unfortunately, it doesn't take
 into account current replica assignment.
 So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth 
 broker id=3, 
 generate will create an assignment config which will redistribute replicas 
 fairly across brokers [0..3] 
 in the same way as those partitions were created from scratch. It will not 
 take into consideration current replica 
 assignment and accordingly will not try to minimize number of replica moves 
 between brokers.
 As proposed by [~charmalloc] this should be improved. New output of improved 
 --generate algorithm should suite following requirements:
 - fairness of replica distribution - every broker will have R or R+1 replicas 
 assigned;
 - minimum of reassignments - number of replica moves between brokers will be 
 minimal;
 Example.
 Consider following replica distribution per brokers [0..3] (we just added 
 brokers 2 and 3):
 - broker - 0, 1, 2, 3 
 - replicas - 7, 6, 0, 0
 The new algorithm will produce following assignment:
 - broker - 0, 1, 2, 3 
 - replicas - 4, 3, 3, 3
 - moves - -3, -3, +3, +3
 It will be fair and number of moves will be 6, which is minimal for specified 
 initial distribution.
 The scope of this issue is:
 - design an algorithm matching the above requirements;
 - implement this algorithm and unit tests;
 - test it manually using different initial assignments;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments

2014-11-26 Thread Dmitry Pekar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Pekar updated KAFKA-1792:

Status: Patch Available  (was: In Progress)

 change behavior of --generate to produce assignment config with fair replica 
 distribution and minimal number of reassignments
 -

 Key: KAFKA-1792
 URL: https://issues.apache.org/jira/browse/KAFKA-1792
 Project: Kafka
  Issue Type: Sub-task
  Components: tools
Reporter: Dmitry Pekar
Assignee: Dmitry Pekar
 Fix For: 0.8.3

 Attachments: KAFKA-1792.patch


 Current implementation produces fair replica distribution between specified 
 list of brokers. Unfortunately, it doesn't take
 into account current replica assignment.
 So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth 
 broker id=3, 
 generate will create an assignment config which will redistribute replicas 
 fairly across brokers [0..3] 
 in the same way as those partitions were created from scratch. It will not 
 take into consideration current replica 
 assignment and accordingly will not try to minimize number of replica moves 
 between brokers.
 As proposed by [~charmalloc] this should be improved. New output of improved 
 --generate algorithm should suite following requirements:
 - fairness of replica distribution - every broker will have R or R+1 replicas 
 assigned;
 - minimum of reassignments - number of replica moves between brokers will be 
 minimal;
 Example.
 Consider following replica distribution per brokers [0..3] (we just added 
 brokers 2 and 3):
 - broker - 0, 1, 2, 3 
 - replicas - 7, 6, 0, 0
 The new algorithm will produce following assignment:
 - broker - 0, 1, 2, 3 
 - replicas - 4, 3, 3, 3
 - moves - -3, -3, +3, +3
 It will be fair and number of moves will be 6, which is minimal for specified 
 initial distribution.
 The scope of this issue is:
 - design an algorithm matching the above requirements;
 - implement this algorithm and unit tests;
 - test it manually using different initial assignments;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments

2014-11-26 Thread Dmitry Pekar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Pekar updated KAFKA-1792:

Attachment: KAFKA-1792.patch

 change behavior of --generate to produce assignment config with fair replica 
 distribution and minimal number of reassignments
 -

 Key: KAFKA-1792
 URL: https://issues.apache.org/jira/browse/KAFKA-1792
 Project: Kafka
  Issue Type: Sub-task
  Components: tools
Reporter: Dmitry Pekar
Assignee: Dmitry Pekar
 Fix For: 0.8.3

 Attachments: KAFKA-1792.patch


 Current implementation produces fair replica distribution between specified 
 list of brokers. Unfortunately, it doesn't take
 into account current replica assignment.
 So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth 
 broker id=3, 
 generate will create an assignment config which will redistribute replicas 
 fairly across brokers [0..3] 
 in the same way as those partitions were created from scratch. It will not 
 take into consideration current replica 
 assignment and accordingly will not try to minimize number of replica moves 
 between brokers.
 As proposed by [~charmalloc] this should be improved. New output of improved 
 --generate algorithm should suite following requirements:
 - fairness of replica distribution - every broker will have R or R+1 replicas 
 assigned;
 - minimum of reassignments - number of replica moves between brokers will be 
 minimal;
 Example.
 Consider following replica distribution per brokers [0..3] (we just added 
 brokers 2 and 3):
 - broker - 0, 1, 2, 3 
 - replicas - 7, 6, 0, 0
 The new algorithm will produce following assignment:
 - broker - 0, 1, 2, 3 
 - replicas - 4, 3, 3, 3
 - moves - -3, -3, +3, +3
 It will be fair and number of moves will be 6, which is minimal for specified 
 initial distribution.
 The scope of this issue is:
 - design an algorithm matching the above requirements;
 - implement this algorithm and unit tests;
 - test it manually using different initial assignments;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2014-11-26 Thread Sriharsha Chintalapani (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani reassigned KAFKA-1461:
-

Assignee: Sriharsha Chintalapani  (was: nicu marasoiu)

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Fix For: 0.8.3


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2014-11-26 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226840#comment-14226840
 ] 

Sriharsha Chintalapani commented on KAFKA-1461:
---

[~charmalloc] [~nmarasoiu] I can take this. I am looking at this code for 
another JIRA.

 Replica fetcher thread does not implement any back-off behavior
 ---

 Key: KAFKA-1461
 URL: https://issues.apache.org/jira/browse/KAFKA-1461
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.1.1
Reporter: Sam Meder
Assignee: nicu marasoiu
  Labels: newbie++
 Fix For: 0.8.3


 The current replica fetcher thread will retry in a tight loop if any error 
 occurs during the fetch call. For example, we've seen cases where the fetch 
 continuously throws a connection refused exception leading to several replica 
 fetcher threads that spin in a pretty tight loop.
 To a much lesser degree this is also an issue in the consumer fetcher thread, 
 although the fact that erroring partitions are removed so a leader can be 
 re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1273) Brokers should make sure replica.fetch.max.bytes = message.max.bytes

2014-11-26 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226929#comment-14226929
 ] 

Neha Narkhede commented on KAFKA-1273:
--

I'm not able to reopen the issue as well. [~junrao], are you able to?

 Brokers should make sure replica.fetch.max.bytes = message.max.bytes
 -

 Key: KAFKA-1273
 URL: https://issues.apache.org/jira/browse/KAFKA-1273
 Project: Kafka
  Issue Type: Bug
  Components: replication
Affects Versions: 0.8.0
Reporter: Dong Zhong
Assignee: Sriharsha Chintalapani
  Labels: newbie

 If message.max.bytes is larger than replica.fetch.max.bytes,followers can't 
 fetch data from the leader and will incur endless retry. And this may cause 
 high network traffic between followers and leaders.
 Brokers should make sure replica.fetch.max.bytes = message.max.bytes by 
 adding a sanity check, or throw an exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1476) Get a list of consumer groups

2014-11-26 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226935#comment-14226935
 ] 

Neha Narkhede commented on KAFKA-1476:
--

[~balaji.sesha...@dish.com] I had already left my comments on the rb.

 Get a list of consumer groups
 -

 Key: KAFKA-1476
 URL: https://issues.apache.org/jira/browse/KAFKA-1476
 Project: Kafka
  Issue Type: Wish
  Components: tools
Affects Versions: 0.8.1.1
Reporter: Ryan Williams
Assignee: BalajiSeshadri
  Labels: newbie
 Fix For: 0.9.0

 Attachments: ConsumerCommand.scala, KAFKA-1476-LIST-GROUPS.patch, 
 KAFKA-1476-RENAME.patch, KAFKA-1476-REVIEW-COMMENTS.patch, KAFKA-1476.patch, 
 KAFKA-1476.patch, KAFKA-1476.patch, KAFKA-1476_2014-11-10_11:58:26.patch, 
 KAFKA-1476_2014-11-10_12:04:01.patch, KAFKA-1476_2014-11-10_12:06:35.patch


 It would be useful to have a way to get a list of consumer groups currently 
 active via some tool/script that ships with kafka. This would be helpful so 
 that the system tools can be explored more easily.
 For example, when running the ConsumerOffsetChecker, it requires a group 
 option
 bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --topic test --group 
 ?
 But, when just getting started with kafka, using the console producer and 
 consumer, it is not clear what value to use for the group option.  If a list 
 of consumer groups could be listed, then it would be clear what value to use.
 Background:
 http://mail-archives.apache.org/mod_mbox/kafka-users/201405.mbox/%3cCAOq_b1w=slze5jrnakxvak0gu9ctdkpazak1g4dygvqzbsg...@mail.gmail.com%3e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1476) Get a list of consumer groups

2014-11-26 Thread BalajiSeshadri (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226947#comment-14226947
 ] 

BalajiSeshadri commented on KAFKA-1476:
---

[~nehanarkhede] I was requesting for comment regarding exposing as API.Are you 
guys planning for it ?.

 Get a list of consumer groups
 -

 Key: KAFKA-1476
 URL: https://issues.apache.org/jira/browse/KAFKA-1476
 Project: Kafka
  Issue Type: Wish
  Components: tools
Affects Versions: 0.8.1.1
Reporter: Ryan Williams
Assignee: BalajiSeshadri
  Labels: newbie
 Fix For: 0.9.0

 Attachments: ConsumerCommand.scala, KAFKA-1476-LIST-GROUPS.patch, 
 KAFKA-1476-RENAME.patch, KAFKA-1476-REVIEW-COMMENTS.patch, KAFKA-1476.patch, 
 KAFKA-1476.patch, KAFKA-1476.patch, KAFKA-1476_2014-11-10_11:58:26.patch, 
 KAFKA-1476_2014-11-10_12:04:01.patch, KAFKA-1476_2014-11-10_12:06:35.patch


 It would be useful to have a way to get a list of consumer groups currently 
 active via some tool/script that ships with kafka. This would be helpful so 
 that the system tools can be explored more easily.
 For example, when running the ConsumerOffsetChecker, it requires a group 
 option
 bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --topic test --group 
 ?
 But, when just getting started with kafka, using the console producer and 
 consumer, it is not clear what value to use for the group option.  If a list 
 of consumer groups could be listed, then it would be clear what value to use.
 Background:
 http://mail-archives.apache.org/mod_mbox/kafka-users/201405.mbox/%3cCAOq_b1w=slze5jrnakxvak0gu9ctdkpazak1g4dygvqzbsg...@mail.gmail.com%3e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 28479: Fix KAFKA-1800

2014-11-26 Thread Joel Koshy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28479/#review63145
---



clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java
https://reviews.apache.org/r/28479/#comment105340

It would be useful to clarify the comment on why this needed to be moved 
further up as you explained offline - i.e., since buffer exhaustion (for 
example) can happen before the sender gets a chance to register the metrics.

Also, we should probably discuss on the jira the additional caveat of 
failed metadata fetches. i.e., since that happens in the network-client the 
true record error rate would be higher than what's counted by sendermetrics.

The options that we have are:
* Expose Sender's maybeRegisterTopicMetrics and use that in NetworkClient 
maybeUpdateMetadata if there are no known partitions for a topic
* Keep it as you have it for now and just accept the above discrepancy - 
(or we could address that in a separate jira as it is orthogonal).


- Joel Koshy


On Nov. 26, 2014, 8:20 p.m., Guozhang Wang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/28479/
 ---
 
 (Updated Nov. 26, 2014, 8:20 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1800
 https://issues.apache.org/jira/browse/KAFKA-1800
 
 
 Repository: kafka
 
 
 Description
 ---
 
 1. Add the logic for recoding per-topic record error rate in KafkaProducer 
 handling thrown KafkaExceptions; 2. Move the metrics registration from send 
 requests, since for some corner cases the request may not be sent and hence 
 the per-topic metrics would never be registered
 
 
 Diffs
 -
 
   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
 32f444ebbd27892275af7a0947b86a6b8317a374 
   
 clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
 84a7a07269c51ccc22ebb4ff9797292d07ba778e 
 
 Diff: https://reviews.apache.org/r/28479/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Guozhang Wang
 




[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception

2014-11-26 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227033#comment-14227033
 ] 

Guozhang Wang commented on KAFKA-992:
-

We have seen some scenarios which are not fully resolved by this patch: under 
certain cases the ephemeral node are not deleted ever after the session has 
expired (there is a ticket ZOOKEEPER-1208 for this and it is marked to be fixed 
in 3.3.4, but we are still seeing this issue with a newer version).

For this corner case one thing we can do (or more precisely hack around) is to 
force-delete the ZK path when the written timestamp and the current timestamp's 
difference is larger than the ZK session timeout value already.

 Double Check on Broker Registration to Avoid False NodeExist Exception
 --

 Key: KAFKA-992
 URL: https://issues.apache.org/jira/browse/KAFKA-992
 Project: Kafka
  Issue Type: Bug
Reporter: Neha Narkhede
Assignee: Guozhang Wang
 Attachments: KAFKA-992.v1.patch, KAFKA-992.v10.patch, 
 KAFKA-992.v11.patch, KAFKA-992.v12.patch, KAFKA-992.v13.patch, 
 KAFKA-992.v14.patch, KAFKA-992.v2.patch, KAFKA-992.v3.patch, 
 KAFKA-992.v4.patch, KAFKA-992.v5.patch, KAFKA-992.v6.patch, 
 KAFKA-992.v7.patch, KAFKA-992.v8.patch, KAFKA-992.v9.patch


 The current behavior of zookeeper for ephemeral nodes is that session 
 expiration and ephemeral node deletion is not an atomic operation. 
 The side-effect of the above zookeeper behavior in Kafka, for certain corner 
 cases, is that ephemeral nodes can be lost even if the session is not 
 expired. The sequence of events that can lead to lossy ephemeral nodes is as 
 follows -
 1. The session expires on the client, it assumes the ephemeral nodes are 
 deleted, so it establishes a new session with zookeeper and tries to 
 re-create the ephemeral nodes. 
 2. However, when it tries to re-create the ephemeral node,zookeeper throws 
 back a NodeExists error code. Now this is legitimate during a session 
 disconnect event (since zkclient automatically retries the
 operation and raises a NodeExists error). Also by design, Kafka server 
 doesn't have multiple zookeeper clients create the same ephemeral node, so 
 Kafka server assumes the NodeExists is normal. 
 3. However, after a few seconds zookeeper deletes that ephemeral node. So 
 from the client's perspective, even though the client has a new valid 
 session, its ephemeral node is gone.
 This behavior is triggered due to very long fsync operations on the zookeeper 
 leader. When the leader wakes up from such a long fsync operation, it has 
 several sessions to expire. And the time between the session expiration and 
 the ephemeral node deletion is magnified. Between these 2 operations, a 
 zookeeper client can issue a ephemeral node creation operation, that could've 
 appeared to have succeeded, but the leader later deletes the ephemeral node 
 leading to permanent ephemeral node loss from the client's perspective. 
 Thread from zookeeper mailing list: 
 http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1800) KafkaException was not recorded at the per-topic metrics

2014-11-26 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1800:
-
Description: 
When KafkaException was thrown from producer.send() call, it is not recorded on 
the per-topic record-error-rate, but only the global error-rate.

Since users are usually monitoring on the per-topic metrics, loosing all 
dropped messages caused by kafka producer thrown exceptions could be very 
dangerous.

  was:When KafkaException was thrown from producer.send() call, it is not 
recorded on the per-topic record-error-rate, but only the global error-rate.


 KafkaException was not recorded at the per-topic metrics
 

 Key: KAFKA-1800
 URL: https://issues.apache.org/jira/browse/KAFKA-1800
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.9.0

 Attachments: KAFKA-1800.patch


 When KafkaException was thrown from producer.send() call, it is not recorded 
 on the per-topic record-error-rate, but only the global error-rate.
 Since users are usually monitoring on the per-topic metrics, loosing all 
 dropped messages caused by kafka producer thrown exceptions could be very 
 dangerous.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1800) KafkaException was not recorded at the per-topic metrics

2014-11-26 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1800:
-
Description: 
When KafkaException was thrown from producer.send() call, it is not recorded on 
the per-topic record-error-rate, but only the global error-rate.

Since users are usually monitoring on the per-topic metrics, loosing all 
dropped message counts at this level that are caused by kafka producer thrown 
exceptions such as BufferExhaustedException could be very dangerous.

  was:
When KafkaException was thrown from producer.send() call, it is not recorded on 
the per-topic record-error-rate, but only the global error-rate.

Since users are usually monitoring on the per-topic metrics, loosing all 
dropped messages caused by kafka producer thrown exceptions could be very 
dangerous.


 KafkaException was not recorded at the per-topic metrics
 

 Key: KAFKA-1800
 URL: https://issues.apache.org/jira/browse/KAFKA-1800
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.9.0

 Attachments: KAFKA-1800.patch


 When KafkaException was thrown from producer.send() call, it is not recorded 
 on the per-topic record-error-rate, but only the global error-rate.
 Since users are usually monitoring on the per-topic metrics, loosing all 
 dropped message counts at this level that are caused by kafka producer thrown 
 exceptions such as BufferExhaustedException could be very dangerous.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1800) KafkaException was not recorded at the per-topic metrics

2014-11-26 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227042#comment-14227042
 ] 

Guozhang Wang commented on KAFKA-1800:
--

There is still a corner case after this patch that the per-topic metrics cannot 
be recorded because they are not registered: when KafkaProducer's 
waitOnMetadata throws a TimeoutException because the topic metadata is not 
available, this error cannot be recorded at the per-topic metrics because they 
are only registered at the sender level when the produce requests are being 
sent (in the patch it is changed to when the it is refreshed).

To solve this issue, one proposal is that:

1. In Metrics.registerMetric() function, when the metric already exists, treat 
it as a no-op instead of throwing IllegalArgumentException.
2. Expose a registerSenderMetrics() API of sender in kafka producer, which will 
be triggered before metadata.awaitUpdate(version, remainingWaitMs) in 
waitForMetadata.

This fix is a little bit hacky though, so I would like to hear opinions from 
other people? [~jkreps]


 KafkaException was not recorded at the per-topic metrics
 

 Key: KAFKA-1800
 URL: https://issues.apache.org/jira/browse/KAFKA-1800
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.9.0

 Attachments: KAFKA-1800.patch


 When KafkaException was thrown from producer.send() call, it is not recorded 
 on the per-topic record-error-rate, but only the global error-rate.
 Since users are usually monitoring on the per-topic metrics, loosing all 
 dropped message counts at this level that are caused by kafka producer thrown 
 exceptions such as BufferExhaustedException could be very dangerous.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1781) Readme should specify that Gradle 2.0 is required for initial bootstrap

2014-11-26 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227078#comment-14227078
 ] 

Jun Rao commented on KAFKA-1781:


We can probably just double commit KAFKA-1624 to 0.8.2 and add the gradle 
version requirement for JDK 8 in README.

 Readme should specify that Gradle 2.0 is required for initial bootstrap
 ---

 Key: KAFKA-1781
 URL: https://issues.apache.org/jira/browse/KAFKA-1781
 Project: Kafka
  Issue Type: Bug
  Components: build
Affects Versions: 0.8.2
Reporter: Jean-Francois Im
Priority: Blocker
 Fix For: 0.8.2

 Attachments: gradle-2.0-readme.patch


 Current README.md says You need to have gradle installed.
 As the bootstrap procedure doesn't work with gradle 1.12, this needs to say 
 that 2.0 or greater is needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1642) [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost

2014-11-26 Thread Soumen Sarkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227286#comment-14227286
 ] 

Soumen Sarkar commented on KAFKA-1642:
--

Is it reasonable to expect that timeout should have a lower bound (say *100 
ms*) instead of being 0?

 [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network 
 connection is lost
 ---

 Key: KAFKA-1642
 URL: https://issues.apache.org/jira/browse/KAFKA-1642
 Project: Kafka
  Issue Type: Bug
  Components: producer 
Affects Versions: 0.8.1.1, 0.8.2
Reporter: Bhavesh Mistry
Assignee: Ewen Cheslack-Postava
Priority: Blocker
 Fix For: 0.8.2

 Attachments: 
 0001-Initial-CPU-Hish-Usage-by-Kafka-FIX-and-Also-fix-CLO.patch, 
 KAFKA-1642.patch, KAFKA-1642_2014-10-20_17:33:57.patch, 
 KAFKA-1642_2014-10-23_16:19:41.patch


 I see my CPU spike to 100% when network connection is lost for while.  It 
 seems network  IO thread are very busy logging following error message.  Is 
 this expected behavior ?
 2014-09-17 14:06:16.830 [kafka-producer-network-thread] ERROR 
 org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka 
 producer I/O thread: 
 java.lang.IllegalStateException: No entry found for node -2
 at 
 org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:110)
 at 
 org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:99)
 at 
 org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:394)
 at 
 org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:380)
 at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:174)
 at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:175)
 at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115)
 at java.lang.Thread.run(Thread.java:744)
 Thanks,
 Bhavesh



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)