[jira] [Commented] (KAFKA-1501) transient unit tests failures due to port already in use

2014-09-10 Thread Chris Cope (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128457#comment-14128457
 ] 

Chris Cope commented on KAFKA-1501:
---

[~abhioncbr],
{code}
git clone https://github.com/apache/kafka.git .
./gradlew test
{code}
There will be failures anywhere from 10%-20% of the time. I *think* there is a 
race condition with TestUtils.choosePorts(), where a port is grabbed, closed, 
and then when the KafkaTestHarness uses it, it's not available yet. Looking 
through failures of the last few hundred test runs I've done, there is usually 
one 1 (occasionally 2) ports at fault that then cause subsequent tests to fail 
for the test class. 

Essentially, this race condition is occurring approximately 0.06% of the time a 
socket server is created. However, my team frequently has to rerun tests after 
a code change, because sometimes it fails. This is seen at least multiple times 
a day by us. The best solution seems to catch this exception and then grab new 
ports. Again we're talking about the test harness, so which ports it runs on 
doesn't matter. Thoughts?


> transient unit tests failures due to port already in use
> 
>
> Key: KAFKA-1501
> URL: https://issues.apache.org/jira/browse/KAFKA-1501
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Reporter: Jun Rao
>  Labels: newbie
>
> Saw the following transient failures.
> kafka.api.ProducerFailureHandlingTest > testTooLargeRecordWithAckOne FAILED
> kafka.common.KafkaException: Socket server failed to bind to 
> localhost:59909: Address already in use.
> at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
> at kafka.network.Acceptor.(SocketServer.scala:141)
> at kafka.network.SocketServer.startup(SocketServer.scala:68)
> at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
> at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
> at 
> kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1501) transient unit tests failures due to port already in use

2014-09-10 Thread Abhishek Sharma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128504#comment-14128504
 ] 

Abhishek Sharma commented on KAFKA-1501:


[~copester]

I did intentionally tried to produce this issue by manually giving the same 
port value to different server when I was not able to reproduce it. 

I was thinking solution over the same line that you have suggested - catch the 
exception and then grab new port from TestUtils port related function for 
retry. One thing we need to take care is - to limit the number of tries 
otherwise in worst scenario it may end up in loop. 

> transient unit tests failures due to port already in use
> 
>
> Key: KAFKA-1501
> URL: https://issues.apache.org/jira/browse/KAFKA-1501
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Reporter: Jun Rao
>  Labels: newbie
>
> Saw the following transient failures.
> kafka.api.ProducerFailureHandlingTest > testTooLargeRecordWithAckOne FAILED
> kafka.common.KafkaException: Socket server failed to bind to 
> localhost:59909: Address already in use.
> at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
> at kafka.network.Acceptor.(SocketServer.scala:141)
> at kafka.network.SocketServer.startup(SocketServer.scala:68)
> at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
> at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
> at 
> kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1501) transient unit tests failures due to port already in use

2014-09-10 Thread Chris Cope (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128653#comment-14128653
 ] 

Chris Cope commented on KAFKA-1501:
---

I agree, [~absingh]. I'm running some more tests and I think the best way to 
handle this unlikely event is to catch is specifically, and then have it rerun 
the entire test class *one* time, and noting this in the test log. This bug 
does not affect the core Kafka code, and is simple exposed here because Kafka 
has such great unit tests, and we just happen to run them A LOT of our 
purposes. I'm proposing this solution instead of hunting and fixing the 
underlying issue in choosePorts(), which when looking around at other projects 
does seem like a decent implementation.

The probability of a test class failing twice in a row should be very low 
(<0.0001%) and should result in any test class failure less than 1% of the time 
`./gradlew test` is run.

Is this approach sound?


> transient unit tests failures due to port already in use
> 
>
> Key: KAFKA-1501
> URL: https://issues.apache.org/jira/browse/KAFKA-1501
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Reporter: Jun Rao
>  Labels: newbie
>
> Saw the following transient failures.
> kafka.api.ProducerFailureHandlingTest > testTooLargeRecordWithAckOne FAILED
> kafka.common.KafkaException: Socket server failed to bind to 
> localhost:59909: Address already in use.
> at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
> at kafka.network.Acceptor.(SocketServer.scala:141)
> at kafka.network.SocketServer.startup(SocketServer.scala:68)
> at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
> at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
> at 
> kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1501) transient unit tests failures due to port already in use

2014-09-10 Thread Chris Cope (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128653#comment-14128653
 ] 

Chris Cope edited comment on KAFKA-1501 at 9/10/14 4:10 PM:


I agree, [~absingh]. I'm running some more tests and I think the best way to 
handle this unlikely event is to catch it specifically, and then have it rerun 
the entire test class *one* time, and noting this in the test log. This bug 
does not affect the core Kafka code, and is simply exposed here because Kafka 
has such great unit tests, and we just happen to run them A LOT for our 
purposes. I'm proposing this solution instead of hunting and fixing the 
underlying issue in choosePorts(), which when looking around at other projects 
does seem like a decent implementation.

The probability of a test class failing twice in a row should be very low 
(<0.0001%) and should result in any test class failure less than 1% of the time 
`./gradlew test` is run.

Is this approach sound?



was (Author: copester):
I agree, [~absingh]. I'm running some more tests and I think the best way to 
handle this unlikely event is to catch is specifically, and then have it rerun 
the entire test class *one* time, and noting this in the test log. This bug 
does not affect the core Kafka code, and is simple exposed here because Kafka 
has such great unit tests, and we just happen to run them A LOT of our 
purposes. I'm proposing this solution instead of hunting and fixing the 
underlying issue in choosePorts(), which when looking around at other projects 
does seem like a decent implementation.

The probability of a test class failing twice in a row should be very low 
(<0.0001%) and should result in any test class failure less than 1% of the time 
`./gradlew test` is run.

Is this approach sound?


> transient unit tests failures due to port already in use
> 
>
> Key: KAFKA-1501
> URL: https://issues.apache.org/jira/browse/KAFKA-1501
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Reporter: Jun Rao
>  Labels: newbie
>
> Saw the following transient failures.
> kafka.api.ProducerFailureHandlingTest > testTooLargeRecordWithAckOne FAILED
> kafka.common.KafkaException: Socket server failed to bind to 
> localhost:59909: Address already in use.
> at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
> at kafka.network.Acceptor.(SocketServer.scala:141)
> at kafka.network.SocketServer.startup(SocketServer.scala:68)
> at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
> at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
> at 
> kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-560) Garbage Collect obsolete topics

2014-09-10 Thread Chris Riccomini (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128771#comment-14128771
 ] 

Chris Riccomini commented on KAFKA-560:
---

bq. It would be good to have a tool that could delete any topic that had not 
been written to in a configurable period of time and had no active consumer 
groups. 

I would prefer not to depend on consumer groups. Samza, for example, doesn't 
have consumer groups, so doing things like looking at the lsat offset commit of 
a consumer group in ZK/OffsetManager will not help if the consumer is using 
Samza (or some other offset checkpoint mechanism). The better approach, to me, 
seems to be to just have brokers keep track of approximate last-reads for each 
topic/partition based on FetchRequests.

> Garbage Collect obsolete topics
> ---
>
> Key: KAFKA-560
> URL: https://issues.apache.org/jira/browse/KAFKA-560
> Project: Kafka
>  Issue Type: New Feature
>Reporter: Jay Kreps
>  Labels: project
>
> Old junk topics tend to accumulate over time. Code may migrate to use new 
> topics leaving the old ones orphaned. Likewise there are some use cases for 
> temporary transient topics. It would be good to have a tool that could delete 
> any topic that had not been written to in a configurable period of time and 
> had no active consumer groups. Something like
>./bin/delete-unused-topics.sh --last-write [date] --zookeeper [zk_connect]
> This requires API support to get the last update time. I think it may be 
> possible to do this through the OffsetRequest now?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


IRC logs now available on botbot.me

2014-09-10 Thread David Arthur

https://botbot.me/freenode/apache-kafka/

Just FYI, wasn't sure if we had any logging in place

Cheers,
David




[jira] [Commented] (KAFKA-1501) transient unit tests failures due to port already in use

2014-09-10 Thread Abhishek Sharma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128928#comment-14128928
 ] 

Abhishek Sharma commented on KAFKA-1501:


Running an entire test class again is not a good approach (I think so). I think 
catching specific exception and retrying again in catch block with alternative 
port seems to be more efficient. This needs a change only in  test class and I 
think that should be OK. 

Other approach is like we can we have another util function in TestUtils class 
for getting connection with same approach. 

What you think about taking an input from some other on this?

> transient unit tests failures due to port already in use
> 
>
> Key: KAFKA-1501
> URL: https://issues.apache.org/jira/browse/KAFKA-1501
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Reporter: Jun Rao
>  Labels: newbie
>
> Saw the following transient failures.
> kafka.api.ProducerFailureHandlingTest > testTooLargeRecordWithAckOne FAILED
> kafka.common.KafkaException: Socket server failed to bind to 
> localhost:59909: Address already in use.
> at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
> at kafka.network.Acceptor.(SocketServer.scala:141)
> at kafka.network.SocketServer.startup(SocketServer.scala:68)
> at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
> at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
> at 
> kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1591) Clean-up Unnecessary stack trace in error/warn logs

2014-09-10 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129185#comment-14129185
 ] 

Guozhang Wang commented on KAFKA-1591:
--

Thank you for the patches. I few comments below:

0. Could you merge all the changes into one patch file, since committing works 
easier with single patch?

1. We no longer need "possible cause: " in the log content. Instead we can just 
print the Exception.toString() at the end of the log entry. I think there are 
some other places where we write "possible cause: " that can possible result in 
"possible cause: null" if we are using Exception.getMessage(). Can you also 
search and clean them?

2. There are some other places that we may want to reduce the stack trace, for 
example:

a. In SyncProducer, "error("Producer connection to " +  config.host + ":" + 
config.port + " unsuccessful", e)".

I think we should just remove this line since the exception is re-thrown, and 
hence the upper-level caller will catch it and handle / log it.

b. In ClientUtils, "warn("Fetching topic metadata with correlation id %d for 
topics [%s] from broker [%s] failed".format(correlationId, topics, 
shuffledBrokers(i).toString), e)"

We can remove the stack trace logging also.

I think it would be good to just go through all the error / warn logging and 
check if it prints the stack trace, and if that iss that really useful, in the 
sense that 1) the possibly thrown exception will be likely Kafka exceptions 2) 
the possibly thrown exception will be from a nested calling trace.

> Clean-up Unnecessary stack trace in error/warn logs
> ---
>
> Key: KAFKA-1591
> URL: https://issues.apache.org/jira/browse/KAFKA-1591
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>  Labels: newbie
> Fix For: 0.9.0
>
> Attachments: Jira-1591-SocketConnection-Warning.patch, 
> Jira1591-SendProducerRequest-Warning.patch
>
>
> Some of the unnecessary stack traces in error / warning log entries can 
> easily pollute the log files. Examples include KAFKA-1066, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Need Document and Explanation Of New Metrics Name in New Java Producer on Kafka Trunk

2014-09-10 Thread Jun Rao
We actually don't allow "." in the topic name. Topic name can be
alpha-numeric plus "-" and "_".

Thanks,

Jun

On Tue, Sep 9, 2014 at 6:29 PM, Bhavesh Mistry 
wrote:

> Thanks, I was using without JMX.  I will go through doc.  But how about
> Topic or Metric name Topic Name Convention or Metric Name Convention ?  The
> dot notation with topic having a ".".  Any future plan to enforce some
> stand rules.
>
> Thanks,
>
> Bhavesh
>
> On Tue, Sep 9, 2014 at 3:38 PM, Jay Kreps  wrote:
>
> > Hi Bhavesh,
> >
> > Each of those JMX attributes comes with documentation. If you open up
> > jconsole and attach to a jvm running the consumer you should be able
> > to read the descriptions for each attribute.
> >
> > -Jay
> >
> > On Tue, Sep 9, 2014 at 2:07 PM, Bhavesh Mistry
> >  wrote:
> > > Kafka Team,
> > >
> > > Can you please let me know what each of following Metrics means ?  Some
> > of
> > > them are obvious, but some are hard to understand. My Topic name is
> > > *TOPIC_NAME*.
> > >
> > >
> > >
> > > can we enforce a Topic Name Convention or Metric Name Convention.
> > Because
> > > in previous version of Kafka, we have similar issue of parsing Kafka
> > > Metrics name with host name issue (codahale lib) .  I have topic name
> > with
> > > “.”  So, it is hard to distinguish metric name and topic.   Also,  when
> > > guys get chance I would appreciate if you guys can explain metric
> > > description on wiki so community would know what to monitor.  Please
> see
> > > below for full list of metrics from new producer.
> > >
> > >
> > > Thanks,
> > >
> > > Bhavesh
> > >
> > >
> > > record-queue-time-avg NaN
> > > *node-1.*request-latency-max -Infinity
> > > record-size-max -Infinity
> > > *node-1.*incoming-byte-rate NaN
> > > request-size-avg NaN
> > > *node-1.*request-latency-avg NaN
> > > *node-2.*request-size-avg NaN
> > > requests-in-flight 0.0
> > > bufferpool-wait-ratio NaN
> > > network-io-rate NaN
> > > metadata-age 239.828
> > > records-per-request-avg NaN
> > > record-retry-rate NaN
> > > buffer-total-bytes 6.7108864E7
> > > buffer-available-bytes 6.7108864E7
> > > topic.*TOPIC_NAME*.record-error-rate NaN
> > > record-send-rate NaN
> > > select-rate NaN
> > > node-2.outgoing-byte-rate NaN
> > > topic.*TOPIC_NAME*.record-retry-rate NaN
> > > batch-size-max -Infinity
> > > connection-creation-rate NaN
> > > node-1.outgoing-byte-rate NaN
> > > topic.*TOPIC_NAME*.byte-rate NaN
> > > waiting-threads 0.0
> > > batch-size-avg NaN
> > > io-wait-ratio NaN
> > > io-wait-time-ns-avg NaN
> > > io-ratio NaN
> > > topic.TOPIC_NAME.record-send-rate NaN
> > > request-size-max -Infinity
> > > record-size-avg NaN
> > > request-latency-max -Infinity
> > > node-2.request-latency-max -Infinity
> > > record-queue-time-max -Infinity
> > > node-2.response-rate NaN
> > > node-1.request-rate NaN
> > > node-1.request-size-max -Infinity
> > > connection-count 3.0
> > > incoming-byte-rate NaN
> > > compression-rate-avg NaN
> > > request-rate NaN
> > > node-1.response-rate NaN
> > > node-2.request-latency-avg NaN
> > > request-latency-avg NaN
> > > record-error-rate NaN
> > > connection-close-rate NaN
> > > *node-2.*request-size-max -Infinity
> > > topic.TOPIC_NAME.compression-rate NaN
> > > node-2.incoming-byte-rate NaN
> > > node-1.request-size-avg NaN
> > > io-time-ns-avg NaN
> > > outgoing-byte-rate NaN
> > > *node-2*.request-rate NaN
> > > response-rate NaN
> >
>


Re: IRC logs now available on botbot.me

2014-09-10 Thread Jun Rao
David,

Thanks for the pointer. Added the link to our website.

Jun

On Wed, Sep 10, 2014 at 11:05 AM, David Arthur  wrote:

> https://botbot.me/freenode/apache-kafka/
>
> Just FYI, wasn't sure if we had any logging in place
>
> Cheers,
> David
>
>
>


[jira] [Updated] (KAFKA-1558) AdminUtils.deleteTopic does not work

2014-09-10 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-1558:
---
Priority: Blocker  (was: Major)

> AdminUtils.deleteTopic does not work
> 
>
> Key: KAFKA-1558
> URL: https://issues.apache.org/jira/browse/KAFKA-1558
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.1
>Reporter: Henning Schmiedehausen
>Assignee: Sriharsha Chintalapani
>Priority: Blocker
> Fix For: 0.8.2
>
>
> the AdminUtils:.deleteTopic method is implemented as
> {code}
> def deleteTopic(zkClient: ZkClient, topic: String) {
> ZkUtils.createPersistentPath(zkClient, 
> ZkUtils.getDeleteTopicPath(topic))
> }
> {code}
> but the DeleteTopicCommand actually does
> {code}
> zkClient = new ZkClient(zkConnect, 3, 3, ZKStringSerializer)
> zkClient.deleteRecursive(ZkUtils.getTopicPath(topic))
> {code}
> so I guess, that the 'createPersistentPath' above should actually be 
> {code}
> def deleteTopic(zkClient: ZkClient, topic: String) {
> ZkUtils.deletePathRecursive(zkClient, ZkUtils.getTopicPath(topic))
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-686) 0.8 Kafka broker should give a better error message when running against 0.7 zookeeper

2014-09-10 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-686:
--
Priority: Major  (was: Blocker)

> 0.8 Kafka broker should give a better error message when running against 0.7 
> zookeeper
> --
>
> Key: KAFKA-686
> URL: https://issues.apache.org/jira/browse/KAFKA-686
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.0
>Reporter: Jay Kreps
>  Labels: newbie, patch
> Fix For: 0.8.2
>
> Attachments: KAFAK-686-null-pointer-fix.patch, 
> KAFKA-686-null-pointer-fix-2.patch
>
>
> People will not know that the zookeeper paths are not compatible. When you 
> try to start the 0.8 broker pointed at a 0.7 zookeeper you get a 
> NullPointerException. We should detect this and give a more sane error.
> Error:
> kafka.common.KafkaException: Can't parse json string: null
> at kafka.utils.Json$.liftedTree1$1(Json.scala:20)
> at kafka.utils.Json$.parseFull(Json.scala:16)
> at 
> kafka.utils.ZkUtils$$anonfun$getReplicaAssignmentForTopics$2.apply(ZkUtils.scala:498)
> at 
> kafka.utils.ZkUtils$$anonfun$getReplicaAssignmentForTopics$2.apply(ZkUtils.scala:494)
> at 
> scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
> at scala.collection.immutable.List.foreach(List.scala:45)
> at 
> kafka.utils.ZkUtils$.getReplicaAssignmentForTopics(ZkUtils.scala:494)
> at 
> kafka.controller.KafkaController.initializeControllerContext(KafkaController.scala:446)
> at 
> kafka.controller.KafkaController.onControllerFailover(KafkaController.scala:220)
> at 
> kafka.controller.KafkaController$$anonfun$1.apply$mcV$sp(KafkaController.scala:85)
> at 
> kafka.server.ZookeeperLeaderElector.elect(ZookeeperLeaderElector.scala:53)
> at 
> kafka.server.ZookeeperLeaderElector.startup(ZookeeperLeaderElector.scala:43)
> at kafka.controller.KafkaController.startup(KafkaController.scala:381)
> at kafka.server.KafkaServer.startup(KafkaServer.scala:90)
> at 
> kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
> at kafka.Kafka$.main(Kafka.scala:46)
> at kafka.Kafka.main(Kafka.scala)
> Caused by: java.lang.NullPointerException
> at 
> scala.util.parsing.combinator.lexical.Scanners$Scanner.(Scanners.scala:52)
> at scala.util.parsing.json.JSON$.parseRaw(JSON.scala:71)
> at scala.util.parsing.json.JSON$.parseFull(JSON.scala:85)
> at kafka.utils.Json$.liftedTree1$1(Json.scala:17)
> ... 16 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25420: Patch for KAFKA-686

2014-09-10 Thread Guozhang Wang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25420/#review52980
---


Thanks for the patch. Overall it is a great clean-up. Some comments below:


core/src/main/scala/kafka/controller/PartitionStateMachine.scala


The controller will only be shutdown upon kafka server shutting down, we 
should probably just call controller.onControllerResignation()



core/src/main/scala/kafka/utils/ZkUtils.scala


Do we still need this function any more? Shall we just change the usages of 
getReplicasForPartition to getPartitionAssignmentForTopic?



core/src/main/scala/kafka/utils/ZkUtils.scala


Why do we want it to be mutable?



core/src/main/scala/kafka/utils/ZkUtils.scala


Shall we also use failAsValue on catching ZkNoNodeException in 
getChildrenParentMayNotExist, and make its return type an Option?



core/src/main/scala/kafka/utils/ZkUtils.scala


Can we rename to "consumerOffsetForPartitionDir" for naming consistency?


- Guozhang Wang


On Sept. 7, 2014, 7:22 p.m., Viktor Tarananeko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25420/
> ---
> 
> (Updated Sept. 7, 2014, 7:22 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-686
> https://issues.apache.org/jira/browse/KAFKA-686
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Merge branch 'trunk' into fix-null-pointer-in-zk-utils
> 
> 
> unify topic partition path constructing; refactoring and better code reuse
> 
> 
> reuse method to fetch broker data
> 
> 
> unify fetching topics
> 
> 
> raise InvalidTopicException if unable to properly parse json data from 
> Zookeeper
> 
> 
> TopicRegistrationInfo class
> 
> 
> base controller failure test on invalid topic data
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/common/TopicRegistrationInfo.scala PRE-CREATION 
>   core/src/main/scala/kafka/consumer/TopicCount.scala 
> 0954b3c3ff8b3b7a7a4095436bc9e6c494a38c37 
>   core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala 
> fbc680fde21b02f11285a4f4b442987356abd17b 
>   core/src/main/scala/kafka/controller/PartitionStateMachine.scala 
> e20b63a6ec1c1a848bc3823701b0f8ceaeb6100d 
>   core/src/main/scala/kafka/server/KafkaHealthcheck.scala 
> 4acdd70fe9c1ee78d6510741006c2ece65450671 
>   core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala 
> d1e7c434e77859d746b8dc68dd5d5a3740425e79 
>   core/src/main/scala/kafka/tools/ExportZkOffsets.scala 
> 4d051bc2db12f0e15aa6a3289abeb9dd25d373d2 
>   core/src/main/scala/kafka/tools/UpdateOffsetsInZK.scala 
> 111c9a8b94ce45d95551482e9fd3f8c1cccbf548 
>   core/src/main/scala/kafka/utils/ZkUtils.scala 
> a7b1fdcb50d5cf930352d37e39cb4fc9a080cb12 
>   core/src/test/scala/unit/kafka/integration/AutoOffsetResetTest.scala 
> 95303e098d40cd790fb370e9b5a47d20860a6da3 
>   core/src/test/scala/unit/kafka/server/ControllerFailureTest.scala 
> PRE-CREATION 
>   core/src/test/scala/unit/kafka/utils/TestUtils.scala 
> c4e13c5240c8303853d08cc3b40088f8c7dae460 
> 
> Diff: https://reviews.apache.org/r/25420/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Viktor Tarananeko
> 
>



Re: [DISCUSS] 0.8.2 release branch, "unofficial" release candidates(s), 0.8.1.2 release

2014-09-10 Thread Jun Rao
Joe,

Thanks for starting the discussion.

(1) I made a pass of the open jiras for 0.8.2 and marked a few of them as
blockers for now. There are currently 6 blockers. Ideally, we want to get
all those fixed before cutting the 0.8.2 branch. The rest of the jiras
don't really have to be fixed in 0.8.2. So, if anyone wants to help on
fixing those blocker jiras, that would be great. Perhaps we can circle back
in a couple of weeks and see how much progress we make on those blocker
jiras.

(2) A beta 0.8.2 may not be a bad idea.

(3) We can do 0.8.1.2. However, I'd prefer only trivial and critical
patches to back port. The scala 2.11 patch seems ok.

(4) Yes, we should start updating the wiki once 0.8.2 is cut.

(5) Yes, we can include kafka-1555 if it can be fixed in time.

Thanks,

Jun



On Wed, Sep 3, 2014 at 6:34 PM, Joe Stein  wrote:

> Hey, I wanted to take a quick pulse to see if we are getting closer to a
> branch for 0.8.2.
>
> 1) There still seems to be a lot of open issues
>
> https://issues.apache.org/jira/browse/KAFKA/fixforversion/12326167/?selectedTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel
> and our 30 day summary is showing issues: 51 created and *34* resolved and
> not
> sure how much of that we could really just decide to push off to 0.8.3 or
> 0.9.0 vs working on 0.8.2 as stable for release.  There is already so much
> goodness on trunk.  I appreciate the double commit pain especially as trunk
> and branch drift (ugh).
>
> 2) Also, I wanted to float the idea of after making the 0.8.2 branch that I
> would do some unofficial release candidates for folks to test prior to
> release and vote.  What I was thinking was I would build, upload and stage
> like I was preparing artifacts for vote but let the community know to go in
> and "have at it" well prior to the vote release.  We don't get a lot of
> community votes during a release but issues after (which is natural because
> of how things are done).  I have seen four Apache projects doing this very
> successfully not only have they had less iterations of RC votes (sensitive
> to that myself) but the community kicked back issues they saw by giving
> them some "pre release" time to go through their own test and staging
> environments as the release are coming about.
>
> 3) Checking again on "should we have a 0.8.1.2" release if folks in the
> community find important features (this might be best asked on the user
> list maybe not sure) they don't want/can't wait for which wouldn't be too
> much pain/dangerous to back port. Two things that spring to the top of my
> head are 2.11 Scala support and fixing the source jars.  Both of these are
> easy to patch personally I don't mind but want to gauge more from the
> community on this too.  I have heard gripes ad hoc from folks in direct
> communication but no complains really in the public forum and wanted to
> open the floor if folks had a need.
>
> 4) 0.9 work I feel is being held up some (or at least resourcing it from my
> perspective).  We decided to hold up including SSL (even though we have a
> path for it). Jay did a nice update recently to the Security wiki which I
> think we should move forward with.  I have some more to add/change/update
> and want to start getting down to more details and getting specific people
> working on specific tasks but without knowing what we are doing when it is
> hard to manage.
>
> 5) I just updated https://issues.apache.org/jira/browse/KAFKA-1555 I think
> it is a really important feature update doesn't have to be in 0.8.2 but we
> need consensus (no pun intended). It fundamentally allows for data in min
> two rack requirement which A LOT of data requires for successful save to
> occur.
>
> /***
>  Joe Stein
>  Founder, Principal Consultant
>  Big Data Open Source Security LLC
>  http://www.stealth.ly
>  Twitter: @allthingshadoop 
> /
>


Re: IRC logs now available on botbot.me

2014-09-10 Thread Jay Kreps
That's awesome.

-Jay

On Wed, Sep 10, 2014 at 11:05 AM, David Arthur  wrote:
> https://botbot.me/freenode/apache-kafka/
>
> Just FYI, wasn't sure if we had any logging in place
>
> Cheers,
> David
>
>


[jira] [Comment Edited] (KAFKA-1558) AdminUtils.deleteTopic does not work

2014-09-10 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129451#comment-14129451
 ] 

Sriharsha Chintalapani edited comment on KAFKA-1558 at 9/11/14 12:50 AM:
-

[~junrao] Do you know any steps on reproducing this. Thanks.


was (Author: sriharsha):
[~junrao] Do you any steps on reproducing this. Thanks.

> AdminUtils.deleteTopic does not work
> 
>
> Key: KAFKA-1558
> URL: https://issues.apache.org/jira/browse/KAFKA-1558
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.1
>Reporter: Henning Schmiedehausen
>Assignee: Sriharsha Chintalapani
>Priority: Blocker
> Fix For: 0.8.2
>
>
> the AdminUtils:.deleteTopic method is implemented as
> {code}
> def deleteTopic(zkClient: ZkClient, topic: String) {
> ZkUtils.createPersistentPath(zkClient, 
> ZkUtils.getDeleteTopicPath(topic))
> }
> {code}
> but the DeleteTopicCommand actually does
> {code}
> zkClient = new ZkClient(zkConnect, 3, 3, ZKStringSerializer)
> zkClient.deleteRecursive(ZkUtils.getTopicPath(topic))
> {code}
> so I guess, that the 'createPersistentPath' above should actually be 
> {code}
> def deleteTopic(zkClient: ZkClient, topic: String) {
> ZkUtils.deletePathRecursive(zkClient, ZkUtils.getTopicPath(topic))
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1558) AdminUtils.deleteTopic does not work

2014-09-10 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129451#comment-14129451
 ] 

Sriharsha Chintalapani commented on KAFKA-1558:
---

[~junrao] Do you any steps on reproducing this. Thanks.

> AdminUtils.deleteTopic does not work
> 
>
> Key: KAFKA-1558
> URL: https://issues.apache.org/jira/browse/KAFKA-1558
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.1
>Reporter: Henning Schmiedehausen
>Assignee: Sriharsha Chintalapani
>Priority: Blocker
> Fix For: 0.8.2
>
>
> the AdminUtils:.deleteTopic method is implemented as
> {code}
> def deleteTopic(zkClient: ZkClient, topic: String) {
> ZkUtils.createPersistentPath(zkClient, 
> ZkUtils.getDeleteTopicPath(topic))
> }
> {code}
> but the DeleteTopicCommand actually does
> {code}
> zkClient = new ZkClient(zkConnect, 3, 3, ZKStringSerializer)
> zkClient.deleteRecursive(ZkUtils.getTopicPath(topic))
> {code}
> so I guess, that the 'createPersistentPath' above should actually be 
> {code}
> def deleteTopic(zkClient: ZkClient, topic: String) {
> ZkUtils.deletePathRecursive(zkClient, ZkUtils.getTopicPath(topic))
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Need Document and Explanation Of New Metrics Name in New Java Producer on Kafka Trunk

2014-09-10 Thread Bhavesh Mistry
I am using topic name with "." and it works with  old and new
producers/consumers.  is Kafka enforcing in code or documented limitation ?


Thanks,

Bhavesh

On Wed, Sep 10, 2014 at 3:24 PM, Jun Rao  wrote:

> We actually don't allow "." in the topic name. Topic name can be
> alpha-numeric plus "-" and "_".
>
> Thanks,
>
> Jun
>
> On Tue, Sep 9, 2014 at 6:29 PM, Bhavesh Mistry  >
> wrote:
>
> > Thanks, I was using without JMX.  I will go through doc.  But how about
> > Topic or Metric name Topic Name Convention or Metric Name Convention ?
> The
> > dot notation with topic having a ".".  Any future plan to enforce some
> > stand rules.
> >
> > Thanks,
> >
> > Bhavesh
> >
> > On Tue, Sep 9, 2014 at 3:38 PM, Jay Kreps  wrote:
> >
> > > Hi Bhavesh,
> > >
> > > Each of those JMX attributes comes with documentation. If you open up
> > > jconsole and attach to a jvm running the consumer you should be able
> > > to read the descriptions for each attribute.
> > >
> > > -Jay
> > >
> > > On Tue, Sep 9, 2014 at 2:07 PM, Bhavesh Mistry
> > >  wrote:
> > > > Kafka Team,
> > > >
> > > > Can you please let me know what each of following Metrics means ?
> Some
> > > of
> > > > them are obvious, but some are hard to understand. My Topic name is
> > > > *TOPIC_NAME*.
> > > >
> > > >
> > > >
> > > > can we enforce a Topic Name Convention or Metric Name Convention.
> > > Because
> > > > in previous version of Kafka, we have similar issue of parsing Kafka
> > > > Metrics name with host name issue (codahale lib) .  I have topic name
> > > with
> > > > “.”  So, it is hard to distinguish metric name and topic.   Also,
> when
> > > > guys get chance I would appreciate if you guys can explain metric
> > > > description on wiki so community would know what to monitor.  Please
> > see
> > > > below for full list of metrics from new producer.
> > > >
> > > >
> > > > Thanks,
> > > >
> > > > Bhavesh
> > > >
> > > >
> > > > record-queue-time-avg NaN
> > > > *node-1.*request-latency-max -Infinity
> > > > record-size-max -Infinity
> > > > *node-1.*incoming-byte-rate NaN
> > > > request-size-avg NaN
> > > > *node-1.*request-latency-avg NaN
> > > > *node-2.*request-size-avg NaN
> > > > requests-in-flight 0.0
> > > > bufferpool-wait-ratio NaN
> > > > network-io-rate NaN
> > > > metadata-age 239.828
> > > > records-per-request-avg NaN
> > > > record-retry-rate NaN
> > > > buffer-total-bytes 6.7108864E7
> > > > buffer-available-bytes 6.7108864E7
> > > > topic.*TOPIC_NAME*.record-error-rate NaN
> > > > record-send-rate NaN
> > > > select-rate NaN
> > > > node-2.outgoing-byte-rate NaN
> > > > topic.*TOPIC_NAME*.record-retry-rate NaN
> > > > batch-size-max -Infinity
> > > > connection-creation-rate NaN
> > > > node-1.outgoing-byte-rate NaN
> > > > topic.*TOPIC_NAME*.byte-rate NaN
> > > > waiting-threads 0.0
> > > > batch-size-avg NaN
> > > > io-wait-ratio NaN
> > > > io-wait-time-ns-avg NaN
> > > > io-ratio NaN
> > > > topic.TOPIC_NAME.record-send-rate NaN
> > > > request-size-max -Infinity
> > > > record-size-avg NaN
> > > > request-latency-max -Infinity
> > > > node-2.request-latency-max -Infinity
> > > > record-queue-time-max -Infinity
> > > > node-2.response-rate NaN
> > > > node-1.request-rate NaN
> > > > node-1.request-size-max -Infinity
> > > > connection-count 3.0
> > > > incoming-byte-rate NaN
> > > > compression-rate-avg NaN
> > > > request-rate NaN
> > > > node-1.response-rate NaN
> > > > node-2.request-latency-avg NaN
> > > > request-latency-avg NaN
> > > > record-error-rate NaN
> > > > connection-close-rate NaN
> > > > *node-2.*request-size-max -Infinity
> > > > topic.TOPIC_NAME.compression-rate NaN
> > > > node-2.incoming-byte-rate NaN
> > > > node-1.request-size-avg NaN
> > > > io-time-ns-avg NaN
> > > > outgoing-byte-rate NaN
> > > > *node-2*.request-rate NaN
> > > > response-rate NaN
> > >
> >
>


[jira] [Commented] (KAFKA-1558) AdminUtils.deleteTopic does not work

2014-09-10 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129506#comment-14129506
 ] 

Gwen Shapira commented on KAFKA-1558:
-

I don't think the issue as described above exists an more. 
I tested the trunk implementation of deleteTopic in simple cases and in all of 
them, if delete.topic.enable was true, the topic was deleted.

I think what we need now is to test deleteTopic under failure modes - leader 
election, partition reassignment, etc.

> AdminUtils.deleteTopic does not work
> 
>
> Key: KAFKA-1558
> URL: https://issues.apache.org/jira/browse/KAFKA-1558
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.1.1
>Reporter: Henning Schmiedehausen
>Assignee: Sriharsha Chintalapani
>Priority: Blocker
> Fix For: 0.8.2
>
>
> the AdminUtils:.deleteTopic method is implemented as
> {code}
> def deleteTopic(zkClient: ZkClient, topic: String) {
> ZkUtils.createPersistentPath(zkClient, 
> ZkUtils.getDeleteTopicPath(topic))
> }
> {code}
> but the DeleteTopicCommand actually does
> {code}
> zkClient = new ZkClient(zkConnect, 3, 3, ZKStringSerializer)
> zkClient.deleteRecursive(ZkUtils.getTopicPath(topic))
> {code}
> so I guess, that the 'createPersistentPath' above should actually be 
> {code}
> def deleteTopic(zkClient: ZkClient, topic: String) {
> ZkUtils.deletePathRecursive(zkClient, ZkUtils.getTopicPath(topic))
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Need Document and Explanation Of New Metrics Name in New Java Producer on Kafka Trunk

2014-09-10 Thread Jun Rao
Hmm, it seems that we do allow "." in the topic name. The topic name can't
be just "." or ".." though. So, if there is a topic "test.1", we will have
the following jmx metrics name.

kafka.producer.console-producer.topic.test:type=1

It should be changed to
kafka.producer.console-producer.topic:type=test.1

Could you file a jira to follow up on this?

Thanks,

Jun

On Wed, Sep 10, 2014 at 5:56 PM, Bhavesh Mistry 
wrote:

> I am using topic name with "." and it works with  old and new
> producers/consumers.  is Kafka enforcing in code or documented limitation ?
>
>
> Thanks,
>
> Bhavesh
>
> On Wed, Sep 10, 2014 at 3:24 PM, Jun Rao  wrote:
>
> > We actually don't allow "." in the topic name. Topic name can be
> > alpha-numeric plus "-" and "_".
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Sep 9, 2014 at 6:29 PM, Bhavesh Mistry <
> mistry.p.bhav...@gmail.com
> > >
> > wrote:
> >
> > > Thanks, I was using without JMX.  I will go through doc.  But how about
> > > Topic or Metric name Topic Name Convention or Metric Name Convention ?
> > The
> > > dot notation with topic having a ".".  Any future plan to enforce some
> > > stand rules.
> > >
> > > Thanks,
> > >
> > > Bhavesh
> > >
> > > On Tue, Sep 9, 2014 at 3:38 PM, Jay Kreps  wrote:
> > >
> > > > Hi Bhavesh,
> > > >
> > > > Each of those JMX attributes comes with documentation. If you open up
> > > > jconsole and attach to a jvm running the consumer you should be able
> > > > to read the descriptions for each attribute.
> > > >
> > > > -Jay
> > > >
> > > > On Tue, Sep 9, 2014 at 2:07 PM, Bhavesh Mistry
> > > >  wrote:
> > > > > Kafka Team,
> > > > >
> > > > > Can you please let me know what each of following Metrics means ?
> > Some
> > > > of
> > > > > them are obvious, but some are hard to understand. My Topic name is
> > > > > *TOPIC_NAME*.
> > > > >
> > > > >
> > > > >
> > > > > can we enforce a Topic Name Convention or Metric Name Convention.
> > > > Because
> > > > > in previous version of Kafka, we have similar issue of parsing
> Kafka
> > > > > Metrics name with host name issue (codahale lib) .  I have topic
> name
> > > > with
> > > > > “.”  So, it is hard to distinguish metric name and topic.   Also,
> > when
> > > > > guys get chance I would appreciate if you guys can explain metric
> > > > > description on wiki so community would know what to monitor.
> Please
> > > see
> > > > > below for full list of metrics from new producer.
> > > > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Bhavesh
> > > > >
> > > > >
> > > > > record-queue-time-avg NaN
> > > > > *node-1.*request-latency-max -Infinity
> > > > > record-size-max -Infinity
> > > > > *node-1.*incoming-byte-rate NaN
> > > > > request-size-avg NaN
> > > > > *node-1.*request-latency-avg NaN
> > > > > *node-2.*request-size-avg NaN
> > > > > requests-in-flight 0.0
> > > > > bufferpool-wait-ratio NaN
> > > > > network-io-rate NaN
> > > > > metadata-age 239.828
> > > > > records-per-request-avg NaN
> > > > > record-retry-rate NaN
> > > > > buffer-total-bytes 6.7108864E7
> > > > > buffer-available-bytes 6.7108864E7
> > > > > topic.*TOPIC_NAME*.record-error-rate NaN
> > > > > record-send-rate NaN
> > > > > select-rate NaN
> > > > > node-2.outgoing-byte-rate NaN
> > > > > topic.*TOPIC_NAME*.record-retry-rate NaN
> > > > > batch-size-max -Infinity
> > > > > connection-creation-rate NaN
> > > > > node-1.outgoing-byte-rate NaN
> > > > > topic.*TOPIC_NAME*.byte-rate NaN
> > > > > waiting-threads 0.0
> > > > > batch-size-avg NaN
> > > > > io-wait-ratio NaN
> > > > > io-wait-time-ns-avg NaN
> > > > > io-ratio NaN
> > > > > topic.TOPIC_NAME.record-send-rate NaN
> > > > > request-size-max -Infinity
> > > > > record-size-avg NaN
> > > > > request-latency-max -Infinity
> > > > > node-2.request-latency-max -Infinity
> > > > > record-queue-time-max -Infinity
> > > > > node-2.response-rate NaN
> > > > > node-1.request-rate NaN
> > > > > node-1.request-size-max -Infinity
> > > > > connection-count 3.0
> > > > > incoming-byte-rate NaN
> > > > > compression-rate-avg NaN
> > > > > request-rate NaN
> > > > > node-1.response-rate NaN
> > > > > node-2.request-latency-avg NaN
> > > > > request-latency-avg NaN
> > > > > record-error-rate NaN
> > > > > connection-close-rate NaN
> > > > > *node-2.*request-size-max -Infinity
> > > > > topic.TOPIC_NAME.compression-rate NaN
> > > > > node-2.incoming-byte-rate NaN
> > > > > node-1.request-size-avg NaN
> > > > > io-time-ns-avg NaN
> > > > > outgoing-byte-rate NaN
> > > > > *node-2*.request-rate NaN
> > > > > response-rate NaN
> > > >
> > >
> >
>


[jira] [Created] (KAFKA-1628) [New Java Producer] Topic which contains "." does not correct corresponding metric name

2014-09-10 Thread Bhavesh Mistry (JIRA)
Bhavesh Mistry created KAFKA-1628:
-

 Summary: [New Java Producer] Topic which contains "."  does not 
correct corresponding metric name 
 Key: KAFKA-1628
 URL: https://issues.apache.org/jira/browse/KAFKA-1628
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 0.8.2
 Environment: ALL
Reporter: Bhavesh Mistry
Priority: Minor



Hmm, it seems that we do allow "." in the topic name. The topic name can't
be just "." or ".." though. So, if there is a topic "test.1", we will have
the following jmx metrics name.

kafka.producer.console-producer.topic.test:type=1

It should be changed to
kafka.producer.console-producer.topic:type=test.1

Could you file a jira to follow up on this?

Thanks,

Jun




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Need Document and Explanation Of New Metrics Name in New Java Producer on Kafka Trunk

2014-09-10 Thread Bhavesh Mistry
Hi Jun,

I have created this issue for tracking purpose
https://issues.apache.org/jira/browse/KAFKA-1628

Thanks,

Bhavesh

On Wed, Sep 10, 2014 at 9:06 PM, Jun Rao  wrote:

> Hmm, it seems that we do allow "." in the topic name. The topic name can't
> be just "." or ".." though. So, if there is a topic "test.1", we will have
> the following jmx metrics name.
>
> kafka.producer.console-producer.topic.test:type=1
>
> It should be changed to
> kafka.producer.console-producer.topic:type=test.1
>
> Could you file a jira to follow up on this?
>
> Thanks,
>
> Jun
>
> On Wed, Sep 10, 2014 at 5:56 PM, Bhavesh Mistry <
> mistry.p.bhav...@gmail.com>
> wrote:
>
> > I am using topic name with "." and it works with  old and new
> > producers/consumers.  is Kafka enforcing in code or documented
> limitation ?
> >
> >
> > Thanks,
> >
> > Bhavesh
> >
> > On Wed, Sep 10, 2014 at 3:24 PM, Jun Rao  wrote:
> >
> > > We actually don't allow "." in the topic name. Topic name can be
> > > alpha-numeric plus "-" and "_".
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Tue, Sep 9, 2014 at 6:29 PM, Bhavesh Mistry <
> > mistry.p.bhav...@gmail.com
> > > >
> > > wrote:
> > >
> > > > Thanks, I was using without JMX.  I will go through doc.  But how
> about
> > > > Topic or Metric name Topic Name Convention or Metric Name Convention
> ?
> > > The
> > > > dot notation with topic having a ".".  Any future plan to enforce
> some
> > > > stand rules.
> > > >
> > > > Thanks,
> > > >
> > > > Bhavesh
> > > >
> > > > On Tue, Sep 9, 2014 at 3:38 PM, Jay Kreps 
> wrote:
> > > >
> > > > > Hi Bhavesh,
> > > > >
> > > > > Each of those JMX attributes comes with documentation. If you open
> up
> > > > > jconsole and attach to a jvm running the consumer you should be
> able
> > > > > to read the descriptions for each attribute.
> > > > >
> > > > > -Jay
> > > > >
> > > > > On Tue, Sep 9, 2014 at 2:07 PM, Bhavesh Mistry
> > > > >  wrote:
> > > > > > Kafka Team,
> > > > > >
> > > > > > Can you please let me know what each of following Metrics means ?
> > > Some
> > > > > of
> > > > > > them are obvious, but some are hard to understand. My Topic name
> is
> > > > > > *TOPIC_NAME*.
> > > > > >
> > > > > >
> > > > > >
> > > > > > can we enforce a Topic Name Convention or Metric Name Convention.
> > > > > Because
> > > > > > in previous version of Kafka, we have similar issue of parsing
> > Kafka
> > > > > > Metrics name with host name issue (codahale lib) .  I have topic
> > name
> > > > > with
> > > > > > “.”  So, it is hard to distinguish metric name and topic.   Also,
> > > when
> > > > > > guys get chance I would appreciate if you guys can explain metric
> > > > > > description on wiki so community would know what to monitor.
> > Please
> > > > see
> > > > > > below for full list of metrics from new producer.
> > > > > >
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Bhavesh
> > > > > >
> > > > > >
> > > > > > record-queue-time-avg NaN
> > > > > > *node-1.*request-latency-max -Infinity
> > > > > > record-size-max -Infinity
> > > > > > *node-1.*incoming-byte-rate NaN
> > > > > > request-size-avg NaN
> > > > > > *node-1.*request-latency-avg NaN
> > > > > > *node-2.*request-size-avg NaN
> > > > > > requests-in-flight 0.0
> > > > > > bufferpool-wait-ratio NaN
> > > > > > network-io-rate NaN
> > > > > > metadata-age 239.828
> > > > > > records-per-request-avg NaN
> > > > > > record-retry-rate NaN
> > > > > > buffer-total-bytes 6.7108864E7
> > > > > > buffer-available-bytes 6.7108864E7
> > > > > > topic.*TOPIC_NAME*.record-error-rate NaN
> > > > > > record-send-rate NaN
> > > > > > select-rate NaN
> > > > > > node-2.outgoing-byte-rate NaN
> > > > > > topic.*TOPIC_NAME*.record-retry-rate NaN
> > > > > > batch-size-max -Infinity
> > > > > > connection-creation-rate NaN
> > > > > > node-1.outgoing-byte-rate NaN
> > > > > > topic.*TOPIC_NAME*.byte-rate NaN
> > > > > > waiting-threads 0.0
> > > > > > batch-size-avg NaN
> > > > > > io-wait-ratio NaN
> > > > > > io-wait-time-ns-avg NaN
> > > > > > io-ratio NaN
> > > > > > topic.TOPIC_NAME.record-send-rate NaN
> > > > > > request-size-max -Infinity
> > > > > > record-size-avg NaN
> > > > > > request-latency-max -Infinity
> > > > > > node-2.request-latency-max -Infinity
> > > > > > record-queue-time-max -Infinity
> > > > > > node-2.response-rate NaN
> > > > > > node-1.request-rate NaN
> > > > > > node-1.request-size-max -Infinity
> > > > > > connection-count 3.0
> > > > > > incoming-byte-rate NaN
> > > > > > compression-rate-avg NaN
> > > > > > request-rate NaN
> > > > > > node-1.response-rate NaN
> > > > > > node-2.request-latency-avg NaN
> > > > > > request-latency-avg NaN
> > > > > > record-error-rate NaN
> > > > > > connection-close-rate NaN
> > > > > > *node-2.*request-size-max -Infinity
> > > > > > topic.TOPIC_NAME.compression-rate NaN
> > > > > > node-2.incoming-byte-rate NaN
> > > > > > node-1.request-size-avg NaN
> > > > > > io-time-ns-avg NaN
> > > > > > outgoing-byte-rate NaN
> > > > > > *node-2

[jira] [Created] (KAFKA-1629) Replica fetcher thread need to back off upon getting errors on partitions

2014-09-10 Thread Guozhang Wang (JIRA)
Guozhang Wang created KAFKA-1629:


 Summary: Replica fetcher thread need to back off upon getting 
errors on partitions
 Key: KAFKA-1629
 URL: https://issues.apache.org/jira/browse/KAFKA-1629
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
 Fix For: 0.9.0


ReplicaFetcherThread's handlePartitionsWithErrors() function needs to be 
implemented (currently it is an empty function) such that upon getting errors 
on these partitions, the fetcher thread will back off the corresponding simple 
consumer to retry fetching that partition.

This can happen when there is leader migration, the replica may get a bit 
delayed receiving the leader ISR update request before keeping retry fetching 
the old leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-1630) ConsumerFetcherThread locked in Tomcat

2014-09-10 Thread vijay (JIRA)
vijay created KAFKA-1630:


 Summary: ConsumerFetcherThread locked in Tomcat
 Key: KAFKA-1630
 URL: https://issues.apache.org/jira/browse/KAFKA-1630
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Affects Versions: 0.8.0
 Environment: linux redhat
Reporter: vijay
Assignee: Neha Narkhede


I am using high level consumer API for consuming messages from kafka. 
ConsumerFetcherThread gets locked. Kindly look in to the below stack trace

ConsumerFetcherThread-SocialTwitterStream6_172.31.240.136-1410398702143-61a247c3-0-1"
 prio=10 tid=0x7f294001e800 nid=0x1677 runnable [0x7f297aae9000]
   java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
- locked <0x7f2a7c38eb40> (a sun.nio.ch.Util$1)
- locked <0x7f2a7c38eb28> (a java.util.Collections$UnmodifiableSet)
- locked <0x7f2a7c326f20> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
at 
sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:193)
- locked <0x7f2a7c2163c0> (a java.lang.Object)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:86)
- locked <0x7f2a7c229950> (a 
sun.nio.ch.SocketAdaptor$SocketInputStream)
at 
java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:200)
- locked <0x7f2a7c38ea50> (a java.lang.Object)
at kafka.utils.Utils$.read(Utils.scala:395)
at 
kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
at kafka.network.Receive$class.readCompletely(Transmission.scala:56)
at 
kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:73)
at 
kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:71)
- locked <0x7f2a7c38e9f0> (a java.lang.Object)
at 
kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:110)
at 
kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:110)
at 
kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:110)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
at 
kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:109)
at 
kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:109)
at 
kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:109)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:108)
at 
kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:94)
at 
kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:86)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)