[jira] [Commented] (KAFKA-1365) Second Manual preferred replica leader election command always fails

2014-04-18 Thread BalajiSeshadri (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974159#comment-13974159
 ] 

BalajiSeshadri commented on KAFKA-1365:
---

We will test the controller issue one in dev ,we will test this also at that 
time.

Sorry for that.



 Second Manual preferred replica leader election command always fails
 

 Key: KAFKA-1365
 URL: https://issues.apache.org/jira/browse/KAFKA-1365
 Project: Kafka
  Issue Type: Bug
  Components: controller, tools
Affects Versions: 0.8.1
Reporter: Ryan Berdeen
Assignee: Neha Narkhede
Priority: Critical
 Fix For: 0.8.1.1

 Attachments: KAFKA-1365.patch


 After running kafka-preferred-replica-election.sh once, a second run will 
 fail with Preferred replica leader election currently in progress for 
 The /admin/preferred_replica_election key is never deleted from ZooKeeper, 
 because the isTriggeredByAutoRebalance parameter to 
 onPreferredReplicaElection 
 (https://github.com/apache/kafka/blob/0ffec142a991849833d9767be07e895428ccaea1/core/src/main/scala/kafka/controller/KafkaController.scala#L614)
  is used incorrectly. In the automatic case 
 (https://github.com/apache/kafka/blob/0ffec142a991849833d9767be07e895428ccaea1/core/src/main/scala/kafka/controller/KafkaController.scala#L1119),
  it is set to false. In the manual case 
 (https://github.com/apache/kafka/blob/0ffec142a991849833d9767be07e895428ccaea1/core/src/main/scala/kafka/controller/KafkaController.scala#L1266)
  the parameter is not passed, so it defaults to true.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 20471: Patch for KAFKA-1398

2014-04-18 Thread Jun Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20471/#review40772
---


Could you add some comments at the beginning of TopicConfigManager to make it 
clear that the TopicConfigChange command directly updates the topic config path 
in ZK and the notification path is just for refreshing the topic config cache 
in each broker? This wasn't very clear to me after reading the current comment.


core/src/main/scala/kafka/server/TopicConfigManager.scala
https://reviews.apache.org/r/20471/#comment73897

The child list returned by ZK doesn't guarantee any ordering. We will need 
to sort this list so that we don't miss the latest config change.



core/src/main/scala/kafka/server/TopicConfigManager.scala
https://reviews.apache.org/r/20471/#comment73896

changeZnode is unused.



core/src/main/scala/kafka/server/TopicConfigManager.scala
https://reviews.apache.org/r/20471/#comment73899

In the corner case, it seems that it's still possible by the time that we 
try to read this ZK path, it's deleted by another broker. We probably need to 
handle the ZKNodeNotExistException.



core/src/main/scala/kafka/server/TopicConfigManager.scala
https://reviews.apache.org/r/20471/#comment73900

It seems that if there are no new config changes, the last few config 
changes in the notification path will not be deleted. This can be a bit 
confusing.



core/src/main/scala/kafka/server/TopicConfigManager.scala
https://reviews.apache.org/r/20471/#comment73898

This is not needed. In ZK, new sequential nodes always get a higher id 
whether previous sequential nodes are deleted or not. ZK server maintains 
enough info to achieve that.



core/src/test/scala/unit/kafka/server/DynamicConfigChangeTest.scala
https://reviews.apache.org/r/20471/#comment73901

Missing Apache header.


- Jun Rao


On April 18, 2014, 12:36 a.m., Jay Kreps wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/20471/
 ---
 
 (Updated April 18, 2014, 12:36 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1398
 https://issues.apache.org/jira/browse/KAFKA-1398
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1398 dynamic config changes are broken.
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/server/TopicConfigManager.scala 
 d41fd33d91406dfa2ce8c1e1b04a078e983ccadd 
   core/src/test/scala/unit/kafka/server/DynamicConfigChangeTest.scala 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/20471/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jay Kreps
 




[jira] [Resolved] (KAFKA-1398) Topic config changes can be lost and cause fatal exceptions on broker restarts

2014-04-18 Thread Jay Kreps (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Kreps resolved KAFKA-1398.
--

Resolution: Fixed

 Topic config changes can be lost and cause fatal exceptions on broker restarts
 --

 Key: KAFKA-1398
 URL: https://issues.apache.org/jira/browse/KAFKA-1398
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Joel Koshy
Assignee: Jay Kreps
Priority: Critical
 Fix For: 0.8.1.1

 Attachments: KAFKA-1398.patch


 Our topic config cleanup policy seems to be broken. When a broker is
 bounced and starting up:
 1 - Read all the children of the config change path
 2 - For each, if the change id is greater than the last executed change,
   then extract the topic information.
 3 - If there is a log for that topic on this broker, then apply the change.
   However, if there is no log, then delete the config change.
 In step 3, a delete triggers a child change watch firing on all the other
 brokers. The other brokers currently take all the children of the config
 path but will ignore those config changes that are less than the last
 executed change. At least one issue here is that if a broker does not have
 partitions for a topic then the lastExecutedChange is not updated (for
 that topic).
 Consider this scenario:
 - Three brokers 0, 1, 2
 - Topic A has partitions only assigned to broker 0
 - Topic B has partitions only assigned to broker 1
 - Topic C has partitions only assigned to broker 2
 - Change 0: topic A
 - Change 1: topic B
 - Change 2: topic C
 - lastExecutedChange on broker 0 is 0
 - lastExecutedChange on broker 1 is 1
 - lastExecutedChange on broker 2 is 2
 - Bounce broker 1
 - The above bounce will cause Change 0 and Change 2 to get deleted.
 - Watch fires on broker 0 and 1
 - Broker 0 will try and read the topic corresponding to change 1 (since its
   lastExecutedChange is 0) and then change 2. That read will fail:
 2014/04/15 19:35:34.236 INFO [TopicConfigManager] [main] [kafka-server] [] 
 Processed topic config change 25 for topic xyz, setting new config to
  {retention.ms=360, segment.ms=360}.
 2014/04/15 19:35:34.238 FATAL [KafkaServerStartable] [main] [kafka-server] [] 
 Fatal error during KafkaServerStable startup. Prepare to shutdown
 org.I0Itec.zkclient.exception.ZkNoNodeException: 
 org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
 NoNode for /config/changes/config_change_26
 at 
 org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
 at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
 at kafka.utils.ZkUtils$.readData(ZkUtils.scala:467)
 at 
 kafka.server.TopicConfigManager$$anonfun$kafka$server$TopicConfigManager$$processConfigChanges$2.apply(TopicConfigManager.scala:97)
 at 
 kafka.server.TopicConfigManager$$anonfun$kafka$server$TopicConfigManager$$processConfigChanges$2.apply(TopicConfigManager.scala:93)
 at 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:57)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:43)
 at 
 kafka.server.TopicConfigManager.kafka$server$TopicConfigManager$$processConfigChanges(TopicConfigManager.scala:93)
 at 
 kafka.server.TopicConfigManager.processAllConfigChanges(TopicConfigManager.scala:81)
 at 
 kafka.server.TopicConfigManager.startup(TopicConfigManager.scala:72)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:104)
 at 
 kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
 ...
 Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
 KeeperErrorCode = NoNode for /config/changes/config_change_26
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927)
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:956)
 at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
 at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770)
 at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766)
 at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
 ... 39 more
 Another issue is that there are two logging statements with incorrect
 qualifiers which makes things a little harder to debug. E.g.,
 2014/04/15 19:35:34.223 ERROR [TopicConfigManager] [kafka-server] [] Ignoring 
 topic config change %d for topic 

[jira] [Commented] (KAFKA-1405) Global JSON.globalNumberParser screws up other libraries

2014-04-18 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974266#comment-13974266
 ] 

Jun Rao commented on KAFKA-1405:


Vadim,

Thanks for reporting this. Yes, overriding JSON.globalNumberParser is not 
ideal. Are you sure JSON is going to be deprecated in scala 2.11? The api is 
still there.

http://www.scala-lang.org/api/2.11.0-M3/index.html#scala.util.parsing.json.JSON$

If JSON is still supported in scala 2.11, maybe we can see if there is a way to 
fix JSON.globalNumberParser. Could we just override the global value each time 
we do parsing and reset it back to the original value when done?

As for dragging in lift-json, this is also possible, but we have to be a bit 
careful. So, we need to know that the library is well maintained and has a good 
history of providing backward compatible releases. Currently, the consumer 
client depends on json parsing. Any jar compatibility issue could affect the 
consumer applications.

 Global JSON.globalNumberParser screws up other libraries
 

 Key: KAFKA-1405
 URL: https://issues.apache.org/jira/browse/KAFKA-1405
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Vadim Chekan
  Labels: json

 I'm getting exception kafka.common.ConsumerRebalanceFailedException but it 
 only happens when I do a call to scala/pickling serialization library. What 
 the connection you might ask? The underly exception is 
 ZookeeperConsumerConnector:76, exception during rebalance 
 kafka.common.KafkaException: Failed to parse the broker info from zookeeper: 
 {jmx_port:-1,timestamp:1397514497053,host:xxx,version:1,port:9092}
 Caused by: java.lang.ClassCastException: java.lang.Double cannot be cast to 
 java.lang.Integer
 A little bit looking at  kafka code lead me to this line:
 In 
 https://github.com/apache/kafka/blob/0.8.0/core/src/main/scala/kafka/utils/Json.scala#L27
 there is JSON.globalNumberParser redefined. It is terrible idea to change 
 global variable. This JSON library is used by other libraries and this global 
 assignment messes it up.
 My 5-minutes research shows that scala's JSON library was considered almost 
 of demo quality and most people prefer ligt-json implementation.
 https://groups.google.com/forum/#!topic/scala-user/P7-8PEUUj6A
 Also it is my understanding, that scala JSON is deprecated in scala-2.11, so 
 this change is needed anyway.
 If no objections to this ticket in general, I can work on a patch to use 3rd 
 party JSON library usage in kafka. Pleas let me know...



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 20471: Patch for KAFKA-1398

2014-04-18 Thread Joel Koshy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20471/#review40780
---



core/src/main/scala/kafka/server/TopicConfigManager.scala
https://reviews.apache.org/r/20471/#comment73918

This is done in readDataMaybeNull



core/src/main/scala/kafka/server/TopicConfigManager.scala
https://reviews.apache.org/r/20471/#comment73921

You mean that the purge is only triggered by a broker bounce or a new 
config change? I think that is fine right? i.e., in favor of some separate 
trigger such as a scheduled task doing that?

We could store the last executed change as part of the topic config itself 
for informational purposes (so people can be sure that a config change still 
sitting in zookeeper was in fact applied to the topic config).


- Joel Koshy


On April 18, 2014, 12:36 a.m., Jay Kreps wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/20471/
 ---
 
 (Updated April 18, 2014, 12:36 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1398
 https://issues.apache.org/jira/browse/KAFKA-1398
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1398 dynamic config changes are broken.
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/server/TopicConfigManager.scala 
 d41fd33d91406dfa2ce8c1e1b04a078e983ccadd 
   core/src/test/scala/unit/kafka/server/DynamicConfigChangeTest.scala 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/20471/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jay Kreps
 




Build failed in Jenkins: Kafka-trunk #167

2014-04-18 Thread Apache Jenkins Server
See https://builds.apache.org/job/Kafka-trunk/167/changes

Changes:

[jay.kreps] KAFKA-1398 dynamic config changes are broken.

--
[...truncated 738 lines...]
kafka.admin.AddPartitionsTest  testManualAssignmentOfReplicas PASSED

kafka.admin.AddPartitionsTest  testReplicaPlacement PASSED

kafka.admin.DeleteTopicTest  testFake PASSED

kafka.admin.AdminTest  testReplicaAssignment PASSED

kafka.admin.AdminTest  testManualReplicaAssignment PASSED

kafka.admin.AdminTest  testTopicCreationInZK PASSED

kafka.admin.AdminTest  testPartitionReassignmentWithLeaderInNewReplicas PASSED

kafka.admin.AdminTest  testPartitionReassignmentWithLeaderNotInNewReplicas 
PASSED

kafka.admin.AdminTest  testPartitionReassignmentNonOverlappingReplicas PASSED

kafka.admin.AdminTest  testReassigningNonExistingPartition PASSED

kafka.admin.AdminTest  testResumePartitionReassignmentThatWasCompleted PASSED

kafka.admin.AdminTest  testPreferredReplicaJsonData PASSED

kafka.admin.AdminTest  testBasicPreferredReplicaElection PASSED

kafka.admin.AdminTest  testShutdownBroker PASSED

kafka.admin.AdminTest  testTopicConfigChange PASSED

kafka.utils.IteratorTemplateTest  testIterator PASSED

kafka.utils.JsonTest  testJsonEncoding PASSED

kafka.utils.SchedulerTest  testMockSchedulerNonPeriodicTask PASSED

kafka.utils.SchedulerTest  testMockSchedulerPeriodicTask PASSED

kafka.utils.SchedulerTest  testReentrantTaskInMockScheduler PASSED

kafka.utils.SchedulerTest  testNonPeriodicTask PASSED

kafka.utils.SchedulerTest  testPeriodicTask PASSED

kafka.utils.UtilsTest  testSwallow PASSED

kafka.utils.UtilsTest  testCircularIterator PASSED

kafka.utils.UtilsTest  testReadBytes PASSED

kafka.utils.UtilsTest  testReplaceSuffix PASSED

kafka.utils.UtilsTest  testReadInt PASSED

kafka.utils.UtilsTest  testCsvList PASSED

kafka.utils.UtilsTest  testInLock PASSED

kafka.api.RequestResponseSerializationTest  
testSerializationAndDeserialization PASSED

kafka.api.ApiUtilsTest  testShortStringNonASCII PASSED

kafka.api.ApiUtilsTest  testShortStringASCII PASSED

kafka.api.ProducerFailureHandlingTest  testInvalidPartition PASSED

kafka.api.ProducerFailureHandlingTest  testTooLargeRecordWithAckZero PASSED

kafka.api.ProducerFailureHandlingTest  testTooLargeRecordWithAckOne PASSED

kafka.api.ProducerFailureHandlingTest  testNonExistTopic PASSED

kafka.api.ProducerFailureHandlingTest  testWrongBrokerList FAILED
junit.framework.AssertionFailedError: Partition [topic-1,0] metadata not 
propagated after timeout
at junit.framework.Assert.fail(Assert.java:47)
at junit.framework.Assert.assertTrue(Assert.java:20)
at 
kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:557)
at 
kafka.utils.TestUtils$$anonfun$createTopic$1.apply(TestUtils.scala:161)
at 
kafka.utils.TestUtils$$anonfun$createTopic$1.apply(TestUtils.scala:160)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.Range.foreach(Range.scala:141)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at kafka.utils.TestUtils$.createTopic(TestUtils.scala:160)
at 
kafka.api.ProducerFailureHandlingTest.testWrongBrokerList(ProducerFailureHandlingTest.scala:162)

kafka.api.ProducerFailureHandlingTest  testNoResponse PASSED

kafka.api.ProducerFailureHandlingTest  testSendAfterClosed PASSED

kafka.api.test.ProducerCompressionTest  testCompression[0] PASSED

kafka.api.test.ProducerCompressionTest  testCompression[1] PASSED

kafka.api.test.ProducerSendTest  testSendOffset PASSED

kafka.api.test.ProducerSendTest  testClose PASSED

kafka.api.test.ProducerSendTest  testSendToPartition PASSED

kafka.api.test.ProducerSendTest  testAutoCreateTopic PASSED

kafka.consumer.TopicFilterTest  testWhitelists PASSED

kafka.consumer.TopicFilterTest  testBlacklists PASSED

kafka.consumer.ConsumerIteratorTest  
testConsumerIteratorDeduplicationDeepIterator PASSED

kafka.consumer.ConsumerIteratorTest  testConsumerIteratorDecodingFailure PASSED

kafka.consumer.ZookeeperConsumerConnectorTest  testCompression PASSED

kafka.consumer.ZookeeperConsumerConnectorTest  testBasic PASSED

kafka.consumer.ZookeeperConsumerConnectorTest  testCompressionSetConsumption 
PASSED

kafka.consumer.ZookeeperConsumerConnectorTest  testConsumerDecoder PASSED

kafka.consumer.ZookeeperConsumerConnectorTest  testLeaderSelectionForPartition 
PASSED

kafka.message.ByteBufferMessageSetTest  testWrittenEqualsRead PASSED

kafka.message.ByteBufferMessageSetTest  testIteratorIsConsistent PASSED

kafka.message.ByteBufferMessageSetTest  testSizeInBytes PASSED

kafka.message.ByteBufferMessageSetTest  testWriteTo PASSED

kafka.message.ByteBufferMessageSetTest  testIterator PASSED


[jira] [Commented] (KAFKA-1403) Adding timestamp to kafka index structure

2014-04-18 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974302#comment-13974302
 ] 

Jun Rao commented on KAFKA-1403:


Would it be good enough to just add the timestamp in the index? The timestamp 
won't be very accurate at the message level, but will be much better than the 
timestamp at the segment file level.

 Adding timestamp to kafka index structure
 -

 Key: KAFKA-1403
 URL: https://issues.apache.org/jira/browse/KAFKA-1403
 Project: Kafka
  Issue Type: Improvement
  Components: core
Affects Versions: 0.8.1
Reporter: Xinyao Hu
   Original Estimate: 336h
  Remaining Estimate: 336h

 Right now, kafka doesn't have timestamp per message. It makes an assumption 
 that all the messages in the same file has the same timestamp which is the 
 mtime of the file. This makes it inefficient to scan all the messages within 
 a time window, which is a valid use case in a lot of realtime data analysis. 
 One way to hack this is to roll a new file in a short period of time. 
 However, this will result in opening lots of files (KAFKA-1404) which crashed 
 the servers eventually. 
 My guess this is not implemented due to the efficiency reason. It will cost 
 additional four bytes per message which might be pinned in memory for fast 
 access. There might be some simple perf optimization, such as differential 
 encoding + var length encoding, which should bring down the cost to 1-2 bytes 
 avg per message. 
 Let me know if this makes sense. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 20471: Patch for KAFKA-1398

2014-04-18 Thread Jay Kreps


 On April 18, 2014, 3:22 p.m., Jun Rao wrote:
  core/src/main/scala/kafka/server/TopicConfigManager.scala, lines 87-88
  https://reviews.apache.org/r/20471/diff/1/?file=561891#file561891line87
 
  The child list returned by ZK doesn't guarantee any ordering. We will 
  need to sort this list so that we don't miss the latest config change.

I don't think this depends on any ordering.


 On April 18, 2014, 3:22 p.m., Jun Rao wrote:
  core/src/main/scala/kafka/server/TopicConfigManager.scala, lines 121-134
  https://reviews.apache.org/r/20471/diff/1/?file=561891#file561891line121
 
  It seems that if there are no new config changes, the last few config 
  changes in the notification path will not be deleted. This can be a bit 
  confusing.

Yes, that is by design. Otherwise you need some background thread to check. 
This is less confusing. Change notifications can't pile up because if new ones 
come that will trigger purging.


 On April 18, 2014, 3:22 p.m., Jun Rao wrote:
  core/src/main/scala/kafka/server/TopicConfigManager.scala, lines 123-124
  https://reviews.apache.org/r/20471/diff/1/?file=561891#file561891line123
 
  This is not needed. In ZK, new sequential nodes always get a higher id 
  whether previous sequential nodes are deleted or not. ZK server maintains 
  enough info to achieve that.

If you are totally sure I will remove that bit.


- Jay


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20471/#review40772
---


On April 18, 2014, 12:36 a.m., Jay Kreps wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/20471/
 ---
 
 (Updated April 18, 2014, 12:36 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1398
 https://issues.apache.org/jira/browse/KAFKA-1398
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1398 dynamic config changes are broken.
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/server/TopicConfigManager.scala 
 d41fd33d91406dfa2ce8c1e1b04a078e983ccadd 
   core/src/test/scala/unit/kafka/server/DynamicConfigChangeTest.scala 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/20471/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jay Kreps
 




Re: Review Request 20471: Patch for KAFKA-1398

2014-04-18 Thread Jay Kreps


 On April 18, 2014, 5:25 p.m., Joel Koshy wrote:
  core/src/main/scala/kafka/server/TopicConfigManager.scala, lines 121-134
  https://reviews.apache.org/r/20471/diff/1/?file=561891#file561891line121
 
  You mean that the purge is only triggered by a broker bounce or a new 
  config change? I think that is fine right? i.e., in favor of some separate 
  trigger such as a scheduled task doing that?
  
  We could store the last executed change as part of the topic config 
  itself for informational purposes (so people can be sure that a config 
  change still sitting in zookeeper was in fact applied to the topic config).

Yes, though that ack would have to be per-broker, and we would need to clean 
it up if/when the partition moves to another broker.


- Jay


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20471/#review40780
---


On April 18, 2014, 12:36 a.m., Jay Kreps wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/20471/
 ---
 
 (Updated April 18, 2014, 12:36 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1398
 https://issues.apache.org/jira/browse/KAFKA-1398
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1398 dynamic config changes are broken.
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/server/TopicConfigManager.scala 
 d41fd33d91406dfa2ce8c1e1b04a078e983ccadd 
   core/src/test/scala/unit/kafka/server/DynamicConfigChangeTest.scala 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/20471/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jay Kreps
 




Re: Review Request 20376: Publish source jars and javadoc

2014-04-18 Thread Joel Koshy


 On April 18, 2014, 4:54 p.m., Jun Rao wrote:
  README.md, lines 10-11
  https://reviews.apache.org/r/20376/diff/1/?file=558371#file558371line10
 
  Do we need the trailing  ###?

Will make that change.


 On April 18, 2014, 4:54 p.m., Jun Rao wrote:
  build.gradle, lines 78-79
  https://reviews.apache.org/r/20376/diff/1/?file=558372#file558372line78
 
  Do we still need this given the jar config above?

Yes - the above applies to the jar task - this applies to the javadoc task.


- Joel


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20376/#review40777
---


On April 15, 2014, 7:39 p.m., Joel Koshy wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/20376/
 ---
 
 (Updated April 15, 2014, 7:39 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1362
 https://issues.apache.org/jira/browse/KAFKA-1362
 
 
 Repository: kafka
 
 
 Description
 ---
 
 Scaladoc and javadoc
 
 
 Add scaladoc
 
 
 Generate src jar
 
 
 Diffs
 -
 
   README.md d0cdde9ec07fb8d7874a8cd1273388ae6f22d57f 
   build.gradle 3db5f67d801d65157683b5a4b18955b4475b9d3f 
   contrib/LICENSE PRE-CREATION 
   contrib/NOTICE PRE-CREATION 
   contrib/hadoop-consumer/LICENSE 6b0b1270ff0ca8f03867efcd09ba6ddb6392b1e1 
   contrib/hadoop-producer/LICENSE 6b0b1270ff0ca8f03867efcd09ba6ddb6392b1e1 
   core/src/main/scala/kafka/utils/VerifiableProperties.scala 
 b070bb4df117df8542a651b4f0b6bc476855308b 
 
 Diff: https://reviews.apache.org/r/20376/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Joel Koshy
 




Could I get a review

2014-04-18 Thread Jay Kreps
https://reviews.apache.org/r/20440/

Should be a simple one.

-Jay


Re: Review Request 20471: Patch for KAFKA-1398

2014-04-18 Thread Jun Rao


 On April 18, 2014, 3:22 p.m., Jun Rao wrote:
  core/src/main/scala/kafka/server/TopicConfigManager.scala, lines 123-124
  https://reviews.apache.org/r/20471/diff/1/?file=561891#file561891line123
 
  This is not needed. In ZK, new sequential nodes always get a higher id 
  whether previous sequential nodes are deleted or not. ZK server maintains 
  enough info to achieve that.
 
 Jay Kreps wrote:
 If you are totally sure I will remove that bit.

[zk: localhost:2181(CONNECTED) 2] create -s /a 1
Created /a01
[zk: localhost:2181(CONNECTED) 3] delete /a01
[zk: localhost:2181(CONNECTED) 4] create -s /a 2 
Created /a03
[zk: localhost:2181(CONNECTED) 5] ls /a
[]
[zk: localhost:2181(CONNECTED) 6] ls / 
[a03, zookeeper]


 On April 18, 2014, 3:22 p.m., Jun Rao wrote:
  core/src/main/scala/kafka/server/TopicConfigManager.scala, lines 87-88
  https://reviews.apache.org/r/20471/diff/1/?file=561891#file561891line87
 
  The child list returned by ZK doesn't guarantee any ordering. We will 
  need to sort this list so that we don't miss the latest config change.
 
 Jay Kreps wrote:
 I don't think this depends on any ordering.

Yes, you are right. It doesn't depend on ordering since it always reads from 
the config path in ZK.


- Jun


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20471/#review40772
---


On April 18, 2014, 12:36 a.m., Jay Kreps wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/20471/
 ---
 
 (Updated April 18, 2014, 12:36 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1398
 https://issues.apache.org/jira/browse/KAFKA-1398
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1398 dynamic config changes are broken.
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/server/TopicConfigManager.scala 
 d41fd33d91406dfa2ce8c1e1b04a078e983ccadd 
   core/src/test/scala/unit/kafka/server/DynamicConfigChangeTest.scala 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/20471/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jay Kreps
 




[jira] [Commented] (KAFKA-1398) Topic config changes can be lost and cause fatal exceptions on broker restarts

2014-04-18 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974311#comment-13974311
 ] 

Jay Kreps commented on KAFKA-1398:
--

Created reviewboard https://reviews.apache.org/r/20492/
 against branch trunk

 Topic config changes can be lost and cause fatal exceptions on broker restarts
 --

 Key: KAFKA-1398
 URL: https://issues.apache.org/jira/browse/KAFKA-1398
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Joel Koshy
Assignee: Jay Kreps
Priority: Critical
 Fix For: 0.8.1.1

 Attachments: KAFKA-1398.patch, KAFKA-1398.patch


 Our topic config cleanup policy seems to be broken. When a broker is
 bounced and starting up:
 1 - Read all the children of the config change path
 2 - For each, if the change id is greater than the last executed change,
   then extract the topic information.
 3 - If there is a log for that topic on this broker, then apply the change.
   However, if there is no log, then delete the config change.
 In step 3, a delete triggers a child change watch firing on all the other
 brokers. The other brokers currently take all the children of the config
 path but will ignore those config changes that are less than the last
 executed change. At least one issue here is that if a broker does not have
 partitions for a topic then the lastExecutedChange is not updated (for
 that topic).
 Consider this scenario:
 - Three brokers 0, 1, 2
 - Topic A has partitions only assigned to broker 0
 - Topic B has partitions only assigned to broker 1
 - Topic C has partitions only assigned to broker 2
 - Change 0: topic A
 - Change 1: topic B
 - Change 2: topic C
 - lastExecutedChange on broker 0 is 0
 - lastExecutedChange on broker 1 is 1
 - lastExecutedChange on broker 2 is 2
 - Bounce broker 1
 - The above bounce will cause Change 0 and Change 2 to get deleted.
 - Watch fires on broker 0 and 1
 - Broker 0 will try and read the topic corresponding to change 1 (since its
   lastExecutedChange is 0) and then change 2. That read will fail:
 2014/04/15 19:35:34.236 INFO [TopicConfigManager] [main] [kafka-server] [] 
 Processed topic config change 25 for topic xyz, setting new config to
  {retention.ms=360, segment.ms=360}.
 2014/04/15 19:35:34.238 FATAL [KafkaServerStartable] [main] [kafka-server] [] 
 Fatal error during KafkaServerStable startup. Prepare to shutdown
 org.I0Itec.zkclient.exception.ZkNoNodeException: 
 org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
 NoNode for /config/changes/config_change_26
 at 
 org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
 at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
 at kafka.utils.ZkUtils$.readData(ZkUtils.scala:467)
 at 
 kafka.server.TopicConfigManager$$anonfun$kafka$server$TopicConfigManager$$processConfigChanges$2.apply(TopicConfigManager.scala:97)
 at 
 kafka.server.TopicConfigManager$$anonfun$kafka$server$TopicConfigManager$$processConfigChanges$2.apply(TopicConfigManager.scala:93)
 at 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:57)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:43)
 at 
 kafka.server.TopicConfigManager.kafka$server$TopicConfigManager$$processConfigChanges(TopicConfigManager.scala:93)
 at 
 kafka.server.TopicConfigManager.processAllConfigChanges(TopicConfigManager.scala:81)
 at 
 kafka.server.TopicConfigManager.startup(TopicConfigManager.scala:72)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:104)
 at 
 kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
 ...
 Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
 KeeperErrorCode = NoNode for /config/changes/config_change_26
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927)
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:956)
 at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
 at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770)
 at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766)
 at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
 ... 39 more
 Another issue is that there are two logging statements with incorrect
 qualifiers which makes things a little harder to 

[jira] [Updated] (KAFKA-1398) Topic config changes can be lost and cause fatal exceptions on broker restarts

2014-04-18 Thread Jay Kreps (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Kreps updated KAFKA-1398:
-

Attachment: KAFKA-1398.patch

 Topic config changes can be lost and cause fatal exceptions on broker restarts
 --

 Key: KAFKA-1398
 URL: https://issues.apache.org/jira/browse/KAFKA-1398
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Joel Koshy
Assignee: Jay Kreps
Priority: Critical
 Fix For: 0.8.1.1

 Attachments: KAFKA-1398.patch, KAFKA-1398.patch


 Our topic config cleanup policy seems to be broken. When a broker is
 bounced and starting up:
 1 - Read all the children of the config change path
 2 - For each, if the change id is greater than the last executed change,
   then extract the topic information.
 3 - If there is a log for that topic on this broker, then apply the change.
   However, if there is no log, then delete the config change.
 In step 3, a delete triggers a child change watch firing on all the other
 brokers. The other brokers currently take all the children of the config
 path but will ignore those config changes that are less than the last
 executed change. At least one issue here is that if a broker does not have
 partitions for a topic then the lastExecutedChange is not updated (for
 that topic).
 Consider this scenario:
 - Three brokers 0, 1, 2
 - Topic A has partitions only assigned to broker 0
 - Topic B has partitions only assigned to broker 1
 - Topic C has partitions only assigned to broker 2
 - Change 0: topic A
 - Change 1: topic B
 - Change 2: topic C
 - lastExecutedChange on broker 0 is 0
 - lastExecutedChange on broker 1 is 1
 - lastExecutedChange on broker 2 is 2
 - Bounce broker 1
 - The above bounce will cause Change 0 and Change 2 to get deleted.
 - Watch fires on broker 0 and 1
 - Broker 0 will try and read the topic corresponding to change 1 (since its
   lastExecutedChange is 0) and then change 2. That read will fail:
 2014/04/15 19:35:34.236 INFO [TopicConfigManager] [main] [kafka-server] [] 
 Processed topic config change 25 for topic xyz, setting new config to
  {retention.ms=360, segment.ms=360}.
 2014/04/15 19:35:34.238 FATAL [KafkaServerStartable] [main] [kafka-server] [] 
 Fatal error during KafkaServerStable startup. Prepare to shutdown
 org.I0Itec.zkclient.exception.ZkNoNodeException: 
 org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
 NoNode for /config/changes/config_change_26
 at 
 org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
 at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
 at kafka.utils.ZkUtils$.readData(ZkUtils.scala:467)
 at 
 kafka.server.TopicConfigManager$$anonfun$kafka$server$TopicConfigManager$$processConfigChanges$2.apply(TopicConfigManager.scala:97)
 at 
 kafka.server.TopicConfigManager$$anonfun$kafka$server$TopicConfigManager$$processConfigChanges$2.apply(TopicConfigManager.scala:93)
 at 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:57)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:43)
 at 
 kafka.server.TopicConfigManager.kafka$server$TopicConfigManager$$processConfigChanges(TopicConfigManager.scala:93)
 at 
 kafka.server.TopicConfigManager.processAllConfigChanges(TopicConfigManager.scala:81)
 at 
 kafka.server.TopicConfigManager.startup(TopicConfigManager.scala:72)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:104)
 at 
 kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
 ...
 Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
 KeeperErrorCode = NoNode for /config/changes/config_change_26
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927)
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:956)
 at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
 at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770)
 at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766)
 at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
 ... 39 more
 Another issue is that there are two logging statements with incorrect
 qualifiers which makes things a little harder to debug. E.g.,
 2014/04/15 19:35:34.223 ERROR [TopicConfigManager] [kafka-server] [] Ignoring 
 topic 

Re: Review Request 20440: Patch for KAFKA-1337

2014-04-18 Thread Jun Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20440/#review40790
---

Ship it!


Ship It!

- Jun Rao


On April 17, 2014, 12:41 a.m., Jay Kreps wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/20440/
 ---
 
 (Updated April 17, 2014, 12:41 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1337
 https://issues.apache.org/jira/browse/KAFKA-1337
 
 
 Repository: kafka
 
 
 Description
 ---
 
 Fix incorrect producer configs after config renaming.
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/producer/ConsoleProducer.scala 
 27b0ec86bf465a137d30904c615a19cdc6e59312 
   core/src/main/scala/kafka/tools/ReplayLogProducer.scala 
 f2246f97beb799f80538a14d878b755b422e88e5 
   core/src/main/scala/kafka/tools/TestEndToEndLatency.scala 
 ea856c7112080ae9e348d6f08942fec7b321c081 
   core/src/main/scala/kafka/tools/TestLogCleaning.scala 
 edb6e5f8fe511debd8b971dddb786e684ed7e1ac 
 
 Diff: https://reviews.apache.org/r/20440/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jay Kreps
 




Re: Review Request 20471: Patch for KAFKA-1398

2014-04-18 Thread Jay Kreps


 On April 18, 2014, 6:01 p.m., Joel Koshy wrote:
  core/src/main/scala/kafka/server/TopicConfigManager.scala, lines 87-88
  https://reviews.apache.org/r/20471/diff/1/?file=561891#file561891line87
 
  Say you override the retention to x hours, realize you made a mistake 
  and then override to y hours. If the order can be mixed then it would be an 
  issue no?

I don't think so--the notification just says read the config for topic X. So 
two updates will just mean two entries saying the config has changed. The only 
config value that is stored is the final value.


- Jay


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20471/#review40786
---


On April 18, 2014, 12:36 a.m., Jay Kreps wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/20471/
 ---
 
 (Updated April 18, 2014, 12:36 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1398
 https://issues.apache.org/jira/browse/KAFKA-1398
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1398 dynamic config changes are broken.
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/server/TopicConfigManager.scala 
 d41fd33d91406dfa2ce8c1e1b04a078e983ccadd 
   core/src/test/scala/unit/kafka/server/DynamicConfigChangeTest.scala 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/20471/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jay Kreps
 




Re: Review Request 20376: Publish source jars and javadoc

2014-04-18 Thread Jun Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20376/#review40791
---

Ship it!


Ship It!

- Jun Rao


On April 15, 2014, 7:39 p.m., Joel Koshy wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/20376/
 ---
 
 (Updated April 15, 2014, 7:39 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1362
 https://issues.apache.org/jira/browse/KAFKA-1362
 
 
 Repository: kafka
 
 
 Description
 ---
 
 Scaladoc and javadoc
 
 
 Add scaladoc
 
 
 Generate src jar
 
 
 Diffs
 -
 
   README.md d0cdde9ec07fb8d7874a8cd1273388ae6f22d57f 
   build.gradle 3db5f67d801d65157683b5a4b18955b4475b9d3f 
   contrib/LICENSE PRE-CREATION 
   contrib/NOTICE PRE-CREATION 
   contrib/hadoop-consumer/LICENSE 6b0b1270ff0ca8f03867efcd09ba6ddb6392b1e1 
   contrib/hadoop-producer/LICENSE 6b0b1270ff0ca8f03867efcd09ba6ddb6392b1e1 
   core/src/main/scala/kafka/utils/VerifiableProperties.scala 
 b070bb4df117df8542a651b4f0b6bc476855308b 
 
 Diff: https://reviews.apache.org/r/20376/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Joel Koshy
 




Re: Review Request 20471: Patch for KAFKA-1398

2014-04-18 Thread Joel Koshy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20471/#review40792
---



core/src/main/scala/kafka/server/TopicConfigManager.scala
https://reviews.apache.org/r/20471/#comment73945

Actually that's not the right example - it would work fine since you only 
want the later change.

How about this:
Say you first override retention to x hours.
(change 1)

Then you override compaction policy (change 2)

If you bring up a broker that has been down for some time (and has that 
topic) then it could fail to apply change 1 if it sees change 2 first.


- Joel Koshy


On April 18, 2014, 12:36 a.m., Jay Kreps wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/20471/
 ---
 
 (Updated April 18, 2014, 12:36 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1398
 https://issues.apache.org/jira/browse/KAFKA-1398
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1398 dynamic config changes are broken.
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/server/TopicConfigManager.scala 
 d41fd33d91406dfa2ce8c1e1b04a078e983ccadd 
   core/src/test/scala/unit/kafka/server/DynamicConfigChangeTest.scala 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/20471/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jay Kreps
 




[jira] [Updated] (KAFKA-1380) 0.8.1.1 release candidate

2014-04-18 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy updated KAFKA-1380:
--

Due Date: 21/Apr/14  (was: 16/Apr/14)

Quick update - we have to push this out to Monday. Most of the patches are 
ready/being reviewed. Will do some testing over the weekend and we should be 
good to proceed with the release process next week.

 0.8.1.1 release candidate
 -

 Key: KAFKA-1380
 URL: https://issues.apache.org/jira/browse/KAFKA-1380
 Project: Kafka
  Issue Type: Task
Reporter: Joel Koshy
Assignee: Joel Koshy
 Attachments: KAFKA-1380.patch


 Filing an umbrella tracker to list all dependencies for the 0.8.1.1 bug-fix 
 release.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1356) Topic metadata requests takes too long to process

2014-04-18 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974333#comment-13974333
 ] 

Jun Rao commented on KAFKA-1356:


Joel,

+1 on  
https://issues.apache.org/jira/secure/attachment/12640660/KAFKA-1356.patch 

Thanks,

 Topic metadata requests takes too long to process
 -

 Key: KAFKA-1356
 URL: https://issues.apache.org/jira/browse/KAFKA-1356
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Timothy Chen
Assignee: Timothy Chen
 Attachments: KAFKA-1356.patch, KAFKA-1356.patch, KAFKA-1356.patch, 
 KAFKA-1356.patch, KAFKA-1356.patch, KAFKA-1356.patch, KAFKA-1356.patch, 
 KAFKA-1356_2014-04-02_18:39:36.patch, KAFKA-1356_2014-04-04_14:40:18.patch, 
 KAFKA-1356_2014-04-04_17:45:37.patch, KAFKA-1356_2014-04-06_01:45:47.patch, 
 KAFKA-1356_2014-04-08_01:38:23.patch, KAFKA-1356_2014-04-11_11:39:10.patch


 Currently we're seeing slow response times in handling get topic metadata 
 requests.
 Local testing shows that even locally it takes 300 avg ms to respond, even 
 though it's not doing any IO operations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1356) Topic metadata requests takes too long to process

2014-04-18 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974346#comment-13974346
 ] 

Joel Koshy commented on KAFKA-1356:
---

Thanks for the review. Committed the follow-up patch to 0.8.1

 Topic metadata requests takes too long to process
 -

 Key: KAFKA-1356
 URL: https://issues.apache.org/jira/browse/KAFKA-1356
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Timothy Chen
Assignee: Timothy Chen
 Attachments: KAFKA-1356.patch, KAFKA-1356.patch, KAFKA-1356.patch, 
 KAFKA-1356.patch, KAFKA-1356.patch, KAFKA-1356.patch, KAFKA-1356.patch, 
 KAFKA-1356_2014-04-02_18:39:36.patch, KAFKA-1356_2014-04-04_14:40:18.patch, 
 KAFKA-1356_2014-04-04_17:45:37.patch, KAFKA-1356_2014-04-06_01:45:47.patch, 
 KAFKA-1356_2014-04-08_01:38:23.patch, KAFKA-1356_2014-04-11_11:39:10.patch


 Currently we're seeing slow response times in handling get topic metadata 
 requests.
 Local testing shows that even locally it takes 300 avg ms to respond, even 
 though it's not doing any IO operations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 20471: Patch for KAFKA-1398

2014-04-18 Thread Joel Koshy


 On April 18, 2014, 6:12 p.m., Joel Koshy wrote:
  core/src/main/scala/kafka/server/TopicConfigManager.scala, lines 87-88
  https://reviews.apache.org/r/20471/diff/1/?file=561891#file561891line87
 
  Actually that's not the right example - it would work fine since you 
  only want the later change.
  
  How about this:
  Say you first override retention to x hours.
  (change 1)
  
  Then you override compaction policy (change 2)
  
  If you bring up a broker that has been down for some time (and has that 
  topic) then it could fail to apply change 1 if it sees change 2 first.

Ok nm - the topic's config would have been written in zookeeper already. I'll 
cherry-pick this and the follow-up patch into 0.8.1


- Joel


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20471/#review40792
---


On April 18, 2014, 12:36 a.m., Jay Kreps wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/20471/
 ---
 
 (Updated April 18, 2014, 12:36 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1398
 https://issues.apache.org/jira/browse/KAFKA-1398
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1398 dynamic config changes are broken.
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/server/TopicConfigManager.scala 
 d41fd33d91406dfa2ce8c1e1b04a078e983ccadd 
   core/src/test/scala/unit/kafka/server/DynamicConfigChangeTest.scala 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/20471/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jay Kreps
 




Review Request 20498: KAFKA-1398: Follow-up comments on dynamic config

2014-04-18 Thread Jay Kreps

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20498/
---

Review request for kafka.


Bugs: KAFKA-1398
https://issues.apache.org/jira/browse/KAFKA-1398


Repository: kafka


Description
---

KAFKA-1398 Dynamic config follow-on-comments.


KAFKA-1398 dynamic config changes are broken.


Diffs
-

  core/src/main/scala/kafka/server/TopicConfigManager.scala 
d41fd33d91406dfa2ce8c1e1b04a078e983ccadd 
  core/src/test/scala/unit/kafka/server/DynamicConfigChangeTest.scala 
PRE-CREATION 

Diff: https://reviews.apache.org/r/20498/diff/


Testing
---


Thanks,

Jay Kreps



[jira] [Updated] (KAFKA-1398) Topic config changes can be lost and cause fatal exceptions on broker restarts

2014-04-18 Thread Jay Kreps (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Kreps updated KAFKA-1398:
-

Attachment: KAFKA-1398.patch

 Topic config changes can be lost and cause fatal exceptions on broker restarts
 --

 Key: KAFKA-1398
 URL: https://issues.apache.org/jira/browse/KAFKA-1398
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Joel Koshy
Assignee: Jay Kreps
Priority: Critical
 Fix For: 0.8.1.1

 Attachments: KAFKA-1398.patch, KAFKA-1398.patch, KAFKA-1398.patch


 Our topic config cleanup policy seems to be broken. When a broker is
 bounced and starting up:
 1 - Read all the children of the config change path
 2 - For each, if the change id is greater than the last executed change,
   then extract the topic information.
 3 - If there is a log for that topic on this broker, then apply the change.
   However, if there is no log, then delete the config change.
 In step 3, a delete triggers a child change watch firing on all the other
 brokers. The other brokers currently take all the children of the config
 path but will ignore those config changes that are less than the last
 executed change. At least one issue here is that if a broker does not have
 partitions for a topic then the lastExecutedChange is not updated (for
 that topic).
 Consider this scenario:
 - Three brokers 0, 1, 2
 - Topic A has partitions only assigned to broker 0
 - Topic B has partitions only assigned to broker 1
 - Topic C has partitions only assigned to broker 2
 - Change 0: topic A
 - Change 1: topic B
 - Change 2: topic C
 - lastExecutedChange on broker 0 is 0
 - lastExecutedChange on broker 1 is 1
 - lastExecutedChange on broker 2 is 2
 - Bounce broker 1
 - The above bounce will cause Change 0 and Change 2 to get deleted.
 - Watch fires on broker 0 and 1
 - Broker 0 will try and read the topic corresponding to change 1 (since its
   lastExecutedChange is 0) and then change 2. That read will fail:
 2014/04/15 19:35:34.236 INFO [TopicConfigManager] [main] [kafka-server] [] 
 Processed topic config change 25 for topic xyz, setting new config to
  {retention.ms=360, segment.ms=360}.
 2014/04/15 19:35:34.238 FATAL [KafkaServerStartable] [main] [kafka-server] [] 
 Fatal error during KafkaServerStable startup. Prepare to shutdown
 org.I0Itec.zkclient.exception.ZkNoNodeException: 
 org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
 NoNode for /config/changes/config_change_26
 at 
 org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
 at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
 at kafka.utils.ZkUtils$.readData(ZkUtils.scala:467)
 at 
 kafka.server.TopicConfigManager$$anonfun$kafka$server$TopicConfigManager$$processConfigChanges$2.apply(TopicConfigManager.scala:97)
 at 
 kafka.server.TopicConfigManager$$anonfun$kafka$server$TopicConfigManager$$processConfigChanges$2.apply(TopicConfigManager.scala:93)
 at 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:57)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:43)
 at 
 kafka.server.TopicConfigManager.kafka$server$TopicConfigManager$$processConfigChanges(TopicConfigManager.scala:93)
 at 
 kafka.server.TopicConfigManager.processAllConfigChanges(TopicConfigManager.scala:81)
 at 
 kafka.server.TopicConfigManager.startup(TopicConfigManager.scala:72)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:104)
 at 
 kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
 ...
 Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
 KeeperErrorCode = NoNode for /config/changes/config_change_26
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927)
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:956)
 at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
 at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770)
 at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766)
 at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
 ... 39 more
 Another issue is that there are two logging statements with incorrect
 qualifiers which makes things a little harder to debug. E.g.,
 2014/04/15 19:35:34.223 ERROR [TopicConfigManager] [kafka-server] [] 

[jira] [Commented] (KAFKA-1398) Topic config changes can be lost and cause fatal exceptions on broker restarts

2014-04-18 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974418#comment-13974418
 ] 

Jay Kreps commented on KAFKA-1398:
--

Created reviewboard https://reviews.apache.org/r/20498/
 against branch trunk

 Topic config changes can be lost and cause fatal exceptions on broker restarts
 --

 Key: KAFKA-1398
 URL: https://issues.apache.org/jira/browse/KAFKA-1398
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Joel Koshy
Assignee: Jay Kreps
Priority: Critical
 Fix For: 0.8.1.1

 Attachments: KAFKA-1398.patch, KAFKA-1398.patch, KAFKA-1398.patch


 Our topic config cleanup policy seems to be broken. When a broker is
 bounced and starting up:
 1 - Read all the children of the config change path
 2 - For each, if the change id is greater than the last executed change,
   then extract the topic information.
 3 - If there is a log for that topic on this broker, then apply the change.
   However, if there is no log, then delete the config change.
 In step 3, a delete triggers a child change watch firing on all the other
 brokers. The other brokers currently take all the children of the config
 path but will ignore those config changes that are less than the last
 executed change. At least one issue here is that if a broker does not have
 partitions for a topic then the lastExecutedChange is not updated (for
 that topic).
 Consider this scenario:
 - Three brokers 0, 1, 2
 - Topic A has partitions only assigned to broker 0
 - Topic B has partitions only assigned to broker 1
 - Topic C has partitions only assigned to broker 2
 - Change 0: topic A
 - Change 1: topic B
 - Change 2: topic C
 - lastExecutedChange on broker 0 is 0
 - lastExecutedChange on broker 1 is 1
 - lastExecutedChange on broker 2 is 2
 - Bounce broker 1
 - The above bounce will cause Change 0 and Change 2 to get deleted.
 - Watch fires on broker 0 and 1
 - Broker 0 will try and read the topic corresponding to change 1 (since its
   lastExecutedChange is 0) and then change 2. That read will fail:
 2014/04/15 19:35:34.236 INFO [TopicConfigManager] [main] [kafka-server] [] 
 Processed topic config change 25 for topic xyz, setting new config to
  {retention.ms=360, segment.ms=360}.
 2014/04/15 19:35:34.238 FATAL [KafkaServerStartable] [main] [kafka-server] [] 
 Fatal error during KafkaServerStable startup. Prepare to shutdown
 org.I0Itec.zkclient.exception.ZkNoNodeException: 
 org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
 NoNode for /config/changes/config_change_26
 at 
 org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
 at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
 at kafka.utils.ZkUtils$.readData(ZkUtils.scala:467)
 at 
 kafka.server.TopicConfigManager$$anonfun$kafka$server$TopicConfigManager$$processConfigChanges$2.apply(TopicConfigManager.scala:97)
 at 
 kafka.server.TopicConfigManager$$anonfun$kafka$server$TopicConfigManager$$processConfigChanges$2.apply(TopicConfigManager.scala:93)
 at 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:57)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:43)
 at 
 kafka.server.TopicConfigManager.kafka$server$TopicConfigManager$$processConfigChanges(TopicConfigManager.scala:93)
 at 
 kafka.server.TopicConfigManager.processAllConfigChanges(TopicConfigManager.scala:81)
 at 
 kafka.server.TopicConfigManager.startup(TopicConfigManager.scala:72)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:104)
 at 
 kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
 ...
 Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
 KeeperErrorCode = NoNode for /config/changes/config_change_26
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927)
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:956)
 at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
 at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770)
 at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766)
 at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
 ... 39 more
 Another issue is that there are two logging statements with incorrect
 qualifiers which makes things a 

Re: Review Request 20471: Patch for KAFKA-1398

2014-04-18 Thread Joel Koshy


 On April 18, 2014, 3:22 p.m., Jun Rao wrote:
  core/src/main/scala/kafka/server/TopicConfigManager.scala, lines 123-124
  https://reviews.apache.org/r/20471/diff/1/?file=561891#file561891line123
 
  This is not needed. In ZK, new sequential nodes always get a higher id 
  whether previous sequential nodes are deleted or not. ZK server maintains 
  enough info to achieve that.
 
 Jay Kreps wrote:
 If you are totally sure I will remove that bit.
 
 Jun Rao wrote:
 [zk: localhost:2181(CONNECTED) 2] create -s /a 1
 Created /a01
 [zk: localhost:2181(CONNECTED) 3] delete /a01
 [zk: localhost:2181(CONNECTED) 4] create -s /a 2 
 Created /a03
 [zk: localhost:2181(CONNECTED) 5] ls /a
 []
 [zk: localhost:2181(CONNECTED) 6] ls / 
 [a03, zookeeper]


Since we were wondering about this: the non-consecutive ids are because it is 
based on the parent's cVersion (which is incremented when you do the delete)


- Joel


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20471/#review40772
---


On April 18, 2014, 12:36 a.m., Jay Kreps wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/20471/
 ---
 
 (Updated April 18, 2014, 12:36 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1398
 https://issues.apache.org/jira/browse/KAFKA-1398
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1398 dynamic config changes are broken.
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/server/TopicConfigManager.scala 
 d41fd33d91406dfa2ce8c1e1b04a078e983ccadd 
   core/src/test/scala/unit/kafka/server/DynamicConfigChangeTest.scala 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/20471/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jay Kreps
 




Re: Review Request 20471: KAFKA-1398: Follow-up comments on dynamic config

2014-04-18 Thread Jay Kreps

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20471/
---

(Updated April 18, 2014, 8:03 p.m.)


Review request for kafka.


Summary (updated)
-

KAFKA-1398: Follow-up comments on dynamic config


Bugs: KAFKA-1398
https://issues.apache.org/jira/browse/KAFKA-1398


Repository: kafka


Description (updated)
---

KAFKA-1398 Dynamic config follow-on-comments.


Diffs (updated)
-

  core/src/main/scala/kafka/server/TopicConfigManager.scala 
4a4274eef790ff1fd2dfbbd85d44722688bfadee 
  core/src/test/scala/unit/kafka/server/DynamicConfigChangeTest.scala 
5a1d5cc36033d0427c6c680d198c02207e10326c 

Diff: https://reviews.apache.org/r/20471/diff/


Testing
---


Thanks,

Jay Kreps



[jira] [Updated] (KAFKA-1398) Topic config changes can be lost and cause fatal exceptions on broker restarts

2014-04-18 Thread Jay Kreps (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Kreps updated KAFKA-1398:
-

Attachment: KAFKA-1398_2014-04-18_13:03:03.patch

 Topic config changes can be lost and cause fatal exceptions on broker restarts
 --

 Key: KAFKA-1398
 URL: https://issues.apache.org/jira/browse/KAFKA-1398
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Joel Koshy
Assignee: Jay Kreps
Priority: Critical
 Fix For: 0.8.1.1

 Attachments: KAFKA-1398.patch, KAFKA-1398.patch, KAFKA-1398.patch, 
 KAFKA-1398_2014-04-18_13:03:03.patch


 Our topic config cleanup policy seems to be broken. When a broker is
 bounced and starting up:
 1 - Read all the children of the config change path
 2 - For each, if the change id is greater than the last executed change,
   then extract the topic information.
 3 - If there is a log for that topic on this broker, then apply the change.
   However, if there is no log, then delete the config change.
 In step 3, a delete triggers a child change watch firing on all the other
 brokers. The other brokers currently take all the children of the config
 path but will ignore those config changes that are less than the last
 executed change. At least one issue here is that if a broker does not have
 partitions for a topic then the lastExecutedChange is not updated (for
 that topic).
 Consider this scenario:
 - Three brokers 0, 1, 2
 - Topic A has partitions only assigned to broker 0
 - Topic B has partitions only assigned to broker 1
 - Topic C has partitions only assigned to broker 2
 - Change 0: topic A
 - Change 1: topic B
 - Change 2: topic C
 - lastExecutedChange on broker 0 is 0
 - lastExecutedChange on broker 1 is 1
 - lastExecutedChange on broker 2 is 2
 - Bounce broker 1
 - The above bounce will cause Change 0 and Change 2 to get deleted.
 - Watch fires on broker 0 and 1
 - Broker 0 will try and read the topic corresponding to change 1 (since its
   lastExecutedChange is 0) and then change 2. That read will fail:
 2014/04/15 19:35:34.236 INFO [TopicConfigManager] [main] [kafka-server] [] 
 Processed topic config change 25 for topic xyz, setting new config to
  {retention.ms=360, segment.ms=360}.
 2014/04/15 19:35:34.238 FATAL [KafkaServerStartable] [main] [kafka-server] [] 
 Fatal error during KafkaServerStable startup. Prepare to shutdown
 org.I0Itec.zkclient.exception.ZkNoNodeException: 
 org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
 NoNode for /config/changes/config_change_26
 at 
 org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
 at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
 at kafka.utils.ZkUtils$.readData(ZkUtils.scala:467)
 at 
 kafka.server.TopicConfigManager$$anonfun$kafka$server$TopicConfigManager$$processConfigChanges$2.apply(TopicConfigManager.scala:97)
 at 
 kafka.server.TopicConfigManager$$anonfun$kafka$server$TopicConfigManager$$processConfigChanges$2.apply(TopicConfigManager.scala:93)
 at 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:57)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:43)
 at 
 kafka.server.TopicConfigManager.kafka$server$TopicConfigManager$$processConfigChanges(TopicConfigManager.scala:93)
 at 
 kafka.server.TopicConfigManager.processAllConfigChanges(TopicConfigManager.scala:81)
 at 
 kafka.server.TopicConfigManager.startup(TopicConfigManager.scala:72)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:104)
 at 
 kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
 ...
 Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
 KeeperErrorCode = NoNode for /config/changes/config_change_26
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927)
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:956)
 at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
 at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770)
 at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766)
 at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
 ... 39 more
 Another issue is that there are two logging statements with incorrect
 qualifiers which makes things a little harder to debug. E.g.,
 

[jira] [Commented] (KAFKA-1398) Topic config changes can be lost and cause fatal exceptions on broker restarts

2014-04-18 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974420#comment-13974420
 ] 

Jay Kreps commented on KAFKA-1398:
--

Updated reviewboard https://reviews.apache.org/r/20471/
 against branch trunk

 Topic config changes can be lost and cause fatal exceptions on broker restarts
 --

 Key: KAFKA-1398
 URL: https://issues.apache.org/jira/browse/KAFKA-1398
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Joel Koshy
Assignee: Jay Kreps
Priority: Critical
 Fix For: 0.8.1.1

 Attachments: KAFKA-1398.patch, KAFKA-1398.patch, KAFKA-1398.patch, 
 KAFKA-1398_2014-04-18_13:03:03.patch


 Our topic config cleanup policy seems to be broken. When a broker is
 bounced and starting up:
 1 - Read all the children of the config change path
 2 - For each, if the change id is greater than the last executed change,
   then extract the topic information.
 3 - If there is a log for that topic on this broker, then apply the change.
   However, if there is no log, then delete the config change.
 In step 3, a delete triggers a child change watch firing on all the other
 brokers. The other brokers currently take all the children of the config
 path but will ignore those config changes that are less than the last
 executed change. At least one issue here is that if a broker does not have
 partitions for a topic then the lastExecutedChange is not updated (for
 that topic).
 Consider this scenario:
 - Three brokers 0, 1, 2
 - Topic A has partitions only assigned to broker 0
 - Topic B has partitions only assigned to broker 1
 - Topic C has partitions only assigned to broker 2
 - Change 0: topic A
 - Change 1: topic B
 - Change 2: topic C
 - lastExecutedChange on broker 0 is 0
 - lastExecutedChange on broker 1 is 1
 - lastExecutedChange on broker 2 is 2
 - Bounce broker 1
 - The above bounce will cause Change 0 and Change 2 to get deleted.
 - Watch fires on broker 0 and 1
 - Broker 0 will try and read the topic corresponding to change 1 (since its
   lastExecutedChange is 0) and then change 2. That read will fail:
 2014/04/15 19:35:34.236 INFO [TopicConfigManager] [main] [kafka-server] [] 
 Processed topic config change 25 for topic xyz, setting new config to
  {retention.ms=360, segment.ms=360}.
 2014/04/15 19:35:34.238 FATAL [KafkaServerStartable] [main] [kafka-server] [] 
 Fatal error during KafkaServerStable startup. Prepare to shutdown
 org.I0Itec.zkclient.exception.ZkNoNodeException: 
 org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
 NoNode for /config/changes/config_change_26
 at 
 org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
 at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
 at kafka.utils.ZkUtils$.readData(ZkUtils.scala:467)
 at 
 kafka.server.TopicConfigManager$$anonfun$kafka$server$TopicConfigManager$$processConfigChanges$2.apply(TopicConfigManager.scala:97)
 at 
 kafka.server.TopicConfigManager$$anonfun$kafka$server$TopicConfigManager$$processConfigChanges$2.apply(TopicConfigManager.scala:93)
 at 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:57)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:43)
 at 
 kafka.server.TopicConfigManager.kafka$server$TopicConfigManager$$processConfigChanges(TopicConfigManager.scala:93)
 at 
 kafka.server.TopicConfigManager.processAllConfigChanges(TopicConfigManager.scala:81)
 at 
 kafka.server.TopicConfigManager.startup(TopicConfigManager.scala:72)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:104)
 at 
 kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
 ...
 Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
 KeeperErrorCode = NoNode for /config/changes/config_change_26
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927)
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:956)
 at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
 at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770)
 at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766)
 at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
 ... 39 more
 Another issue is that there are two logging statements with 

Re: Review Request 20471: KAFKA-1398: Follow-up comments on dynamic config

2014-04-18 Thread Joel Koshy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20471/#review40809
---

Ship it!


Ship It!

- Joel Koshy


On April 18, 2014, 8:03 p.m., Jay Kreps wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/20471/
 ---
 
 (Updated April 18, 2014, 8:03 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1398
 https://issues.apache.org/jira/browse/KAFKA-1398
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1398 Dynamic config follow-on-comments.
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/server/TopicConfigManager.scala 
 4a4274eef790ff1fd2dfbbd85d44722688bfadee 
   core/src/test/scala/unit/kafka/server/DynamicConfigChangeTest.scala 
 5a1d5cc36033d0427c6c680d198c02207e10326c 
 
 Diff: https://reviews.apache.org/r/20471/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jay Kreps
 




Re: Review Request 20471: KAFKA-1398: Follow-up comments on dynamic config

2014-04-18 Thread Jun Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20471/#review40810
---

Ship it!


Ship It!

- Jun Rao


On April 18, 2014, 8:03 p.m., Jay Kreps wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/20471/
 ---
 
 (Updated April 18, 2014, 8:03 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1398
 https://issues.apache.org/jira/browse/KAFKA-1398
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1398 Dynamic config follow-on-comments.
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/server/TopicConfigManager.scala 
 4a4274eef790ff1fd2dfbbd85d44722688bfadee 
   core/src/test/scala/unit/kafka/server/DynamicConfigChangeTest.scala 
 5a1d5cc36033d0427c6c680d198c02207e10326c 
 
 Diff: https://reviews.apache.org/r/20471/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jay Kreps
 




[jira] [Commented] (KAFKA-1399) Drop Scala 2.8.x support

2014-04-18 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1397#comment-1397
 ] 

Jun Rao commented on KAFKA-1399:


We are re-rewriting the clients in java. Once that's done, it will be a lot 
easier to drop the scala 2.8.X support since most people won't care the version 
of scala on the broker.

 Drop Scala 2.8.x support
 

 Key: KAFKA-1399
 URL: https://issues.apache.org/jira/browse/KAFKA-1399
 Project: Kafka
  Issue Type: Task
  Components: packaging
Affects Versions: 0.8.1
Reporter: Stevo Slavic
  Labels: gradle, scala

 It's been almost 4 years since [Scala 2.8 has been 
 released|http://www.scala-lang.org/old/node/7009] and 3 years since [Scala 
 2.9 has been released|http://www.scala-lang.org/old/node/9483], so there was 
 more than plenty of time to migrate.
 Continued support of old Scala 2.8 is causing issues like 
 [this|https://issues.apache.org/jira/browse/KAFKA-1362?focusedCommentId=13970390page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13970390].



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (KAFKA-1406) Fix scaladoc/javadoc warnings

2014-04-18 Thread Joel Koshy (JIRA)
Joel Koshy created KAFKA-1406:
-

 Summary: Fix scaladoc/javadoc warnings
 Key: KAFKA-1406
 URL: https://issues.apache.org/jira/browse/KAFKA-1406
 Project: Kafka
  Issue Type: Bug
  Components: packaging
Reporter: Joel Koshy


./gradlew docsJarAll

You will see a bunch of warnings mainly due to typos/incorrect use of 
javadoc/scaladoc



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (KAFKA-1406) Fix scaladoc/javadoc warnings

2014-04-18 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy updated KAFKA-1406:
--

Labels: build  (was: )

 Fix scaladoc/javadoc warnings
 -

 Key: KAFKA-1406
 URL: https://issues.apache.org/jira/browse/KAFKA-1406
 Project: Kafka
  Issue Type: Bug
  Components: packaging
Reporter: Joel Koshy
  Labels: build

 ./gradlew docsJarAll
 You will see a bunch of warnings mainly due to typos/incorrect use of 
 javadoc/scaladoc



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (KAFKA-1406) Fix scaladoc/javadoc warnings

2014-04-18 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy updated KAFKA-1406:
--

Fix Version/s: 0.8.2

 Fix scaladoc/javadoc warnings
 -

 Key: KAFKA-1406
 URL: https://issues.apache.org/jira/browse/KAFKA-1406
 Project: Kafka
  Issue Type: Bug
  Components: packaging
Reporter: Joel Koshy
  Labels: build
 Fix For: 0.8.2


 ./gradlew docsJarAll
 You will see a bunch of warnings mainly due to typos/incorrect use of 
 javadoc/scaladoc



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1362) Publish sources and javadoc jars

2014-04-18 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974454#comment-13974454
 ] 

Joel Koshy commented on KAFKA-1362:
---

Committed to both 0.8.1 and trunk

Filed KAFKA-1406 to address the warnings when generating javadocs/scaladocs.


 Publish sources and javadoc jars
 

 Key: KAFKA-1362
 URL: https://issues.apache.org/jira/browse/KAFKA-1362
 Project: Kafka
  Issue Type: Bug
  Components: packaging
Affects Versions: 0.8.1
Reporter: Stevo Slavic
Assignee: Joel Koshy
  Labels: build
 Fix For: 0.8.1.1

 Attachments: KAFKA-1362.patch


 Currently just binaries jars get published on Maven Central (see 
 http://repo1.maven.org/maven2/org/apache/kafka/kafka_2.10/0.8.1/ ). Please 
 also publish sources and javadoc jars.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (KAFKA-1362) Publish sources and javadoc jars

2014-04-18 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy resolved KAFKA-1362.
---

Resolution: Fixed

 Publish sources and javadoc jars
 

 Key: KAFKA-1362
 URL: https://issues.apache.org/jira/browse/KAFKA-1362
 Project: Kafka
  Issue Type: Bug
  Components: packaging
Affects Versions: 0.8.1
Reporter: Stevo Slavic
Assignee: Joel Koshy
  Labels: build
 Fix For: 0.8.1.1

 Attachments: KAFKA-1362.patch


 Currently just binaries jars get published on Maven Central (see 
 http://repo1.maven.org/maven2/org/apache/kafka/kafka_2.10/0.8.1/ ). Please 
 also publish sources and javadoc jars.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Reopened] (KAFKA-1398) Topic config changes can be lost and cause fatal exceptions on broker restarts

2014-04-18 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy reopened KAFKA-1398:
---


Reopening to keep track of the follow-up. Also, I need to commit to 0.8.1.

 Topic config changes can be lost and cause fatal exceptions on broker restarts
 --

 Key: KAFKA-1398
 URL: https://issues.apache.org/jira/browse/KAFKA-1398
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Joel Koshy
Assignee: Jay Kreps
Priority: Critical
 Fix For: 0.8.1.1

 Attachments: KAFKA-1398.patch, KAFKA-1398.patch, KAFKA-1398.patch, 
 KAFKA-1398_2014-04-18_13:03:03.patch


 Our topic config cleanup policy seems to be broken. When a broker is
 bounced and starting up:
 1 - Read all the children of the config change path
 2 - For each, if the change id is greater than the last executed change,
   then extract the topic information.
 3 - If there is a log for that topic on this broker, then apply the change.
   However, if there is no log, then delete the config change.
 In step 3, a delete triggers a child change watch firing on all the other
 brokers. The other brokers currently take all the children of the config
 path but will ignore those config changes that are less than the last
 executed change. At least one issue here is that if a broker does not have
 partitions for a topic then the lastExecutedChange is not updated (for
 that topic).
 Consider this scenario:
 - Three brokers 0, 1, 2
 - Topic A has partitions only assigned to broker 0
 - Topic B has partitions only assigned to broker 1
 - Topic C has partitions only assigned to broker 2
 - Change 0: topic A
 - Change 1: topic B
 - Change 2: topic C
 - lastExecutedChange on broker 0 is 0
 - lastExecutedChange on broker 1 is 1
 - lastExecutedChange on broker 2 is 2
 - Bounce broker 1
 - The above bounce will cause Change 0 and Change 2 to get deleted.
 - Watch fires on broker 0 and 1
 - Broker 0 will try and read the topic corresponding to change 1 (since its
   lastExecutedChange is 0) and then change 2. That read will fail:
 2014/04/15 19:35:34.236 INFO [TopicConfigManager] [main] [kafka-server] [] 
 Processed topic config change 25 for topic xyz, setting new config to
  {retention.ms=360, segment.ms=360}.
 2014/04/15 19:35:34.238 FATAL [KafkaServerStartable] [main] [kafka-server] [] 
 Fatal error during KafkaServerStable startup. Prepare to shutdown
 org.I0Itec.zkclient.exception.ZkNoNodeException: 
 org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
 NoNode for /config/changes/config_change_26
 at 
 org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
 at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
 at kafka.utils.ZkUtils$.readData(ZkUtils.scala:467)
 at 
 kafka.server.TopicConfigManager$$anonfun$kafka$server$TopicConfigManager$$processConfigChanges$2.apply(TopicConfigManager.scala:97)
 at 
 kafka.server.TopicConfigManager$$anonfun$kafka$server$TopicConfigManager$$processConfigChanges$2.apply(TopicConfigManager.scala:93)
 at 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:57)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:43)
 at 
 kafka.server.TopicConfigManager.kafka$server$TopicConfigManager$$processConfigChanges(TopicConfigManager.scala:93)
 at 
 kafka.server.TopicConfigManager.processAllConfigChanges(TopicConfigManager.scala:81)
 at 
 kafka.server.TopicConfigManager.startup(TopicConfigManager.scala:72)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:104)
 at 
 kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
 ...
 Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
 KeeperErrorCode = NoNode for /config/changes/config_change_26
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927)
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:956)
 at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
 at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770)
 at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766)
 at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
 ... 39 more
 Another issue is that there are two logging statements with incorrect
 qualifiers which makes things a little harder 

[jira] [Commented] (KAFKA-493) High CPU usage on inactive server

2014-04-18 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974471#comment-13974471
 ] 

Joel Koshy commented on KAFKA-493:
--

For (1) I think the issue was discussed (not yet fixed) in KAFKA-1150

 High CPU usage on inactive server
 -

 Key: KAFKA-493
 URL: https://issues.apache.org/jira/browse/KAFKA-493
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.0
Reporter: Jay Kreps
 Fix For: 0.8.2

 Attachments: Kafka-sampling1.zip, Kafka-sampling2.zip, 
 Kafka-sampling3.zip, Kafka-trace1.zip, Kafka-trace2.zip, Kafka-trace3.zip, 
 backtraces.txt, stacktrace.txt


  I've been playing with the 0.8 branch of Kafka and noticed that idle CPU 
  usage is fairly high (13% of a 
  core). Is that to be expected? I did look at the stack, but didn't see 
  anything obvious. A background 
  task?
  I wanted to mention how I am getting into this state. I've set up two 
  machines with the latest 0.8 
  code base and am using a replication factor of 2. On starting the brokers 
  there is no idle CPU activity. 
  Then I run a test that essential does 10k publish operations followed by 
  immediate consume operations 
  (I was measuring latency). Once this has run the kafka nodes seem to 
  consistently be consuming CPU 
  essentially forever.
 hprof results:
 THREAD START (obj=53ae, id = 24, name=RMI TCP Accept-0, 
 group=system)
 THREAD START (obj=53ae, id = 25, name=RMI TCP Accept-, 
 group=system)
 THREAD START (obj=53ae, id = 26, name=RMI TCP Accept-0, 
 group=system)
 THREAD START (obj=53ae, id = 21, name=main, group=main)
 THREAD START (obj=53ae, id = 27, name=Thread-2, group=main)
 THREAD START (obj=53ae, id = 28, name=Thread-3, group=main)
 THREAD START (obj=53ae, id = 29, name=kafka-processor-9092-0, 
 group=main)
 THREAD START (obj=53ae, id = 200010, name=kafka-processor-9092-1, 
 group=main)
 THREAD START (obj=53ae, id = 200011, name=kafka-acceptor, group=main)
 THREAD START (obj=574b, id = 200012, 
 name=ZkClient-EventThread-20-localhost:2181, group=main)
 THREAD START (obj=576e, id = 200014, name=main-SendThread(), 
 group=main)
 THREAD START (obj=576d, id = 200013, name=main-EventThread, 
 group=main)
 THREAD START (obj=53ae, id = 200015, name=metrics-meter-tick-thread-1, 
 group=main)
 THREAD START (obj=53ae, id = 200016, name=metrics-meter-tick-thread-2, 
 group=main)
 THREAD START (obj=53ae, id = 200017, name=request-expiration-task, 
 group=main)
 THREAD START (obj=53ae, id = 200018, name=request-expiration-task, 
 group=main)
 THREAD START (obj=53ae, id = 200019, name=kafka-request-handler-0, 
 group=main)
 THREAD START (obj=53ae, id = 200020, name=kafka-request-handler-1, 
 group=main)
 THREAD START (obj=53ae, id = 200021, name=Thread-6, group=main)
 THREAD START (obj=53ae, id = 200022, name=Thread-7, group=main)
 THREAD START (obj=5899, id = 200023, name=ReplicaFetcherThread-0-2 on 
 broker 1, , group=main)
 THREAD START (obj=5899, id = 200024, name=ReplicaFetcherThread-0-3 on 
 broker 1, , group=main)
 THREAD START (obj=5899, id = 200025, name=ReplicaFetcherThread-0-0 on 
 broker 1, , group=main)
 THREAD START (obj=5899, id = 200026, name=ReplicaFetcherThread-0-1 on 
 broker 1, , group=main)
 THREAD START (obj=53ae, id = 200028, name=SIGINT handler, 
 group=system)
 THREAD START (obj=53ae, id = 200029, name=Thread-5, group=main)
 THREAD START (obj=574b, id = 200030, name=Thread-1, group=main)
 THREAD START (obj=574b, id = 200031, name=Thread-0, group=main)
 THREAD END (id = 200031)
 THREAD END (id = 200029)
 THREAD END (id = 200020)
 THREAD END (id = 200019)
 THREAD END (id = 28)
 THREAD END (id = 200021)
 THREAD END (id = 27)
 THREAD END (id = 200022)
 THREAD END (id = 200018)
 THREAD END (id = 200017)
 THREAD END (id = 200012)
 THREAD END (id = 200013)
 THREAD END (id = 200014)
 THREAD END (id = 200025)
 THREAD END (id = 200023)
 THREAD END (id = 200026)
 THREAD END (id = 200024)
 THREAD END (id = 200011)
 THREAD END (id = 29)
 THREAD END (id = 200010)
 THREAD END (id = 200030)
 THREAD END (id = 200028)
 TRACE 301281:
 sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown 
 line)
 sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
 sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
 sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
 sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
 
 sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:218)
 sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
 
 

[jira] [Resolved] (KAFKA-1355) Reduce/optimize update metadata requests sent during leader election

2014-04-18 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy resolved KAFKA-1355.
---

Resolution: Fixed

Thanks for the review. Committed to 0.8.1 as well.

 Reduce/optimize update metadata requests sent during leader election
 

 Key: KAFKA-1355
 URL: https://issues.apache.org/jira/browse/KAFKA-1355
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Joel Koshy
Priority: Critical
 Fix For: 0.8.1.1

 Attachments: KAFKA-1355.patch, KAFKA-1355_2014-04-04_13:48:34.patch, 
 KAFKA-1355_2014-04-04_13:51:22.patch, KAFKA-1355_2014-04-17_14:48:57.patch


 This is part of the investigation into slow shutdowns in 0.8.1. While
 logging contributes to bulk of the regression, this one also adds
 quite a bit of overhead:
 In addLeaderAndIsrRequest (called for every partition that is led by the
 broker being shut down) we also add an UpdateMetadataRequest - each call to
 addUpdateMetadataRequests does two traversals over _all_ (global)
 partitions. I think it should be straightforward to optimize this a bit.
 Marking as critical, since it is not as big an overhead as the logging.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1359) Add topic/broker metrics once new topic/broker is discovered

2014-04-18 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974561#comment-13974561
 ] 

Guozhang Wang commented on KAFKA-1359:
--

I did a small benchmark on my mac laptop and linux desktop:

{code}
 public class NanoVsMillis {
 public static void main(String[] args) {
 int count = 10 * 1000 * 1000;
 long[] times = new long[count];
 long t1 = System.nanoTime();
 for (int i = 0; i  count; i++) {
 times[i] = System.nanoTime();
 }
 long t2 = System.nanoTime();
 for (int i = 0; i  count; i++) {
 times[i] = System.currentTimeMillis();
 }
 long t3 = System.nanoTime();
 System.out.println(Total time nano :  + ((t2 - t1) / (1000 * 1000)) 
+ ms);
 System.out.println(Total time millis:  + ((t3 - t2) / (1000 * 1000)) 
+ ms);
 System.out.println();
 System.out.println(Avg time nano :  + ((t2 - t1) / count) + ns);
 System.out.println(Avg time millis:  + ((t3 - t2) / count) + ns);
 }
 }
{code}

On the mac laptop, since the QueryPerformanceCounter/ 
QueryPerformanceFrequency API is available, nanoTime actually returns 
currentTimeMillis\*10\^6). , the results are

{code}
Total time nano : 366ms
Total time millis: 354ms

Avg time nano : 36ns
Avg time millis: 35ns
{code}

On the linux machine, the results are

{code}
Total time nano : 520ms
Total time millis: 470ms

Avg time nano : 52ns
Avg time millis: 47ns
{code}

So I think nanoTime will be a bit more expensive than millis on the linux box. 
If millis' accuracy is sufficient for us, maybe we should switch from nano?

 Add topic/broker metrics once new topic/broker is discovered
 

 Key: KAFKA-1359
 URL: https://issues.apache.org/jira/browse/KAFKA-1359
 Project: Kafka
  Issue Type: Sub-task
  Components: producer 
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.8.2

 Attachments: KAFKA-1359.patch, KAFKA-1359_2014-04-10_10:11:40.patch, 
 KAFKA-1359_2014-04-11_14:20:45.patch, KAFKA-1359_2014-04-16_09:53:55.patch


 Today some topic/broker level metrics are only added the first time such an 
 event (record-retry, record-error, etc) happens. This has a potential issue 
 for customized mbean reporter which needs to register all the sensors at the 
 time the new broker/topic is discovered. It is better to add such metrics at 
 the very beginning when new topic/brokers are discovered.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Jenkins build is back to normal : Kafka-trunk #168

2014-04-18 Thread Apache Jenkins Server
See https://builds.apache.org/job/Kafka-trunk/168/changes



[jira] [Resolved] (KAFKA-1398) Topic config changes can be lost and cause fatal exceptions on broker restarts

2014-04-18 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy resolved KAFKA-1398.
---

Resolution: Fixed

Committed to 0.8.1 as well

 Topic config changes can be lost and cause fatal exceptions on broker restarts
 --

 Key: KAFKA-1398
 URL: https://issues.apache.org/jira/browse/KAFKA-1398
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Joel Koshy
Assignee: Jay Kreps
Priority: Critical
 Fix For: 0.8.1.1

 Attachments: KAFKA-1398.patch, KAFKA-1398.patch, KAFKA-1398.patch, 
 KAFKA-1398_2014-04-18_13:03:03.patch


 Our topic config cleanup policy seems to be broken. When a broker is
 bounced and starting up:
 1 - Read all the children of the config change path
 2 - For each, if the change id is greater than the last executed change,
   then extract the topic information.
 3 - If there is a log for that topic on this broker, then apply the change.
   However, if there is no log, then delete the config change.
 In step 3, a delete triggers a child change watch firing on all the other
 brokers. The other brokers currently take all the children of the config
 path but will ignore those config changes that are less than the last
 executed change. At least one issue here is that if a broker does not have
 partitions for a topic then the lastExecutedChange is not updated (for
 that topic).
 Consider this scenario:
 - Three brokers 0, 1, 2
 - Topic A has partitions only assigned to broker 0
 - Topic B has partitions only assigned to broker 1
 - Topic C has partitions only assigned to broker 2
 - Change 0: topic A
 - Change 1: topic B
 - Change 2: topic C
 - lastExecutedChange on broker 0 is 0
 - lastExecutedChange on broker 1 is 1
 - lastExecutedChange on broker 2 is 2
 - Bounce broker 1
 - The above bounce will cause Change 0 and Change 2 to get deleted.
 - Watch fires on broker 0 and 1
 - Broker 0 will try and read the topic corresponding to change 1 (since its
   lastExecutedChange is 0) and then change 2. That read will fail:
 2014/04/15 19:35:34.236 INFO [TopicConfigManager] [main] [kafka-server] [] 
 Processed topic config change 25 for topic xyz, setting new config to
  {retention.ms=360, segment.ms=360}.
 2014/04/15 19:35:34.238 FATAL [KafkaServerStartable] [main] [kafka-server] [] 
 Fatal error during KafkaServerStable startup. Prepare to shutdown
 org.I0Itec.zkclient.exception.ZkNoNodeException: 
 org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
 NoNode for /config/changes/config_change_26
 at 
 org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
 at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
 at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
 at kafka.utils.ZkUtils$.readData(ZkUtils.scala:467)
 at 
 kafka.server.TopicConfigManager$$anonfun$kafka$server$TopicConfigManager$$processConfigChanges$2.apply(TopicConfigManager.scala:97)
 at 
 kafka.server.TopicConfigManager$$anonfun$kafka$server$TopicConfigManager$$processConfigChanges$2.apply(TopicConfigManager.scala:93)
 at 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:57)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:43)
 at 
 kafka.server.TopicConfigManager.kafka$server$TopicConfigManager$$processConfigChanges(TopicConfigManager.scala:93)
 at 
 kafka.server.TopicConfigManager.processAllConfigChanges(TopicConfigManager.scala:81)
 at 
 kafka.server.TopicConfigManager.startup(TopicConfigManager.scala:72)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:104)
 at 
 kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
 ...
 Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
 KeeperErrorCode = NoNode for /config/changes/config_change_26
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927)
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:956)
 at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
 at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770)
 at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766)
 at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
 ... 39 more
 Another issue is that there are two logging statements with incorrect
 qualifiers which makes things a little harder to debug. E.g.,
 

[jira] [Commented] (KAFKA-1359) Add topic/broker metrics once new topic/broker is discovered

2014-04-18 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974624#comment-13974624
 ] 

Jay Kreps commented on KAFKA-1359:
--

Yeah I'm not opposed to switching.

 Add topic/broker metrics once new topic/broker is discovered
 

 Key: KAFKA-1359
 URL: https://issues.apache.org/jira/browse/KAFKA-1359
 Project: Kafka
  Issue Type: Sub-task
  Components: producer 
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.8.2

 Attachments: KAFKA-1359.patch, KAFKA-1359_2014-04-10_10:11:40.patch, 
 KAFKA-1359_2014-04-11_14:20:45.patch, KAFKA-1359_2014-04-16_09:53:55.patch


 Today some topic/broker level metrics are only added the first time such an 
 event (record-retry, record-error, etc) happens. This has a potential issue 
 for customized mbean reporter which needs to register all the sensors at the 
 time the new broker/topic is discovered. It is better to add such metrics at 
 the very beginning when new topic/brokers are discovered.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 20468: Add log cleaner metrics. (Cherry-picked from trunk)

2014-04-18 Thread Jun Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20468/#review40838
---

Ship it!



core/src/main/scala/kafka/log/LogCleanerManager.scala
https://reviews.apache.org/r/20468/#comment74050

It seems that we should set the ratio back to 0 if dirtyLogs is empty. 
Otherwise, the reported metrics will stay with the last updated value, which 
can be miss-leading.


- Jun Rao


On April 17, 2014, 10:33 p.m., Joel Koshy wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/20468/
 ---
 
 (Updated April 17, 2014, 10:33 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1380
 https://issues.apache.org/jira/browse/KAFKA-1380
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1327 Add log cleaner metrics.
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/log/Log.scala 
 b3ab5220a66a2ae82084dad89877daf60f613e66 
   core/src/main/scala/kafka/log/LogCleaner.scala 
 312204c6ddd0cf46cd7349d79a43edec839bc361 
   core/src/main/scala/kafka/log/LogCleanerManager.scala 
 79e9d556d22c5ecd9be2439e5c54caad9b58735c 
   core/src/main/scala/kafka/utils/Throttler.scala 
 c6c3c75ee8408ca81aeeb5846f7987a287b5a6e8 
 
 Diff: https://reviews.apache.org/r/20468/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Joel Koshy
 




Re: Review Request 20468: Add log cleaner metrics. (Cherry-picked from trunk)

2014-04-18 Thread Jun Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20468/#review40840
---


Also, I think we need to make the following var volatile.

CleanerStats.startTime, mapCompleteTime, endTime, bytesRead, bytesWritten, 
mapBytesRead, mapMessagesRead, messagesRead, messagesWritten

LogCleanerManager.dirtiestLogCleanableRatio

Jay,

Could you double check?

- Jun Rao


On April 17, 2014, 10:33 p.m., Joel Koshy wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/20468/
 ---
 
 (Updated April 17, 2014, 10:33 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1380
 https://issues.apache.org/jira/browse/KAFKA-1380
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1327 Add log cleaner metrics.
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/log/Log.scala 
 b3ab5220a66a2ae82084dad89877daf60f613e66 
   core/src/main/scala/kafka/log/LogCleaner.scala 
 312204c6ddd0cf46cd7349d79a43edec839bc361 
   core/src/main/scala/kafka/log/LogCleanerManager.scala 
 79e9d556d22c5ecd9be2439e5c54caad9b58735c 
   core/src/main/scala/kafka/utils/Throttler.scala 
 c6c3c75ee8408ca81aeeb5846f7987a287b5a6e8 
 
 Diff: https://reviews.apache.org/r/20468/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Joel Koshy