[ https://issues.apache.org/jira/browse/KAFKA-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ian Morgan updated KAFKA-2194: ------------------------------ Description: Trying to separate out Kafka-logs and Zookeeper data from the primary kafka folder, so that distributed system can be distributed separately to the data folders. Initialisation seems to succeed (e.g. old topics from kafka-logs is loaded successfully). Steps to reproduce: 1. Start ZooKeeper 2. Start Kafka (write data to Kafka so that data is available in kafka-logs). 3. Kill Kafka 4. Kill Zookeeper 5. Start Kafka 6. Start Zookeeper 7. Try reading from Kafka Logs: Seeing the following in server.log (where LcmdSegments is the topic). 2015-05-15 12:06:38,290 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with correlation id 148440 from client on partition [LcmdSegments,14] failed due to Partition [LcmdSegments,14] doesn't exist on 1 2015-05-15 12:06:38,384 INFO KAFKA.Processor: Closing socket connection to /10.44.18.75. 2015-05-15 12:06:39,384 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with correlation id 148519 from client on partition [LcmdSegments,14] failed due to Partition [LcmdSegments,14] doesn't exist on 1 2015-05-15 12:06:39,493 INFO KAFKA.Processor: Closing socket connection to /10.44.18.75. 2015-05-15 12:06:39,493 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with correlation id 148598 from client on partition [LcmdSegments,14] failed due to Partition [LcmdSegments,14] doesn't exist on 1 2015-05-15 12:06:39,603 INFO KAFKA.Processor: Closing socket connection to /10.44.18.75. 2015-05-15 12:06:39,603 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with correlation id 148677 from client on partition [LcmdSegments,14] failed due to Partition [LcmdSegments,14] doesn't exist on 1 2015-05-15 12:06:39,696 INFO KAFKA.Processor: Closing socket connection to /10.44.18.75. 2015-05-15 12:06:39,696 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with correlation id 148756 from client on partition [LcmdSegments,14] failed due to Partition [LcmdSegments,14] doesn't exist on 1 And KafkaNet client returns: Topic:LcmdSegments returned error code of LeaderNotAvailable. Retrying. Backing off metadata request retry. Waiting for 62500ms. state-change.log shows an error: 2015-05-15 11:44:18,110 ERROR KAFKA.logger: Controller 1 epoch 2 initiated state change for partition [LcmdSegments,14] from OfflinePartition to OnlinePartition failed kafka.common.NoReplicaOnlineException: No replica for partition [LcmdSegments,14] is alive. Live brokers are: [Set()], Assigned replicas are: [List(1)] at kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75) at kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357) at kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206) at kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120) at kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117) at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) at scala.collection.mutable.HashMap.foreach(HashMap.scala:98) at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771) at kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:117) at kafka.controller.PartitionStateMachine.startup(PartitionStateMachine.scala:70) at kafka.controller.KafkaController.onControllerFailover(KafkaController.scala:314) at kafka.controller.KafkaController$$anonfun$1.apply$mcV$sp(KafkaController.scala:161) at kafka.server.ZookeeperLeaderElector.elect(ZookeeperLeaderElector.scala:81) at kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply$mcZ$sp(ZookeeperLeaderElector.scala:49) at kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply(ZookeeperLeaderElector.scala:47) at kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply(ZookeeperLeaderElector.scala:47) at kafka.utils.Utils$.inLock(Utils.scala:535) at kafka.server.ZookeeperLeaderElector.startup(ZookeeperLeaderElector.scala:47) at kafka.controller.KafkaController$$anonfun$startup$1.apply$mcV$sp(KafkaController.scala:650) at kafka.controller.KafkaController$$anonfun$startup$1.apply(KafkaController.scala:646) at kafka.controller.KafkaController$$anonfun$startup$1.apply(KafkaController.scala:646) at kafka.utils.Utils$.inLock(Utils.scala:535) at kafka.controller.KafkaController.startup(KafkaController.scala:646) at kafka.server.KafkaServer.startup(KafkaServer.scala:117) at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:29) at kafka.Kafka$.main(Kafka.scala:46) at kafka.Kafka.main(Kafka.scala) --- I upgraded from 0.8.0 to 0.8.2.1 to try and avoid the original issues with KAFKA-1029 and KAFKA-1451. Unfortunately it seems I have new issues with 0.8.2.1 with the above error. was: Trying to separate out Kafka-logs and Zookeeper data from the primary kafka folder, so that distributed system can be distributed separately to the data folders. Initialisation seems to succeed (e.g. old topics from kafka-logs is loaded successfully). Steps to reproduce: 1. Kill Kafka 2. Kill Zookeeper 3. Start Kafka 4. Start Zookeeper Logs: Seeing the following in server.log (where LcmdSegments is the topic). 2015-05-15 12:06:38,290 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with correlation id 148440 from client on partition [LcmdSegments,14] failed due to Partition [LcmdSegments,14] doesn't exist on 1 2015-05-15 12:06:38,384 INFO KAFKA.Processor: Closing socket connection to /10.44.18.75. 2015-05-15 12:06:39,384 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with correlation id 148519 from client on partition [LcmdSegments,14] failed due to Partition [LcmdSegments,14] doesn't exist on 1 2015-05-15 12:06:39,493 INFO KAFKA.Processor: Closing socket connection to /10.44.18.75. 2015-05-15 12:06:39,493 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with correlation id 148598 from client on partition [LcmdSegments,14] failed due to Partition [LcmdSegments,14] doesn't exist on 1 2015-05-15 12:06:39,603 INFO KAFKA.Processor: Closing socket connection to /10.44.18.75. 2015-05-15 12:06:39,603 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with correlation id 148677 from client on partition [LcmdSegments,14] failed due to Partition [LcmdSegments,14] doesn't exist on 1 2015-05-15 12:06:39,696 INFO KAFKA.Processor: Closing socket connection to /10.44.18.75. 2015-05-15 12:06:39,696 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with correlation id 148756 from client on partition [LcmdSegments,14] failed due to Partition [LcmdSegments,14] doesn't exist on 1 And KafkaNet client returns: Topic:LcmdSegments returned error code of LeaderNotAvailable. Retrying. Backing off metadata request retry. Waiting for 62500ms. state-change.log shows an error: 2015-05-15 11:44:18,110 ERROR KAFKA.logger: Controller 1 epoch 2 initiated state change for partition [LcmdSegments,14] from OfflinePartition to OnlinePartition failed kafka.common.NoReplicaOnlineException: No replica for partition [LcmdSegments,14] is alive. Live brokers are: [Set()], Assigned replicas are: [List(1)] at kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75) at kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357) at kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206) at kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120) at kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117) at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) at scala.collection.mutable.HashMap.foreach(HashMap.scala:98) at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771) at kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:117) at kafka.controller.PartitionStateMachine.startup(PartitionStateMachine.scala:70) at kafka.controller.KafkaController.onControllerFailover(KafkaController.scala:314) at kafka.controller.KafkaController$$anonfun$1.apply$mcV$sp(KafkaController.scala:161) at kafka.server.ZookeeperLeaderElector.elect(ZookeeperLeaderElector.scala:81) at kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply$mcZ$sp(ZookeeperLeaderElector.scala:49) at kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply(ZookeeperLeaderElector.scala:47) at kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply(ZookeeperLeaderElector.scala:47) at kafka.utils.Utils$.inLock(Utils.scala:535) at kafka.server.ZookeeperLeaderElector.startup(ZookeeperLeaderElector.scala:47) at kafka.controller.KafkaController$$anonfun$startup$1.apply$mcV$sp(KafkaController.scala:650) at kafka.controller.KafkaController$$anonfun$startup$1.apply(KafkaController.scala:646) at kafka.controller.KafkaController$$anonfun$startup$1.apply(KafkaController.scala:646) at kafka.utils.Utils$.inLock(Utils.scala:535) at kafka.controller.KafkaController.startup(KafkaController.scala:646) at kafka.server.KafkaServer.startup(KafkaServer.scala:117) at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:29) at kafka.Kafka$.main(Kafka.scala:46) at kafka.Kafka.main(Kafka.scala) --- I upgraded from 0.8.0 to 0.8.2.1 to try and avoid the original issues with KAFKA-1029 and KAFKA-1451. Unfortunately it seems I have new issues with 0.8.2.1 with the above error. > Produce request failure after Kafka + Zookeeper restart > ------------------------------------------------------- > > Key: KAFKA-2194 > URL: https://issues.apache.org/jira/browse/KAFKA-2194 > Project: Kafka > Issue Type: Bug > Components: clients, producer > Affects Versions: 0.8.2.1 > Environment: Windows Server 2012 R2 > Reporter: Ian Morgan > Assignee: Jun Rao > > Trying to separate out Kafka-logs and Zookeeper data from the primary kafka > folder, so that distributed system can be distributed separately to the data > folders. Initialisation seems to succeed (e.g. old topics from kafka-logs is > loaded successfully). > Steps to reproduce: > 1. Start ZooKeeper > 2. Start Kafka (write data to Kafka so that data is available in kafka-logs). > 3. Kill Kafka > 4. Kill Zookeeper > 5. Start Kafka > 6. Start Zookeeper > 7. Try reading from Kafka > Logs: > Seeing the following in server.log (where LcmdSegments is the topic). > 2015-05-15 12:06:38,290 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request > with correlation id 148440 from client on partition [LcmdSegments,14] failed > due to Partition [LcmdSegments,14] doesn't exist on 1 > 2015-05-15 12:06:38,384 INFO KAFKA.Processor: Closing socket connection to > /10.44.18.75. > 2015-05-15 12:06:39,384 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request > with correlation id 148519 from client on partition [LcmdSegments,14] failed > due to Partition [LcmdSegments,14] doesn't exist on 1 > 2015-05-15 12:06:39,493 INFO KAFKA.Processor: Closing socket connection to > /10.44.18.75. > 2015-05-15 12:06:39,493 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request > with correlation id 148598 from client on partition [LcmdSegments,14] failed > due to Partition [LcmdSegments,14] doesn't exist on 1 > 2015-05-15 12:06:39,603 INFO KAFKA.Processor: Closing socket connection to > /10.44.18.75. > 2015-05-15 12:06:39,603 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request > with correlation id 148677 from client on partition [LcmdSegments,14] failed > due to Partition [LcmdSegments,14] doesn't exist on 1 > 2015-05-15 12:06:39,696 INFO KAFKA.Processor: Closing socket connection to > /10.44.18.75. > 2015-05-15 12:06:39,696 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request > with correlation id 148756 from client on partition [LcmdSegments,14] failed > due to Partition [LcmdSegments,14] doesn't exist on 1 > And KafkaNet client returns: > Topic:LcmdSegments returned error code of LeaderNotAvailable. Retrying. > Backing off metadata request retry. Waiting for 62500ms. > state-change.log shows an error: > 2015-05-15 11:44:18,110 ERROR KAFKA.logger: Controller 1 epoch 2 initiated > state change for partition [LcmdSegments,14] from OfflinePartition to > OnlinePartition failed > kafka.common.NoReplicaOnlineException: No replica for partition > [LcmdSegments,14] is alive. Live brokers are: [Set()], Assigned replicas are: > [List(1)] > at > kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75) > at > kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357) > at > kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206) > at > kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120) > at > kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117) > at > scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) > at scala.collection.mutable.HashMap.foreach(HashMap.scala:98) > at > scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771) > at > kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:117) > at > kafka.controller.PartitionStateMachine.startup(PartitionStateMachine.scala:70) > at > kafka.controller.KafkaController.onControllerFailover(KafkaController.scala:314) > at > kafka.controller.KafkaController$$anonfun$1.apply$mcV$sp(KafkaController.scala:161) > at > kafka.server.ZookeeperLeaderElector.elect(ZookeeperLeaderElector.scala:81) > at > kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply$mcZ$sp(ZookeeperLeaderElector.scala:49) > at > kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply(ZookeeperLeaderElector.scala:47) > at > kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply(ZookeeperLeaderElector.scala:47) > at kafka.utils.Utils$.inLock(Utils.scala:535) > at > kafka.server.ZookeeperLeaderElector.startup(ZookeeperLeaderElector.scala:47) > at > kafka.controller.KafkaController$$anonfun$startup$1.apply$mcV$sp(KafkaController.scala:650) > at > kafka.controller.KafkaController$$anonfun$startup$1.apply(KafkaController.scala:646) > at > kafka.controller.KafkaController$$anonfun$startup$1.apply(KafkaController.scala:646) > at kafka.utils.Utils$.inLock(Utils.scala:535) > at kafka.controller.KafkaController.startup(KafkaController.scala:646) > at kafka.server.KafkaServer.startup(KafkaServer.scala:117) > at > kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:29) > at kafka.Kafka$.main(Kafka.scala:46) > at kafka.Kafka.main(Kafka.scala) > --- > I upgraded from 0.8.0 to 0.8.2.1 to try and avoid the original issues with > KAFKA-1029 and KAFKA-1451. Unfortunately it seems I have new issues with > 0.8.2.1 with the above error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)