[ 
https://issues.apache.org/jira/browse/KAFKA-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ian Morgan updated KAFKA-2194:
------------------------------
    Description: 
Trying to separate out Kafka-logs and Zookeeper data from the primary kafka 
folder, so that distributed system can be distributed separately to the data 
folders. Initialisation seems to succeed (e.g. old topics from kafka-logs is 
loaded successfully).

Steps to reproduce:

1. Start ZooKeeper
2. Start Kafka (write data to Kafka so that data is available in kafka-logs).
3. Kill Kafka
4. Kill Zookeeper
5. Start Kafka
6. Start Zookeeper
7. Try reading from Kafka

Logs:

Seeing the following in server.log (where LcmdSegments is the topic). 

2015-05-15 12:06:38,290 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with 
correlation id 148440 from client  on partition [LcmdSegments,14] failed due to 
Partition [LcmdSegments,14] doesn't exist on 1
2015-05-15 12:06:38,384 INFO KAFKA.Processor: Closing socket connection to 
/10.44.18.75.
2015-05-15 12:06:39,384 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with 
correlation id 148519 from client  on partition [LcmdSegments,14] failed due to 
Partition [LcmdSegments,14] doesn't exist on 1
2015-05-15 12:06:39,493 INFO KAFKA.Processor: Closing socket connection to 
/10.44.18.75.
2015-05-15 12:06:39,493 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with 
correlation id 148598 from client  on partition [LcmdSegments,14] failed due to 
Partition [LcmdSegments,14] doesn't exist on 1
2015-05-15 12:06:39,603 INFO KAFKA.Processor: Closing socket connection to 
/10.44.18.75.
2015-05-15 12:06:39,603 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with 
correlation id 148677 from client  on partition [LcmdSegments,14] failed due to 
Partition [LcmdSegments,14] doesn't exist on 1
2015-05-15 12:06:39,696 INFO KAFKA.Processor: Closing socket connection to 
/10.44.18.75.
2015-05-15 12:06:39,696 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with 
correlation id 148756 from client  on partition [LcmdSegments,14] failed due to 
Partition [LcmdSegments,14] doesn't exist on 1

And KafkaNet client returns:

Topic:LcmdSegments returned error code of LeaderNotAvailable.  Retrying.
Backing off metadata request retry.  Waiting for 62500ms.

state-change.log shows an error:

2015-05-15 11:44:18,110 ERROR KAFKA.logger: Controller 1 epoch 2 initiated 
state change for partition [LcmdSegments,14] from OfflinePartition to 
OnlinePartition failed
kafka.common.NoReplicaOnlineException: No replica for partition 
[LcmdSegments,14] is alive. Live brokers are: [Set()], Assigned replicas are: 
[List(1)]
        at 
kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75)
        at 
kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357)
        at 
kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206)
        at 
kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120)
        at 
kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117)
        at 
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
        at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
        at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
        at 
scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
        at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
        at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
        at 
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
        at 
kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:117)
        at 
kafka.controller.PartitionStateMachine.startup(PartitionStateMachine.scala:70)
        at 
kafka.controller.KafkaController.onControllerFailover(KafkaController.scala:314)
        at 
kafka.controller.KafkaController$$anonfun$1.apply$mcV$sp(KafkaController.scala:161)
        at 
kafka.server.ZookeeperLeaderElector.elect(ZookeeperLeaderElector.scala:81)
        at 
kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply$mcZ$sp(ZookeeperLeaderElector.scala:49)
        at 
kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply(ZookeeperLeaderElector.scala:47)
        at 
kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply(ZookeeperLeaderElector.scala:47)
        at kafka.utils.Utils$.inLock(Utils.scala:535)
        at 
kafka.server.ZookeeperLeaderElector.startup(ZookeeperLeaderElector.scala:47)
        at 
kafka.controller.KafkaController$$anonfun$startup$1.apply$mcV$sp(KafkaController.scala:650)
        at 
kafka.controller.KafkaController$$anonfun$startup$1.apply(KafkaController.scala:646)
        at 
kafka.controller.KafkaController$$anonfun$startup$1.apply(KafkaController.scala:646)
        at kafka.utils.Utils$.inLock(Utils.scala:535)
        at kafka.controller.KafkaController.startup(KafkaController.scala:646)
        at kafka.server.KafkaServer.startup(KafkaServer.scala:117)
        at 
kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:29)
        at kafka.Kafka$.main(Kafka.scala:46)
        at kafka.Kafka.main(Kafka.scala)

---
I upgraded from 0.8.0 to 0.8.2.1 to try and avoid the original issues with 
KAFKA-1029 and KAFKA-1451. Unfortunately it seems I have new issues with 
0.8.2.1 with the above error.



  was:
Trying to separate out Kafka-logs and Zookeeper data from the primary kafka 
folder, so that distributed system can be distributed separately to the data 
folders. Initialisation seems to succeed (e.g. old topics from kafka-logs is 
loaded successfully).

Steps to reproduce:

1. Kill Kafka
2. Kill Zookeeper
3. Start Kafka
4. Start Zookeeper

Logs:

Seeing the following in server.log (where LcmdSegments is the topic). 

2015-05-15 12:06:38,290 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with 
correlation id 148440 from client  on partition [LcmdSegments,14] failed due to 
Partition [LcmdSegments,14] doesn't exist on 1
2015-05-15 12:06:38,384 INFO KAFKA.Processor: Closing socket connection to 
/10.44.18.75.
2015-05-15 12:06:39,384 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with 
correlation id 148519 from client  on partition [LcmdSegments,14] failed due to 
Partition [LcmdSegments,14] doesn't exist on 1
2015-05-15 12:06:39,493 INFO KAFKA.Processor: Closing socket connection to 
/10.44.18.75.
2015-05-15 12:06:39,493 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with 
correlation id 148598 from client  on partition [LcmdSegments,14] failed due to 
Partition [LcmdSegments,14] doesn't exist on 1
2015-05-15 12:06:39,603 INFO KAFKA.Processor: Closing socket connection to 
/10.44.18.75.
2015-05-15 12:06:39,603 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with 
correlation id 148677 from client  on partition [LcmdSegments,14] failed due to 
Partition [LcmdSegments,14] doesn't exist on 1
2015-05-15 12:06:39,696 INFO KAFKA.Processor: Closing socket connection to 
/10.44.18.75.
2015-05-15 12:06:39,696 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with 
correlation id 148756 from client  on partition [LcmdSegments,14] failed due to 
Partition [LcmdSegments,14] doesn't exist on 1

And KafkaNet client returns:

Topic:LcmdSegments returned error code of LeaderNotAvailable.  Retrying.
Backing off metadata request retry.  Waiting for 62500ms.

state-change.log shows an error:

2015-05-15 11:44:18,110 ERROR KAFKA.logger: Controller 1 epoch 2 initiated 
state change for partition [LcmdSegments,14] from OfflinePartition to 
OnlinePartition failed
kafka.common.NoReplicaOnlineException: No replica for partition 
[LcmdSegments,14] is alive. Live brokers are: [Set()], Assigned replicas are: 
[List(1)]
        at 
kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75)
        at 
kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357)
        at 
kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206)
        at 
kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120)
        at 
kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117)
        at 
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
        at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
        at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
        at 
scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
        at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
        at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
        at 
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
        at 
kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:117)
        at 
kafka.controller.PartitionStateMachine.startup(PartitionStateMachine.scala:70)
        at 
kafka.controller.KafkaController.onControllerFailover(KafkaController.scala:314)
        at 
kafka.controller.KafkaController$$anonfun$1.apply$mcV$sp(KafkaController.scala:161)
        at 
kafka.server.ZookeeperLeaderElector.elect(ZookeeperLeaderElector.scala:81)
        at 
kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply$mcZ$sp(ZookeeperLeaderElector.scala:49)
        at 
kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply(ZookeeperLeaderElector.scala:47)
        at 
kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply(ZookeeperLeaderElector.scala:47)
        at kafka.utils.Utils$.inLock(Utils.scala:535)
        at 
kafka.server.ZookeeperLeaderElector.startup(ZookeeperLeaderElector.scala:47)
        at 
kafka.controller.KafkaController$$anonfun$startup$1.apply$mcV$sp(KafkaController.scala:650)
        at 
kafka.controller.KafkaController$$anonfun$startup$1.apply(KafkaController.scala:646)
        at 
kafka.controller.KafkaController$$anonfun$startup$1.apply(KafkaController.scala:646)
        at kafka.utils.Utils$.inLock(Utils.scala:535)
        at kafka.controller.KafkaController.startup(KafkaController.scala:646)
        at kafka.server.KafkaServer.startup(KafkaServer.scala:117)
        at 
kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:29)
        at kafka.Kafka$.main(Kafka.scala:46)
        at kafka.Kafka.main(Kafka.scala)

---
I upgraded from 0.8.0 to 0.8.2.1 to try and avoid the original issues with 
KAFKA-1029 and KAFKA-1451. Unfortunately it seems I have new issues with 
0.8.2.1 with the above error.




> Produce request failure after Kafka + Zookeeper restart
> -------------------------------------------------------
>
>                 Key: KAFKA-2194
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2194
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients, producer 
>    Affects Versions: 0.8.2.1
>         Environment: Windows Server 2012 R2
>            Reporter: Ian Morgan
>            Assignee: Jun Rao
>
> Trying to separate out Kafka-logs and Zookeeper data from the primary kafka 
> folder, so that distributed system can be distributed separately to the data 
> folders. Initialisation seems to succeed (e.g. old topics from kafka-logs is 
> loaded successfully).
> Steps to reproduce:
> 1. Start ZooKeeper
> 2. Start Kafka (write data to Kafka so that data is available in kafka-logs).
> 3. Kill Kafka
> 4. Kill Zookeeper
> 5. Start Kafka
> 6. Start Zookeeper
> 7. Try reading from Kafka
> Logs:
> Seeing the following in server.log (where LcmdSegments is the topic). 
> 2015-05-15 12:06:38,290 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request 
> with correlation id 148440 from client  on partition [LcmdSegments,14] failed 
> due to Partition [LcmdSegments,14] doesn't exist on 1
> 2015-05-15 12:06:38,384 INFO KAFKA.Processor: Closing socket connection to 
> /10.44.18.75.
> 2015-05-15 12:06:39,384 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request 
> with correlation id 148519 from client  on partition [LcmdSegments,14] failed 
> due to Partition [LcmdSegments,14] doesn't exist on 1
> 2015-05-15 12:06:39,493 INFO KAFKA.Processor: Closing socket connection to 
> /10.44.18.75.
> 2015-05-15 12:06:39,493 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request 
> with correlation id 148598 from client  on partition [LcmdSegments,14] failed 
> due to Partition [LcmdSegments,14] doesn't exist on 1
> 2015-05-15 12:06:39,603 INFO KAFKA.Processor: Closing socket connection to 
> /10.44.18.75.
> 2015-05-15 12:06:39,603 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request 
> with correlation id 148677 from client  on partition [LcmdSegments,14] failed 
> due to Partition [LcmdSegments,14] doesn't exist on 1
> 2015-05-15 12:06:39,696 INFO KAFKA.Processor: Closing socket connection to 
> /10.44.18.75.
> 2015-05-15 12:06:39,696 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request 
> with correlation id 148756 from client  on partition [LcmdSegments,14] failed 
> due to Partition [LcmdSegments,14] doesn't exist on 1
> And KafkaNet client returns:
> Topic:LcmdSegments returned error code of LeaderNotAvailable.  Retrying.
> Backing off metadata request retry.  Waiting for 62500ms.
> state-change.log shows an error:
> 2015-05-15 11:44:18,110 ERROR KAFKA.logger: Controller 1 epoch 2 initiated 
> state change for partition [LcmdSegments,14] from OfflinePartition to 
> OnlinePartition failed
> kafka.common.NoReplicaOnlineException: No replica for partition 
> [LcmdSegments,14] is alive. Live brokers are: [Set()], Assigned replicas are: 
> [List(1)]
>       at 
> kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75)
>       at 
> kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357)
>       at 
> kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206)
>       at 
> kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120)
>       at 
> kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117)
>       at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
>       at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>       at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>       at 
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
>       at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
>       at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
>       at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
>       at 
> kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:117)
>       at 
> kafka.controller.PartitionStateMachine.startup(PartitionStateMachine.scala:70)
>       at 
> kafka.controller.KafkaController.onControllerFailover(KafkaController.scala:314)
>       at 
> kafka.controller.KafkaController$$anonfun$1.apply$mcV$sp(KafkaController.scala:161)
>       at 
> kafka.server.ZookeeperLeaderElector.elect(ZookeeperLeaderElector.scala:81)
>       at 
> kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply$mcZ$sp(ZookeeperLeaderElector.scala:49)
>       at 
> kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply(ZookeeperLeaderElector.scala:47)
>       at 
> kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply(ZookeeperLeaderElector.scala:47)
>       at kafka.utils.Utils$.inLock(Utils.scala:535)
>       at 
> kafka.server.ZookeeperLeaderElector.startup(ZookeeperLeaderElector.scala:47)
>       at 
> kafka.controller.KafkaController$$anonfun$startup$1.apply$mcV$sp(KafkaController.scala:650)
>       at 
> kafka.controller.KafkaController$$anonfun$startup$1.apply(KafkaController.scala:646)
>       at 
> kafka.controller.KafkaController$$anonfun$startup$1.apply(KafkaController.scala:646)
>       at kafka.utils.Utils$.inLock(Utils.scala:535)
>       at kafka.controller.KafkaController.startup(KafkaController.scala:646)
>       at kafka.server.KafkaServer.startup(KafkaServer.scala:117)
>       at 
> kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:29)
>       at kafka.Kafka$.main(Kafka.scala:46)
>       at kafka.Kafka.main(Kafka.scala)
> ---
> I upgraded from 0.8.0 to 0.8.2.1 to try and avoid the original issues with 
> KAFKA-1029 and KAFKA-1451. Unfortunately it seems I have new issues with 
> 0.8.2.1 with the above error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to