Re: Kafka Performance Tuning
Hi Yashika, No logs in broker log is not normal, can you verify if you turned off logging in your log4j properties file? If it is please enable it and try again, and see what is in the logs. Tim On Thu, Apr 24, 2014 at 10:53 PM, Yashika Gupta wrote: > Jun, > > I am using Kafka 2.8.0- 0.8.0 version. > There are no logs for the past month in the controller and state-change log. > > Though I can see dome gc logs in the kafka-home-dir/logs folder. > zookeeper-gc.log > kafkaServer-gc.log > > > Yashika > __ > From: Jun Rao > Sent: Friday, April 25, 2014 9:03 AM > To: users@kafka.apache.org > Subject: Re: Kafka Performance Tuning > > Which version of Kafka are you using? Any error in the controller and > state-change log? > > Thanks, > > Jun > > > On Thu, Apr 24, 2014 at 7:37 PM, Yashika Gupta > wrote: > >> I am running a single broker and the leader column has 0 as the value. >> >> pushkar priyadarshi wrote: >> >> >> you can use the kafka-list-topic.sh to find out if leader for particual >> topic is available.-1 in leader column might indicate trouble. >> >> >> On Fri, Apr 25, 2014 at 6:34 AM, Guozhang Wang wrote: >> >> > Could you double check if the topic LOGFILE04 is already created on the >> > servers? >> > >> > Guozhang >> > >> > >> > On Thu, Apr 24, 2014 at 10:46 AM, Yashika Gupta < >> > yashika.gu...@impetus.co.in >> > > wrote: >> > >> > > Jun, >> > > >> > > The detailed logs are as follows: >> > > >> > > 24.04.2014 13:37:31812 INFO main kafka.producer.SyncProducer - >> > > Disconnecting from localhost:9092 >> > > 24.04.2014 13:37:38612 WARN main kafka.producer.BrokerPartitionInfo - >> > > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 -> >> > > No partition metadata for topic LOGFILE04 due to >> > > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class >> > > kafka.common.LeaderNotAvailableException >> > > 24.04.2014 13:37:40712 INFO main kafka.client.ClientUtils$ - Fetching >> > > metadata from broker id:0,host:localhost,port:9092 with correlation id >> 1 >> > > for 1 topic(s) Set(LOGFILE04) >> > > 24.04.2014 13:37:41212 INFO main kafka.producer.SyncProducer - >> Connected >> > > to localhost:9092 for producing >> > > 24.04.2014 13:37:48812 INFO main kafka.producer.SyncProducer - >> > > Disconnecting from localhost:9092 >> > > 24.04.2014 13:37:48912 WARN main kafka.producer.BrokerPartitionInfo - >> > > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 -> >> > > No partition metadata for topic LOGFILE04 due to >> > > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class >> > > kafka.common.LeaderNotAvailableException >> > > 24.04.2014 13:37:49012 ERROR main >> > kafka.producer.async.DefaultEventHandler >> > > - Failed to collate messages by topic, partition due to: Failed to >> fetch >> > > topic metadata for topic: LOGFILE04 >> > > >> > > >> > > 24.04.2014 13:39:96513 WARN >> > > >> > >> ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0 >> > > kafka.consumer.ConsumerFetcherThread - >> > > >> > >> [ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0], >> > > Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 4; >> > ClientId: >> > > >> > >> produceLogLine2-ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0; >> > > ReplicaId: -1; MaxWait: 6 ms; MinBytes: 1 bytes; RequestInfo: >> > > [LOGFILE04,0] -> PartitionFetchInfo(2,1048576) >> > > java.net.SocketTimeoutException >> > > at >> > > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229) >> > > at >> > sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) >> > > at >> > > >> > >> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) >> > > at kafka.utils.Utils$.read(Unknown Source) >> > > at kafka.network.BoundedByteBufferReceive.readFrom(Unknown >> > Source) >> > > at kafka.network.Receive$class.readCompletely(Unknown Source) >> > > at >> kafka.network.BoundedByteBufferReceive.readCompletely(Unknown >> > > Source) >> > > at kafka.network.BlockingChannel.receive(Unknown Source) >> > > at kafka.consumer.SimpleConsumer.liftedTree1$1(Unknown Source) >> > > at >> > > >> > >> kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(Unknown >> > > Source) >> > > at >> > > >> > >> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Unknown >> > > Source) >> > > at >> > > >> > >> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown >> > > Source) >> > > at >> > > >> > >> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown >> > > Source) >> > > at kafka.metrics.KafkaTimer.time(Unknown Source) >> > > at >> > > kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(Unknown >> > Source) >
RE: Kafka Performance Tuning
Jun, I am using Kafka 2.8.0- 0.8.0 version. There are no logs for the past month in the controller and state-change log. Though I can see dome gc logs in the kafka-home-dir/logs folder. zookeeper-gc.log kafkaServer-gc.log Yashika __ From: Jun Rao Sent: Friday, April 25, 2014 9:03 AM To: users@kafka.apache.org Subject: Re: Kafka Performance Tuning Which version of Kafka are you using? Any error in the controller and state-change log? Thanks, Jun On Thu, Apr 24, 2014 at 7:37 PM, Yashika Gupta wrote: > I am running a single broker and the leader column has 0 as the value. > > pushkar priyadarshi wrote: > > > you can use the kafka-list-topic.sh to find out if leader for particual > topic is available.-1 in leader column might indicate trouble. > > > On Fri, Apr 25, 2014 at 6:34 AM, Guozhang Wang wrote: > > > Could you double check if the topic LOGFILE04 is already created on the > > servers? > > > > Guozhang > > > > > > On Thu, Apr 24, 2014 at 10:46 AM, Yashika Gupta < > > yashika.gu...@impetus.co.in > > > wrote: > > > > > Jun, > > > > > > The detailed logs are as follows: > > > > > > 24.04.2014 13:37:31812 INFO main kafka.producer.SyncProducer - > > > Disconnecting from localhost:9092 > > > 24.04.2014 13:37:38612 WARN main kafka.producer.BrokerPartitionInfo - > > > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 -> > > > No partition metadata for topic LOGFILE04 due to > > > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class > > > kafka.common.LeaderNotAvailableException > > > 24.04.2014 13:37:40712 INFO main kafka.client.ClientUtils$ - Fetching > > > metadata from broker id:0,host:localhost,port:9092 with correlation id > 1 > > > for 1 topic(s) Set(LOGFILE04) > > > 24.04.2014 13:37:41212 INFO main kafka.producer.SyncProducer - > Connected > > > to localhost:9092 for producing > > > 24.04.2014 13:37:48812 INFO main kafka.producer.SyncProducer - > > > Disconnecting from localhost:9092 > > > 24.04.2014 13:37:48912 WARN main kafka.producer.BrokerPartitionInfo - > > > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 -> > > > No partition metadata for topic LOGFILE04 due to > > > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class > > > kafka.common.LeaderNotAvailableException > > > 24.04.2014 13:37:49012 ERROR main > > kafka.producer.async.DefaultEventHandler > > > - Failed to collate messages by topic, partition due to: Failed to > fetch > > > topic metadata for topic: LOGFILE04 > > > > > > > > > 24.04.2014 13:39:96513 WARN > > > > > > ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0 > > > kafka.consumer.ConsumerFetcherThread - > > > > > > [ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0], > > > Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 4; > > ClientId: > > > > > > produceLogLine2-ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0; > > > ReplicaId: -1; MaxWait: 6 ms; MinBytes: 1 bytes; RequestInfo: > > > [LOGFILE04,0] -> PartitionFetchInfo(2,1048576) > > > java.net.SocketTimeoutException > > > at > > > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229) > > > at > > sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) > > > at > > > > > > java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) > > > at kafka.utils.Utils$.read(Unknown Source) > > > at kafka.network.BoundedByteBufferReceive.readFrom(Unknown > > Source) > > > at kafka.network.Receive$class.readCompletely(Unknown Source) > > > at > kafka.network.BoundedByteBufferReceive.readCompletely(Unknown > > > Source) > > > at kafka.network.BlockingChannel.receive(Unknown Source) > > > at kafka.consumer.SimpleConsumer.liftedTree1$1(Unknown Source) > > > at > > > > > > kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(Unknown > > > Source) > > > at > > > > > > kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Unknown > > > Source) > > > at > > > > > > kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown > > > Source) > > > at > > > > > > kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown > > > Source) > > > at kafka.metrics.KafkaTimer.time(Unknown Source) > > > at > > > kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(Unknown > > Source) > > > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown > > > Source) > > > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown > > > Source) > > > at kafka.metrics.KafkaTimer.time(Unknown Source) > > > at kafka.consumer.SimpleConsumer.fetch(Unknown Source) > > > at > kafka.server.AbstractFetcherThread.processFetchRequest(Unknown > > > Source)
Re: Kafka Performance Tuning
Which version of Kafka are you using? Any error in the controller and state-change log? Thanks, Jun On Thu, Apr 24, 2014 at 7:37 PM, Yashika Gupta wrote: > I am running a single broker and the leader column has 0 as the value. > > pushkar priyadarshi wrote: > > > you can use the kafka-list-topic.sh to find out if leader for particual > topic is available.-1 in leader column might indicate trouble. > > > On Fri, Apr 25, 2014 at 6:34 AM, Guozhang Wang wrote: > > > Could you double check if the topic LOGFILE04 is already created on the > > servers? > > > > Guozhang > > > > > > On Thu, Apr 24, 2014 at 10:46 AM, Yashika Gupta < > > yashika.gu...@impetus.co.in > > > wrote: > > > > > Jun, > > > > > > The detailed logs are as follows: > > > > > > 24.04.2014 13:37:31812 INFO main kafka.producer.SyncProducer - > > > Disconnecting from localhost:9092 > > > 24.04.2014 13:37:38612 WARN main kafka.producer.BrokerPartitionInfo - > > > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 -> > > > No partition metadata for topic LOGFILE04 due to > > > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class > > > kafka.common.LeaderNotAvailableException > > > 24.04.2014 13:37:40712 INFO main kafka.client.ClientUtils$ - Fetching > > > metadata from broker id:0,host:localhost,port:9092 with correlation id > 1 > > > for 1 topic(s) Set(LOGFILE04) > > > 24.04.2014 13:37:41212 INFO main kafka.producer.SyncProducer - > Connected > > > to localhost:9092 for producing > > > 24.04.2014 13:37:48812 INFO main kafka.producer.SyncProducer - > > > Disconnecting from localhost:9092 > > > 24.04.2014 13:37:48912 WARN main kafka.producer.BrokerPartitionInfo - > > > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 -> > > > No partition metadata for topic LOGFILE04 due to > > > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class > > > kafka.common.LeaderNotAvailableException > > > 24.04.2014 13:37:49012 ERROR main > > kafka.producer.async.DefaultEventHandler > > > - Failed to collate messages by topic, partition due to: Failed to > fetch > > > topic metadata for topic: LOGFILE04 > > > > > > > > > 24.04.2014 13:39:96513 WARN > > > > > > ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0 > > > kafka.consumer.ConsumerFetcherThread - > > > > > > [ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0], > > > Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 4; > > ClientId: > > > > > > produceLogLine2-ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0; > > > ReplicaId: -1; MaxWait: 6 ms; MinBytes: 1 bytes; RequestInfo: > > > [LOGFILE04,0] -> PartitionFetchInfo(2,1048576) > > > java.net.SocketTimeoutException > > > at > > > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229) > > > at > > sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) > > > at > > > > > > java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) > > > at kafka.utils.Utils$.read(Unknown Source) > > > at kafka.network.BoundedByteBufferReceive.readFrom(Unknown > > Source) > > > at kafka.network.Receive$class.readCompletely(Unknown Source) > > > at > kafka.network.BoundedByteBufferReceive.readCompletely(Unknown > > > Source) > > > at kafka.network.BlockingChannel.receive(Unknown Source) > > > at kafka.consumer.SimpleConsumer.liftedTree1$1(Unknown Source) > > > at > > > > > > kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(Unknown > > > Source) > > > at > > > > > > kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Unknown > > > Source) > > > at > > > > > > kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown > > > Source) > > > at > > > > > > kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown > > > Source) > > > at kafka.metrics.KafkaTimer.time(Unknown Source) > > > at > > > kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(Unknown > > Source) > > > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown > > > Source) > > > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown > > > Source) > > > at kafka.metrics.KafkaTimer.time(Unknown Source) > > > at kafka.consumer.SimpleConsumer.fetch(Unknown Source) > > > at > kafka.server.AbstractFetcherThread.processFetchRequest(Unknown > > > Source) > > > at kafka.server.AbstractFetcherThread.doWork(Unknown Source) > > > at kafka.utils.ShutdownableThread.run(Unknown Source) > > > > > > > > > Regards, > > > Yashika > > > > > > From: Jun Rao > > > Sent: Thursday, April 24, 2014 10:49 PM > > > To: users@kafka.apache.org > > > Subject: Re: Kafka Performance Tuning > >
Re: Kafka Performance Tuning
I am running a single broker and the leader column has 0 as the value. pushkar priyadarshi wrote: you can use the kafka-list-topic.sh to find out if leader for particual topic is available.-1 in leader column might indicate trouble. On Fri, Apr 25, 2014 at 6:34 AM, Guozhang Wang wrote: > Could you double check if the topic LOGFILE04 is already created on the > servers? > > Guozhang > > > On Thu, Apr 24, 2014 at 10:46 AM, Yashika Gupta < > yashika.gu...@impetus.co.in > > wrote: > > > Jun, > > > > The detailed logs are as follows: > > > > 24.04.2014 13:37:31812 INFO main kafka.producer.SyncProducer - > > Disconnecting from localhost:9092 > > 24.04.2014 13:37:38612 WARN main kafka.producer.BrokerPartitionInfo - > > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 -> > > No partition metadata for topic LOGFILE04 due to > > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class > > kafka.common.LeaderNotAvailableException > > 24.04.2014 13:37:40712 INFO main kafka.client.ClientUtils$ - Fetching > > metadata from broker id:0,host:localhost,port:9092 with correlation id 1 > > for 1 topic(s) Set(LOGFILE04) > > 24.04.2014 13:37:41212 INFO main kafka.producer.SyncProducer - Connected > > to localhost:9092 for producing > > 24.04.2014 13:37:48812 INFO main kafka.producer.SyncProducer - > > Disconnecting from localhost:9092 > > 24.04.2014 13:37:48912 WARN main kafka.producer.BrokerPartitionInfo - > > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 -> > > No partition metadata for topic LOGFILE04 due to > > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class > > kafka.common.LeaderNotAvailableException > > 24.04.2014 13:37:49012 ERROR main > kafka.producer.async.DefaultEventHandler > > - Failed to collate messages by topic, partition due to: Failed to fetch > > topic metadata for topic: LOGFILE04 > > > > > > 24.04.2014 13:39:96513 WARN > > > ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0 > > kafka.consumer.ConsumerFetcherThread - > > > [ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0], > > Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 4; > ClientId: > > > produceLogLine2-ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0; > > ReplicaId: -1; MaxWait: 6 ms; MinBytes: 1 bytes; RequestInfo: > > [LOGFILE04,0] -> PartitionFetchInfo(2,1048576) > > java.net.SocketTimeoutException > > at > > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229) > > at > sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) > > at > > > java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) > > at kafka.utils.Utils$.read(Unknown Source) > > at kafka.network.BoundedByteBufferReceive.readFrom(Unknown > Source) > > at kafka.network.Receive$class.readCompletely(Unknown Source) > > at kafka.network.BoundedByteBufferReceive.readCompletely(Unknown > > Source) > > at kafka.network.BlockingChannel.receive(Unknown Source) > > at kafka.consumer.SimpleConsumer.liftedTree1$1(Unknown Source) > > at > > > kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(Unknown > > Source) > > at > > > kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Unknown > > Source) > > at > > > kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown > > Source) > > at > > > kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown > > Source) > > at kafka.metrics.KafkaTimer.time(Unknown Source) > > at > > kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(Unknown > Source) > > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown > > Source) > > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown > > Source) > > at kafka.metrics.KafkaTimer.time(Unknown Source) > > at kafka.consumer.SimpleConsumer.fetch(Unknown Source) > > at kafka.server.AbstractFetcherThread.processFetchRequest(Unknown > > Source) > > at kafka.server.AbstractFetcherThread.doWork(Unknown Source) > > at kafka.utils.ShutdownableThread.run(Unknown Source) > > > > > > Regards, > > Yashika > > > > From: Jun Rao > > Sent: Thursday, April 24, 2014 10:49 PM > > To: users@kafka.apache.org > > Subject: Re: Kafka Performance Tuning > > > > Before that error messge, the log should tell you the cause of the error. > > Could you dig that out? > > > > Thanks, > > > > Jun > > > > > > On Thu, Apr 24, 2014 at 10:12 AM, Yashika Gupta < > > yashika.gu...@impetus.co.in > > > wrote: > > > > > Hi, > > > > > > I am working on a POC where I have 1 Zookeeper and 2 Kafka Brokers on > my > > > local machine. I am running 8 sets of Kafka consumers and produce
Re: Kafka Performance Tuning
I had cleaned up the topics using the following commands: Rm -rf /tmp/kafka-logs/* And verified using the topics list command before executing the script. Am I missing anything else. Regards, Yashika Guozhang Wang wrote: Could you double check if the topic LOGFILE04 is already created on the servers? Guozhang On Thu, Apr 24, 2014 at 10:46 AM, Yashika Gupta wrote: > Jun, > > The detailed logs are as follows: > > 24.04.2014 13:37:31812 INFO main kafka.producer.SyncProducer - > Disconnecting from localhost:9092 > 24.04.2014 13:37:38612 WARN main kafka.producer.BrokerPartitionInfo - > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 -> > No partition metadata for topic LOGFILE04 due to > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class > kafka.common.LeaderNotAvailableException > 24.04.2014 13:37:40712 INFO main kafka.client.ClientUtils$ - Fetching > metadata from broker id:0,host:localhost,port:9092 with correlation id 1 > for 1 topic(s) Set(LOGFILE04) > 24.04.2014 13:37:41212 INFO main kafka.producer.SyncProducer - Connected > to localhost:9092 for producing > 24.04.2014 13:37:48812 INFO main kafka.producer.SyncProducer - > Disconnecting from localhost:9092 > 24.04.2014 13:37:48912 WARN main kafka.producer.BrokerPartitionInfo - > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 -> > No partition metadata for topic LOGFILE04 due to > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class > kafka.common.LeaderNotAvailableException > 24.04.2014 13:37:49012 ERROR main kafka.producer.async.DefaultEventHandler > - Failed to collate messages by topic, partition due to: Failed to fetch > topic metadata for topic: LOGFILE04 > > > 24.04.2014 13:39:96513 WARN > ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0 > kafka.consumer.ConsumerFetcherThread - > [ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0], > Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 4; ClientId: > produceLogLine2-ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0; > ReplicaId: -1; MaxWait: 6 ms; MinBytes: 1 bytes; RequestInfo: > [LOGFILE04,0] -> PartitionFetchInfo(2,1048576) > java.net.SocketTimeoutException > at > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229) > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) > at > java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) > at kafka.utils.Utils$.read(Unknown Source) > at kafka.network.BoundedByteBufferReceive.readFrom(Unknown Source) > at kafka.network.Receive$class.readCompletely(Unknown Source) > at kafka.network.BoundedByteBufferReceive.readCompletely(Unknown > Source) > at kafka.network.BlockingChannel.receive(Unknown Source) > at kafka.consumer.SimpleConsumer.liftedTree1$1(Unknown Source) > at > kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(Unknown > Source) > at > kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Unknown > Source) > at > kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown > Source) > at > kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown > Source) > at kafka.metrics.KafkaTimer.time(Unknown Source) > at > kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(Unknown Source) > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown > Source) > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown > Source) > at kafka.metrics.KafkaTimer.time(Unknown Source) > at kafka.consumer.SimpleConsumer.fetch(Unknown Source) > at kafka.server.AbstractFetcherThread.processFetchRequest(Unknown > Source) > at kafka.server.AbstractFetcherThread.doWork(Unknown Source) > at kafka.utils.ShutdownableThread.run(Unknown Source) > > > Regards, > Yashika > > From: Jun Rao > Sent: Thursday, April 24, 2014 10:49 PM > To: users@kafka.apache.org > Subject: Re: Kafka Performance Tuning > > Before that error messge, the log should tell you the cause of the error. > Could you dig that out? > > Thanks, > > Jun > > > On Thu, Apr 24, 2014 at 10:12 AM, Yashika Gupta < > yashika.gu...@impetus.co.in > > wrote: > > > Hi, > > > > I am working on a POC where I have 1 Zookeeper and 2 Kafka Brokers on my > > local machine. I am running 8 sets of Kafka consumers and producers > running > > in parallel. > > > > Below are my configurations: > > Consumer Configs: > > zookeeper.session.timeout.ms=12 > > zookeeper.sync.time.ms=2000 > > zookeeper.connection.timeout.ms=12 > > auto.commit.interval.ms=6 > > rebalance.backoff.ms=2000 > > fetch.wait.max.ms=6 > > auto.offset.reset=smallest > > > > Produce
Re: Kafka Performance Tuning
you can use the kafka-list-topic.sh to find out if leader for particual topic is available.-1 in leader column might indicate trouble. On Fri, Apr 25, 2014 at 6:34 AM, Guozhang Wang wrote: > Could you double check if the topic LOGFILE04 is already created on the > servers? > > Guozhang > > > On Thu, Apr 24, 2014 at 10:46 AM, Yashika Gupta < > yashika.gu...@impetus.co.in > > wrote: > > > Jun, > > > > The detailed logs are as follows: > > > > 24.04.2014 13:37:31812 INFO main kafka.producer.SyncProducer - > > Disconnecting from localhost:9092 > > 24.04.2014 13:37:38612 WARN main kafka.producer.BrokerPartitionInfo - > > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 -> > > No partition metadata for topic LOGFILE04 due to > > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class > > kafka.common.LeaderNotAvailableException > > 24.04.2014 13:37:40712 INFO main kafka.client.ClientUtils$ - Fetching > > metadata from broker id:0,host:localhost,port:9092 with correlation id 1 > > for 1 topic(s) Set(LOGFILE04) > > 24.04.2014 13:37:41212 INFO main kafka.producer.SyncProducer - Connected > > to localhost:9092 for producing > > 24.04.2014 13:37:48812 INFO main kafka.producer.SyncProducer - > > Disconnecting from localhost:9092 > > 24.04.2014 13:37:48912 WARN main kafka.producer.BrokerPartitionInfo - > > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 -> > > No partition metadata for topic LOGFILE04 due to > > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class > > kafka.common.LeaderNotAvailableException > > 24.04.2014 13:37:49012 ERROR main > kafka.producer.async.DefaultEventHandler > > - Failed to collate messages by topic, partition due to: Failed to fetch > > topic metadata for topic: LOGFILE04 > > > > > > 24.04.2014 13:39:96513 WARN > > > ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0 > > kafka.consumer.ConsumerFetcherThread - > > > [ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0], > > Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 4; > ClientId: > > > produceLogLine2-ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0; > > ReplicaId: -1; MaxWait: 6 ms; MinBytes: 1 bytes; RequestInfo: > > [LOGFILE04,0] -> PartitionFetchInfo(2,1048576) > > java.net.SocketTimeoutException > > at > > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229) > > at > sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) > > at > > > java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) > > at kafka.utils.Utils$.read(Unknown Source) > > at kafka.network.BoundedByteBufferReceive.readFrom(Unknown > Source) > > at kafka.network.Receive$class.readCompletely(Unknown Source) > > at kafka.network.BoundedByteBufferReceive.readCompletely(Unknown > > Source) > > at kafka.network.BlockingChannel.receive(Unknown Source) > > at kafka.consumer.SimpleConsumer.liftedTree1$1(Unknown Source) > > at > > > kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(Unknown > > Source) > > at > > > kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Unknown > > Source) > > at > > > kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown > > Source) > > at > > > kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown > > Source) > > at kafka.metrics.KafkaTimer.time(Unknown Source) > > at > > kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(Unknown > Source) > > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown > > Source) > > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown > > Source) > > at kafka.metrics.KafkaTimer.time(Unknown Source) > > at kafka.consumer.SimpleConsumer.fetch(Unknown Source) > > at kafka.server.AbstractFetcherThread.processFetchRequest(Unknown > > Source) > > at kafka.server.AbstractFetcherThread.doWork(Unknown Source) > > at kafka.utils.ShutdownableThread.run(Unknown Source) > > > > > > Regards, > > Yashika > > > > From: Jun Rao > > Sent: Thursday, April 24, 2014 10:49 PM > > To: users@kafka.apache.org > > Subject: Re: Kafka Performance Tuning > > > > Before that error messge, the log should tell you the cause of the error. > > Could you dig that out? > > > > Thanks, > > > > Jun > > > > > > On Thu, Apr 24, 2014 at 10:12 AM, Yashika Gupta < > > yashika.gu...@impetus.co.in > > > wrote: > > > > > Hi, > > > > > > I am working on a POC where I have 1 Zookeeper and 2 Kafka Brokers on > my > > > local machine. I am running 8 sets of Kafka consumers and producers > > running > > > in parallel. > > > > > > Below are my configurations: > > > Consumer Configs: > >
Re: question about isr
I don't do any partition reassignment. When broker occure following error, this phenomenon will happen. [hadoop@nelo76 libs]$ [2014-03-14 12:11:44,310] INFO Partition [nelo2-normal-logs,0] on broker 0: Shrinking ISR for partition [nelo2-normal-logs,0] from 0,1 to 0 (kafka.cluster.Partition) [2014-03-14 12:11:44,313] ERROR Conditional update of path /brokers/topics/nelo2-normal-logs/partitions/0/state with data {"controller_epoch":4,"leader":0,"version":1,"leader_epoch":5,"isr":[0]} and expected version 7 failed due to org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /brokers/topics/nelo2-normal-logs/partitions/0/state (kafka.utils.ZkUtils$) [2014-03-14 12:11:44,313] INFO Partition [nelo2-normal-logs,0] on broker 0: Cached zkVersion [7] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition) [2014-03-14 12:11:44,313] INFO Partition [nelo2-symbolicated-logs,1] on broker 0: Shrinking ISR for partition [nelo2-symbolicated-logs,1] from 0,2 to 0 (kafka.cluster.Partition) [2014-03-14 12:11:44,315] ERROR Conditional update of path /brokers/topics/nelo2-symbolicated-logs/partitions/1/state with data {"controller_epoch":4,"leader":0,"version":1,"leader_epoch":6,"isr":[0]} and expected version 8 failed due to org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /brokers/topics/nelo2-symbolicated-logs/partitions/1/state (kafka.utils.ZkUtils$) [2014-03-14 12:11:44,315] INFO Partition [nelo2-symbolicated-logs,1] on broker 0: Cached zkVersion [8] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition) [2014-03-14 12:11:44,316] INFO Partition [nelo2-crash-logs,1] on broker 0: Shrinking ISR for partition [nelo2-crash-logs,1] from 0,1 to 0 (kafka.cluster.Partition) [2014-03-14 12:11:44,318] ERROR Conditional update of path /brokers/topics/nelo2-crash-logs/partitions/1/state with data {"controller_epoch":4,"leader":0,"version":1,"leader_epoch":5,"isr":[0]} and expected version 7 failed due to org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /brokers/topics/nelo2-crash-logs/partitions/1/state (kafka.utils.ZkUtils$) [2014-03-14 12:11:44,318] INFO Partition [nelo2-crash-logs,1] on broker 0: Cached zkVersion [7] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partit Best Regards Jerry -Original Message- From: "Jun Rao"To: "users@kafka.apache.org" ; "陈小军" ; Cc: Sent: 2014-04-25 (星期五) 02:12:02 Subject: Re: question about isr Interesting. Which version of Kafka are you using? Were you doing some partition reassignment? Thanks, Jun On Wed, Apr 23, 2014 at 11:14 PM, 陈小军 wrote: Hi Team, I found a strange phenomenon of isr list in my kafka cluster When I use the tool that kafka provide to get the topic information, and it show isr list as following, seem it is ok [irt...@xseed171.kdev bin]$ ./kafka-topics.sh --describe --zookeeper 10.96.250.215:10013,10.96.250.216:10013,10.96.250.217:10013/nelo2-kafka Topic:nelo2-normal-logs PartitionCount:3ReplicationFactor:2 Configs: Topic: nelo2-normal-logsPartition: 0Leader: 3 Replicas: 3,0 Isr: 0,3 Topic: nelo2-normal-logsPartition: 1Leader: 0 Replicas: 0,1 Isr: 0,1 Topic: nelo2-normal-logsPartition: 2Leader: 1 Replicas: 1,3 Isr: 1,3 but when I use some sdk to get the meta info from broker, the isr is different. metadata: { size: 246, correlationId: 0, brokerNum: -1, nodeId: 1, host: 'xseed171.kdev.nhnsystem.com', port: 9093, topicNum: 0, topicError: 0, topic: 'nelo2-normal-logs', partitionNum: 2, errorCode: 0, partition: 0, leader: 3, replicasNum: 2, replicas: [ 3, 0 ], isrNum: 2, isr: [ 0, 3 ] } metadata: { size: 246, correlationId: 0, brokerNum: -1, nodeId: 1, host: 'xseed171.kdev.nhnsystem.com', port: 9093, topicNum: 0, topicError: 0, topic: 'nelo2-normal-logs', partitionNum: 1, errorCode: 0, partition: 1, leader: 0, replicasNum: 2, replicas: [ 0, 1 ], isrNum: 2, isr: [ 0, 1 ] } metadata: { size: 246, correlationId: 0, brokerNum: -1, nodeId: 1, host: 'xseed171.kdev.nhnsystem.com', port: 9093, topicNum: 0, topicError: 0, topic: 'nelo2-normal-logs', partitionNum: 0, errorCode: 0, partition: 2, leader: 1, replicasNum: 2, replicas: [ 1, 3 ], isrNum: 1, isr: [ 1 ] } I also use other sdk, get the same result. I check the logs from kafka, it seems the sdk result is right. the tool get the wrong result. why is it happend? [2014-04-24 14:53:57,705] TRACE Broker 3 cached leader info (LeaderAndIsrInfo:(Leader:0,ISR:0,1,LeaderEpoch:7,ControllerEpoch:9),ReplicationFactor:2),AllReplicas:0,1) for partition [nel
Re: Brokers throwing warning messages after change in retention policy and multiple produce failures
Hi Sadhan, Do you see any errors on the server logs? Guozhang On Thu, Apr 24, 2014 at 12:57 PM, Sadhan Sood wrote: > We are seeing some strange behavior from brokers after we we had to change > our log retention policy on brokers yesterday. We had a huge spike in > producer data for a small period which caused brokers to get very close to > the max disk space. Normally our retention policy is good 6-7 days but > since our consumers were synced up we changed the retention policy from > hour based to size based and cut short the size to a safe number (half of > our max disk space and normal usage is around 30%). After the restart, we > started seeing multiple producer side failures with FailedSends metrics > showing almost 10% failures and FailedProduceRequestsPerSec on the broker > side a non-zero number. The traces from one of the brokers looked like > this: > > [KafkaApi-8] Produce request with correlation id 2050686 from client xxx on > partition [TOPIC_NAME,18] failed due to Partition [TOPIC_NAME,18] doesn't > exist on 8 (kafka.server.KafkaApis) > [KafkaApi-8] Produce request with correlation id 2102325 from client xxx > on partition [TOPIC_NAME,28] failed due to Partition [TOPIC_NAME,28] > doesn't exist on 8 (kafka.server.KafkaApis) > > We checked and made sure those partitions were present on the broker. > Any help is appreciated. Also, is there a recommended way to purge log data > quickly out from the brokers. > > Thanks, > Sadhan > -- -- Guozhang
Re: Kafka Performance Tuning
Could you double check if the topic LOGFILE04 is already created on the servers? Guozhang On Thu, Apr 24, 2014 at 10:46 AM, Yashika Gupta wrote: > Jun, > > The detailed logs are as follows: > > 24.04.2014 13:37:31812 INFO main kafka.producer.SyncProducer - > Disconnecting from localhost:9092 > 24.04.2014 13:37:38612 WARN main kafka.producer.BrokerPartitionInfo - > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 -> > No partition metadata for topic LOGFILE04 due to > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class > kafka.common.LeaderNotAvailableException > 24.04.2014 13:37:40712 INFO main kafka.client.ClientUtils$ - Fetching > metadata from broker id:0,host:localhost,port:9092 with correlation id 1 > for 1 topic(s) Set(LOGFILE04) > 24.04.2014 13:37:41212 INFO main kafka.producer.SyncProducer - Connected > to localhost:9092 for producing > 24.04.2014 13:37:48812 INFO main kafka.producer.SyncProducer - > Disconnecting from localhost:9092 > 24.04.2014 13:37:48912 WARN main kafka.producer.BrokerPartitionInfo - > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 -> > No partition metadata for topic LOGFILE04 due to > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class > kafka.common.LeaderNotAvailableException > 24.04.2014 13:37:49012 ERROR main kafka.producer.async.DefaultEventHandler > - Failed to collate messages by topic, partition due to: Failed to fetch > topic metadata for topic: LOGFILE04 > > > 24.04.2014 13:39:96513 WARN > ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0 > kafka.consumer.ConsumerFetcherThread - > [ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0], > Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 4; ClientId: > produceLogLine2-ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0; > ReplicaId: -1; MaxWait: 6 ms; MinBytes: 1 bytes; RequestInfo: > [LOGFILE04,0] -> PartitionFetchInfo(2,1048576) > java.net.SocketTimeoutException > at > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229) > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) > at > java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) > at kafka.utils.Utils$.read(Unknown Source) > at kafka.network.BoundedByteBufferReceive.readFrom(Unknown Source) > at kafka.network.Receive$class.readCompletely(Unknown Source) > at kafka.network.BoundedByteBufferReceive.readCompletely(Unknown > Source) > at kafka.network.BlockingChannel.receive(Unknown Source) > at kafka.consumer.SimpleConsumer.liftedTree1$1(Unknown Source) > at > kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(Unknown > Source) > at > kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Unknown > Source) > at > kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown > Source) > at > kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown > Source) > at kafka.metrics.KafkaTimer.time(Unknown Source) > at > kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(Unknown Source) > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown > Source) > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown > Source) > at kafka.metrics.KafkaTimer.time(Unknown Source) > at kafka.consumer.SimpleConsumer.fetch(Unknown Source) > at kafka.server.AbstractFetcherThread.processFetchRequest(Unknown > Source) > at kafka.server.AbstractFetcherThread.doWork(Unknown Source) > at kafka.utils.ShutdownableThread.run(Unknown Source) > > > Regards, > Yashika > > From: Jun Rao > Sent: Thursday, April 24, 2014 10:49 PM > To: users@kafka.apache.org > Subject: Re: Kafka Performance Tuning > > Before that error messge, the log should tell you the cause of the error. > Could you dig that out? > > Thanks, > > Jun > > > On Thu, Apr 24, 2014 at 10:12 AM, Yashika Gupta < > yashika.gu...@impetus.co.in > > wrote: > > > Hi, > > > > I am working on a POC where I have 1 Zookeeper and 2 Kafka Brokers on my > > local machine. I am running 8 sets of Kafka consumers and producers > running > > in parallel. > > > > Below are my configurations: > > Consumer Configs: > > zookeeper.session.timeout.ms=12 > > zookeeper.sync.time.ms=2000 > > zookeeper.connection.timeout.ms=12 > > auto.commit.interval.ms=6 > > rebalance.backoff.ms=2000 > > fetch.wait.max.ms=6 > > auto.offset.reset=smallest > > > > Producer configs: > > key.serializer.class=kafka.serializer.StringEncoder > > request.required.acks=-1 > > message.send.max.retries=3 > > request.timeout.ms=6 > > > > I have tried various more configuration changes but I am running
Re: Kafka Performance Tuning
I had this error before and corrected by increasing nofile limit add to file an entry for the user running the broker. /etc/security/limits.conf kafka - nofile 98304 On Thu, Apr 24, 2014 at 1:46 PM, Yashika Gupta wrote: > Jun, > > The detailed logs are as follows: > > 24.04.2014 13:37:31812 INFO main kafka.producer.SyncProducer - > Disconnecting from localhost:9092 > 24.04.2014 13:37:38612 WARN main kafka.producer.BrokerPartitionInfo - > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 -> > No partition metadata for topic LOGFILE04 due to > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class > kafka.common.LeaderNotAvailableException > 24.04.2014 13:37:40712 INFO main kafka.client.ClientUtils$ - Fetching > metadata from broker id:0,host:localhost,port:9092 with correlation id 1 > for 1 topic(s) Set(LOGFILE04) > 24.04.2014 13:37:41212 INFO main kafka.producer.SyncProducer - Connected > to localhost:9092 for producing > 24.04.2014 13:37:48812 INFO main kafka.producer.SyncProducer - > Disconnecting from localhost:9092 > 24.04.2014 13:37:48912 WARN main kafka.producer.BrokerPartitionInfo - > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 -> > No partition metadata for topic LOGFILE04 due to > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class > kafka.common.LeaderNotAvailableException > 24.04.2014 13:37:49012 ERROR main kafka.producer.async.DefaultEventHandler > - Failed to collate messages by topic, partition due to: Failed to fetch > topic metadata for topic: LOGFILE04 > > > 24.04.2014 13:39:96513 WARN > ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0 > kafka.consumer.ConsumerFetcherThread - > [ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0], > Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 4; ClientId: > produceLogLine2-ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0; > ReplicaId: -1; MaxWait: 6 ms; MinBytes: 1 bytes; RequestInfo: > [LOGFILE04,0] -> PartitionFetchInfo(2,1048576) > java.net.SocketTimeoutException > at > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229) > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) > at > java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) > at kafka.utils.Utils$.read(Unknown Source) > at kafka.network.BoundedByteBufferReceive.readFrom(Unknown Source) > at kafka.network.Receive$class.readCompletely(Unknown Source) > at kafka.network.BoundedByteBufferReceive.readCompletely(Unknown > Source) > at kafka.network.BlockingChannel.receive(Unknown Source) > at kafka.consumer.SimpleConsumer.liftedTree1$1(Unknown Source) > at > kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(Unknown > Source) > at > kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Unknown > Source) > at > kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown > Source) > at > kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown > Source) > at kafka.metrics.KafkaTimer.time(Unknown Source) > at > kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(Unknown Source) > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown > Source) > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown > Source) > at kafka.metrics.KafkaTimer.time(Unknown Source) > at kafka.consumer.SimpleConsumer.fetch(Unknown Source) > at kafka.server.AbstractFetcherThread.processFetchRequest(Unknown > Source) > at kafka.server.AbstractFetcherThread.doWork(Unknown Source) > at kafka.utils.ShutdownableThread.run(Unknown Source) > > > Regards, > Yashika > > From: Jun Rao > Sent: Thursday, April 24, 2014 10:49 PM > To: users@kafka.apache.org > Subject: Re: Kafka Performance Tuning > > Before that error messge, the log should tell you the cause of the error. > Could you dig that out? > > Thanks, > > Jun > > > On Thu, Apr 24, 2014 at 10:12 AM, Yashika Gupta < > yashika.gu...@impetus.co.in > > wrote: > > > Hi, > > > > I am working on a POC where I have 1 Zookeeper and 2 Kafka Brokers on my > > local machine. I am running 8 sets of Kafka consumers and producers > running > > in parallel. > > > > Below are my configurations: > > Consumer Configs: > > zookeeper.session.timeout.ms=12 > > zookeeper.sync.time.ms=2000 > > zookeeper.connection.timeout.ms=12 > > auto.commit.interval.ms=6 > > rebalance.backoff.ms=2000 > > fetch.wait.max.ms=6 > > auto.offset.reset=smallest > > > > Producer configs: > > key.serializer.class=kafka.serializer.StringEncoder > > request.required.acks=-1 > > message.send.max.retries=3 > > request.timeout.ms=60
Brokers throwing warning messages after change in retention policy and multiple produce failures
We are seeing some strange behavior from brokers after we we had to change our log retention policy on brokers yesterday. We had a huge spike in producer data for a small period which caused brokers to get very close to the max disk space. Normally our retention policy is good 6-7 days but since our consumers were synced up we changed the retention policy from hour based to size based and cut short the size to a safe number (half of our max disk space and normal usage is around 30%). After the restart, we started seeing multiple producer side failures with FailedSends metrics showing almost 10% failures and FailedProduceRequestsPerSec on the broker side a non-zero number. The traces from one of the brokers looked like this: [KafkaApi-8] Produce request with correlation id 2050686 from client xxx on partition [TOPIC_NAME,18] failed due to Partition [TOPIC_NAME,18] doesn't exist on 8 (kafka.server.KafkaApis) [KafkaApi-8] Produce request with correlation id 2102325 from client xxx on partition [TOPIC_NAME,28] failed due to Partition [TOPIC_NAME,28] doesn't exist on 8 (kafka.server.KafkaApis) We checked and made sure those partitions were present on the broker. Any help is appreciated. Also, is there a recommended way to purge log data quickly out from the brokers. Thanks, Sadhan
RE: Kafka Performance Tuning
Jun, The detailed logs are as follows: 24.04.2014 13:37:31812 INFO main kafka.producer.SyncProducer - Disconnecting from localhost:9092 24.04.2014 13:37:38612 WARN main kafka.producer.BrokerPartitionInfo - Error while fetching metadata [{TopicMetadata for topic LOGFILE04 -> No partition metadata for topic LOGFILE04 due to kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class kafka.common.LeaderNotAvailableException 24.04.2014 13:37:40712 INFO main kafka.client.ClientUtils$ - Fetching metadata from broker id:0,host:localhost,port:9092 with correlation id 1 for 1 topic(s) Set(LOGFILE04) 24.04.2014 13:37:41212 INFO main kafka.producer.SyncProducer - Connected to localhost:9092 for producing 24.04.2014 13:37:48812 INFO main kafka.producer.SyncProducer - Disconnecting from localhost:9092 24.04.2014 13:37:48912 WARN main kafka.producer.BrokerPartitionInfo - Error while fetching metadata [{TopicMetadata for topic LOGFILE04 -> No partition metadata for topic LOGFILE04 due to kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class kafka.common.LeaderNotAvailableException 24.04.2014 13:37:49012 ERROR main kafka.producer.async.DefaultEventHandler - Failed to collate messages by topic, partition due to: Failed to fetch topic metadata for topic: LOGFILE04 24.04.2014 13:39:96513 WARN ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0 kafka.consumer.ConsumerFetcherThread - [ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0], Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 4; ClientId: produceLogLine2-ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0; ReplicaId: -1; MaxWait: 6 ms; MinBytes: 1 bytes; RequestInfo: [LOGFILE04,0] -> PartitionFetchInfo(2,1048576) java.net.SocketTimeoutException at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229) at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) at kafka.utils.Utils$.read(Unknown Source) at kafka.network.BoundedByteBufferReceive.readFrom(Unknown Source) at kafka.network.Receive$class.readCompletely(Unknown Source) at kafka.network.BoundedByteBufferReceive.readCompletely(Unknown Source) at kafka.network.BlockingChannel.receive(Unknown Source) at kafka.consumer.SimpleConsumer.liftedTree1$1(Unknown Source) at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(Unknown Source) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Unknown Source) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown Source) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown Source) at kafka.metrics.KafkaTimer.time(Unknown Source) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(Unknown Source) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown Source) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown Source) at kafka.metrics.KafkaTimer.time(Unknown Source) at kafka.consumer.SimpleConsumer.fetch(Unknown Source) at kafka.server.AbstractFetcherThread.processFetchRequest(Unknown Source) at kafka.server.AbstractFetcherThread.doWork(Unknown Source) at kafka.utils.ShutdownableThread.run(Unknown Source) Regards, Yashika From: Jun Rao Sent: Thursday, April 24, 2014 10:49 PM To: users@kafka.apache.org Subject: Re: Kafka Performance Tuning Before that error messge, the log should tell you the cause of the error. Could you dig that out? Thanks, Jun On Thu, Apr 24, 2014 at 10:12 AM, Yashika Gupta wrote: > Hi, > > I am working on a POC where I have 1 Zookeeper and 2 Kafka Brokers on my > local machine. I am running 8 sets of Kafka consumers and producers running > in parallel. > > Below are my configurations: > Consumer Configs: > zookeeper.session.timeout.ms=12 > zookeeper.sync.time.ms=2000 > zookeeper.connection.timeout.ms=12 > auto.commit.interval.ms=6 > rebalance.backoff.ms=2000 > fetch.wait.max.ms=6 > auto.offset.reset=smallest > > Producer configs: > key.serializer.class=kafka.serializer.StringEncoder > request.required.acks=-1 > message.send.max.retries=3 > request.timeout.ms=6 > > I have tried various more configuration changes but I am running into the > same exception again and again. > > 17.04.2014 06:31:83216 ERROR pool-5-thread-1 > kafka.producer.async.DefaultEventHandler - Failed to collate messages by > topic, partition due to: Failed to fetch topic metadata for topic: TOPIC99 > 17.04.2014 06:31:85616 ERROR pool-4-thread-1 > kafka.producer.async.DefaultEventHandler - Failed to collate messages by > topic, p
Re: Kafka Performance Tuning
Before that error messge, the log should tell you the cause of the error. Could you dig that out? Thanks, Jun On Thu, Apr 24, 2014 at 10:12 AM, Yashika Gupta wrote: > Hi, > > I am working on a POC where I have 1 Zookeeper and 2 Kafka Brokers on my > local machine. I am running 8 sets of Kafka consumers and producers running > in parallel. > > Below are my configurations: > Consumer Configs: > zookeeper.session.timeout.ms=12 > zookeeper.sync.time.ms=2000 > zookeeper.connection.timeout.ms=12 > auto.commit.interval.ms=6 > rebalance.backoff.ms=2000 > fetch.wait.max.ms=6 > auto.offset.reset=smallest > > Producer configs: > key.serializer.class=kafka.serializer.StringEncoder > request.required.acks=-1 > message.send.max.retries=3 > request.timeout.ms=6 > > I have tried various more configuration changes but I am running into the > same exception again and again. > > 17.04.2014 06:31:83216 ERROR pool-5-thread-1 > kafka.producer.async.DefaultEventHandler - Failed to collate messages by > topic, partition due to: Failed to fetch topic metadata for topic: TOPIC99 > 17.04.2014 06:31:85616 ERROR pool-4-thread-1 > kafka.producer.async.DefaultEventHandler - Failed to collate messages by > topic, partition due to: Failed to fetch topic metadata for topic: TOPIC99 > > I am not able to get to the root cause of the issue. > Appreciate your help. > > Regards, > Yashika > > > > > > > > > NOTE: This message may contain information that is confidential, > proprietary, privileged or otherwise protected by law. The message is > intended solely for the named addressee. If received in error, please > destroy and notify the sender. Any use of this email is prohibited when > received in error. Impetus does not represent, warrant and/or guarantee, > that the integrity of this communication has been maintained nor that the > communication is free of errors, virus, interception or interference. >
Re: Failing broker with errors for Conditional update
0.8.1.1 is being voted now. Thanks, Jun On Thu, Apr 24, 2014 at 10:07 AM, Drew Goya wrote: > This just hit me this morning as well, any news on 0.8.1.1? My ops guy is > going to kill me, we just rolled off my older build of 0.8.1 to the > official release. > > > On Thu, Apr 3, 2014 at 11:55 PM, Krzysztof Ociepa < > ociepa.krzysz...@gmail.com> wrote: > > > Hi Guozhang, > > Hi Neha, > > > > Thanks a lot for your answers. I will try new version 0.8.1.1 and let > > you know how it works for me. > > > > Best, > > Chris > > >
Re: Cluster expansion and upgrade
Partition reassignment wasn't fully working in 0.8-beta. So you probably will have to upgrade existing brokers to 0.8.1 before running partition reassignment. Also, 0.8.1.1 will be out soon. Thanks, Jun On Thu, Apr 24, 2014 at 9:49 AM, vimpy batra wrote: > Hello, > > We are currently running a kafka 0.8-beta cluster. We are planning to > expand the existing cluster and use 0.8.1 version on the new nodes. Before > upgrading the older ones we want the new ones to participate in the > cluster. We plan to use "reassign-partitions" tool in 0.8.1 to reassign > partitions to the newly added brokers. We will also add additional > partitions to the existing topics. Is there a best practice to go about > expanding clusters? Is this the recommended way to go or should we upgrade > our existing cluster to 0.8.1 first? > I also noticed that there are some fixes on top of 0.8.1. Are they > available to use? > > Thanks,
Re: Delete Topic - BadVersionException
Delete topic doesn't quite work yet and we will try to fix it in the next release. https://issues.apache.org/jira/browse/KAFKA-1397 Thanks, Jun On Thu, Apr 24, 2014 at 9:49 AM, Drew Goya wrote: > Just tried my first topic delete today and it looks like something went > wrong on the controller. I issued the command on a test topic and shortly > after that a describe looked like: > > Topic:TimeoutQueueTest PartitionCount:256 ReplicationFactor:3 Configs: > Topic: TimeoutQueueTest Partition: 0 Leader: -1 Replicas: 9,14,15 Isr: > Topic: TimeoutQueueTest Partition: 1 Leader: -1 Replicas: 10,15,1 Isr: > Topic: TimeoutQueueTest Partition: 2 Leader: -1 Replicas: 11,1,2 Isr: > Topic: TimeoutQueueTest Partition: 3 Leader: -1 Replicas: 12,2,3 Isr: > Topic: TimeoutQueueTest Partition: 4 Leader: -1 Replicas: 13,3,4 Isr: > Topic: TimeoutQueueTest Partition: 5 Leader: -1 Replicas: 14,4,5 Isr: > > It stayed that way for quite a while so I hit zookeeper and went looking > for who was the controller, I found these in that brokers logs: > > [2014-04-24 16:27:42,498] ERROR Conditional update of path > /brokers/topics/TimeoutQueueTest/partitions/170/state with data > > {"controller_epoch":18,"leader":2,"version":1,"leader_epoch":14,"isr":[2,14]} > and expected version 30 failed due to > org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = > BadVersion for /brokers/topics/TimeoutQueueTest/partitions/170/state > (kafka.utils.ZkUtils$) > [2014-04-24 16:27:42,504] ERROR Conditional update of path > /brokers/topics/TimeoutQueueTest/partitions/113/state with data > > {"controller_epoch":18,"leader":2,"version":1,"leader_epoch":4,"isr":[2,15,14]} > and expected version 17 failed due to > org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = > BadVersion for /brokers/topics/TimeoutQueueTest/partitions/113/state > (kafka.utils.ZkUtils$) > > Any ideas? >
Kafka Performance Tuning
Hi, I am working on a POC where I have 1 Zookeeper and 2 Kafka Brokers on my local machine. I am running 8 sets of Kafka consumers and producers running in parallel. Below are my configurations: Consumer Configs: zookeeper.session.timeout.ms=12 zookeeper.sync.time.ms=2000 zookeeper.connection.timeout.ms=12 auto.commit.interval.ms=6 rebalance.backoff.ms=2000 fetch.wait.max.ms=6 auto.offset.reset=smallest Producer configs: key.serializer.class=kafka.serializer.StringEncoder request.required.acks=-1 message.send.max.retries=3 request.timeout.ms=6 I have tried various more configuration changes but I am running into the same exception again and again. 17.04.2014 06:31:83216 ERROR pool-5-thread-1 kafka.producer.async.DefaultEventHandler - Failed to collate messages by topic, partition due to: Failed to fetch topic metadata for topic: TOPIC99 17.04.2014 06:31:85616 ERROR pool-4-thread-1 kafka.producer.async.DefaultEventHandler - Failed to collate messages by topic, partition due to: Failed to fetch topic metadata for topic: TOPIC99 I am not able to get to the root cause of the issue. Appreciate your help. Regards, Yashika NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
Re: question about isr
Interesting. Which version of Kafka are you using? Were you doing some partition reassignment? Thanks, Jun On Wed, Apr 23, 2014 at 11:14 PM, 陈小军 wrote: > Hi Team, >I found a strange phenomenon of isr list in my kafka cluster > >When I use the tool that kafka provide to get the topic information, > and it show isr list as following, seem it is ok > > [irt...@xseed171.kdev bin]$ ./kafka-topics.sh --describe --zookeeper > 10.96.250.215:10013,10.96.250.216:10013,10.96.250.217:10013/nelo2-kafka > > Topic:nelo2-normal-logs PartitionCount:3ReplicationFactor:2 > Configs: > Topic: nelo2-normal-logsPartition: 0Leader: 3 > Replicas: 3,0 Isr: 0,3 > Topic: nelo2-normal-logsPartition: 1Leader: 0 > Replicas: 0,1 Isr: 0,1 > Topic: nelo2-normal-logsPartition: 2Leader: 1 > Replicas: 1,3 Isr: 1,3 > > but when I use some sdk to get the meta info from broker, the isr is > different. > metadata: { size: 246, > correlationId: 0, > brokerNum: -1, > nodeId: 1, > host: 'xseed171.kdev.nhnsystem.com', > port: 9093, > topicNum: 0, > topicError: 0, > topic: 'nelo2-normal-logs', > partitionNum: 2, > errorCode: 0, > partition: 0, > leader: 3, > replicasNum: 2, > replicas: [ 3, 0 ], > isrNum: 2, > isr: [ 0, 3 ] } > metadata: { size: 246, > correlationId: 0, > brokerNum: -1, > nodeId: 1, > host: 'xseed171.kdev.nhnsystem.com', > port: 9093, > topicNum: 0, > topicError: 0, > topic: 'nelo2-normal-logs', > partitionNum: 1, > errorCode: 0, > partition: 1, > leader: 0, > replicasNum: 2, > replicas: [ 0, 1 ], > isrNum: 2, > isr: [ 0, 1 ] } > metadata: { size: 246, > correlationId: 0, > brokerNum: -1, > nodeId: 1, > host: 'xseed171.kdev.nhnsystem.com', > port: 9093, > topicNum: 0, > topicError: 0, > topic: 'nelo2-normal-logs', > partitionNum: 0, > errorCode: 0, > partition: 2, > leader: 1, > replicasNum: 2, > replicas: [ 1, 3 ], > isrNum: 1, > isr: [ 1 ] } > > I also use other sdk, get the same result. I check the logs from kafka, > it seems the sdk result is right. the tool get the wrong result. why is it > happend? > > [2014-04-24 14:53:57,705] TRACE Broker 3 cached leader info > (LeaderAndIsrInfo:(Leader:0,ISR:0,1,LeaderEpoch:7,ControllerEpoch:9),ReplicationFactor:2),AllReplicas:0,1) > for partition [nelo2-normal-logs,1] in response to UpdateMetadata request > sent by controller 0 epoch 10 with correlation id 13 (state.change.logger) > [2014-04-24 14:53:57,705] TRACE Broker 3 cached leader info > (LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:9,ControllerEpoch:10),ReplicationFactor:2),AllReplicas:1,3) > for partition [nelo2-normal-logs,2] in response to UpdateMetadata request > sent by controller 0 epoch 10 with correlation id 13 (state.change.logger) > [2014-04-24 14:53:57,705] TRACE Broker 3 cached leader info > (LeaderAndIsrInfo:(Leader:3,ISR:0,3,LeaderEpoch:10,ControllerEpoch:10),ReplicationFactor:2),AllReplicas:3,0) > for partition [nelo2-normal-logs,0] in response to UpdateMetadata request > sent by controller 0 epoch 10 with correlation id 13 (state.change.logger) > > Thanks~! > > Best Regards > Jerry >
Re: Failing broker with errors for Conditional update
This just hit me this morning as well, any news on 0.8.1.1? My ops guy is going to kill me, we just rolled off my older build of 0.8.1 to the official release. On Thu, Apr 3, 2014 at 11:55 PM, Krzysztof Ociepa < ociepa.krzysz...@gmail.com> wrote: > Hi Guozhang, > Hi Neha, > > Thanks a lot for your answers. I will try new version 0.8.1.1 and let > you know how it works for me. > > Best, > Chris >
Delete Topic - BadVersionException
Just tried my first topic delete today and it looks like something went wrong on the controller. I issued the command on a test topic and shortly after that a describe looked like: Topic:TimeoutQueueTest PartitionCount:256 ReplicationFactor:3 Configs: Topic: TimeoutQueueTest Partition: 0 Leader: -1 Replicas: 9,14,15 Isr: Topic: TimeoutQueueTest Partition: 1 Leader: -1 Replicas: 10,15,1 Isr: Topic: TimeoutQueueTest Partition: 2 Leader: -1 Replicas: 11,1,2 Isr: Topic: TimeoutQueueTest Partition: 3 Leader: -1 Replicas: 12,2,3 Isr: Topic: TimeoutQueueTest Partition: 4 Leader: -1 Replicas: 13,3,4 Isr: Topic: TimeoutQueueTest Partition: 5 Leader: -1 Replicas: 14,4,5 Isr: It stayed that way for quite a while so I hit zookeeper and went looking for who was the controller, I found these in that brokers logs: [2014-04-24 16:27:42,498] ERROR Conditional update of path /brokers/topics/TimeoutQueueTest/partitions/170/state with data {"controller_epoch":18,"leader":2,"version":1,"leader_epoch":14,"isr":[2,14]} and expected version 30 failed due to org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /brokers/topics/TimeoutQueueTest/partitions/170/state (kafka.utils.ZkUtils$) [2014-04-24 16:27:42,504] ERROR Conditional update of path /brokers/topics/TimeoutQueueTest/partitions/113/state with data {"controller_epoch":18,"leader":2,"version":1,"leader_epoch":4,"isr":[2,15,14]} and expected version 17 failed due to org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /brokers/topics/TimeoutQueueTest/partitions/113/state (kafka.utils.ZkUtils$) Any ideas?
Cluster expansion and upgrade
Hello, We are currently running a kafka 0.8-beta cluster. We are planning to expand the existing cluster and use 0.8.1 version on the new nodes. Before upgrading the older ones we want the new ones to participate in the cluster. We plan to use "reassign-partitions" tool in 0.8.1 to reassign partitions to the newly added brokers. We will also add additional partitions to the existing topics. Is there a best practice to go about expanding clusters? Is this the recommended way to go or should we upgrade our existing cluster to 0.8.1 first? I also noticed that there are some fixes on top of 0.8.1. Are they available to use? Thanks,
Re: Kafka/Zookeeper co-location
We typically run all of our Zookeeper instances separate, but we do have one Kafka cluster that is colocated with the Zookeeper nodes. It works just fine, probably in part because Zookeeper handles everything serially. The caveat is that the cluster that we¹re doing this on is not designed for performance, but rather compactness. And we¹re getting rid of it because we don¹t like the setup. Keep in mind that you¹re colocating two applications that are sensitive to disk I/O. This is going to affect performance of both systems and you really need to decide if that¹s what you want. If you have enough spindles to keep everything separate (Kafka data storage and Zookeeper transaction logs on separate spindles from everything else) you might be OK. -Todd On 4/24/14, 8:53 AM, "Andrew Otto" wrote: >Oo, I¹m curious about this as well! Wikimedia is considering doing this >if/when we install brokers in our web caching data centers. > > >On Apr 24, 2014, at 11:49 AM, Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) > wrote: > >> Are there any thoughts on running Zookeeper on the same physical nodes >>that run the Kafka broker? So the loss of a node affects quorum and >>possibly requires electing new leaders at both the ZK and the broker >>level. Are there race conditions or other failure cases that could come >>about from either a co-located (or, an independent) setup? Thanks. >> >> -sudarshan >> >> >>- >>-- >
Re: Kafka/Zookeeper co-location
Oo, I’m curious about this as well! Wikimedia is considering doing this if/when we install brokers in our web caching data centers. On Apr 24, 2014, at 11:49 AM, Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) wrote: > Are there any thoughts on running Zookeeper on the same physical nodes that > run the Kafka broker? So the loss of a node affects quorum and possibly > requires electing new leaders at both the ZK and the broker level. Are there > race conditions or other failure cases that could come about from either a > co-located (or, an independent) setup? Thanks. > > -sudarshan > > ---
Kafka/Zookeeper co-location
Are there any thoughts on running Zookeeper on the same physical nodes that run the Kafka broker? So the loss of a node affects quorum and possibly requires electing new leaders at both the ZK and the broker level. Are there race conditions or other failure cases that could come about from either a co-located (or, an independent) setup? Thanks. -sudarshan ---