Re: Kafka Performance Tuning

2014-04-24 Thread Timothy Chen
Hi Yashika,

No logs in broker log is not normal, can you verify if you turned off
logging in your log4j properties file?

If it is please enable it and try again, and see what is in the logs.

Tim

On Thu, Apr 24, 2014 at 10:53 PM, Yashika Gupta
 wrote:
> Jun,
>
> I am using Kafka 2.8.0- 0.8.0 version.
> There are no logs for the past month in the controller and state-change log.
>
> Though I can see dome gc logs in the kafka-home-dir/logs folder.
> zookeeper-gc.log
> kafkaServer-gc.log
>
>
> Yashika
> __
> From: Jun Rao 
> Sent: Friday, April 25, 2014 9:03 AM
> To: users@kafka.apache.org
> Subject: Re: Kafka Performance Tuning
>
> Which version of Kafka are you using? Any error in the controller and
> state-change log?
>
> Thanks,
>
> Jun
>
>
> On Thu, Apr 24, 2014 at 7:37 PM, Yashika Gupta
> wrote:
>
>> I am running a single broker and the leader column has 0 as the value.
>>
>> pushkar priyadarshi  wrote:
>>
>>
>> you can use the kafka-list-topic.sh to find out if leader for particual
>> topic is available.-1 in leader column might indicate trouble.
>>
>>
>> On Fri, Apr 25, 2014 at 6:34 AM, Guozhang Wang  wrote:
>>
>> > Could you double check if the topic LOGFILE04 is already created on the
>> > servers?
>> >
>> > Guozhang
>> >
>> >
>> > On Thu, Apr 24, 2014 at 10:46 AM, Yashika Gupta <
>> > yashika.gu...@impetus.co.in
>> > > wrote:
>> >
>> > > Jun,
>> > >
>> > > The detailed logs are as follows:
>> > >
>> > > 24.04.2014 13:37:31812 INFO main kafka.producer.SyncProducer -
>> > > Disconnecting from localhost:9092
>> > > 24.04.2014 13:37:38612 WARN main kafka.producer.BrokerPartitionInfo -
>> > > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 ->
>> > > No partition metadata for topic LOGFILE04 due to
>> > > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class
>> > > kafka.common.LeaderNotAvailableException
>> > > 24.04.2014 13:37:40712 INFO main kafka.client.ClientUtils$ - Fetching
>> > > metadata from broker id:0,host:localhost,port:9092 with correlation id
>> 1
>> > > for 1 topic(s) Set(LOGFILE04)
>> > > 24.04.2014 13:37:41212 INFO main kafka.producer.SyncProducer -
>> Connected
>> > > to localhost:9092 for producing
>> > > 24.04.2014 13:37:48812 INFO main kafka.producer.SyncProducer -
>> > > Disconnecting from localhost:9092
>> > > 24.04.2014 13:37:48912 WARN main kafka.producer.BrokerPartitionInfo -
>> > > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 ->
>> > > No partition metadata for topic LOGFILE04 due to
>> > > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class
>> > > kafka.common.LeaderNotAvailableException
>> > > 24.04.2014 13:37:49012 ERROR main
>> > kafka.producer.async.DefaultEventHandler
>> > > - Failed to collate messages by topic, partition due to: Failed to
>> fetch
>> > > topic metadata for topic: LOGFILE04
>> > >
>> > >
>> > > 24.04.2014 13:39:96513 WARN
>> > >
>> >
>> ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0
>> > > kafka.consumer.ConsumerFetcherThread -
>> > >
>> >
>> [ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0],
>> > > Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 4;
>> > ClientId:
>> > >
>> >
>> produceLogLine2-ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0;
>> > > ReplicaId: -1; MaxWait: 6 ms; MinBytes: 1 bytes; RequestInfo:
>> > > [LOGFILE04,0] -> PartitionFetchInfo(2,1048576)
>> > > java.net.SocketTimeoutException
>> > > at
>> > > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
>> > > at
>> > sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
>> > > at
>> > >
>> >
>> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
>> > > at kafka.utils.Utils$.read(Unknown Source)
>> > > at kafka.network.BoundedByteBufferReceive.readFrom(Unknown
>> > Source)
>> > > at kafka.network.Receive$class.readCompletely(Unknown Source)
>> > > at
>> kafka.network.BoundedByteBufferReceive.readCompletely(Unknown
>> > > Source)
>> > > at kafka.network.BlockingChannel.receive(Unknown Source)
>> > > at kafka.consumer.SimpleConsumer.liftedTree1$1(Unknown Source)
>> > > at
>> > >
>> >
>> kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(Unknown
>> > > Source)
>> > > at
>> > >
>> >
>> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Unknown
>> > > Source)
>> > > at
>> > >
>> >
>> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown
>> > > Source)
>> > > at
>> > >
>> >
>> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown
>> > > Source)
>> > > at kafka.metrics.KafkaTimer.time(Unknown Source)
>> > > at
>> > > kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(Unknown
>> > Source)
>

RE: Kafka Performance Tuning

2014-04-24 Thread Yashika Gupta
Jun,

I am using Kafka 2.8.0- 0.8.0 version.
There are no logs for the past month in the controller and state-change log.

Though I can see dome gc logs in the kafka-home-dir/logs folder.
zookeeper-gc.log
kafkaServer-gc.log


Yashika
__
From: Jun Rao 
Sent: Friday, April 25, 2014 9:03 AM
To: users@kafka.apache.org
Subject: Re: Kafka Performance Tuning

Which version of Kafka are you using? Any error in the controller and
state-change log?

Thanks,

Jun


On Thu, Apr 24, 2014 at 7:37 PM, Yashika Gupta
wrote:

> I am running a single broker and the leader column has 0 as the value.
>
> pushkar priyadarshi  wrote:
>
>
> you can use the kafka-list-topic.sh to find out if leader for particual
> topic is available.-1 in leader column might indicate trouble.
>
>
> On Fri, Apr 25, 2014 at 6:34 AM, Guozhang Wang  wrote:
>
> > Could you double check if the topic LOGFILE04 is already created on the
> > servers?
> >
> > Guozhang
> >
> >
> > On Thu, Apr 24, 2014 at 10:46 AM, Yashika Gupta <
> > yashika.gu...@impetus.co.in
> > > wrote:
> >
> > > Jun,
> > >
> > > The detailed logs are as follows:
> > >
> > > 24.04.2014 13:37:31812 INFO main kafka.producer.SyncProducer -
> > > Disconnecting from localhost:9092
> > > 24.04.2014 13:37:38612 WARN main kafka.producer.BrokerPartitionInfo -
> > > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 ->
> > > No partition metadata for topic LOGFILE04 due to
> > > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class
> > > kafka.common.LeaderNotAvailableException
> > > 24.04.2014 13:37:40712 INFO main kafka.client.ClientUtils$ - Fetching
> > > metadata from broker id:0,host:localhost,port:9092 with correlation id
> 1
> > > for 1 topic(s) Set(LOGFILE04)
> > > 24.04.2014 13:37:41212 INFO main kafka.producer.SyncProducer -
> Connected
> > > to localhost:9092 for producing
> > > 24.04.2014 13:37:48812 INFO main kafka.producer.SyncProducer -
> > > Disconnecting from localhost:9092
> > > 24.04.2014 13:37:48912 WARN main kafka.producer.BrokerPartitionInfo -
> > > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 ->
> > > No partition metadata for topic LOGFILE04 due to
> > > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class
> > > kafka.common.LeaderNotAvailableException
> > > 24.04.2014 13:37:49012 ERROR main
> > kafka.producer.async.DefaultEventHandler
> > > - Failed to collate messages by topic, partition due to: Failed to
> fetch
> > > topic metadata for topic: LOGFILE04
> > >
> > >
> > > 24.04.2014 13:39:96513 WARN
> > >
> >
> ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0
> > > kafka.consumer.ConsumerFetcherThread -
> > >
> >
> [ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0],
> > > Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 4;
> > ClientId:
> > >
> >
> produceLogLine2-ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0;
> > > ReplicaId: -1; MaxWait: 6 ms; MinBytes: 1 bytes; RequestInfo:
> > > [LOGFILE04,0] -> PartitionFetchInfo(2,1048576)
> > > java.net.SocketTimeoutException
> > > at
> > > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
> > > at
> > sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> > > at
> > >
> >
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> > > at kafka.utils.Utils$.read(Unknown Source)
> > > at kafka.network.BoundedByteBufferReceive.readFrom(Unknown
> > Source)
> > > at kafka.network.Receive$class.readCompletely(Unknown Source)
> > > at
> kafka.network.BoundedByteBufferReceive.readCompletely(Unknown
> > > Source)
> > > at kafka.network.BlockingChannel.receive(Unknown Source)
> > > at kafka.consumer.SimpleConsumer.liftedTree1$1(Unknown Source)
> > > at
> > >
> >
> kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(Unknown
> > > Source)
> > > at
> > >
> >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Unknown
> > > Source)
> > > at
> > >
> >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown
> > > Source)
> > > at
> > >
> >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown
> > > Source)
> > > at kafka.metrics.KafkaTimer.time(Unknown Source)
> > > at
> > > kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(Unknown
> > Source)
> > > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown
> > > Source)
> > > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown
> > > Source)
> > > at kafka.metrics.KafkaTimer.time(Unknown Source)
> > > at kafka.consumer.SimpleConsumer.fetch(Unknown Source)
> > > at
> kafka.server.AbstractFetcherThread.processFetchRequest(Unknown
> > > Source)

Re: Kafka Performance Tuning

2014-04-24 Thread Jun Rao
Which version of Kafka are you using? Any error in the controller and
state-change log?

Thanks,

Jun


On Thu, Apr 24, 2014 at 7:37 PM, Yashika Gupta
wrote:

> I am running a single broker and the leader column has 0 as the value.
>
> pushkar priyadarshi  wrote:
>
>
> you can use the kafka-list-topic.sh to find out if leader for particual
> topic is available.-1 in leader column might indicate trouble.
>
>
> On Fri, Apr 25, 2014 at 6:34 AM, Guozhang Wang  wrote:
>
> > Could you double check if the topic LOGFILE04 is already created on the
> > servers?
> >
> > Guozhang
> >
> >
> > On Thu, Apr 24, 2014 at 10:46 AM, Yashika Gupta <
> > yashika.gu...@impetus.co.in
> > > wrote:
> >
> > > Jun,
> > >
> > > The detailed logs are as follows:
> > >
> > > 24.04.2014 13:37:31812 INFO main kafka.producer.SyncProducer -
> > > Disconnecting from localhost:9092
> > > 24.04.2014 13:37:38612 WARN main kafka.producer.BrokerPartitionInfo -
> > > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 ->
> > > No partition metadata for topic LOGFILE04 due to
> > > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class
> > > kafka.common.LeaderNotAvailableException
> > > 24.04.2014 13:37:40712 INFO main kafka.client.ClientUtils$ - Fetching
> > > metadata from broker id:0,host:localhost,port:9092 with correlation id
> 1
> > > for 1 topic(s) Set(LOGFILE04)
> > > 24.04.2014 13:37:41212 INFO main kafka.producer.SyncProducer -
> Connected
> > > to localhost:9092 for producing
> > > 24.04.2014 13:37:48812 INFO main kafka.producer.SyncProducer -
> > > Disconnecting from localhost:9092
> > > 24.04.2014 13:37:48912 WARN main kafka.producer.BrokerPartitionInfo -
> > > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 ->
> > > No partition metadata for topic LOGFILE04 due to
> > > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class
> > > kafka.common.LeaderNotAvailableException
> > > 24.04.2014 13:37:49012 ERROR main
> > kafka.producer.async.DefaultEventHandler
> > > - Failed to collate messages by topic, partition due to: Failed to
> fetch
> > > topic metadata for topic: LOGFILE04
> > >
> > >
> > > 24.04.2014 13:39:96513 WARN
> > >
> >
> ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0
> > > kafka.consumer.ConsumerFetcherThread -
> > >
> >
> [ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0],
> > > Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 4;
> > ClientId:
> > >
> >
> produceLogLine2-ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0;
> > > ReplicaId: -1; MaxWait: 6 ms; MinBytes: 1 bytes; RequestInfo:
> > > [LOGFILE04,0] -> PartitionFetchInfo(2,1048576)
> > > java.net.SocketTimeoutException
> > > at
> > > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
> > > at
> > sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> > > at
> > >
> >
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> > > at kafka.utils.Utils$.read(Unknown Source)
> > > at kafka.network.BoundedByteBufferReceive.readFrom(Unknown
> > Source)
> > > at kafka.network.Receive$class.readCompletely(Unknown Source)
> > > at
> kafka.network.BoundedByteBufferReceive.readCompletely(Unknown
> > > Source)
> > > at kafka.network.BlockingChannel.receive(Unknown Source)
> > > at kafka.consumer.SimpleConsumer.liftedTree1$1(Unknown Source)
> > > at
> > >
> >
> kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(Unknown
> > > Source)
> > > at
> > >
> >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Unknown
> > > Source)
> > > at
> > >
> >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown
> > > Source)
> > > at
> > >
> >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown
> > > Source)
> > > at kafka.metrics.KafkaTimer.time(Unknown Source)
> > > at
> > > kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(Unknown
> > Source)
> > > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown
> > > Source)
> > > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown
> > > Source)
> > > at kafka.metrics.KafkaTimer.time(Unknown Source)
> > > at kafka.consumer.SimpleConsumer.fetch(Unknown Source)
> > > at
> kafka.server.AbstractFetcherThread.processFetchRequest(Unknown
> > > Source)
> > > at kafka.server.AbstractFetcherThread.doWork(Unknown Source)
> > > at kafka.utils.ShutdownableThread.run(Unknown Source)
> > >
> > >
> > > Regards,
> > > Yashika
> > > 
> > > From: Jun Rao 
> > > Sent: Thursday, April 24, 2014 10:49 PM
> > > To: users@kafka.apache.org
> > > Subject: Re: Kafka Performance Tuning
> > 

Re: Kafka Performance Tuning

2014-04-24 Thread Yashika Gupta
I am running a single broker and the leader column has 0 as the value.

pushkar priyadarshi  wrote:


you can use the kafka-list-topic.sh to find out if leader for particual
topic is available.-1 in leader column might indicate trouble.


On Fri, Apr 25, 2014 at 6:34 AM, Guozhang Wang  wrote:

> Could you double check if the topic LOGFILE04 is already created on the
> servers?
>
> Guozhang
>
>
> On Thu, Apr 24, 2014 at 10:46 AM, Yashika Gupta <
> yashika.gu...@impetus.co.in
> > wrote:
>
> > Jun,
> >
> > The detailed logs are as follows:
> >
> > 24.04.2014 13:37:31812 INFO main kafka.producer.SyncProducer -
> > Disconnecting from localhost:9092
> > 24.04.2014 13:37:38612 WARN main kafka.producer.BrokerPartitionInfo -
> > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 ->
> > No partition metadata for topic LOGFILE04 due to
> > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class
> > kafka.common.LeaderNotAvailableException
> > 24.04.2014 13:37:40712 INFO main kafka.client.ClientUtils$ - Fetching
> > metadata from broker id:0,host:localhost,port:9092 with correlation id 1
> > for 1 topic(s) Set(LOGFILE04)
> > 24.04.2014 13:37:41212 INFO main kafka.producer.SyncProducer - Connected
> > to localhost:9092 for producing
> > 24.04.2014 13:37:48812 INFO main kafka.producer.SyncProducer -
> > Disconnecting from localhost:9092
> > 24.04.2014 13:37:48912 WARN main kafka.producer.BrokerPartitionInfo -
> > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 ->
> > No partition metadata for topic LOGFILE04 due to
> > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class
> > kafka.common.LeaderNotAvailableException
> > 24.04.2014 13:37:49012 ERROR main
> kafka.producer.async.DefaultEventHandler
> > - Failed to collate messages by topic, partition due to: Failed to fetch
> > topic metadata for topic: LOGFILE04
> >
> >
> > 24.04.2014 13:39:96513 WARN
> >
> ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0
> > kafka.consumer.ConsumerFetcherThread -
> >
> [ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0],
> > Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 4;
> ClientId:
> >
> produceLogLine2-ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0;
> > ReplicaId: -1; MaxWait: 6 ms; MinBytes: 1 bytes; RequestInfo:
> > [LOGFILE04,0] -> PartitionFetchInfo(2,1048576)
> > java.net.SocketTimeoutException
> > at
> > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
> > at
> sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> > at
> >
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> > at kafka.utils.Utils$.read(Unknown Source)
> > at kafka.network.BoundedByteBufferReceive.readFrom(Unknown
> Source)
> > at kafka.network.Receive$class.readCompletely(Unknown Source)
> > at kafka.network.BoundedByteBufferReceive.readCompletely(Unknown
> > Source)
> > at kafka.network.BlockingChannel.receive(Unknown Source)
> > at kafka.consumer.SimpleConsumer.liftedTree1$1(Unknown Source)
> > at
> >
> kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(Unknown
> > Source)
> > at
> >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Unknown
> > Source)
> > at
> >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown
> > Source)
> > at
> >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown
> > Source)
> > at kafka.metrics.KafkaTimer.time(Unknown Source)
> > at
> > kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(Unknown
> Source)
> > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown
> > Source)
> > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown
> > Source)
> > at kafka.metrics.KafkaTimer.time(Unknown Source)
> > at kafka.consumer.SimpleConsumer.fetch(Unknown Source)
> > at kafka.server.AbstractFetcherThread.processFetchRequest(Unknown
> > Source)
> > at kafka.server.AbstractFetcherThread.doWork(Unknown Source)
> > at kafka.utils.ShutdownableThread.run(Unknown Source)
> >
> >
> > Regards,
> > Yashika
> > 
> > From: Jun Rao 
> > Sent: Thursday, April 24, 2014 10:49 PM
> > To: users@kafka.apache.org
> > Subject: Re: Kafka Performance Tuning
> >
> > Before that error messge, the log should tell you the cause of the error.
> > Could you dig that out?
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Thu, Apr 24, 2014 at 10:12 AM, Yashika Gupta <
> > yashika.gu...@impetus.co.in
> > > wrote:
> >
> > > Hi,
> > >
> > > I am working on a POC where I have 1 Zookeeper and 2 Kafka Brokers on
> my
> > > local machine. I am running 8 sets of Kafka consumers and produce

Re: Kafka Performance Tuning

2014-04-24 Thread Yashika Gupta
I had cleaned up the topics using the following commands:

Rm -rf /tmp/kafka-logs/*
And verified using the topics list command before executing the script.

Am I missing anything else.

Regards,
Yashika

Guozhang Wang  wrote:


Could you double check if the topic LOGFILE04 is already created on the
servers?

Guozhang


On Thu, Apr 24, 2014 at 10:46 AM, Yashika Gupta  wrote:

> Jun,
>
> The detailed logs are as follows:
>
> 24.04.2014 13:37:31812 INFO main kafka.producer.SyncProducer -
> Disconnecting from localhost:9092
> 24.04.2014 13:37:38612 WARN main kafka.producer.BrokerPartitionInfo -
> Error while fetching metadata [{TopicMetadata for topic LOGFILE04 ->
> No partition metadata for topic LOGFILE04 due to
> kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class
> kafka.common.LeaderNotAvailableException
> 24.04.2014 13:37:40712 INFO main kafka.client.ClientUtils$ - Fetching
> metadata from broker id:0,host:localhost,port:9092 with correlation id 1
> for 1 topic(s) Set(LOGFILE04)
> 24.04.2014 13:37:41212 INFO main kafka.producer.SyncProducer - Connected
> to localhost:9092 for producing
> 24.04.2014 13:37:48812 INFO main kafka.producer.SyncProducer -
> Disconnecting from localhost:9092
> 24.04.2014 13:37:48912 WARN main kafka.producer.BrokerPartitionInfo -
> Error while fetching metadata [{TopicMetadata for topic LOGFILE04 ->
> No partition metadata for topic LOGFILE04 due to
> kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class
> kafka.common.LeaderNotAvailableException
> 24.04.2014 13:37:49012 ERROR main kafka.producer.async.DefaultEventHandler
> - Failed to collate messages by topic, partition due to: Failed to fetch
> topic metadata for topic: LOGFILE04
>
>
> 24.04.2014 13:39:96513 WARN
> ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0
> kafka.consumer.ConsumerFetcherThread -
> [ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0],
> Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 4; ClientId:
> produceLogLine2-ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0;
> ReplicaId: -1; MaxWait: 6 ms; MinBytes: 1 bytes; RequestInfo:
> [LOGFILE04,0] -> PartitionFetchInfo(2,1048576)
> java.net.SocketTimeoutException
> at
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
> at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> at
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> at kafka.utils.Utils$.read(Unknown Source)
> at kafka.network.BoundedByteBufferReceive.readFrom(Unknown Source)
> at kafka.network.Receive$class.readCompletely(Unknown Source)
> at kafka.network.BoundedByteBufferReceive.readCompletely(Unknown
> Source)
> at kafka.network.BlockingChannel.receive(Unknown Source)
> at kafka.consumer.SimpleConsumer.liftedTree1$1(Unknown Source)
> at
> kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(Unknown
> Source)
> at
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Unknown
> Source)
> at
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown
> Source)
> at
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown
> Source)
> at kafka.metrics.KafkaTimer.time(Unknown Source)
> at
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(Unknown Source)
> at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown
> Source)
> at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown
> Source)
> at kafka.metrics.KafkaTimer.time(Unknown Source)
> at kafka.consumer.SimpleConsumer.fetch(Unknown Source)
> at kafka.server.AbstractFetcherThread.processFetchRequest(Unknown
> Source)
> at kafka.server.AbstractFetcherThread.doWork(Unknown Source)
> at kafka.utils.ShutdownableThread.run(Unknown Source)
>
>
> Regards,
> Yashika
> 
> From: Jun Rao 
> Sent: Thursday, April 24, 2014 10:49 PM
> To: users@kafka.apache.org
> Subject: Re: Kafka Performance Tuning
>
> Before that error messge, the log should tell you the cause of the error.
> Could you dig that out?
>
> Thanks,
>
> Jun
>
>
> On Thu, Apr 24, 2014 at 10:12 AM, Yashika Gupta <
> yashika.gu...@impetus.co.in
> > wrote:
>
> > Hi,
> >
> > I am working on a POC where I have 1 Zookeeper and 2 Kafka Brokers on my
> > local machine. I am running 8 sets of Kafka consumers and producers
> running
> > in parallel.
> >
> > Below are my configurations:
> > Consumer Configs:
> > zookeeper.session.timeout.ms=12
> > zookeeper.sync.time.ms=2000
> > zookeeper.connection.timeout.ms=12
> > auto.commit.interval.ms=6
> > rebalance.backoff.ms=2000
> > fetch.wait.max.ms=6
> > auto.offset.reset=smallest
> >
> > Produce

Re: Kafka Performance Tuning

2014-04-24 Thread pushkar priyadarshi
you can use the kafka-list-topic.sh to find out if leader for particual
topic is available.-1 in leader column might indicate trouble.


On Fri, Apr 25, 2014 at 6:34 AM, Guozhang Wang  wrote:

> Could you double check if the topic LOGFILE04 is already created on the
> servers?
>
> Guozhang
>
>
> On Thu, Apr 24, 2014 at 10:46 AM, Yashika Gupta <
> yashika.gu...@impetus.co.in
> > wrote:
>
> > Jun,
> >
> > The detailed logs are as follows:
> >
> > 24.04.2014 13:37:31812 INFO main kafka.producer.SyncProducer -
> > Disconnecting from localhost:9092
> > 24.04.2014 13:37:38612 WARN main kafka.producer.BrokerPartitionInfo -
> > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 ->
> > No partition metadata for topic LOGFILE04 due to
> > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class
> > kafka.common.LeaderNotAvailableException
> > 24.04.2014 13:37:40712 INFO main kafka.client.ClientUtils$ - Fetching
> > metadata from broker id:0,host:localhost,port:9092 with correlation id 1
> > for 1 topic(s) Set(LOGFILE04)
> > 24.04.2014 13:37:41212 INFO main kafka.producer.SyncProducer - Connected
> > to localhost:9092 for producing
> > 24.04.2014 13:37:48812 INFO main kafka.producer.SyncProducer -
> > Disconnecting from localhost:9092
> > 24.04.2014 13:37:48912 WARN main kafka.producer.BrokerPartitionInfo -
> > Error while fetching metadata [{TopicMetadata for topic LOGFILE04 ->
> > No partition metadata for topic LOGFILE04 due to
> > kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class
> > kafka.common.LeaderNotAvailableException
> > 24.04.2014 13:37:49012 ERROR main
> kafka.producer.async.DefaultEventHandler
> > - Failed to collate messages by topic, partition due to: Failed to fetch
> > topic metadata for topic: LOGFILE04
> >
> >
> > 24.04.2014 13:39:96513 WARN
> >
> ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0
> > kafka.consumer.ConsumerFetcherThread -
> >
> [ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0],
> > Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 4;
> ClientId:
> >
> produceLogLine2-ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0;
> > ReplicaId: -1; MaxWait: 6 ms; MinBytes: 1 bytes; RequestInfo:
> > [LOGFILE04,0] -> PartitionFetchInfo(2,1048576)
> > java.net.SocketTimeoutException
> > at
> > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
> > at
> sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> > at
> >
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> > at kafka.utils.Utils$.read(Unknown Source)
> > at kafka.network.BoundedByteBufferReceive.readFrom(Unknown
> Source)
> > at kafka.network.Receive$class.readCompletely(Unknown Source)
> > at kafka.network.BoundedByteBufferReceive.readCompletely(Unknown
> > Source)
> > at kafka.network.BlockingChannel.receive(Unknown Source)
> > at kafka.consumer.SimpleConsumer.liftedTree1$1(Unknown Source)
> > at
> >
> kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(Unknown
> > Source)
> > at
> >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Unknown
> > Source)
> > at
> >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown
> > Source)
> > at
> >
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown
> > Source)
> > at kafka.metrics.KafkaTimer.time(Unknown Source)
> > at
> > kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(Unknown
> Source)
> > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown
> > Source)
> > at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown
> > Source)
> > at kafka.metrics.KafkaTimer.time(Unknown Source)
> > at kafka.consumer.SimpleConsumer.fetch(Unknown Source)
> > at kafka.server.AbstractFetcherThread.processFetchRequest(Unknown
> > Source)
> > at kafka.server.AbstractFetcherThread.doWork(Unknown Source)
> > at kafka.utils.ShutdownableThread.run(Unknown Source)
> >
> >
> > Regards,
> > Yashika
> > 
> > From: Jun Rao 
> > Sent: Thursday, April 24, 2014 10:49 PM
> > To: users@kafka.apache.org
> > Subject: Re: Kafka Performance Tuning
> >
> > Before that error messge, the log should tell you the cause of the error.
> > Could you dig that out?
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Thu, Apr 24, 2014 at 10:12 AM, Yashika Gupta <
> > yashika.gu...@impetus.co.in
> > > wrote:
> >
> > > Hi,
> > >
> > > I am working on a POC where I have 1 Zookeeper and 2 Kafka Brokers on
> my
> > > local machine. I am running 8 sets of Kafka consumers and producers
> > running
> > > in parallel.
> > >
> > > Below are my configurations:
> > > Consumer Configs:
> >

Re: question about isr

2014-04-24 Thread 陈小军
I don't do any partition reassignment. 
 
When broker occure following error, this phenomenon  will happen.
 
[hadoop@nelo76 libs]$ [2014-03-14 12:11:44,310] INFO Partition 
[nelo2-normal-logs,0] on broker 0: Shrinking ISR for partition 
[nelo2-normal-logs,0] from 0,1 to 0 (kafka.cluster.Partition)
[2014-03-14 12:11:44,313] ERROR Conditional update of path 
/brokers/topics/nelo2-normal-logs/partitions/0/state with data 
{"controller_epoch":4,"leader":0,"version":1,"leader_epoch":5,"isr":[0]} and 
expected version 7 failed due to 
org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = 
BadVersion for /brokers/topics/nelo2-normal-logs/partitions/0/state 
(kafka.utils.ZkUtils$)
[2014-03-14 12:11:44,313] INFO Partition [nelo2-normal-logs,0] on broker 0: 
Cached zkVersion [7] not equal to that in zookeeper, skip updating ISR 
(kafka.cluster.Partition)
[2014-03-14 12:11:44,313] INFO Partition [nelo2-symbolicated-logs,1] on broker 
0: Shrinking ISR for partition [nelo2-symbolicated-logs,1] from 0,2 to 0 
(kafka.cluster.Partition)
[2014-03-14 12:11:44,315] ERROR Conditional update of path 
/brokers/topics/nelo2-symbolicated-logs/partitions/1/state with data 
{"controller_epoch":4,"leader":0,"version":1,"leader_epoch":6,"isr":[0]} and 
expected version 8 failed due to 
org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = 
BadVersion for /brokers/topics/nelo2-symbolicated-logs/partitions/1/state 
(kafka.utils.ZkUtils$)
[2014-03-14 12:11:44,315] INFO Partition [nelo2-symbolicated-logs,1] on broker 
0: Cached zkVersion [8] not equal to that in zookeeper, skip updating ISR 
(kafka.cluster.Partition)
[2014-03-14 12:11:44,316] INFO Partition [nelo2-crash-logs,1] on broker 0: 
Shrinking ISR for partition [nelo2-crash-logs,1] from 0,1 to 0 
(kafka.cluster.Partition)
[2014-03-14 12:11:44,318] ERROR Conditional update of path 
/brokers/topics/nelo2-crash-logs/partitions/1/state with data 
{"controller_epoch":4,"leader":0,"version":1,"leader_epoch":5,"isr":[0]} and 
expected version 7 failed due to 
org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = 
BadVersion for /brokers/topics/nelo2-crash-logs/partitions/1/state 
(kafka.utils.ZkUtils$)
[2014-03-14 12:11:44,318] INFO Partition [nelo2-crash-logs,1] on broker 0: 
Cached zkVersion [7] not equal to that in zookeeper, skip updating ISR 
(kafka.cluster.Partit
Best Regards
Jerry
-Original Message-
From: "Jun Rao" 
To: "users@kafka.apache.org"; 
"陈小军"; 
Cc: 
Sent: 2014-04-25 (星期五) 02:12:02
Subject: Re: question about isr
 
Interesting. Which version of Kafka are you using? Were you doing some 
partition reassignment? Thanks, Jun


On Wed, Apr 23, 2014 at 11:14 PM, 陈小军  wrote:

Hi Team,

   I found a strange phenomenon of isr list in my kafka cluster



   When I use the tool that kafka provide to get the topic information, and it 
show isr list as following, seem it is ok



 [irt...@xseed171.kdev bin]$ ./kafka-topics.sh --describe --zookeeper 
10.96.250.215:10013,10.96.250.216:10013,10.96.250.217:10013/nelo2-kafka




Topic:nelo2-normal-logs PartitionCount:3ReplicationFactor:2 Configs:

Topic: nelo2-normal-logsPartition: 0Leader: 3   
Replicas: 3,0   Isr: 0,3

Topic: nelo2-normal-logsPartition: 1Leader: 0   
Replicas: 0,1   Isr: 0,1

Topic: nelo2-normal-logsPartition: 2Leader: 1   
Replicas: 1,3   Isr: 1,3



  but when I use some sdk to get the meta info from broker, the isr is 
different.

metadata:  { size: 246,

  correlationId: 0,

  brokerNum: -1,

  nodeId: 1,

  host: 'xseed171.kdev.nhnsystem.com',

  port: 9093,

  topicNum: 0,

  topicError: 0,

  topic: 'nelo2-normal-logs',

  partitionNum: 2,

  errorCode: 0,

  partition: 0,

  leader: 3,

  replicasNum: 2,

  replicas: [ 3, 0 ],

  isrNum: 2,

  isr: [ 0, 3 ] }

metadata:  { size: 246,

  correlationId: 0,

  brokerNum: -1,

  nodeId: 1,

  host: 'xseed171.kdev.nhnsystem.com',

  port: 9093,

  topicNum: 0,

  topicError: 0,

  topic: 'nelo2-normal-logs',

  partitionNum: 1,

  errorCode: 0,

  partition: 1,

  leader: 0,

  replicasNum: 2,

  replicas: [ 0, 1 ],

  isrNum: 2,

  isr: [ 0, 1 ] }

metadata:  { size: 246,

  correlationId: 0,

  brokerNum: -1,

  nodeId: 1,

  host: 'xseed171.kdev.nhnsystem.com',

  port: 9093,

  topicNum: 0,

  topicError: 0,

  topic: 'nelo2-normal-logs',

  partitionNum: 0,

  errorCode: 0,

  partition: 2,

  leader: 1,

  replicasNum: 2,

  replicas: [ 1, 3 ],

  isrNum: 1,

  isr: [ 1 ] }



 I also use other sdk, get the same result. I check the logs from kafka, it 
seems the sdk result is right. the tool get the wrong result. why is it happend?



[2014-04-24 14:53:57,705] TRACE Broker 3 cached leader info 
(LeaderAndIsrInfo:(Leader:0,ISR:0,1,LeaderEpoch:7,ControllerEpoch:9),ReplicationFactor:2),AllReplicas:0,1)
 for partition [nel

Re: Brokers throwing warning messages after change in retention policy and multiple produce failures

2014-04-24 Thread Guozhang Wang
Hi Sadhan,

Do you see any errors on the server logs?

Guozhang


On Thu, Apr 24, 2014 at 12:57 PM, Sadhan Sood  wrote:

> We are seeing some strange behavior from brokers after we we had to change
> our log retention policy on brokers yesterday. We had a huge spike in
> producer data for a small period which caused brokers to get very close to
> the max disk space. Normally our retention policy is good 6-7 days but
> since our consumers were synced up we changed the retention policy from
> hour based to size based and cut short the size to a safe number (half of
> our max disk space and normal usage is around 30%). After the restart, we
> started seeing multiple producer side failures with FailedSends metrics
> showing almost 10% failures and FailedProduceRequestsPerSec on the broker
> side a non-zero number. The traces from one of the brokers looked like
> this:
>
> [KafkaApi-8] Produce request with correlation id 2050686 from client xxx on
> partition [TOPIC_NAME,18] failed due to Partition [TOPIC_NAME,18] doesn't
> exist on 8 (kafka.server.KafkaApis)
>  [KafkaApi-8] Produce request with correlation id 2102325 from client xxx
> on partition [TOPIC_NAME,28] failed due to Partition [TOPIC_NAME,28]
> doesn't exist on 8 (kafka.server.KafkaApis)
>
> We checked and made sure those partitions were present on the broker.
> Any help is appreciated. Also, is there a recommended way to purge log data
> quickly out from the brokers.
>
> Thanks,
> Sadhan
>



-- 
-- Guozhang


Re: Kafka Performance Tuning

2014-04-24 Thread Guozhang Wang
Could you double check if the topic LOGFILE04 is already created on the
servers?

Guozhang


On Thu, Apr 24, 2014 at 10:46 AM, Yashika Gupta  wrote:

> Jun,
>
> The detailed logs are as follows:
>
> 24.04.2014 13:37:31812 INFO main kafka.producer.SyncProducer -
> Disconnecting from localhost:9092
> 24.04.2014 13:37:38612 WARN main kafka.producer.BrokerPartitionInfo -
> Error while fetching metadata [{TopicMetadata for topic LOGFILE04 ->
> No partition metadata for topic LOGFILE04 due to
> kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class
> kafka.common.LeaderNotAvailableException
> 24.04.2014 13:37:40712 INFO main kafka.client.ClientUtils$ - Fetching
> metadata from broker id:0,host:localhost,port:9092 with correlation id 1
> for 1 topic(s) Set(LOGFILE04)
> 24.04.2014 13:37:41212 INFO main kafka.producer.SyncProducer - Connected
> to localhost:9092 for producing
> 24.04.2014 13:37:48812 INFO main kafka.producer.SyncProducer -
> Disconnecting from localhost:9092
> 24.04.2014 13:37:48912 WARN main kafka.producer.BrokerPartitionInfo -
> Error while fetching metadata [{TopicMetadata for topic LOGFILE04 ->
> No partition metadata for topic LOGFILE04 due to
> kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class
> kafka.common.LeaderNotAvailableException
> 24.04.2014 13:37:49012 ERROR main kafka.producer.async.DefaultEventHandler
> - Failed to collate messages by topic, partition due to: Failed to fetch
> topic metadata for topic: LOGFILE04
>
>
> 24.04.2014 13:39:96513 WARN
> ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0
> kafka.consumer.ConsumerFetcherThread -
> [ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0],
> Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 4; ClientId:
> produceLogLine2-ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0;
> ReplicaId: -1; MaxWait: 6 ms; MinBytes: 1 bytes; RequestInfo:
> [LOGFILE04,0] -> PartitionFetchInfo(2,1048576)
> java.net.SocketTimeoutException
> at
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
> at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> at
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> at kafka.utils.Utils$.read(Unknown Source)
> at kafka.network.BoundedByteBufferReceive.readFrom(Unknown Source)
> at kafka.network.Receive$class.readCompletely(Unknown Source)
> at kafka.network.BoundedByteBufferReceive.readCompletely(Unknown
> Source)
> at kafka.network.BlockingChannel.receive(Unknown Source)
> at kafka.consumer.SimpleConsumer.liftedTree1$1(Unknown Source)
> at
> kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(Unknown
> Source)
> at
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Unknown
> Source)
> at
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown
> Source)
> at
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown
> Source)
> at kafka.metrics.KafkaTimer.time(Unknown Source)
> at
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(Unknown Source)
> at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown
> Source)
> at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown
> Source)
> at kafka.metrics.KafkaTimer.time(Unknown Source)
> at kafka.consumer.SimpleConsumer.fetch(Unknown Source)
> at kafka.server.AbstractFetcherThread.processFetchRequest(Unknown
> Source)
> at kafka.server.AbstractFetcherThread.doWork(Unknown Source)
> at kafka.utils.ShutdownableThread.run(Unknown Source)
>
>
> Regards,
> Yashika
> 
> From: Jun Rao 
> Sent: Thursday, April 24, 2014 10:49 PM
> To: users@kafka.apache.org
> Subject: Re: Kafka Performance Tuning
>
> Before that error messge, the log should tell you the cause of the error.
> Could you dig that out?
>
> Thanks,
>
> Jun
>
>
> On Thu, Apr 24, 2014 at 10:12 AM, Yashika Gupta <
> yashika.gu...@impetus.co.in
> > wrote:
>
> > Hi,
> >
> > I am working on a POC where I have 1 Zookeeper and 2 Kafka Brokers on my
> > local machine. I am running 8 sets of Kafka consumers and producers
> running
> > in parallel.
> >
> > Below are my configurations:
> > Consumer Configs:
> > zookeeper.session.timeout.ms=12
> > zookeeper.sync.time.ms=2000
> > zookeeper.connection.timeout.ms=12
> > auto.commit.interval.ms=6
> > rebalance.backoff.ms=2000
> > fetch.wait.max.ms=6
> > auto.offset.reset=smallest
> >
> > Producer configs:
> > key.serializer.class=kafka.serializer.StringEncoder
> > request.required.acks=-1
> > message.send.max.retries=3
> > request.timeout.ms=6
> >
> > I have tried various more configuration changes but I am running

Re: Kafka Performance Tuning

2014-04-24 Thread Bert Corderman
I had this error before and corrected by increasing nofile limit

add to file an entry for the user running the broker.

/etc/security/limits.conf

kafka - nofile 98304


On Thu, Apr 24, 2014 at 1:46 PM, Yashika Gupta
wrote:

> Jun,
>
> The detailed logs are as follows:
>
> 24.04.2014 13:37:31812 INFO main kafka.producer.SyncProducer -
> Disconnecting from localhost:9092
> 24.04.2014 13:37:38612 WARN main kafka.producer.BrokerPartitionInfo -
> Error while fetching metadata [{TopicMetadata for topic LOGFILE04 ->
> No partition metadata for topic LOGFILE04 due to
> kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class
> kafka.common.LeaderNotAvailableException
> 24.04.2014 13:37:40712 INFO main kafka.client.ClientUtils$ - Fetching
> metadata from broker id:0,host:localhost,port:9092 with correlation id 1
> for 1 topic(s) Set(LOGFILE04)
> 24.04.2014 13:37:41212 INFO main kafka.producer.SyncProducer - Connected
> to localhost:9092 for producing
> 24.04.2014 13:37:48812 INFO main kafka.producer.SyncProducer -
> Disconnecting from localhost:9092
> 24.04.2014 13:37:48912 WARN main kafka.producer.BrokerPartitionInfo -
> Error while fetching metadata [{TopicMetadata for topic LOGFILE04 ->
> No partition metadata for topic LOGFILE04 due to
> kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class
> kafka.common.LeaderNotAvailableException
> 24.04.2014 13:37:49012 ERROR main kafka.producer.async.DefaultEventHandler
> - Failed to collate messages by topic, partition due to: Failed to fetch
> topic metadata for topic: LOGFILE04
>
>
> 24.04.2014 13:39:96513 WARN
> ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0
> kafka.consumer.ConsumerFetcherThread -
> [ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0],
> Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 4; ClientId:
> produceLogLine2-ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0;
> ReplicaId: -1; MaxWait: 6 ms; MinBytes: 1 bytes; RequestInfo:
> [LOGFILE04,0] -> PartitionFetchInfo(2,1048576)
> java.net.SocketTimeoutException
> at
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
> at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> at
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> at kafka.utils.Utils$.read(Unknown Source)
> at kafka.network.BoundedByteBufferReceive.readFrom(Unknown Source)
> at kafka.network.Receive$class.readCompletely(Unknown Source)
> at kafka.network.BoundedByteBufferReceive.readCompletely(Unknown
> Source)
> at kafka.network.BlockingChannel.receive(Unknown Source)
> at kafka.consumer.SimpleConsumer.liftedTree1$1(Unknown Source)
> at
> kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(Unknown
> Source)
> at
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Unknown
> Source)
> at
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown
> Source)
> at
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown
> Source)
> at kafka.metrics.KafkaTimer.time(Unknown Source)
> at
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(Unknown Source)
> at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown
> Source)
> at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown
> Source)
> at kafka.metrics.KafkaTimer.time(Unknown Source)
> at kafka.consumer.SimpleConsumer.fetch(Unknown Source)
> at kafka.server.AbstractFetcherThread.processFetchRequest(Unknown
> Source)
> at kafka.server.AbstractFetcherThread.doWork(Unknown Source)
> at kafka.utils.ShutdownableThread.run(Unknown Source)
>
>
> Regards,
> Yashika
> 
> From: Jun Rao 
> Sent: Thursday, April 24, 2014 10:49 PM
> To: users@kafka.apache.org
> Subject: Re: Kafka Performance Tuning
>
> Before that error messge, the log should tell you the cause of the error.
> Could you dig that out?
>
> Thanks,
>
> Jun
>
>
> On Thu, Apr 24, 2014 at 10:12 AM, Yashika Gupta <
> yashika.gu...@impetus.co.in
> > wrote:
>
> > Hi,
> >
> > I am working on a POC where I have 1 Zookeeper and 2 Kafka Brokers on my
> > local machine. I am running 8 sets of Kafka consumers and producers
> running
> > in parallel.
> >
> > Below are my configurations:
> > Consumer Configs:
> > zookeeper.session.timeout.ms=12
> > zookeeper.sync.time.ms=2000
> > zookeeper.connection.timeout.ms=12
> > auto.commit.interval.ms=6
> > rebalance.backoff.ms=2000
> > fetch.wait.max.ms=6
> > auto.offset.reset=smallest
> >
> > Producer configs:
> > key.serializer.class=kafka.serializer.StringEncoder
> > request.required.acks=-1
> > message.send.max.retries=3
> > request.timeout.ms=60

Brokers throwing warning messages after change in retention policy and multiple produce failures

2014-04-24 Thread Sadhan Sood
We are seeing some strange behavior from brokers after we we had to change
our log retention policy on brokers yesterday. We had a huge spike in
producer data for a small period which caused brokers to get very close to
the max disk space. Normally our retention policy is good 6-7 days but
since our consumers were synced up we changed the retention policy from
hour based to size based and cut short the size to a safe number (half of
our max disk space and normal usage is around 30%). After the restart, we
started seeing multiple producer side failures with FailedSends metrics
showing almost 10% failures and FailedProduceRequestsPerSec on the broker
side a non-zero number. The traces from one of the brokers looked like this:

[KafkaApi-8] Produce request with correlation id 2050686 from client xxx on
partition [TOPIC_NAME,18] failed due to Partition [TOPIC_NAME,18] doesn't
exist on 8 (kafka.server.KafkaApis)
 [KafkaApi-8] Produce request with correlation id 2102325 from client xxx
on partition [TOPIC_NAME,28] failed due to Partition [TOPIC_NAME,28]
doesn't exist on 8 (kafka.server.KafkaApis)

We checked and made sure those partitions were present on the broker.
Any help is appreciated. Also, is there a recommended way to purge log data
quickly out from the brokers.

Thanks,
Sadhan


RE: Kafka Performance Tuning

2014-04-24 Thread Yashika Gupta
Jun,

The detailed logs are as follows:

24.04.2014 13:37:31812 INFO main kafka.producer.SyncProducer - Disconnecting 
from localhost:9092
24.04.2014 13:37:38612 WARN main kafka.producer.BrokerPartitionInfo - Error 
while fetching metadata [{TopicMetadata for topic LOGFILE04 ->
No partition metadata for topic LOGFILE04 due to 
kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class 
kafka.common.LeaderNotAvailableException
24.04.2014 13:37:40712 INFO main kafka.client.ClientUtils$ - Fetching metadata 
from broker id:0,host:localhost,port:9092 with correlation id 1 for 1 topic(s) 
Set(LOGFILE04)
24.04.2014 13:37:41212 INFO main kafka.producer.SyncProducer - Connected to 
localhost:9092 for producing
24.04.2014 13:37:48812 INFO main kafka.producer.SyncProducer - Disconnecting 
from localhost:9092
24.04.2014 13:37:48912 WARN main kafka.producer.BrokerPartitionInfo - Error 
while fetching metadata [{TopicMetadata for topic LOGFILE04 ->
No partition metadata for topic LOGFILE04 due to 
kafka.common.LeaderNotAvailableException}] for topic [LOGFILE04]: class 
kafka.common.LeaderNotAvailableException
24.04.2014 13:37:49012 ERROR main kafka.producer.async.DefaultEventHandler - 
Failed to collate messages by topic, partition due to: Failed to fetch topic 
metadata for topic: LOGFILE04


24.04.2014 13:39:96513 WARN 
ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0 
kafka.consumer.ConsumerFetcherThread - 
[ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0],
 Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 4; ClientId: 
produceLogLine2-ConsumerFetcherThread-produceLogLine2_vcmd-devanshu-1398361030812-8a0c706e-0-0;
 ReplicaId: -1; MaxWait: 6 ms; MinBytes: 1 bytes; RequestInfo: 
[LOGFILE04,0] -> PartitionFetchInfo(2,1048576)
java.net.SocketTimeoutException
at 
sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at 
java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
at kafka.utils.Utils$.read(Unknown Source)
at kafka.network.BoundedByteBufferReceive.readFrom(Unknown Source)
at kafka.network.Receive$class.readCompletely(Unknown Source)
at kafka.network.BoundedByteBufferReceive.readCompletely(Unknown Source)
at kafka.network.BlockingChannel.receive(Unknown Source)
at kafka.consumer.SimpleConsumer.liftedTree1$1(Unknown Source)
at 
kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(Unknown
 Source)
at 
kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Unknown
 Source)
at 
kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown
 Source)
at 
kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(Unknown
 Source)
at kafka.metrics.KafkaTimer.time(Unknown Source)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(Unknown 
Source)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown Source)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(Unknown Source)
at kafka.metrics.KafkaTimer.time(Unknown Source)
at kafka.consumer.SimpleConsumer.fetch(Unknown Source)
at kafka.server.AbstractFetcherThread.processFetchRequest(Unknown 
Source)
at kafka.server.AbstractFetcherThread.doWork(Unknown Source)
at kafka.utils.ShutdownableThread.run(Unknown Source)


Regards,
Yashika

From: Jun Rao 
Sent: Thursday, April 24, 2014 10:49 PM
To: users@kafka.apache.org
Subject: Re: Kafka Performance Tuning

Before that error messge, the log should tell you the cause of the error.
Could you dig that out?

Thanks,

Jun


On Thu, Apr 24, 2014 at 10:12 AM, Yashika Gupta  wrote:

> Hi,
>
> I am working on a POC where I have 1 Zookeeper and 2 Kafka Brokers on my
> local machine. I am running 8 sets of Kafka consumers and producers running
> in parallel.
>
> Below are my configurations:
> Consumer Configs:
> zookeeper.session.timeout.ms=12
> zookeeper.sync.time.ms=2000
> zookeeper.connection.timeout.ms=12
> auto.commit.interval.ms=6
> rebalance.backoff.ms=2000
> fetch.wait.max.ms=6
> auto.offset.reset=smallest
>
> Producer configs:
> key.serializer.class=kafka.serializer.StringEncoder
> request.required.acks=-1
> message.send.max.retries=3
> request.timeout.ms=6
>
> I have tried various more configuration changes but I am running into the
> same exception again and again.
>
> 17.04.2014 06:31:83216 ERROR pool-5-thread-1
> kafka.producer.async.DefaultEventHandler - Failed to collate messages by
> topic, partition due to: Failed to fetch topic metadata for topic: TOPIC99
> 17.04.2014 06:31:85616 ERROR pool-4-thread-1
> kafka.producer.async.DefaultEventHandler - Failed to collate messages by
> topic, p

Re: Kafka Performance Tuning

2014-04-24 Thread Jun Rao
Before that error messge, the log should tell you the cause of the error.
Could you dig that out?

Thanks,

Jun


On Thu, Apr 24, 2014 at 10:12 AM, Yashika Gupta  wrote:

> Hi,
>
> I am working on a POC where I have 1 Zookeeper and 2 Kafka Brokers on my
> local machine. I am running 8 sets of Kafka consumers and producers running
> in parallel.
>
> Below are my configurations:
> Consumer Configs:
> zookeeper.session.timeout.ms=12
> zookeeper.sync.time.ms=2000
> zookeeper.connection.timeout.ms=12
> auto.commit.interval.ms=6
> rebalance.backoff.ms=2000
> fetch.wait.max.ms=6
> auto.offset.reset=smallest
>
> Producer configs:
> key.serializer.class=kafka.serializer.StringEncoder
> request.required.acks=-1
> message.send.max.retries=3
> request.timeout.ms=6
>
> I have tried various more configuration changes but I am running into the
> same exception again and again.
>
> 17.04.2014 06:31:83216 ERROR pool-5-thread-1
> kafka.producer.async.DefaultEventHandler - Failed to collate messages by
> topic, partition due to: Failed to fetch topic metadata for topic: TOPIC99
> 17.04.2014 06:31:85616 ERROR pool-4-thread-1
> kafka.producer.async.DefaultEventHandler - Failed to collate messages by
> topic, partition due to: Failed to fetch topic metadata for topic: TOPIC99
>
> I am not able to get to the root cause of the issue.
> Appreciate your help.
>
> Regards,
> Yashika
>
> 
>
>
>
>
>
>
> NOTE: This message may contain information that is confidential,
> proprietary, privileged or otherwise protected by law. The message is
> intended solely for the named addressee. If received in error, please
> destroy and notify the sender. Any use of this email is prohibited when
> received in error. Impetus does not represent, warrant and/or guarantee,
> that the integrity of this communication has been maintained nor that the
> communication is free of errors, virus, interception or interference.
>


Re: Failing broker with errors for Conditional update

2014-04-24 Thread Jun Rao
0.8.1.1 is being voted now.

Thanks,

Jun


On Thu, Apr 24, 2014 at 10:07 AM, Drew Goya  wrote:

> This just hit me this morning as well, any news on 0.8.1.1?  My ops guy is
> going to kill me, we just rolled off my older build of 0.8.1 to the
> official release.
>
>
> On Thu, Apr 3, 2014 at 11:55 PM, Krzysztof Ociepa <
> ociepa.krzysz...@gmail.com> wrote:
>
> > Hi Guozhang,
> > Hi Neha,
> >
> > Thanks a lot for your answers. I will try new version 0.8.1.1 and let
> > you know how it works for me.
> >
> > Best,
> > Chris
> >
>


Re: Cluster expansion and upgrade

2014-04-24 Thread Jun Rao
Partition reassignment wasn't fully working in 0.8-beta. So you probably
will have to upgrade existing brokers to 0.8.1 before running partition
reassignment. Also, 0.8.1.1 will be out soon.

Thanks,

Jun


On Thu, Apr 24, 2014 at 9:49 AM, vimpy batra  wrote:

> Hello,
>
> We are currently running a kafka 0.8-beta cluster. We are planning to
> expand the existing cluster and use 0.8.1 version on the new nodes.  Before
> upgrading the older ones we want the new ones to participate in the
> cluster. We plan to use "reassign-partitions" tool in 0.8.1 to reassign
> partitions to the newly added brokers. We will also add additional
> partitions to the existing topics. Is there a best practice to go about
> expanding clusters? Is this the recommended way to go or should we upgrade
> our existing cluster to 0.8.1 first?
> I also noticed that there are some fixes on top of 0.8.1. Are they
> available to use?
>
> Thanks,


Re: Delete Topic - BadVersionException

2014-04-24 Thread Jun Rao
Delete topic doesn't quite work yet and we will try to fix it in the next
release. https://issues.apache.org/jira/browse/KAFKA-1397

Thanks,

Jun


On Thu, Apr 24, 2014 at 9:49 AM, Drew Goya  wrote:

> Just tried my first topic delete today and it looks like something went
> wrong on the controller.  I issued the command on a test topic and shortly
> after that a describe looked like:
>
> Topic:TimeoutQueueTest PartitionCount:256 ReplicationFactor:3 Configs:
> Topic: TimeoutQueueTest Partition: 0 Leader: -1 Replicas: 9,14,15 Isr:
> Topic: TimeoutQueueTest Partition: 1 Leader: -1 Replicas: 10,15,1 Isr:
> Topic: TimeoutQueueTest Partition: 2 Leader: -1 Replicas: 11,1,2 Isr:
> Topic: TimeoutQueueTest Partition: 3 Leader: -1 Replicas: 12,2,3 Isr:
> Topic: TimeoutQueueTest Partition: 4 Leader: -1 Replicas: 13,3,4 Isr:
> Topic: TimeoutQueueTest Partition: 5 Leader: -1 Replicas: 14,4,5 Isr:
>
> It stayed that way for quite a while so I hit zookeeper and went looking
> for who was the controller, I found these in that brokers logs:
>
> [2014-04-24 16:27:42,498] ERROR Conditional update of path
> /brokers/topics/TimeoutQueueTest/partitions/170/state with data
>
> {"controller_epoch":18,"leader":2,"version":1,"leader_epoch":14,"isr":[2,14]}
> and expected version 30 failed due to
> org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode =
> BadVersion for /brokers/topics/TimeoutQueueTest/partitions/170/state
> (kafka.utils.ZkUtils$)
> [2014-04-24 16:27:42,504] ERROR Conditional update of path
> /brokers/topics/TimeoutQueueTest/partitions/113/state with data
>
> {"controller_epoch":18,"leader":2,"version":1,"leader_epoch":4,"isr":[2,15,14]}
> and expected version 17 failed due to
> org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode =
> BadVersion for /brokers/topics/TimeoutQueueTest/partitions/113/state
> (kafka.utils.ZkUtils$)
>
> Any ideas?
>


Kafka Performance Tuning

2014-04-24 Thread Yashika Gupta
Hi,

I am working on a POC where I have 1 Zookeeper and 2 Kafka Brokers on my local 
machine. I am running 8 sets of Kafka consumers and producers running in 
parallel.

Below are my configurations:
Consumer Configs:
zookeeper.session.timeout.ms=12
zookeeper.sync.time.ms=2000
zookeeper.connection.timeout.ms=12
auto.commit.interval.ms=6
rebalance.backoff.ms=2000
fetch.wait.max.ms=6
auto.offset.reset=smallest

Producer configs:
key.serializer.class=kafka.serializer.StringEncoder
request.required.acks=-1
message.send.max.retries=3
request.timeout.ms=6

I have tried various more configuration changes but I am running into the same 
exception again and again.

17.04.2014 06:31:83216 ERROR pool-5-thread-1 
kafka.producer.async.DefaultEventHandler - Failed to collate messages by topic, 
partition due to: Failed to fetch topic metadata for topic: TOPIC99
17.04.2014 06:31:85616 ERROR pool-4-thread-1 
kafka.producer.async.DefaultEventHandler - Failed to collate messages by topic, 
partition due to: Failed to fetch topic metadata for topic: TOPIC99

I am not able to get to the root cause of the issue.
Appreciate your help.

Regards,
Yashika








NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.


Re: question about isr

2014-04-24 Thread Jun Rao
Interesting. Which version of Kafka are you using? Were you doing some
partition reassignment?

Thanks,

Jun


On Wed, Apr 23, 2014 at 11:14 PM, 陈小军  wrote:

> Hi Team,
>I found a strange phenomenon of isr list in my kafka cluster
>
>When I use the tool that kafka provide to get the topic information,
> and it show isr list as following, seem it is ok
>
>  [irt...@xseed171.kdev bin]$ ./kafka-topics.sh --describe --zookeeper
> 10.96.250.215:10013,10.96.250.216:10013,10.96.250.217:10013/nelo2-kafka
>
> Topic:nelo2-normal-logs PartitionCount:3ReplicationFactor:2
> Configs:
> Topic: nelo2-normal-logsPartition: 0Leader: 3
> Replicas: 3,0   Isr: 0,3
> Topic: nelo2-normal-logsPartition: 1Leader: 0
> Replicas: 0,1   Isr: 0,1
> Topic: nelo2-normal-logsPartition: 2Leader: 1
> Replicas: 1,3   Isr: 1,3
>
>   but when I use some sdk to get the meta info from broker, the isr is
> different.
> metadata:  { size: 246,
>   correlationId: 0,
>   brokerNum: -1,
>   nodeId: 1,
>   host: 'xseed171.kdev.nhnsystem.com',
>   port: 9093,
>   topicNum: 0,
>   topicError: 0,
>   topic: 'nelo2-normal-logs',
>   partitionNum: 2,
>   errorCode: 0,
>   partition: 0,
>   leader: 3,
>   replicasNum: 2,
>   replicas: [ 3, 0 ],
>   isrNum: 2,
>   isr: [ 0, 3 ] }
> metadata:  { size: 246,
>   correlationId: 0,
>   brokerNum: -1,
>   nodeId: 1,
>   host: 'xseed171.kdev.nhnsystem.com',
>   port: 9093,
>   topicNum: 0,
>   topicError: 0,
>   topic: 'nelo2-normal-logs',
>   partitionNum: 1,
>   errorCode: 0,
>   partition: 1,
>   leader: 0,
>   replicasNum: 2,
>   replicas: [ 0, 1 ],
>   isrNum: 2,
>   isr: [ 0, 1 ] }
> metadata:  { size: 246,
>   correlationId: 0,
>   brokerNum: -1,
>   nodeId: 1,
>   host: 'xseed171.kdev.nhnsystem.com',
>   port: 9093,
>   topicNum: 0,
>   topicError: 0,
>   topic: 'nelo2-normal-logs',
>   partitionNum: 0,
>   errorCode: 0,
>   partition: 2,
>   leader: 1,
>   replicasNum: 2,
>   replicas: [ 1, 3 ],
>   isrNum: 1,
>   isr: [ 1 ] }
>
>  I also use other sdk, get the same result. I check the logs from kafka,
> it seems the sdk result is right. the tool get the wrong result. why is it
> happend?
>
> [2014-04-24 14:53:57,705] TRACE Broker 3 cached leader info
> (LeaderAndIsrInfo:(Leader:0,ISR:0,1,LeaderEpoch:7,ControllerEpoch:9),ReplicationFactor:2),AllReplicas:0,1)
> for partition [nelo2-normal-logs,1] in response to UpdateMetadata request
> sent by controller 0 epoch 10 with correlation id 13 (state.change.logger)
> [2014-04-24 14:53:57,705] TRACE Broker 3 cached leader info
> (LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:9,ControllerEpoch:10),ReplicationFactor:2),AllReplicas:1,3)
> for partition [nelo2-normal-logs,2] in response to UpdateMetadata request
> sent by controller 0 epoch 10 with correlation id 13 (state.change.logger)
> [2014-04-24 14:53:57,705] TRACE Broker 3 cached leader info
> (LeaderAndIsrInfo:(Leader:3,ISR:0,3,LeaderEpoch:10,ControllerEpoch:10),ReplicationFactor:2),AllReplicas:3,0)
> for partition [nelo2-normal-logs,0] in response to UpdateMetadata request
> sent by controller 0 epoch 10 with correlation id 13 (state.change.logger)
>
> Thanks~!
>
> Best Regards
> Jerry
>


Re: Failing broker with errors for Conditional update

2014-04-24 Thread Drew Goya
This just hit me this morning as well, any news on 0.8.1.1?  My ops guy is
going to kill me, we just rolled off my older build of 0.8.1 to the
official release.


On Thu, Apr 3, 2014 at 11:55 PM, Krzysztof Ociepa <
ociepa.krzysz...@gmail.com> wrote:

> Hi Guozhang,
> Hi Neha,
>
> Thanks a lot for your answers. I will try new version 0.8.1.1 and let
> you know how it works for me.
>
> Best,
> Chris
>


Delete Topic - BadVersionException

2014-04-24 Thread Drew Goya
Just tried my first topic delete today and it looks like something went
wrong on the controller.  I issued the command on a test topic and shortly
after that a describe looked like:

Topic:TimeoutQueueTest PartitionCount:256 ReplicationFactor:3 Configs:
Topic: TimeoutQueueTest Partition: 0 Leader: -1 Replicas: 9,14,15 Isr:
Topic: TimeoutQueueTest Partition: 1 Leader: -1 Replicas: 10,15,1 Isr:
Topic: TimeoutQueueTest Partition: 2 Leader: -1 Replicas: 11,1,2 Isr:
Topic: TimeoutQueueTest Partition: 3 Leader: -1 Replicas: 12,2,3 Isr:
Topic: TimeoutQueueTest Partition: 4 Leader: -1 Replicas: 13,3,4 Isr:
Topic: TimeoutQueueTest Partition: 5 Leader: -1 Replicas: 14,4,5 Isr:

It stayed that way for quite a while so I hit zookeeper and went looking
for who was the controller, I found these in that brokers logs:

[2014-04-24 16:27:42,498] ERROR Conditional update of path
/brokers/topics/TimeoutQueueTest/partitions/170/state with data
{"controller_epoch":18,"leader":2,"version":1,"leader_epoch":14,"isr":[2,14]}
and expected version 30 failed due to
org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode =
BadVersion for /brokers/topics/TimeoutQueueTest/partitions/170/state
(kafka.utils.ZkUtils$)
[2014-04-24 16:27:42,504] ERROR Conditional update of path
/brokers/topics/TimeoutQueueTest/partitions/113/state with data
{"controller_epoch":18,"leader":2,"version":1,"leader_epoch":4,"isr":[2,15,14]}
and expected version 17 failed due to
org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode =
BadVersion for /brokers/topics/TimeoutQueueTest/partitions/113/state
(kafka.utils.ZkUtils$)

Any ideas?


Cluster expansion and upgrade

2014-04-24 Thread vimpy batra
Hello,

We are currently running a kafka 0.8-beta cluster. We are planning to expand 
the existing cluster and use 0.8.1 version on the new nodes.  Before upgrading 
the older ones we want the new ones to participate in the cluster. We plan to 
use "reassign-partitions" tool in 0.8.1 to reassign partitions to the newly 
added brokers. We will also add additional partitions to the existing topics. 
Is there a best practice to go about expanding clusters? Is this the 
recommended way to go or should we upgrade our existing cluster to 0.8.1 first?
I also noticed that there are some fixes on top of 0.8.1. Are they available to 
use?

Thanks,

Re: Kafka/Zookeeper co-location

2014-04-24 Thread Todd Palino
We typically run all of our Zookeeper instances separate, but we do have
one Kafka cluster that is colocated with the Zookeeper nodes. It works
just fine, probably in part because Zookeeper handles everything serially.
The caveat is that the cluster that we¹re doing this on is not designed
for performance, but rather compactness. And we¹re getting rid of it
because we don¹t like the setup.

Keep in mind that you¹re colocating two applications that are sensitive to
disk I/O. This is going to affect performance of both systems and you
really need to decide if that¹s what you want. If you have enough spindles
to keep everything separate (Kafka data storage and Zookeeper transaction
logs on separate spindles from everything else) you might be OK.

-Todd


On 4/24/14, 8:53 AM, "Andrew Otto"  wrote:

>Oo, I¹m curious about this as well!  Wikimedia is considering doing this
>if/when we install brokers in our web caching data centers.
>
>
>On Apr 24, 2014, at 11:49 AM, Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN)
> wrote:
>
>> Are there any thoughts on running Zookeeper on the same physical nodes
>>that run the Kafka broker? So the loss of a node affects quorum and
>>possibly requires electing new leaders at both the ZK and the broker
>>level. Are there race conditions or other failure cases that could come
>>about from either a co-located (or, an independent) setup? Thanks.
>> 
>> -sudarshan
>> 
>> 
>>-
>>--
>



Re: Kafka/Zookeeper co-location

2014-04-24 Thread Andrew Otto
Oo, I’m curious about this as well!  Wikimedia is considering doing this 
if/when we install brokers in our web caching data centers.


On Apr 24, 2014, at 11:49 AM, Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) 
 wrote:

> Are there any thoughts on running Zookeeper on the same physical nodes that 
> run the Kafka broker? So the loss of a node affects quorum and possibly 
> requires electing new leaders at both the ZK and the broker level. Are there 
> race conditions or other failure cases that could come about from either a 
> co-located (or, an independent) setup? Thanks.
> 
> -sudarshan
> 
> ---



Kafka/Zookeeper co-location

2014-04-24 Thread Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN)
Are there any thoughts on running Zookeeper on the same physical nodes that run 
the Kafka broker? So the loss of a node affects quorum and possibly requires 
electing new leaders at both the ZK and the broker level. Are there race 
conditions or other failure cases that could come about from either a 
co-located (or, an independent) setup? Thanks.

-sudarshan

---