Re: consumer.poll() takes approx. 30 seconds - 0.9 new consumer api

2016-06-18 Thread Rohit Sardesai
In my tests , I am using around 24 consumer groups.  I never call 
consumer.close() or consumer.unsubscribe() until the application is shutting 
down.

So the consumers never leave but new consumer instances do get created as the 
parallel requests pile up . Also, I am reusing consumer instances

if they are idle ( i,.e not serving any consume request). So with 9 partitions 
, I do 9 parallel consume requests in parallel every second under the same 
consumer group.

So to summarize I have the following test setup : 3 Kafka brokers , 2 zookeeper 
nodes,  1 topic , 9 partitions , 24 consumer groups and 9 consume requests at a 
time.



From: Dana Powers 
Sent: 19 June 2016 10:45
To: users@kafka.apache.org
Subject: Re: consumer.poll() takes approx. 30 seconds - 0.9 new consumer api

Is your test reusing a group name? And if so, are your consumer instances
gracefully leaving? This may cause subsequent 'rebalance' operations to
block until those old consumers check-in or the session timeout happens
(30secs)

-Dana
On Jun 18, 2016 8:56 PM, "Rohit Sardesai" 
wrote:

> I am using the group management feature of Kafka 0.9 to handle partition
> assignment to consumer instances. I use the subscribe() API to subscribe to
> the topic I am interested in reading data from.  I have an environment
> where I have 3 Kafka brokers  with a couple of Zookeeper nodes . I created
> a topic with 9 partitions . The performance tests attempt to send 9
> parallel poll() requests to the Kafka brokers every second. The results
> show that each poll() operation takes around 30 seconds for the first time
> it polls and returns 0 records. Also , when I print the partition
> assignment to this consumer instance , I see no partitions assigned to it.
> The next poll() does return quickly ( ~ 10-20 ms) with data and some
> partitions assigned to it.
>
> With each consumer taking 30 seconds , the performance tests report very
> low throughput since I run the tests for around 1000 seconds out which I
> produce messages on the topic for the complete duration and I start the
> parallel consume requests after 400 seconds. So out of 400 seconds , with 9
> consumers taking 30 seconds each , around 270 seconds are spent in the
> first poll without any data. Is this because of the re-balance operation
> that the consumers are blocked on the poll() ? What is the best way to use
> poll()  if I have to serve many parallel requests per second ?  Should I
> prefer manual assignment of partitions in this case instead of relying on
> re-balance ?
>
>
> Regards,
>
> Rohit Sardesai
>
>


Re: consumer.poll() takes approx. 30 seconds - 0.9 new consumer api

2016-06-18 Thread Dana Powers
Is your test reusing a group name? And if so, are your consumer instances
gracefully leaving? This may cause subsequent 'rebalance' operations to
block until those old consumers check-in or the session timeout happens
(30secs)

-Dana
On Jun 18, 2016 8:56 PM, "Rohit Sardesai" 
wrote:

> I am using the group management feature of Kafka 0.9 to handle partition
> assignment to consumer instances. I use the subscribe() API to subscribe to
> the topic I am interested in reading data from.  I have an environment
> where I have 3 Kafka brokers  with a couple of Zookeeper nodes . I created
> a topic with 9 partitions . The performance tests attempt to send 9
> parallel poll() requests to the Kafka brokers every second. The results
> show that each poll() operation takes around 30 seconds for the first time
> it polls and returns 0 records. Also , when I print the partition
> assignment to this consumer instance , I see no partitions assigned to it.
> The next poll() does return quickly ( ~ 10-20 ms) with data and some
> partitions assigned to it.
>
> With each consumer taking 30 seconds , the performance tests report very
> low throughput since I run the tests for around 1000 seconds out which I
> produce messages on the topic for the complete duration and I start the
> parallel consume requests after 400 seconds. So out of 400 seconds , with 9
> consumers taking 30 seconds each , around 270 seconds are spent in the
> first poll without any data. Is this because of the re-balance operation
> that the consumers are blocked on the poll() ? What is the best way to use
> poll()  if I have to serve many parallel requests per second ?  Should I
> prefer manual assignment of partitions in this case instead of relying on
> re-balance ?
>
>
> Regards,
>
> Rohit Sardesai
>
>


consumer.poll() takes approx. 30 seconds - 0.9 new consumer api

2016-06-18 Thread Rohit Sardesai
I am using the group management feature of Kafka 0.9 to handle partition 
assignment to consumer instances. I use the subscribe() API to subscribe to the 
topic I am interested in reading data from.  I have an environment where I have 
3 Kafka brokers  with a couple of Zookeeper nodes . I created a topic with 9 
partitions . The performance tests attempt to send 9 parallel poll() requests 
to the Kafka brokers every second. The results show that each poll() operation 
takes around 30 seconds for the first time it polls and returns 0 records. Also 
, when I print the partition assignment to this consumer instance , I see no 
partitions assigned to it.  The next poll() does return quickly ( ~ 10-20 ms) 
with data and some partitions assigned to it.

With each consumer taking 30 seconds , the performance tests report very low 
throughput since I run the tests for around 1000 seconds out which I produce 
messages on the topic for the complete duration and I start the parallel 
consume requests after 400 seconds. So out of 400 seconds , with 9 consumers 
taking 30 seconds each , around 270 seconds are spent in the first poll without 
any data. Is this because of the re-balance operation that the consumers are 
blocked on the poll() ? What is the best way to use poll()  if I have to serve 
many parallel requests per second ?  Should I prefer manual assignment of 
partitions in this case instead of relying on re-balance ?


Regards,

Rohit Sardesai



Re: test of producer's delay and consumer's delay

2016-06-18 Thread Kafka
To Christian Posta 
,
I have taken into account the  interpretation of time.
my producer and consumer are deployed on the same  machine, the 
machine’s configuration is very good, so it will not be the bottlenecks.
so it’s not have the problem of interpretation of time.

> 在 2016年6月18日,上午11:26,Kafka  写道:
> 
> hello,I have done a series of tests on kafka 0.9.0,and one of the results 
> confused me.
> 
> test enviroment:
> kafka cluster: 3 brokers,8core cpu / 8g mem /1g netcard
> client:4core cpu/4g mem
> topic:6 partitions,2 replica
> 
> total messages:1
> singal message size:1024byte
> fetch.min.bytes:1
> fetch.wait.max.ms:100ms
> 
> all send tests are under the enviroment of using scala sync interface,
> 
> when I set ack to 0,the producer’s delay is 0.3ms,the consumer’s delay is 
> 7.7ms
> when I set ack to 1,the producer's delay is 1.6ms, the consumer’s delay is 
> 3.7ms
> when I set ack to -1,the produce's delay is 3.5ms, the consumer’s delay is 
> 4.2ms
> 
> but why consumer’s delay is decreased when I set ack from 0 to 1,its confused 
> me。
> 



test+of+producer's+delay+and+consumer's+delay

2016-06-18 Thread Kafka
I send every message with timestamp, and when I receive a message,I do a 
subtraction between current timestamp and message’s timestamp.  then I get the 
consumer’s delay.