Re: High CPU usage for idle kafka server

2015-06-05 Thread Jiangjie Qin
Has this to do with KAFKA-1461?
Can you see which thread is taking a lot of cpu? Some jconsole plugin can
get that information.

Jiangjie (Becket) Qin

On 6/5/15, 2:57 PM, "pundlik.anuja"  wrote:

>Hi Jay,
>
>Good to hear from you. I met you at the kafka meetup at linkedin.
>
>- No, I am running kafka_2.11-0.8.2.1
>
>
>Are there any logs/ any info that I can provide that will help you
>understand what could be the issue?
>
>Thanks,
>Anuja
>
>On Fri, Jun 5, 2015 at 2:36 PM, Jay Kreps  wrote:
>
>> This sounds a lot like a bug we fixed in 0.8.2.0, no chance you are
>>running
>> that pre-release version is there?
>>
>> -Jay
>>
>> On Wed, Jun 3, 2015 at 9:43 PM, Anuja Pundlik (apundlik) <
>> apund...@cisco.com
>> > wrote:
>>
>> > Hi,
>> >
>> > I am using Kafka 0.8.2.1.
>> > We have 1 zookeeper, 3 kafka brokers.
>> > We have 9 topics, out of which 1 topic has 18 partitions, while
>>another
>> > has 12 partitions. All other topics have 1 partition each.
>> >
>> > We see that idle kafka brokers (not carrying any message) are using
>>more
>> > than 50% of CPU. See top output below.
>> >
>> > Is this a known issue?
>> >
>> >
>> > Thanks
>> >
>> >
>> >
>> > top - 04:42:30 up  2:07,  1 user,  load average: 1.50, 1.31, 0.92
>> > Tasks: 177 total,   1 running, 176 sleeping,   0 stopped,   0 zombie
>> > Cpu(s): 13.5%us,  4.5%sy,  0.0%ni, 81.3%id,  0.2%wa,  0.0%hi,  0.1%si,
>> > 0.4%st
>> > Mem:  65974296k total, 22310524k used, 43663772k free,   112688k
>>buffers
>> > Swap:0k total,0k used,0k free, 13382460k
>>cached
>> >
>> >   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+ COMMAND
>> >  9295 wae   20   0 5212m 894m  12m S   62  1.4  22:50.99 java
>> >  9323 wae   20   0 5502m 894m  12m S   56  1.4  24:28.69 java
>> >  9353 wae   20   0 5072m 896m  12m S   54  1.4  17:04.31 java
>> >
>>



Re: High CPU usage for idle kafka server

2015-06-05 Thread pundlik.anuja
Hi Jay,

Good to hear from you. I met you at the kafka meetup at linkedin.

- No, I am running kafka_2.11-0.8.2.1


Are there any logs/ any info that I can provide that will help you
understand what could be the issue?

Thanks,
Anuja

On Fri, Jun 5, 2015 at 2:36 PM, Jay Kreps  wrote:

> This sounds a lot like a bug we fixed in 0.8.2.0, no chance you are running
> that pre-release version is there?
>
> -Jay
>
> On Wed, Jun 3, 2015 at 9:43 PM, Anuja Pundlik (apundlik) <
> apund...@cisco.com
> > wrote:
>
> > Hi,
> >
> > I am using Kafka 0.8.2.1.
> > We have 1 zookeeper, 3 kafka brokers.
> > We have 9 topics, out of which 1 topic has 18 partitions, while another
> > has 12 partitions. All other topics have 1 partition each.
> >
> > We see that idle kafka brokers (not carrying any message) are using more
> > than 50% of CPU. See top output below.
> >
> > Is this a known issue?
> >
> >
> > Thanks
> >
> >
> >
> > top - 04:42:30 up  2:07,  1 user,  load average: 1.50, 1.31, 0.92
> > Tasks: 177 total,   1 running, 176 sleeping,   0 stopped,   0 zombie
> > Cpu(s): 13.5%us,  4.5%sy,  0.0%ni, 81.3%id,  0.2%wa,  0.0%hi,  0.1%si,
> > 0.4%st
> > Mem:  65974296k total, 22310524k used, 43663772k free,   112688k buffers
> > Swap:0k total,0k used,0k free, 13382460k cached
> >
> >   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+ COMMAND
> >  9295 wae   20   0 5212m 894m  12m S   62  1.4  22:50.99 java
> >  9323 wae   20   0 5502m 894m  12m S   56  1.4  24:28.69 java
> >  9353 wae   20   0 5072m 896m  12m S   54  1.4  17:04.31 java
> >
>


Re: High CPU usage for idle kafka server

2015-06-05 Thread Jay Kreps
This sounds a lot like a bug we fixed in 0.8.2.0, no chance you are running
that pre-release version is there?

-Jay

On Wed, Jun 3, 2015 at 9:43 PM, Anuja Pundlik (apundlik)  wrote:

> Hi,
>
> I am using Kafka 0.8.2.1.
> We have 1 zookeeper, 3 kafka brokers.
> We have 9 topics, out of which 1 topic has 18 partitions, while another
> has 12 partitions. All other topics have 1 partition each.
>
> We see that idle kafka brokers (not carrying any message) are using more
> than 50% of CPU. See top output below.
>
> Is this a known issue?
>
>
> Thanks
>
>
>
> top - 04:42:30 up  2:07,  1 user,  load average: 1.50, 1.31, 0.92
> Tasks: 177 total,   1 running, 176 sleeping,   0 stopped,   0 zombie
> Cpu(s): 13.5%us,  4.5%sy,  0.0%ni, 81.3%id,  0.2%wa,  0.0%hi,  0.1%si,
> 0.4%st
> Mem:  65974296k total, 22310524k used, 43663772k free,   112688k buffers
> Swap:0k total,0k used,0k free, 13382460k cached
>
>   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+ COMMAND
>  9295 wae   20   0 5212m 894m  12m S   62  1.4  22:50.99 java
>  9323 wae   20   0 5502m 894m  12m S   56  1.4  24:28.69 java
>  9353 wae   20   0 5072m 896m  12m S   54  1.4  17:04.31 java
>


Re: High CPU usage for idle kafka server

2015-06-05 Thread pundlik.anuja
There are no messages being sent or received. The system is idle.
however, there seems to be some GC going on in the kafka broker and some
socket reads and writes. It is using approx 500MB of memory.

On Fri, Jun 5, 2015 at 1:44 PM, pundlik.anuja 
wrote:

> Hi Otis,
> How do I check garbage collection on kafka broker?
>
>
> On Thu, Jun 4, 2015 at 1:24 PM, Otis Gospodnetic <
> otis.gospodne...@gmail.com> wrote:
>
>> How's their garbage collection doing?
>>
>> Otis
>> --
>> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
>> Solr & Elasticsearch Support * http://sematext.com/
>>
>>
>> On Thu, Jun 4, 2015 at 12:43 AM, Anuja Pundlik (apundlik) <
>> apund...@cisco.com> wrote:
>>
>> > Hi,
>> >
>> > I am using Kafka 0.8.2.1.
>> > We have 1 zookeeper, 3 kafka brokers.
>> > We have 9 topics, out of which 1 topic has 18 partitions, while another
>> > has 12 partitions. All other topics have 1 partition each.
>> >
>> > We see that idle kafka brokers (not carrying any message) are using more
>> > than 50% of CPU. See top output below.
>> >
>> > Is this a known issue?
>> >
>> >
>> > Thanks
>> >
>> >
>> >
>> > top - 04:42:30 up  2:07,  1 user,  load average: 1.50, 1.31, 0.92
>> > Tasks: 177 total,   1 running, 176 sleeping,   0 stopped,   0 zombie
>> > Cpu(s): 13.5%us,  4.5%sy,  0.0%ni, 81.3%id,  0.2%wa,  0.0%hi,  0.1%si,
>> > 0.4%st
>> > Mem:  65974296k total, 22310524k used, 43663772k free,   112688k buffers
>> > Swap:0k total,0k used,0k free, 13382460k cached
>> >
>> >   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+ COMMAND
>> >  9295 wae   20   0 5212m 894m  12m S   62  1.4  22:50.99 java
>> >  9323 wae   20   0 5502m 894m  12m S   56  1.4  24:28.69 java
>> >  9353 wae   20   0 5072m 896m  12m S   54  1.4  17:04.31 java
>> >
>>
>
>


Re: High CPU usage for idle kafka server

2015-06-05 Thread pundlik.anuja
Hi Otis,
How do I check garbage collection on kafka broker?


On Thu, Jun 4, 2015 at 1:24 PM, Otis Gospodnetic  wrote:

> How's their garbage collection doing?
>
> Otis
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Thu, Jun 4, 2015 at 12:43 AM, Anuja Pundlik (apundlik) <
> apund...@cisco.com> wrote:
>
> > Hi,
> >
> > I am using Kafka 0.8.2.1.
> > We have 1 zookeeper, 3 kafka brokers.
> > We have 9 topics, out of which 1 topic has 18 partitions, while another
> > has 12 partitions. All other topics have 1 partition each.
> >
> > We see that idle kafka brokers (not carrying any message) are using more
> > than 50% of CPU. See top output below.
> >
> > Is this a known issue?
> >
> >
> > Thanks
> >
> >
> >
> > top - 04:42:30 up  2:07,  1 user,  load average: 1.50, 1.31, 0.92
> > Tasks: 177 total,   1 running, 176 sleeping,   0 stopped,   0 zombie
> > Cpu(s): 13.5%us,  4.5%sy,  0.0%ni, 81.3%id,  0.2%wa,  0.0%hi,  0.1%si,
> > 0.4%st
> > Mem:  65974296k total, 22310524k used, 43663772k free,   112688k buffers
> > Swap:0k total,0k used,0k free, 13382460k cached
> >
> >   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+ COMMAND
> >  9295 wae   20   0 5212m 894m  12m S   62  1.4  22:50.99 java
> >  9323 wae   20   0 5502m 894m  12m S   56  1.4  24:28.69 java
> >  9353 wae   20   0 5072m 896m  12m S   54  1.4  17:04.31 java
> >
>


Query on kafka topic metadata

2015-06-05 Thread Pavan Chenduluru
Hi,

I am new to kafka and I have a doubt.

How to read specified topic statistics from kafka server?

I want to read below parameters about existing topic from kafka.

1) How many activeMessages
2) How many activeSubscriptions
3) How many totalMessages
4) How many totalSubscriptions
5) How mnay deliveryFaults
6) How many pendingDelivery

Pls do the needful.

Thanks & Regards,
Pavan


Re: Consumer lag lies - orphaned offsets?

2015-06-05 Thread Joel Koshy
On Fri, Jun 05, 2015 at 12:53:00AM -0400, Otis Gospodnetić wrote:
> Hi Joel,
> 
> On Thu, Jun 4, 2015 at 8:52 PM, Joel Koshy  wrote:
> 
> > Hi Otis,
> >
> > Yes this is a limitation in the old consumer. i.e., a number of
> > per-topic/partition mbeans remain even on a rebalance. Those need to
> > be de-registered. So if you stop consuming from some partition after a
> > rebalance that lag mbean currently remain which is why it remains
> > flat.  This is a known issue.
> >
> 
> I see.  Is / should this be considered a bug?  Something worth fixing for
> 0.8.3?

Yes I would call it a bug, but it hasn't been a high priority so far
mainly because (I think) most users monitor lag with committed
offsets. This is what we do at LinkedIn for instance as Todd mentioned
in his reply.

> 
> Also, you say this is the limitation of the old consumer.  Does that mean
> that this problem goes away completely if one uses the new consumer?

This is sort of n/a at the moment as per-partition lag has not been
added yet to the new consumer. It does have the equivalent of max-lag.
If we add per-partition lag sensors we would need to be able to remove
those sensors if applicable after a rebalance.

> 
> > On the restart, the lag goes down to zero because - well the mbeans
> > get recreated and the consumer starts fetching. If the fetch request
> > reads up to the end of the log then the mbean will report zero. Your
> > actual committed offset may be behind though which is why your true
> > lag is > 0.
> >
> > The lag mbeans are useful, but have a number of limitations - it
> > depends on active fetches in progress;
> 
> 
> What do you mean by this?

If the fetcher threads die for any reason then fetches stop and the
consumer continues to report lag off the last fetched offset and the
last reported log end offset. So it will stay flat when it should be
increasing (since the log end offset on the broker is increasing if
producers are still sending to that partition).

Also, the old consumer pre-fetches chunks and buffers these
internally.  If the chunk queue is full fetches stop; and if the
consumer is extremely slow in actually processing the messages off
each chunk then lag can stay flat (perhaps even at zero) until the
next chunk, while the consumer is iterating messages off the previous
chunk.

> 
> > it also does not exactly
> > correspond with your actual processed (and committed) offset.
> 
> Right.  Though it should be updated in near real-time, so it will
> approximately match the reality, no?

Yes - I think it is fair to say that in most cases the lag mbeans
should be accurate within a small delta of the true lag. Although we
are trying to avoid further non-critical development on the old
consumer it is convenient to have these mbeans. So I think it may be
worth fixing this issue (i.e., deregistering mbeans on a rebalance).
Can you file a jira for this?

Thanks,

Joel

> 
> Thanks,
> Otis
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
> 
> 
> 
> > The most
> > reliable way to monitor application lag is to use the committed
> > offsets and the current log end offsets. Todd has been doing a lot of
> > interesting work in making lag monitoring less painful and can comment
> > more.
> >
> > Joel
> >
> > On Thu, Jun 04, 2015 at 04:55:44PM -0400, Otis Gospodnetić wrote:
> > > Hi,
> > >
> > > On Thu, Jun 4, 2015 at 4:26 PM, Scott Reynolds 
> > wrote:
> > >
> > > > I believe the JMX metrics reflect the consumer PRIOR to committing
> > offsets
> > > > to Kafka / Zookeeper. But when you query from the command line using
> > the
> > > > kafka tools, you are just getting the committed offsets.
> > > >
> > >
> > > Even if that were the case, and maybe it is, it doesn't explain why the
> > > ConsumerLag in JMX often remains *completely constant*.forever...
> > until
> > > the consumer is restarted.  You see what I mean?
> > >
> > > Otis
> > > --
> > > Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> > > Solr & Elasticsearch Support * http://sematext.com/
> > >
> > >
> > >
> > > > On Thu, Jun 4, 2015 at 1:23 PM, Otis Gospodnetic <
> > > > otis.gospodne...@gmail.com
> > > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Here's something potentially useful.
> > > > >
> > > > > 1) Before: https://apps.sematext.com/spm-reports/s/eQ9WhLegW9 - the
> > > > "flat
> > > > > Lag situation"
> > > > >
> > > > > 2) I restarted the consumer whose lag is shown in the above graph
> > > > >
> > > > > 3) After restart: https://apps.sematext.com/spm-reports/s/4YGkcUP9ms
> > -
> > > > NO
> > > > > lag at all!?
> > > > >
> > > > > So that 81560 Lag value that was stuck in JMX is gone.  Went down to
> > 0.
> > > > > Kind of makes sense - the whole consumer was restarted, consumer/java
> > > > > process was restarted, everything that was in JMX got reset, and if
> > there
> > > > > is truly no consumer lag it makes sense that the values in JMX a

Re: Multiple instances of HL Consumer

2015-06-05 Thread Sharninder Khera
You can have the same consumer id and Kafka will balance partitions across the 
two instances automatically. When one of them dies the partitions are 
rebalanced and assigned to the remaining alive consumers. 



_
From: Panda, Samaresh 
Sent: Friday, June 5, 2015 7:42 pm
Subject: Multiple instances of HL Consumer
To:  


I've a HL consumer receiving messages using four threads (four partitions). 
This is a stand-alone Java client. For fail-safe reasons, I want to run another 
instance of the exact same Java client in a different box.

Here are my questions:

> Can I keep the same consumer group name or it must be different for the 2nd 
> instance?
> If same consumer group, will the 2nd client receive same set of messages 
> again?
> In general what's the best practice to designing fail-safe clients?

Thanks
Sam

Multiple instances of HL Consumer

2015-06-05 Thread Panda, Samaresh
I've a HL consumer receiving messages using four threads (four partitions). 
This is a stand-alone Java client. For fail-safe reasons, I want to run another 
instance of the exact same Java client in a different box.

Here are my questions:

> Can I keep the same consumer group name or it must be different for the 2nd 
> instance?
> If same consumer group, will the 2nd client receive same set of messages 
> again?
> In general what's the best practice to designing fail-safe clients?

Thanks
Sam