Re: One of two consumers is always Idle though I have 2 partitions

2014-11-13 Thread Sharninder
If you're not using your own partitioning logic, messages are partitioned
randomly. This is the current default behavior I believe.


On Fri, Nov 14, 2014 at 12:01 PM, Palur Sandeep 
wrote:

> Thank you Chia-chun,Joe and Jagat.
>
> I am not using any custom partitioner logic. Here is what I observed when I
> ran kafka on 4 nodes with the following structure:
>
> 1. Each node has a producer, consumer and a broker (that contains one
> partition of my topic) and one of the machine has the Zookeeper too.
> 2. Producer in each node sends 1 messages to my topic.
> 3. I observed that consumer in all 4 nodes gets some messages and some
> times only 2 nodes receive messages and one doesn't and sometimes only one
> node receives messages and 3 doesnt receive any messages.
>
> So according to my observation, producer is sending messages to random
> partition.
>
> Am I correct?
>
> Thank you
> Sandeep
>
>
>
>
>
>
> On Thu, Nov 13, 2014 at 9:34 PM, Joe Stein  wrote:
>
> > Yup, sounds like
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyisdatanotevenlydistributedamongpartitionswhenapartitioningkeyisnotspecified
> > ?
> >
> > This should go away with 0.8.2 with the default partions now being 1 =8^)
> > with auto create topics.
> >
> > /***
> > Joe Stein
> > Founder, Principal Consultant
> > Big Data Open Source Security LLC
> > http://www.stealth.ly
> > Twitter: @allthingshadoop
> > /
> > On Nov 13, 2014 8:34 PM, "Chia-Chun Shih" 
> wrote:
> >
> > > Hi Palur,
> > >
> > > When producing messages, did you specify a key in your KeyedMessage? If
> > > not, producer will send all messages to ONE randomly selected partition
> > and
> > > stick to this partition for 10 minutes by default.
> > >
> > > regards,
> > > Chia-Chun
> > >
> > > 2014-11-14 7:19 GMT+08:00 Jagat Singh :
> > >
> > > > It would be worth reading once the consumer section from the
> > > documentation.
> > > >
> > > > https://kafka.apache.org/documentation.html
> > > >
> > > >
> > > >
> > > > On Fri, Nov 14, 2014 at 10:09 AM, Palur Sandeep <
> psand...@hawk.iit.edu
> > >
> > > > wrote:
> > > >
> > > > > Yes, they are on the same consumer group, but I have two
> partitions.
> > > > >
> > > > > On Thu, Nov 13, 2014 at 5:04 PM, Jagat Singh  >
> > > > wrote:
> > > > >
> > > > > > Are both of them in same Consumer Group?
> > > > > >
> > > > > > On Fri, Nov 14, 2014 at 9:12 AM, Palur Sandeep <
> > > psand...@hawk.iit.edu>
> > > > > > wrote:
> > > > > >
> > > > > > > Dear Developers,
> > > > > > >
> > > > > > > I am 2nd year masters student at IIT. I am using Kafka for one
> of
> > > my
> > > > > > > research projects.My question is the following:
> > > > > > >
> > > > > > > 1. I have a producer, consumer and a broker(that contains 1st
> > > > partition
> > > > > > of
> > > > > > > my topic)  on node1
> > > > > > > 2. I have a producer, consumer, zookeeper and a broker(that
> > > contains
> > > > > 2nd
> > > > > > > partition of my topic)  on node2
> > > > > > > 3. Here comes my problem: though I have two partitions only one
> > > > > consumer
> > > > > > > pulls messages and the other one is always idle.
> > > > > > >
> > > > > > > What is that I can do to keep both of my consumer busy?
> > > > > > >
> > > > > > > Thank you
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Regards,
> > > > > > > Sandeep Palur
> > > > > > > Data-Intensive Distributed Systems Laboratory, CS/IIT
> > > > > > > Department of Computer Science, Illinois Institute of
> Technology
> > > > (IIT)
> > > > > > > Phone : 312-647-9833
> > > > > > > Email : psand...@hawk.iit.edu 
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards,
> > > > > Sandeep Palur
> > > > > Data-Intensive Distributed Systems Laboratory, CS/IIT
> > > > > Department of Computer Science, Illinois Institute of Technology
> > (IIT)
> > > > > Phone : 312-647-9833
> > > > > Email : psand...@hawk.iit.edu 
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Regards,
> Sandeep Palur
> Data-Intensive Distributed Systems Laboratory, CS/IIT
> Department of Computer Science, Illinois Institute of Technology (IIT)
> Phone : 312-647-9833
> Email : psand...@hawk.iit.edu 
>



-- 
--
Sharninder


Re: Getting Simple consumer details using MBean

2014-11-13 Thread Madhukar Bharti
Hi Jun Rao,

Sorry to disturb you. But I my Kafka setup it is not showing. I am
attaching screen shot taken from all brokers.

In kafka.consumer it is listing only "ReplicaFetcherThread".

As I said earlier I am using "2.10-0.8.1.1" version. Do i need to configure
any extra parameter for this? I am simply using the same configuration as
described in wiki page.



Thanks and Regards,
Madhukar


On Fri, Nov 14, 2014 at 1:17 AM, Jun Rao  wrote:

> I tried running kafka-simple-consumer-shell. I can see the following mbean.
>
>
> "kafka.consumer":type="FetchRequestAndResponseMetrics",name="SimpleConsumerShell-AllBrokersFetchRequestRateAndTimeMs"
>
> Thanks,
>
> Jun
>
> On Wed, Nov 12, 2014 at 9:57 PM, Madhukar Bharti  >
> wrote:
>
> > Hi Jun Rao,
> >
> > Thanks for your quick reply.
> >
> > I am not able to see this  any bean named as "SimpleConsumer". Is there
> any
> > configuration related to this?
> >
> > How can I see this bean named listing in Jconsole window?
> >
> >
> > Thanks and Regards
> > Madhukar
> >
> > On Thu, Nov 13, 2014 at 6:06 AM, Jun Rao  wrote:
> >
> > > Those are for 0.7. In 0.8, you should see sth
> > > like FetchRequestRateAndTimeMs in SimpleConsumer.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Wed, Nov 12, 2014 at 5:14 AM, Madhukar Bharti <
> > bhartimadhu...@gmail.com
> > > >
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I want to get the simple consumer details using MBean as described
> here
> > > > <
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Operations#Operations-Monitoring
> > > > >.
> > > > But these bean names are not showing in JConsole as well as while
> > trying
> > > to
> > > > read from JMX.
> > > >
> > > > Please help me to get simple consumer details.
> > > >
> > > > I am using Kafka 0.8.1.1 version.
> > > >
> > > >
> > > > Thanks and Regards,
> > > > Madhukar Bharti
> > > >
> > >
> >
> >
> >
> > --
> > Thanks and Regards,
> > Madhukar Bharti
> > Mob: 7845755539
> >
>



-- 
Thanks and Regards,
Madhukar Bharti
Mob: 7845755539


Re: Location of Logging Files/How To Turn On Logging For Kafka Components

2014-11-13 Thread Alex Melville
Hi Jun,

These are the two lines of log4j-related warnings I get when I try to run
my producer:

log4j:WARN No appenders could be found for logger
(kafka.utils.VerifiableProperties).

log4j:WARN Please initialize the log4j system properly.


I have searched extensively online and have so far not found how to
"initialize the log4j system" properly. All I want is to create debug
logging so I can better find why my producer fails to send messages to the
broker cluster.



Alex

On Thu, Nov 6, 2014 at 3:31 PM, Jun Rao  wrote:

> The log4j entries before that error should tell you the cause of the error.
>
> Thanks,
>
> Jun
>
> On Tue, Nov 4, 2014 at 11:25 PM, Alex Melville 
> wrote:
>
> > Background:
> >
> > I have searched for a while online, and through the files located in the
> > kafka/logs directory, trying to find where kafka writes log output to in
> > order to debug the SimpleProducer I wrote. My producer is almost
> identical
> > to the simple producer located here
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+Producer+Example
> >
> > except for I'm using Protobuffers instead of Strings to publish data to a
> > cluster. I'm receiving the following error when I try to run the
> > SimpleProducer
> >
> > Exception in thread "main" kafka.common.FailedToSendMessageException:
> > Failed to send messages after 3 tries.
> >
> > at
> >
> >
> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:90)
> >
> > at kafka.producer.Producer.send(Producer.scala:76)
> >
> > at kafka.javaapi.producer.Producer.send(Producer.scala:33)
> >
> > at stream.SimpleProducer.send(Unknown Source)
> >
> > at stream.SimpleProducer.main(Unknown Source)
> >
> >
> > I know this isn't a network problem, because I ran the console-producer
> and
> > successfully published data to the same broker that my Simple Producer is
> > trying to publish to. I now want to try to debug this error.
> >
> >
> >
> > Question:
> >
> > Where would my Simple Producer write info about its startup and eventual
> > error, such that I can read it and try to reason as to why it failed? If
> it
> > produces no log data on its own, what is the best way to write this data
> to
> > a somewhere where I can use it to debug? I've noticed that log4j, which I
> > understand is a often-used library for logging in Java, came with my
> kafka
> > download. Am I supposed to use log4j for this info? I do not know very
> much
> > about log4j, so any info on how to get this setup would also be
> > appreciated.
> >
> >
> > -Alex
> >
>


Re: One of two consumers is always Idle though I have 2 partitions

2014-11-13 Thread Palur Sandeep
Thank you Chia-chun,Joe and Jagat.

I am not using any custom partitioner logic. Here is what I observed when I
ran kafka on 4 nodes with the following structure:

1. Each node has a producer, consumer and a broker (that contains one
partition of my topic) and one of the machine has the Zookeeper too.
2. Producer in each node sends 1 messages to my topic.
3. I observed that consumer in all 4 nodes gets some messages and some
times only 2 nodes receive messages and one doesn't and sometimes only one
node receives messages and 3 doesnt receive any messages.

So according to my observation, producer is sending messages to random
partition.

Am I correct?

Thank you
Sandeep






On Thu, Nov 13, 2014 at 9:34 PM, Joe Stein  wrote:

> Yup, sounds like
>
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyisdatanotevenlydistributedamongpartitionswhenapartitioningkeyisnotspecified
> ?
>
> This should go away with 0.8.2 with the default partions now being 1 =8^)
> with auto create topics.
>
> /***
> Joe Stein
> Founder, Principal Consultant
> Big Data Open Source Security LLC
> http://www.stealth.ly
> Twitter: @allthingshadoop
> /
> On Nov 13, 2014 8:34 PM, "Chia-Chun Shih"  wrote:
>
> > Hi Palur,
> >
> > When producing messages, did you specify a key in your KeyedMessage? If
> > not, producer will send all messages to ONE randomly selected partition
> and
> > stick to this partition for 10 minutes by default.
> >
> > regards,
> > Chia-Chun
> >
> > 2014-11-14 7:19 GMT+08:00 Jagat Singh :
> >
> > > It would be worth reading once the consumer section from the
> > documentation.
> > >
> > > https://kafka.apache.org/documentation.html
> > >
> > >
> > >
> > > On Fri, Nov 14, 2014 at 10:09 AM, Palur Sandeep  >
> > > wrote:
> > >
> > > > Yes, they are on the same consumer group, but I have two partitions.
> > > >
> > > > On Thu, Nov 13, 2014 at 5:04 PM, Jagat Singh 
> > > wrote:
> > > >
> > > > > Are both of them in same Consumer Group?
> > > > >
> > > > > On Fri, Nov 14, 2014 at 9:12 AM, Palur Sandeep <
> > psand...@hawk.iit.edu>
> > > > > wrote:
> > > > >
> > > > > > Dear Developers,
> > > > > >
> > > > > > I am 2nd year masters student at IIT. I am using Kafka for one of
> > my
> > > > > > research projects.My question is the following:
> > > > > >
> > > > > > 1. I have a producer, consumer and a broker(that contains 1st
> > > partition
> > > > > of
> > > > > > my topic)  on node1
> > > > > > 2. I have a producer, consumer, zookeeper and a broker(that
> > contains
> > > > 2nd
> > > > > > partition of my topic)  on node2
> > > > > > 3. Here comes my problem: though I have two partitions only one
> > > > consumer
> > > > > > pulls messages and the other one is always idle.
> > > > > >
> > > > > > What is that I can do to keep both of my consumer busy?
> > > > > >
> > > > > > Thank you
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Regards,
> > > > > > Sandeep Palur
> > > > > > Data-Intensive Distributed Systems Laboratory, CS/IIT
> > > > > > Department of Computer Science, Illinois Institute of Technology
> > > (IIT)
> > > > > > Phone : 312-647-9833
> > > > > > Email : psand...@hawk.iit.edu 
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards,
> > > > Sandeep Palur
> > > > Data-Intensive Distributed Systems Laboratory, CS/IIT
> > > > Department of Computer Science, Illinois Institute of Technology
> (IIT)
> > > > Phone : 312-647-9833
> > > > Email : psand...@hawk.iit.edu 
> > > >
> > >
> >
>



-- 
Regards,
Sandeep Palur
Data-Intensive Distributed Systems Laboratory, CS/IIT
Department of Computer Science, Illinois Institute of Technology (IIT)
Phone : 312-647-9833
Email : psand...@hawk.iit.edu 


Re: One of two consumers is always Idle though I have 2 partitions

2014-11-13 Thread Joe Stein
Yup, sounds like
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyisdatanotevenlydistributedamongpartitionswhenapartitioningkeyisnotspecified
?

This should go away with 0.8.2 with the default partions now being 1 =8^)
with auto create topics.

/***
Joe Stein
Founder, Principal Consultant
Big Data Open Source Security LLC
http://www.stealth.ly
Twitter: @allthingshadoop
/
On Nov 13, 2014 8:34 PM, "Chia-Chun Shih"  wrote:

> Hi Palur,
>
> When producing messages, did you specify a key in your KeyedMessage? If
> not, producer will send all messages to ONE randomly selected partition and
> stick to this partition for 10 minutes by default.
>
> regards,
> Chia-Chun
>
> 2014-11-14 7:19 GMT+08:00 Jagat Singh :
>
> > It would be worth reading once the consumer section from the
> documentation.
> >
> > https://kafka.apache.org/documentation.html
> >
> >
> >
> > On Fri, Nov 14, 2014 at 10:09 AM, Palur Sandeep 
> > wrote:
> >
> > > Yes, they are on the same consumer group, but I have two partitions.
> > >
> > > On Thu, Nov 13, 2014 at 5:04 PM, Jagat Singh 
> > wrote:
> > >
> > > > Are both of them in same Consumer Group?
> > > >
> > > > On Fri, Nov 14, 2014 at 9:12 AM, Palur Sandeep <
> psand...@hawk.iit.edu>
> > > > wrote:
> > > >
> > > > > Dear Developers,
> > > > >
> > > > > I am 2nd year masters student at IIT. I am using Kafka for one of
> my
> > > > > research projects.My question is the following:
> > > > >
> > > > > 1. I have a producer, consumer and a broker(that contains 1st
> > partition
> > > > of
> > > > > my topic)  on node1
> > > > > 2. I have a producer, consumer, zookeeper and a broker(that
> contains
> > > 2nd
> > > > > partition of my topic)  on node2
> > > > > 3. Here comes my problem: though I have two partitions only one
> > > consumer
> > > > > pulls messages and the other one is always idle.
> > > > >
> > > > > What is that I can do to keep both of my consumer busy?
> > > > >
> > > > > Thank you
> > > > >
> > > > >
> > > > > --
> > > > > Regards,
> > > > > Sandeep Palur
> > > > > Data-Intensive Distributed Systems Laboratory, CS/IIT
> > > > > Department of Computer Science, Illinois Institute of Technology
> > (IIT)
> > > > > Phone : 312-647-9833
> > > > > Email : psand...@hawk.iit.edu 
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > > Sandeep Palur
> > > Data-Intensive Distributed Systems Laboratory, CS/IIT
> > > Department of Computer Science, Illinois Institute of Technology (IIT)
> > > Phone : 312-647-9833
> > > Email : psand...@hawk.iit.edu 
> > >
> >
>


Re: How to keep-alive connection from producer to broker?

2014-11-13 Thread Guozhang Wang
I think there is no docs yet for this feature since it is only included in
trunk (are you using a released version? If yes then you should not hit
this); but here is the ticket that you can take a look:

https://issues.apache.org/jira/browse/KAFKA-1282

On Thu, Nov 13, 2014 at 6:37 PM, Kothakota, Neeraja (HP Software) <
neeraj...@hp.com> wrote:

> @Guozhang,
>
> Thanks for your reply.
> Is is available as part of broker configuration? If yes, how to configure.
>
> --Neeraja
>
>
> -Original Message-
> From: Guozhang Wang [mailto:wangg...@gmail.com]
> Sent: Friday, November 14, 2014 12:02 AM
> To: users@kafka.apache.org
> Subject: Re: How to keep-alive connection from producer to broker?
>
> Neeraja,
>
> Producer does use keep-alive connections to the brokers, and a recent
> change is introduced in broker which will actively close connections if it
> has not got any requests from the producer for some time. The default
> period is 10 min, you can set it to INT_MAX if you do not want this feature.
>
> Guozhang
>
> On Thu, Nov 13, 2014 at 12:09 AM, Kothakota, Neeraja (HP Software) <
> neeraj...@hp.com> wrote:
>
> > Hi,
> >
> > I would like to know if there is a way/configuration to keep-alive
> > connections from producer/client to broker?
> >
> > I observed that connection is getting closed every time, which takes
> > considerable time.
> >
> > Can you please suggest ?
> >
> > Thanks & Regards,
> > Neeraja
> >
>
>
>
> --
> -- Guozhang
>



-- 
-- Guozhang


RE: How to keep-alive connection from producer to broker?

2014-11-13 Thread Kothakota, Neeraja (HP Software)
@Guozhang,

Thanks for your reply.
Is is available as part of broker configuration? If yes, how to configure.

--Neeraja


-Original Message-
From: Guozhang Wang [mailto:wangg...@gmail.com] 
Sent: Friday, November 14, 2014 12:02 AM
To: users@kafka.apache.org
Subject: Re: How to keep-alive connection from producer to broker?

Neeraja,

Producer does use keep-alive connections to the brokers, and a recent change is 
introduced in broker which will actively close connections if it has not got 
any requests from the producer for some time. The default period is 10 min, you 
can set it to INT_MAX if you do not want this feature.

Guozhang

On Thu, Nov 13, 2014 at 12:09 AM, Kothakota, Neeraja (HP Software) < 
neeraj...@hp.com> wrote:

> Hi,
>
> I would like to know if there is a way/configuration to keep-alive 
> connections from producer/client to broker?
>
> I observed that connection is getting closed every time, which takes 
> considerable time.
>
> Can you please suggest ?
>
> Thanks & Regards,
> Neeraja
>



--
-- Guozhang


Re: Kafka to transport binary files

2014-11-13 Thread Jun Rao
Both the Kafka client and broker need to allocate memory for the whole
message. So, the larger the message, the more memory fragmentation it may
cause, which can lead to GC/OOME issues.

Thanks,

Jun

On Wed, Nov 12, 2014 at 9:23 PM, Rohit Pujari 
wrote:

> I'm thinking of using Kafka for transporting binary files (tiff, jpeg,
> pdf). These files are anywhere between 10 KB to 5MB. Thought behind
> considering Kafka is - It serves as a staging area for the files and
> facilitates asynchronous ingestion in near-real-time.
>
> Any thoughts on using Kafka for binary payloads? gotchas, watch outs?
>
> Thanks,
> Rohit Pujari
> Solutions Architect, Hortonworks
> rpuj...@hortonworks.com
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>


Re: One of two consumers is always Idle though I have 2 partitions

2014-11-13 Thread Chia-Chun Shih
Hi Palur,

When producing messages, did you specify a key in your KeyedMessage? If
not, producer will send all messages to ONE randomly selected partition and
stick to this partition for 10 minutes by default.

regards,
Chia-Chun

2014-11-14 7:19 GMT+08:00 Jagat Singh :

> It would be worth reading once the consumer section from the documentation.
>
> https://kafka.apache.org/documentation.html
>
>
>
> On Fri, Nov 14, 2014 at 10:09 AM, Palur Sandeep 
> wrote:
>
> > Yes, they are on the same consumer group, but I have two partitions.
> >
> > On Thu, Nov 13, 2014 at 5:04 PM, Jagat Singh 
> wrote:
> >
> > > Are both of them in same Consumer Group?
> > >
> > > On Fri, Nov 14, 2014 at 9:12 AM, Palur Sandeep 
> > > wrote:
> > >
> > > > Dear Developers,
> > > >
> > > > I am 2nd year masters student at IIT. I am using Kafka for one of my
> > > > research projects.My question is the following:
> > > >
> > > > 1. I have a producer, consumer and a broker(that contains 1st
> partition
> > > of
> > > > my topic)  on node1
> > > > 2. I have a producer, consumer, zookeeper and a broker(that contains
> > 2nd
> > > > partition of my topic)  on node2
> > > > 3. Here comes my problem: though I have two partitions only one
> > consumer
> > > > pulls messages and the other one is always idle.
> > > >
> > > > What is that I can do to keep both of my consumer busy?
> > > >
> > > > Thank you
> > > >
> > > >
> > > > --
> > > > Regards,
> > > > Sandeep Palur
> > > > Data-Intensive Distributed Systems Laboratory, CS/IIT
> > > > Department of Computer Science, Illinois Institute of Technology
> (IIT)
> > > > Phone : 312-647-9833
> > > > Email : psand...@hawk.iit.edu 
> > > >
> > >
> >
> >
> >
> > --
> > Regards,
> > Sandeep Palur
> > Data-Intensive Distributed Systems Laboratory, CS/IIT
> > Department of Computer Science, Illinois Institute of Technology (IIT)
> > Phone : 312-647-9833
> > Email : psand...@hawk.iit.edu 
> >
>


Re: Broker keeps rebalancing

2014-11-13 Thread Jun Rao
Which version of ZK are you using?

Thanks,

Jun

On Thu, Nov 13, 2014 at 10:15 AM, Chen Wang 
wrote:

> Thanks for the info.
> It makes sense, however, I didn't see any "session timeout"/"expired"
> entries in consumer log..
> but do see lots of connection closed entry in zookeeper log:
>
> 2014-11-13 10:07:53,132 [myid:1] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for
> client /10.93.83.50:37180 which had sessionid 0x149a4cc1b580e7d
> 2014-11-13 10:08:04,499 [myid:1] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@197] - Accepted socket
> connection
> from /10.93.80.121:38437
> 2014-11-13 10:08:04,503 [myid:1] - WARN  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:ZooKeeperServer@822] - Connection request from old
> client /10.93.80.121:38437; will be dropped if server is in r-o mode
> 2014-11-13 10:08:04,503 [myid:1] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:ZooKeeperServer@868] - Client attempting to establish
> new session at /10.93.80.121:38437
> 2014-11-13 10:08:04,538 [myid:1] - INFO
>  [CommitProcessor:1:ZooKeeperServer@617] - Established session
> 0x149a4cc1b580e7e with negotiated timeout 4 for client /
> 10.93.80.121:38437
> 2014-11-13 10:08:08,746 [myid:1] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for
> client /10.93.80.121:38437 which had sessionid 0x149a4cc1b580e7e
>
> We are using -Xmx2048m for consumer, and I didn't see any GC related
> exceptions
>
> Chen
>
>
>
> On Thu, Nov 13, 2014 at 9:13 AM, Guozhang Wang  wrote:
>
> > Hey Chen,
> >
> > As Neha suggested, typical reason of too many rebalances is that your
> > consumers kept being timed out from ZK, and you can verify this by
> checking
> > in your consumer logs for sth. like "session timeout" entries (these are
> > not ERROR entries).
> >
> > Guozhang
> >
> > Guozhang
> >
> > On Wed, Nov 12, 2014 at 5:31 PM, Neha Narkhede 
> > wrote:
> >
> > > Does this help?
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog
> > > ?
> > >
> > > On Wed, Nov 12, 2014 at 3:53 PM, Chen Wang  >
> > > wrote:
> > >
> > > > Hi there,
> > > > My kafka client is reading a 3 partition topic from kafka with 3
> > threads
> > > > distributed on different machines. I am seeing frequent owner changes
> > on
> > > > the topics when running:
> > > > bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --group
> > > > my_test_group --topic mytopic -zkconnect localhost:2181
> > > >
> > > > The owner kept changing once a while, but I didn't see any exceptions
> > > > thrown from the consumer side. When checking broker log, its full of
> > > >  INFO Closing socket connection to /IP. (kafka.network.Processor)
> > > >
> > > > Is this expected behavior? If so,  how can I tell when  the leader is
> > > > imbalanced, and rebalance is triggered?
> > > > Thanks,
> > > > Chen
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>


log4j dir?

2014-11-13 Thread hsy...@gmail.com
Hi guys,

Just notice kafka.logs.dir in log4j.properties doesn't take effect

It's always set to *$base_dir/logs* in kafka-run-class.sh

LOG_DIR=$base_dir/logs
KAFKA_LOG4J_OPTS="-Dkafka.logs.dir=$LOG_DIR $KAFKA_LOG4J_OPTS"

Best,
Siyuan


Re: Is AWS Kinesis Kafka?

2014-11-13 Thread Jagat Singh
Do you think things will change now ,  with Kafta makers setting up own
company and will provide commercial support?

On Fri, Nov 14, 2014 at 10:28 AM, cac...@gmail.com  wrote:

> Yeah the real question is really are the products built on top of Kafka
> (Kafka with a hat on). The last place I worked we ended up using Kinesis
> rather than Kafka basically for the reason Niek mentions, it seemed easier
> to accept the limitations and pay Amazon rather than run Kafka (small
> company <30 devs), and my current place (<10 people) is moving towards
> Azure Event Hubs (C#/Azure shop) for similar reasons.
>
> The Kafka producer and consumer code certainly seems way better than that
> for EventHubs and Kinesis (assuming you're in C# for Azure and Java for the
> others).
>
> Christian
>
> On Thu, Nov 13, 2014 at 3:11 PM, Niek Sanders 
> wrote:
>
> > Many similarities.
> >
> > For Kinesis right now:
> >
> > * only a 1 day max retention
> > * max 50KB message size
> > * guaranteed throughput based on MB/sec in and out.
> > * servers hosting the shards abstracted away by SaaS
> >
> > For collaborative consumption, Kinesis uses DynamoDB whereas Kafka
> > uses Zookeeper.
> >
> > Until recently, the collaborative consumption library was Java only.
> > They recently released a bridge daemon (MultiLangDaemon) which lets
> > you use Python too.  I wrote a Golang client for using that same
> > bridge daemon in about a day (https://github.com/nieksand/gokinesis).
> >
> > For handling the broker topology, you just write to the Kinesis API
> > which takes care of the distribution to the appropriate shards
> >
> > Another downside on Kinesis is that it doesn't have Kafka's neat
> > producer-side message batch compression.
> >
> > The most compelling use case for Kinesis right now is if you're and
> > AWS shop and don't want to deal with setting up and maintaining a
> > Kafka cluster.  And even then it's only applicable if you're use case
> > fits inside the retention and message size limitations.
> >
> > Best,
> > Niek
> >
> >
> > On Thu, Nov 13, 2014 at 2:32 PM, Joseph Lawson 
> > wrote:
> > > Oh man they look similar.  Any comments?
> >
>


Re: Is AWS Kinesis Kafka?

2014-11-13 Thread cac...@gmail.com
Yeah the real question is really are the products built on top of Kafka
(Kafka with a hat on). The last place I worked we ended up using Kinesis
rather than Kafka basically for the reason Niek mentions, it seemed easier
to accept the limitations and pay Amazon rather than run Kafka (small
company <30 devs), and my current place (<10 people) is moving towards
Azure Event Hubs (C#/Azure shop) for similar reasons.

The Kafka producer and consumer code certainly seems way better than that
for EventHubs and Kinesis (assuming you're in C# for Azure and Java for the
others).

Christian

On Thu, Nov 13, 2014 at 3:11 PM, Niek Sanders 
wrote:

> Many similarities.
>
> For Kinesis right now:
>
> * only a 1 day max retention
> * max 50KB message size
> * guaranteed throughput based on MB/sec in and out.
> * servers hosting the shards abstracted away by SaaS
>
> For collaborative consumption, Kinesis uses DynamoDB whereas Kafka
> uses Zookeeper.
>
> Until recently, the collaborative consumption library was Java only.
> They recently released a bridge daemon (MultiLangDaemon) which lets
> you use Python too.  I wrote a Golang client for using that same
> bridge daemon in about a day (https://github.com/nieksand/gokinesis).
>
> For handling the broker topology, you just write to the Kinesis API
> which takes care of the distribution to the appropriate shards
>
> Another downside on Kinesis is that it doesn't have Kafka's neat
> producer-side message batch compression.
>
> The most compelling use case for Kinesis right now is if you're and
> AWS shop and don't want to deal with setting up and maintaining a
> Kafka cluster.  And even then it's only applicable if you're use case
> fits inside the retention and message size limitations.
>
> Best,
> Niek
>
>
> On Thu, Nov 13, 2014 at 2:32 PM, Joseph Lawson 
> wrote:
> > Oh man they look similar.  Any comments?
>


Re: One of two consumers is always Idle though I have 2 partitions

2014-11-13 Thread Jagat Singh
It would be worth reading once the consumer section from the documentation.

https://kafka.apache.org/documentation.html



On Fri, Nov 14, 2014 at 10:09 AM, Palur Sandeep 
wrote:

> Yes, they are on the same consumer group, but I have two partitions.
>
> On Thu, Nov 13, 2014 at 5:04 PM, Jagat Singh  wrote:
>
> > Are both of them in same Consumer Group?
> >
> > On Fri, Nov 14, 2014 at 9:12 AM, Palur Sandeep 
> > wrote:
> >
> > > Dear Developers,
> > >
> > > I am 2nd year masters student at IIT. I am using Kafka for one of my
> > > research projects.My question is the following:
> > >
> > > 1. I have a producer, consumer and a broker(that contains 1st partition
> > of
> > > my topic)  on node1
> > > 2. I have a producer, consumer, zookeeper and a broker(that contains
> 2nd
> > > partition of my topic)  on node2
> > > 3. Here comes my problem: though I have two partitions only one
> consumer
> > > pulls messages and the other one is always idle.
> > >
> > > What is that I can do to keep both of my consumer busy?
> > >
> > > Thank you
> > >
> > >
> > > --
> > > Regards,
> > > Sandeep Palur
> > > Data-Intensive Distributed Systems Laboratory, CS/IIT
> > > Department of Computer Science, Illinois Institute of Technology (IIT)
> > > Phone : 312-647-9833
> > > Email : psand...@hawk.iit.edu 
> > >
> >
>
>
>
> --
> Regards,
> Sandeep Palur
> Data-Intensive Distributed Systems Laboratory, CS/IIT
> Department of Computer Science, Illinois Institute of Technology (IIT)
> Phone : 312-647-9833
> Email : psand...@hawk.iit.edu 
>


Re: Is AWS Kinesis Kafka?

2014-11-13 Thread Niek Sanders
Many similarities.

For Kinesis right now:

* only a 1 day max retention
* max 50KB message size
* guaranteed throughput based on MB/sec in and out.
* servers hosting the shards abstracted away by SaaS

For collaborative consumption, Kinesis uses DynamoDB whereas Kafka
uses Zookeeper.

Until recently, the collaborative consumption library was Java only.
They recently released a bridge daemon (MultiLangDaemon) which lets
you use Python too.  I wrote a Golang client for using that same
bridge daemon in about a day (https://github.com/nieksand/gokinesis).

For handling the broker topology, you just write to the Kinesis API
which takes care of the distribution to the appropriate shards

Another downside on Kinesis is that it doesn't have Kafka's neat
producer-side message batch compression.

The most compelling use case for Kinesis right now is if you're and
AWS shop and don't want to deal with setting up and maintaining a
Kafka cluster.  And even then it's only applicable if you're use case
fits inside the retention and message size limitations.

Best,
Niek


On Thu, Nov 13, 2014 at 2:32 PM, Joseph Lawson  wrote:
> Oh man they look similar.  Any comments?


Re: One of two consumers is always Idle though I have 2 partitions

2014-11-13 Thread Palur Sandeep
Yes, they are on the same consumer group, but I have two partitions.

On Thu, Nov 13, 2014 at 5:04 PM, Jagat Singh  wrote:

> Are both of them in same Consumer Group?
>
> On Fri, Nov 14, 2014 at 9:12 AM, Palur Sandeep 
> wrote:
>
> > Dear Developers,
> >
> > I am 2nd year masters student at IIT. I am using Kafka for one of my
> > research projects.My question is the following:
> >
> > 1. I have a producer, consumer and a broker(that contains 1st partition
> of
> > my topic)  on node1
> > 2. I have a producer, consumer, zookeeper and a broker(that contains 2nd
> > partition of my topic)  on node2
> > 3. Here comes my problem: though I have two partitions only one consumer
> > pulls messages and the other one is always idle.
> >
> > What is that I can do to keep both of my consumer busy?
> >
> > Thank you
> >
> >
> > --
> > Regards,
> > Sandeep Palur
> > Data-Intensive Distributed Systems Laboratory, CS/IIT
> > Department of Computer Science, Illinois Institute of Technology (IIT)
> > Phone : 312-647-9833
> > Email : psand...@hawk.iit.edu 
> >
>



-- 
Regards,
Sandeep Palur
Data-Intensive Distributed Systems Laboratory, CS/IIT
Department of Computer Science, Illinois Institute of Technology (IIT)
Phone : 312-647-9833
Email : psand...@hawk.iit.edu 


Re: One of two consumers is always Idle though I have 2 partitions

2014-11-13 Thread Jagat Singh
Are both of them in same Consumer Group?

On Fri, Nov 14, 2014 at 9:12 AM, Palur Sandeep 
wrote:

> Dear Developers,
>
> I am 2nd year masters student at IIT. I am using Kafka for one of my
> research projects.My question is the following:
>
> 1. I have a producer, consumer and a broker(that contains 1st partition of
> my topic)  on node1
> 2. I have a producer, consumer, zookeeper and a broker(that contains 2nd
> partition of my topic)  on node2
> 3. Here comes my problem: though I have two partitions only one consumer
> pulls messages and the other one is always idle.
>
> What is that I can do to keep both of my consumer busy?
>
> Thank you
>
>
> --
> Regards,
> Sandeep Palur
> Data-Intensive Distributed Systems Laboratory, CS/IIT
> Department of Computer Science, Illinois Institute of Technology (IIT)
> Phone : 312-647-9833
> Email : psand...@hawk.iit.edu 
>


Re: Is AWS Kinesis Kafka?

2014-11-13 Thread Joseph Lawson

Perhaps humanity just hit that inevitable point where we needed streaming event 
queues. Sort of like how Darwin and Alfred Russel Wallace both thought of 
evolution at the same time.

From: cac...@gmail.com 
Sent: Thursday, November 13, 2014 2:36:01 PM
To: users@kafka.apache.org
Subject: Re: Is AWS Kinesis Kafka?

I've wondered that about Azure Event Hubs as well. They both use a
different consumer offset tracking mechanism than the one in 0.8 for their
higher level consumers.

Christian

On Thu, Nov 13, 2014 at 2:32 PM, Joseph Lawson  wrote:

> Oh man they look similar.  Any comments?
>


Re: Is AWS Kinesis Kafka?

2014-11-13 Thread cac...@gmail.com
I've wondered that about Azure Event Hubs as well. They both use a
different consumer offset tracking mechanism than the one in 0.8 for their
higher level consumers.

Christian

On Thu, Nov 13, 2014 at 2:32 PM, Joseph Lawson  wrote:

> Oh man they look similar.  Any comments?
>


Is AWS Kinesis Kafka?

2014-11-13 Thread Joseph Lawson
Oh man they look similar.  Any comments?


One of two consumers is always Idle though I have 2 partitions

2014-11-13 Thread Palur Sandeep
Dear Developers,

I am 2nd year masters student at IIT. I am using Kafka for one of my
research projects.My question is the following:

1. I have a producer, consumer and a broker(that contains 1st partition of
my topic)  on node1
2. I have a producer, consumer, zookeeper and a broker(that contains 2nd
partition of my topic)  on node2
3. Here comes my problem: though I have two partitions only one consumer
pulls messages and the other one is always idle.

What is that I can do to keep both of my consumer busy?

Thank you


-- 
Regards,
Sandeep Palur
Data-Intensive Distributed Systems Laboratory, CS/IIT
Department of Computer Science, Illinois Institute of Technology (IIT)
Phone : 312-647-9833
Email : psand...@hawk.iit.edu 


Re: Broker keeps rebalancing

2014-11-13 Thread Guozhang Wang
>From your zk logs:

2014-11-13 10:07:53,132 [myid:1] - INFO  [NIOServerCxn.Factory: .. Closed
socket connection for
2014-11-13 10:08:08,746 [myid:1] - INFO  [NIOServerCxn.Factory: .. Closed
socket connection for

It does seem to kick out one consumer every 6ms, and you would probably
check your consumer GC log to see if it is paused from time to time.

Guozhang

On Thu, Nov 13, 2014 at 11:01 AM, Chen Wang 
wrote:

> kafka.config.zookeeper.session.timeout.ms6
> kafka.config.rebalance.backoff.ms6000kafka.config.rebalance.max.retries6
>
> On Thu, Nov 13, 2014 at 10:56 AM, Guozhang Wang 
> wrote:
>
> > I was originally asking about consumer configs, which should contain the
> > following:
> >
> > http://kafka.apache.org/documentation.html#consumerconfigs
> >
> > zookeeper.session.timeout.ms
> > zookeeper.connection.timeout.ms
> >
> > On Thu, Nov 13, 2014 at 10:40 AM, Manish  wrote:
> >
> > > @Guozhang:
> > >
> > > In server.properties  we have :
> > >
> > > zookeeper.connection.timeout.ms=100
> > >
> > >
> > > In zoo.cfg we have
> > >
> > > tickTime=2000
> > >
> > > initLimit=10
> > >
> > > syncLimit=5
> > >
> > > dataDir=/opt/zookeeper/data
> > >
> > > dataLogDir=/opt/zookeeper/logs
> > >
> > > clientPort=2182
> > >
> > > server.1=.com:2888:3888
> > >
> > > server.2=.com:2888:3888
> > >
> > > server.3=.com:2888:3888
> > >
> > >
> > > On Thu, Nov 13, 2014 at 10:27 AM, Guozhang Wang 
> > > wrote:
> > >
> > > > Chen,
> > > >
> > > > From ZK logs it sounds like ZK kept timed out consumers which
> triggers
> > > > rebalance.
> > > >
> > > > What is the zk session timeout config value in your consumers?
> > > >
> > > > Guozhang
> > > >
> > > > On Thu, Nov 13, 2014 at 10:15 AM, Chen Wang <
> > chen.apache.s...@gmail.com>
> > > > wrote:
> > > >
> > > > > Thanks for the info.
> > > > > It makes sense, however, I didn't see any "session
> timeout"/"expired"
> > > > > entries in consumer log..
> > > > > but do see lots of connection closed entry in zookeeper log:
> > > > >
> > > > > 2014-11-13 10:07:53,132 [myid:1] - INFO  [NIOServerCxn.Factory:
> > > > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket
> connection
> > > for
> > > > > client /10.93.83.50:37180 which had sessionid 0x149a4cc1b580e7d
> > > > > 2014-11-13 10:08:04,499 [myid:1] - INFO  [NIOServerCxn.Factory:
> > > > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@197] - Accepted socket
> > > > > connection
> > > > > from /10.93.80.121:38437
> > > > > 2014-11-13 10:08:04,503 [myid:1] - WARN  [NIOServerCxn.Factory:
> > > > > 0.0.0.0/0.0.0.0:2182:ZooKeeperServer@822] - Connection request
> from
> > > old
> > > > > client /10.93.80.121:38437; will be dropped if server is in r-o
> mode
> > > > > 2014-11-13 10:08:04,503 [myid:1] - INFO  [NIOServerCxn.Factory:
> > > > > 0.0.0.0/0.0.0.0:2182:ZooKeeperServer@868] - Client attempting to
> > > > establish
> > > > > new session at /10.93.80.121:38437
> > > > > 2014-11-13 10:08:04,538 [myid:1] - INFO
> > > > >  [CommitProcessor:1:ZooKeeperServer@617] - Established session
> > > > > 0x149a4cc1b580e7e with negotiated timeout 4 for client /
> > > > > 10.93.80.121:38437
> > > > > 2014-11-13 10:08:08,746 [myid:1] - INFO  [NIOServerCxn.Factory:
> > > > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket
> connection
> > > for
> > > > > client /10.93.80.121:38437 which had sessionid 0x149a4cc1b580e7e
> > > > >
> > > > > We are using -Xmx2048m for consumer, and I didn't see any GC
> related
> > > > > exceptions
> > > > >
> > > > > Chen
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Nov 13, 2014 at 9:13 AM, Guozhang Wang  >
> > > > wrote:
> > > > >
> > > > > > Hey Chen,
> > > > > >
> > > > > > As Neha suggested, typical reason of too many rebalances is that
> > your
> > > > > > consumers kept being timed out from ZK, and you can verify this
> by
> > > > > checking
> > > > > > in your consumer logs for sth. like "session timeout" entries
> > (these
> > > > are
> > > > > > not ERROR entries).
> > > > > >
> > > > > > Guozhang
> > > > > >
> > > > > > Guozhang
> > > > > >
> > > > > > On Wed, Nov 12, 2014 at 5:31 PM, Neha Narkhede <
> > > > neha.narkh...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Does this help?
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog
> > > > > > > ?
> > > > > > >
> > > > > > > On Wed, Nov 12, 2014 at 3:53 PM, Chen Wang <
> > > > chen.apache.s...@gmail.com
> > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi there,
> > > > > > > > My kafka client is reading a 3 partition topic from kafka
> with
> > 3
> > > > > > threads
> > > > > > > > distributed on different machines. I am seeing frequent owner
> > > > changes
> > > > > > on
> > > > > > > > the topics when running:
> > > > > > > > bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker
> > --group
> > > > > > > > my_test_group --topic mytopic -zkconnect localhost:2181
> > 

Re: Getting Simple consumer details using MBean

2014-11-13 Thread Jun Rao
I tried running kafka-simple-consumer-shell. I can see the following mbean.

"kafka.consumer":type="FetchRequestAndResponseMetrics",name="SimpleConsumerShell-AllBrokersFetchRequestRateAndTimeMs"

Thanks,

Jun

On Wed, Nov 12, 2014 at 9:57 PM, Madhukar Bharti 
wrote:

> Hi Jun Rao,
>
> Thanks for your quick reply.
>
> I am not able to see this  any bean named as "SimpleConsumer". Is there any
> configuration related to this?
>
> How can I see this bean named listing in Jconsole window?
>
>
> Thanks and Regards
> Madhukar
>
> On Thu, Nov 13, 2014 at 6:06 AM, Jun Rao  wrote:
>
> > Those are for 0.7. In 0.8, you should see sth
> > like FetchRequestRateAndTimeMs in SimpleConsumer.
> >
> > Thanks,
> >
> > Jun
> >
> > On Wed, Nov 12, 2014 at 5:14 AM, Madhukar Bharti <
> bhartimadhu...@gmail.com
> > >
> > wrote:
> >
> > > Hi,
> > >
> > > I want to get the simple consumer details using MBean as described here
> > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Operations#Operations-Monitoring
> > > >.
> > > But these bean names are not showing in JConsole as well as while
> trying
> > to
> > > read from JMX.
> > >
> > > Please help me to get simple consumer details.
> > >
> > > I am using Kafka 0.8.1.1 version.
> > >
> > >
> > > Thanks and Regards,
> > > Madhukar Bharti
> > >
> >
>
>
>
> --
> Thanks and Regards,
> Madhukar Bharti
> Mob: 7845755539
>


Re: Broker keeps rebalancing

2014-11-13 Thread Chen Wang
kafka.config.zookeeper.session.timeout.ms6
kafka.config.rebalance.backoff.ms6000kafka.config.rebalance.max.retries6

On Thu, Nov 13, 2014 at 10:56 AM, Guozhang Wang  wrote:

> I was originally asking about consumer configs, which should contain the
> following:
>
> http://kafka.apache.org/documentation.html#consumerconfigs
>
> zookeeper.session.timeout.ms
> zookeeper.connection.timeout.ms
>
> On Thu, Nov 13, 2014 at 10:40 AM, Manish  wrote:
>
> > @Guozhang:
> >
> > In server.properties  we have :
> >
> > zookeeper.connection.timeout.ms=100
> >
> >
> > In zoo.cfg we have
> >
> > tickTime=2000
> >
> > initLimit=10
> >
> > syncLimit=5
> >
> > dataDir=/opt/zookeeper/data
> >
> > dataLogDir=/opt/zookeeper/logs
> >
> > clientPort=2182
> >
> > server.1=.com:2888:3888
> >
> > server.2=.com:2888:3888
> >
> > server.3=.com:2888:3888
> >
> >
> > On Thu, Nov 13, 2014 at 10:27 AM, Guozhang Wang 
> > wrote:
> >
> > > Chen,
> > >
> > > From ZK logs it sounds like ZK kept timed out consumers which triggers
> > > rebalance.
> > >
> > > What is the zk session timeout config value in your consumers?
> > >
> > > Guozhang
> > >
> > > On Thu, Nov 13, 2014 at 10:15 AM, Chen Wang <
> chen.apache.s...@gmail.com>
> > > wrote:
> > >
> > > > Thanks for the info.
> > > > It makes sense, however, I didn't see any "session timeout"/"expired"
> > > > entries in consumer log..
> > > > but do see lots of connection closed entry in zookeeper log:
> > > >
> > > > 2014-11-13 10:07:53,132 [myid:1] - INFO  [NIOServerCxn.Factory:
> > > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection
> > for
> > > > client /10.93.83.50:37180 which had sessionid 0x149a4cc1b580e7d
> > > > 2014-11-13 10:08:04,499 [myid:1] - INFO  [NIOServerCxn.Factory:
> > > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@197] - Accepted socket
> > > > connection
> > > > from /10.93.80.121:38437
> > > > 2014-11-13 10:08:04,503 [myid:1] - WARN  [NIOServerCxn.Factory:
> > > > 0.0.0.0/0.0.0.0:2182:ZooKeeperServer@822] - Connection request from
> > old
> > > > client /10.93.80.121:38437; will be dropped if server is in r-o mode
> > > > 2014-11-13 10:08:04,503 [myid:1] - INFO  [NIOServerCxn.Factory:
> > > > 0.0.0.0/0.0.0.0:2182:ZooKeeperServer@868] - Client attempting to
> > > establish
> > > > new session at /10.93.80.121:38437
> > > > 2014-11-13 10:08:04,538 [myid:1] - INFO
> > > >  [CommitProcessor:1:ZooKeeperServer@617] - Established session
> > > > 0x149a4cc1b580e7e with negotiated timeout 4 for client /
> > > > 10.93.80.121:38437
> > > > 2014-11-13 10:08:08,746 [myid:1] - INFO  [NIOServerCxn.Factory:
> > > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection
> > for
> > > > client /10.93.80.121:38437 which had sessionid 0x149a4cc1b580e7e
> > > >
> > > > We are using -Xmx2048m for consumer, and I didn't see any GC related
> > > > exceptions
> > > >
> > > > Chen
> > > >
> > > >
> > > >
> > > > On Thu, Nov 13, 2014 at 9:13 AM, Guozhang Wang 
> > > wrote:
> > > >
> > > > > Hey Chen,
> > > > >
> > > > > As Neha suggested, typical reason of too many rebalances is that
> your
> > > > > consumers kept being timed out from ZK, and you can verify this by
> > > > checking
> > > > > in your consumer logs for sth. like "session timeout" entries
> (these
> > > are
> > > > > not ERROR entries).
> > > > >
> > > > > Guozhang
> > > > >
> > > > > Guozhang
> > > > >
> > > > > On Wed, Nov 12, 2014 at 5:31 PM, Neha Narkhede <
> > > neha.narkh...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Does this help?
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog
> > > > > > ?
> > > > > >
> > > > > > On Wed, Nov 12, 2014 at 3:53 PM, Chen Wang <
> > > chen.apache.s...@gmail.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi there,
> > > > > > > My kafka client is reading a 3 partition topic from kafka with
> 3
> > > > > threads
> > > > > > > distributed on different machines. I am seeing frequent owner
> > > changes
> > > > > on
> > > > > > > the topics when running:
> > > > > > > bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker
> --group
> > > > > > > my_test_group --topic mytopic -zkconnect localhost:2181
> > > > > > >
> > > > > > > The owner kept changing once a while, but I didn't see any
> > > exceptions
> > > > > > > thrown from the consumer side. When checking broker log, its
> full
> > > of
> > > > > > >  INFO Closing socket connection to /IP.
> (kafka.network.Processor)
> > > > > > >
> > > > > > > Is this expected behavior? If so,  how can I tell when  the
> > leader
> > > is
> > > > > > > imbalanced, and rebalance is triggered?
> > > > > > > Thanks,
> > > > > > > Chen
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > -- Guozhang
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>


Re: Broker keeps rebalancing

2014-11-13 Thread Guozhang Wang
I was originally asking about consumer configs, which should contain the
following:

http://kafka.apache.org/documentation.html#consumerconfigs

zookeeper.session.timeout.ms
zookeeper.connection.timeout.ms

On Thu, Nov 13, 2014 at 10:40 AM, Manish  wrote:

> @Guozhang:
>
> In server.properties  we have :
>
> zookeeper.connection.timeout.ms=100
>
>
> In zoo.cfg we have
>
> tickTime=2000
>
> initLimit=10
>
> syncLimit=5
>
> dataDir=/opt/zookeeper/data
>
> dataLogDir=/opt/zookeeper/logs
>
> clientPort=2182
>
> server.1=.com:2888:3888
>
> server.2=.com:2888:3888
>
> server.3=.com:2888:3888
>
>
> On Thu, Nov 13, 2014 at 10:27 AM, Guozhang Wang 
> wrote:
>
> > Chen,
> >
> > From ZK logs it sounds like ZK kept timed out consumers which triggers
> > rebalance.
> >
> > What is the zk session timeout config value in your consumers?
> >
> > Guozhang
> >
> > On Thu, Nov 13, 2014 at 10:15 AM, Chen Wang 
> > wrote:
> >
> > > Thanks for the info.
> > > It makes sense, however, I didn't see any "session timeout"/"expired"
> > > entries in consumer log..
> > > but do see lots of connection closed entry in zookeeper log:
> > >
> > > 2014-11-13 10:07:53,132 [myid:1] - INFO  [NIOServerCxn.Factory:
> > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection
> for
> > > client /10.93.83.50:37180 which had sessionid 0x149a4cc1b580e7d
> > > 2014-11-13 10:08:04,499 [myid:1] - INFO  [NIOServerCxn.Factory:
> > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@197] - Accepted socket
> > > connection
> > > from /10.93.80.121:38437
> > > 2014-11-13 10:08:04,503 [myid:1] - WARN  [NIOServerCxn.Factory:
> > > 0.0.0.0/0.0.0.0:2182:ZooKeeperServer@822] - Connection request from
> old
> > > client /10.93.80.121:38437; will be dropped if server is in r-o mode
> > > 2014-11-13 10:08:04,503 [myid:1] - INFO  [NIOServerCxn.Factory:
> > > 0.0.0.0/0.0.0.0:2182:ZooKeeperServer@868] - Client attempting to
> > establish
> > > new session at /10.93.80.121:38437
> > > 2014-11-13 10:08:04,538 [myid:1] - INFO
> > >  [CommitProcessor:1:ZooKeeperServer@617] - Established session
> > > 0x149a4cc1b580e7e with negotiated timeout 4 for client /
> > > 10.93.80.121:38437
> > > 2014-11-13 10:08:08,746 [myid:1] - INFO  [NIOServerCxn.Factory:
> > > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection
> for
> > > client /10.93.80.121:38437 which had sessionid 0x149a4cc1b580e7e
> > >
> > > We are using -Xmx2048m for consumer, and I didn't see any GC related
> > > exceptions
> > >
> > > Chen
> > >
> > >
> > >
> > > On Thu, Nov 13, 2014 at 9:13 AM, Guozhang Wang 
> > wrote:
> > >
> > > > Hey Chen,
> > > >
> > > > As Neha suggested, typical reason of too many rebalances is that your
> > > > consumers kept being timed out from ZK, and you can verify this by
> > > checking
> > > > in your consumer logs for sth. like "session timeout" entries (these
> > are
> > > > not ERROR entries).
> > > >
> > > > Guozhang
> > > >
> > > > Guozhang
> > > >
> > > > On Wed, Nov 12, 2014 at 5:31 PM, Neha Narkhede <
> > neha.narkh...@gmail.com>
> > > > wrote:
> > > >
> > > > > Does this help?
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog
> > > > > ?
> > > > >
> > > > > On Wed, Nov 12, 2014 at 3:53 PM, Chen Wang <
> > chen.apache.s...@gmail.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Hi there,
> > > > > > My kafka client is reading a 3 partition topic from kafka with 3
> > > > threads
> > > > > > distributed on different machines. I am seeing frequent owner
> > changes
> > > > on
> > > > > > the topics when running:
> > > > > > bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --group
> > > > > > my_test_group --topic mytopic -zkconnect localhost:2181
> > > > > >
> > > > > > The owner kept changing once a while, but I didn't see any
> > exceptions
> > > > > > thrown from the consumer side. When checking broker log, its full
> > of
> > > > > >  INFO Closing socket connection to /IP. (kafka.network.Processor)
> > > > > >
> > > > > > Is this expected behavior? If so,  how can I tell when  the
> leader
> > is
> > > > > > imbalanced, and rebalance is triggered?
> > > > > > Thanks,
> > > > > > Chen
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-- Guozhang


Re: Broker keeps rebalancing

2014-11-13 Thread Manish
@Guozhang:

In server.properties  we have :

zookeeper.connection.timeout.ms=100


In zoo.cfg we have

tickTime=2000

initLimit=10

syncLimit=5

dataDir=/opt/zookeeper/data

dataLogDir=/opt/zookeeper/logs

clientPort=2182

server.1=.com:2888:3888

server.2=.com:2888:3888

server.3=.com:2888:3888


On Thu, Nov 13, 2014 at 10:27 AM, Guozhang Wang  wrote:

> Chen,
>
> From ZK logs it sounds like ZK kept timed out consumers which triggers
> rebalance.
>
> What is the zk session timeout config value in your consumers?
>
> Guozhang
>
> On Thu, Nov 13, 2014 at 10:15 AM, Chen Wang 
> wrote:
>
> > Thanks for the info.
> > It makes sense, however, I didn't see any "session timeout"/"expired"
> > entries in consumer log..
> > but do see lots of connection closed entry in zookeeper log:
> >
> > 2014-11-13 10:07:53,132 [myid:1] - INFO  [NIOServerCxn.Factory:
> > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for
> > client /10.93.83.50:37180 which had sessionid 0x149a4cc1b580e7d
> > 2014-11-13 10:08:04,499 [myid:1] - INFO  [NIOServerCxn.Factory:
> > 0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@197] - Accepted socket
> > connection
> > from /10.93.80.121:38437
> > 2014-11-13 10:08:04,503 [myid:1] - WARN  [NIOServerCxn.Factory:
> > 0.0.0.0/0.0.0.0:2182:ZooKeeperServer@822] - Connection request from old
> > client /10.93.80.121:38437; will be dropped if server is in r-o mode
> > 2014-11-13 10:08:04,503 [myid:1] - INFO  [NIOServerCxn.Factory:
> > 0.0.0.0/0.0.0.0:2182:ZooKeeperServer@868] - Client attempting to
> establish
> > new session at /10.93.80.121:38437
> > 2014-11-13 10:08:04,538 [myid:1] - INFO
> >  [CommitProcessor:1:ZooKeeperServer@617] - Established session
> > 0x149a4cc1b580e7e with negotiated timeout 4 for client /
> > 10.93.80.121:38437
> > 2014-11-13 10:08:08,746 [myid:1] - INFO  [NIOServerCxn.Factory:
> > 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for
> > client /10.93.80.121:38437 which had sessionid 0x149a4cc1b580e7e
> >
> > We are using -Xmx2048m for consumer, and I didn't see any GC related
> > exceptions
> >
> > Chen
> >
> >
> >
> > On Thu, Nov 13, 2014 at 9:13 AM, Guozhang Wang 
> wrote:
> >
> > > Hey Chen,
> > >
> > > As Neha suggested, typical reason of too many rebalances is that your
> > > consumers kept being timed out from ZK, and you can verify this by
> > checking
> > > in your consumer logs for sth. like "session timeout" entries (these
> are
> > > not ERROR entries).
> > >
> > > Guozhang
> > >
> > > Guozhang
> > >
> > > On Wed, Nov 12, 2014 at 5:31 PM, Neha Narkhede <
> neha.narkh...@gmail.com>
> > > wrote:
> > >
> > > > Does this help?
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog
> > > > ?
> > > >
> > > > On Wed, Nov 12, 2014 at 3:53 PM, Chen Wang <
> chen.apache.s...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Hi there,
> > > > > My kafka client is reading a 3 partition topic from kafka with 3
> > > threads
> > > > > distributed on different machines. I am seeing frequent owner
> changes
> > > on
> > > > > the topics when running:
> > > > > bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --group
> > > > > my_test_group --topic mytopic -zkconnect localhost:2181
> > > > >
> > > > > The owner kept changing once a while, but I didn't see any
> exceptions
> > > > > thrown from the consumer side. When checking broker log, its full
> of
> > > > >  INFO Closing socket connection to /IP. (kafka.network.Processor)
> > > > >
> > > > > Is this expected behavior? If so,  how can I tell when  the leader
> is
> > > > > imbalanced, and rebalance is triggered?
> > > > > Thanks,
> > > > > Chen
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>


Re: How to keep-alive connection from producer to broker?

2014-11-13 Thread Guozhang Wang
Neeraja,

Producer does use keep-alive connections to the brokers, and a recent
change is introduced in broker which will actively close connections if it
has not got any requests from the producer for some time. The default
period is 10 min, you can set it to INT_MAX if you do not want this feature.

Guozhang

On Thu, Nov 13, 2014 at 12:09 AM, Kothakota, Neeraja (HP Software) <
neeraj...@hp.com> wrote:

> Hi,
>
> I would like to know if there is a way/configuration to keep-alive
> connections from producer/client to broker?
>
> I observed that connection is getting closed every time, which takes
> considerable time.
>
> Can you please suggest ?
>
> Thanks & Regards,
> Neeraja
>



-- 
-- Guozhang


Re: Broker keeps rebalancing

2014-11-13 Thread Guozhang Wang
Chen,

>From ZK logs it sounds like ZK kept timed out consumers which triggers
rebalance.

What is the zk session timeout config value in your consumers?

Guozhang

On Thu, Nov 13, 2014 at 10:15 AM, Chen Wang 
wrote:

> Thanks for the info.
> It makes sense, however, I didn't see any "session timeout"/"expired"
> entries in consumer log..
> but do see lots of connection closed entry in zookeeper log:
>
> 2014-11-13 10:07:53,132 [myid:1] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for
> client /10.93.83.50:37180 which had sessionid 0x149a4cc1b580e7d
> 2014-11-13 10:08:04,499 [myid:1] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@197] - Accepted socket
> connection
> from /10.93.80.121:38437
> 2014-11-13 10:08:04,503 [myid:1] - WARN  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:ZooKeeperServer@822] - Connection request from old
> client /10.93.80.121:38437; will be dropped if server is in r-o mode
> 2014-11-13 10:08:04,503 [myid:1] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:ZooKeeperServer@868] - Client attempting to establish
> new session at /10.93.80.121:38437
> 2014-11-13 10:08:04,538 [myid:1] - INFO
>  [CommitProcessor:1:ZooKeeperServer@617] - Established session
> 0x149a4cc1b580e7e with negotiated timeout 4 for client /
> 10.93.80.121:38437
> 2014-11-13 10:08:08,746 [myid:1] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for
> client /10.93.80.121:38437 which had sessionid 0x149a4cc1b580e7e
>
> We are using -Xmx2048m for consumer, and I didn't see any GC related
> exceptions
>
> Chen
>
>
>
> On Thu, Nov 13, 2014 at 9:13 AM, Guozhang Wang  wrote:
>
> > Hey Chen,
> >
> > As Neha suggested, typical reason of too many rebalances is that your
> > consumers kept being timed out from ZK, and you can verify this by
> checking
> > in your consumer logs for sth. like "session timeout" entries (these are
> > not ERROR entries).
> >
> > Guozhang
> >
> > Guozhang
> >
> > On Wed, Nov 12, 2014 at 5:31 PM, Neha Narkhede 
> > wrote:
> >
> > > Does this help?
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog
> > > ?
> > >
> > > On Wed, Nov 12, 2014 at 3:53 PM, Chen Wang  >
> > > wrote:
> > >
> > > > Hi there,
> > > > My kafka client is reading a 3 partition topic from kafka with 3
> > threads
> > > > distributed on different machines. I am seeing frequent owner changes
> > on
> > > > the topics when running:
> > > > bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --group
> > > > my_test_group --topic mytopic -zkconnect localhost:2181
> > > >
> > > > The owner kept changing once a while, but I didn't see any exceptions
> > > > thrown from the consumer side. When checking broker log, its full of
> > > >  INFO Closing socket connection to /IP. (kafka.network.Processor)
> > > >
> > > > Is this expected behavior? If so,  how can I tell when  the leader is
> > > > imbalanced, and rebalance is triggered?
> > > > Thanks,
> > > > Chen
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-- Guozhang


Re: Broker keeps rebalancing

2014-11-13 Thread Neha Narkhede
@Neha, Can you share suggested consumer side GC settings?

Consumer side GC settings are not standard since it is a function of your
application that embeds the consumer. Your consumer application's memory
patterns will dictate your GC settings. Sorry, I know that's not very
helpful, but GC tuning is a dark art :-)

On Thu, Nov 13, 2014 at 9:13 AM, Guozhang Wang  wrote:

> Hey Chen,
>
> As Neha suggested, typical reason of too many rebalances is that your
> consumers kept being timed out from ZK, and you can verify this by checking
> in your consumer logs for sth. like "session timeout" entries (these are
> not ERROR entries).
>
> Guozhang
>
> Guozhang
>
> On Wed, Nov 12, 2014 at 5:31 PM, Neha Narkhede 
> wrote:
>
> > Does this help?
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog
> > ?
> >
> > On Wed, Nov 12, 2014 at 3:53 PM, Chen Wang 
> > wrote:
> >
> > > Hi there,
> > > My kafka client is reading a 3 partition topic from kafka with 3
> threads
> > > distributed on different machines. I am seeing frequent owner changes
> on
> > > the topics when running:
> > > bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --group
> > > my_test_group --topic mytopic -zkconnect localhost:2181
> > >
> > > The owner kept changing once a while, but I didn't see any exceptions
> > > thrown from the consumer side. When checking broker log, its full of
> > >  INFO Closing socket connection to /IP. (kafka.network.Processor)
> > >
> > > Is this expected behavior? If so,  how can I tell when  the leader is
> > > imbalanced, and rebalance is triggered?
> > > Thanks,
> > > Chen
> > >
> >
>
>
>
> --
> -- Guozhang
>


Re: Broker keeps rebalancing

2014-11-13 Thread Chen Wang
Thanks for the info.
It makes sense, however, I didn't see any "session timeout"/"expired"
entries in consumer log..
but do see lots of connection closed entry in zookeeper log:

2014-11-13 10:07:53,132 [myid:1] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for
client /10.93.83.50:37180 which had sessionid 0x149a4cc1b580e7d
2014-11-13 10:08:04,499 [myid:1] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@197] - Accepted socket connection
from /10.93.80.121:38437
2014-11-13 10:08:04,503 [myid:1] - WARN  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2182:ZooKeeperServer@822] - Connection request from old
client /10.93.80.121:38437; will be dropped if server is in r-o mode
2014-11-13 10:08:04,503 [myid:1] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2182:ZooKeeperServer@868] - Client attempting to establish
new session at /10.93.80.121:38437
2014-11-13 10:08:04,538 [myid:1] - INFO
 [CommitProcessor:1:ZooKeeperServer@617] - Established session
0x149a4cc1b580e7e with negotiated timeout 4 for client /
10.93.80.121:38437
2014-11-13 10:08:08,746 [myid:1] - INFO  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1007] - Closed socket connection for
client /10.93.80.121:38437 which had sessionid 0x149a4cc1b580e7e

We are using -Xmx2048m for consumer, and I didn't see any GC related
exceptions

Chen



On Thu, Nov 13, 2014 at 9:13 AM, Guozhang Wang  wrote:

> Hey Chen,
>
> As Neha suggested, typical reason of too many rebalances is that your
> consumers kept being timed out from ZK, and you can verify this by checking
> in your consumer logs for sth. like "session timeout" entries (these are
> not ERROR entries).
>
> Guozhang
>
> Guozhang
>
> On Wed, Nov 12, 2014 at 5:31 PM, Neha Narkhede 
> wrote:
>
> > Does this help?
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog
> > ?
> >
> > On Wed, Nov 12, 2014 at 3:53 PM, Chen Wang 
> > wrote:
> >
> > > Hi there,
> > > My kafka client is reading a 3 partition topic from kafka with 3
> threads
> > > distributed on different machines. I am seeing frequent owner changes
> on
> > > the topics when running:
> > > bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --group
> > > my_test_group --topic mytopic -zkconnect localhost:2181
> > >
> > > The owner kept changing once a while, but I didn't see any exceptions
> > > thrown from the consumer side. When checking broker log, its full of
> > >  INFO Closing socket connection to /IP. (kafka.network.Processor)
> > >
> > > Is this expected behavior? If so,  how can I tell when  the leader is
> > > imbalanced, and rebalance is triggered?
> > > Thanks,
> > > Chen
> > >
> >
>
>
>
> --
> -- Guozhang
>


Re: how to configure kafka over firewall

2014-11-13 Thread Manikumar Reddy
 Kafka uses a binary protocol over tcp.
 To allow kafka traffic,  you should open all the brokerIP:port in
 firewall.

kumar

On Thu, Nov 13, 2014 at 9:47 PM, Kothakota, Neeraja (HP Software) <
neeraj...@hp.com> wrote:

> Hi,
>
> I would like to know the way to configure kafka over firewall.
> We have firewall where it filters the connections based on protocol and
> there are security checks in place.
> Our broker 0.8.1 resides over firewall when producers sending data to
> broker, have to go through checks.
> Please suggest on how to make sure kafka protocol passes these checks?
>
> Thanks & Regards,
> Neeraja
>


Re: Broker keeps rebalancing

2014-11-13 Thread Guozhang Wang
Hey Chen,

As Neha suggested, typical reason of too many rebalances is that your
consumers kept being timed out from ZK, and you can verify this by checking
in your consumer logs for sth. like "session timeout" entries (these are
not ERROR entries).

Guozhang

Guozhang

On Wed, Nov 12, 2014 at 5:31 PM, Neha Narkhede 
wrote:

> Does this help?
>
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog
> ?
>
> On Wed, Nov 12, 2014 at 3:53 PM, Chen Wang 
> wrote:
>
> > Hi there,
> > My kafka client is reading a 3 partition topic from kafka with 3 threads
> > distributed on different machines. I am seeing frequent owner changes on
> > the topics when running:
> > bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --group
> > my_test_group --topic mytopic -zkconnect localhost:2181
> >
> > The owner kept changing once a while, but I didn't see any exceptions
> > thrown from the consumer side. When checking broker log, its full of
> >  INFO Closing socket connection to /IP. (kafka.network.Processor)
> >
> > Is this expected behavior? If so,  how can I tell when  the leader is
> > imbalanced, and rebalance is triggered?
> > Thanks,
> > Chen
> >
>



-- 
-- Guozhang


Re: Subscibe

2014-11-13 Thread Harsha
Rohit,
   Please send a mail to users-subscr...@kafka.apache.org.
more info here http://kafka.apache.org/contact.html.
-Harsha

On Wed, Nov 12, 2014, at 09:20 PM, Rohit Pujari wrote:
> Hello there:
> 
> I'd like to be added to the mailing list
> 
> Thanks,
> -- 
> Rohit Pujari
> Solutions Engineer, Hortonworks
> rpuj...@hortonworks.com
> 716-430-6899
> 
> -- 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to 
> which it is addressed and may contain information that is confidential, 
> privileged and exempt from disclosure under applicable law. If the reader 
> of this message is not the intended recipient, you are hereby notified
> that 
> any printing, copying, dissemination, distribution, disclosure or 
> forwarding of this communication is strictly prohibited. If you have 
> received this communication in error, please contact the sender
> immediately 
> and delete it from your system. Thank You.


how to configure kafka over firewall

2014-11-13 Thread Kothakota, Neeraja (HP Software)
Hi,

I would like to know the way to configure kafka over firewall.
We have firewall where it filters the connections based on protocol and there 
are security checks in place.
Our broker 0.8.1 resides over firewall when producers sending data to broker, 
have to go through checks.
Please suggest on how to make sure kafka protocol passes these checks?

Thanks & Regards,
Neeraja


Re: Indefinite growth of FetchRequestPurgatory

2014-11-13 Thread András Serény


Thanks a lot Guozhang, I've now upgraded to 0.8.2-beta and the issue 
seems to be gone.


András

On 11/3/2014 4:45 PM, Guozhang Wang wrote:

Hi Andras,

Could you try 0.8.2-beta and see if this issue comes out again? We fixed a
couple of the purgatory issues (e.g. KAFKA-1616
) in 0.8.2, but I do not
remember any of them will cause OOM.

Guozhang

On Mon, Nov 3, 2014 at 5:42 AM, András Serény 
wrote:


Hi Kafka users,

we're running a cluster of two Kafka 0.8.1.1 brokers, with a twofold
replicaton of each topic.

When both brokers are up, after a short while the FetchRequestPurgatory
starts to grow indefinitely on the leader (detectable via a heap dump and
also via the "FetchRequestPurgatory"."PurgatorySize" JMX metric),
eventually leading to an OOM error. When one of the brokers is shut down,
the purgatory stops growing in size, and the remaining broker runs fine. In
https://issues.apache.org/jira/browse/KAFKA-1016, I see this can occur
when a fetcher specifies a too large max wait time, but we don't override
replica.fetch.wait.max.ms, leaving it at the default 500 ms.

Do you have any suggestions what can be the cause and how to fix it?

Thanks a lot,
András








Subscibe

2014-11-13 Thread Rohit Pujari
Hello there:

I'd like to be added to the mailing list

Thanks,
-- 
Rohit Pujari
Solutions Engineer, Hortonworks
rpuj...@hortonworks.com
716-430-6899

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Kafka to transport binary files

2014-11-13 Thread Rohit Pujari
I'm thinking of using Kafka for transporting binary files (tiff, jpeg,
pdf). These files are anywhere between 10 KB to 5MB. Thought behind
considering Kafka is - It serves as a staging area for the files and
facilitates asynchronous ingestion in near-real-time.

Any thoughts on using Kafka for binary payloads? gotchas, watch outs?

Thanks,
Rohit Pujari
Solutions Architect, Hortonworks
rpuj...@hortonworks.com

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Broker keeps rebalancing

2014-11-13 Thread Manish
@Neha, Can you share suggested consumer side GC settings?

On Wed, Nov 12, 2014 at 5:31 PM, Neha Narkhede 
wrote:

> Does this help?
>
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog
> ?
>
> On Wed, Nov 12, 2014 at 3:53 PM, Chen Wang 
> wrote:
>
> > Hi there,
> > My kafka client is reading a 3 partition topic from kafka with 3 threads
> > distributed on different machines. I am seeing frequent owner changes on
> > the topics when running:
> > bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --group
> > my_test_group --topic mytopic -zkconnect localhost:2181
> >
> > The owner kept changing once a while, but I didn't see any exceptions
> > thrown from the consumer side. When checking broker log, its full of
> >  INFO Closing socket connection to /IP. (kafka.network.Processor)
> >
> > Is this expected behavior? If so,  how can I tell when  the leader is
> > imbalanced, and rebalance is triggered?
> > Thanks,
> > Chen
> >
>


Re: 0.8.2 producer with 0.8.1.1 cluster?

2014-11-13 Thread Shlomi Hazan
10x Christian

On Thu, Nov 13, 2014 at 9:50 AM, cac...@gmail.com  wrote:

> I used the 0.8.2 producer in a 0.8.1 cluster in a nonproduction
> environment. No problems to report it worked great, but my testing at that
> time was not particularly extensive for failure scenarios.
>
> Christian
>
> On Wed, Nov 12, 2014 at 10:37 PM, Shlomi Hazan  wrote:
>
> > I was asking to know if there's a point in trying...
> > From your answer I understand the answer is yes.
> > 10x,
> > Shlomi
> >
> > On Wed, Nov 12, 2014 at 7:04 PM, Guozhang Wang 
> wrote:
> >
> > > Shlomi,
> > >
> > > It should be compatible, did you see any issues using it against a
> > 0.8.1.1
> > > cluster?
> > >
> > > Guozhang
> > >
> > > On Wed, Nov 12, 2014 at 5:43 AM, Shlomi Hazan 
> wrote:
> > >
> > > > Hi,
> > > > Is the new producer 0.8.2 supposed to work with 0.8.1.1 cluster?
> > > > Shlomi
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>


How to keep-alive connection from producer to broker?

2014-11-13 Thread Kothakota, Neeraja (HP Software)
Hi,

I would like to know if there is a way/configuration to keep-alive connections 
from producer/client to broker?

I observed that connection is getting closed every time, which takes 
considerable time.

Can you please suggest ?

Thanks & Regards,
Neeraja