Jun,

I observed similar kind of things recently. (didn't notice before because our 
file limit is huge)

I have a set of brokers in a datacenter, and producers in different data 
centers. 

At some point I got disconnections, from the producer perspective I had 
something like 15 connections to the broker. On the other hand on the broker 
side, I observed hundreds of connections from the producer in an ESTABLISHED 
state.

We had some default settings for the socket timeout on the OS level, which we 
reduced hoping it would prevent the issue in the future. I'm not sure if the 
issue is from the broker or OS configuration though. I'm still keeping the 
broker under observation for the time being.

Note that, for clients in the same datacenter, we didn't see this issue, the 
socket count matches on both ends.

Nicolas Berthet 

-----Original Message-----
From: Jun Rao [mailto:jun...@gmail.com] 
Sent: Thursday, September 26, 2013 12:39 PM
To: users@kafka.apache.org
Subject: Re: Too many open files

If a client is gone, the broker should automatically close those broken 
sockets. Are you using a hardware load balancer?

Thanks,

Jun


On Wed, Sep 25, 2013 at 4:48 PM, Mark <static.void....@gmail.com> wrote:

> FYI if I kill all producers I don't see the number of open files drop. 
> I still see all the ESTABLISHED connections.
>
> Is there a broker setting to automatically kill any inactive TCP 
> connections?
>
>
> On Sep 25, 2013, at 4:30 PM, Mark <static.void....@gmail.com> wrote:
>
> > Any other ideas?
> >
> > On Sep 25, 2013, at 9:06 AM, Jun Rao <jun...@gmail.com> wrote:
> >
> >> We haven't seen any socket leaks with the java producer. If you 
> >> have
> lots
> >> of unexplained socket connections in established mode, one possible
> cause
> >> is that the client created new producer instances, but didn't close 
> >> the
> old
> >> ones.
> >>
> >> Thanks,
> >>
> >> Jun
> >>
> >>
> >> On Wed, Sep 25, 2013 at 6:08 AM, Mark <static.void....@gmail.com>
> wrote:
> >>
> >>> No. We are using the kafka-rb ruby gem producer.
> >>> https://github.com/acrosa/kafka-rb
> >>>
> >>> Now that you asked that question I need to ask. Is there a problem 
> >>> with the java producer?
> >>>
> >>> Sent from my iPhone
> >>>
> >>>> On Sep 24, 2013, at 9:01 PM, Jun Rao <jun...@gmail.com> wrote:
> >>>>
> >>>> Are you using the java producer client?
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Jun
> >>>>
> >>>>
> >>>>> On Tue, Sep 24, 2013 at 5:33 PM, Mark 
> >>>>> <static.void....@gmail.com>
> >>> wrote:
> >>>>>
> >>>>> Our 0.7.2 Kafka cluster keeps crashing with:
> >>>>>
> >>>>> 2013-09-24 17:21:47,513 -  [kafka-acceptor:Acceptor@153] - Error 
> >>>>> in acceptor
> >>>>>      java.io.IOException: Too many open
> >>>>>
> >>>>> The obvious fix is to bump up the number of open files but I'm
> wondering
> >>>>> if there is a leak on the Kafka side and/or our application 
> >>>>> side. We currently have the ulimit set to a generous 4096 but 
> >>>>> obviously we are hitting this ceiling. What's a recommended value?
> >>>>>
> >>>>> We are running rails and our Unicorn workers are connecting to 
> >>>>> our
> Kafka
> >>>>> cluster via round-robin load balancing. We have about 1500 
> >>>>> workers to
> >>> that
> >>>>> would be 1500 connections right there but they should be split 
> >>>>> across
> >>> our 3
> >>>>> nodes. Instead Netstat shows thousands of connections that look 
> >>>>> like
> >>> this:
> >>>>>
> >>>>> tcp        0      0 kafka1.mycompany.:XmlIpcRegSvc ::ffff:
> >>> 10.99.99.1:22503    ESTABLISHED
> >>>>> tcp        0      0 kafka1.mycompany.:XmlIpcRegSvc ::ffff:
> >>> 10.99.99.1:48398    ESTABLISHED
> >>>>> tcp        0      0 kafka1.mycompany.:XmlIpcRegSvc ::ffff:
> >>> 10.99.99.2:29617    ESTABLISHED
> >>>>> tcp        0      0 kafka1.mycompany.:XmlIpcRegSvc ::ffff:
> >>> 10.99.99.1:32444    ESTABLISHED
> >>>>> tcp        0      0 kafka1.mycompany.:XmlIpcRegSvc ::ffff:
> >>> 10.99.99.1:34415    ESTABLISHED
> >>>>> tcp        0      0 kafka1.mycompany.:XmlIpcRegSvc ::ffff:
> >>> 10.99.99.1:56901    ESTABLISHED
> >>>>> tcp        0      0 kafka1.mycompany.:XmlIpcRegSvc ::ffff:
> >>> 10.99.99.2:45349    ESTABLISHED
> >>>>>
> >>>>> Has anyone come across this problem before? Is this a 0.7.2 
> >>>>> leak, LB misconfiguration... ?
> >>>>>
> >>>>> Thanks
> >>>
> >
>
>

Reply via email to