Hi Stefan, thanks for your reporting, I'm going to have a look in lab for
applying the patch in next release.

Thanks again and regards


2014-03-18 14:48 GMT+01:00 Stefan <[email protected]>:

> Hi Emilio,
>
> thanks for having a look at this.
>
> I did some more tests and research in the meantime.
>
> I was able to reproduce using a simple shell script as the check
> command, which does not do anything but sleep a period of time higher
> than the timeout.
>
> I then tried to understand how farmguardian works. It is using
> $SIG{ALRM} for timeout handling when executing the check command.
> http://perldoc.perl.org/perlipc.html#Signals says "If the operation
> being timed out is system() or qx(), this technique is liable to
> generate zombies". I also read many postings in various forums from
> persons having problems with zombies using perl. I am not an expert
> but as I would understand it now, if the child process (check command)
> times out and the parent (farmguardian) continues, the child becomes a
> zombie if the parent does not read its status. So as a quick fix I
> modified the farmguardian script to use a non-blocking waitpid when
> the check command times out to read the child status. For whatever
> reason this leaves me with still one zombie per farm that has failing
> farmguardian checks, but more important they do not keep adding up
> anymore.
>
> This is how I modified farmguardian:
>
> use Proc::Daemon;
>
> changed to
>
> use Proc::Daemon;
> use POSIX(WNOHANG);
>
> and
>
>       if ($@){
>            warn "$command timed out.\n";
>            $timedout = 1;
>       }
>
> changed to
>
>       if ($@){
>            warn "$command timed out.\n";
>            $timedout = 1;
>            do {
>              my $kid = waitpid(-1, WNOHANG);
>              #print "pid $kid exited\n";
>            }
>            while $kid > 0;
>       }
>
>
> I don't know if this would work for others but it seems to work for me
> right now and I hope it helps your lab study. I will post an update if
> anything worth mentioning changes.
>
> Kind Regards,
> Stefan
>
>
> On Thu, Feb 13, 2014 at 5:07 PM, Emilio Campos
> <[email protected]> wrote:
> > Hi Stefan, we have to configure a lab to study this behaviour, your
> reported
> > information will help us.
> >
> > Thanks.
> >
> >
> > 2014-02-13 15:37 GMT+01:00 Stefan <[email protected]>:
> >>
> >> Please find a farmguardian log excerpt and additional info below.
> >>
> >> I have noticed that stopping the target service (e.g. Apache) does not
> >> seem to result in defunct processes. But they certainly appear when I
> >> disconnect the real server's virtual NIC for example.
> >>
> >> Kind Regards,
> >> Stefan
> >>
> >> Farmguardian Config:
> >> t-c-443: check_tcp -H HOST -p PORT -S -w 5 -c 5 -t 10
> >> t-n-80:  check_http -I HOST -p PORT -H 't-n.somedomain' -u '/abc/app'
> >> -w 5 -c 5 -t 10 -e '302 Found'
> >> t-n-443: check_http -I HOST -p PORT -S -H 't-n.somedomain' -u
> >> '/abc/app' -w 5 -c 5 -t 10 -e '200 OK'
> >> t-o-80:  check_http -I HOST -p PORT -H 't-o.somedomain' -w 5 -c 5 -t
> >> 10 -e '302 Found'
> >> t-o-443: check_http -I HOST -p PORT -S -H 't-o.somedomain' -w 5 -c 5
> >> -t 10 -e '302 Found'
> >>
> >> Process list:
> >> ...
> >>     1  1342  1342  1342 ?           -1 Ss       0   0:00
> >> /usr/local/zenloadbalancer/app/mini_httpd/mini_httpd -C
> >> /usr/local/zenloadbalancer/app/mini_httpd/mini_httpd.conf
> >>     1 27924 27924 27924 tty1      7057 Ss       0   0:00 /bin/login --
> >> 27924 21781 21781 27924 tty1      7057 S        0   0:00  \_ -bash
> >> 21781  7057  7057 27924 tty1      7057 R+       0   0:00      \_ ps axjf
> >>     1 28308 28085 28085 ?           -1 S        0   0:05 /usr/bin/perl
> >> /usr/local/zenloadbalancer/app/farmguardian/bin/farmguardian t-c-443
> >> -l
> >>     1 28342 28085 28085 ?           -1 S        0   0:04 /usr/bin/perl
> >> /usr/local/zenloadbalancer/app/farmguardian/bin/farmguardian t-o-80 -l
> >> 28342  6507 28085 28085 ?           -1 Z        0   0:00  \_ [sh]
> >> <defunct>
> >>     1 28377 28085 28085 ?           -1 S        0   0:04 /usr/bin/perl
> >> /usr/local/zenloadbalancer/app/farmguardian/bin/farmguardian t-n-443
> >> -l
> >> 28377  6512 28085 28085 ?           -1 Z        0   0:00  \_ [sh]
> >> <defunct>
> >>     1 28412 28085 28085 ?           -1 S        0   0:04 /usr/bin/perl
> >> /usr/local/zenloadbalancer/app/farmguardian/bin/farmguardian t-n-80 -l
> >> 28412  6506 28085 28085 ?           -1 Z        0   0:00  \_ [sh]
> >> <defunct>
> >>     1 28450 28085 28085 ?           -1 S        0   0:04 /usr/bin/perl
> >> /usr/local/zenloadbalancer/app/farmguardian/bin/farmguardian t-o-443
> >> -l
> >> 28450  6500 28085 28085 ?           -1 Z        0   0:00  \_ [sh]
> >> <defunct>
> >>
> >> Farmguardian log excerpt t-o-80:
> >> The servers timeout is: 10
> >>     checking:
> >>         farmname: t-o-80
> >>         timeout: 10
> >>         blacklist:
> >>         timetocheck: 10
> >>         portadmin: -1
> >>         server[0]: a.b.234.20:80
> >>         server[1]: a.b.235.20:80
> >>         check: check_http -I HOST -p PORT -H 't-o.somedomain' -w 5 -c
> >> 5 -t 10 -e '302 Found'
> >>
> >> execution in Tue Feb  4 10:21:34 2014 ::
> >>         server[0]: a.b.234.20:80
> >> Backend status 0: up
> >> command: /usr/local/zenloadbalancer/app/libexec/check_http -I
> >> a.b.234.20 -p 80 -H 't-o.somedomain' -w 5 -c 5 -t 10 -e '302 Found'
> >> timedout: 0
> >> errorcode: 0
> >> No state changed for the backend.
> >>         server[1]: a.b.235.20:80
> >> Backend status 1: up
> >> command: /usr/local/zenloadbalancer/app/libexec/check_http -I
> >> a.b.235.20 -p 80 -H 't-o.somedomain' -w 5 -c 5 -t 10 -e '302 Found'
> >> timedout: 0
> >> errorcode: 0
> >> No state changed for the backend.
> >> The servers timeout is: 10
> >>     checking:
> >>         farmname: t-o-80
> >>         timeout: 10
> >>         blacklist:
> >>         timetocheck: 10
> >>         portadmin: -1
> >>         server[0]: a.b.234.20:80
> >>         server[1]: a.b.235.20:80
> >>         check: check_http -I HOST -p PORT -H 't-o.somedomain' -w 5 -c
> >> 5 -t 10 -e '302 Found'
> >>
> >> execution in Tue Feb  4 10:21:44 2014 ::
> >>         server[0]: a.b.234.20:80
> >> Backend status 0: up
> >> command: /usr/local/zenloadbalancer/app/libexec/check_http -I
> >> a.b.234.20 -p 80 -H 't-o.somedomain' -w 5 -c 5 -t 10 -e '302 Found'
> >> timedout: 0
> >> errorcode: 0
> >> No state changed for the backend.
> >>         server[1]: a.b.235.20:80
> >> Backend status 1: up
> >> command: /usr/local/zenloadbalancer/app/libexec/check_http -I
> >> a.b.235.20 -p 80 -H 't-o.somedomain' -w 5 -c 5 -t 10 -e '302 Found'
> >> /usr/local/zenloadbalancer/app/libexec/check_http -I a.b.235.20 -p 80
> >> -H 't-o.somedomain' -w 5 -c 5 -t 10 -e '302 Found' timed out.
> >> timedout: 1
> >> errorcode: 0
> >> **execution error in '
> >> /usr/local/zenloadbalancer/app/libexec/check_http -I a.b.235.20 -p 80
> >> -H 't-o.somedomain' -w 5 -c 5 -t 10 -e '302 Found' ', output::**
> >> The servers timeout is: 10
> >>     checking:
> >>         farmname: t-o-80
> >>         timeout: 10
> >>         blacklist:
> >>         timetocheck: 10
> >>         portadmin: -1
> >>         server[0]: a.b.234.20:80
> >>         server[1]: a.b.235.20:80
> >>         check: check_http -I HOST -p PORT -H 't-o.somedomain' -w 5 -c
> >> 5 -t 10 -e '302 Found'
> >>
> >> execution in Tue Feb  4 10:22:04 2014 ::
> >>         server[0]: a.b.234.20:80
> >> Backend status 0: up
> >> command: /usr/local/zenloadbalancer/app/libexec/check_http -I
> >> a.b.234.20 -p 80 -H 't-o.somedomain' -w 5 -c 5 -t 10 -e '302 Found'
> >> timedout: 0
> >> errorcode: 0
> >> No state changed for the backend.
> >>         server[1]: a.b.235.20:80
> >> Backend status 1: fgDOWN
> >> command: /usr/local/zenloadbalancer/app/libexec/check_http -I
> >> a.b.235.20 -p 80 -H 't-o.somedomain' -w 5 -c 5 -t 10 -e '302 Found'
> >> /usr/local/zenloadbalancer/app/libexec/check_http -I a.b.235.20 -p 80
> >> -H 't-o.somedomain' -w 5 -c 5 -t 10 -e '302 Found' timed out.
> >> timedout: 1
> >> errorcode: 0
> >> No state changed for the backend.
> >>
> >>
> >>
> ------------------------------------------------------------------------------
> >> Android apps run on BlackBerry 10
> >> Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
> >> Now with support for Jelly Bean, Bluetooth, Mapview and more.
> >> Get your Android app in front of a whole new audience.  Start now.
> >>
> >>
> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk
> >> _______________________________________________
> >> Zenloadbalancer-support mailing list
> >> [email protected]
> >> https://lists.sourceforge.net/lists/listinfo/zenloadbalancer-support
> >
> >
> >
> >
> > --
> > Load balancer distribution - Open Source Project
> > http://www.zenloadbalancer.com
> > Distribution list (subscribe):
> [email protected]
> >
> >
> ------------------------------------------------------------------------------
> > Android apps run on BlackBerry 10
> > Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
> > Now with support for Jelly Bean, Bluetooth, Mapview and more.
> > Get your Android app in front of a whole new audience.  Start now.
> >
> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk
> > _______________________________________________
> > Zenloadbalancer-support mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/zenloadbalancer-support
> >
>
>
> ------------------------------------------------------------------------------
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and their
> applications. Written by three acclaimed leaders in the field,
> this first edition is now available. Download your free book today!
> http://p.sf.net/sfu/13534_NeoTech
> _______________________________________________
> Zenloadbalancer-support mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/zenloadbalancer-support
>



-- 
Load balancer distribution - Open Source Project
http://www.zenloadbalancer.com
Distribution list (subscribe): [email protected]
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Zenloadbalancer-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/zenloadbalancer-support

Reply via email to