I debugged the problem with your patch. The patched ldirectord is not that illuminating though,
[Thu May 6 13:03:15 2010|www.229|5621] system(/sbin/ipvsadm -e -t 216.246.59.229:5222 -r 172.20.1.104:5222 -m -w 5) failed to execute: No child processes [Thu May 6 13:03:15 2010|www.229|5621] Restored real server: 172.20.1.104:5222 (216.246.59.229:5222) (Weight set to 5) [Thu May 6 13:03:15 2010|www.229|5621] emailalert: Restored real server: 172.20.1.104:5222 (216.246.59.229:5222) (Weight set to 5) [Thu May 6 13:03:15 2010|www.229|5621] failed to send email message [Thu May 6 13:03:15 2010|www.229|5621] system(/sbin/ipvsadm -e -t 216.246.59.229:8000 -r 172.20.1.104:8000 -m -w 5) failed to execute: No child processes [Thu May 6 13:03:15 2010|www.229|5621] Restored real server: 172.20.1.104:8000 (216.246.59.229:8000) (Weight set to 5) [Thu May 6 13:03:15 2010|www.229|5621] emailalert: Restored real server: 172.20.1.104:8000 (216.246.59.229:8000) (Weight set to 5) [Thu May 6 13:03:15 2010|www.229|5621] failed to send email message Note that the "failed to send email message" errors are incorrect as well. The messages are actually sent. Sid On Wed, May 5, 2010 at 1:27 AM, Simon Horman <[email protected]> wrote: > On Tue, May 04, 2010 at 01:06:15PM -0700, Sid Stuart wrote: > > We upgraded ldirectord last week to heartbeat-ldirectord.x86_64 > > 2.1.4-11.el5 from the Fedora EPEL repository. After the upgrade we > started > > seeing error messages like, > > > > [Tue May 4 11:09:36 2010|www.228|3608] system(/sbin/ipvsadm -e -t > > 216.246.59.228:80 -r 172.20.1.121:80 -m -w 20) failed: No child > processes > > [Tue May 4 11:09:36 2010|www.228|3608] Restored real server: > > 172.20.1.121:80 (216.246.59.228:80) (Weight set to 20) > > > > ipvsadm -l shows the operation worked, but the message is a concern. Does > > anyone know of a fix? > > Hi, > > that is rather curious. Are you able to reproduce the problem? > If so, could you try applying the following patch to ldirectord > and see if it sheds any light on the problem? > > Index: agents/ldirectord/ldirectord.in > =================================================================== > --- agents.orig/ldirectord/ldirectord.in 2010-05-05 > 18:15:06.000000000 +1000 > +++ agents/ldirectord/ldirectord.in 2010-05-05 18:19:53.000000000 > +1000 > @@ -819,6 +819,11 @@ use Sys::Hostname; > use POSIX qw(setsid :sys_wait_h); > use Sys::Syslog qw(:DEFAULT setlogsock); > > +use IPC::Open3; > +use Symbol qw(gensym); > + > +$| = 1; > + > BEGIN > { > # wrap exit() to preserve replacability > @@ -4380,12 +4385,33 @@ sub system_wrapper > { > my (@args)=(@_); > > - my $status; > + my $status, $pid, @log; > > &ld_log("Running system(@args)") if $DEBUG>2; > - $status = system(@args); > - if($status != 0) { > - &ld_log("system(@args) failed: $!"); > + > + > + my $pid = open3(gensym, ">&STDERR", \*PH, @args); > + while( <PH> ) { > + push @log, $_; > + } > + waitpid($pid, 0); > + $status = $?; > + > + if($status == -1) { > + &ld_log("system(@args) failed to execute: $!"); > + } elsif ($status & 127) { > + &ld_log("system(@args) died with signal " . ($status & > 127) . > + ", " . ($status & 128) ? "with" : "without" . > + " coredump"); > + } elsif ($status != 0) { > + &ld_log("system(@args) failed with exit-status " . > + ($status >> 8)); > + } > + > + if ($status != 0) { > + foreach (@log) { > + &ld_log("stdio/stderr: $_"); > + } > } > > return($status) > _______________________________________________ Please read the documentation before posting - it's available at: http://www.linuxvirtualserver.org/ LinuxVirtualServer.org mailing list - [email protected] Send requests to [email protected] or go to http://lists.graemef.net/mailman/listinfo/lvs-users
