Public bug reported:

After LP: #600941 was pushed out all of our systems started experiencing
Nagios nrpe restart failures.

Commands like /etc/init.d/nagios-nrpe-server restart

would cause nrpe to stop but not restart.

I tracked this down to the way that the /etc/init.d/nagios-nrpe-server
script is calling start-stop-daemon.

The issue is that the "stop" stanza in the /etc/init.d/nagios-nrpe-
server script first calls start-stop-daemon which sends SIGTERM to nrpe
and then waits only for one second.

If nrpe has not exited by that time the pid file will still exist and
the /etc/init.d/nagios-nrpe-server script will remove it.

Worse if /etc/init.d/nagios-nrpe-server restart is used not only will
the pid file be removed, the attempt to restart nrpe will fail provided
that the nrpe daemon is still tardy in shutting down.

The attempt to start under those circumstances will fail because nrpe
will still be bound to a socket and the second attempt at binding will
cause the nrpe startup to abort.

They should have wondered why there was a comment about "sometimes the
pid file does not get removed".

They should have tested on systems that have a heavy load and therefore
slow nrpe response times.

The fix is to add --retry 10 or such to the invocation of start-stop-
daemon ... --stop ...

Patch forthcoming, see

http://askubuntu.com/questions/82631/what-is-the-way-to-submit-a-patch-
to-fix-all-the-damage-that-lp-600941-causes

See also

https://launchpad.net/~nutznboltz/+archive/nrpe-unbreak-lp-600941

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: nagios-nrpe-server (not installed)
ProcVersionSignature: Ubuntu 3.2.0-1.3-generic 3.2.0-rc2
Uname: Linux 3.2.0-1-generic i686
ApportVersion: 1.90-0ubuntu1
Architecture: i386
Date: Fri Nov 25 14:38:05 2011
InstallationMedia: Ubuntu 11.10 "Oneiric Ocelot" - Release i386 (20111011)
ProcEnviron:
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: nagios-nrpe
UpgradeStatus: No upgrade log present (probably fresh install)

** Affects: nagios-nrpe (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: apport-bug i386 precise running-unity

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/896388

Title:
  Fix Regression Caused by LP: #600941

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nagios-nrpe/+bug/896388/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to