Re: nagios and freebsd threads issue : help please ...

Christophe Yayon Sun, 21 Aug 2005 06:57:52 -0700

I have already asked them...
here is a resume of our conversation (me and other freebsd guys) :


-------
The thread I started is here:

 http://marc.theaimsgroup.com/?t=111930118000001&r=1&w=2

 There are some very interesting replies, a few in particular note that
 Nagios may be breaking POSIX spec in how it spawns/destroys threads:

 http://marc.theaimsgroup.com/?l=freebsd-hackers&m=111944526323754&w=2
 http://marc.theaimsgroup.com/?l=freebsd-hackers&m=111945035012258&w=2

 Anyhow, I"m sure if Ethan were to post some more specific info to
 [EMAIL PROTECTED] (it"s an open list, no need to sub), this
 issue could get banged out pretty quickly.

 Shortly after this thread, I found another where the issue was brought up
 by another curious poster, and he was using 5.4, which uses a newer
 threading library:

 http://marc.theaimsgroup.com/?t=112119712600002&r=1&w=2

 This post again brings up the "fork without exec or exit" possibly not
 following spec:

 http://marc.theaimsgroup.com/?l=freebsd-hackers&m=112125883804481&w=2

 "I don"t know what Nagios does just after fork(2), it would be worth to
 check.  It appears that fork(2)ing without exec(2)ing or _exit(2)ing
 in a pthreaded program is not a "valid" behaviour, regarding to
 SUSv3 [1].  I don"t want to avoid admitting there is a problem in
 FreeBSD threading library, I don"t know how other OSes handle this,
 but Nagios folks should really avoid doing what is explicitely
 dissuaded in SUSv3."
--------


--------
As the problem isn't in Nagios and noone seems to have an authoritative
 answer on what exactly is causing it, I'd say you would be better off
 switching to a GNU/Linux system, with at least Linux 2.4.29 and
 glibc-2.3 (a lot work was put into thread-safeness on glibc-2.3).
--------


--------
  From

http://www.opengroup.org/onlinepubs/009695399/functions/pthread_atfork.html

 "It is suggested that programs that use fork() call an exec function
 very soon afterwards in the child process, thus resetting all states. In
 the meantime, only a short list of async-signal-safe library routines
 are promised to be available."

 Note *suggested*. This is a recommendation to protect against a shoddy
 pthread-implementation. The thread specifications rule that only the
 thread calling fork() is duplicated, which initially leads to the
 recommendation (other threads holding locks aren't around to release
 them in the new execution context).

 That said, Nagios would most likely benefit greatly from a different
 means of checking things than fork()'ing twice and sending the results
 through several tiers of FIFO's. Several different methods have already
 been benchmarked. For server machines (or at least cans with a lot of
 memory and quite regularly multiple CPU's), the best way seems to be to
 create a new thread for each check to run. popen() causes a fork() and
 execve(), so that should be safe enough.

 What limits this imposes I don't know, but the NPTL library in use on
 most modern linux systems today handles 10.000 threads without barfing,
 so the limit would probably be sysconf(_SC_MAX_FILES), or ulimit -n,
 which is required by posix to be at least 256. Note that half this value
 (give or take 5 or so for stdin and such) represents the number of
 checks that can run simultaneously at any given time. When one of them
 completes another can kick in.
--------

in others words, somebody says that this a nagios problem and otherssays it is a freebsd problem ...




Daniel Eischen wrote:

On Sun, 21 Aug 2005, Christophe Yayon wrote:

Hi again,

I just upgraded again to FreeBSD5.4-Stable of August 20 and, i just
killed a nagios loop process which consume 100% of CPU...
The problem seems to persist again...

How do think about this ?
Thanks in advance.



Go ask the nagios guys.  If they are doing things after a fork()
from a threaded application that are not allowed by POSIX, then
they need to address it.

They choose to quote a weak reference to the actual requirement.
The standard says (in the fork() section):

 A process shall be created with a single thread.  If a
 multi-threaded process calls fork(), the new process shall
 contain a replica of the calling thread and its entire address
 space, possibly including the states of mutexes and other
 resources.  Consequently, to avoid errors, the child process may
 only execute async-signal-safe operations until such time as one
 of the exec functions is called.  Fork handlers may be
 established by means of the pthread_atfork() function in order
 to maintain application invariants across fork() calls.



_______________________________________________
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: nagios and freebsd threads issue : help please ...

Reply via email to