RE: ntpd hangs under FBSD 8
>So this is arguably a Python bug. Did you contacted anybody who cares about >the Python ? I did not, mainly because this link: http://bugs.python.org/msg61870 seems to imply they are already aware of the problem. I agree it must be a Python bug though. It worked in 2.5.1 but not in 2.5.5 and later, so clearly they changed how processes are launched from threads that has led to this problem. One should not have to be forced to make explicit calls to change the signal mask in order to launch an external app. Granted, we've only had this issue with ntpd--other apps launch fine--but there is clearly something wrong somewhere for even one app to hang when it is spawned as a thread. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: ntpd hangs under FBSD 8
On Thu, Feb 25, 2010 at 04:26:22PM -0600, Peter Steele wrote: > > We'll likely go with this solution instead of downgrading Python and the > > related libraries. > > In fact I came up with another solution. I realized that since the problem > was related to the process signal mask, instead of called ntpd directly, wrap > it up in a C app that resets the signal mask to something that works. I have > the following code: > > sigset_t set, oset; > sigemptyset(&set); > pthread_sigmask(SIG_SETMASK, &set, &oset); > system("/usr/sbin/ntpd -g -q"); > pthread_sigmask(SIG_SETMASK, &oset, NULL); > > I wrapped this up into a standalone app and call this from Python instead of > calling ntpd directly. This solved the problem--no more hang. Thanks very > much to Kostik Belousov for his "wild guess" that this was related to the > process signal mask. His guess was dead on. So this is arguably a Python bug. Did you contacted anybody who cares about the Python ? pgpN9o8mWtkIT.pgp Description: PGP signature
RE: ntpd hangs under FBSD 8
> We'll likely go with this solution instead of downgrading Python and the > related libraries. In fact I came up with another solution. I realized that since the problem was related to the process signal mask, instead of called ntpd directly, wrap it up in a C app that resets the signal mask to something that works. I have the following code: sigset_t set, oset; sigemptyset(&set); pthread_sigmask(SIG_SETMASK, &set, &oset); system("/usr/sbin/ntpd -g -q"); pthread_sigmask(SIG_SETMASK, &oset, NULL); I wrapped this up into a standalone app and call this from Python instead of calling ntpd directly. This solved the problem--no more hang. Thanks very much to Kostik Belousov for his "wild guess" that this was related to the process signal mask. His guess was dead on. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: ntpd hangs under FBSD 8
On Thu, Feb 25, 2010 at 09:57:48AM -0600, Peter Steele wrote: > >Very wild guess, check the process signal mask of the child for both methods > >of spawning. > > I'm running ntpd through Python. How do I check the process signal mask? I > did some quick searches and it seems Python does not support sigprocmask(). > > In my searches I came across this link: > > http://bugs.python.org/msg61870 > > I think you might be right that this is related to the signal mask. In my > scenario the select call is hanging indefinitely, just like discussed in this > article. > Below is the quickly made patch to add ability to show signal disposition to the procstat(1). I am not sure about duplicating information about catch/ignore state of the signal for all threads (this information is process-global), but I think this is more usable for scripts. diff --git a/usr.bin/procstat/Makefile b/usr.bin/procstat/Makefile index 1c187b0..251fc06 100644 --- a/usr.bin/procstat/Makefile +++ b/usr.bin/procstat/Makefile @@ -10,6 +10,7 @@ SRCS= procstat.c \ procstat_files.c\ procstat_kstack.c \ procstat_threads.c \ + procstat_threads_sigs.c \ procstat_vm.c LDADD+=-lutil diff --git a/usr.bin/procstat/procstat.c b/usr.bin/procstat/procstat.c index bc02682..cbd4eca 100644 --- a/usr.bin/procstat/procstat.c +++ b/usr.bin/procstat/procstat.c @@ -38,7 +38,7 @@ #include "procstat.h" -static int aflag, bflag, cflag, fflag, kflag, sflag, tflag, vflag; +static int aflag, bflag, cflag, fflag, iflag, kflag, sflag, tflag, vflag; inthflag; static void @@ -46,7 +46,7 @@ usage(void) { fprintf(stderr, "usage: procstat [-h] [-w interval] [-b | -c | -f | " - "-k | -s | -t | -v]\n"); + "-i | -k | -s | -t | -v]\n"); fprintf(stderr, "[-a | pid ...]\n"); exit(EX_USAGE); } @@ -61,6 +61,8 @@ procstat(pid_t pid, struct kinfo_proc *kipp) procstat_args(pid, kipp); else if (fflag) procstat_files(pid, kipp); + else if (iflag) + procstat_threads_sigs(pid, kipp); else if (kflag) procstat_kstack(pid, kipp, kflag); else if (sflag) @@ -109,7 +111,7 @@ main(int argc, char *argv[]) char *dummy; interval = 0; - while ((ch = getopt(argc, argv, "abcfkhstvw:")) != -1) { + while ((ch = getopt(argc, argv, "abcfikhstvw:")) != -1) { switch (ch) { case 'a': aflag++; @@ -127,6 +129,10 @@ main(int argc, char *argv[]) fflag++; break; + case 'i': + iflag++; + break; + case 'k': kflag++; break; diff --git a/usr.bin/procstat/procstat.h b/usr.bin/procstat/procstat.h index 8bacab7..10f8fce 100644 --- a/usr.bin/procstat/procstat.h +++ b/usr.bin/procstat/procstat.h @@ -41,6 +41,7 @@ void procstat_cred(pid_t pid, struct kinfo_proc *kipp); void procstat_files(pid_t pid, struct kinfo_proc *kipp); void procstat_kstack(pid_t pid, struct kinfo_proc *kipp, int kflag); void procstat_threads(pid_t pid, struct kinfo_proc *kipp); +void procstat_threads_sigs(pid_t pid, struct kinfo_proc *kipp); void procstat_vm(pid_t pid, struct kinfo_proc *kipp); #endif /* !PROCSTAT_H */ diff --git a/usr.bin/procstat/procstat_threads_sigs.c b/usr.bin/procstat/procstat_threads_sigs.c new file mode 100644 index 000..814f0c4 --- /dev/null +++ b/usr.bin/procstat/procstat_threads_sigs.c @@ -0,0 +1,111 @@ +/*- + * Copyright (c) 2007 Robert N. M. Watson + * Copyright (c) 2010 Konstantin Belousov + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + *notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + *notice, this list of conditions and the following disclaimer in the + *documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARI
RE: ntpd hangs under FBSD 8
>Very wild guess, check the process signal mask of the child for both methods >of spawning. I'm running ntpd through Python. How do I check the process signal mask? I did some quick searches and it seems Python does not support sigprocmask(). In my searches I came across this link: http://bugs.python.org/msg61870 I think you might be right that this is related to the signal mask. In my scenario the select call is hanging indefinitely, just like discussed in this article. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: ntpd hangs under FBSD 8
On Thu, Feb 25, 2010 at 08:12:05AM -0600, Peter Steele wrote: > >I think problem not in ntpd, since I use ntpdate. And in 50% times, when it > >run from startup script, it hangs with kernel. > >No Ctrl+C work, kernel don`t answer for ping, just freeze. > >Problem somewhere in kernel, maybe in subsystems that set new time, maybe in > >network(UDP) parts. > >This problem don`t affect other programs, so I think this in time handling > >code. > > I think you may be describing a different problem. For one thing, we don't > use ntpdate, we use the "ntpd -g -q" alternative. Secondly, for us ntpd is > hanging 100% of the time when run via a Python thread class. The exception is > Python 2.5.1; this succeeds 100% of the time. > > >Peter, what platform You use? I use MIPS BCM5354. > > We have a variety of 1U and 3U boxes. They all hang the same way. Very wild guess, check the process signal mask of the child for both methods of spawning. pgp2TMyaOAOk4.pgp Description: PGP signature
RE: ntpd hangs under FBSD 8
>I think problem not in ntpd, since I use ntpdate. And in 50% times, when it >run from startup script, it hangs with kernel. >No Ctrl+C work, kernel don`t answer for ping, just freeze. >Problem somewhere in kernel, maybe in subsystems that set new time, maybe in >network(UDP) parts. >This problem don`t affect other programs, so I think this in time handling >code. I think you may be describing a different problem. For one thing, we don't use ntpdate, we use the "ntpd -g -q" alternative. Secondly, for us ntpd is hanging 100% of the time when run via a Python thread class. The exception is Python 2.5.1; this succeeds 100% of the time. >Peter, what platform You use? I use MIPS BCM5354. We have a variety of 1U and 3U boxes. They all hang the same way. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: ntpd hangs under FBSD 8
Alexey Shuvaev writes: > The flag you should look at is '-g'. GCC supports debuggind symbols > together with -O2 optimizations. It is generally not a good idea to use -O2 for debugging versions, since gcc will optimize away many local variables. DES -- Dag-Erling Smørgrav - d...@des.no ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: ntpd hangs under FBSD 8
On Wed, Feb 24, 2010 at 03:56:35PM -0600, Peter Steele wrote: > >> How do I get libc built with full debug symbols? > > > >I haven't tried it by myself but think here is the way to go: put the > >following to /etc/make.conf and recompile needed libraries / ports. > >WITH_DEBUG=yes > >DEBUG_FLAGS=-g > > That didn't seem to have any effect. I still see -O2 being used > instead of -O0. > The flag you should look at is '-g'. GCC supports debuggind symbols together with -O2 optimizations. Others have posted suggenstions how to build libraries with debugging symbols which go in the same direction. However, with the above variables in make.conf you do not need to remember all the places where you have to put DEBUG_FLAGS=-g in the command line. Just normal buildworld and buildkernel targets will dtrt. That is, you will get the complete base system with debug symbols. The variable WITH_DEBUG=yes is for the software from ports. Just FYI. > >Mmm... Do other daemons (sshd, lpd, ...) also fail when started > through this script? Normal commands (ls, ps) seem not affected. > > I tried a few other things and they all seemed to run correctly. > We use this same general approach in the full version of this script > to launch lots of applications. Its role in fact is a process > launcher/monitor. I stripped it down to the bare minimum in order > to isolate the cause of the problem. It seems that only ntpd hangs, > but not if I use Python 2.5.1. > ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: ntpd hangs under FBSD 8
On Wed, 24 Feb 2010 15:56:35 -0600 Peter Steele wrote: >> >> How do I get libc built with full debug symbols? >> > >> >I haven't tried it by myself but think here is the way to go: put the >> >following to /etc/make.conf and recompile needed >> >libraries / ports. WITH_DEBUG=yes >> >DEBUG_FLAGS=-g >> >> That didn't seem to have any effect. I still see -O2 being used instead of >> -O0. Try to use make DEBUG_FLAGS=-g WITH_DEBUG=yes buildkworld make DEBUG_FLAGS=-g WITH_DEBUG=yes buildkernel >> >> >Mmm... Do other daemons (sshd, lpd, ...) also fail when started through >> >this script? Normal commands (ls, ps) seem not >> >affected. >> >> I tried a few other things and they all seemed to run correctly. We use this >> same general approach in the full version of this >> script to launch lots of applications. Its role in fact is a process >> launcher/monitor. I stripped it down to the bare minimum >> in order to isolate the cause of the problem. It seems that only ntpd hangs, >> but not if I use Python 2.5.1. >> I think problem not in ntpd, since I use ntpdate. And in 50% times, when it run from startup script, it hangs with kernel. No Ctrl+C work, kernel don`t answer for ping, just freeze. Problem somewhere in kernel, maybe in subsystems that set new time, maybe in network(UDP) parts. This problem don`t affect other programs, so I think this in time handling code. Peter, what platform You use? I use MIPS BCM5354. >> >> ___ >> freebsd-hackers@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers >> To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org" -- Alexandr Rybalko aka Alex RAY ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
RE: ntpd hangs under FBSD 8
>> make install should be done with DEBUG_FLAGS containing -g too, otherwise >> strip(1) is called on the installed binary. > >Doh, yes. I did not do this; that's likely my problem. Thanks. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
RE: ntpd hangs under FBSD 8
>I bet ntpd doesn't call select() in all that many places. Instead of going to >all this trouble to build a debugging libc, you could just >grep for select() and place breakpoints on all occurrences. (It might also be >obvious from looking at them which one is the offender.) I just checked--there are five calls to select. I might flag each one with a printf or something and recompile to see which one is the culprit. >Also, since a system call is causing the trouble, you might learn something >from truss or ktrace. I'll check these out... ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: ntpd hangs under FBSD 8
On Wednesday 24 February 2010 5:09:47 pm Kostik Belousov wrote: > On Wed, Feb 24, 2010 at 03:17:25PM -0500, John Baldwin wrote: > > On Wednesday 24 February 2010 1:17:50 pm Peter Steele wrote: > > > >You're going to need a debug version of libc, too. gdb won't be able to > > find a backtrace out of a libc function without it. > > > > > > What's the proper way to build a debug version of libc and the other > > libraries? I tried this: > > > > You can just do this: > > > > cd /usr/src/lib/libc > > make clean > > make DEBUG_FLAGS=-g > > make install > > make install should be done with DEBUG_FLAGS containing -g too, otherwise > strip(1) is called on the installed binary. Doh, yes. -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
RE: ntpd hangs under FBSD 8
On Mon, 22 Feb 2010, Peter Steele wrote: Just out of curiosity, can you attach to the process via gdb and get a backtrace? This smells like a locked pthread_join I hit in my own code a few weeks ago I'm not using the debug version of ntpd so the backtrace isn't too useful, but here's what I get: (gdb) bt #0 0x000800d52bfc in select () from /lib/libc.so.7 #1 0x00425273 in ?? () #2 0x0040540e in ?? () #3 0x00080058 in ?? () #4 0x in ?? () I bet ntpd doesn't call select() in all that many places. Instead of going to all this trouble to build a debugging libc, you could just grep for select() and place breakpoints on all occurrences. (It might also be obvious from looking at them which one is the offender.) Also, since a system call is causing the trouble, you might learn something from truss or ktrace. -- Nate Eldredge n...@thatsmathematics.com ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: ntpd hangs under FBSD 8
On Wed, Feb 24, 2010 at 03:17:25PM -0500, John Baldwin wrote: > On Wednesday 24 February 2010 1:17:50 pm Peter Steele wrote: > > >You're going to need a debug version of libc, too. gdb won't be able to > find a backtrace out of a libc function without it. > > > > What's the proper way to build a debug version of libc and the other > libraries? I tried this: > > You can just do this: > > cd /usr/src/lib/libc > make clean > make DEBUG_FLAGS=-g > make install make install should be done with DEBUG_FLAGS containing -g too, otherwise strip(1) is called on the installed binary. pgpTclt7JKHiZ.pgp Description: PGP signature
RE: ntpd hangs under FBSD 8
>> How do I get libc built with full debug symbols? > >I haven't tried it by myself but think here is the way to go: put the >following to /etc/make.conf and recompile needed libraries / ports. >WITH_DEBUG=yes >DEBUG_FLAGS=-g That didn't seem to have any effect. I still see -O2 being used instead of -O0. >Mmm... Do other daemons (sshd, lpd, ...) also fail when started through this >script? Normal commands (ls, ps) seem not affected. I tried a few other things and they all seemed to run correctly. We use this same general approach in the full version of this script to launch lots of applications. Its role in fact is a process launcher/monitor. I stripped it down to the bare minimum in order to isolate the cause of the problem. It seems that only ntpd hangs, but not if I use Python 2.5.1. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
RE: ntpd hangs under FBSD 8
>> What's the proper way to build a debug version of libc and the other >> libraries? I tried this: > >You can just do this: > >cd /usr/src/lib/libc >make clean >make DEBUG_FLAGS=-g >make install When I tried this the make actually failed with various errors. So I decided to do a full "make buildworld DEBUG_FLAGS=-g" but in looking at the output being generated I see see -O2 in the cc commands and this at least should be -O0. It doesn't look like the DEBUG_FLAGS is having any effect. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: ntpd hangs under FBSD 8
On Wednesday 24 February 2010 1:17:50 pm Peter Steele wrote: > >You're going to need a debug version of libc, too. gdb won't be able to find a backtrace out of a libc function without it. > > What's the proper way to build a debug version of libc and the other libraries? I tried this: You can just do this: cd /usr/src/lib/libc make clean make DEBUG_FLAGS=-g make install -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: ntpd hangs under FBSD 8
On Wed, Feb 24, 2010 at 12:17:50PM -0600, Peter Steele wrote: > >You're going to need a debug version of libc, too. > >gdb won't be able to find a backtrace out of a libc function without it. > > What's the proper way to build a debug version of libc and the other > libraries? I tried this: > > export CFLAGS="-O0" > make buildworld > make installworld DESTDIR=/mydir > > and then copied libc.so.7 from /mydir/lib to the /lib dir on > my target system. I also replaced the ntpd binary with the debug version. > I can see that -O0 is being used in the various "cc" commands that are > generated, but libc still doesn't seem to be built properly. > When I attach to a hung ntpd process, I get this: > > # gdb /usr/sbin/ntpd -p 2113 > GNU gdb 6.1.1 [FreeBSD] > ... > Attaching to program: /usr/sbin/ntpd, process 2113 > Reading symbols from /lib/libm.so.5...(no debugging symbols found)...done. > ... > Reading symbols from /lib/libc.so.7...(no debugging symbols found)...done. > Loaded symbols for /lib/libc.so.7 > Reading symbols from /lib/libthr.so.3...(no debugging symbols found)...done. > ... > [Switching to Thread 8012041c0 (LWP 100283)] > 0x000800dbeddc in select () from /lib/libc.so.7 > (gdb) bt > #0 0x000800dbeddc in select () from /lib/libc.so.7 > #1 0x004335de in ntpdmain () > #2 0x0043310b in main () > > So I'm getting some symbols from ntpd but I still can't see into select(). > I think not. > It hangs in there forever so that's where I need to drill down further. > How do I get libc built with full debug symbols? > I haven't tried it by myself but think here is the way to go: put the following to /etc/make.conf and recompile needed libraries / ports. WITH_DEBUG=yes DEBUG_FLAGS=-g This should do the trick for both base and ports. > > [snip] > > If anyone has any clues at all as to what is causing this issue, > I'd appreciate the feedback. Here's the code that reproduces this behavior. > > #! /usr/bin/env python > import os > import threading > > class RunProc(threading.Thread): > def __init__(self, cmd): > threading.Thread.__init__(self) > self.cmd = cmd > > def run(self): > os.system(self.cmd) > > def main(): > RunProc("/usr/sbin/ntpd -g -q").start() > > if __name__ == "__main__": > main() > Mmm... Do other daemons (sshd, lpd, ...) also fail when started through this script? Normal commands (ls, ps) seem not affected. 0.02$, Alexey. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
RE: ntpd hangs under FBSD 8
>You're going to need a debug version of libc, too. gdb won't be able to find >a backtrace out of a libc function without it. What's the proper way to build a debug version of libc and the other libraries? I tried this: export CFLAGS="-O0" make buildworld make installworld DESTDIR=/mydir and then copied libc.so.7 from /mydir/lib to the /lib dir on my target system. I also replaced the ntpd binary with the debug version. I can see that -O0 is being used in the various "cc" commands that are generated, but libc still doesn't seem to be built properly. When I attach to a hung ntpd process, I get this: # gdb /usr/sbin/ntpd -p 2113 GNU gdb 6.1.1 [FreeBSD] ... Attaching to program: /usr/sbin/ntpd, process 2113 Reading symbols from /lib/libm.so.5...(no debugging symbols found)...done. ... Reading symbols from /lib/libc.so.7...(no debugging symbols found)...done. Loaded symbols for /lib/libc.so.7 Reading symbols from /lib/libthr.so.3...(no debugging symbols found)...done. ... [Switching to Thread 8012041c0 (LWP 100283)] 0x000800dbeddc in select () from /lib/libc.so.7 (gdb) bt #0 0x000800dbeddc in select () from /lib/libc.so.7 #1 0x004335de in ntpdmain () #2 0x0043310b in main () So I'm getting some symbols from ntpd but I still can't see into select(). It hangs in there forever so that's where I need to drill down further. How do I get libc built with full debug symbols? In other testing I've narrowed the problem down to some kind of Python issue. If I run the Python code at the end of this email where "ntpd -g -q" is launched as part of a Python thread class, the command hangs (the code assumes that ntpd is not already running). If I run the same ntpd command in a normal function (e.g. main) no hang occurs. I've tried subcommand.Popen and os.spawnv to run ntpd and these calls behave exactly the same way--when called from a thread the ntpd process hangs but it works fine when called from outside of a thread. This is a breakdown of course of our larger project into a simple test app. In our real code we cannot so easily eliminate this thread wrapper. The same code BTW works fine on our FreeBSD 7 boxes, the main difference being we are running an older version of Python on those boxes (2.5.1 instead of 2.6.2). I tried installing the same 2.5.1 package on a FBSD 8 box and that solved the problem. Curiously a slightly newer FBSD 7 version of Python, 2.5.5, causes the same hang to occur. So only Python 2.5.1 built under FreeBSD 7 works to get around this issue with ntpd on FreeBSD 8. That means one potential solution is to downgrade to this 2.5.1, but we have other libraries targeted to work with Python 2.6 and we don't really want to downgrade all these associated libraries. If anyone has any clues at all as to what is causing this issue, I'd appreciate the feedback. Here's the code that reproduces this behavior. #! /usr/bin/env python import os import threading class RunProc(threading.Thread): def __init__(self, cmd): threading.Thread.__init__(self) self.cmd = cmd def run(self): os.system(self.cmd) def main(): RunProc("/usr/sbin/ntpd -g -q").start() if __name__ == "__main__": main() ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
RE: ntpd hangs under FBSD 8
>You're going to need a debug version of libc, too. gdb won't be able to find >a backtrace out of a libc function without it. Yeah, you're right. This is definitely an annoying bug... ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: ntpd hangs under FBSD 8
You're going to need a debug version of libc, too. gdb won't be able to find a backtrace out of a libc function without it. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
RE: ntpd hangs under FBSD 8
>Just out of curiosity, can you attach to the process via gdb and get a >backtrace? This smells like a locked pthread_join I hit in my own code a few >weeks ago I'm not using the debug version of ntpd so the backtrace isn't too useful, but here's what I get: (gdb) bt #0 0x000800d52bfc in select () from /lib/libc.so.7 #1 0x00425273 in ?? () #2 0x0040540e in ?? () #3 0x00080058 in ?? () #4 0x in ?? () The trace continues for 700+ entries. The first entry is useful enough though. One of the parameters to select() is a timeout parameter. Every time I do the backtrace it's stuck on this select call so it seems they have an infinite timeout set. One of these was running all weekend in fact and it's still stuck. Curiously, this problem only happens when we make the call from code via a system() call. If I run the same command interactively, it never hangs: # /usr/sbin/ntpd -g -q ntpd: time set +28845.997063s The same code that runs this command does not hang when we run it on a BSD 7 box. I think I'm going to have to build the debug version of ntpd and try to debug it. Definitely something weird going on. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: ntpd hangs under FBSD 8
On Fri, 19 Feb 2010, Peter Steele wrote: > I posted this originally on the -questions list but did not make any headway. > We have an application where the user can change the date/time via a GUI. One > of the options the user has is to specify that the time is to be synced using > ntp. Our coding worked fine under BSD 7 but since we've moved to BSD 8 we've > encountered a problem where the command that we initiate from the GUI: Just out of curiosity, can you attach to the process via gdb and get a backtrace? This smells like a locked pthread_join I hit in my own code a few weeks ago Cheers, -R. Tyler Ballance -- Jabber: rty...@jabber.org GitHub: http://github.com/rtyler Twitter: http://twitter.com/agentdero Blog: http://unethicalblogger.com pgpkfqopLBnxf.pgp Description: PGP signature