Re: ntpd hangs under FBSD 8

2010-02-26 Thread Kostik Belousov
On Thu, Feb 25, 2010 at 04:26:22PM -0600, Peter Steele wrote:
  We'll likely go with this solution instead of downgrading Python and the 
  related libraries.
 
 In fact I came up with another solution. I realized that since the problem 
 was related to the process signal mask, instead of called ntpd directly, wrap 
 it up in a C app that resets the signal mask to something that works. I have 
 the following code:
 
  sigset_t set, oset;
  sigemptyset(set);
  pthread_sigmask(SIG_SETMASK, set, oset);
  system(/usr/sbin/ntpd -g -q);
  pthread_sigmask(SIG_SETMASK, oset, NULL);
 
 I wrapped this up into a standalone app and call this from Python instead of 
 calling ntpd directly. This solved the problem--no more hang. Thanks very 
 much to Kostik Belousov for his wild guess that this was related to the 
 process signal mask. His guess was dead on.

So this is arguably a Python bug. Did you contacted anybody who
cares about the Python ?


pgpN9o8mWtkIT.pgp
Description: PGP signature


RE: ntpd hangs under FBSD 8

2010-02-26 Thread Peter Steele
So this is arguably a Python bug. Did you contacted anybody who cares about 
the Python ?

I did not, mainly because this link:

http://bugs.python.org/msg61870

seems to imply they are already aware of the problem. I agree it must be a 
Python bug though. It worked in 2.5.1 but not in 2.5.5 and later, so clearly 
they changed how processes are launched from threads that has led to this 
problem. One should not have to be forced to make explicit calls to change the 
signal mask in order to launch an external app. Granted, we've only had this 
issue with ntpd--other apps launch fine--but there is clearly something wrong 
somewhere for even one app to hang when it is spawned as a thread.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: ntpd hangs under FBSD 8

2010-02-25 Thread Alexandr Rybalko
On Wed, 24 Feb 2010 15:56:35 -0600
Peter Steele pste...@maxiscale.com wrote:

  How do I get libc built with full debug symbols?
  
 I haven't tried it by myself but think here is the way to go: put the 
 following to /etc/make.conf and recompile needed
 libraries / ports. WITH_DEBUG=yes
 DEBUG_FLAGS=-g
 
 That didn't seem to have any effect. I still see -O2 being used instead of 
 -O0.

Try to use 
make DEBUG_FLAGS=-g WITH_DEBUG=yes buildkworld
make DEBUG_FLAGS=-g WITH_DEBUG=yes buildkernel
 
 Mmm... Do other daemons (sshd, lpd, ...) also fail when started through 
 this script? Normal commands (ls, ps) seem not
 affected.
 
 I tried a few other things and they all seemed to run correctly. We use this 
 same general approach in the full version of this
 script to launch lots of applications. Its role in fact is a process 
 launcher/monitor. I stripped it down to the bare minimum
 in order to isolate the cause of the problem. It seems that only ntpd hangs, 
 but not if I use Python 2.5.1.
 

I think problem not in ntpd, since I use ntpdate. And in 50% times, when it run 
from startup script, it hangs with kernel.
No Ctrl+C work, kernel don`t answer for ping, just freeze.
Problem somewhere in kernel, maybe in subsystems that set new time, maybe in 
network(UDP) parts.
This problem don`t affect other programs, so I think this in time handling code.

Peter, what platform You use? I use MIPS BCM5354.


 
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


-- 
Alexandr Rybalko r...@dlink.ua 
aka Alex RAY r...@ddteam.net
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: ntpd hangs under FBSD 8

2010-02-25 Thread Alexey Shuvaev
On Wed, Feb 24, 2010 at 03:56:35PM -0600, Peter Steele wrote:
  How do I get libc built with full debug symbols?
  
 I haven't tried it by myself but think here is the way to go: put the 
 following to /etc/make.conf and recompile needed libraries / ports.
 WITH_DEBUG=yes
 DEBUG_FLAGS=-g
 
 That didn't seem to have any effect. I still see -O2 being used
 instead of -O0.
 
The flag you should look at is '-g'. GCC supports debuggind symbols
together with -O2 optimizations.
Others have posted suggenstions how to build libraries with debugging
symbols which go in the same direction. However, with the above
variables in make.conf you do not need to remember all the places where
you have to put DEBUG_FLAGS=-g in the command line. Just normal
buildworld and buildkernel targets will dtrt. That is, you will get
the complete base system with debug symbols. The variable WITH_DEBUG=yes
is for the software from ports.

Just FYI.

 Mmm... Do other daemons (sshd, lpd, ...) also fail when started
 through this script? Normal commands (ls, ps) seem not affected.
 
 I tried a few other things and they all seemed to run correctly.
 We use this same general approach in the full version of this script
 to launch lots of applications. Its role in fact is a process
 launcher/monitor. I stripped it down to the bare minimum in order
 to isolate the cause of the problem. It seems that only ntpd hangs,
 but not if I use Python 2.5.1.
 
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: ntpd hangs under FBSD 8

2010-02-25 Thread Dag-Erling Smørgrav
Alexey Shuvaev shuv...@physik.uni-wuerzburg.de writes:
 The flag you should look at is '-g'. GCC supports debuggind symbols
 together with -O2 optimizations.

It is generally not a good idea to use -O2 for debugging versions, since
gcc will optimize away many local variables.

DES
-- 
Dag-Erling Smørgrav - d...@des.no
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


RE: ntpd hangs under FBSD 8

2010-02-25 Thread Peter Steele
I think problem not in ntpd, since I use ntpdate. And in 50% times, when it 
run from startup script, it hangs with kernel.
No Ctrl+C work, kernel don`t answer for ping, just freeze.
Problem somewhere in kernel, maybe in subsystems that set new time, maybe in 
network(UDP) parts.
This problem don`t affect other programs, so I think this in time handling 
code.

I think you may be describing a different problem. For one thing, we don't use 
ntpdate, we use the ntpd -g -q alternative. Secondly, for us ntpd is hanging 
100% of the time when run via a Python thread class. The exception is Python 
2.5.1; this succeeds 100% of the time.

Peter, what platform You use? I use MIPS BCM5354.

We have a variety of 1U and 3U boxes. They all hang the same way.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: ntpd hangs under FBSD 8

2010-02-25 Thread Kostik Belousov
On Thu, Feb 25, 2010 at 08:12:05AM -0600, Peter Steele wrote:
 I think problem not in ntpd, since I use ntpdate. And in 50% times, when it 
 run from startup script, it hangs with kernel.
 No Ctrl+C work, kernel don`t answer for ping, just freeze.
 Problem somewhere in kernel, maybe in subsystems that set new time, maybe in 
 network(UDP) parts.
 This problem don`t affect other programs, so I think this in time handling 
 code.
 
 I think you may be describing a different problem. For one thing, we don't 
 use ntpdate, we use the ntpd -g -q alternative. Secondly, for us ntpd is 
 hanging 100% of the time when run via a Python thread class. The exception is 
 Python 2.5.1; this succeeds 100% of the time.
 
 Peter, what platform You use? I use MIPS BCM5354.
 
 We have a variety of 1U and 3U boxes. They all hang the same way.

Very wild guess, check the process signal mask of the child for both
methods of spawning.


pgp2TMyaOAOk4.pgp
Description: PGP signature


RE: ntpd hangs under FBSD 8

2010-02-25 Thread Peter Steele
Very wild guess, check the process signal mask of the child for both methods 
of spawning.

I'm running ntpd through Python. How do I check the process signal mask? I did 
some quick searches and it seems Python does not support sigprocmask(). 

In my searches I came across this link:

http://bugs.python.org/msg61870

I think you might be right that this is related to the signal mask. In my 
scenario the select call is hanging indefinitely, just like discussed in this 
article.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: ntpd hangs under FBSD 8

2010-02-25 Thread Kostik Belousov
On Thu, Feb 25, 2010 at 09:57:48AM -0600, Peter Steele wrote:
 Very wild guess, check the process signal mask of the child for both methods 
 of spawning.
 
 I'm running ntpd through Python. How do I check the process signal mask? I 
 did some quick searches and it seems Python does not support sigprocmask(). 
 
 In my searches I came across this link:
 
 http://bugs.python.org/msg61870
 
 I think you might be right that this is related to the signal mask. In my 
 scenario the select call is hanging indefinitely, just like discussed in this 
 article.
 

Below is the quickly made patch to add ability to show signal disposition
to the procstat(1). I am not sure about duplicating information about
catch/ignore state of the signal for all threads (this information is
process-global), but I think this is more usable for scripts.

diff --git a/usr.bin/procstat/Makefile b/usr.bin/procstat/Makefile
index 1c187b0..251fc06 100644
--- a/usr.bin/procstat/Makefile
+++ b/usr.bin/procstat/Makefile
@@ -10,6 +10,7 @@ SRCS= procstat.c  \
procstat_files.c\
procstat_kstack.c   \
procstat_threads.c  \
+   procstat_threads_sigs.c \
procstat_vm.c
 
 LDADD+=-lutil
diff --git a/usr.bin/procstat/procstat.c b/usr.bin/procstat/procstat.c
index bc02682..cbd4eca 100644
--- a/usr.bin/procstat/procstat.c
+++ b/usr.bin/procstat/procstat.c
@@ -38,7 +38,7 @@
 
 #include procstat.h
 
-static int aflag, bflag, cflag, fflag, kflag, sflag, tflag, vflag;
+static int aflag, bflag, cflag, fflag, iflag, kflag, sflag, tflag, vflag;
 inthflag;
 
 static void
@@ -46,7 +46,7 @@ usage(void)
 {
 
fprintf(stderr, usage: procstat [-h] [-w interval] [-b | -c | -f | 
-   -k | -s | -t | -v]\n);
+   -i | -k | -s | -t | -v]\n);
fprintf(stderr, [-a | pid ...]\n);
exit(EX_USAGE);
 }
@@ -61,6 +61,8 @@ procstat(pid_t pid, struct kinfo_proc *kipp)
procstat_args(pid, kipp);
else if (fflag)
procstat_files(pid, kipp);
+   else if (iflag)
+   procstat_threads_sigs(pid, kipp);
else if (kflag)
procstat_kstack(pid, kipp, kflag);
else if (sflag)
@@ -109,7 +111,7 @@ main(int argc, char *argv[])
char *dummy;
 
interval = 0;
-   while ((ch = getopt(argc, argv, abcfkhstvw:)) != -1) {
+   while ((ch = getopt(argc, argv, abcfikhstvw:)) != -1) {
switch (ch) {
case 'a':
aflag++;
@@ -127,6 +129,10 @@ main(int argc, char *argv[])
fflag++;
break;
 
+   case 'i':
+   iflag++;
+   break;
+
case 'k':
kflag++;
break;
diff --git a/usr.bin/procstat/procstat.h b/usr.bin/procstat/procstat.h
index 8bacab7..10f8fce 100644
--- a/usr.bin/procstat/procstat.h
+++ b/usr.bin/procstat/procstat.h
@@ -41,6 +41,7 @@ void  procstat_cred(pid_t pid, struct kinfo_proc *kipp);
 void   procstat_files(pid_t pid, struct kinfo_proc *kipp);
 void   procstat_kstack(pid_t pid, struct kinfo_proc *kipp, int kflag);
 void   procstat_threads(pid_t pid, struct kinfo_proc *kipp);
+void   procstat_threads_sigs(pid_t pid, struct kinfo_proc *kipp);
 void   procstat_vm(pid_t pid, struct kinfo_proc *kipp);
 
 #endif /* !PROCSTAT_H */
diff --git a/usr.bin/procstat/procstat_threads_sigs.c 
b/usr.bin/procstat/procstat_threads_sigs.c
new file mode 100644
index 000..814f0c4
--- /dev/null
+++ b/usr.bin/procstat/procstat_threads_sigs.c
@@ -0,0 +1,111 @@
+/*-
+ * Copyright (c) 2007 Robert N. M. Watson
+ * Copyright (c) 2010 Konstantin Belousov
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF 

RE: ntpd hangs under FBSD 8

2010-02-25 Thread Peter Steele
 We'll likely go with this solution instead of downgrading Python and the 
 related libraries.

In fact I came up with another solution. I realized that since the problem was 
related to the process signal mask, instead of called ntpd directly, wrap it up 
in a C app that resets the signal mask to something that works. I have the 
following code:

   sigset_t set, oset;
   sigemptyset(set);
   pthread_sigmask(SIG_SETMASK, set, oset);
   system(/usr/sbin/ntpd -g -q);
   pthread_sigmask(SIG_SETMASK, oset, NULL);

I wrapped this up into a standalone app and call this from Python instead of 
calling ntpd directly. This solved the problem--no more hang. Thanks very much 
to Kostik Belousov for his wild guess that this was related to the process 
signal mask. His guess was dead on.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


RE: ntpd hangs under FBSD 8

2010-02-24 Thread Peter Steele
You're going to need a debug version of libc, too.  gdb won't be able to find 
a backtrace out of a libc function without it.

What's the proper way to build a debug version of libc and the other libraries? 
I tried this:

export CFLAGS=-O0
make buildworld
make installworld DESTDIR=/mydir

and then copied libc.so.7 from /mydir/lib to the /lib dir on my target system. 
I also replaced the ntpd binary with the debug version. I can see that -O0 is 
being used in the various cc commands that are generated, but libc still 
doesn't seem to be built properly. When I attach to a hung ntpd process, I get 
this:

# gdb /usr/sbin/ntpd -p 2113
GNU gdb 6.1.1 [FreeBSD]
...
Attaching to program: /usr/sbin/ntpd, process 2113
Reading symbols from /lib/libm.so.5...(no debugging symbols found)...done.
...
Reading symbols from /lib/libc.so.7...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.7
Reading symbols from /lib/libthr.so.3...(no debugging symbols found)...done.
...
[Switching to Thread 8012041c0 (LWP 100283)]
0x000800dbeddc in select () from /lib/libc.so.7
(gdb) bt
#0  0x000800dbeddc in select () from /lib/libc.so.7
#1  0x004335de in ntpdmain ()
#2  0x0043310b in main ()

So I'm getting some symbols from ntpd but I still can't see into select(). It 
hangs in there forever so that's where I need to drill down further. How do I 
get libc built with full debug symbols?

In other testing I've narrowed the problem down to some kind of Python issue. 
If I run the Python code at the end of this email where ntpd -g -q is 
launched as part of a Python thread class, the command hangs (the code assumes 
that ntpd is not already running). If I run the same ntpd command in a normal 
function (e.g. main) no hang occurs. I've tried subcommand.Popen and os.spawnv 
to run ntpd and these calls behave exactly the same way--when called from a 
thread the ntpd process hangs but it works fine when called from outside of a 
thread. This is a breakdown of course of our larger project into a simple test 
app. In our real code we cannot so easily eliminate this thread wrapper.

The same code BTW works fine on our FreeBSD 7 boxes, the main difference being 
we are running an older version of Python on those boxes (2.5.1 instead of 
2.6.2). I tried installing the same 2.5.1 package on a FBSD 8 box and that 
solved the problem. Curiously a slightly newer FBSD 7 version of Python, 2.5.5, 
causes the same hang to occur. So only Python 2.5.1 built under FreeBSD 7 works 
to get around this issue with ntpd on FreeBSD 8. That means one potential 
solution is to downgrade to this 2.5.1, but we have other libraries targeted to 
work with Python 2.6 and we don't really want to downgrade all these associated 
libraries.

If anyone has any clues at all as to what is causing this issue, I'd appreciate 
the feedback. Here's the code that reproduces this behavior.

#! /usr/bin/env python
import os
import threading

class RunProc(threading.Thread):
def __init__(self, cmd):
threading.Thread.__init__(self)
self.cmd = cmd

def run(self):
os.system(self.cmd)

def main():
RunProc(/usr/sbin/ntpd -g -q).start()

if __name__ == __main__:
main()


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: ntpd hangs under FBSD 8

2010-02-24 Thread Alexey Shuvaev
On Wed, Feb 24, 2010 at 12:17:50PM -0600, Peter Steele wrote:
 You're going to need a debug version of libc, too.
 gdb won't be able to find a backtrace out of a libc function without it.
 
 What's the proper way to build a debug version of libc and the other
 libraries? I tried this:
 
 export CFLAGS=-O0
 make buildworld
 make installworld DESTDIR=/mydir
 
 and then copied libc.so.7 from /mydir/lib to the /lib dir on
 my target system. I also replaced the ntpd binary with the debug version.
 I can see that -O0 is being used in the various cc commands that are
 generated, but libc still doesn't seem to be built properly.
 When I attach to a hung ntpd process, I get this:
 
 # gdb /usr/sbin/ntpd -p 2113
 GNU gdb 6.1.1 [FreeBSD]
 ...
 Attaching to program: /usr/sbin/ntpd, process 2113
 Reading symbols from /lib/libm.so.5...(no debugging symbols found)...done.
 ...
 Reading symbols from /lib/libc.so.7...(no debugging symbols found)...done.
 Loaded symbols for /lib/libc.so.7
 Reading symbols from /lib/libthr.so.3...(no debugging symbols found)...done.
 ...
 [Switching to Thread 8012041c0 (LWP 100283)]
 0x000800dbeddc in select () from /lib/libc.so.7
 (gdb) bt
 #0  0x000800dbeddc in select () from /lib/libc.so.7
 #1  0x004335de in ntpdmain ()
 #2  0x0043310b in main ()
 
 So I'm getting some symbols from ntpd but I still can't see into select().

I think not.

 It hangs in there forever so that's where I need to drill down further.
 How do I get libc built with full debug symbols?
 
I haven't tried it by myself but think here is the way to go: put the
following to /etc/make.conf and recompile needed libraries / ports.
WITH_DEBUG=yes
DEBUG_FLAGS=-g

This should do the trick for both base and ports.

 
 [snip]
 
 If anyone has any clues at all as to what is causing this issue,
 I'd appreciate the feedback. Here's the code that reproduces this behavior.
 
 #! /usr/bin/env python
 import os
 import threading
 
 class RunProc(threading.Thread):
 def __init__(self, cmd):
 threading.Thread.__init__(self)
 self.cmd = cmd
 
 def run(self):
 os.system(self.cmd)
 
 def main():
 RunProc(/usr/sbin/ntpd -g -q).start()
 
 if __name__ == __main__:
 main()
 
Mmm... Do other daemons (sshd, lpd, ...) also fail when started
through this script? Normal commands (ls, ps) seem not affected.

0.02$,
Alexey.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: ntpd hangs under FBSD 8

2010-02-24 Thread John Baldwin
On Wednesday 24 February 2010 1:17:50 pm Peter Steele wrote:
 You're going to need a debug version of libc, too.  gdb won't be able to 
find a backtrace out of a libc function without it.
 
 What's the proper way to build a debug version of libc and the other 
libraries? I tried this:

You can just do this:

cd /usr/src/lib/libc
make clean
make DEBUG_FLAGS=-g
make install

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


RE: ntpd hangs under FBSD 8

2010-02-24 Thread Peter Steele
 What's the proper way to build a debug version of libc and the other 
 libraries? I tried this:

You can just do this:

cd /usr/src/lib/libc
make clean
make DEBUG_FLAGS=-g
make install

When I tried this the make actually failed with various errors. So I decided to 
do a full make buildworld DEBUG_FLAGS=-g but in looking at the output being 
generated I see see -O2 in the cc commands and this at least should be -O0. It 
doesn't look like the DEBUG_FLAGS is having any effect.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


RE: ntpd hangs under FBSD 8

2010-02-24 Thread Peter Steele
 How do I get libc built with full debug symbols?
 
I haven't tried it by myself but think here is the way to go: put the 
following to /etc/make.conf and recompile needed libraries / ports.
WITH_DEBUG=yes
DEBUG_FLAGS=-g

That didn't seem to have any effect. I still see -O2 being used instead of -O0.

Mmm... Do other daemons (sshd, lpd, ...) also fail when started through this 
script? Normal commands (ls, ps) seem not affected.

I tried a few other things and they all seemed to run correctly. We use this 
same general approach in the full version of this script to launch lots of 
applications. Its role in fact is a process launcher/monitor. I stripped it 
down to the bare minimum in order to isolate the cause of the problem. It seems 
that only ntpd hangs, but not if I use Python 2.5.1.


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: ntpd hangs under FBSD 8

2010-02-24 Thread Kostik Belousov
On Wed, Feb 24, 2010 at 03:17:25PM -0500, John Baldwin wrote:
 On Wednesday 24 February 2010 1:17:50 pm Peter Steele wrote:
  You're going to need a debug version of libc, too.  gdb won't be able to 
 find a backtrace out of a libc function without it.
  
  What's the proper way to build a debug version of libc and the other 
 libraries? I tried this:
 
 You can just do this:
 
 cd /usr/src/lib/libc
 make clean
 make DEBUG_FLAGS=-g
 make install

make install should be done with DEBUG_FLAGS containing -g too, otherwise
strip(1) is called on the installed binary.


pgpTclt7JKHiZ.pgp
Description: PGP signature


RE: ntpd hangs under FBSD 8

2010-02-24 Thread Nate Eldredge

On Mon, 22 Feb 2010, Peter Steele wrote:

Just out of curiosity, can you attach to the process via gdb and get a 
backtrace? This smells like a locked pthread_join I hit in my own code 
a few weeks ago


I'm not using the debug version of ntpd so the backtrace isn't too 
useful, but here's what I get:


(gdb) bt
#0  0x000800d52bfc in select () from /lib/libc.so.7
#1  0x00425273 in ?? ()
#2  0x0040540e in ?? ()
#3  0x00080058 in ?? ()
#4  0x in ?? ()


I bet ntpd doesn't call select() in all that many places.  Instead of 
going to all this trouble to build a debugging libc, you could just grep 
for select() and place breakpoints on all occurrences.  (It might also be 
obvious from looking at them which one is the offender.)


Also, since a system call is causing the trouble, you might learn 
something from truss or ktrace.


--

Nate Eldredge
n...@thatsmathematics.com
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: ntpd hangs under FBSD 8

2010-02-24 Thread John Baldwin
On Wednesday 24 February 2010 5:09:47 pm Kostik Belousov wrote:
 On Wed, Feb 24, 2010 at 03:17:25PM -0500, John Baldwin wrote:
  On Wednesday 24 February 2010 1:17:50 pm Peter Steele wrote:
   You're going to need a debug version of libc, too.  gdb won't be able to 
  find a backtrace out of a libc function without it.
   
   What's the proper way to build a debug version of libc and the other 
  libraries? I tried this:
  
  You can just do this:
  
  cd /usr/src/lib/libc
  make clean
  make DEBUG_FLAGS=-g
  make install
 
 make install should be done with DEBUG_FLAGS containing -g too, otherwise
 strip(1) is called on the installed binary.

Doh, yes.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


RE: ntpd hangs under FBSD 8

2010-02-24 Thread Peter Steele
I bet ntpd doesn't call select() in all that many places.  Instead of going to 
all this trouble to build a debugging libc, you could just
grep for select() and place breakpoints on all occurrences.  (It might also be 
obvious from looking at them which one is the offender.)

I just checked--there are five calls to select. I might flag each one with a 
printf or something and recompile to see which one is the culprit.

Also, since a system call is causing the trouble, you might learn something 
from truss or ktrace.

I'll check these out...
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


RE: ntpd hangs under FBSD 8

2010-02-24 Thread Peter Steele
 make install should be done with DEBUG_FLAGS containing -g too, otherwise
 strip(1) is called on the installed binary.

Doh, yes.

I did not do this; that's likely my problem. Thanks.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


RE: ntpd hangs under FBSD 8

2010-02-22 Thread Peter Steele
Just out of curiosity, can you attach to the process via gdb and get a 
backtrace? This smells like a locked pthread_join I hit in my own code a few 
weeks ago

I'm not using the debug version of ntpd so the backtrace isn't too useful, but 
here's what I get:

(gdb) bt
#0  0x000800d52bfc in select () from /lib/libc.so.7
#1  0x00425273 in ?? ()
#2  0x0040540e in ?? ()
#3  0x00080058 in ?? ()
#4  0x in ?? ()

The trace continues for 700+ entries. The first entry is useful enough though. 
One of the parameters to select() is a timeout parameter. Every time I do the 
backtrace it's stuck on this select call so it seems they have an infinite 
timeout set. One of these was running all weekend in fact and it's still stuck. 
Curiously, this problem only happens when we make the call from code via a 
system() call. If I run the same command interactively, it never hangs:

# /usr/sbin/ntpd -g -q
ntpd: time set +28845.997063s

The same code that runs this command does not hang when we run it on a BSD 7 
box. 

I think I'm going to have to build the debug version of ntpd and try to debug 
it. Definitely something weird going on.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: ntpd hangs under FBSD 8

2010-02-22 Thread Ryan Stone
You're going to need a debug version of libc, too.  gdb won't be able
to find a backtrace out of a libc function without it.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


RE: ntpd hangs under FBSD 8

2010-02-22 Thread Peter Steele
You're going to need a debug version of libc, too.  gdb won't be able to find 
a backtrace out of a libc function without it.

Yeah, you're right. This is definitely an annoying bug...

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: ntpd hangs under FBSD 8

2010-02-20 Thread R. Tyler Ballance

On Fri, 19 Feb 2010, Peter Steele wrote:

 I posted this originally on the -questions list but did not make any headway. 
 We have an application where the user can change the date/time via a GUI. One 
 of the options the user has is to specify that the time is to be synced using 
 ntp. Our coding worked fine under BSD 7 but since we've moved to BSD 8 we've 
 encountered a problem where the command that we initiate from the GUI:

Just out of curiosity, can you attach to the process via gdb and get a
backtrace? This smells like a locked pthread_join I hit in my own code a few
weeks ago


Cheers,
-R. Tyler Ballance
--
 Jabber: rty...@jabber.org
 GitHub: http://github.com/rtyler
Twitter: http://twitter.com/agentdero
   Blog: http://unethicalblogger.com



pgpkfqopLBnxf.pgp
Description: PGP signature