On Mon, 04 Apr 2011 10:39:37 +0200 Mark <[email protected]> wrote: > Can do, its not in production yet. I did install 1.6.0pre4 on it, and > that one runs fine btw.
Yeah, and actually, 1.6 testing is probably more worthwhile at this point with a platform like that. You don't need to keep fiddling with 1.4, unless you want to, of course. > Backtrace from the generated bosserver.core: > > (gdb) bt > #0 0x000000080077afcc in kill () from /lib/libc.so.7 > #1 0x0000000800779dcb in abort () from /lib/libc.so.7 > #2 0x000000000041389b in osi_Panic (msg=Variable "msg" is not available.) at > rx_user.c:225 So, the panic message gets printed to stderr, but bosserver will normally redirect that to /dev/null. You can see the panic message if you run bosserver in the foreground with bosserver -nofork. I expect you'll get something complaining like "rx packet not free". On reproducing this myself, I see that the rx free packet queue is getting corrupted when we're in the middle of libc. At first this seemed very odd, but it just looks like our LWP stack is too small. The code path it gets corrupted is in rxkad decode_generalized_time -> generalizedtime2time -> timegm. timegm eventually calls some function to load some tz data, which seems to read from disk into some stack space (tzload local variable u). This involves requiring a rather large amount of stack (imo; it's like 60k or so for that one frame, if I'm reading this right). So, it's not too surprising that we crash, given our stack for rx_Listener I believe should be the LWP minimum of 48k. So, if you want to give this a quick try, if you start bosserver like so: AFS_LWP_STACK_SIZE=196608 /usr/afs/bin/bosserver The problem should go away (it does for me). Of course, any other LWP daemon will probably have the same problem, so if you want to run dbserver processes on that machine, you'll need to do the same for them. (fileserver and volserver use pthreads, so they should not be affected) To me, this suggests that we just need to raise the minimum LWP stack size for freebsd (or maybe fbsd 8, or something more specific?). I still have no idea why this isn't a problem on 1.6/master, though, as I thought we had similar stack sizes there, but I haven't looked too much at that yet. -- Andrew Deason [email protected] _______________________________________________ OpenAFS-info mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-info
