On 2005.10.03, Janine Sisk <[EMAIL PROTECTED]> wrote: > It is Linux. Unfortunately, it's always the production instances > that hang, and I can't leave them that way while I poke around. > Especially since it seems to be load related and so usually happens > at the busiest times of the day. I don't really know how to use > gdb; if there is something quick I can do that can be examined > offline, let me know and I will try it.
Newer gdb has "gcore" which will let you force a coredump of a running process to be written out. The amount of "downtime" could be as short (or as long, depending) as the time it takes for your monitoring software to detect the nsd is unresponsive, then kick off the gdb script which attaches to the running nsd, gcore's it, then exits, then triggers the nsd restart. This way, you can peruse and post-mortem the corefiles but keep the nsd's running with as little downtime as possible. However, Linux corefiles of multi-threaded processes will only be useful on modern Linux kernels -- 2.6.x kernels -- and I *think* the more recent 2.4.x kernels, not sure exactly which one. Oh, and "gcore" in gdb is also recent, I'd recommend using gdb 6.3 or newer. -- Dossy -- Dossy Shiobara [EMAIL PROTECTED] | http://dossy.org/ Panoptic Computer Network http://panoptic.com/ "He realized the fastest way to change is to laugh at your own folly -- then you can let go and quickly move on." (p. 70) -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]> with the body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.
