On Wednesday, March 11, 2015 02:00:41 PM Nick Frampton wrote: > On 11/03/15 07:59, Mark Johnston wrote: > > On Tue, Mar 10, 2015 at 02:10:09PM -0400, John Baldwin wrote: > >> Often loops using libkvm are due to programs using libkvm are trying to > >> read > >> kernel data structures while they are changing. However, if you use > >> sysctls > >> to fetch this data instead, you should be able to get a stable snapshot of > >> the > >> system state without getting stuck in a possible loop. I believe for > >> libkvm > >> to use sysctl instead of /dev/kmem you have to pass a NULL for the kernel > >> and > >> "/dev/null" for the core image. > > In our code, we're invoking kvm_openfiles as you suggest: > kd = kvm_openfiles (NULL, _PATH_DEVNULL, NULL, O_RDONLY, errbuf) > > > > It sounds like this issue might be the one fixed in r272566: if the > > KERN_PROC_ALL sysctl is read with an insufficiently large buffer, an > > sbuf error return value could bubble up and be treated as ERESTART, > > resulting in a loop. > > > > This can be confirmed with something like > > > > dtrace -n 'syscall:::entry /pid == $target/{@[probefunc] = count();} > > tick-3s {exit(0);}' -p <pid of looping proc> > > > > If the output consists solely of __sysctl, this bug is likely the > > culprit. > > Unfortunately, I accidentally killed fstat this morning before I could do any > further debug. > > I ran truss -p on it yesterday and it was spinning solely on __sysctl. > > I'll try compiling with debug symbols in case it happens again. I haven't > been able to reproduce the > problem in a reasonable time frame so it could be days or weeks before we see > it happen again.
Tha truss output is consistent with Mark's suggestion, so I would try his suggested fix of 272566. -- John Baldwin _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"