On Wednesday, March 11, 2015 02:00:41 PM Nick Frampton wrote:
> On 11/03/15 07:59, Mark Johnston wrote:
> > On Tue, Mar 10, 2015 at 02:10:09PM -0400, John Baldwin wrote:
> >> Often loops using libkvm are due to programs using libkvm are trying to 
> >> read
> >> kernel data structures while they are changing.  However, if you use 
> >> sysctls
> >> to fetch this data instead, you should be able to get a stable snapshot of 
> >> the
> >> system state without getting stuck in a possible loop.  I believe for 
> >> libkvm
> >> to use sysctl instead of /dev/kmem you have to pass a NULL for the kernel 
> >> and
> >> "/dev/null" for the core image.
> 
> In our code, we're invoking kvm_openfiles as you suggest:
> kd = kvm_openfiles (NULL, _PATH_DEVNULL, NULL, O_RDONLY, errbuf)
> 
> 
> > It sounds like this issue might be the one fixed in r272566: if the
> > KERN_PROC_ALL sysctl is read with an insufficiently large buffer, an
> > sbuf error return value could bubble up and be treated as ERESTART,
> > resulting in a loop.
> >
> > This can be confirmed with something like
> >
> >    dtrace -n 'syscall:::entry /pid == $target/{@[probefunc] = count();} 
> > tick-3s {exit(0);}' -p <pid of looping proc>
> >
> > If the output consists solely of __sysctl, this bug is likely the
> > culprit.
> 
> Unfortunately, I accidentally killed fstat this morning before I could do any 
> further debug.
> 
> I ran truss -p on it yesterday and it was spinning solely on __sysctl.
> 
> I'll try compiling with debug symbols in case it happens again. I haven't 
> been able to reproduce the 
> problem in a reasonable time frame so it could be days or weeks before we see 
> it happen again.

Tha truss output is consistent with Mark's suggestion, so I would try
his suggested fix of 272566.

-- 
John Baldwin
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to