On 12/03/15 00:38, John Baldwin wrote:
It sounds like this issue might be the one fixed in r272566: if the
> >KERN_PROC_ALL sysctl is read with an insufficiently large buffer, an
> >sbuf error return value could bubble up and be treated as ERESTART,
> >resulting in a loop.
> >
> >This can be confirmed with something like
> >
> >    dtrace -n 'syscall:::entry/pid == $target/{@[probefunc] = count();} tick-3s 
{exit(0);}' -p <pid of looping proc>
> >
> >If the output consists solely of __sysctl, this bug is likely the
> >culprit.
>
>Unfortunately, I accidentally killed fstat this morning before I could do any 
further debug.
>
>I ran truss -p on it yesterday and it was spinning solely on __sysctl.
>
>I'll try compiling with debug symbols in case it happens again. I haven't been 
able to reproduce the
>problem in a reasonable time frame so it could be days or weeks before we see 
it happen again.
Tha truss output is consistent with Mark's suggestion, so I would try
his suggested fix of 272566.

I patched the 10.1 kernel with r272566 and it appears to have fixed the issue. Is this patch likely to be MFCed back to 10-stable?

Our RC script forks off about 200 processes when starting our software, and I wrote a small script to repeatedly stop/start the software, which fairly reliably reproduces the issue about 1 in 10 times. I've been running the script with the patched kernel for an hour now and I haven't seen the issue appear.

Thanks for your help.

-Nick
--
Founder, CTO
www.akips.com
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to