Ken Moffat wrote:
> When I woke up in the
> morning, I was surprised to find that my OpenJDK script was still running
> rm -rf for the source directory, and had been doing so for more than 5
> hours (both wall-clock time and CPU time), and was now at 99%-100% of one
> CPU, according to top.

...Odd.

I upgraded to 3.13.5 a day after it got released, and just yesterday saw
something vaguely but not exactly similar.  Firefox (same binary as before the
kernel upgrade) crashed when I was mid-mouse-movement, which seemed odd, so I
poked at kernel logs.  There were a couple of "BUG: Bad page map in process
firefox" and "BUG: bad page state in process firefox", when trying to
madvise() (in zap_page_range -> unmap_single_vma) and page_fault,
respectively, about 15 minutes before the crash.  Then again at crash time,
the same pair of BUG messages, both in int_signal; the first was down in
do_group_exit -> unmap_vmas -> unmap_single_vma, and the second was down in
unmap_single_vma -> release_pages -> free_pages_prepare.

Then it logged "BUG: bad rss-counter state" twice, followed by "INFO:
rcu_preempt detected stalls on CPUs/tasks: {} (detected by 4, t=18002 jiffies,
g=81303, c=81302, q=7261" and "INFO: Stall ended before state dump start".

And about when it logged the rcu_preempt message, CPU 4 went busy-looping in
kernel space (according to gkrellm, which showed 100% in orange instead of the
userspace cyan or userspace-niced green) in a kworker thread (according to
top).  Had to reboot to get it back (trying to exit X also hung; most likely
something got scheduled onto that worker during handoff to the console driver
or something like that; had to alt-sysrq-u / b to get it to actually reboot).

So I guess this is a long way of saying -- are you sure the rm userspace code
is what was hung, and not something in the kernel?  Might be a prevalence of
cosmic rays I suppose, or it might be a memory corruption bug somewhere
causing issues with RCU.

(OTOH this system isn't really anywhere near stock LFS, either.  Not sure how
different it is from yours, but it's multilib with a pretty old gcc/glibc.)

Attachment: signature.asc
Description: OpenPGP digital signature

-- 
http://linuxfromscratch.org/mailman/listinfo/lfs-dev
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page

Reply via email to