Further progress trying to track this down:

I don't have to shutdown the system to have problems.  "swapoff /dev/hd0s5"
is enough to cause problems, once enough swap is in use.  After a failed
swapoff, I have an extra 98 storeio processes running!

I don't have to swapoff to have "symptoms".  The kernel debugger normally
shows symbolic names, i.e:

Stopped  at  machine_idle+0xe:   leave
machine_idle(0,81a2c630,3806f64,0,9b448b38)+0xe
idle_thread_continue(9fcbdde0,81028b50,9c0c7fe4,0,9c3d5548)+0x2a

Once I've got enough swap in use, though, it stops doing this.  Now I see:

Stopped       at  0x810000be: leave
0x810000be(0,0,9fcc5990,0,9fb90b30)
0x810293fa(9fcbdde0,81028b50,99526fe4,0,9c3d5548)

When I see a kernel page fault, it's always in strcmp()

It doesn't matter if an ssh session is open or not (Riccardo Mottola's
suggestion).

I can't task_terminate the auth server, as this typically does nothing once
I've started having symptoms, but I can kill the auth server from the
command line (just "kill 7") and that triggers a reboot that leaves the
disk in a clean state.

I'm just learning Hurd.  Any ideas?

    agape
    brent

Reply via email to