Aloha -

While testing the exec server, I setup a very minimalist subhurd using just
the most essential files, as opposed to copying the entire filesystem, and
uncovered a number of bugs.

I've refined the process into a shell script (attached) which creates the
subhurd on a ramdisk and then boots it.

At least three bugs become apparent:

1. /hurd/startup doesn't fallback on /bin/sh if it can't exec
/etc/hurd/runsystem.  This is easy to fix - just a missing increment.
Patch attached.

2. /hurd/startup naively assumes that SIGCHLD and waitpid() both work on
init (PID 1), but they don't.

I've been able to patch this up by introducing special cases to check for
HURD_PID_INIT in proc/wait.c's alert_parent (if PID is HURD_PID_INIT then
ignore the p_parent field and treat startup_proc as the parent) and
S_proc_wait (if we're called from procserver, make a special attempt to
reap(init_proc)), but I hesitate to submit this as a patch.  I'm not sure
how we want to do this.  Introduce special cases for init everywhere we've
got a problem with it?  Also, after fixing bug #1, this screws up startup's
attempt to start a new shell if the old one dies.  proc doesn't like having
a second init process started after the first one has died and been
reaped.  Maybe startup shouldn't try to start a second init, even if the
first one dies.  And startup still should have some way to detect if init
dies.

Our current setup is that PID 5 (ext2fs) runs first, then starts PID 2
(startup), which starts PID 1 (init).  Weird.  The cleanest solution, of
course, would be to have proc actually respect these parenting
relationships, then SIGCHLD and waitpid() would work normally.

3. Booting the subhurd, then running "halt -f" from its shell crashes the
parent Hurd.  Here's what the subhurd displays:

# halt -f
startup: notifying ext2fs.static pseudo-root of halt...done
startup: Killing pid 1
startup: Killing pid 3

...and here's what I see on the parent's console:

panic: thread_invoke: thread 9fcbfc80 has unexpected state 86
Debugger invoked: panic
Kernel Breakpoint trap, eip 0x810200f4
Stopped    at  Debugger+0x13:    int    $3
Debugger(810e015e,7,0,81a2fc90,9fcc6960)+0x13
panic(810e4740,810d9666,9fcbfc80,86,9fcbfc80)+0x79
state_panic(81051d8f,9bbb48dc,0,9fcbfc80,81029300)+0x17
thread_invoke(9fcbfc80,81029300,9b58f698,810292b3)+0x258
thread_run(81029300,9b58f698,0,81029375)+0x49
idle_thread_continue(9bb92568,81028a70,9c092fe4,0,9bd15488)+0x125
db>

Thread 0x9fcbfc80 is a kernel thread, and state 86 is TH_RUN | TH_SUSP |
TH_IDLE.  Not sure how it gets there.

    agape
    brent

Attachment: subhurd
Description: Binary data

Attachment: patch
Description: Binary data

Reply via email to