(Please CC replies to me, I'm not subscribed to lkml.) I think I've found a bug, local to linux/init/main.c, related to freeing init pages, which seems to be present in 2.4.5. Executive summary: After start_kernel has created the init thread, it needs to proceed to cpu_idle before init pages are free:d, since it's __init-marked. But if the init thread is scheduled to run before the parent thread, and in turn its call to do_basic_setup returns quickly, start_kernel will be free_initmem:d before that thread has proceeded to cpu_idle. Boom. Patch at end. Somewhat lengthier description and analysis: I found this out the hard way when fixing free_initmem (and the linker script) in Linux/CRIS. In linux/init/main.c it looks as follows at a conceptual level: void __init start_kernel() { foo_init(); blah_init(); ... kernel_thread (init, NULL, CLONE_FS | CLONE_FILES | CLONE_SIGNAL); current->need_resched = 1; cpu_idle(); } Correspondingly, the function init looks as follows: static int init(void * unused) { do_basic_setup(); free_initmem(); ... start up /bin/init or whatever ... } And do_basic_setup looks like this: static void __init do_basic_setup(void) { baz_init (); ... more *_init() calls ... ... mount root fs ... } If nothing in do_basic_setup causes rescheduling, it might (depending on e.g. timing) return before the parent has proceeded to call cpu_idle, from which it never returns. If so, free_initmem will be called, freeing among others start_kernel. Later, when the parent thread is scheduled to run and returns from the kernel_thread call, it will execute in unmapped memory. Bad: will crash, perhaps depending on that memory being recycled and modified. Normally, mounting the root fs involves mechanical disks or network activity or other stuff that causes I/O or takes time. Then rescheduling will cause the parent thread to resume to safety before do_basic_setup returns. You're most likely to see the race problem on platforms with a solid-state root fs. Analysis sanity-checked with [EMAIL PROTECTED] Repeatable on an "Axis Developer Board LX" (with patches to fix the linker script and remove the #if 0:s from linux/arch/cris/mm/init.c; contact me if you actually need to do this). One solution is to break out the last piece of start_kernel to a non-__init function as follows. The patch is against 2.4.5. 2001-06-12 Hans-Peter Nilsson <[EMAIL PROTECTED]> * linux/init/main.c (rest_init): New, broken out from start_kernel. --- linux/init/main.c.original Tue May 22 18:35:42 2001 +++ linux/init/main.c Wed Jun 13 01:20:54 2001 @@ -502,6 +502,21 @@ static void __init smp_init(void) #endif /* + * We need to finalize in a non-__init function or else race conditions + * between the root thread and the init thread may cause start_kernel to + * be reaped by free_initmem before the root thread has proceeded to + * cpu_idle. + */ + +static void rest_init(void) +{ + kernel_thread(init, NULL, CLONE_FS | CLONE_FILES | CLONE_SIGNAL); + unlock_kernel(); + current->need_resched = 1; + cpu_idle(); +} + +/* * Activate the first processor. */ @@ -583,10 +598,7 @@ asmlinkage void __init start_kernel(void * make syscalls (and thus be locked). */ smp_init(); - kernel_thread(init, NULL, CLONE_FS | CLONE_FILES | CLONE_SIGNAL); - unlock_kernel(); - current->need_resched = 1; - cpu_idle(); + rest_init(); } #ifdef CONFIG_BLK_DEV_INITRD brgds, H-P - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/