Re: initrd race conditions
On Tue, Feb 07, 2006 at 05:36:51PM +, Steven Haslam wrote: > There shouldn't be any kernel changes required. Just use mkinitrd.yaird > to build a new initrd.img. I installed the sarge-backports yaird and modified /etc/kernel-pkg.conf to say "ramdisk=mkinitrd.yaird" and then ran "dpkg-reconfigure linux-image-2.6.15-1-amd64-k8-smp" which seemed to cause the ramdisk to be rebuilt using yaird. As you explained this meant that the network drivers were not loaded by the initrd. I still couldn't get my local udev rules to rename the interfaces on startup though. In the end I just disabled the e100 in the BIOS - I hope that the two tg3s will always be discovered in the same order... :-) Thanks for all your help. I think I'd like to log a bug report about this though. Is udev the best package to do that against? -- Mike Crowe -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: initrd race conditions
Mike Crowe wrote: >> 1. Failure to boot at all and being dumped at a shell prompt inside >> the ramdisk if the onboard SATA driver is loaded prior to the >> megaraid driver. This is despite the fact that nothing is connected >> to SATA. I've worked around this one by disabling the onboard SATA. On Tue, Feb 07, 2006 at 02:26:54PM +, Steven Haslam wrote: > I actually had this exact problem myself for the first time this morning > with my X2 system, with an initrd built by initramfs-tools. > > Using yaird may help because AIUI it will only load the modules required > to mount the root filesystem during the initramfs stage, or at least > will load those first. Assuming I can persuade the stock 2.6.15 kernel to use yaird rather than initramfs then I'll give that a go. If I have to compile the kernel myself I might as well just compile in the drivers I need which will also solve the problem for me. >> 2. Almost random ordering of ethernet devices between boots. The >> machine has a single e100 and two tg3 ports. Although I can believe >> that the two tg3s always appear in the same order I've had the e100 >> detected either first, last or inbetween the two tg3s! Statically >> configuring IP addresses is very hard if you don't know which will be >> eth0 next time. I've not fathomed a workaround for this one. This >> makes bug #342498 entered against the installer even worse. > You can fix the names of network interfaces using udev-- istr that's > available in sarge, even though a kernel that supports it isn't (fun). > > e.g. I have: > > bash$ cat /etc/udev/rules.d/010_local.rules [snip] But surely that's too late? The ramdisk will load the drivers before my root filesystem and therefore that rules file can be seen. -- Mike Crowe -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
initrd race conditions
I'm running the backports.org 2.6.15 kernel on an otherwise sarge amd64 box with a couple of Opteron 275s. It seems that modules are being loaded by the initrd at startup in parallel. This seems to lead to serious race conditions with the following symptoms: 1. Failure to boot at all and being dumped at a shell prompt inside the ramdisk if the onboard SATA driver is loaded prior to the megaraid driver. This is despite the fact that nothing is connected to SATA. I've worked around this one by disabling the onboard SATA. 2. Almost random ordering of ethernet devices between boots. The machine has a single e100 and two tg3 ports. Although I can believe that the two tg3s always appear in the same order I've had the e100 detected either first, last or inbetween the two tg3s! Statically configuring IP addresses is very hard if you don't know which will be eth0 next time. I've not fathomed a workaround for this one. This makes bug #342498 entered against the installer even worse. Does anyone else see this problem? Is it likely to be amd64 specific? I haven't really used initrds, udev or 2.6 kernels on the dual processor x86 boxes we have so I don't know if they suffer similary. -- Mike Crowe -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: Make, ccache and unreaped defunct shells
On Fri, Feb 03, 2006 at 12:47:10AM +0100, Frederik Schueler wrote: > I suggest you pick the backported linux-image-2.6.15-1-amd64-k8-smp > and yaird from www.backports.org. That's what I ended up doing not long after posting. The problem seemed to go away but it's been replaced by another network related problem that I'm currently discussing on the Linux netdev list. Once I was convinced that the make problem had gone away I was going to post a followup. > The ccache version you are using is newer than the one in sarge > (2.3-1.1), is it a sid backport? It's compiled straight from source. Thanks for the advice. -- Mike Crowe -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Make, ccache and unreaped defunct shells
I've installed sarge/amd64 on a machine with a couple of Opteron 275s. We're using make (Debian version), ccache 2.4 and distcc 2.18.3 to build quite a large amount of code using -j12. At some point during the build make will stop spawning new jobs. Looking at the process listing there are a large number (equivalent to the number passed to -j) of completed shell processes sat unreaped with make sat asleep waiting for them. mac 32196 1.0 0.2 24876 19648 pts/5 S+ 11:51 0:02 make mac 13234 0.0 0.00 0 pts/5 Z+ 11:52 0:00 [sh] mac 14254 0.0 0.00 0 pts/5 Z+ 11:52 0:00 [sh] mac 14296 0.0 0.00 0 pts/5 Z+ 11:52 0:00 [sh] mac 14308 0.0 0.00 0 pts/5 Z+ 11:52 0:00 [sh] mac 14470 0.0 0.00 0 pts/5 Z+ 11:52 0:00 [sh] mac 14491 0.0 0.00 0 pts/5 Z+ 11:52 0:00 [sh] mac 14518 0.0 0.00 0 pts/5 Z+ 11:52 0:00 [sh] mac 14530 0.0 0.00 0 pts/5 Z+ 11:52 0:00 [sh] mac 14545 0.0 0.00 0 pts/5 Z+ 11:52 0:00 [sh] mac 14571 0.0 0.00 0 pts/5 Z+ 11:52 0:00 [sh] mac 14589 0.0 0.00 0 pts/5 Z+ 11:52 0:00 [sh] mac 14610 0.0 0.00 0 pts/5 Z+ 11:52 0:00 [sh] According to /proc/32196/wchan the make process is in pipe_wait. It seems like there is a fundamental problem with waiting for child processes somewhere. Has anyone seen anything similar or recommend the best course of action? ---8<--- ii libc6 2.3.2.ds1-22 ii kernel-image-2.6.8-11-amd64 2.6.8-16sarge1 ii make3.80-9 --->8--- -- Mike Crowe -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]