Re: [Bug 818177] Re: boot failures caused by udev race

2011-10-06 Thread Steve Langasek
On Thu, Oct 06, 2011 at 01:18:27AM -, Serge Hallyn wrote: see comment #42 for one disk layout that reproduces it for me in a kvm VM. This describes a physical disk layout... it doesn't describe a partition or LV layout, which is what is key to reproducing the LVM-related hang. Is this VM

Re: [Bug 818177] Re: boot failures caused by udev race

2011-10-06 Thread Steve Langasek
On Thu, Oct 06, 2011 at 01:20:03AM -, Serge Hallyn wrote: also see bug 833891 as a udev bug specifically for the LVM case. Do you mean you're already tracking the LVM case on that bug instead, and that in the setup you're using there is *no* use of LVM? I guess no one told Adam that this is

[Bug 818177] Re: boot failures caused by udev race

2011-10-06 Thread James Hunt
@Adam/@Serge: could you try the following to see if it solves / reduces the occurrence of the problem. Also, can you report back if you see any interesting processes as logged by the change below: 1) Change initramfs to bind mount /dev and dump devices and running processes before udevd exits: $

[Bug 818177] Re: boot failures caused by udev race

2011-10-06 Thread Serge Hallyn
@Steve, (re comment #52) I'm not sure what you were asking for then, but as I said in that comment, partition 1 is a simple ext3 filesystem. Partition 2 is just an extended. Partition 5 (the only one on the extended) is swap. There is no LVM. With standard (non-instrumented) udev it hangs

Re: [Bug 818177] Re: boot failures caused by udev race

2011-10-06 Thread Serge Hallyn
Quoting Steve Langasek (steve.langa...@canonical.com): On Thu, Oct 06, 2011 at 01:20:03AM -, Serge Hallyn wrote: also see bug 833891 as a udev bug specifically for the LVM case. Do you mean you're already tracking the LVM case on that bug instead, and I was, yes. I filed that one some

Re: [Bug 818177] Re: boot failures caused by udev race

2011-10-06 Thread Serge Hallyn
Quoting James Hunt (818...@bugs.launchpad.net): @Adam/@Serge: could you try the following to see if it solves / reduces the occurrence of the problem. It certainly didn't solve it, hung on first try :) I'm not sure what you would deem interesting, but here is the output from the script as

[Bug 818177] Re: boot failures caused by udev race

2011-10-06 Thread Steve Langasek
Serge, I'm not sure what you were asking for then, but as I said in that comment, partition 1 is a simple ext3 filesystem. Oh, doh - I read the wrong comment. Sorry. :) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu.

[Bug 818177] Re: boot failures caused by udev race

2011-10-06 Thread Andy Whitcroft
Ok. I have a machine here which triggers somethign similar to this pretty often. For me udev finds /dev is read-only and halts boot. This is presumably because our devtmpfs /dev has not made it into / when it starts. I was getting a failed boot about every 5-6 boots, 4 in 20 overall. I then

[Bug 818177] Re: boot failures caused by udev race

2011-10-05 Thread Adam Gandelman
After modifying initramfs similar to https://bugs.launchpad.net/ubuntu/oneiric/+source/udev/+bug/833783/comments/17 , ive managed to hit the bug. Attached is output from a failed boot and successful boot. ** Attachment added: 'udevadm monitor -e' log of a failed boot

[Bug 818177] Re: boot failures caused by udev race

2011-10-05 Thread Adam Gandelman
** Attachment added: 'udevadm monitor -e log of a successful boot https://bugs.launchpad.net/ubuntu/+source/linux/+bug/818177/+attachment/2514537/+files/udev.success -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu.

[Bug 818177] Re: boot failures caused by udev race

2011-10-05 Thread Steve Langasek
analysis of the two logs shows the following events missing from udev in the failed case. UDEV [4.229209] add /devices/pci:00/:00:01.1/host2/target2:0:1/2:0:1:0/block/sdc (block) UDEV_LOG=3 ACTION=add DEVPATH=/devices/pci:00/:00:01.1/host2/target2:0:1/2:0:1:0/block/sdc

[Bug 818177] Re: boot failures caused by udev race

2011-10-05 Thread Steve Langasek
The corresponding kernel events *are* present, so this seems to be definitively a udev bug and not a kernel bug. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/818177 Title: boot failures caused by

[Bug 818177] Re: boot failures caused by udev race

2011-10-05 Thread Steve Langasek
The last udev event shown in the failure case is this one: UDEV [3.735566] change /devices/virtual/block/dm-0 (block) UDEV_LOG=3 ACTION=change DEVPATH=/devices/virtual/block/dm-0 SUBSYSTEM=block DM_COOKIE=4228816 DEVNAME=/dev/dm-0 DEVTYPE=disk SEQNUM=1092 DM_UDEV_PRIMARY_SOURCE_FLAG=1

[Bug 818177] Re: boot failures caused by udev race

2011-10-05 Thread Steve Langasek
** Changed in: linux (Ubuntu Oneiric) Status: Incomplete = Invalid -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/818177 Title: boot failures caused by udev race To manage notifications

[Bug 818177] Re: boot failures caused by udev race

2011-10-05 Thread Steve Langasek
Looking at bugs on lvm2 turns up this gem in bug #802626: Just a wild speculation, because I haven't yet digged into the interactions between kernel and udevd, but the semaphore decrementation event might be lost when transitioning from the initrd-udevd to the rootfs-udevd. In cases where

[Bug 818177] Re: boot failures caused by udev race

2011-10-05 Thread Serge Hallyn
@Steve, see comment #42 for one disk layout that reproduces it for me in a kvm VM. (the vm was created with 'vm-new oneiric amd64 clean' - well, technically with a customized vm-new using the mini iso which hasn't yet been merged into ubuntu-qa-tools) -- You received this bug notification

[Bug 818177] Re: boot failures caused by udev race

2011-10-05 Thread Serge Hallyn
@Steve, also see bug 833891 as a udev bug specifically for the LVM case. In fact, Eduard toward the end speculated precisely the semaphore as a cause as you just did. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu.

[Bug 818177] Re: boot failures caused by udev race

2011-10-04 Thread Serge Hallyn
@JamesHunt, you mention LVs in comment #34, but assuming this is the same bug causing my hangs and read-only rootfs on VMs, it does not require LVs. Unfortunately, like you, whenever I've instrumented grub to print out the list of pending events, I can't reproduce it :) Perhaps kgdb is the way

[Bug 818177] Re: boot failures caused by udev race

2011-10-04 Thread Serge Hallyn
@JamesHunt, have you pursued trying to reproduce with a version of udevd which continues to process events when udev_exit==1? I'm unclear as to whether (1) we need to continue to process inotify events as well (so that udev workers don't get hung), and (2) whether that just means that

[Bug 818177] Re: boot failures caused by udev race

2011-10-04 Thread Adam Gandelman
@James: Im currently at a conference with limited wifi, but I checked yesterday and I can consistently reproduce on my thinkpad + kvm. Any chance you can publish those modded udev packages to a branch or PPA? I'm happy to test and see if I can get anything useful -- You received this bug

[Bug 818177] Re: boot failures caused by udev race

2011-10-04 Thread James Hunt
@Adam: Not quite yet, but will work on that when I get a chance tomorrow if possible. I'm intrigued by your setup though as I have a thinkpad + kvm but only see the problem very infrequently (and I've tried setting up images as you specify). How exactly are you invoking kvm? To clarify, are you

[Bug 818177] Re: boot failures caused by udev race

2011-10-04 Thread Adam Gandelman
Hi James- I'm using the server image, kvm+libvirt for the VM, here is the corresponding XML config for the VM. http://paste.ubuntu.com/702405/ Should note that the root disk is a qcow2 image, the two additional images are raw dd'd images, each 100MB. I've also been sure to provide 2 CPUs to the

Re: [Bug 818177] Re: boot failures caused by udev race

2011-10-04 Thread Serge Hallyn
Quoting James Hunt (818...@bugs.launchpad.net): @Serge: what was your storage configuration when you saw the problem without LVs? I think I've seen the problem a couple of times simply by providing 2 extra raw disks to the system but at that point my udevd debug wasn't helpful. I have tried

Re: [Bug 818177] Re: boot failures caused by udev race

2011-10-04 Thread Serge Hallyn
Quoting James Hunt (818...@bugs.launchpad.net): @Serge: what was your storage configuration when you saw the problem without LVs? I think I've seen the problem a couple of times simply by providing 2 extra raw disks to the system but at that point my udevd debug wasn't helpful. I have tried

[Bug 818177] Re: boot failures caused by udev race

2011-10-04 Thread James Hunt
@Dave: I've modded udevd to display some internal details, but cannot now make the images I have fail to boot reliably. Currently working with @jamespage who has a machine that fails to boot most times. I've tried to force more frequent failures by installing with lots of LVs, but that doesn't

[Bug 818177] Re: boot failures caused by udev race

2011-10-04 Thread James Hunt
BTW - I've also looked at how Fedora 15 stops udev with dracut and using Stefans terminology, they club it to death like we used to. They also pepper the code with frequent calls to settle and add a few sub-second sleeps here and there which feels horribly wrong IMHO. I have an off-beat idea as

[Bug 818177] Re: boot failures caused by udev race

2011-10-04 Thread James Hunt
@Adam: from comment #18, do you still have an image that fails to boot 1 in 5 times? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/818177 Title: boot failures caused by udev race To manage

[Bug 818177] Re: boot failures caused by udev race

2011-10-03 Thread Dave Walker
@James, How did you get on with the debug version of udevd? When we tried to do the same, we were unable to reproduce the bug, as the debug statements seemed to slow down udev, hiding the race. Was this the same behaviour you encountered? Thanks. -- You received this bug notification because

[Bug 818177] Re: boot failures caused by udev race

2011-09-29 Thread Steve Langasek
Andrew, I'm not sure you're experiencing the same issue; I would say in fact that you have some unrelated kernel bug, since there's no excuse for it taking 2 minutes to settle the kernel event queue. Getting a dump of 'udevadm monitor -e' from this initramfs (which would need to be started

[Bug 818177] Re: boot failures caused by udev race

2011-09-28 Thread James Hunt
It looks like udevd.c is rather aggressive when handling the exit scenario. I'm currently building a debug version of udevd + initramfs to try and see if and how messages are getting lost. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to

[Bug 818177] Re: boot failures caused by udev race

2011-09-27 Thread Dave Walker
** Summary changed: - HP DL380G5 root disk mounted read-only on boot and boot fails + boot failures caused by udev race -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/818177 Title: boot failures

[Bug 818177] Re: boot failures caused by udev race

2011-09-27 Thread Andrew Glen-Young
I have a few machines throwing a kernel panic while netbooting oneiric with a similar error message. I have attached the boot message log with the panic. ** Attachment added: udev-race.log https://bugs.launchpad.net/ubuntu/+source/udev/+bug/818177/+attachment/2471068/+files/udev-race.log