Hi Rusty, Rusty Russell wrote: > > Lately I upgraded my host and guest from kernel 2.6.25 (Debian) to > > vanilla 2.6.27.8. I have two virtual machines and in both I recognised > > the same issue after a while. > > > > After running for some hours the first guest showed the following > > message on the console: > > virtio_blk virtio1: id 1 is not a head! > This is interesting. There hasn't been a great deal of change between > those versions on the lguest side. This sounds like a race, and the only > thing I can see which would have introduced this is the change below.
I removed the "lguest: turn Waker into a thread, not a process" patch as proposed. Short result: The problem still exists. Longer story: As this issue is extremly hard to reproduce, I made an attempt to cause many file system reads. If I understood things correctly this is when the guest side of virtio_blk is likely to call vring_get_buf?! I did so by creating a large file (300MiB vs. 256MiB memory) in the guest and reading it multiple times in 1k pieces (dd). To verify that it works, I ran this loop on 2.6.27.8 guest with the original launcher: failure after ~200 iterations. So I went on with the patched version of lguest (launcher with waker process) and the same 2.6.27.8 guest: failure after ~600 iterations. Next step was a 2.6.25 guest with the original launcher: I stopped it after ~1600 iterations did not show a fault. As it occured to me that using a file in a guest's file system might not be the optimal target, I have now modified my loop to use the first 300MiB of the block device itself. With the original combination, I now got the failure after ~750 iterations. I am now running this test with 2.6.25 guest again. As the host is also serving some serious purpose, I can only run this kind of long lasting tests during night time. So I'll be forced to stop this test soon (but I'll move into bed first). My next step will be to use some non-root block device, to have a chance to access the guest after the crash. Do you have any idea what to look at then? One thing, I just recognised: While my 2.6.25 guest kernel is configured with "NOHz mode", the 2.6.27.8 guest is using "high resolution mode". Thanks so far, André -- Reality is a crutch for people who can't handle drugs.
signature.asc
Description: Digital signature
_______________________________________________ Lguest mailing list [email protected] https://ozlabs.org/mailman/listinfo/lguest
