* Kevin O'Connor ([email protected]) wrote: > On Mon, Jan 23, 2017 at 06:49:07PM +0000, Dr. David Alan Gilbert wrote: > > * Laszlo Ersek ([email protected]) wrote: > > > On 01/23/17 16:49, Kevin O'Connor wrote: > > > > On Mon, Jan 23, 2017 at 11:11:02AM +0100, Laszlo Ersek wrote: > > > >> On 01/20/17 20:39, Dr. David Alan Gilbert wrote: > > > >>> * Kevin O'Connor ([email protected]) wrote: > > > >>>> On Fri, Jan 20, 2017 at 06:40:44PM +0000, Dr. David Alan Gilbert > > > >>>> wrote: > > > >>>>> Hi, > > > >>>>> I turned the debug level up to 4 on our smaller (128k) ROM > > > >>>>> downstream > > > >>>>> build and seem to have hit a case where it's been layed out so that > > > >>>>> the > > > >>>>> 'ExtraStack' is at the same location as some code (display_uuid) > > > >>>>> which > > > >>>>> was causing some very random behaviour; > > > > [...] > > > >> Would this be consistent with a stack overflow? > > > >> > > > >> See commit 46b82624c95b951e8825fab117d9352faeae0ec8. Perhaps > > > >> BUILD_EXTRA_STACK_SIZE (2KB) is too small now? > > > > > > > > The ExtraStack isn't used at the point Dave reports the problem - > > > > display_uuid() is part of the init phase and that happens on the main > > > > "post" stack. > > > > > > > > [...] > > > >> (This is based off 1.9.1) > > > > > > > > I missed that earlier - there were some important fixes post 1.9.1 wrt > > > > reboots. Commits b837e68d / a48f602c2 could explain the issue. I'd > > > > make sure the issue is still present on the latest version. > > > > > > That's a very promising hunch -- b837e68d explicitly mentions "reboot > > > loop" in the subject. It seems that Dave didn't mention any RHBZ numbers > > > in his email, but we have two somewhat similar bug reports (which I hope > > > share a root cause) and the second report triggers the issue with a > > > reboot loop specifically. > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1411275 > > > https://bugzilla.redhat.com/show_bug.cgi?id=1382906 > > > > > > (Apologies that the 2nd RHBZ is not public; it's currently filed for the > > > RH kernel, and those BZs default to private. :/) > > That first report mentions migration to a different QEMU version. > When migrating, is the BIOS software migrated as well (the copy at > 0xffff0000), or does the new instance get a potentially different > instance of the BIOS?
This case isn't with migration; as soon as I started debugging that I found I could hit it with just a simple reboot loop in the guest. (and migration should give you exactly the same ROM image as the source because the read-only blocks get migrated and used rather than the local copy on the destination - until you power the VM down and restart the QEMU when it rereads the local file). > > > CC'ing DavidH too, for RHBZ#1382906. > > > > Yeh, it's looking promising; I've done a build with low debug that > > survived for 50+ reboots and turned my debug on and it's going for 20 so > > far, > > so that's pretty good. > > > > However, reading the commits I'm a little confused. > > > > I don't seem to have hit any cases where it's taken the shutdown case after > > failing to reboot; so it's not that path. > > > > My reboots in this case are always guest triggered, so they're not very > > early reboots. > > Both of those seabios fixes are for reboots that occur while > processing a reboot. Any chance the guest tries multiple reboot > signals and one of them gets delayed? It's possible; I'll have to add some qemu debug to watch for that; the guest I'm using is a rhel 6.9-ish image - so old kernel with lots of patches; but I'll try and see what it's doing. I've just got a loop that waits for the VM to boot, sees the login/password prompt on serial console, and as soon as it gets a shell issues 'reboot'. > > One comment in there is: > > + // Some old versions of KVM don't store a pristine copy of the > > + // BIOS in high memory. Try to shutdown the machine instead. > > > > do you have a definition of 'old'; in this case it's a new-ish qemu > > on our downstream (older) kernel but it's got fairly new kvm bits in, > > but the qemu is configured in our rhel6 compatibility mode - so hmm. > > I don't have the kvm version handy, but it's really old. You're > definitely not on that version, or every reboot would result in a > shutdown instead. OK, thanks. Dave > -Kevin -- Dr. David Alan Gilbert / [email protected] / Manchester, UK _______________________________________________ SeaBIOS mailing list [email protected] https://www.coreboot.org/mailman/listinfo/seabios
