On 03/08/13 04:35, Kevin O'Connor wrote:
> On Thu, Mar 07, 2013 at 09:43:04AM +0100, Aurelien Jarno wrote:
>> On Wed, Mar 06, 2013 at 07:53:51PM -0500, Kevin O'Connor wrote:
>>> That change is definitely just build related - I don't see how it
>>> could impact the final SeaBIOS binary.  How did you conclude that this
>>> commit is what fixes the issue?
>>>
>>
>> I did a git bisect to find the commit fixing the issue. Then, as I was
>> not believing the result, I tried the following sequence a dozen of
>> times (for some unknown reasons the FreeBSD install CD doesn't exhibit
>> the issue, so I used the Debian GNU/kFreeBSD installer):
> [...]
> 
> Thanks for the detailed bug report.  Here's what I see going on:
> 
> - the SeaBIOS 4219149a commit does change the resulting binary ever so
>   slightly - the src/virtio_ring.c code has a reference to __FILE__
>   (the only code in SeaBIOS that does that), and due to slightly
>   different build rules in this commit it evaluates to a slightly
>   different string.
> 
> - the freebsd crash has nothing to do with 4219149a or
>   src/virtio_ring.c - instead, random changes in the seabios binary
>   layout can cause (or avoid) the crash.  You can see this in action
>   by modifying seabios to have higher debug levels, commenting out
>   code, adding dprintf statements, etc.
> 
> - the crash happens when freebsd attempts to emulate the bios code (!)
>   in order to determine the keyboard typematic rate (!).  (See
>   sys/dev/atkbdc/atkbd.c.) Since SeaBIOS doesn't support the typematic
>   callback rate (int 0x16 ax=0x0306) this doesn't actually achieve
>   anything in practice were the call to not crash.  However, a crash
>   does (sometimes) result.
> 
> - the freebsd x86bios_get_pages() code is buggy (See
>   sys/compat/x86bios/x86bios.c).  It attempts to check that its x86
>   emulater (!) doesn't access a page it hasn't mapped.  However, it
>   does not check for the case where a two byte access spans two pages.
>   If the first page is mapped, but the second is not - splat.  The
>   crash I've seen in QEMU had a two byte access to 0xffffff8000015fff
>   with the fault at 0xffffff8000016000.
> 
> - I have not been able to determine why an attempt was made to access
>   a non-mapped page.  My best guess is that the x86emu code (!) goes
>   off the deep-end in all cases - just some cases lead it to the bug
>   above and other cases lead it to a more friendly termination.
>   (Recall that SeaBIOS doesn't support the typematic call anyway.)  It
>   should be possible to track this down by adding debug statements to
>   the freebsd code if anyone is familiar with the freebsd kernel
>   compile-deploy-run cycle.

Great analysis!

Laszlo
(sorry for the noise)


Reply via email to