On Tue, Feb 21, 2017 at 06:07:15PM +0100, Laszlo Ersek wrote: > CC Rebecca & Konrad > > On 02/21/17 17:39, Anthony PERARD wrote: > > On Sat, Dec 03, 2016 at 06:59:28PM +0100, Laszlo Ersek wrote: > >> On 12/02/16 20:26, Laszlo Ersek wrote: > >>> On 12/02/16 17:02, Anthony PERARD wrote: > >>>> On Thu, Dec 01, 2016 at 07:43:24PM +0100, Laszlo Ersek wrote: > >>>>> On 12/01/16 16:28, Anthony PERARD wrote: > >>>>>> Hi, > >>>>>> > >>>>>> That might be only with the Xen part of OVMF but now that the GCC5 > >>>>>> toolchains is used with my gcc (6.2.1 20160830, Arch Linux), OVMF fail > >>>>>> to boot in Xen guests. > >>>>>> > >>>> [...] > >>>>>> > >>>>>> Removing the gcc option -flto in only the XenBusDxe module makes OVMF > >>>>>> boot. > >>>>>> > >>>>>> While trying to debug that, I've added some debug prints (in this > >>>>>> module > >>>>>> and in XenPvBlkDxe), and the exception could change and become a "page > >>>>>> fault" instead, or even an assert failure in the PrintLib, that was the > >>>>>> ASSERT(Buffer != NULL) at I think > >>>>>> MdePkg/Library/BasePrintLib/PrintLibInternal.c:366 > >>>>>> > >>>>>> Adding EFIAPI to internal functions in XenBusDxe makes things work > >>>>>> again. My guest is that gcc would bypass (optimise) an exported > >>>>>> functions and call directly an internal one but without reordering the > >>>>>> arguments (EFIAPI vs nothing). > >>>>>> > >>>>>> Does that make sense? > >>>>> > >>>>> If "-b NOOPT" works for you, I'd prefer that as a temporary solution > >>>>> (until the root cause is found and addressed) to the XenBusDxe patches. > >>>> > >>>> That works, using GCC49 (with gcc 6.2.1) works as well. > >>>> > >>>>> Hrpmf, wait a second, I do see something interesting: in this series you > >>>>> *are* modifying APIs declared in a library class header (namely > >>>>> "OvmfPkg/Include/Library/XenHypercallLib.h"). Such functions (public > >>>>> libraries) *are* required to specify EFIAPI. > >>>>> > >>>>> What happens if you apply patch #1 only? > >>>> > >>>> With only XenHypercallLib changes, the error is the same. > >>>> > >>>> But I did find the minimum change needed, it envolve a function with a > >>>> VA_LIST as argument. > >>>> > >>>> With only the following patch, OVMF works again. > >>>> > >>>> diff --git a/OvmfPkg/XenBusDxe/XenStore.c b/OvmfPkg/XenBusDxe/XenStore.c > >>>> index 1666c4b..85b0956 100644 > >>>> --- a/OvmfPkg/XenBusDxe/XenStore.c > >>>> +++ b/OvmfPkg/XenBusDxe/XenStore.c > >>>> @@ -1307,6 +1307,7 @@ XenStoreTransactionEnd ( > >>>> } > >>>> > >>>> XENSTORE_STATUS > >>>> +EFIAPI > >>>> XenStoreVSPrint ( > >>>> IN CONST XENSTORE_TRANSACTION *Transaction, > >>>> IN CONST CHAR8 *DirectoryPath, > >>>> diff --git a/OvmfPkg/XenBusDxe/XenStore.h b/OvmfPkg/XenBusDxe/XenStore.h > >>>> index c9d4c65..33bb647 100644 > >>>> --- a/OvmfPkg/XenBusDxe/XenStore.h > >>>> +++ b/OvmfPkg/XenBusDxe/XenStore.h > >>>> @@ -209,6 +209,7 @@ XenStoreSPrint ( > >>>> indicating the type of write failure. > >>>> **/ > >>>> XENSTORE_STATUS > >>>> +EFIAPI > >>>> XenStoreVSPrint ( > >>>> IN CONST XENSTORE_TRANSACTION *Transaction, > >>>> IN CONST CHAR8 *DirectoryPath, > >>>> IN CONST CHAR8 *Node, > >>>> IN CONST CHAR8 *FormatString, > >>>> IN VA_LIST Marker > >>>> ); > >>>> > >>>> > >>>> I think the exception happen when this function is called via > >>>> XENBUS_PROTOCOL->XsPrintf() from XenPvBlockFrontInitialization() in > >>>> OvmfPkg/XenPvBlkDxe/BlockFront.c > >>>> > >>> > >>> It used to be a known requirement / limitation that all functions with > >>> variable argument lists had to be EFIAPI, regardless of cross-module > >>> use. However, commit 48d5f9a551a93acb45f272dda879b0ab5a504e36 changed > >>> that, and varargs should "just work" now. I suspect this is a > >>> __builtin_ms_va_* regression in gcc-6. Thank you for narrowing it down. > >>> It might make sense to report a bug in the upstream gcc tracker. > >>> > >>> ... Oh wow, this is a known gcc bug! See: > >>> > >>> https://lists.01.org/pipermail/edk2-devel/2016-August/001018.html > >>> > >>> Upstream gcc BZ <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70955> was > >>> apparently solved for "Target Milestone: 6.3" (your version is 6.2.1). > >>> So we'll either need a GCC6 toolchain in BaseTools that drops -flto, in > >>> order to work around this gcc issue, or we'll have to ask gcc-6 users to > >>> use at least gcc-6.3. > >>> > >>> Oh wait, gcc-6.3 hasn't been released yet. We need the BaseTools > >>> workaround then. > >> > >> I think I got confused in parts of the above; I got some details wrong. > >> Namely, commit 48d5f9a551a9 did not remove the requirement/limitation > >> that all varargs functions have to be EFIAPI. Said commit only changed > >> how the VA_*() macros would be implemented. > >> > >> The two caller functions of XenStoreVSPrint(), namely XenStoreSPrint() > >> and XenBusXenStoreSPrint(), are varargs functions, but they are already > >> EFIAPI. So the requirement/limitation (which was unaffected by > >> 48d5f9a551a9) is actually satisfied / considered in XenBusDxe. > >> > >> The XenStoreVSPrint() function, which you identified as the breaking > >> part, is *not* a varargs function itself, so it needn't be EFIAPI. It > >> simply receives a VA_LIST parameter (which is "__builtin_ms_va_list", > >> from commit 48d5f9a551a9), and (a) copies it with VA_COPY() for passing > >> the copy to SPrintLengthAsciiFormat(), (b) passes the original parameter > >> to AsciiVSPrint(). In turn both of those functions call the common > >> BasePrintLibSPrintMarker() function. > >> > >> Comment <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70955#c6> says, > >> > >>> This is bug report that the specialized > >>> __builtin_ms_va_{list,start,end,copy} builtins have stopped working > >>> when -flto is used. They worked until gcc 5.3, both with or without > >>> -flto. In gcc 6.1 with -flto, the canonical iterator __builtin_va_arg > >>> ignores them and works on a sysv_va_list. To be precise, it's > >>> __builtin_va_arg in the context of -flto that's broken, not the > >>> specialized builtins. __builtin_va_arg has always been a polymorphic > >>> builtin that changes its behavior based on the type of va_list it's > >>> given as an argument. Without this polymorphic behavior, there's no > >>> way to iterate over an ms_va_list. > >> > >> Apparently, when BasePrintLibSPrintMarker() finally calls VA_ARG() (== > >> __builtin_va_arg(), from commit 48d5f9a551a9) on Marker / Marker2, with > >> LTO enabled, __builtin_va_arg() fails to notice what context > >> VaListMarker comes from: > >> - __builtin_ms_va_start() in XenStoreSPrint() and XenBusXenStoreSPrint(), > >> or > >> - __builtin_ms_va_copy() in XenStoreVSPrint(). > >> > >> So I think we *are* being hit by gcc BZ#70955, and making > >> XenStoreVSPrint() EFIAPI only masks the issue. Comment > >> <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70955#c7> seems relevant: > >> > >>> The change with GCC 6 is that the builtins are now lowered during > >>> link-time optimization rather than at compile-time. Thus the abi > >>> selection bits are possibly not transfered correctly (type merging?). > >>> I remember the business was quite ugly, but eventually we just miss to > >>> properly transfer the function attribute. > >> > >> The end result for edk2 remains the same (= BaseTools should work around > >> this gcc issue with a new GCC6 toolchain that drops -flto, unless > >> gcc-6.3 is about to become available to users real quick). I just wanted > >> to point out that my earlier statement "commit 48d5f9a551a9 had removed > >> the need for varargs functions to be EFIAPI" was incorrect -- varargs > >> functions still must be EFIAPI (and XenBusDxe conforms, see > >> XenStoreSPrint() and XenBusXenStoreSPrint()). > > > > Hi Laszlo, > > > > Now that gcc 6.3 is out, the bug described in the thread strikes again. > > Building OVMF with -flto result in Page-Fault or General Protection > > fault, due to the way va_args are used in XenStoreVSPrint(). > > > > Also, now I've tried to build OVMF with gcc 5.4, same result, using > > -flto result in a firmware that does not work. > > > > > > I've tried to create a small programme that use the va_args in the same > > way, and compiled-test it with different gcc (gcc 4.9.2, gcc 5.4, gcc > > 6.3), then depending on the options use, it does not work or it works: > > > > Don't work (prog segv or wrong output): > > gcc -o prog va_main.c va_test.c > > gcc -o prog -flto va_main.c va_test.c > > gcc -Os -o prog -flto va_main.c va_test.c > > > > Work: > > gcc -Os -o prog va_main.c va_test.c > > > > I'll attach va_main.c and va_test.c. > > > > So, should I add EFIAPI to XenStoreVSPrint, as it is using VA_COPY? > > > > Hm, please help me jog my memory... > > If I remember correctly, this is still a GCC bug, one that we suppressed for > gcc-6.2 with your patch as follows:
Yes. > > commit 432f1d83f77acf92d52ef18d2cee6dbf7c5b9b86 > > Author: Anthony PERARD <[email protected]> > > Date: Tue Dec 6 12:03:25 2016 +0000 > > > > OvmfPkg/build.sh: Use GCC49 toolchains with GCC 6.[0-2] > > > > The goal of the patch is to avoid using -flto with GCC 6.0 to 6.2. > > > > This is to workaround a GCC bug: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70955 > > > > Contributed-under: TianoCore Contribution Agreement 1.0 > > Signed-off-by: Anthony PERARD <[email protected]> > > Reviewed-by: Laszlo Ersek <[email protected]> > > Regression-tested-by: Laszlo Ersek <[email protected]> > > > > diff --git a/OvmfPkg/build.sh b/OvmfPkg/build.sh > > index 95fe8fb07647..b6e936056ca0 100755 > > --- a/OvmfPkg/build.sh > > +++ b/OvmfPkg/build.sh > > @@ -102,7 +102,7 @@ case `uname` in > > 4.8.*) > > TARGET_TOOLS=GCC48 > > ;; > > - 4.9.*) > > + 4.9.*|6.[0-2].*) > > TARGET_TOOLS=GCC49 > > ;; > > *) > > Do I understand correctly that the gcc bug has not been fixed in gcc-6.3, and > -- because we don't suppress it for gcc-6.3 as the above expression does not > match -- it causes problems again? The bug describe in the GCC bugzilla is probably fix, but the test-case does not make use of __builtin_ms_va_copy. > You also mention gcc-5.4 as problematic. I think we haven't received such > reports about gcc-5 versions up to and including gcc-5.3 (that's why GCC5 is > the default selection in "OvmfPkg/build.sh"). Do you mean that the gcc bug > has now been "backported" from the gcc-6 series to the gcc-5 series (starting > with gcc-5.4)? I don't know the state of gcc-5.0 to gcc-5.3, I have never tested -flto with gcc-5.x (until now), I would say they are also problematic until proven otherwise. > If that's the case, then I suggest flipping "OvmfPkg/build.sh" from > black-listing gcc versions for -flto to white-listing. In other words, assume > that -flto is generally broken with GCC, except for a few known versions: 5.0 > through 5.3 inclusive. Those versions should trigger the use of the GCC5 > toolchain, and everything else (5.4+, 6.*, 4.9.*) should use GCC49. > > I don't feel comfortable about adding EFIAPI to XenStoreVSPrint just because > it takes a VA_LIST parameter -- note: it is *not* a varargs function itself! > --; the same issue might hit elsewhere in the edk2 tree at any time, outside > of OvmfPkg too. >From the different tests I've done, I feel more like VA_COPY might be the issue, but I don't know how __builtin_ms_va_* are supposed to be used. > Would the gcc white-listing work for you? > > Note that the white-listing would practically undo Konrad's commit > 2667ad40919a ("OvmfPkg/build.sh: Make GCC5 the default toolchain, catch GCC43 > and earlier", 2016-11-23), but given the recent gcc developments (gcc-6.3 has > not fixed the gcc bug, and the bug has even surfaced in gcc-5.4), I think it > would be justified. Do be honnest, I don't think the toolchain GCC5 has ever been tested with gcc-5.x and the module XenBusDxe. I think most people that want to start OVMF under Xen are likely to build it with gcc-4.9 or already had gcc-6.x when OVMF switch to the GCC5 toolchain by default. -- Anthony PERARD _______________________________________________ edk2-devel mailing list [email protected] https://lists.01.org/mailman/listinfo/edk2-devel

