On 24 July 2017 at 18:38, Christophe Lyon <christophe.l...@linaro.org> wrote: > > > Le 24 juil. 2017 18:30, "Ard Biesheuvel" <ard.biesheu...@linaro.org> a écrit > : > > On 18 July 2017 at 13:54, Christophe Lyon <christophe.l...@linaro.org> > wrote: >> On 13 July 2017 at 13:50, Christophe Lyon <christophe.l...@linaro.org> >> wrote: >>> On 12 July 2017 at 19:33, Ard Biesheuvel <ard.biesheu...@linaro.org> >>> wrote: >>>> On 12 July 2017 at 18:27, Alexei Fedorov <alexei.fedo...@arm.com> wrote: >>>>> >>>>> Christophe, Leif, Ard, Ryan at al. >>>>> >>>>> >>>>> We are observing unaligned memory access fault with UEFI code compiled >>>>> by >>>>> Linaro GCC 6.3.1 & 7.1.1 using -O3 optimisation option. >>>>> >>>>> The fault occures at the very early stage of UEFI boot with MMU not >>>>> being >>>>> enabled yet. >>>>> >>>>> The failing function is CalculateSum8() from >>>>> edk2\MdePkg\Library\BaseLib\CheckSum.c: >>>>> >>>>> >>>>> UINT8 >>>>> EFIAPI >>>>> CalculateSum8 ( >>>>> IN CONST UINT8 *Buffer, >>>>> IN UINTN Length >>>>> ) >>>>> { >>>>> UINT8 Sum; >>>>> UINTN Count; >>>>> >>>>> ASSERT (Buffer != NULL); >>>>> ASSERT (Length <= (MAX_ADDRESS - ((UINTN) Buffer) + 1)); >>>>> >>>>> for (Sum = 0, Count = 0; Count < Length; Count++) { >>>>> Sum = (UINT8) (Sum + *(Buffer + Count)); >>>>> } >>>>> >>>>> return Sum; >>>>> } >>>>> >>>>> & the instruction which causes the exception is "ldr q1, [x1], 16" >>>>> which >>>>> accesses Buffer = 0xE0000048 pointed by X1 register, see the part of >>>>> generated assembly code: >>>>> >>>>> >>>>> // r:\edk2\MdePkg\Library\BaseLib\CheckSum.c:49: for (Sum = 0, Count >>>>> = 0; >>>>> Count < Length; Count++) { >>>>> .loc 1 49 0 is_stmt 1 >>>>> cbz x19, .L10 // Length, >>>>> .L4: >>>>> sub x0, x19, #1 // tmp150, Length, >>>>> cmp x0, 14 // tmp150, >>>>> bls .L11 //, >>>>> // r:\edk2\MdePkg\Library\BaseLib\CheckSum.c:42: { >>>>> .loc 1 42 0 >>>>> movi v0.4s, 0 // vect_Sum_19.24 >>>>> lsr x2, x19, 4 // bnd.18, Length, >>>>> mov x1, x20 // ivtmp.29, Buffer >>>>> mov x0, 0 // ivtmp.28, >>>>> .LVL4: >>>>> .p2align 3 >>>>> .L7: >>>>> // r:\edk2\MdePkg\Library\BaseLib\CheckSum.c:50: Sum = (UINT8) (Sum >>>>> + >>>>> *(Buffer + Count)); >>>>> .loc 1 50 0 discriminator 3 >>>>> ldr q1, [x1], 16 // vect__6.23, MEM[(const UINT8 >>>>> *)vectp_Buffer.21_38] >>>>> add x0, x0, 1 // ivtmp.28, ivtmp.28, >>>>> cmp x0, x2 // ivtmp.28, bnd.18 >>>>> add v0.16b, v0.16b, v1.16b // vect_Sum_19.24, vect_Sum_19.24, >>>>> vect__6.23 >>>>> bcc .L7 //, >>>>> >>>>> ... >>>>> >>>>> Although all AARCH64 code is compiled with "-mstrict-align" option >>>>> which >>>>> according to GCC 3.18.1 AArch64 Options: >>>>> >>>>> "-mstrict-align >>>>> >>>>> Avoid generating memory accesses that may not be aligned on a natural >>>>> object >>>>> boundary as described in the architecture specification." >>>>> >>>>> >>>>> the generated code doesn't comply with this description. In this case >>>>> X1 = >>>>> Buffer @0xE0000048 and is not aligned to 16 bytes boundary. >>>>> >>>>> The similiar code is generated by GCC 6.3.1-2017.05 but 5.3.1-2016.05 >>>>> compiler produces only 16 bytes aligned memory accesses when loading Q1 >>>>> register. >>>>> >>>>> >>>>> I attached the simple test file which can be compiled by running GCC >>>>> compilation with >>>>> >>>>> -c test.c -O3 -mstrict-align -save-temps >>>>> >>>>> to see the difference between code generated by 7.1.1 & 5.3.1 GCC >>>>> versions. >>>>> >>>>> It seems that 5.3.1 ignores "-mstrict-align" option at all and always >>>>> generates aligned pointers for loading Q1 register, 7.1.1 & 6.3.1 also >>>>> ignore the option but generate slighly different code with unaligned >>>>> access >>>>> enabled. >>>>> >>>>> >>>>> Please share your thoughts regading this issue. >>>>> >>>> >>>> Hello Alexei, >>>> >>>> This does look like a compiler bug to me. 'Buffer' is a pointer to >>>> unsigned char, and so the compiler should never emit the ldr >>>> instruction under -mstrict-align. >>>> >>>> In the mean time, we could work around this with adding >>>> -mgeneral-regs-only in all places where -mstrict-align is being >>>> passed. In general, I don't really see the point of supporting the use >>>> of FP/ASIMD registers in UEFI beyond ensuring that our builds are >>>> compatible with 3rd party binaries that do use them. >>>> >>> >>> Hello Alexei, >>> >>> I agree with Ard: it looks like a compiler bug, I'm looking at it. >>> >>> And indeed in the mean time, using -mgeneral-regs-only should >>> workaround your problem. >>> >> >> Hello, >> >> As a follow-up, I've posted a patch: >> https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01063.html >> >> We'll see if maintainers agree. >> > > Thanks. By the looks of it, nobody cared to respond, right? > > > Not yet and we are used to slow response. > > In addition I'm on holidays until Aug 21st so I won't ping until then. > >
Hi all, My patch was finally accepted last week and committed. I also backported it to the gcc-7-branch, so that the problem will be fixed in the next gcc-7 release (either FSF or Linaro). Thanks, Christophe _______________________________________________ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-toolchain