On 24 July 2017 at 18:38, Christophe Lyon <christophe.l...@linaro.org> wrote:
>
>
> Le 24 juil. 2017 18:30, "Ard Biesheuvel" <ard.biesheu...@linaro.org> a écrit
> :
>
> On 18 July 2017 at 13:54, Christophe Lyon <christophe.l...@linaro.org>
> wrote:
>> On 13 July 2017 at 13:50, Christophe Lyon <christophe.l...@linaro.org>
>> wrote:
>>> On 12 July 2017 at 19:33, Ard Biesheuvel <ard.biesheu...@linaro.org>
>>> wrote:
>>>> On 12 July 2017 at 18:27, Alexei Fedorov <alexei.fedo...@arm.com> wrote:
>>>>>
>>>>> Christophe, Leif, Ard, Ryan at al.
>>>>>
>>>>>
>>>>> We are observing unaligned memory access fault with UEFI code compiled
>>>>> by
>>>>> Linaro GCC 6.3.1  & 7.1.1 using -O3 optimisation option.
>>>>>
>>>>> The fault occures at the very early stage of UEFI boot with MMU not
>>>>> being
>>>>> enabled yet.
>>>>>
>>>>> The failing function is CalculateSum8() from
>>>>> edk2\MdePkg\Library\BaseLib\CheckSum.c:
>>>>>
>>>>>
>>>>> UINT8
>>>>> EFIAPI
>>>>> CalculateSum8 (
>>>>>   IN      CONST UINT8       *Buffer,
>>>>>   IN      UINTN                     Length
>>>>>   )
>>>>> {
>>>>>   UINT8     Sum;
>>>>>   UINTN     Count;
>>>>>
>>>>>   ASSERT (Buffer != NULL);
>>>>>   ASSERT (Length <= (MAX_ADDRESS - ((UINTN) Buffer) + 1));
>>>>>
>>>>>   for (Sum = 0, Count = 0; Count < Length; Count++) {
>>>>>     Sum = (UINT8) (Sum + *(Buffer + Count));
>>>>>   }
>>>>>
>>>>>   return Sum;
>>>>> }
>>>>>
>>>>> & the instruction which causes the exception is "ldr    q1, [x1], 16"
>>>>> which
>>>>> accesses Buffer = 0xE0000048 pointed by X1 register, see the part of
>>>>> generated assembly code:
>>>>>
>>>>>
>>>>> // r:\edk2\MdePkg\Library\BaseLib\CheckSum.c:49:   for (Sum = 0, Count
>>>>> = 0;
>>>>> Count < Length; Count++) {
>>>>>     .loc 1 49 0 is_stmt 1
>>>>>     cbz    x19, .L10    // Length,
>>>>> .L4:
>>>>>     sub    x0, x19, #1    // tmp150, Length,
>>>>>     cmp    x0, 14    // tmp150,
>>>>>     bls    .L11    //,
>>>>> // r:\edk2\MdePkg\Library\BaseLib\CheckSum.c:42: {
>>>>>     .loc 1 42 0
>>>>>     movi    v0.4s, 0    // vect_Sum_19.24
>>>>>     lsr    x2, x19, 4    // bnd.18, Length,
>>>>>     mov    x1, x20    // ivtmp.29, Buffer
>>>>>     mov    x0, 0    // ivtmp.28,
>>>>> .LVL4:
>>>>>     .p2align 3
>>>>> .L7:
>>>>> // r:\edk2\MdePkg\Library\BaseLib\CheckSum.c:50:     Sum = (UINT8) (Sum
>>>>> +
>>>>> *(Buffer + Count));
>>>>>     .loc 1 50 0 discriminator 3
>>>>>     ldr    q1, [x1], 16    // vect__6.23, MEM[(const UINT8
>>>>> *)vectp_Buffer.21_38]
>>>>>     add    x0, x0, 1    // ivtmp.28, ivtmp.28,
>>>>>     cmp    x0, x2    // ivtmp.28, bnd.18
>>>>>     add    v0.16b, v0.16b, v1.16b    // vect_Sum_19.24, vect_Sum_19.24,
>>>>> vect__6.23
>>>>>     bcc    .L7    //,
>>>>>
>>>>> ...
>>>>>
>>>>> Although all AARCH64 code is compiled with "-mstrict-align" option
>>>>> which
>>>>> according to GCC 3.18.1 AArch64 Options:
>>>>>
>>>>> "-mstrict-align
>>>>>
>>>>> Avoid generating memory accesses that may not be aligned on a natural
>>>>> object
>>>>> boundary as described in the architecture specification."
>>>>>
>>>>>
>>>>> the generated code doesn't comply with this description. In this case
>>>>> X1 =
>>>>> Buffer @0xE0000048 and is not aligned to 16 bytes boundary.
>>>>>
>>>>> The similiar code is generated by GCC 6.3.1-2017.05 but 5.3.1-2016.05
>>>>> compiler produces only 16 bytes aligned memory accesses when loading Q1
>>>>> register.
>>>>>
>>>>>
>>>>> I attached the simple test file which can be compiled by running GCC
>>>>> compilation with
>>>>>
>>>>> -c test.c -O3 -mstrict-align -save-temps
>>>>>
>>>>> to see the difference between code generated by 7.1.1 & 5.3.1 GCC
>>>>> versions.
>>>>>
>>>>> It seems that 5.3.1 ignores "-mstrict-align" option at all and always
>>>>> generates aligned pointers for loading Q1 register, 7.1.1 & 6.3.1 also
>>>>> ignore the option but generate slighly different code with unaligned
>>>>> access
>>>>> enabled.
>>>>>
>>>>>
>>>>> Please share your thoughts regading this issue.
>>>>>
>>>>
>>>> Hello Alexei,
>>>>
>>>> This does look like a compiler bug to me. 'Buffer' is a pointer to
>>>> unsigned char, and so the compiler should never emit the ldr
>>>> instruction under -mstrict-align.
>>>>
>>>> In the mean time, we could work around this with adding
>>>> -mgeneral-regs-only in all places where -mstrict-align is being
>>>> passed. In general, I don't really see the point of supporting the use
>>>> of FP/ASIMD registers in UEFI beyond ensuring that our builds are
>>>> compatible with 3rd party binaries that do use them.
>>>>
>>>
>>> Hello Alexei,
>>>
>>> I agree with Ard: it looks like a compiler bug, I'm looking at it.
>>>
>>> And indeed in the mean time, using -mgeneral-regs-only should
>>> workaround your problem.
>>>
>>
>> Hello,
>>
>> As a follow-up, I've posted a patch:
>> https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01063.html
>>
>> We'll see if maintainers agree.
>>
>
> Thanks. By the looks of it, nobody cared to respond, right?
>
>
> Not yet and we are used to slow response.
>
> In addition I'm on holidays until Aug 21st so I won't ping until then.
>
>

Hi all,

My patch was finally accepted last week and committed.
I also backported it to the gcc-7-branch, so that the problem will
be fixed in the next gcc-7 release (either FSF or Linaro).

Thanks,

Christophe
_______________________________________________
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-toolchain

Reply via email to