On Tue, Dec 8, 2015 at 1:29 PM, Christian Bruel <christian.br...@st.com> wrote:
> Hello Ramana,
>
> On 12/08/2015 02:01 PM, Ramana Radhakrishnan wrote:
>>
>> On Tue, Dec 8, 2015 at 12:53 PM, Christian Bruel <christian.br...@st.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> The order of the NEON builtins construction has led to complications
>>> since
>>> the attribute target support. This was not a problem when driven from the
>>> command line, but was causing various issues when the builtins was mixed
>>> between fpu configurations or when used with LTO.
>>>
>>> Firstly the builtin functions was not initialized before the parsing of
>>> functions, leading to wrong type initializations.
>>>
>>> Then error catching code when a builtin was used without the proper fpu
>>> flags was incomprehensible for the user, for instance
>>>
>>> #include "arm_neon.h"
>>>
>>> int8x8_t a, b;
>>> int16x8_t e;
>>>
>>> void
>>> main()
>>> {
>>>    e = (int16x8_t)__builtin_neon_vaddlsv8qi (a, b);
>>> }
>>
>>
>> I'm not sure what problem you are trying to solve here - The user
>> should never be using __builtin_neon_vaddlsv8qi (a, b) here. What
>> happens with vaddl_s16 intrinsic ?
>>
>> They really have to only use the vaddl_s8 intrinsic.
>
>
>
> Sure, that's not the problem, replace _builtin_neon_vaddlsv8qi by vaddl_s8.
> The tests (part of the patch) equivalently fails.
>
> But anyway, users do use the __builtin directly, see for instance the Bug
> 65837

I think that's just a reduced testcase from the issue to illustrate
the problem from Prathamesh who was trying to build chromium with LTO.

The __builtin_neon* aren't published anywhere and people really
shouldn't be using that directly in source code and only use the
interface in arm_neon.h which implements pretty much all the Neon
intrinsics in the ACLE document.

regards
Ramana


>
>
>
>>
>>
>> Ramana
>>
>>>
>>> compiled with default options (without -mfpu=neon -mfloat-abi=hard) gave
>>> pages of
>>>
>>> /arm-none-eabi/6.0.0/include/arm_neon.h:39:9: error: unknown type name
>>> '__simd64_int8_t'
>>>   typedef __simd64_int8_t int8x8_t;
>>> ...
>>> ...
>>> arm_neon.h:4724:3: error: can't convert a vector of type 'poly64x2_t {aka
>>> __vector(4) int}' to type 'int' which has different size
>>>     return (poly64x2_t)__builtin_neon_vsli_nv2di ((int64x2_t) __a,
>>> (int64x2_t) __b, __c);
>>>     ^~~~~~
>>> ...
>>> ... and one for each arm_neon.h lines..
>>>
>>> by postponing the check into arm_expand_builtin, we now emit something
>>> more
>>> useful:
>>>
>>> testo.c: In function 'main':
>>> testo.c:9:7: error: '__builtin_neon_vaddlsv8qi' neon builtin is not
>>> supported in this configuration.
>>>     e = (int16x8_t)__builtin_neon_vaddlsv8qi (a, b);
>>>         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>
>>> One small side effect to note: The total memory allocated is 370k bigger
>>> when neon is not used, so this support will have a follow-up to make
>>> their
>>> initialization lazy. But I'd like first to stabilize the stuff for stage3
>>> (or get it pre-approved if the memory is an issue)
>>>
>>> tested without new failures with
>>> {,-mfpu=vfp,-mfpu=neon}{,-march=armv7-a\}
>>> (a few tests that was fail are now unsupported)
>>>
>>> OK for trunk ?
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>

Reply via email to