On Tue, Dec 8, 2015 at 1:29 PM, Christian Bruel <christian.br...@st.com> wrote: > Hello Ramana, > > On 12/08/2015 02:01 PM, Ramana Radhakrishnan wrote: >> >> On Tue, Dec 8, 2015 at 12:53 PM, Christian Bruel <christian.br...@st.com> >> wrote: >>> >>> Hi, >>> >>> The order of the NEON builtins construction has led to complications >>> since >>> the attribute target support. This was not a problem when driven from the >>> command line, but was causing various issues when the builtins was mixed >>> between fpu configurations or when used with LTO. >>> >>> Firstly the builtin functions was not initialized before the parsing of >>> functions, leading to wrong type initializations. >>> >>> Then error catching code when a builtin was used without the proper fpu >>> flags was incomprehensible for the user, for instance >>> >>> #include "arm_neon.h" >>> >>> int8x8_t a, b; >>> int16x8_t e; >>> >>> void >>> main() >>> { >>> e = (int16x8_t)__builtin_neon_vaddlsv8qi (a, b); >>> } >> >> >> I'm not sure what problem you are trying to solve here - The user >> should never be using __builtin_neon_vaddlsv8qi (a, b) here. What >> happens with vaddl_s16 intrinsic ? >> >> They really have to only use the vaddl_s8 intrinsic. > > > > Sure, that's not the problem, replace _builtin_neon_vaddlsv8qi by vaddl_s8. > The tests (part of the patch) equivalently fails. > > But anyway, users do use the __builtin directly, see for instance the Bug > 65837
I think that's just a reduced testcase from the issue to illustrate the problem from Prathamesh who was trying to build chromium with LTO. The __builtin_neon* aren't published anywhere and people really shouldn't be using that directly in source code and only use the interface in arm_neon.h which implements pretty much all the Neon intrinsics in the ACLE document. regards Ramana > > > >> >> >> Ramana >> >>> >>> compiled with default options (without -mfpu=neon -mfloat-abi=hard) gave >>> pages of >>> >>> /arm-none-eabi/6.0.0/include/arm_neon.h:39:9: error: unknown type name >>> '__simd64_int8_t' >>> typedef __simd64_int8_t int8x8_t; >>> ... >>> ... >>> arm_neon.h:4724:3: error: can't convert a vector of type 'poly64x2_t {aka >>> __vector(4) int}' to type 'int' which has different size >>> return (poly64x2_t)__builtin_neon_vsli_nv2di ((int64x2_t) __a, >>> (int64x2_t) __b, __c); >>> ^~~~~~ >>> ... >>> ... and one for each arm_neon.h lines.. >>> >>> by postponing the check into arm_expand_builtin, we now emit something >>> more >>> useful: >>> >>> testo.c: In function 'main': >>> testo.c:9:7: error: '__builtin_neon_vaddlsv8qi' neon builtin is not >>> supported in this configuration. >>> e = (int16x8_t)__builtin_neon_vaddlsv8qi (a, b); >>> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> >>> One small side effect to note: The total memory allocated is 370k bigger >>> when neon is not used, so this support will have a follow-up to make >>> their >>> initialization lazy. But I'd like first to stabilize the stuff for stage3 >>> (or get it pre-approved if the memory is an issue) >>> >>> tested without new failures with >>> {,-mfpu=vfp,-mfpu=neon}{,-march=armv7-a\} >>> (a few tests that was fail are now unsupported) >>> >>> OK for trunk ? >>> >>> >>> >>> >>> >>> >>> >>> >