Re: [RFC][AARCH64] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook
On Thu, May 22, 2014 at 03:24:23PM +0100, Marcus Shawcroft wrote: On 2 May 2014 13:27, Kugan kugan.vivekanandara...@linaro.org wrote: +2014-05-02 Kugan Vivekanandarajah kug...@linaro.org + + * config/aarch64/aarch64.c (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New + define. + * config/aarch64/aarch64-protos.h (aarch64_atomic_assign_expand_fenv): + New function declaration. + * config/aarch64/aarch64-builtins.c (aarch64_builtins) : Add + AARCH64_BUILTIN_GET_FPCR, AARCH64_BUILTIN_SET_FPCR. + AARCH64_BUILTIN_GET_FPSR and AARCH64_BUILTIN_SET_FPSR. + (aarch64_init_builtins) : Initialize builtins + __builtins_aarch64_set_fpcr, __builtins_aarch64_get_fpcr. + __builtins_aarch64_set_fpsr and __builtins_aarch64_get_fpsr. + (aarch64_expand_builtin) : Expand builtins __builtins_aarch64_set_fpcr + __builtins_aarch64_get_fpcr, __builtins_aarch64_get_fpsr, + and __builtins_aarch64_set_fpsr. + (aarch64_atomic_assign_expand_fenv): New function. + * config/aarch64/aarch64.md (set_fpcr): New pattern. + (get_fpcr) : Likewise. + (set_fpsr) : Likewise. + (get_fpsr) : Likewise. + (unspecv): Add UNSPECV_GET_FPCR and UNSPECV_SET_FPCR, UNSPECV_GET_FPSR +and UNSPECV_SET_FPSR. + * doc/extend.texi (AARCH64 Built-in Functions) : Document + __builtins_aarch64_set_fpcr, __builtins_aarch64_get_fpcr. + __builtins_aarch64_set_fpsr and __builtins_aarch64_get_fpsr. + OK, thanks Kugan. I appreciate it is quite late in the day for the 4.9.3 branch, but do we want to consider this patch for backporting (either now or after the branch reopens)? gcc.dg/atomic/c11-atomic-exec-5.c is the only interesting test I see failing on a native AArch64 build of the 4.9.3 release candidate (there is plenty of other FAILures, but they are guality, scan assembler or missed optimization fails). Thanks, James
Re: [RFC][AARCH64] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook
On 2 May 2014 13:27, Kugan kugan.vivekanandara...@linaro.org wrote: +2014-05-02 Kugan Vivekanandarajah kug...@linaro.org + + * config/aarch64/aarch64.c (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New + define. + * config/aarch64/aarch64-protos.h (aarch64_atomic_assign_expand_fenv): + New function declaration. + * config/aarch64/aarch64-builtins.c (aarch64_builtins) : Add + AARCH64_BUILTIN_GET_FPCR, AARCH64_BUILTIN_SET_FPCR. + AARCH64_BUILTIN_GET_FPSR and AARCH64_BUILTIN_SET_FPSR. + (aarch64_init_builtins) : Initialize builtins + __builtins_aarch64_set_fpcr, __builtins_aarch64_get_fpcr. + __builtins_aarch64_set_fpsr and __builtins_aarch64_get_fpsr. + (aarch64_expand_builtin) : Expand builtins __builtins_aarch64_set_fpcr + __builtins_aarch64_get_fpcr, __builtins_aarch64_get_fpsr, + and __builtins_aarch64_set_fpsr. + (aarch64_atomic_assign_expand_fenv): New function. + * config/aarch64/aarch64.md (set_fpcr): New pattern. + (get_fpcr) : Likewise. + (set_fpsr) : Likewise. + (get_fpsr) : Likewise. + (unspecv): Add UNSPECV_GET_FPCR and UNSPECV_SET_FPCR, UNSPECV_GET_FPSR +and UNSPECV_SET_FPSR. + * doc/extend.texi (AARCH64 Built-in Functions) : Document + __builtins_aarch64_set_fpcr, __builtins_aarch64_get_fpcr. + __builtins_aarch64_set_fpsr and __builtins_aarch64_get_fpsr. + OK, thanks Kugan. /Marcus
Re: [RFC][AARCH64] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook
Ping ? Thanks, Kugan On 02/05/14 22:27, Kugan wrote: On 02/05/14 20:06, Marcus Shawcroft wrote: On 29 April 2014 03:37, Kugan kugan.vivekanandara...@linaro.org wrote: On 28/04/14 21:01, Ramana Radhakrishnan wrote: On 04/26/14 11:57, Kugan wrote: Attached patch implements TARGET_ATOMIC_ASSIGN_EXPAND_FENV for AARCH64. With this, atomic test-case gcc.dg/atomic/c11-atomic-exec-5.c now PASS. This implementation is based on SPARC and i386 implementations. Regression tested on qemu-aarch64 for aarch64-none-linux-gnu with no new regression. Is this OK for trunk? Again like A32 please test on hardware to make sure this behaves correctly with c11-atomic-exec-5.c . If you don't have access to hardware, let us know : we'll take it for a spin once you update the patch according to Marcus's comments. Thanks for the review. I have updated the patch. I also have updated hold, clear and update to be exactly as in feholdexcpt.c, fclrexcpt.c and feupdateenv.c of glibc/ports/sysdeps/aarch64/fpu. Kugan, I've not looked at the respin in detail yet, but it has just occurred to me that the sequence used here to set FPCR is insufficient. The architecture reference manual requires that any write to FPCR must be syncrhronized by a context synchronization operation so we need to plant an ISB after the write. Both the write and ISB are likely to be expensive on some implementations so it would be good to ensure that both the write and the isb are scheduled independently. IIRC there si I have limited real hardware access and just did a bootstrap and tested c11-atomic-exec-5.c alone to make sure that it PASS. I have also regression tested again on qemu-aarch64 for aarch64-none-linux-gnu with no new regressions. I will appreciate if you could do the regression testing on real hw. Once the ISB issue is resolved I'll give the patch a spin on HW here. Here is the modified patch which also includes changes Yufeng has suggested. Regression tested on qemu-aarch64 for aarch64-none-linux-gnu with no new regressions. Thanks, Kugan gcc/ +2014-05-02 Kugan Vivekanandarajah kug...@linaro.org + + * config/aarch64/aarch64.c (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New + define. + * config/aarch64/aarch64-protos.h (aarch64_atomic_assign_expand_fenv): + New function declaration. + * config/aarch64/aarch64-builtins.c (aarch64_builtins) : Add + AARCH64_BUILTIN_GET_FPCR, AARCH64_BUILTIN_SET_FPCR. + AARCH64_BUILTIN_GET_FPSR and AARCH64_BUILTIN_SET_FPSR. + (aarch64_init_builtins) : Initialize builtins + __builtins_aarch64_set_fpcr, __builtins_aarch64_get_fpcr. + __builtins_aarch64_set_fpsr and __builtins_aarch64_get_fpsr. + (aarch64_expand_builtin) : Expand builtins __builtins_aarch64_set_fpcr + __builtins_aarch64_get_fpcr, __builtins_aarch64_get_fpsr, + and __builtins_aarch64_set_fpsr. + (aarch64_atomic_assign_expand_fenv): New function. + * config/aarch64/aarch64.md (set_fpcr): New pattern. + (get_fpcr) : Likewise. + (set_fpsr) : Likewise. + (get_fpsr) : Likewise. + (unspecv): Add UNSPECV_GET_FPCR and UNSPECV_SET_FPCR, UNSPECV_GET_FPSR + and UNSPECV_SET_FPSR. + * doc/extend.texi (AARCH64 Built-in Functions) : Document + __builtins_aarch64_set_fpcr, __builtins_aarch64_get_fpcr. + __builtins_aarch64_set_fpsr and __builtins_aarch64_get_fpsr. +
Re: [RFC][AARCH64] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook
+2014-04-29 Kugan Vivekanandarajah kug...@linaro.org + + * config/aarch64/aarch64.c (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New + define. + * config/aarch64/aarch64-protos.h (aarch64_atomic_assign_expand_fenv): + New function declaration. + * config/aarch64/aarch64-builtins.c (aarch64_builtins) : Add + AARCH64_BUILTIN_GET_FPCR, AARCH64_BUILTIN_SET_FPCR. + AARCH64_BUILTIN_GET_FPSR and AARCH64_BUILTIN_SET_FPSR. + (aarch64_init_builtins) : Initialize builtins + __builtins_aarch64_set_fpcr, __builtins_aarch64_get_fpcr. + __builtins_aarch64_set_fpsr and __builtins_aarch64_get_fpsr. + (aarch64_expand_builtin) : Expand builtins __builtins_aarch64_set_fpcr + __builtins_aarch64_get_fpcr, __builtins_aarch64_get_fpsr, + and __builtins_aarch64_set_fpsr. + (aarch64_atomic_assign_expand_fenv): New function. + * config/aarch64/aarch64.md (set_fpcr): New pattern. + (get_fpcr) : Likewise. + (set_fpsr) : Likewise. + (get_fpsr) : Likewise. + (unspecv): Add UNSPECV_GET_FPCR and UNSPECV_SET_FPCR, UNSPECV_GET_FPSR + and UNSPECV_SET_FPSR. + * doc/extend.texi (AARCH64 Built-in Functions) : Document + __builtins_aarch64_set_fpcr, __builtins_aarch64_get_fpcr. + __builtins_aarch64_set_fpsr and __builtins_aarch64_get_fpsr. Updated is based on the review at http://gcc.gnu.org/ml/gcc-patches/2014-05/msg00041.html. FE_* values are now changed to AARCH64_FE-*. Thanks, Kugan diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index 55cfe0a..40d53b1 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -371,6 +371,12 @@ static aarch64_simd_builtin_datum aarch64_simd_builtin_data[] = { enum aarch64_builtins { AARCH64_BUILTIN_MIN, + + AARCH64_BUILTIN_GET_FPCR, + AARCH64_BUILTIN_SET_FPCR, + AARCH64_BUILTIN_GET_FPSR, + AARCH64_BUILTIN_SET_FPSR, + AARCH64_SIMD_BUILTIN_BASE, #include aarch64-simd-builtins.def AARCH64_SIMD_BUILTIN_MAX = AARCH64_SIMD_BUILTIN_BASE @@ -752,6 +758,24 @@ aarch64_init_simd_builtins (void) void aarch64_init_builtins (void) { + tree ftype_set_fpr += build_function_type_list (void_type_node, unsigned_type_node, NULL); + tree ftype_get_fpr += build_function_type_list (unsigned_type_node, NULL); + + aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPCR] += add_builtin_function (__builtin_aarch64_get_fpcr, ftype_get_fpr, + AARCH64_BUILTIN_GET_FPCR, BUILT_IN_MD, NULL, NULL_TREE); + aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPCR] += add_builtin_function (__builtin_aarch64_set_fpcr, ftype_set_fpr, + AARCH64_BUILTIN_SET_FPCR, BUILT_IN_MD, NULL, NULL_TREE); + aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPSR] += add_builtin_function (__builtin_aarch64_get_fpsr, ftype_get_fpr, + AARCH64_BUILTIN_GET_FPSR, BUILT_IN_MD, NULL, NULL_TREE); + aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPSR] += add_builtin_function (__builtin_aarch64_set_fpsr, ftype_set_fpr, + AARCH64_BUILTIN_SET_FPSR, BUILT_IN_MD, NULL, NULL_TREE); + if (TARGET_SIMD) aarch64_init_simd_builtins (); } @@ -964,6 +988,36 @@ aarch64_expand_builtin (tree exp, { tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0); int fcode = DECL_FUNCTION_CODE (fndecl); + int icode; + rtx pat, op0; + tree arg0; + + switch (fcode) +{ +case AARCH64_BUILTIN_GET_FPCR: +case AARCH64_BUILTIN_SET_FPCR: +case AARCH64_BUILTIN_GET_FPSR: +case AARCH64_BUILTIN_SET_FPSR: + if ((fcode == AARCH64_BUILTIN_GET_FPCR) + || (fcode == AARCH64_BUILTIN_GET_FPSR)) + { + icode = (fcode == AARCH64_BUILTIN_GET_FPSR) ? + CODE_FOR_get_fpsr : CODE_FOR_get_fpcr; + target = gen_reg_rtx (SImode); + pat = GEN_FCN (icode) (target); + } + else + { + target = NULL_RTX; + icode = (fcode == AARCH64_BUILTIN_SET_FPSR) ? + CODE_FOR_set_fpsr : CODE_FOR_set_fpcr; + arg0 = CALL_EXPR_ARG (exp, 0); + op0 = expand_normal (arg0); + pat = GEN_FCN (icode) (op0); + } + emit_insn (pat); + return target; +} if (fcode = AARCH64_SIMD_BUILTIN_BASE) return aarch64_simd_expand_builtin (fcode, exp, target); @@ -1196,6 +1250,106 @@ aarch64_gimple_fold_builtin (gimple_stmt_iterator *gsi) return changed; } +void +aarch64_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update) +{ + const unsigned AARCH64_FE_INVALID = 1; + const unsigned AARCH64_FE_DIVBYZERO = 2; + const unsigned AARCH64_FE_OVERFLOW = 4; + const unsigned AARCH64_FE_UNDERFLOW = 8; + const unsigned AARCH64_FE_INEXACT = 16; + const unsigned HOST_WIDE_INT AARCH64_FE_ALL_EXCEPT = (AARCH64_FE_INVALID + | AARCH64_FE_DIVBYZERO + |
Re: [RFC][AARCH64] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook
On 29 April 2014 03:37, Kugan kugan.vivekanandara...@linaro.org wrote: On 28/04/14 21:01, Ramana Radhakrishnan wrote: On 04/26/14 11:57, Kugan wrote: Attached patch implements TARGET_ATOMIC_ASSIGN_EXPAND_FENV for AARCH64. With this, atomic test-case gcc.dg/atomic/c11-atomic-exec-5.c now PASS. This implementation is based on SPARC and i386 implementations. Regression tested on qemu-aarch64 for aarch64-none-linux-gnu with no new regression. Is this OK for trunk? Again like A32 please test on hardware to make sure this behaves correctly with c11-atomic-exec-5.c . If you don't have access to hardware, let us know : we'll take it for a spin once you update the patch according to Marcus's comments. Thanks for the review. I have updated the patch. I also have updated hold, clear and update to be exactly as in feholdexcpt.c, fclrexcpt.c and feupdateenv.c of glibc/ports/sysdeps/aarch64/fpu. Kugan, I've not looked at the respin in detail yet, but it has just occurred to me that the sequence used here to set FPCR is insufficient. The architecture reference manual requires that any write to FPCR must be syncrhronized by a context synchronization operation so we need to plant an ISB after the write. Both the write and ISB are likely to be expensive on some implementations so it would be good to ensure that both the write and the isb are scheduled independently. IIRC there si I have limited real hardware access and just did a bootstrap and tested c11-atomic-exec-5.c alone to make sure that it PASS. I have also regression tested again on qemu-aarch64 for aarch64-none-linux-gnu with no new regressions. I will appreciate if you could do the regression testing on real hw. Once the ISB issue is resolved I'll give the patch a spin on HW here. /Marcus
Re: [RFC][AARCH64] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook
On 2 May 2014 11:06, Marcus Shawcroft marcus.shawcr...@gmail.com wrote: Kugan, I've not looked at the respin in detail yet, but it has just occurred to me that the sequence used here to set FPCR is insufficient. The architecture reference manual requires that any write to FPCR must be syncrhronized by a context synchronization operation so we need to plant an ISB after the write. Both the write and ISB are likely to be expensive on some implementations so it would be good to ensure that both the write and the isb are scheduled independently. IIRC there si Sorry, incomplete sentence. I had started to write that IIRC the same issue did not apply to FPSCR in the ARM patch. I have doubled checked and the FPSCR does not have the issue therefore the ARM patch is fine in this respect. /Marcus
Re: [RFC][AARCH64] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook
On 05/02/14 10:08, Kugan wrote: diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 347a94a..8bd13f3 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -9107,6 +9107,7 @@ to those machines. Generally these generate calls to specific machine instructions, but allow the compiler to schedule those calls. @menu +* AARCH64 Built-in Functions:: * Alpha Built-in Functions:: * Altera Nios II Built-in Functions:: * ARC Built-in Functions:: @@ -9139,6 +9140,18 @@ instructions, but allow the compiler to schedule those calls. * TILEPro Built-in Functions:: @end menu +@node AARCH64 Built-in Functions +@subsection AARCH64 Built-in Functions + +These built-in functions are available for the AARCH64 family of +processors. +@smallexample +unsigned int __builtin_aarch64_get_fpcr () +void __builtin_aarch64_set_fpcr (unsigned int) +unsigned int __builtin_aarch64_get_fpsr () +void __builtin_aarch64_set_fpsr (unsigned int) +@end smallexample + @node Alpha Built-in Functions @subsection Alpha Built-in Functions Please s/AARCH64/AArch64 to stay consistent with the existing usage, e.g. those in invoke.texi. Thanks, Yufeng
Re: [RFC][AARCH64] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook
On 02/05/14 20:06, Marcus Shawcroft wrote: On 29 April 2014 03:37, Kugan kugan.vivekanandara...@linaro.org wrote: On 28/04/14 21:01, Ramana Radhakrishnan wrote: On 04/26/14 11:57, Kugan wrote: Attached patch implements TARGET_ATOMIC_ASSIGN_EXPAND_FENV for AARCH64. With this, atomic test-case gcc.dg/atomic/c11-atomic-exec-5.c now PASS. This implementation is based on SPARC and i386 implementations. Regression tested on qemu-aarch64 for aarch64-none-linux-gnu with no new regression. Is this OK for trunk? Again like A32 please test on hardware to make sure this behaves correctly with c11-atomic-exec-5.c . If you don't have access to hardware, let us know : we'll take it for a spin once you update the patch according to Marcus's comments. Thanks for the review. I have updated the patch. I also have updated hold, clear and update to be exactly as in feholdexcpt.c, fclrexcpt.c and feupdateenv.c of glibc/ports/sysdeps/aarch64/fpu. Kugan, I've not looked at the respin in detail yet, but it has just occurred to me that the sequence used here to set FPCR is insufficient. The architecture reference manual requires that any write to FPCR must be syncrhronized by a context synchronization operation so we need to plant an ISB after the write. Both the write and ISB are likely to be expensive on some implementations so it would be good to ensure that both the write and the isb are scheduled independently. IIRC there si I have limited real hardware access and just did a bootstrap and tested c11-atomic-exec-5.c alone to make sure that it PASS. I have also regression tested again on qemu-aarch64 for aarch64-none-linux-gnu with no new regressions. I will appreciate if you could do the regression testing on real hw. Once the ISB issue is resolved I'll give the patch a spin on HW here. Here is the modified patch which also includes changes Yufeng has suggested. Regression tested on qemu-aarch64 for aarch64-none-linux-gnu with no new regressions. Thanks, Kugan gcc/ +2014-05-02 Kugan Vivekanandarajah kug...@linaro.org + + * config/aarch64/aarch64.c (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New + define. + * config/aarch64/aarch64-protos.h (aarch64_atomic_assign_expand_fenv): + New function declaration. + * config/aarch64/aarch64-builtins.c (aarch64_builtins) : Add + AARCH64_BUILTIN_GET_FPCR, AARCH64_BUILTIN_SET_FPCR. + AARCH64_BUILTIN_GET_FPSR and AARCH64_BUILTIN_SET_FPSR. + (aarch64_init_builtins) : Initialize builtins + __builtins_aarch64_set_fpcr, __builtins_aarch64_get_fpcr. + __builtins_aarch64_set_fpsr and __builtins_aarch64_get_fpsr. + (aarch64_expand_builtin) : Expand builtins __builtins_aarch64_set_fpcr + __builtins_aarch64_get_fpcr, __builtins_aarch64_get_fpsr, + and __builtins_aarch64_set_fpsr. + (aarch64_atomic_assign_expand_fenv): New function. + * config/aarch64/aarch64.md (set_fpcr): New pattern. + (get_fpcr) : Likewise. + (set_fpsr) : Likewise. + (get_fpsr) : Likewise. + (unspecv): Add UNSPECV_GET_FPCR and UNSPECV_SET_FPCR, UNSPECV_GET_FPSR +and UNSPECV_SET_FPSR. + * doc/extend.texi (AARCH64 Built-in Functions) : Document + __builtins_aarch64_set_fpcr, __builtins_aarch64_get_fpcr. + __builtins_aarch64_set_fpsr and __builtins_aarch64_get_fpsr. + diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index 55cfe0a..a5af874 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -371,6 +371,12 @@ static aarch64_simd_builtin_datum aarch64_simd_builtin_data[] = { enum aarch64_builtins { AARCH64_BUILTIN_MIN, + + AARCH64_BUILTIN_GET_FPCR, + AARCH64_BUILTIN_SET_FPCR, + AARCH64_BUILTIN_GET_FPSR, + AARCH64_BUILTIN_SET_FPSR, + AARCH64_SIMD_BUILTIN_BASE, #include aarch64-simd-builtins.def AARCH64_SIMD_BUILTIN_MAX = AARCH64_SIMD_BUILTIN_BASE @@ -752,6 +758,24 @@ aarch64_init_simd_builtins (void) void aarch64_init_builtins (void) { + tree ftype_set_fpr += build_function_type_list (void_type_node, unsigned_type_node, NULL); + tree ftype_get_fpr += build_function_type_list (unsigned_type_node, NULL); + + aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPCR] += add_builtin_function (__builtin_aarch64_get_fpcr, ftype_get_fpr, + AARCH64_BUILTIN_GET_FPCR, BUILT_IN_MD, NULL, NULL_TREE); + aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPCR] += add_builtin_function (__builtin_aarch64_set_fpcr, ftype_set_fpr, + AARCH64_BUILTIN_SET_FPCR, BUILT_IN_MD, NULL, NULL_TREE); + aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPSR] += add_builtin_function (__builtin_aarch64_get_fpsr, ftype_get_fpr, + AARCH64_BUILTIN_GET_FPSR, BUILT_IN_MD, NULL, NULL_TREE); + aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPSR] += add_builtin_function (__builtin_aarch64_set_fpsr,
Re: [RFC][AARCH64] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook
Hi Kugan, Thanks for this, couple of comments inline: On 26 April 2014 11:57, Kugan kugan.vivekanandara...@linaro.org wrote: gcc/ +2014-04-27 Kugan Vivekanandarajah kug...@linaro.org + + * config/aarch64/aarch64.c (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New + define. + * config/aarch64/aarch64-builtins.c (arm_builtins) : Add aarch64_builtins ? + AARCH64_BUILTIN_LDFPSCR and AARCH64_BUILTIN_STFPSCR. AArch32 has the traditional combined FPSCR, but AArch64 splits this register into FPSR and FPCR therefore I think AARCH64_BUILTIN_GET_FPCR and AARCH64_BUILTIN_SET_FPCR are more appropriate names. Likewise subsequent references to FPSCR in this patch should change to FPCR. + (aarch64_init_builtins) : Initialize builtins + __builtins_aarch64_stfpscr and __builtins_aarch64_ldfpscr. + (aarch64_expand_builtin) : Expand builtins __builtins_aarch64_stfpscr + and __builtins_aarch64_ldfpscr. + (aarch64_atomic_assign_expand_fenv): New function. + * config/aarch64/aarch64.md (stfpscr): New pattern. + (ldfpscr) : Likewise. + (unspecv): Add UNSPECV_LDFPSCR and UNSPECV_STFPSCR. + + aarch64_builtin_decls[AARCH64_BUILTIN_LDFPSCR] += add_builtin_function (__builtin_aarch64_ldfscr, ftype_ldfpscr, I'd prefer __builtin_aarch64_get_fpcr and __builtin_aarch64_set_fpcr. We should document them in doc/extend.texi + const unsigned HOST_WIDE_INT FE_ALL_EXCEPT = (FE_INVALID | FE_DIVBYZERO + | FE_OVERFLOW | FE_UNDERFLOW + | FE_INEXACT); Indentation is funny here.. + /* Genareate the equivalence of : Spelling. + tree fenv_var = create_tmp_var (unsigned_type_node, NULL); + tree ldfpscr = aarch64_builtin_decls[AARCH64_BUILTIN_LDFPSCR]; + tree stfpscr = aarch64_builtin_decls[AARCH64_BUILTIN_STFPSCR]; Move the declarations to the top of the function please. +void aarch64_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update); + Drop the argument names and relocate to aarch64-protos.h please. +UNSPECV_LDFPSCR ; load floating point status and control register. It isn't a status register, how about: UNSPECV_GET_FPCR ; Represent fetch of FPCR content. Cheers /Marcus
Re: [RFC][AARCH64] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook
On 04/26/14 11:57, Kugan wrote: Attached patch implements TARGET_ATOMIC_ASSIGN_EXPAND_FENV for AARCH64. With this, atomic test-case gcc.dg/atomic/c11-atomic-exec-5.c now PASS. This implementation is based on SPARC and i386 implementations. Regression tested on qemu-aarch64 for aarch64-none-linux-gnu with no new regression. Is this OK for trunk? Again like A32 please test on hardware to make sure this behaves correctly with c11-atomic-exec-5.c . If you don't have access to hardware, let us know : we'll take it for a spin once you update the patch according to Marcus's comments. regards Ramana Thanks, Kugan gcc/ +2014-04-27 Kugan Vivekanandarajah kug...@linaro.org + + * config/aarch64/aarch64.c (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New + define. + * config/aarch64/aarch64-builtins.c (arm_builtins) : Add + AARCH64_BUILTIN_LDFPSCR and AARCH64_BUILTIN_STFPSCR. + (aarch64_init_builtins) : Initialize builtins + __builtins_aarch64_stfpscr and __builtins_aarch64_ldfpscr. + (aarch64_expand_builtin) : Expand builtins __builtins_aarch64_stfpscr + and __builtins_aarch64_ldfpscr. + (aarch64_atomic_assign_expand_fenv): New function. + * config/aarch64/aarch64.md (stfpscr): New pattern. + (ldfpscr) : Likewise. + (unspecv): Add UNSPECV_LDFPSCR and UNSPECV_STFPSCR. + -- Ramana Radhakrishnan Principal Engineer ARM Ltd.
Re: [RFC][AARCH64] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook
On 28/04/14 21:01, Ramana Radhakrishnan wrote: On 04/26/14 11:57, Kugan wrote: Attached patch implements TARGET_ATOMIC_ASSIGN_EXPAND_FENV for AARCH64. With this, atomic test-case gcc.dg/atomic/c11-atomic-exec-5.c now PASS. This implementation is based on SPARC and i386 implementations. Regression tested on qemu-aarch64 for aarch64-none-linux-gnu with no new regression. Is this OK for trunk? Again like A32 please test on hardware to make sure this behaves correctly with c11-atomic-exec-5.c . If you don't have access to hardware, let us know : we'll take it for a spin once you update the patch according to Marcus's comments. Thanks for the review. I have updated the patch. I also have updated hold, clear and update to be exactly as in feholdexcpt.c, fclrexcpt.c and feupdateenv.c of glibc/ports/sysdeps/aarch64/fpu. I have limited real hardware access and just did a bootstrap and tested c11-atomic-exec-5.c alone to make sure that it PASS. I have also regression tested again on qemu-aarch64 for aarch64-none-linux-gnu with no new regressions. I will appreciate if you could do the regression testing on real hw. As for the ARM version of the patch, I did test the previous version for c11-atomic-exec-5.c and did verified it on chromebook before I posted the match . I have now updated the patch based on your review and the full bootstrap and regression testing is now under way. I will post the patch once the results are available. Thanks, Kugan +2014-04-29 Kugan Vivekanandarajah kug...@linaro.org + + * config/aarch64/aarch64.c (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New + define. + * config/aarch64/aarch64-protos.h (aarch64_atomic_assign_expand_fenv): + New function declaration. + * config/aarch64/aarch64-builtins.c (aarch64_builtins) : Add + AARCH64_BUILTIN_GET_FPCR, AARCH64_BUILTIN_SET_FPCR. + AARCH64_BUILTIN_GET_FPSR and AARCH64_BUILTIN_SET_FPSR. + (aarch64_init_builtins) : Initialize builtins + __builtins_aarch64_set_fpcr, __builtins_aarch64_get_fpcr. + __builtins_aarch64_set_fpsr and __builtins_aarch64_get_fpsr. + (aarch64_expand_builtin) : Expand builtins __builtins_aarch64_set_fpcr + __builtins_aarch64_get_fpcr, __builtins_aarch64_get_fpsr, + and __builtins_aarch64_set_fpsr. + (aarch64_atomic_assign_expand_fenv): New function. + * config/aarch64/aarch64.md (set_fpcr): New pattern. + (get_fpcr) : Likewise. + (set_fpsr) : Likewise. + (get_fpsr) : Likewise. + (unspecv): Add UNSPECV_GET_FPCR and UNSPECV_SET_FPCR, UNSPECV_GET_FPSR +and UNSPECV_SET_FPSR. + * doc/extend.texi (AARCH64 Built-in Functions) : Document + __builtins_aarch64_set_fpcr, __builtins_aarch64_get_fpcr. + __builtins_aarch64_set_fpsr and __builtins_aarch64_get_fpsr. diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index 55cfe0a..5cdc978 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -371,6 +371,12 @@ static aarch64_simd_builtin_datum aarch64_simd_builtin_data[] = { enum aarch64_builtins { AARCH64_BUILTIN_MIN, + + AARCH64_BUILTIN_GET_FPCR, + AARCH64_BUILTIN_SET_FPCR, + AARCH64_BUILTIN_GET_FPSR, + AARCH64_BUILTIN_SET_FPSR, + AARCH64_SIMD_BUILTIN_BASE, #include aarch64-simd-builtins.def AARCH64_SIMD_BUILTIN_MAX = AARCH64_SIMD_BUILTIN_BASE @@ -752,6 +758,24 @@ aarch64_init_simd_builtins (void) void aarch64_init_builtins (void) { + tree ftype_set_fpr += build_function_type_list (void_type_node, unsigned_type_node, NULL); + tree ftype_get_fpr += build_function_type_list (unsigned_type_node, NULL); + + aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPCR] += add_builtin_function (__builtin_aarch64_get_fpcr, ftype_get_fpr, + AARCH64_BUILTIN_GET_FPCR, BUILT_IN_MD, NULL, NULL_TREE); + aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPCR] += add_builtin_function (__builtin_aarch64_set_fpcr, ftype_set_fpr, + AARCH64_BUILTIN_SET_FPCR, BUILT_IN_MD, NULL, NULL_TREE); + aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPSR] += add_builtin_function (__builtin_aarch64_get_fpsr, ftype_get_fpr, + AARCH64_BUILTIN_GET_FPSR, BUILT_IN_MD, NULL, NULL_TREE); + aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPSR] += add_builtin_function (__builtin_aarch64_set_fpsr, ftype_set_fpr, + AARCH64_BUILTIN_SET_FPSR, BUILT_IN_MD, NULL, NULL_TREE); + if (TARGET_SIMD) aarch64_init_simd_builtins (); } @@ -964,6 +988,36 @@ aarch64_expand_builtin (tree exp, { tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0); int fcode = DECL_FUNCTION_CODE (fndecl); + int icode; + rtx pat, op0; + tree arg0; + + switch (fcode) +{ +case AARCH64_BUILTIN_GET_FPCR: +case AARCH64_BUILTIN_SET_FPCR: +case AARCH64_BUILTIN_GET_FPSR: +case AARCH64_BUILTIN_SET_FPSR: + if ((fcode ==
[RFC][AARCH64] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook
Attached patch implements TARGET_ATOMIC_ASSIGN_EXPAND_FENV for AARCH64. With this, atomic test-case gcc.dg/atomic/c11-atomic-exec-5.c now PASS. This implementation is based on SPARC and i386 implementations. Regression tested on qemu-aarch64 for aarch64-none-linux-gnu with no new regression. Is this OK for trunk? Thanks, Kugan gcc/ +2014-04-27 Kugan Vivekanandarajah kug...@linaro.org + + * config/aarch64/aarch64.c (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New + define. + * config/aarch64/aarch64-builtins.c (arm_builtins) : Add + AARCH64_BUILTIN_LDFPSCR and AARCH64_BUILTIN_STFPSCR. + (aarch64_init_builtins) : Initialize builtins + __builtins_aarch64_stfpscr and __builtins_aarch64_ldfpscr. + (aarch64_expand_builtin) : Expand builtins __builtins_aarch64_stfpscr + and __builtins_aarch64_ldfpscr. + (aarch64_atomic_assign_expand_fenv): New function. + * config/aarch64/aarch64.md (stfpscr): New pattern. + (ldfpscr) : Likewise. + (unspecv): Add UNSPECV_LDFPSCR and UNSPECV_STFPSCR. + diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index 55cfe0a..70d3efa 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -371,6 +371,10 @@ static aarch64_simd_builtin_datum aarch64_simd_builtin_data[] = { enum aarch64_builtins { AARCH64_BUILTIN_MIN, + + AARCH64_BUILTIN_LDFPSCR, + AARCH64_BUILTIN_STFPSCR, + AARCH64_SIMD_BUILTIN_BASE, #include aarch64-simd-builtins.def AARCH64_SIMD_BUILTIN_MAX = AARCH64_SIMD_BUILTIN_BASE @@ -752,6 +756,18 @@ aarch64_init_simd_builtins (void) void aarch64_init_builtins (void) { + tree ftype_stfpscr += build_function_type_list (void_type_node, unsigned_type_node, NULL); + tree ftype_ldfpscr += build_function_type_list (unsigned_type_node, NULL); + + aarch64_builtin_decls[AARCH64_BUILTIN_LDFPSCR] += add_builtin_function (__builtin_aarch64_ldfscr, ftype_ldfpscr, + AARCH64_BUILTIN_LDFPSCR, BUILT_IN_MD, NULL, NULL_TREE); + aarch64_builtin_decls[AARCH64_BUILTIN_STFPSCR] += add_builtin_function (__builtin_aarch64_stfscr, ftype_stfpscr, + AARCH64_BUILTIN_STFPSCR, BUILT_IN_MD, NULL, NULL_TREE); + if (TARGET_SIMD) aarch64_init_simd_builtins (); } @@ -964,6 +980,31 @@ aarch64_expand_builtin (tree exp, { tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0); int fcode = DECL_FUNCTION_CODE (fndecl); + int icode; + rtx pat, op0; + tree arg0; + + switch (fcode) +{ +case AARCH64_BUILTIN_LDFPSCR: +case AARCH64_BUILTIN_STFPSCR: + if (fcode == AARCH64_BUILTIN_LDFPSCR) + { + icode = CODE_FOR_ldfpscr; + target = gen_reg_rtx (SImode); + pat = GEN_FCN (icode) (target); + } + else + { + target = NULL_RTX; + icode = CODE_FOR_stfpscr; + arg0 = CALL_EXPR_ARG (exp, 0); + op0 = expand_normal (arg0); + pat = GEN_FCN (icode) (op0); + } + emit_insn (pat); + return target; +} if (fcode = AARCH64_SIMD_BUILTIN_BASE) return aarch64_simd_expand_builtin (fcode, exp, target); @@ -1196,6 +1237,70 @@ aarch64_gimple_fold_builtin (gimple_stmt_iterator *gsi) return changed; } +void +aarch64_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update) +{ + const unsigned FE_INVALID = 1; + const unsigned FE_DIVBYZERO = 2; + const unsigned FE_OVERFLOW = 4; + const unsigned FE_UNDERFLOW = 8; + const unsigned FE_INEXACT = 16; + const unsigned HOST_WIDE_INT FE_ALL_EXCEPT = (FE_INVALID | FE_DIVBYZERO + | FE_OVERFLOW | FE_UNDERFLOW + | FE_INEXACT); + const unsigned HOST_WIDE_INT FE_EXCEPT_SHIFT = 8; + + /* Genareate the equivalence of : + unsigned int fenv_var; + fenv_var = __builtin_aarch64_ldfpscr (); + + unsigned int masked_fenv; + tmp1_var = fenv_var ~ mask; + + __builtin_aarch64_fpscr (tmp1_var); */ + + tree fenv_var = create_tmp_var (unsigned_type_node, NULL); + tree ldfpscr = aarch64_builtin_decls[AARCH64_BUILTIN_LDFPSCR]; + tree stfpscr = aarch64_builtin_decls[AARCH64_BUILTIN_STFPSCR]; + tree mask = build_int_cst (unsigned_type_node, +~((FE_ALL_EXCEPT FE_EXCEPT_SHIFT) + | FE_ALL_EXCEPT)); + tree ld_fenv_stmt = build2 (MODIFY_EXPR, unsigned_type_node, + fenv_var, build_call_expr (ldfpscr, 0)); + tree masked_fenv = build2 (BIT_AND_EXPR, unsigned_type_node, fenv_var, mask); + tree hold_fnclex = build_call_expr (stfpscr, 1, masked_fenv); + *hold = build2 (COMPOUND_EXPR, void_type_node, + build2 (COMPOUND_EXPR, void_type_node, masked_fenv, + ld_fenv_stmt), hold_fnclex); + + /* Store the value of masked_fenv to clear the exceptions: + __builtin_aarch64_stfpscr (masked_fenv); */ +