[PATCH, C++] Fix PR63366: __complex not equivalent to __complex double in C++
According to a comment in grokdeclarator in file gcc/cp/decl.c: /* If we just have complex, it is equivalent to complex double, but if any modifiers at all are specified it is the complex form of TYPE. E.g, complex short is complex short int. */ Yet, __complex is equivalent to __complex int as shows the following testcase: #include typeinfo int main (void) { return typeid (__complex) != typeid (__complex int); } The following patch fix the problem. ChangeLog are as follows: *** gcc/cp/ChangeLog *** 2014-09-26 Thomas Preud'homme thomas.preudho...@arm.com * decl.c (grokdeclarator): Set defaulted_int when defaulting to int because type is null. *** gcc/testsuite/ChangeLog *** 2014-10-26 Thomas Preud'homme thomas.preudho...@arm.com * g++.dg/torture/pr63366.C: New test. diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index d26a432..449efdf 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -9212,6 +9212,7 @@ grokdeclarator (const cp_declarator *declarator, ISO C++ forbids declaration of %qs with no type, name); type = integer_type_node; + defaulted_int = 1; } ctype = NULL_TREE; diff --git a/gcc/testsuite/g++.dg/torture/pr63366.C b/gcc/testsuite/g++.dg/torture/pr63366.C new file mode 100644 index 000..af59b98 --- /dev/null +++ b/gcc/testsuite/g++.dg/torture/pr63366.C @@ -0,0 +1,11 @@ +// { dg-do run } +// { dg-options -fpermissive } +// { dg-prune-output ISO C\\+\\+ forbids declaration of 'type name' with no type } + +#include typeinfo + +int +main (void) +{ + return typeid (__complex) != typeid (__complex double); +} Is this ok for trunk? Best regards, Thomas Preud'homme
Re: [PATCH i386 AVX512] [57/n] Extend blend/cmp/brodcast insn patterns.
On Fri, Sep 26, 2014 at 11:04 AM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Hello, Patch in the bottom extends blend/cmp/brodcast insn patterns. Bootstrapped. AVX-512* tests on top of patch-set all pass under simulator. Is it ok for trunk? gcc/ * config/i386/sse.md (define_insn avx512f_blendmmode): Delete. (define_insn avx512_blendmVI48_AVX512VL:mode): New. (define_insn avx512_blendmVI12_AVX512VL:mode): Ditto.. (define_mode_attr cmp_imm_predicate): Add V8SF, V4DF, V8SI, V4DI, V4SF, V2DF, V4SI, V2DI, V32HI, V64QI, V16HI, V32QI, V8HI, V16QI modes. (define_insn avx512f_cmpmode3mask_scalar_merge_nameround_saeonly_name): Remove. (define_insn avx512_cmpVI48_AVX512VL:mode3mask_scalar_merge_nameround_saeonly_name): New. (define_insn avx512_cmpVI12_AVX512VL:mode3mask_scalar_merge_nameround_saeonly_name): Ditto. (define_insn mask_codeforavx512f_vec_dupmodemask_name): Delete. (define_insn avx512_vec_dupV48_AVX512VL:modemask_name): New. (define_insn avx512_vec_dupV12_AVX512VL:modemask_name): Ditto. (define_insn mask_codeforavx512f_vec_dup_gprmodemask_name): Delete. (define_insn mask_codeforavx512_vec_dup_gprVI48_AVX512VL:modemask_name): New. (define_insn mask_codeforavx512_vec_dup_gprVI12_AVX512VL:modemask_name): Ditto. (define_insn·mask_codeforavx512f_vec_dup_memmodemask_name): Delete. (define_insn mask_codeforavx512_vec_dup_memVI48_AVX512VL:modemask_name): New. (define_insn mask_codeforavx512_vec_dup_memVI12_AVX512VL:modemask_name): Ditto. OK with a small fix below. Thanks, Uros. -- Thanks, K diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 9edfebc..43d6655 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -954,14 +954,26 @@ (set_attr memory none,load) (set_attr mode sseinsnmode)]) -(define_insn avx512f_blendmmode - [(set (match_operand:VI48F_512 0 register_operand =v) - (vec_merge:VI48F_512 - (match_operand:VI48F_512 2 nonimmediate_operand vm) - (match_operand:VI48F_512 1 register_operand v) +(define_insn avx512_blendmmode + [(set (match_operand:V48_AVX512VL 0 register_operand =v) + (vec_merge:V48_AVX512VL + (match_operand:V48_AVX512VL 2 nonimmediate_operand vm) + (match_operand:V48_AVX512VL 1 register_operand v) (match_operand:avx512fmaskmode 3 register_operand Yk)))] TARGET_AVX512F - vsseintprefixblendmssemodesuffix\t{%2, %1, %0%{%3%}|%0%{%3%}, %1, %2} + vblendmssemodesuffix\t{%2, %1, %0%{%3%}|%0%{%3%}, %1, %2} + [(set_attr type ssemov) + (set_attr prefix evex) + (set_attr mode sseinsnmode)]) + +(define_insn avx512_blendmmode + [(set (match_operand:VI12_AVX512VL 0 register_operand =v) + (vec_merge:VI12_AVX512VL + (match_operand:VI12_AVX512VL 2 nonimmediate_operand vm) + (match_operand:VI12_AVX512VL 1 register_operand v) + (match_operand:avx512fmaskmode 3 register_operand Yk)))] + TARGET_AVX512BW + vpblendmssemodesuffix\t{%2, %1, %0%{%3%}|%0%{%3%}, %1, %2} [(set_attr type ssemov) (set_attr prefix evex) (set_attr mode sseinsnmode)]) @@ -2467,14 +2479,21 @@ (set_attr mode ssescalarmode)]) (define_mode_attr cmp_imm_predicate - [(V16SF const_0_to_31_operand) (V8DF const_0_to_31_operand) - (V16SI const_0_to_7_operand) (V8DI const_0_to_7_operand)]) - -(define_insn avx512f_cmpmode3mask_scalar_merge_nameround_saeonly_name + [(V16SF const_0_to_31_operand) (V8DF const_0_to_31_operand) + (V16SI const_0_to_7_operand) (V8DI const_0_to_7_operand) + (V8SF const_0_to_31_operand) (V4DF const_0_to_31_operand) + (V8SI const_0_to_7_operand)(V4DI const_0_to_7_operand) + (V4SF const_0_to_31_operand) (V2DF const_0_to_31_operand) + (V4SI const_0_to_7_operand)(V2DI const_0_to_7_operand) + (V32HI const_0_to_7_operand) (V64QI const_0_to_7_operand) + (V16HI const_0_to_7_operand) (V32QI const_0_to_7_operand) + (V8HI const_0_to_7_operand)(V16QI const_0_to_7_operand)]) + +(define_insn avx512_cmpmode3mask_scalar_merge_nameround_saeonly_name [(set (match_operand:avx512fmaskmode 0 register_operand =Yk) (unspec:avx512fmaskmode - [(match_operand:VI48F_512 1 register_operand v) - (match_operand:VI48F_512 2 round_saeonly_nimm_predicate round_saeonly_constraint) + [(match_operand:V48_AVX512VL 1 register_operand v) + (match_operand:V48_AVX512VL 2 nonimmediate_operand round_saeonly_constraint) (match_operand:SI 3 cmp_imm_predicate n)] UNSPEC_PCMP))] TARGET_AVX512F round_saeonly_mode512bit_condition @@ -2484,6 +2503,20 @@ (set_attr prefix evex) (set_attr mode sseinsnmode)])
Re: [PATCH i386 AVX512] [58/n] Add vpmul[u]dq insn patterns.
On Fri, Sep 26, 2014 at 12:33 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Hello, Patch in the bottom adds support for vpmul[u]dq insn patterns. Bootstrapped. AVX-512* tests on top of patch-set all pass under simulator. Is it ok for trunk? gcc/ * config/i386/sse.md (define_expand vec_widen_umult_even_v8simask_name): Add masking. (define_insn *vec_widen_umult_even_v8simask_name): Ditto. (define_expand vec_widen_umult_even_v4simask_name): Ditto. (define_insn *vec_widen_umult_even_v4simask_name): Ditto. (define_expand vec_widen_smult_even_v8simask_name): Ditto. (define_insn *vec_widen_smult_even_v8simask_name): Ditto. (define_expand sse4_1_mulv2siv2di3mask_name): Ditto. (define_insn *sse4_1_mulv2siv2di3mask_name): Ditto. (define_insn avx512dq_mulmode3mask_name): New. OK. Thanks, Uros. -- Thanks, K diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 43d6655..e52d40c 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -9286,7 +9286,7 @@ (set_attr prefix evex) (set_attr mode XI)]) -(define_expand vec_widen_umult_even_v8si +(define_expand vec_widen_umult_even_v8simask_name [(set (match_operand:V4DI 0 register_operand) (mult:V4DI (zero_extend:V4DI @@ -9299,29 +9299,30 @@ (match_operand:V8SI 2 nonimmediate_operand) (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6)])] - TARGET_AVX2 + TARGET_AVX2 mask_avx512vl_condition ix86_fixup_binary_operands_no_copy (MULT, V8SImode, operands);) -(define_insn *vec_widen_umult_even_v8si - [(set (match_operand:V4DI 0 register_operand =x) +(define_insn *vec_widen_umult_even_v8simask_name + [(set (match_operand:V4DI 0 register_operand =v) (mult:V4DI (zero_extend:V4DI (vec_select:V4SI - (match_operand:V8SI 1 nonimmediate_operand %x) + (match_operand:V8SI 1 nonimmediate_operand %v) (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6)]))) (zero_extend:V4DI (vec_select:V4SI - (match_operand:V8SI 2 nonimmediate_operand xm) + (match_operand:V8SI 2 nonimmediate_operand vm) (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6)])] - TARGET_AVX2 ix86_binary_operator_ok (MULT, V8SImode, operands) - vpmuludq\t{%2, %1, %0|%0, %1, %2} + TARGET_AVX2 mask_avx512vl_condition +ix86_binary_operator_ok (MULT, V8SImode, operands) + vpmuludq\t{%2, %1, %0mask_operand3|%0mask_operand3, %1, %2} [(set_attr type sseimul) - (set_attr prefix vex) + (set_attr prefix maybe_evex) (set_attr mode OI)]) -(define_expand vec_widen_umult_even_v4si +(define_expand vec_widen_umult_even_v4simask_name [(set (match_operand:V2DI 0 register_operand) (mult:V2DI (zero_extend:V2DI @@ -9332,28 +9333,29 @@ (vec_select:V2SI (match_operand:V4SI 2 nonimmediate_operand) (parallel [(const_int 0) (const_int 2)])] - TARGET_SSE2 + TARGET_SSE2 mask_avx512vl_condition ix86_fixup_binary_operands_no_copy (MULT, V4SImode, operands);) -(define_insn *vec_widen_umult_even_v4si - [(set (match_operand:V2DI 0 register_operand =x,x) +(define_insn *vec_widen_umult_even_v4simask_name + [(set (match_operand:V2DI 0 register_operand =x,v) (mult:V2DI (zero_extend:V2DI (vec_select:V2SI - (match_operand:V4SI 1 nonimmediate_operand %0,x) + (match_operand:V4SI 1 nonimmediate_operand %0,v) (parallel [(const_int 0) (const_int 2)]))) (zero_extend:V2DI (vec_select:V2SI - (match_operand:V4SI 2 nonimmediate_operand xm,xm) + (match_operand:V4SI 2 nonimmediate_operand xm,vm) (parallel [(const_int 0) (const_int 2)])] - TARGET_SSE2 ix86_binary_operator_ok (MULT, V4SImode, operands) + TARGET_SSE2 mask_avx512vl_condition +ix86_binary_operator_ok (MULT, V4SImode, operands) @ pmuludq\t{%2, %0|%0, %2} - vpmuludq\t{%2, %1, %0|%0, %1, %2} + vpmuludq\t{%2, %1, %0mask_operand3|%0mask_operand3, %1, %2} [(set_attr isa noavx,avx) (set_attr type sseimul) (set_attr prefix_data16 1,*) - (set_attr prefix orig,vex) + (set_attr prefix orig,maybe_evex) (set_attr mode TI)]) (define_expand vec_widen_smult_even_v16simask_name @@ -9401,7 +9403,7 @@ (set_attr prefix evex) (set_attr mode XI)]) -(define_expand vec_widen_smult_even_v8si +(define_expand vec_widen_smult_even_v8simask_name [(set (match_operand:V4DI 0 register_operand) (mult:V4DI (sign_extend:V4DI @@ -9414,30 +9416,31 @@ (match_operand:V8SI 2 nonimmediate_operand)
Re: [PATCH i386 AVX512] [59/n] Add vptest[n]m, ucmp, cmpeq insn patterns.
On Fri, Sep 26, 2014 at 12:45 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Hello, Patch in the bottom adds support for vptest[n]m, ucmp, cmpeq. Bootstrapped. AVX-512* tests on top of patch-set all pass under simulator. Is it ok for trunk? gcc/ * config/i386/i386.c (ix86_expand_args_builtin): Handle CODE_FOR_avx512vl_cmpv4di3_mask, CODE_FOR_avx512vl_cmpv8si3_mask, CODE_FOR_avx512vl_ucmpv4di3_mask, CODE_FOR_avx512vl_ucmpv8si3_mask, CODE_FOR_avx512vl_cmpv2di3_mask, CODE_FOR_avx512vl_cmpv4si3_mask, CODE_FOR_avx512vl_ucmpv2di3_mask, CODE_FOR_avx512vl_ucmpv4si3_mask. * config/i386/sse.md (define_insn Double define_insn here. (define_insn avx512f_ucmpmode3mask_scalar_merge_name): Delete. avx512_ucmpVI12_AVX512VL:mode3mask_scalar_merge_name):New. (define_insn avx512_ucmpVI48_AVX512VL:mode3mask_scalar_merge_name):Ditto. (define_expand avx512_eqmode3mask_scalar_merge_name): Ditto. (define_insn avx512_eqmode3mask_scalar_merge_name_1): Ditto. (define_insn avx512_gtmode3mask_scalar_merge_name): Ditto. (define_insn avx512_testmmode3mask_scalar_merge_name): Ditto. (define_insn avx512_testnmmode3mask_scalar_merge_name): Ditto. OK. Thanks, Uros. -- Thanks, K diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 1aec70f..352ab81 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -34062,6 +34062,14 @@ ix86_expand_args_builtin (const struct builtin_description *d, case CODE_FOR_avx512f_cmpv16si3_mask: case CODE_FOR_avx512f_ucmpv8di3_mask: case CODE_FOR_avx512f_ucmpv16si3_mask: + case CODE_FOR_avx512vl_cmpv4di3_mask: + case CODE_FOR_avx512vl_cmpv8si3_mask: + case CODE_FOR_avx512vl_ucmpv4di3_mask: + case CODE_FOR_avx512vl_ucmpv8si3_mask: + case CODE_FOR_avx512vl_cmpv2di3_mask: + case CODE_FOR_avx512vl_cmpv4si3_mask: + case CODE_FOR_avx512vl_ucmpv2di3_mask: + case CODE_FOR_avx512vl_ucmpv4si3_mask: error (the last argument must be a 3-bit immediate); return const0_rtx; diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index e52d40c..625a2e0 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -2517,11 +2517,25 @@ (set_attr prefix evex) (set_attr mode sseinsnmode)]) -(define_insn avx512f_ucmpmode3mask_scalar_merge_name +(define_insn avx512_ucmpmode3mask_scalar_merge_name [(set (match_operand:avx512fmaskmode 0 register_operand =Yk) (unspec:avx512fmaskmode - [(match_operand:VI48_512 1 register_operand v) - (match_operand:VI48_512 2 nonimmediate_operand vm) + [(match_operand:VI12_AVX512VL 1 register_operand v) + (match_operand:VI12_AVX512VL 2 nonimmediate_operand vm) + (match_operand:SI 3 const_0_to_7_operand n)] + UNSPEC_UNSIGNED_PCMP))] + TARGET_AVX512BW + vpcmpussemodesuffix\t{%3, %2, %1, %0mask_scalar_merge_operand4|%0mask_scalar_merge_operand4, %1, %2, %3} + [(set_attr type ssecmp) + (set_attr length_immediate 1) + (set_attr prefix evex) + (set_attr mode sseinsnmode)]) + +(define_insn avx512_ucmpmode3mask_scalar_merge_name + [(set (match_operand:avx512fmaskmode 0 register_operand =Yk) + (unspec:avx512fmaskmode + [(match_operand:VI48_AVX512VL 1 register_operand v) + (match_operand:VI48_AVX512VL 2 nonimmediate_operand vm) (match_operand:SI 3 const_0_to_7_operand n)] UNSPEC_UNSIGNED_PCMP))] TARGET_AVX512F @@ -10265,20 +10279,42 @@ (set_attr prefix vex) (set_attr mode OI)]) -(define_expand avx512f_eqmode3mask_scalar_merge_name +(define_expand avx512_eqmode3mask_scalar_merge_name + [(set (match_operand:avx512fmaskmode 0 register_operand) + (unspec:avx512fmaskmode + [(match_operand:VI12_AVX512VL 1 register_operand) + (match_operand:VI12_AVX512VL 2 nonimmediate_operand)] + UNSPEC_MASKED_EQ))] + TARGET_AVX512BW + ix86_fixup_binary_operands_no_copy (EQ, MODEmode, operands);) + +(define_expand avx512_eqmode3mask_scalar_merge_name [(set (match_operand:avx512fmaskmode 0 register_operand) (unspec:avx512fmaskmode - [(match_operand:VI48_512 1 register_operand) - (match_operand:VI48_512 2 nonimmediate_operand)] + [(match_operand:VI48_AVX512VL 1 register_operand) + (match_operand:VI48_AVX512VL 2 nonimmediate_operand)] UNSPEC_MASKED_EQ))] TARGET_AVX512F ix86_fixup_binary_operands_no_copy (EQ, MODEmode, operands);) -(define_insn avx512f_eqmode3mask_scalar_merge_name_1 +(define_insn avx512_eqmode3mask_scalar_merge_name_1 [(set (match_operand:avx512fmaskmode 0 register_operand =Yk) (unspec:avx512fmaskmode -
Re: [PATCH i386 AVX512] [60/n] Update 128bit ashrv insn pattern.
On Fri, Sep 26, 2014 at 1:13 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Hello, This tiny patch extends 128bit ashrv expander. Bootstrapped. AVX-512* tests on top of patch-set all pass under simulator. Is it ok for trunk? gcc/ * config/i386/sse.md (define_mode_iterator VI128_128 [V16QI V8HI V2DI]): Delete. (define_expand vashrmode3mask_name): Add masking, use VI12_128 mode iterator. (define_expand ashrv2di3mask_name): New. -- Thanks, K diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 625a2e0..91d6778 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -498,7 +498,6 @@ (define_mode_iterator VI12_128 [V16QI V8HI]) (define_mode_iterator VI14_128 [V16QI V4SI]) (define_mode_iterator VI124_128 [V16QI V8HI V4SI]) -(define_mode_iterator VI128_128 [V16QI V8HI V2DI]) (define_mode_iterator VI24_128 [V8HI V4SI]) (define_mode_iterator VI248_128 [V8HI V4SI V2DI]) (define_mode_iterator VI48_128 [V4SI V2DI]) @@ -15720,17 +15719,36 @@ (match_operand:VI48_256 2 nonimmediate_operand)))] TARGET_AVX2) -(define_expand vashrmode3 - [(set (match_operand:VI128_128 0 register_operand) - (ashiftrt:VI128_128 - (match_operand:VI128_128 1 register_operand) - (match_operand:VI128_128 2 nonimmediate_operand)))] - TARGET_XOP +(define_expand vashrmode3mask_name + [(set (match_operand:VI12_128 0 register_operand) + (ashiftrt:VI12_128 + (match_operand:VI12_128 1 register_operand) + (match_operand:VI12_128 2 nonimmediate_operand)))] + TARGET_XOP || (TARGET_AVX512BW TARGET_AVX512VL) { - rtx neg = gen_reg_rtx (MODEmode); - emit_insn (gen_negmode2 (neg, operands[2])); - emit_insn (gen_xop_shamode3 (operands[0], operands[1], neg)); - DONE; + if (TARGET_XOP) +{ + rtx neg = gen_reg_rtx (MODEmode); + emit_insn (gen_negmode2 (neg, operands[2])); + emit_insn (gen_xop_shamode3 (operands[0], operands[1], neg)); + DONE; +} +}) + +(define_expand vashrv2di3mask_name + [(set (match_operand:V2DI 0 register_operand) + (ashiftrt:V2DI + (match_operand:V2DI 1 register_operand) + (match_operand:V2DI 2 nonimmediate_operand)))] + TARGET_XOP || TARGET_AVX512VL +{ + if (!TARGET_XOP) This condition is wrong. Please re-test the patch. +{ + rtx neg = gen_reg_rtx (V2DImode); + emit_insn (gen_negv2di2 (neg, operands[2])); + emit_insn (gen_xop_shav2di3 (operands[0], operands[1], neg)); + DONE; +} }) (define_expand vashrv4si3
Re: [PATCH i386 AVX512] [61/n] Update FP logic insn patterns.
On Fri, Sep 26, 2014 at 2:32 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Hello, This patch extends andnot and any_logic insn patterns. Bootstrapped. AVX-512* tests on top of patch-set all pass under simulator. Is it ok for trunk? gcc/ * config/i386/sse.md (define_insn sse_andnotVF_128_256:mode3mask_name): Add masking, use VF_128_256 mode iterator and update assembler emit code. (define_insn sse_andnotVF_512:mode3mask_name): New. (define_expand any_logic:codeVF_128_256:mode3mask_name): Add masking, use VF_128_256 mode iterator. (define_expand any_logic:codeVF_512:mode3mask_name): New. (define_insn *any_logic:codeVF_128_256:mode3mask_name): Add masking, use VF_128_256 mode iterator and update assembler emit code. (define_insn *any_logic:codeVF_512:mode3mask_name): New. (define_mode_attr avx512flogicsuff): Delete. (define_insn avx512f_logicmode): Ditto. (define_insn *andnotmode3mask_name): Update MODE_XI, MODE_OI, MODE_TI. (define_insn mask_codeforcodemode3mask_name): Ditto. -- Thanks, K diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 91d6778..9835234 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -2687,15 +2687,15 @@ ;; ; -(define_insn sse_andnotmode3 - [(set (match_operand:VF 0 register_operand =x,v) - (and:VF - (not:VF - (match_operand:VF 1 register_operand 0,v)) - (match_operand:VF 2 nonimmediate_operand xm,vm)))] - TARGET_SSE +(define_insn sse_andnotmode3mask_name + [(set (match_operand:VF_128_256 0 register_operand =x,v) + (and:VF_128_256 + (not:VF_128_256 + (match_operand:VF_128_256 1 register_operand 0,v)) + (match_operand:VF_128_256 2 nonimmediate_operand xm,vm)))] + TARGET_SSE mask_avx512vl_condition { - static char buf[32]; + static char buf[128]; const char *ops; const char *suffix; @@ -2715,17 +2715,17 @@ ops = andn%s\t{%%2, %%0|%%0, %%2}; break; case 1: - ops = vandn%s\t{%%2, %%1, %%0|%%0, %%1, %%2}; + ops = vandn%s\t{%%2, %%1, %%0mask_operand3_1|%%0mask_operand3_1, %%1, %%2}; break; default: gcc_unreachable (); } - /* There is no vandnp[sd]. Use vpandnq. */ - if (MODE_SIZE == 64) + /* There is no vandnp[sd] in avx512f. Use vpandn[qd]. */ + if (mask_applied !TARGET_AVX512DQ) { - suffix = q; - ops = vpandn%s\t{%%2, %%1, %%0|%%0, %%1, %%2}; + suffix = GET_MODE_INNER (MODEmode) == DFmode ? q : d; + ops = vpandn%s\t{%%2, %%1, %%0mask_operand3_1|%%0mask_operand3_1, %%1, %%2}; } snprintf (buf, sizeof (buf), ops, suffix); @@ -2745,30 +2745,63 @@ ] (const_string MODE)))]) -(define_expand codemode3 + +(define_insn sse_andnotmode3mask_name + [(set (match_operand:VF_512 0 register_operand =v) + (and:VF_512 + (not:VF_512 + (match_operand:VF_512 1 register_operand v)) + (match_operand:VF_512 2 nonimmediate_operand vm)))] + TARGET_AVX512F +{ + static char buf[128]; + const char *ops; + const char *suffix; + + suffix = ssemodesuffix; + ops = ; + + /* There is no vandnp[sd] in avx512f. Use vpandn[qd]. */ + if (!TARGET_AVX512DQ) All other patterns also have mask_applied condition here. Is the above condition correct? +{ + suffix = GET_MODE_INNER (MODEmode) == DFmode ? q : d; + ops = p; +} + + snprintf (buf, sizeof (buf), + v%sandn%s\t{%%2, %%1, %%0mask_operand3_1|%%0mask_operand3_1, %%1, %%2}, + ops, suffix); + return buf; +} + [(set_attr type sselog) + (set_attr prefix evex) + (set_attr mode sseinsnmode)]) + +(define_expand codemode3mask_name [(set (match_operand:VF_128_256 0 register_operand) - (any_logic:VF_128_256 - (match_operand:VF_128_256 1 nonimmediate_operand) - (match_operand:VF_128_256 2 nonimmediate_operand)))] - TARGET_SSE + (any_logic:VF_128_256 + (match_operand:VF_128_256 1 nonimmediate_operand) + (match_operand:VF_128_256 2 nonimmediate_operand)))] + TARGET_SSE mask_avx512vl_condition ix86_fixup_binary_operands_no_copy (CODE, MODEmode, operands);) -(define_expand codemode3 +(define_expand codemode3mask_name [(set (match_operand:VF_512 0 register_operand) - (fpint_logic:VF_512 + (any_logic:VF_512 (match_operand:VF_512 1 nonimmediate_operand) (match_operand:VF_512 2 nonimmediate_operand)))] TARGET_AVX512F ix86_fixup_binary_operands_no_copy (CODE, MODEmode, operands);) -(define_insn *codemode3 - [(set (match_operand:VF 0 register_operand =x,v) - (any_logic:VF - (match_operand:VF 1 nonimmediate_operand %0,v) -
Re: [PATCH i386 AVX512] [62/n] Add vpmaddubsw,vdbpsadbw insn patterns.
On Fri, Sep 26, 2014 at 4:09 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Hello, This patch introduces patterns for vpmaddubsw and vdbpsadbw insn. Bootstrapped. AVX-512* tests on top of patch-set all pass under simulator. Is it ok for trunk? gcc/ * config/i386/sse.md (define_c_enum unspec): Add UNSPEC_DBPSADBW, UNSPEC_PMADDUBSW512. (define_insn avx512bw_pmaddubsw512modemask_name): New. (define_insn mask_codeforavx512bw_dbpsadbwmodemask_name): Ditto. -- Thanks, K diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 9835234..601373b 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -130,6 +130,8 @@ UNSPEC_SHA256RNDS2 ;; For AVX512BW support + UNSPEC_DBPSADBW + UNSPEC_PMADDUBSW512 UNSPEC_PSHUFHW UNSPEC_PSHUFLW UNSPEC_CVTINT2MASK @@ -13401,6 +13403,19 @@ (set_attr prefix vex) (set_attr mode OI)]) +;; Unspec version for intrinsics. +(define_insn avx512bw_pmaddubsw512modemask_name + [(set (match_operand:VI2_AVX512VL 0 register_operand =v) + (unspec:VI2_AVX512VL +[(match_operand:dbpsadbwmode 1 register_operand v) + (match_operand:dbpsadbwmode 2 nonimmediate_operand vm)] + UNSPEC_PMADDUBSW512))] + TARGET_AVX512BW + vpmaddubsw\t{%2, %1, %0mask_operand3|%0mask_operand3, %1, %2}; + [(set_attr type sseiadd) + (set_attr prefix evex) + (set_attr mode XI)]) + Can the one above be described using standard RTX, perhaps something similar to avx2_pmaddubsw256? (define_insn ssse3_pmaddubsw128 [(set (match_operand:V8HI 0 register_operand =x,x) (ss_plus:V8HI @@ -18097,6 +18112,21 @@ [(set_attr prefix evex) (set_attr mode ssescalarmode)]) +(define_insn mask_codeforavx512bw_dbpsadbwmodemask_name + [(set (match_operand:VI2_AVX512VL 0 register_operand =v) + (unspec:VI2_AVX512VL + [(match_operand:dbpsadbwmode 1 register_operand v) + (match_operand:dbpsadbwmode 2 nonimmediate_operand vm) + (match_operand:SI 3 const_0_to_255_operand)] + UNSPEC_DBPSADBW))] + TARGET_AVX512BW + vdbpsadbw\t{%3, %2, %1, %0mask_operand4|%0mask_operand4, %1, %2, %3} + [(set_attr isa avx) + (set_attr type sselog1) + (set_attr length_immediate 1) + (set_attr prefix evex) + (set_attr mode sseinsnmode)]) + (define_insn clzmode2mask_name [(set (match_operand:VI48_AVX512VL 0 register_operand =v) (clz:VI48_AVX512VL
Re: [Patch, Fortran] Add CO_BROADCAST
Dominique Dhumieres wrote: The failures for the gfortran.dg/coarray_collectives_9.f90 are fixed with the following patch: Looks good to me. The patch is OK with a ChangLog. Thanks for the patch and sorry for the test fails. Tobias
Re: [PATCH v2] PR libitm/61164: use always_inline consistently
On Sat, Sep 27, 2014 at 09:00:00PM +0400, Gleb Fotengauer-Malinovskiy wrote: 2014-09-27 Gleb Fotengauer-Malinovskiy gle...@altlinux.org libitm/ PR libitm/61164 * local_atomic (__always_inline): Add inline. (__calculate_memory_order): Remove inline. (atomic_thread_fence): Likewise. (atomic_signal_fence): Likewise. (atomic_flag_test_and_set_explicit): Likewise. (atomic_flag_clear_explicit): Likewise. (atomic_flag_test_and_set): Likewise. (atomic_flag_clear): Likewise. --- Sorry, previous patch is incomplete. This patch doesn't seem to match the ChangeLog, there is no change in the #define, etc. Furthermore, I think it is just wrong to redefine a glibc macro. I'd suggest to just sed -i -e 's/__always_inline/__libitm_always_inline/g' libitm/local_atomic Jakub
Re: [fortran,patch] Forbid assignment of different character kinds
It looks like the committee has reversed his opinion on this since the 2008 interp. There is wording in both F2003 and F2008 standards that supports this view, so I’ve closed the PR. Now, here’s a tiny patch to silence the related warning in PR36534. I also remove the condition on gfc_current_form != FORM_FIXED, as diagnostics should be emitted based on language/pedantic options, not source form. Regtested on x86_64-apple-darwin14, comes with a testcase. OK to commit? FX pr36534.ChangeLog Description: Binary data pr36534.diff Description: Binary data
Re: [RFC: Patch, PR 60102] [4.9/4.10 Regression] powerpc fp-bit ices@dwf_regno
Maciej W. Rozycki wrote: On Mon, 4 Aug 2014, Edmar wrote: Committed on trunk, revision 213596 Committed on 4.9 branch, revision 213597 This change regressed GDB for e500v2 multilibs, presumably because it does not understand the new DWARF register numbers and does not know how to map them to hardware registers. As I understand it, the change was supposed to only affect GCC internals, all externally generated debug info was supposed to remain unchanged. If there are changes in debug info, something must have gone wrong. Bye, Ulrich -- Dr. Ulrich Weigand GNU/Linux compilers and toolchain ulrich.weig...@de.ibm.com
Re: [PATCH] Add libstdc++ baseline_symbols for aarch64
On 26/09/14 23:42 +0200, Andreas Schwab wrote: Generated by make new-abi-baseline on aarch64-suse-linux. Andreas. * config/abi/post/aarch64-linux-gnu/baseline_symbols.txt: New file. OK, thanks.
Re: Enable TBAA on anonymous types with LTO
On Fri, 26 Sep 2014, Jan Hubicka wrote: Hello, this is patch to preserve TBAA for anonymous types to LTO. The difference can be seen on the testcase: namespace { struct A {int a;}; struct B {int b;}; } struct A aa,*a=aa; struct B bb,*b=bb; void setA() { a-a=1; } void setB() { b-b=2; } int main() { asm(:=r(a),=r(b)); setA(); setB(); if (!__builtin_constant_p (a-a)) __builtin_abort (); return 0; } With patch it does get properly optimized with -O2 -fno-early-inlining. The basic idea is to: 1) stream canonical types when they are anonymous (and thus need not be structurally merged) Why not just make all anonymous types their own canonical type? (of course considering type variants) 2) update canonical type hash so it can deal with types that already have canonical type set. I insert even anonymous types there because I am not able to get rid of cases where non-anonmous type explicitly mentions anonymous. Consider: namespace { struct B {}; } struct A { void t(B); void t2(); }; void A::t(B) { } void A::t2() { } Here we end up having type of method T non-anonymous but it builds from B that is anonymous. But that makes B non-anonymous as well? How is A::t mangled? Consider also the simpler case struct A { struct B b; }; Being bale to handle non-upwards closed cases will be needed soon for full ODR type handling 3) Disable tree merging of anonymous namespace nodes and anonymous types. The second is needed, because I can have two identically looking anonymous types from same unit with different canonical types. But the container should be distinct? Isn't this again the issue that we merge anonymous namespace decls? Please try to fix that one and forall. This may go away once we get some ability to decide on unmergability at stream out time. I do not attept to merge anonymous types with structurally equivalent non-anonymous types from other compilation units. I think it is nature of C++ language that types in anonymous namespaces can not be accessed by other units and I hope to use this for other optimizations, too. What about cross-language LTO? With your scheme you say that you can't ever interoperate using anonymous entities (even if used from a non-anonymous one like in the examples above)? I think that's a dangerous route to go. Maybe detect the case where we compile from multiple source languages and behave differently? We can add documentation about this to -fstrict-aliasing section of manual I guess. What I am concerned about is the needed change in c-decl.c. C frontend currently outputs declarations that are confused by type_in_anonymous_namespace_p as anonymous in some cases. This is because it does not set PUBLIC flag on TYPE decl. This is bug: /* In a VAR_DECL, FUNCTION_DECL, NAMESPACE_DECL or TYPE_DECL, nonzero means name is to be accessible from outside this translation unit. In an IDENTIFIER_NODE, nonzero means an external declaration accessible from outside this translation unit was previously seen for this name in an inner scope. */ #define TREE_PUBLIC(NODE) ((NODE)-base.public_flag) This fortunately manifests itself as false warnings about type incompatiblity from lto-symtab. I did not see these with other languages, but I suppose we will want to check that other FEs are behaving correctly here. I do not know how Ada and Fortran should behave here. Bootstrapped/regtested x86_64-linux, lto-bootstrapped, tested with Firefox and libreoffice. I also checked that tree merging is working still well. OK? Not in this form, we have to discuss this further. It's way too agressive in my view. Honza * c-decl.c (pushtag): Set TREE_PUBLIC on STUB DECL * lto-streamer-out.c (DFS::DFS_write_tree_body): Optinally stream TYPE_CANONICAL. * lto.c (iterative_hash_canonical_type): Handle cases where TYPE_CANONICAL is pre-set. (gimple_register_canonical_type_1): Likewise. (lto_register_canonical_types): Likewise. (compare_tree_sccs_1): Anonymous namespaces never compare; neither does types in anonymous namespace. (lto_read_decls): Do not check TYPE_CANONICAL. * tree-streamer-out.c (write_ts_type_common_tree_pointers): Optinally write TYPE_CANONICAL. * lto-streamer-in.c (lto_read_body_or_constructor): Handle case where TYPE_CANONICAL is pre-set. * tree-streamer-in.c (lto_input_ts_type_common_tree_pointers): Stream in TYPE_CANONICAL. Index: c/c-decl.c === --- c/c-decl.c(revision 215645) +++ c/c-decl.c(working copy) @@ -1466,6 +1466,7 @@ pushtag (location_t loc, tree name,
RE: [RFC: Patch, PR 60102] [4.9/4.10 Regression] powerpc fp-bit ices@dwf_regno
From: Ulrich Weigand [mailto:uweig...@de.ibm.com] Maciej W. Rozycki wrote: On Mon, 4 Aug 2014, Edmar wrote: Committed on trunk, revision 213596 Committed on 4.9 branch, revision 213597 This change regressed GDB for e500v2 multilibs, presumably because it does not understand the new DWARF register numbers and does not know how to map them to hardware registers. As I understand it, the change was supposed to only affect GCC internals, all externally generated debug info was supposed to remain unchanged. If there are changes in debug info, something must have gone wrong. Let me check if I can track this down. Regards, Rohit
RE: FW: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C
Hi! On Mon, 22 Sep 2014 19:21:33 +, Tannenbaum, Barry M barry.m.tannenb...@intel.com wrote: That's exactly correct. There are two ways to implement a work-stealing scheduler. We refer to them as: [...] Thanks for the explanation. Cilk implements Parent Stealing. Since you're running with 1 worker, there's no other worker to steal the continuation. So steal_flag will never be set to 1 and you'll never break out of the loop. Remains the question about how to address that in the testsuite: -Original Message- From: Thomas Schwinge [mailto:tho...@codesourcery.com] Sent: Monday, September 22, 2014 9:56 AM To: Iyer, Balaji V Cc: gcc-patches@gcc.gnu.org Subject: Re: FW: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C Hi! On Tue, 27 Aug 2013 21:30:49 +, Iyer, Balaji V balaji.v.i...@intel.com wrote: --- /dev/null +++ gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c @@ -0,0 +1,37 @@ +/* { dg-do run { target { i?86-*-* x86_64-*-* arm*-*-* } } } */ +/* { dg-options -fcilkplus } */ +/* { dg-options -lcilkrts { target { i?86-*-* x86_64-*-* arm*-*-* } +} } */ + +void f0(volatile int *steal_flag) +{ + int i = 0; + /* Wait for steal_flag to be set */ + while (!*steal_flag) +; +} + +int f1() +{ + + volatile int steal_flag = 0; + _Cilk_spawn f0(steal_flag); + steal_flag = 1; // Indicate stolen + _Cilk_sync; + return 0; +} + +void f2(int q) +{ + q = 5; +} + +void f3() +{ + _Cilk_spawn f2(f1()); +} + +int main() +{ + f3(); + return 0; +} Is this really well-formed Cilk Plus code? Running with CILK_NWORKERS=1, or -- the equivalent -- in a system with just one CPU (as per libcilkrts/runtime/os-unix.c:__cilkrts_hardware_cpu_count returning 1), I see this test busy-loop as follows: Breakpoint 1, __cilkrts_hardware_cpu_count () at ../../../source/libcilkrts/runtime/os-unix.c:358 358 { (gdb) return 1 Make __cilkrts_hardware_cpu_count return now? (y or n) y #0 cilkg_get_user_settable_values () at ../../../source/libcilkrts/runtime/global_state.cpp:385 385 CILK_ASSERT(hardware_cpu_count 0); (gdb) c Continuing. ^C Program received signal SIGINT, Interrupt. f0 (steal_flag=steal_flag@entry=0x7fffd03c) at [...]/source/gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c:9 9 while (!*steal_flag) (gdb) info threads Id Target Id Frame * 1Thread 0x77fcd780 (LWP 30816) spawning_arg.ex f0 (steal_flag=steal_flag@entry=0x7fffd03c) at [...]/source/gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c:9 (gdb) list 4 5 void f0(volatile int *steal_flag) 6 { 7 int i = 0; 8 /* Wait for steal_flag to be set */ 9 while (!*steal_flag) 10 ; 11 } 12 13 int f1() (gdb) bt #0 f0 (steal_flag=steal_flag@entry=0x7fffd03c) at [...]/source/gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c:9 #1 0x004009c8 in _cilk_spn_0 () at [...]/source/gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c:17 #2 0x00400a4b in f1 () at [...]/source/gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c:17 #3 0x00400d0e in _cilk_spn_1 () at [...]/source/gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c:30 #4 0x00400d7a in f3 () at [...]/source/gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c:30 #5 0x00400e33 in main () at [...]/source/gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c:35 No additional thread has been spawned by libcilkrts, and the one initial thread is stuck in f0, without being able to make progress. Should in f0's while loop, some function be called to yield to libcilkrts scheduler, or should libcilkrts have spawned an additional thread, or is the test case just not valid Cilk Plus code? Assuming the test cases are considered well-formed Cilk Plus code, I understand there is then a hard requirement to run them with more than one worker. OK to fix as follows? commit ee7138e451d1f3284d6fa0f61fe517c82db94060 Author: Thomas Schwinge tho...@codesourcery.com Date: Mon Sep 29 12:47:34 2014 +0200 Audit Cilk Plus tests for CILK_NWORKERS=1. gcc/testsuite/ * c-c++-common/cilk-plus/CK/spawning_arg.c (main): Call __cilkrts_set_param to set two workers. * c-c++-common/cilk-plus/CK/steal_check.c (main): Likewise. * g++.dg/cilk-plus/CK/catch_exc.cc (main): Likewise. --- gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c | 15 +++ gcc/testsuite/c-c++-common/cilk-plus/CK/steal_check.c | 17 ++--- gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc | 14 ++ 3 files changed, 43 insertions(+), 3 deletions(-) diff --git
[PATCH, ARM] attribute target (thumb,arm)
Hi Ramana, Richard, This patch implements the attribute target (and pragma) to allow function based interworking. as in the updated documentation, the syntax is: __attribute__((target(thumb))) int foo() Forces thumb mode for function foo only. If the file was compiled with -mthumb iit has no effect. Similarly __attribute__((target(arm))) int foo() Forces arm mode for function foo. It has no effect if the file was not compiled with -mthumb. and regions can be grouped together with #pragma GCC target (thumb) or #pragma GCC target (arm) a few notes - Inlining is allowed between functions of the same mode (compilation switch, #pragma and attribute) - 'arm_option_override' is now reorganized around 'arm_option_override_internal' for thumb related macros - I kept TARGET_UNIFIED_ASM to minimize changes. Although removing it would avoid to switch between unified/divided asms and simplify arm_declare_function_name. Should be considered at some point. - It is only available for Thumb2 variants (for thumb1 lack of interest and a few complications I was unable to test, although this could be added easily if needed, I think) Tested for no regression for arm-none-eabi [,-with-arch=armv7-a] OK for trunk ? many thanks, Christian 2014-09-23 Christian Bruel christian.br...@st.com PR target/52144 * config/arm/arm.opt (THUMB): Sqve target option. * config/arm/arm-protos.h (arm_declare_function_name, arm_valid_target_attribute_tree arm_register_target_pragmas, arm_reset_previous_fndecl): Declare. * config/arm/arm.c (arm_declare_function_name): Move here. add attribute target support. (emit_thumb): New boolean. (arm_file_start): Set emit_thumb mode. (arm_pragma_target_parse): New function. (arm_valid_target_attribute_p, arm_valid_target_attribute_tree, arm_valid_target_attribute_rec): New functions. (arm_can_inline_p): New function. (arm_set_current_function, arm_reset_previous_fndecl): New functions. (arm_option_override): Split. (arm_option_override_internal): New function. (TARGET_CAN_INLINE_P, TARGET_SET_CURRENT_FUNCTION, TARGET_OPTION_VALID_ATTRIBUTE_P): Define. * config/arm/arm-c.c (arm_pragma_target_parse, arm_target_modify_macros, arm_pragma_target_parse, arm_register_target_pragmas): New functions. * config/arm/arm.h (SWITCHABLE_TARGET): Define. (ARM_DECLARE_FUNCTION_NAME): Call arm_declare_function_name. (REGISTER_TARGET_PRAGMAS): Call arm_register_target_pragma. (TREE_TARGET_THUMB): New macro. * doc/extend.texi (arm, thumb): Document target attributes. * doc/invoke.texi (arm, thumb): Mention target attributes. 2014-09-23 Christian Bruel christian.br...@st.com PR target/52144 * gcc.target/arm/attr_thumb.c: New test. Index: gcc/config/arm/arm-c.c === --- gcc/config/arm/arm-c.c (revision 215680) +++ gcc/config/arm/arm-c.c (working copy) @@ -20,9 +20,12 @@ #include system.h #include coretypes.h #include tm.h -#include tm_p.h #include tree.h +#include tm_p.h #include c-family/c-common.h +#include target.h +#include target-def.h +#include c-family/c-pragma.h /* Output C specific EABI object attributes. These can not be done in arm.c because they require information from the C frontend. */ @@ -42,3 +45,109 @@ { arm_lang_output_object_attributes_hook = arm_output_c_attributes; } + + +/* Define or undefine macros based on the current target. If the user does + #pragma GCC target, we need to adjust the macros dynamically. */ + +static void +arm_target_modify_macros (bool thumb_p) +{ + if (thumb_p) + { + cpp_define (parse_in, __thumb__); + if (arm_arch_thumb2) + cpp_define (parse_in, __thumb2__); + if (TARGET_BIG_END) + cpp_define (parse_in, __THUMBEB__); + else + cpp_define (parse_in, __THUMBEL__); + } + else + { + cpp_undef (parse_in, __thumb__); + if (arm_arch_thumb2) + cpp_undef (parse_in, __thumb2__); + if (TARGET_BIG_END) + cpp_undef (parse_in, __THUMBEB__); + else + cpp_undef (parse_in, __THUMBEL__); + } + +} + +/* Hook to validate the current #pragma GCC target and set the FPU custom + code option state. If ARGS is NULL, then POP_TARGET is used to reset + the options. */ +static bool +arm_pragma_target_parse (tree args, tree pop_target) +{ + tree prev_tree = build_target_option_node (global_options); + tree cur_tree; + struct cl_target_option *prev_opt; + struct cl_target_option *cur_opt; + bool cur_mode, prev_mode; + + if (! args) +{ + cur_tree = ((pop_target) ? pop_target : target_option_default_node); + cl_target_option_restore (global_options, +TREE_TARGET_OPTION (cur_tree)); +} + else +{ + cur_tree = arm_valid_target_attribute_tree (args, global_options); + if (cur_tree == NULL_TREE) + { + cl_target_option_restore (global_options, +TREE_TARGET_OPTION (prev_tree)); + return false; + } +} + + target_option_current_node =
Re: Enable EBX for x86 in 32bits PIC code
On Wed, Sep 24, 2014 at 03:20:44PM -0600, Jeff Law wrote: On 09/24/14 14:32, Ilya Enkovich wrote: 2014-09-24 19:27 GMT+04:00 Jeff Law l...@redhat.com: On 09/24/14 00:56, Ilya Enkovich wrote: After register allocation we have no idea where GOT address is and therefore delegitimize_address target hook becomes less efficient and cannot remove UNSPECs. That's what I see now when build GCC with patch applied: In theory this shouldn't be too hard to fix. I haven't looked at the code, but it might be something looking explicitly for ebx by register #, or something similar. Which case within delegitimize_address isn't firing as it should after your changes? It is the case I had to fix: @@ -14415,7 +14433,8 @@ ix86_delegitimize_address (rtx x) ... movl foo@GOTOFF(%ecx), %edx in which case we return (%ecx - %ebx) + foo. */ - if (pic_offset_table_rtx) + if (pic_offset_table_rtx + (!reload_completed || !ix86_use_pseudo_pic_reg ())) result = gen_rtx_PLUS (Pmode, gen_rtx_MINUS (Pmode, copy_rtx (addend), pic_offset_table_rtx), result); Originally if there is a UNSPEC_GOTOFFSET but no EBX usage then we just remove this UNSPEC and substract EBX value. With pseudo PIC reg we should use PIC register instead of EBX but it is unclear what to use after register allocation. What's the RTL before after allocation? Feel free to just pass along the dump files for sum_r4 that you referenced in a prior message. I wonder if during/after reload we just couldn't look at ORIGINAL_REGNO of hard regs if ix86_use_pseudo_pic_reg. Or is that the other case, where you don't have any PIC register replacement around, and want to subtract something? Perhaps in that case we could just subtract the value of _GLOBAL_OFFSET_TABLE_ symbol if we have nothing better around. Jakub
Re: [PATCH 2/2] Make -Q --help print param defaults and min/max values
On Sat, Sep 27, 2014 at 3:54 PM, Andi Kleen a...@firstfloor.org wrote: From: Andi Kleen a...@linux.intel.com Make -Q --help print the --param default, min, max values, similar to how it does print the defaults for other flags. This is useful to let a option auto tuner automatically query all needed information abourt gcc params (previously it needed to access the .def file in the source) Ok. Thanks, Richard. gcc/: 2014-09-26 Andi Kleen a...@linux.intel.com * opts.c (print_filtered_help): Print --param min/max/default with -Q. --- gcc/opts.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/gcc/opts.c b/gcc/opts.c index 0a49bc0..5cb5a39 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -953,6 +953,7 @@ print_filtered_help (unsigned int include_flags, const char *help; bool found = false; bool displayed = false; + char new_help[128]; if (include_flags == CL_PARAMS) { @@ -971,6 +972,15 @@ print_filtered_help (unsigned int include_flags, /* Get the translation. */ help = _(help); + if (!opts-x_quiet_flag) + { + snprintf (new_help, sizeof (new_help), + _(default %d minimum %d maximum %d), + compiler_params[i].default_value, + compiler_params[i].min_value, + compiler_params[i].max_value); + help = new_help; + } wrap_help (help, param, strlen (param), columns); } putchar ('\n'); @@ -985,7 +995,6 @@ print_filtered_help (unsigned int include_flags, for (i = 0; i cl_options_count; i++) { - char new_help[128]; const struct cl_option *option = cl_options + i; unsigned int len; const char *opt; -- 2.1.1
Re: [PATCH 1/2] Remove -fshort-double
On Sat, Sep 27, 2014 at 3:54 PM, Andi Kleen a...@firstfloor.org wrote: From: Andi Kleen a...@linux.intel.com -fshort-double has crashes the compiler since 4.6 (see PR60410) Since it's an obscure option that apparently nobody uses it the best way to fix it seems to just remove it. This prevents constant ICEs when running an gcc optimization flags autotuner. As we saw LTO fixes for -fshort-double it's clear that this flag _is_ used for some embedded archs. So - no, you can't simply remove it. But IMHO it should become a target-specific flag. Richard. gcc/testsuite/: 2014-09-26 Andi Kleen a...@linux.intel.com PR target/60410 * gcc.dg/lto/pr55113_0.c: Remove. gcc/: 2014-09-26 Andi Kleen a...@linux.intel.com PR target/60410 * doc/invoke.texi: Remove -fshort-double. * lto-wrapper.c (merge_and_complain): Dito. (run_gcc): Dito. gcc/c-family/: 2014-09-26 Andi Kleen a...@linux.intel.com PR target/60410 * c-common.c (c_common_nodes_and_builtins): Remove -fshort-double. * c.opt: Dito. gcc/lto/: 2014-09-26 Andi Kleen a...@linux.intel.com PR target/60410 * lto-lang.c (lto_init): Remove -fshort-double. --- gcc/c-family/c-common.c | 2 +- gcc/c-family/c.opt | 4 gcc/doc/invoke.texi | 10 +- gcc/lto-wrapper.c| 3 --- gcc/lto/lto-lang.c | 2 +- gcc/testsuite/gcc.dg/lto/pr55113_0.c | 14 -- 6 files changed, 3 insertions(+), 32 deletions(-) delete mode 100644 gcc/testsuite/gcc.dg/lto/pr55113_0.c diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index a9e0191..7a529a2 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -5325,7 +5325,7 @@ c_common_nodes_and_builtins (void) tree va_list_ref_type_node; tree va_list_arg_type_node; - build_common_tree_nodes (flag_signed_char, flag_short_double); + build_common_tree_nodes (flag_signed_char, false); /* Define `int' and `char' first so that dbx will output them first. */ record_builtin_type (RID_INT, NULL, integer_type_node); diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 72ac2ed..d6a9698 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -1251,10 +1251,6 @@ frtti C++ ObjC++ Optimization Var(flag_rtti) Init(1) Generate run time type descriptor information -fshort-double -C ObjC C++ ObjC++ LTO Optimization Var(flag_short_double) -Use the same size for double as for float - fshort-enums C ObjC C++ ObjC++ LTO Optimization Var(flag_short_enums) Use the narrowest integer type possible for enumeration types diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 0c3f4be..b2b667d 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -1094,7 +1094,7 @@ See S/390 and zSeries Options. -fno-jump-tables @gol -frecord-gcc-switches @gol -freg-struct-return -fshort-enums @gol --fshort-double -fshort-wchar @gol +-fshort-wchar @gol -fverbose-asm -fpack-struct[=@var{n}] -fstack-check @gol -fstack-limit-register=@var{reg} -fstack-limit-symbol=@var{sym} @gol -fno-stack-limit -fsplit-stack @gol @@ -22598,14 +22598,6 @@ is equivalent to the smallest integer type that has enough room. code that is not binary compatible with code generated without that switch. Use it to conform to a non-default application binary interface. -@item -fshort-double -@opindex fshort-double -Use the same size for @code{double} as for @code{float}. - -@strong{Warning:} the @option{-fshort-double} switch causes GCC to generate -code that is not binary compatible with code generated without that switch. -Use it to conform to a non-default application binary interface. - @item -fshort-wchar @opindex fshort-wchar Override the underlying type for @samp{wchar_t} to be @samp{short diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c index 08fd090..a2ce79c 100644 --- a/gcc/lto-wrapper.c +++ b/gcc/lto-wrapper.c @@ -275,7 +275,6 @@ merge_and_complain (struct cl_decoded_option **decoded_options, case OPT_freg_struct_return: case OPT_fpcc_struct_return: - case OPT_fshort_double: for (j = 0; j *decoded_options_count; ++j) if ((*decoded_options)[j].opt_index == foption-opt_index) break; @@ -500,7 +499,6 @@ run_gcc (unsigned argc, char *argv[]) case OPT_fgnu_tm: case OPT_freg_struct_return: case OPT_fpcc_struct_return: - case OPT_fshort_double: case OPT_ffp_contract_: case OPT_fwrapv: case OPT_ftrapv: @@ -573,7 +571,6 @@ run_gcc (unsigned argc, char *argv[]) case OPT_freg_struct_return: case OPT_fpcc_struct_return: - case OPT_fshort_double: /* Ignore these, they are determined by the input files. ??? We fail to diagnose a possible
Re: [PATCH v3] PR libitm/61164: use always_inline consistently
2014-09-27 Gleb Fotengauer-Malinovskiy gle...@altlinux.org libitm/ PR libitm/61164 * local_atomic: Rename __always_inline to __libitm_always_inline to eliminate glibc macro redefinition. (__libitm_always_inline): Add inline. (__calculate_memory_order): Remove inline. (atomic_thread_fence): Likewise. (atomic_signal_fence): Likewise. (atomic_flag_test_and_set_explicit): Likewise. (atomic_flag_clear_explicit): Likewise. (atomic_flag_test_and_set): Likewise. (atomic_flag_clear): Likewise. --- local_atomic | 299 +-- 1 file changed, 149 insertions(+), 150 deletions(-) diff --git a/local_atomic b/local_atomic index c3e079f..84dd8c2 100644 --- a/local_atomic +++ b/local_atomic @@ -41,8 +41,7 @@ #ifndef _GLIBCXX_ATOMIC #define _GLIBCXX_ATOMIC 1 -#undef __always_inline -#define __always_inline __attribute__((always_inline)) +#define __libitm_always_inline inline __attribute__((always_inline)) // #pragma GCC system_header @@ -74,7 +73,7 @@ namespace std // _GLIBCXX_VISIBILITY(default) memory_order_seq_cst } memory_order; - inline __always_inline memory_order + __libitm_always_inline memory_order __calculate_memory_order(memory_order __m) noexcept { const bool __cond1 = __m == memory_order_release; @@ -84,13 +83,13 @@ namespace std // _GLIBCXX_VISIBILITY(default) return __mo2; } - inline __always_inline void + __libitm_always_inline void atomic_thread_fence(memory_order __m) noexcept { __atomic_thread_fence (__m); } - inline __always_inline void + __libitm_always_inline void atomic_signal_fence(memory_order __m) noexcept { __atomic_thread_fence (__m); @@ -280,19 +279,19 @@ namespace std // _GLIBCXX_VISIBILITY(default) // Conversion to ATOMIC_FLAG_INIT. atomic_flag(bool __i) noexcept : __atomic_flag_base({ __i }) { } -__always_inline bool +__libitm_always_inline bool test_and_set(memory_order __m = memory_order_seq_cst) noexcept { return __atomic_test_and_set (_M_i, __m); } -__always_inline bool +__libitm_always_inline bool test_and_set(memory_order __m = memory_order_seq_cst) volatile noexcept { return __atomic_test_and_set (_M_i, __m); } -__always_inline void +__libitm_always_inline void clear(memory_order __m = memory_order_seq_cst) noexcept { // __glibcxx_assert(__m != memory_order_consume); @@ -302,7 +301,7 @@ namespace std // _GLIBCXX_VISIBILITY(default) __atomic_clear (_M_i, __m); } -__always_inline void +__libitm_always_inline void clear(memory_order __m = memory_order_seq_cst) volatile noexcept { // __glibcxx_assert(__m != memory_order_consume); @@ -455,7 +454,7 @@ namespace std // _GLIBCXX_VISIBILITY(default) is_lock_free() const volatile noexcept { return __atomic_is_lock_free (sizeof (_M_i), _M_i); } - __always_inline void + __libitm_always_inline void store(__int_type __i, memory_order __m = memory_order_seq_cst) noexcept { // __glibcxx_assert(__m != memory_order_acquire); @@ -465,7 +464,7 @@ namespace std // _GLIBCXX_VISIBILITY(default) __atomic_store_n(_M_i, __i, __m); } - __always_inline void + __libitm_always_inline void store(__int_type __i, memory_order __m = memory_order_seq_cst) volatile noexcept { @@ -476,7 +475,7 @@ namespace std // _GLIBCXX_VISIBILITY(default) __atomic_store_n(_M_i, __i, __m); } - __always_inline __int_type + __libitm_always_inline __int_type load(memory_order __m = memory_order_seq_cst) const noexcept { // __glibcxx_assert(__m != memory_order_release); @@ -485,7 +484,7 @@ namespace std // _GLIBCXX_VISIBILITY(default) return __atomic_load_n(_M_i, __m); } - __always_inline __int_type + __libitm_always_inline __int_type load(memory_order __m = memory_order_seq_cst) const volatile noexcept { // __glibcxx_assert(__m != memory_order_release); @@ -494,21 +493,21 @@ namespace std // _GLIBCXX_VISIBILITY(default) return __atomic_load_n(_M_i, __m); } - __always_inline __int_type + __libitm_always_inline __int_type exchange(__int_type __i, memory_order __m = memory_order_seq_cst) noexcept { return __atomic_exchange_n(_M_i, __i, __m); } - __always_inline __int_type + __libitm_always_inline __int_type exchange(__int_type __i, memory_order __m = memory_order_seq_cst) volatile noexcept { return __atomic_exchange_n(_M_i, __i, __m); } - __always_inline bool + __libitm_always_inline bool compare_exchange_weak(__int_type __i1, __int_type __i2, memory_order __m1,
Re: [Patch, AArch64] Enable Address sanitizer and UB sanitizer
On 26 September 2014 23:05, Andreas Schwab sch...@linux-m68k.org wrote: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=34c65c4 * sanitizer_common/sanitizer_platform_limits_posix.h (__sanitizer___kernel_old_uid_t, __sanitizer___kernel_old_gid_t) [__aarch64__]: Define to unsigned short. Thanks for pointing this. My understanding is that this kind of patch has to be submitted to the libsanitizer maintainers via the LLVM project. I'm going to take care of it. Christophe. --- libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h index caa36a4..139fe0a 100644 --- a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h +++ b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h @@ -470,7 +470,7 @@ namespace __sanitizer { typedef long __sanitizer___kernel_off_t; #endif -#if defined(__powerpc__) || defined(__aarch64__) || defined(__mips__) +#if defined(__powerpc__) || defined(__mips__) typedef unsigned int __sanitizer___kernel_old_uid_t; typedef unsigned int __sanitizer___kernel_old_gid_t; #else -- 2.1.1 -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
[libstdc++] Refactor python/hook.in
The attached patch refactors python/hook.in so that there are no individual function calls to load pretty printers and xmethods. This was suggested by Tom here: https://gcc.gnu.org/ml/gcc-patches/2014-08/msg02589.html. He indicates that it is better to put as little as possible in the hook file. The attached patch removes all code which explicitly loads the hooks from hook.in. 2014-09-29 Siva Chandra Reddy sivachan...@google.com * python/hook.in: Only import libstdcxx.v6. * python/libstdcxx/v6/__init__.py: Load printers and xmethods. diff --git a/libstdc++-v3/python/hook.in b/libstdc++-v3/python/hook.in index aeb1cdb..30cf538 100644 --- a/libstdc++-v3/python/hook.in +++ b/libstdc++-v3/python/hook.in @@ -55,18 +55,4 @@ if gdb.current_objfile () is not None: if not dir_ in sys.path: sys.path.insert(0, dir_) -# Load the pretty-printers. -from libstdcxx.v6.printers import register_libstdcxx_printers -register_libstdcxx_printers (gdb.current_objfile ()) - -# Load the xmethods if GDB supports them. -def gdb_has_xmethods(): -try: -import gdb.xmethod -return True -except ImportError: -return False - -if gdb_has_xmethods(): -from libstdcxx.v6.xmethods import register_libstdcxx_xmethods -register_libstdcxx_xmethods (gdb.current_objfile ()) +import libstdcxx.v6 diff --git a/libstdc++-v3/python/libstdcxx/v6/__init__.py b/libstdc++-v3/python/libstdcxx/v6/__init__.py index 8b13789..59c1f27 100644 --- a/libstdc++-v3/python/libstdcxx/v6/__init__.py +++ b/libstdc++-v3/python/libstdcxx/v6/__init__.py @@ -1 +1,32 @@ +# Copyright (C) 2014 Free Software Foundation, Inc. +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see http://www.gnu.org/licenses/. + +import gdb + +# Load the pretty-printers. +from printers import register_libstdcxx_printers +register_libstdcxx_printers(gdb.current_objfile()) + +# Load the xmethods if GDB supports them. +def gdb_has_xmethods(): +try: +import gdb.xmethod +return True +except ImportError: +return False + +if gdb_has_xmethods(): +from xmethods import register_libstdcxx_xmethods +register_libstdcxx_xmethods(gdb.current_objfile())
Re: [PATCH] Fix finding default baseline symbols directory
On 26/09/14 23:42 +0200, Andreas Schwab wrote: Tested on aarch64-suse-linux, where try_cpu=generic. Andreas. * configure.host: Use host_cpu, not try_cpu, to define default abi_baseline_pair. --- libstdc++-v3/configure.host | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/libstdc++-v3/configure.host b/libstdc++-v3/configure.host index a12871a..abfd609 100644 --- a/libstdc++-v3/configure.host +++ b/libstdc++-v3/configure.host @@ -346,8 +346,8 @@ case ${host} in abi_baseline_pair=x86_64-linux-gnu ;; *) -if test -d ${glibcxx_srcdir}/config/abi/post/${try_cpu}-linux-gnu; then - abi_baseline_pair=${try_cpu}-linux-gnu +if test -d ${glibcxx_srcdir}/config/abi/post/${host_cpu}-linux-gnu; then + abi_baseline_pair=${host_cpu}-linux-gnu fi esac case ${host} in Is this definitely right? If someone builds a target such as alphaev68-unknown-linux-gnu then try_cpu will be set to alpha by the first case in that file, and so it will use the alpha-linux-gnu baseline file today, but with your change it would try to use a alphaev68-linux-gnu baseline file which doesn't exist. Would a safer change be to just add a new pattern for aarch64? --- a/libstdc++-v3/configure.host +++ b/libstdc++-v3/configure.host @@ -345,6 +345,9 @@ case ${host} in x86_64) abi_baseline_pair=x86_64-linux-gnu ;; + aarch64) +abi_baseline_pair=aarch64-linux-gnu +;; *) if test -d ${glibcxx_srcdir}/config/abi/post/${try_cpu}-linux-gnu; then abi_baseline_pair=${try_cpu}-linux-gnu
Re: [libstdc++] Refactor python/hook.in
On 29/09/14 06:02 -0700, Siva Chandra wrote: The attached patch refactors python/hook.in so that there are no individual function calls to load pretty printers and xmethods. This was suggested by Tom here: https://gcc.gnu.org/ml/gcc-patches/2014-08/msg02589.html. He indicates that it is better to put as little as possible in the hook file. The attached patch removes all code which explicitly loads the hooks from hook.in. This looks good to me, thanks. I'll commit it later this week unless I hear objections from Tom.
Re: [PATCH v4 1/2] Fix __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__
Hi Mike, I have not seen any trouble arising following the fix to PR 61407 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61407). The patch committed is here: https://gcc.gnu.org/viewcvs?rev=215251root=gccview=rev I’ve just tested the exact same change on the 4.9 branch, and it bootstraps and regtests fine on x86_64-apple-darwin14 (I’ve got unrelated objc/obj-c++ failures, see PS). I suggest we backport it to 4.9, so that when 4.9.2 is released it builds fine on Yosemite. OK? FX PS: I’ve got quite a few objc/obj-c++ failures on x86_64-apple-darwin14, but I assume it’s because of changes in the Objective-C system headers… I’ve posted results here: https://gcc.gnu.org/ml/gcc-testresults/2014-09/msg02802.html Nothing to do with parsing of Mac OS X versions, in any case.
Re: [PATCH v3] PR libitm/61164: use always_inline consistently
On Mon, Sep 29, 2014 at 04:53:26PM +0400, Gleb Fotengauer-Malinovskiy wrote: 2014-09-27 Gleb Fotengauer-Malinovskiy gle...@altlinux.org Two spaces around name on each side. libitm/ PR libitm/61164 * local_atomic: Rename __always_inline to __libitm_always_inline to eliminate glibc macro redefinition. That would be * local_atomic (__always_inline): Rename to... (__libitm_always_inline): ... this. --- a/local_atomic +++ b/local_atomic @@ -41,8 +41,7 @@ #ifndef _GLIBCXX_ATOMIC #define _GLIBCXX_ATOMIC 1 -#undef __always_inline -#define __always_inline __attribute__((always_inline)) +#define __libitm_always_inline inline __attribute__((always_inline)) Why do you want to add inline keyword to that? Some inline keywords are implicit (methods defined inline), so there is no point adding it there. Jakub
RE: FW: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C
In a nutshell, add the following code to main() before the call to f3(): int status = __cilkrts_set_param(nworkers, 2); if (0 != status) { // Failed to set the number of Cilk workers return status; } Here's the details: There are three sources of information the Cilk runtime uses to set the number of workers. 1) By default the Cilk runtime will query the operating system for the number of cores and create a worker for each core. Note that the operating system counts each hyperthread as a core. 2) You can set your own default by defining the CILK_NWORKERS environment variable. For example: export CILK_NWORKERS=2 3) You can override the default programmatically by calling __cilkrts_set_param() before the first spawning function is called. For example, place the following call in main(): __cilkrts_set_param(nworkers, 2); A spawning function is a function that contains a _Cilk_spawn. The Cilk runtime will be initialized the first time a spawning function is executed. For this test to run correctly you must set the number of workers to at least 2, so the Cilk runtime will create at least one worker thread. Please note that once the Cilk runtime has been initialized by the entry into the first spawning function, the number of workers cannot be changed (and the __cilkrts_set_param() call will return a non-zero value to indicate an error has occurred.) The first two values are evaluated lazily; the runtime will load its default values the first time one of the __cilkrts APIs defined in cilk_api.h is called [for example __cilkrts_get_nworkers()], or the first spawning function is entered. So for example, the following code will not do what you want: if (1 == __cilkrts_get_nworkers()) setenv(CILK_NWORKERS=2); The call to __cilkrts_get_nworkers() will cause the Cilk runtime to set its default. Setting the CILK_NWORKERS environment variable after the __cilkrts_get_nworkers() call will be too late. It is possible to change the number of workers, but to do that you must return from all spawning functions and then call __cilkrts_end_cilk() to have the Cilk runtime shut down and then call __cilkrts_set_param(). When the Cilk runtime starts at the entry to the next spawning function, it will obey the new setting. - Barry -Original Message- From: Thomas Schwinge [mailto:tho...@codesourcery.com] Sent: Monday, September 29, 2014 6:54 AM To: Tannenbaum, Barry M; Iyer, Balaji V; Zamyatin, Igor Cc: gcc-patches@gcc.gnu.org Subject: RE: FW: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C Hi! On Mon, 22 Sep 2014 19:21:33 +, Tannenbaum, Barry M barry.m.tannenb...@intel.com wrote: That's exactly correct. There are two ways to implement a work-stealing scheduler. We refer to them as: [...] Thanks for the explanation. Cilk implements Parent Stealing. Since you're running with 1 worker, there's no other worker to steal the continuation. So steal_flag will never be set to 1 and you'll never break out of the loop. Remains the question about how to address that in the testsuite: -Original Message- From: Thomas Schwinge [mailto:tho...@codesourcery.com] Sent: Monday, September 22, 2014 9:56 AM To: Iyer, Balaji V Cc: gcc-patches@gcc.gnu.org Subject: Re: FW: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C Hi! On Tue, 27 Aug 2013 21:30:49 +, Iyer, Balaji V balaji.v.i...@intel.com wrote: --- /dev/null +++ gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c @@ -0,0 +1,37 @@ +/* { dg-do run { target { i?86-*-* x86_64-*-* arm*-*-* } } } */ +/* { dg-options -fcilkplus } */ +/* { dg-options -lcilkrts { target { i?86-*-* x86_64-*-* arm*-*-* +} } } */ + +void f0(volatile int *steal_flag) +{ + int i = 0; + /* Wait for steal_flag to be set */ + while (!*steal_flag) +; +} + +int f1() +{ + + volatile int steal_flag = 0; + _Cilk_spawn f0(steal_flag); + steal_flag = 1; // Indicate stolen + _Cilk_sync; + return 0; +} + +void f2(int q) +{ + q = 5; +} + +void f3() +{ + _Cilk_spawn f2(f1()); +} + +int main() +{ + f3(); + return 0; +} Is this really well-formed Cilk Plus code? Running with CILK_NWORKERS=1, or -- the equivalent -- in a system with just one CPU (as per libcilkrts/runtime/os-unix.c:__cilkrts_hardware_cpu_count returning 1), I see this test busy-loop as follows: Breakpoint 1, __cilkrts_hardware_cpu_count () at ../../../source/libcilkrts/runtime/os-unix.c:358 358 { (gdb) return 1 Make __cilkrts_hardware_cpu_count return now? (y or n) y #0 cilkg_get_user_settable_values () at ../../../source/libcilkrts/runtime/global_state.cpp:385 385 CILK_ASSERT(hardware_cpu_count 0); (gdb) c Continuing. ^C Program received signal SIGINT, Interrupt. f0 (steal_flag=steal_flag@entry=0x7fffd03c) at
[PATCH] Don't call fatal_error before error reporting has been initialized.
Hi, Currently if call to atexit (lto_wrapper_cleanup) fails we won't report error as we haven't initialized error-reporting infrastructure. This patch moves this call after diagnostic_initialize. I hope that we can't exit inside diagnostic_initialize. Otherwise we won't cleanup after it. Ok for trunk? 2014-09-29 Ilya Tocar ilya.to...@intel.com * lto-wrapper.c (main): Don't call fatal_error before diagnostic_initialize. --- gcc/lto-wrapper.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c index 08fd090..39e13b8 100644 --- a/gcc/lto-wrapper.c +++ b/gcc/lto-wrapper.c @@ -870,13 +870,13 @@ main (int argc, char *argv[]) xmalloc_set_program_name (progname); - if (atexit (lto_wrapper_cleanup) != 0) -fatal_error (atexit failed); - gcc_init_libintl (); diagnostic_initialize (global_dc, 0); + if (atexit (lto_wrapper_cleanup) != 0) +fatal_error (atexit failed); + if (signal (SIGINT, SIG_IGN) != SIG_IGN) signal (SIGINT, fatal_signal); #ifdef SIGHUP -- 1.8.3.1
Re: [PATCH] Redesign jump threading profile updates
On Fri, Aug 1, 2014 at 10:10 PM, Teresa Johnson tejohn...@google.com wrote: On Wed, Jul 23, 2014 at 2:08 PM, Teresa Johnson tejohn...@google.com wrote: On Tue, Jul 22, 2014 at 7:29 PM, Jeff Law l...@redhat.com wrote: On 03/26/14 17:44, Teresa Johnson wrote: Recently I discovered that the profile updates being performed by jump threading were incorrect in many cases, particularly in the case where the threading path contains a joiner. Some of the duplicated blocks/edges were not getting any counts, leading to incorrect function splitting and other downstream optimizations, and there were other insanities as well. After making a few attempts to fix the handling I ended up completely redesigning the profile update code, removing a few places throughout the code where it was attempting to do some updates. The biggest complication (see the large comment and example above the new routine compute_path_counts) is that we duplicate a conditional jump in the joiner case, possibly multiple times for multiple jump thread paths through that joiner, and it isn't trivial to figure out what probability to assign each of the duplicated successor edges (and the original after threading). Each jump thread path may need to have a different probability of staying on path through the joiner in order to keep the counts going out of the threading path sane. The patch below was bootstrapped and tested on x86_64-unknown-linux-gnu, and also tested with a profiledbootstrap. I additionally tested with cpu2006, confirming that the amount of resulting cycle samples in the split cold sections reduced, and through manual inspection that many different cases were now correct. I also measured performance with cpu2006, running each benchmark multiple times on a Westmere and see some speedups (453.povray 1-2%, 403.gcc 1-1.5%, and noisy but positive speedups in 471.omnetpp and 483.xalancbmk). Looks like my mailer is corrupting the spacing, which makes it harder to look at the CFG examples in the big header comment block I added. So I have also included the patch as an attachment. Ok for stage 1? Thanks, Teresa 2014-03-26 Teresa Johnson tejohn...@google.com * tree-ssa-threadupdate.c (struct ssa_local_info_t): New duplicate_blocks bitmap. (remove_ctrl_stmt_and_useless_edges): Ditto. (create_block_for_threading): Ditto. (compute_path_counts): New function. (update_profile): Ditto. (deduce_freq): Ditto. (recompute_probabilities): Ditto. (update_joiner_offpath_counts): Ditto. (ssa_fix_duplicate_block_edges): Update profile info. (ssa_create_duplicates): Pass new parameter. (ssa_redirect_edges): Remove old profile update. (thread_block_1): New duplicate_blocks bitmap, remove old profile update. (thread_single_edge): Pass new parameter. First off, sorry this took so long to get reviewed. Most of what's going on in here is similar to something I sketched out, but never coded up a while back -- with the significant difference that you're handling joiner blocks as well. Everything looks to be well thought through and documented in the code at a level I wish existed throughout GCC. The only thing I see missing is regression tests. I don't think you need to do anything huge here, but it ought to be possible to set up relatively simple cases which show the probabilities/counts being updated properly. Otherwise it looks excellent. It's pre-approved once you've added some kind of testing and fixed the nits noted below. Thanks! I will fix the issues you note below and create some test cases before I commit. Just an update - I found some good test cases by compiling the c-torture tests with profile feedback with and without my patch. But in the cases I pulled out I saw that there were still a couple profile or probability insanities introduced by jump threading (albeit far less than before), so I wanted to investigate before I commit. I ran out of time this week and will not get to this until I get back from vacation the week after next. Hi Jeff, I finally had a chance to get back to this and look at the remaining insanities in the new test cases I created. It turns out that there were still a few issues in the case where there were guessed frequencies and no profile counts. The two test cases I created do use FDO, and the insanities in the routines with profile counts went away with my patch. But the outlined copies of routines that were also inlined into the main routine still had estimated frequencies, and these still had a few issues. The problem is that the profile updates are done incrementally as we walk and update the paths in ssa_fix_duplicate_block_edges, including the block and edge counts, the block frequencies and the probabilities. This is very difficult to do when only operating on frequencies since the edge
RE: FW: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C
Hi! On Mon, 29 Sep 2014 13:58:31 +, Tannenbaum, Barry M barry.m.tannenb...@intel.com wrote: In a nutshell, add the following code to main() before the call to f3(): int status = __cilkrts_set_param(nworkers, 2); if (0 != status) { // Failed to set the number of Cilk workers return status; } Yeah, that's what I had proposed with the patch at the end of my previous email, http://news.gmane.org/find-root.php?message_id=%3C8761g6g0je.fsf%40kepler.schwinge.homeip.net%3E. I'm sorry if I didn't make it obvious that more text and the patch were following after the full-quote of the original issue description. Here's the details: [...] Thanks again for your helpful comments; that's appreciated. Here's again my proposed patch. Note, that the include paths in GCC compiler testing (gcc/testsuite/) are not set up to pick up the cilk/cilk_api.h include file, so I've manually added a propotype for the __cilkrts_set_param function to the three files. I can change that, if requested. commit ee7138e451d1f3284d6fa0f61fe517c82db94060 Author: Thomas Schwinge tho...@codesourcery.com Date: Mon Sep 29 12:47:34 2014 +0200 Audit Cilk Plus tests for CILK_NWORKERS=1. gcc/testsuite/ * c-c++-common/cilk-plus/CK/spawning_arg.c (main): Call __cilkrts_set_param to set two workers. * c-c++-common/cilk-plus/CK/steal_check.c (main): Likewise. * g++.dg/cilk-plus/CK/catch_exc.cc (main): Likewise. --- gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c | 15 +++ gcc/testsuite/c-c++-common/cilk-plus/CK/steal_check.c | 17 ++--- gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc | 14 ++ 3 files changed, 43 insertions(+), 3 deletions(-) diff --git gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c index 95e6cab..138b82c 100644 --- gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c +++ gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c @@ -2,6 +2,17 @@ /* { dg-options -fcilkplus } */ /* { dg-additional-options -lcilkrts { target { i?86-*-* x86_64-*-* } } } */ +#ifdef __cplusplus +extern C { +#endif + +extern int __cilkrts_set_param (const char *, const char *); + +#ifdef __cplusplus +} +#endif + + void f0(volatile int *steal_flag) { int i = 0; @@ -32,6 +43,10 @@ void f3() int main() { + /* Ensure more than one worker. */ + if (__cilkrts_set_param(nworkers, 2) != 0) +__builtin_abort(); + f3(); return 0; } diff --git gcc/testsuite/c-c++-common/cilk-plus/CK/steal_check.c gcc/testsuite/c-c++-common/cilk-plus/CK/steal_check.c index 6e28765..6b41c7f 100644 --- gcc/testsuite/c-c++-common/cilk-plus/CK/steal_check.c +++ gcc/testsuite/c-c++-common/cilk-plus/CK/steal_check.c @@ -2,8 +2,16 @@ /* { dg-options -fcilkplus } */ /* { dg-additional-options -lcilkrts { target { i?86-*-* x86_64-*-* } } } */ -// #include cilk/cilk_api.h -extern void __cilkrts_set_param (char *, char *); +#ifdef __cplusplus +extern C { +#endif + +extern int __cilkrts_set_param (const char *, const char *); + +#ifdef __cplusplus +} +#endif + void foo(volatile int *); @@ -11,7 +19,10 @@ void main2(void); int main(void) { - // __cilkrts_set_param ((char *)nworkers, (char *)2); + /* Ensure more than one worker. */ + if (__cilkrts_set_param(nworkers, 2) != 0) +__builtin_abort(); + main2(); return 0; } diff --git gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc index 0633d19..09ddf8b 100644 --- gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc +++ gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc @@ -10,6 +10,16 @@ #endif #include cstdlib +#ifdef __cplusplus +extern C { +#endif + +extern int __cilkrts_set_param (const char *, const char *); + +#ifdef __cplusplus +} +#endif + void func(int volatile* steal_me) { @@ -59,6 +69,10 @@ void my_test() int main() { + /* Ensure more than one worker. */ + if (__cilkrts_set_param(nworkers, 2) != 0) +__builtin_abort(); + my_test(); #if HAVE_IO printf(PASSED\n); Grüße, Thomas pgpnNzunM87ZE.pgp Description: PGP signature
[patch] Update libstdc++ status table.
Update the docs again. Committed to trunk. commit ef8c7f18cdfc087d54b0bbe2a1cd171409eb1f18 Author: Jonathan Wakely jwak...@redhat.com Date: Mon Sep 29 15:13:33 2014 +0100 * doc/xml/manual/status_cxx2011.xml: Update. * doc/html/manual/status.html: Regenerate. diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml index 36630ce..b986cad 100644 --- a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml +++ b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml @@ -606,11 +606,10 @@ particular release. entry/ /row row - ?dbhtml bgcolor=#B0B0B0 ? entry20.6.12.4/entry entrycodeuninitialized_fill_n/code/entry - entryPartial/entry - entryReturns codevoid/code../entry + entryY/entry + entry/ /row row entry20.6.13/entry @@ -1372,8 +1371,7 @@ particular release. entry23.2.1/entry entryGeneral container requirements/entry entryPartial/entry - entrycodedeque/code and codelist/code do not - meet the requirements + entrycodelist/code does not meet the requirements relating to allocator use and propagation./entry /row row @@ -1413,11 +1411,10 @@ particular release. entry/ /row row - ?dbhtml bgcolor=#B0B0B0 ? entry23.3.3/entry entryClass template codedeque/code/entry - entryPartial/entry - entryIncomplete allocator support./entry + entryY/entry + entry/ /row row entry23.3.4/entry
Re: [Patch, Fortran] Add CO_BROADCAST
On Mon, Sep 29, 2014 at 10:17:04AM +0200, Tobias Burnus wrote: Dominique Dhumieres wrote: The failures for the gfortran.dg/coarray_collectives_9.f90 are fixed with the following patch: Looks good to me. The patch is OK with a ChangLog. Actually, I missed the following part: ... - call co_broadcasr(vec(idx), 1) ! { dg-error Argument 'A' with INTENT\\(INOUT\\) at .1. of the intrinsic subroutine co_sum shall not have a vector subscript } - call co_reduce(vec([1,3,2]), red_f) ! { dg-error Argument 'A' with INTENT\\(INOUT\\) at .1. of the intrinsic subroutine co_min shall not have a vector subscript } + call co_broadcasr(vec(idx), 1) ! OK? ... Which is not fully okay: The error message should stay - but the procedure name should (...casr) should be corrected (...cast). Tobias PS: I think I will soon post a patch to support Fortran 2015's IMPLICIT NONE () where ... can be is an implicit-none list with values TYPE and EXTERNAL. Because a implicit none (type, external) would have found the typo! (Or likewise: -Wimplicit-procedure.)
Re: [libstdc++] Refactor python/hook.in
Jonathan I'll commit it later this week unless I hear objections from Jonathan Tom. It looks reasonable to me. Tom
[c++-concepts] Check function concept definitions
This fixes an ICE trying to normalize a function concept with multiple statements. That error will now be diagnosed at the point of definition. Jason, do you want to review this before I commit? This is a pretty small patch. 2014-09-01 Andrew Sutton andrew.n.sut...@gmail.com Check requirements on function concept definitions. * gcc/cp/decl.c (finish_function): Check properties of a function concept definition. * gcc/cp/constraint.cc (check_function_concept): New. Check for deduced return type and multiple statements. (normalize_misc): Don't normalize multiple statements. (normalize_stmt_list): Removed. * gcc/cp/cp-tree.h (check_function_concept): New. * gcc/testsuite/g++.dg/concepts/fn-concept1.C: New. Andrew Index: cp/cp-tree.h === --- cp/cp-tree.h (revision 214991) +++ cp/cp-tree.h (working copy) @@ -6444,6 +6444,7 @@ extern tree build_concept_check extern tree build_constrained_parameter (tree, tree, tree = NULL_TREE); extern bool deduce_constrained_parameter(tree, tree, tree); extern tree resolve_constraint_check(tree); +extern tree check_function_concept (tree); extern tree finish_concept_introduction (tree, tree); extern tree finish_template_constraints (tree); Index: cp/decl.c === --- cp/decl.c (revision 214268) +++ cp/decl.c (working copy) @@ -14360,6 +14360,10 @@ finish_function (int flags) fntype = TREE_TYPE (fndecl); } + // If this is a concept, check that the definition is reasonable. + if (DECL_DECLARED_CONCEPT_P (fndecl)) +check_function_concept (fndecl); + /* Save constexpr function body before it gets munged by the NRV transformation. */ maybe_save_function_definition (fndecl); Index: cp/constraint.cc === --- cp/constraint.cc (revision 214991) +++ cp/constraint.cc (working copy) @@ -269,6 +269,35 @@ deduce_concept_introduction (tree expr) gcc_unreachable (); } + +// -- // +// Declarations + +// Check that FN satisfies the structural requirements of a +// function concept definition. +tree +check_function_concept (tree fn) +{ + location_t loc = DECL_SOURCE_LOCATION (fn); + + // If fn was declared with auto, make sure the result type is bool. + if (FNDECL_USED_AUTO (fn) TREE_TYPE (fn) != boolean_type_node) +error_at (loc, deduced type of concept definition %qD is not %qT, + fn, boolean_type_node); + + // Check that the function is comprised of only a single + // return statements. + tree body = DECL_SAVED_TREE (fn); + if (TREE_CODE (body) == BIND_EXPR) +body = BIND_EXPR_BODY (body); + if (TREE_CODE (body) != RETURN_EXPR) +error_at (loc, function concept definition %qD has multiple statements, + fn); + + return NULL_TREE; +} + + // -- // // Normalization // @@ -425,9 +454,10 @@ normalize_misc (tree t) case CONSTRUCTOR: return t; -case STATEMENT_LIST: - return normalize_stmt_list (t); - +// This should have been caught as an error. +case STATEMENT_LIST: + return NULL_TREE; + default: gcc_unreachable (); } @@ -630,28 +660,6 @@ normalize_requires (tree t) return t; } -// Reduction rules for the statement list STMTS. -// -// Recursively reduce each statement in the list, concatenating each -// reduced result into a conjunction of requirements. -// -// A constexpr function may include statements other than a return -// statement. The primary purpose of these rules is to filter those -// non-return statements from the constraints language. -tree -normalize_stmt_list (tree stmts) -{ - tree lhs = NULL_TREE; - tree_stmt_iterator i = tsi_start (stmts); - while (!tsi_end_p (i)) -{ - if (tree rhs = normalize_node (tsi_stmt (i))) -lhs = conjoin_constraints (lhs, rhs); - tsi_next (i); -} - return lhs; -} - // Normalize a cast expression. tree normalize_cast (tree t) @@ -686,6 +694,7 @@ normalize_constraints (tree reqs) ++processing_template_decl; tree expr = normalize_node (reqs); --processing_template_decl; + return expr; } Index: testsuite/g++.dg/concepts/fn-concept1.C === --- testsuite/g++.dg/concepts/fn-concept1.C (revision 0) +++ testsuite/g++.dg/concepts/fn-concept1.C (revision 0) @@ -0,0 +1,9 @@ +// { dg-options -std=c++1z } + +templatetypename T + concept bool Tuple() { // { dg-error multiple statements } +static_assert(T::value, ); +return true; + } + + void f(Tuple);
RE: FW: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C
Looks good to me. I apologize for spamming you with the extra details. I wasn't clear that this was a patch instead of a bug report. I believe that Igor is the one who controls the GCC submission for Cilk Plus. - Barry -Original Message- From: Thomas Schwinge [mailto:tho...@codesourcery.com] Sent: Monday, September 29, 2014 10:27 AM To: Tannenbaum, Barry M; Iyer, Balaji V; Zamyatin, Igor Cc: gcc-patches@gcc.gnu.org Subject: RE: FW: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C Hi! On Mon, 29 Sep 2014 13:58:31 +, Tannenbaum, Barry M barry.m.tannenb...@intel.com wrote: In a nutshell, add the following code to main() before the call to f3(): int status = __cilkrts_set_param(nworkers, 2); if (0 != status) { // Failed to set the number of Cilk workers return status; } Yeah, that's what I had proposed with the patch at the end of my previous email, http://news.gmane.org/find-root.php?message_id=%3C8761g6g0je.fsf%40kepler.schwinge.homeip.net%3E. I'm sorry if I didn't make it obvious that more text and the patch were following after the full-quote of the original issue description. Here's the details: [...] Thanks again for your helpful comments; that's appreciated. Here's again my proposed patch. Note, that the include paths in GCC compiler testing (gcc/testsuite/) are not set up to pick up the cilk/cilk_api.h include file, so I've manually added a propotype for the __cilkrts_set_param function to the three files. I can change that, if requested. commit ee7138e451d1f3284d6fa0f61fe517c82db94060 Author: Thomas Schwinge tho...@codesourcery.com Date: Mon Sep 29 12:47:34 2014 +0200 Audit Cilk Plus tests for CILK_NWORKERS=1. gcc/testsuite/ * c-c++-common/cilk-plus/CK/spawning_arg.c (main): Call __cilkrts_set_param to set two workers. * c-c++-common/cilk-plus/CK/steal_check.c (main): Likewise. * g++.dg/cilk-plus/CK/catch_exc.cc (main): Likewise. --- gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c | 15 +++ gcc/testsuite/c-c++-common/cilk-plus/CK/steal_check.c | 17 ++--- gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc | 14 ++ 3 files changed, 43 insertions(+), 3 deletions(-) diff --git gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c index 95e6cab..138b82c 100644 --- gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c +++ gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c @@ -2,6 +2,17 @@ /* { dg-options -fcilkplus } */ /* { dg-additional-options -lcilkrts { target { i?86-*-* x86_64-*-* } } } */ +#ifdef __cplusplus +extern C { +#endif + +extern int __cilkrts_set_param (const char *, const char *); + +#ifdef __cplusplus +} +#endif + + void f0(volatile int *steal_flag) { int i = 0; @@ -32,6 +43,10 @@ void f3() int main() { + /* Ensure more than one worker. */ + if (__cilkrts_set_param(nworkers, 2) != 0) +__builtin_abort(); + f3(); return 0; } diff --git gcc/testsuite/c-c++-common/cilk-plus/CK/steal_check.c gcc/testsuite/c-c++-common/cilk-plus/CK/steal_check.c index 6e28765..6b41c7f 100644 --- gcc/testsuite/c-c++-common/cilk-plus/CK/steal_check.c +++ gcc/testsuite/c-c++-common/cilk-plus/CK/steal_check.c @@ -2,8 +2,16 @@ /* { dg-options -fcilkplus } */ /* { dg-additional-options -lcilkrts { target { i?86-*-* x86_64-*-* } } } */ -// #include cilk/cilk_api.h -extern void __cilkrts_set_param (char *, char *); +#ifdef __cplusplus +extern C { +#endif + +extern int __cilkrts_set_param (const char *, const char *); + +#ifdef __cplusplus +} +#endif + void foo(volatile int *); @@ -11,7 +19,10 @@ void main2(void); int main(void) { - // __cilkrts_set_param ((char *)nworkers, (char *)2); + /* Ensure more than one worker. */ + if (__cilkrts_set_param(nworkers, 2) != 0) +__builtin_abort(); + main2(); return 0; } diff --git gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc index 0633d19..09ddf8b 100644 --- gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc +++ gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc @@ -10,6 +10,16 @@ #endif #include cstdlib +#ifdef __cplusplus +extern C { +#endif + +extern int __cilkrts_set_param (const char *, const char *); + +#ifdef __cplusplus +} +#endif + void func(int volatile* steal_me) { @@ -59,6 +69,10 @@ void my_test() int main() { + /* Ensure more than one worker. */ + if (__cilkrts_set_param(nworkers, 2) != 0) +__builtin_abort(); + my_test(); #if HAVE_IO printf(PASSED\n); Grüße, Thomas
Re: [PATCH 1/2] Remove -fshort-double
As we saw LTO fixes for -fshort-double it's clear that this flag _is_ used for some embedded archs. Did we? It has been ICEing since 4.5, which is before LTO. -Andi
Re: [PATCH 1/2] Remove -fshort-double
On Mon, Sep 29, 2014 at 08:20:03AM -0700, Andi Kleen wrote: As we saw LTO fixes for -fshort-double it's clear that this flag _is_ used for some embedded archs. Did we? It has been ICEing since 4.5, which is before LTO. Depends on which target. Aren't the ICEs i?86/x86_64 backend ICEs? Jakub
Re: [c++-concepts] Check function concept definitions
On 09/29/2014 10:55 AM, Andrew Sutton wrote: Jason, do you want to review this before I commit? This is a pretty small patch. No need to wait for review before committing to the branch. + // If fn was declared with auto, make sure the result type is bool. + if (FNDECL_USED_AUTO (fn) TREE_TYPE (fn) != boolean_type_node) +error_at (loc, deduced type of concept definition %qD is not %qT, + fn, boolean_type_node); This should say what the deduced type is. + // Check that the function is comprised of only a single + // return statements. *statement +templatetypename T + concept bool Tuple() { // { dg-error multiple statements } +static_assert(T::value, ); +return true; + } Hmm, have we actually discussed this in core review? I'm not seeing it on the wiki. Constexpr started out this way too, and allowing static_assert was added pretty fast. C++11 said its function-body shall be = delete, = default, or a compound-statement that contains only — null statements, — static_assert-declarations — typedef declarations and alias-declarations that do not define classes or enumerations, — using-declarations, — using-directives, — and exactly one return statement; Is there a reason we want to be more strict than this for concept functions? Jason
Re: [PATCH 1/2] Remove -fshort-double
On September 29, 2014 5:21:36 PM CEST, Jakub Jelinek ja...@redhat.com wrote: On Mon, Sep 29, 2014 at 08:20:03AM -0700, Andi Kleen wrote: As we saw LTO fixes for -fshort-double it's clear that this flag _is_ used for some embedded archs. Did we? It has been ICEing since 4.5, which is before LTO. Depends on which target. Aren't the ICEs i?86/x86_64 backend ICEs? Yes, even works fine with x87 math Richard. Jakub
Re: [PATCH C++] - SD-6 Implementation Part 1 - __has_include.
On 09/25/2014 12:57 PM, Jason Merrill wrote: On 09/01/2014 09:34 PM, Ed Smith-Rowland wrote: (open_file_failed()): Not an error to not find a header file for __has_include__. Hmm, looks like this means that __has_include__ will silently return false if a header exists but is unreadable; I would think that we want it to be true (and have an error when the user tries to include it). Jason Here is the new patch series. A patch addressing C++11 [[deprecated]] is coming later when these are in as is a library patch for std::is_final. I flipped the logic on the libcpp functions because there was a double negative of sorts - the logic was confusing to me on second look. Also, a file that a user can't read for permissions still returns true with __has_include. Built and tested on x86_64-linux. OK? 2014-09-29 Edward Smith-Rowland 3dw...@verizon.net Implement SD-6: SG10 Feature Test Recommendations * internal.h (lexer_state, spec_nodes): Add in__has_include__. * directives.c: Support __has_include__ builtin. * expr.c (parse_has_include): New function to parse __has_include__ builtin; (eval_token()): Use it. * files.c (_cpp_has_header()): New funtion to look for header; (open_file_failed()): Not an error to not find a header file for __has_include__. * identifiers.c (_cpp_init_hashtable()): Add entry for __has_include__. * pch.c (cpp_read_state): Lookup __has_include__. * traditional.c (enum ls, _cpp_scan_out_logical_line()): Walk through __has_include__ statements. Index: internal.h === --- internal.h (revision 215628) +++ internal.h (working copy) @@ -258,6 +258,9 @@ /* Nonzero when parsing arguments to a function-like macro. */ unsigned char parsing_args; + /* Nonzero to prevent macro expansion. */ + unsigned char in__has_include__; + /* Nonzero if prevent_expansion is true only because output is being discarded. */ unsigned char discarding_output; @@ -279,6 +282,8 @@ cpp_hashnode *n_true;/* C++ keyword true */ cpp_hashnode *n_false; /* C++ keyword false */ cpp_hashnode *n__VA_ARGS__; /* C99 vararg macros */ + cpp_hashnode *n__has_include__; /* __has_include__ operator */ + cpp_hashnode *n__has_include_next__; /* __has_include_next__ operator */ }; typedef struct _cpp_line_note _cpp_line_note; @@ -645,6 +650,8 @@ extern bool _cpp_read_file_entries (cpp_reader *, FILE *); extern const char *_cpp_get_file_name (_cpp_file *); extern struct stat *_cpp_get_file_stat (_cpp_file *); +extern bool _cpp_has_header (cpp_reader *, const char *, int, +enum include_type); /* In expr.c */ extern bool _cpp_parse_expr (cpp_reader *, bool); @@ -680,6 +687,7 @@ extern void _cpp_do_file_change (cpp_reader *, enum lc_reason, const char *, linenum_type, unsigned int); extern void _cpp_pop_buffer (cpp_reader *); +extern char *_cpp_bracket_include (cpp_reader *); /* In directives.c */ struct _cpp_dir_only_callbacks Index: directives.c === --- directives.c(revision 215628) +++ directives.c(working copy) @@ -566,6 +566,11 @@ if (is_def_or_undef node == pfile-spec_nodes.n_defined) cpp_error (pfile, CPP_DL_ERROR, \defined\ cannot be used as a macro name); + else if (is_def_or_undef +(node == pfile-spec_nodes.n__has_include__ +|| node == pfile-spec_nodes.n__has_include_next__)) + cpp_error (pfile, CPP_DL_ERROR, + \__has_include__\ cannot be used as a macro name); else if (! (node-flags NODE_POISONED)) return node; } @@ -2623,3 +2628,12 @@ node-directive_index = i; } } + +/* Extract header file from a bracket include. Parsing starts after ''. + The string is malloced and must be freed by the caller. */ +char * +_cpp_bracket_include(cpp_reader *pfile) +{ + return glue_header_name (pfile); +} + Index: expr.c === --- expr.c (revision 215628) +++ expr.c (working copy) @@ -64,6 +64,8 @@ static unsigned int interpret_int_suffix (cpp_reader *, const uchar *, size_t); static void check_promotion (cpp_reader *, const struct op *); +static cpp_num parse_has_include (cpp_reader *, enum include_type); + /* Token type abuse to create unary plus and minus operators. */ #define CPP_UPLUS ((enum cpp_ttype) (CPP_LAST_CPP_OP + 1)) #define CPP_UMINUS ((enum cpp_ttype) (CPP_LAST_CPP_OP + 2)) @@ -1048,6 +1050,10 @@ case CPP_NAME: if (token-val.node.node == pfile-spec_nodes.n_defined) return parse_defined (pfile); + else if (token-val.node.node == pfile-spec_nodes.n__has_include__) + return
Re: [PATCH v3] PR libitm/61164: use always_inline consistently
On Mon, 2014-09-29 at 16:53 +0400, Gleb Fotengauer-Malinovskiy wrote: -#undef __always_inline -#define __always_inline __attribute__((always_inline)) +#define __libitm_always_inline inline __attribute__((always_inline)) The previous code seems to work in libstdc++. I believe that eventually, we'll want to use libstdc++-v3/include/bits/atomic_base.h (see the comment at the top of the file). Can we keep the diff between the two files small?
Re: Enable TBAA on anonymous types with LTO
Why not just make all anonymous types their own canonical type? (of course considering type variants) If C++ FE sets canonical type always to main variant, it should work. Is it always the case? I noticed you do this for variadic types. I tought there is reason why canonical types differ from main variants and the way canonical types are built in FE seems more complex too, than just setting CANONICAL=MAIN_VARIANT 2) update canonical type hash so it can deal with types that already have canonical type set. I insert even anonymous types there because I am not able to get rid of cases where non-anonmous type explicitly mentions anonymous. Consider: namespace { struct B {}; } struct A { void t(B); void t2(); }; void A::t(B) { } void A::t2() { } Here we end up having type of method T non-anonymous but it builds from B that is anonymous. But that makes B non-anonymous as well? How is A::t mangled? Consider also the simpler case struct A { struct B b; }; Yep, A seems to be not anonymous and mangled as A. I think it is ODR violation to declare such type in more than one compilation unit (and we will warn on it). We can make it anonymous, but I think it is C++ FE to do so. Being bale to handle non-upwards closed cases will be needed soon for full ODR type handling 3) Disable tree merging of anonymous namespace nodes and anonymous types. The second is needed, because I can have two identically looking anonymous types from same unit with different canonical types. But the container should be distinct? Isn't this again the issue that we merge anonymous namespace decls? Please try to fix that one and forall. I already made anonymous namespaces unmerged in this patch. But still I saw cases where two anonymous namespace types differed only by TYPE_CANONICAL and had DFS components of different size. Because we do not compare TYPE_CANONICAl we declared them equivalent and DFS merging ICEd. This may go away once we get some ability to decide on unmergability at stream out time. I do not attept to merge anonymous types with structurally equivalent non-anonymous types from other compilation units. I think it is nature of C++ language that types in anonymous namespaces can not be accessed by other units and I hope to use this for other optimizations, too. What about cross-language LTO? With your scheme you say that you can't ever interoperate using anonymous entities (even if used from a non-anonymous one like in the examples above)? I think that's a dangerous route to go. Maybe detect the case where we compile from multiple source languages and behave differently? I really think that anonymous types are meant to not be accessible from other compilation unit and I do not see why other languages need different rule. Of course for proper ODR types we need to solve how to merge non-C++ types with them. So I can do that for anonymous types, too. I had to revisit my original plan for canonical types for ODR. I originally just insterted ODR types into ODR hash as well as canonical type hash (without killing their canonical type). While doing so I recordedif given canonical type has non-ODR type in it. Finally I went through ODR types and updated their canonical types by canonical hash if the conflict happen. This does not work for types build from ODR types that arenot ODR themselves. My plan is: 1) fork current canonical_type hash into two - structural_type_hash and canonical_type_hash 2) During the streaming in, I populate structural_type_hash and the existing odr_hash. I record what structural type hash buckets contains non-ODR type 3) During the existing loop that recomputes canonical type I start populating canonical type hash. It works same way as structural except for ODR types that do not conflict. THose are considered by ODR equality. We can add documentation about this to -fstrict-aliasing section of manual I guess. What I am concerned about is the needed change in c-decl.c. C frontend currently outputs declarations that are confused by type_in_anonymous_namespace_p as anonymous in some cases. This is because it does not set PUBLIC flag on TYPE decl. This is bug: /* In a VAR_DECL, FUNCTION_DECL, NAMESPACE_DECL or TYPE_DECL, nonzero means name is to be accessible from outside this translation unit. In an IDENTIFIER_NODE, nonzero means an external declaration accessible from outside this translation unit was previously seen for this name in an inner scope. */ #define TREE_PUBLIC(NODE) ((NODE)-base.public_flag) This fortunately manifests itself as false warnings about type incompatiblity from lto-symtab. I did not see these with other
Re: Enable TBAA on anonymous types with LTO
Why not just make all anonymous types their own canonical type? (of course considering type variants) If C++ FE sets canonical type always to main variant, it should work. Is it always the case? I noticed you do this for variadic types. I tought there is reason why canonical types differ from main variants and the way canonical types are built in FE seems more complex too, than just setting CANONICAL=MAIN_VARIANT 2) update canonical type hash so it can deal with types that already have canonical type set. I insert even anonymous types there because I am not able to get rid of cases where non-anonmous type explicitly mentions anonymous. Consider: namespace { struct B {}; } struct A { void t(B); void t2(); }; void A::t(B) { } void A::t2() { } Here we end up having type of method T non-anonymous but it builds from B that is anonymous. But that makes B non-anonymous as well? How is A::t mangled? Consider also the simpler case struct A { struct B b; }; Yep, A seems to be not anonymous and mangled as A. I think it is ODR violation to declare such type in more than one compilation unit (and we will warn on it). We can make it anonymous, but I think it is C++ FE to do so. If I update my testcase to also have struct B b as a field I get: .type _ZN1A2t2Ev, @function _ZN1A2t2Ev: .LFB1: .cfi_startproc rep ret .cfi_endproc I.e. a::t is static (based on its anonymous namespace argument B) while a::t2 is normal externally visible method for type mangled as A.
Re: [c++-concepts] Check function concept definitions
Hmm, have we actually discussed this in core review? I'm not seeing it on the wiki. Constexpr started out this way too, and allowing static_assert was added pretty fast. C++11 said its function-body shall be = delete, = default, or a compound-statement that contains only — null statements, — static_assert-declarations — typedef declarations and alias-declarations that do not define classes or enumerations, — using-declarations, — using-directives, — and exactly one return statement; Is there a reason we want to be more strict than this for concept functions? I don't remember much controversy on that particular limitation in either Rapperswil or the previous telecon review. The main reason for the restriction is that concept definitions are normalized into a single constraint-expression. And it's not obvious how things like using declarations and static-assertions should be interpreted within the constraint language. That said, having a static_assert inside a concept kind of defeats the purpose since it triggers a diagnostic when its condition isn't satisfied. That's not very SFINAE friendly :) Maybe the restriction can relaxed when we consider the TS for adoption in 17. Andrew
Re: [PATCH 1/2] Remove -fshort-double
So - no, you can't simply remove it. But IMHO it should become a target-specific flag. How about a patch to just disable it for x86? -Andi
Re: FW: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C
On 09/29/14 08:26, Thomas Schwinge wrote: Hi! On Mon, 29 Sep 2014 13:58:31 +, Tannenbaum, Barry M barry.m.tannenb...@intel.com wrote: In a nutshell, add the following code to main() before the call to f3(): int status = __cilkrts_set_param(nworkers, 2); if (0 != status) { // Failed to set the number of Cilk workers return status; } Yeah, that's what I had proposed with the patch at the end of my previous email, http://news.gmane.org/find-root.php?message_id=%3C8761g6g0je.fsf%40kepler.schwinge.homeip.net%3E. I'm sorry if I didn't make it obvious that more text and the patch were following after the full-quote of the original issue description. Here's the details: [...] Thanks again for your helpful comments; that's appreciated. Here's again my proposed patch. Note, that the include paths in GCC compiler testing (gcc/testsuite/) are not set up to pick up the cilk/cilk_api.h include file, so I've manually added a propotype for the __cilkrts_set_param function to the three files. I can change that, if requested. commit ee7138e451d1f3284d6fa0f61fe517c82db94060 Author: Thomas Schwinge tho...@codesourcery.com Date: Mon Sep 29 12:47:34 2014 +0200 Audit Cilk Plus tests for CILK_NWORKERS=1. gcc/testsuite/ * c-c++-common/cilk-plus/CK/spawning_arg.c (main): Call __cilkrts_set_param to set two workers. * c-c++-common/cilk-plus/CK/steal_check.c (main): Likewise. * g++.dg/cilk-plus/CK/catch_exc.cc (main): Likewise. OK. Jeff
Re: [PATCH v3] PR libitm/61164: use always_inline consistently
On Mon, Sep 29, 2014 at 05:35:24PM +0200, Torvald Riegel wrote: On Mon, 2014-09-29 at 16:53 +0400, Gleb Fotengauer-Malinovskiy wrote: -#undef __always_inline -#define __always_inline __attribute__((always_inline)) +#define __libitm_always_inline inline __attribute__((always_inline)) The previous code seems to work in libstdc++. I believe that eventually, we'll want to use libstdc++-v3/include/bits/atomic_base.h (see the comment at the top of the file). Can we keep the diff between the two files small? libstdc++-v3/include/bits/atomic_base.h uses _GLIBCXX_ALWAYS_INLINE macro: #ifndef _GLIBCXX_ALWAYS_INLINE #define _GLIBCXX_ALWAYS_INLINE inline __attribute__((always_inline)) #endif but using a libstdc++ specific macro in libitm sounds weird to me (any change to that macro would mean you need to change it accordingly also in libitm or vice versa). Jakub
Commit: MSP430: Improve -mhwmult= opton and fix prologues and epilogues for naked functions
Hi Guys, I am applying the patch below to the MSP430 backend. It improves the -mhwmult=auto command line option so that MCUs without any hardware support will be recognised if the -mmcu= option has also been given. The patch also fixes a small problem with the prologue and epilogue generation in that code in function.c expects some RTL to be returned, even if the function is naked. Cheers Nick gcc/ChangeLog 2014-09-29 Nick Clifton ni...@redhat.com * config/msp430/msp430.c (msp430_expand_prologue): Return a CLOBBER rtx for naked functions. (msp430_expand_epilogue): Likewise. (msp430_use_f5_series_hwmult): Cache result. (use_32bit_hwmult): Cache result. (msp430_no_hwmult): New function. (msp430_output_labelref): Use it. Index: gcc/config/msp430/msp430.c === --- gcc/config/msp430/msp430.c (revision 215682) +++ gcc/config/msp430/msp430.c (working copy) @@ -1494,7 +1494,12 @@ rtx p; if (is_naked_func ()) -return; +{ + /* We must generate some RTX as thread_prologue_and_epilogue_insns() +examines the output of the gen_prologue() function. */ + emit_insn (gen_rtx_CLOBBER (VOIDmode, GEN_INT (0))); + return; +} emit_insn (gen_prologue_start_marker ()); @@ -1603,7 +1608,12 @@ int helper_n = 0; if (is_naked_func ()) -return; +{ + /* We must generate some RTX as thread_prologue_and_epilogue_insns() +examines the output of the gen_epilogue() function. */ + emit_insn (gen_rtx_CLOBBER (VOIDmode, GEN_INT (0))); + return; +} if (cfun-machine-need_to_save [10]) { @@ -2030,10 +2040,15 @@ { NULL, NULL } }; -/* Returns true if the current MCU is an F5xxx series. */ +/* Returns true if the current MCU supports an F5xxx series + hardware multiper. */ + bool msp430_use_f5_series_hwmult (void) { + static const char * cached_match = NULL; + static bool cached_result; + if (msp430_hwmult_type == F5SERIES) return true; @@ -2040,10 +2055,35 @@ if (target_mcu == NULL || msp430_hwmult_type != AUTO) return false; - return strncasecmp (target_mcu, msp430f5, 8) == 0; + if (target_mcu == cached_match) +return cached_result; + + cached_match = target_mcu; + + if (strncasecmp (target_mcu, msp430f5, 8) == 0) +return cached_result = true; + + static const char * known_f5_mult_mcus [] = +{ + cc430f5123,cc430f5125, cc430f5133, + cc430f5135,cc430f5137, cc430f5143, + cc430f5145,cc430f5147, cc430f6125, + cc430f6126,cc430f6127, cc430f6135, + cc430f6137,cc430f6143, cc430f6145, + cc430f6147,msp430bt5190, msp430sl5438a +}; + int i; + + for (i = ARRAY_SIZE (known_f5_mult_mcus); i--;) +if (strcasecmp (target_mcu, known_f5_mult_mcus[i]) == 0) + return cached_result = true; + + return cached_result = false; } -/* Returns true id the current MCU has a second generation 32-bit hardware multiplier. */ +/* Returns true if the current MCU has a second generation + 32-bit hardware multiplier. */ + static bool use_32bit_hwmult (void) { @@ -2056,6 +2096,8 @@ msp430f47186, msp430f47196, msp430f47167, msp430f47177, msp430f47187, msp430f47197 }; + static const char * cached_match = NULL; + static bool cached_result; int i; if (msp430_hwmult_type == LARGE) @@ -2064,15 +2106,105 @@ if (target_mcu == NULL || msp430_hwmult_type != AUTO) return false; + if (target_mcu == cached_match) +return cached_result; + + cached_match = target_mcu; for (i = ARRAY_SIZE (known_32bit_mult_mcus); i--;) if (strcasecmp (target_mcu, known_32bit_mult_mcus[i]) == 0) - return true; + return cached_result = true; - return false; + return cached_result = false; } +/* Returns true if the current MCU does not have a + hardware multiplier of any kind. */ + +static bool +msp430_no_hwmult (void) +{ + static const char * known_nomult_mcus [] = +{ + msp430c091,msp430c092, msp430c111, + msp430c, msp430c112, msp430c1121, + msp430c1331, msp430c1351, msp430c311s, + msp430c312,msp430c313, msp430c314, + msp430c315,msp430c323, msp430c325, + msp430c412,msp430c413, msp430e112, + msp430e313,msp430e315, msp430e325, + msp430f110,msp430f1101, msp430f1101a, + msp430f, msp430fa, msp430f112, + msp430f1121, msp430f1121a, msp430f1122, + msp430f1132, msp430f122, msp430f1222, + msp430f123,msp430f1232, msp430f133, + msp430f135,msp430f155, msp430f156, + msp430f157,msp430f2001, msp430f2002, + msp430f2003, msp430f2011, msp430f2012, + msp430f2013, msp430f2101, msp430f2111, + msp430f2112, msp430f2121, msp430f2122, + msp430f2131, msp430f2132, msp430f2232, +
Re: [PATCH] PR libitm/61164: redefinition of __always_inline
2014-09-27 Gleb Fotengauer-Malinovskiy gle...@altlinux.org libitm/ PR libitm/61164 * local_atomic (__always_inline): Rename to... (__libitm_always_inline): ... this. --- On Mon, Sep 29, 2014 at 03:38:25PM +0200, Jakub Jelinek wrote: Why do you want to add inline keyword to that? Some inline keywords are implicit (methods defined inline), so there is no point adding it there. I just didn't get that redefinition of __always_inline was the source of the problem. local_atomic | 299 +-- 1 file changed, 149 insertions(+), 150 deletions(-) diff --git a/local_atomic b/local_atomic index c3e079f..552b919 100644 --- a/local_atomic +++ b/local_atomic @@ -41,8 +41,7 @@ #ifndef _GLIBCXX_ATOMIC #define _GLIBCXX_ATOMIC 1 -#undef __always_inline -#define __always_inline __attribute__((always_inline)) +#define __libitm_always_inline __attribute__((always_inline)) // #pragma GCC system_header @@ -74,7 +74,7 @@ namespace std // _GLIBCXX_VISIBILITY(default) memory_order_seq_cst } memory_order; - inline __always_inline memory_order + inline __libitm_always_inline memory_order __calculate_memory_order(memory_order __m) noexcept { const bool __cond1 = __m == memory_order_release; @@ -84,13 +84,13 @@ namespace std // _GLIBCXX_VISIBILITY(default) return __mo2; } - inline __always_inline void + inline __libitm_always_inline void atomic_thread_fence(memory_order __m) noexcept { __atomic_thread_fence (__m); } - inline __always_inline void + inline __libitm_always_inline void atomic_signal_fence(memory_order __m) noexcept { __atomic_thread_fence (__m); @@ -280,19 +280,19 @@ namespace std // _GLIBCXX_VISIBILITY(default) // Conversion to ATOMIC_FLAG_INIT. atomic_flag(bool __i) noexcept : __atomic_flag_base({ __i }) { } -__always_inline bool +__libitm_always_inline bool test_and_set(memory_order __m = memory_order_seq_cst) noexcept { return __atomic_test_and_set (_M_i, __m); } -__always_inline bool +__libitm_always_inline bool test_and_set(memory_order __m = memory_order_seq_cst) volatile noexcept { return __atomic_test_and_set (_M_i, __m); } -__always_inline void +__libitm_always_inline void clear(memory_order __m = memory_order_seq_cst) noexcept { // __glibcxx_assert(__m != memory_order_consume); @@ -302,7 +302,7 @@ namespace std // _GLIBCXX_VISIBILITY(default) __atomic_clear (_M_i, __m); } -__always_inline void +__libitm_always_inline void clear(memory_order __m = memory_order_seq_cst) volatile noexcept { // __glibcxx_assert(__m != memory_order_consume); @@ -455,7 +455,7 @@ namespace std // _GLIBCXX_VISIBILITY(default) is_lock_free() const volatile noexcept { return __atomic_is_lock_free (sizeof (_M_i), _M_i); } - __always_inline void + __libitm_always_inline void store(__int_type __i, memory_order __m = memory_order_seq_cst) noexcept { // __glibcxx_assert(__m != memory_order_acquire); @@ -465,7 +465,7 @@ namespace std // _GLIBCXX_VISIBILITY(default) __atomic_store_n(_M_i, __i, __m); } - __always_inline void + __libitm_always_inline void store(__int_type __i, memory_order __m = memory_order_seq_cst) volatile noexcept { @@ -476,7 +476,7 @@ namespace std // _GLIBCXX_VISIBILITY(default) __atomic_store_n(_M_i, __i, __m); } - __always_inline __int_type + __libitm_always_inline __int_type load(memory_order __m = memory_order_seq_cst) const noexcept { // __glibcxx_assert(__m != memory_order_release); @@ -485,7 +485,7 @@ namespace std // _GLIBCXX_VISIBILITY(default) return __atomic_load_n(_M_i, __m); } - __always_inline __int_type + __libitm_always_inline __int_type load(memory_order __m = memory_order_seq_cst) const volatile noexcept { // __glibcxx_assert(__m != memory_order_release); @@ -494,21 +494,21 @@ namespace std // _GLIBCXX_VISIBILITY(default) return __atomic_load_n(_M_i, __m); } - __always_inline __int_type + __libitm_always_inline __int_type exchange(__int_type __i, memory_order __m = memory_order_seq_cst) noexcept { return __atomic_exchange_n(_M_i, __i, __m); } - __always_inline __int_type + __libitm_always_inline __int_type exchange(__int_type __i, memory_order __m = memory_order_seq_cst) volatile noexcept { return __atomic_exchange_n(_M_i, __i, __m); } - __always_inline bool + __libitm_always_inline bool compare_exchange_weak(__int_type __i1, __int_type __i2, memory_order __m1, memory_order __m2) noexcept { @@ -519,7 +519,7 @@ namespace std //
[jit] Documentation tweaks
Committed to branch dmalcolm/jit: gcc/jit/ChangeLog.jit: * TODO.rst: Update. * docs/topics/expressions.rst (gcc_jit_context_new_call): Add a note clarifying the behavior of this entrypoint. * docs/topics/functions.rst (Creating and using functions): Markup fix. (gcc_jit_block_add_assignment_op): Add an example. --- gcc/jit/TODO.rst| 9 ++--- gcc/jit/docs/topics/expressions.rst | 23 +++ gcc/jit/docs/topics/functions.rst | 13 +++-- 3 files changed, 36 insertions(+), 9 deletions(-) diff --git a/gcc/jit/TODO.rst b/gcc/jit/TODO.rst index c1ea024..09c4d9d 100644 --- a/gcc/jit/TODO.rst +++ b/gcc/jit/TODO.rst @@ -29,12 +29,6 @@ API gcc_jit_function_as_rvalue () -* clarify gcc_jit_function_add_eval():: - -(void)expression; - - and, indeed, clarify all of the other operations. - * expressing branch probabilies (like __builtin_expect):: extern gcc_jit_rvalue * @@ -99,7 +93,8 @@ Future milestones * Detect and issue warnings/errors about uses of uninitialized variables -* Warn about unused objects in a context (e.g. rvalues/lvalues)? +* Warn about unused objects in a context (e.g. rvalues/lvalues)? (e.g. + for gcc_jit_context_new_call vs gcc_jit_block_add_eval) Nice to have diff --git a/gcc/jit/docs/topics/expressions.rst b/gcc/jit/docs/topics/expressions.rst index 98d8e92..a95f5c9 100644 --- a/gcc/jit/docs/topics/expressions.rst +++ b/gcc/jit/docs/topics/expressions.rst @@ -364,6 +364,29 @@ Function calls Given a function and the given table of argument rvalues, construct a call to the function, with the result as an rvalue. + .. note:: + + :c:func:`gcc_jit_context_new_call` merely builds a + :c:type:`gcc_jit_rvalue` i.e. an expression that can be evaluated, + perhaps as part of a more complicated expression. + The call *won't* happen unless you add a statement to a function + that evaluates the expression. + + For example, if you want to call a function and discard the result + (or to call a function with ``void`` return type), use + :c:func:`gcc_jit_block_add_eval`: + + .. code-block:: c + + /* Add (void)printf (arg0, arg1);. */ + gcc_jit_block_add_eval ( + block, NULL, + gcc_jit_context_new_call ( + ctxt, + NULL, + printf_func, + 2, args)); + Type-coercion * diff --git a/gcc/jit/docs/topics/functions.rst b/gcc/jit/docs/topics/functions.rst index 15c895a..aa0c069 100644 --- a/gcc/jit/docs/topics/functions.rst +++ b/gcc/jit/docs/topics/functions.rst @@ -18,7 +18,7 @@ .. default-domain:: c Creating and using functions - + Params -- @@ -224,7 +224,16 @@ Statements lvalue *= rvalue; lvalue /= rvalue; - etc. + etc. For example: + + .. code-block:: c + + /* i++ */ + gcc_jit_block_add_assignment_op ( + loop_body, NULL, + i, + GCC_JIT_BINARY_OP_PLUS, + gcc_jit_context_one (ctxt, int_type)); .. function:: void\ gcc_jit_block_add_comment (gcc_jit_block *block,\ -- 1.7.11.7
Re: [c++-concepts] Check function concept definitions
On 09/29/2014 11:46 AM, Andrew Sutton wrote: The main reason for the restriction is that concept definitions are normalized into a single constraint-expression. And it's not obvious how things like using declarations and static-assertions should be interpreted within the constraint language. A using-declaration just affects name lookup. They and typedefs/aliases can help to make the return statement easier to write. That said, having a static_assert inside a concept kind of defeats the purpose since it triggers a diagnostic when its condition isn't satisfied. That's not very SFINAE friendly :) True. It might still be useful if for some reason testing a concept for a certain class of types indicates an error somewhere else. And people are likely to try it, as indicated by the bug report. :) Maybe the restriction can relaxed when we consider the TS for adoption in 17. I suppose, but I'd prefer not to wait that long. I guess we can talk about it on the call today. Jason
Re: [PATCH v3] PR libitm/61164: use always_inline consistently
On Mon, 2014-09-29 at 17:38 +0200, Jakub Jelinek wrote: On Mon, Sep 29, 2014 at 05:35:24PM +0200, Torvald Riegel wrote: On Mon, 2014-09-29 at 16:53 +0400, Gleb Fotengauer-Malinovskiy wrote: -#undef __always_inline -#define __always_inline __attribute__((always_inline)) +#define __libitm_always_inline inline __attribute__((always_inline)) The previous code seems to work in libstdc++. I believe that eventually, we'll want to use libstdc++-v3/include/bits/atomic_base.h (see the comment at the top of the file). Can we keep the diff between the two files small? libstdc++-v3/include/bits/atomic_base.h uses _GLIBCXX_ALWAYS_INLINE macro: #ifndef _GLIBCXX_ALWAYS_INLINE #define _GLIBCXX_ALWAYS_INLINE inline __attribute__((always_inline)) #endif Ahh.. I missed that inline in there when I had a quick look. Sorry for the noise :)
Re: [PATCH] microblaze: microblaze.md: Use VOID instead of SI to fix ((void (*)(void)) 0)() issue
On 09/25/14 07:03, Chen Gang wrote: 2014-09-25 Chen Ganggang.chen.5...@gmail.com gcc: * config/microblaze/microblaze.md (call_internal1): Use VOID instead of SI to fix ((void (*)(void)) 0)() issue gcc/testsuite/: 2014-09-28 Chen Gang gang.chen.5...@gmail.com * gcc.c-torture/compile/calls-void.c: New test. Committed revision 215684. Thanks for adding the test case. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077
Re: [PATCH] combine: Allow substituting the target reg of a clobber
On 09/27/14 16:03, Segher Boessenkool wrote: On Mon, Sep 22, 2014 at 04:20:12PM -0600, Jeff Law wrote: Can you add a testcase which shows the 3-insn combination from PR62151 applying? I've tried to make a stable future-proof testcase that does such a three-insn combination. Not easy at all. But now it dawns on me: do you just want the actual testcase from the PR? (Well, fixed so that it is valid C, I suppose). With a test that combine does its job, of course? Not sure how to test that, but maybe I'll learn. Or is a test showing the testcase working after the change good enough? It's going to be hard to totally future proof this kind of test and make it independent across every target, etc. In those cases, I just ask you do something reasonable. That can include making the test only applicable on a small number of targets and testing the RTL or assembly dumps. So, I'd use the testcase from the PR and probably scan the combine dump and probably make it target dependent. If the combine dump isn't particularly good to search for some reason, scan the assembly output. jeff
Re: [PATCH] Fix ICE in redirect_jump_1 (PR inline-asm/63282)
On 09/28/14 23:36, Jakub Jelinek wrote: Hi! On the following testcase, dead_or_predicable decides to call redirect_jump_1 on asm goto which has more than one label, but the bb still has just two successors (one of the labels points to code label at the start of the fallthru bb) and redirect_jump_1 ICEs on that. Usually dead_or_predicable fails if !any_condjump_p, but there is a shortcut (goto no_body) that bypasses that. I think it doesn't really make sense to allow anything but normal conditional jumps here, so the first patch just gives up in that case. Have done instrumented bootstrap on {i?86,x86_64,aarch64,armv7hl,ppc64,ppc64le,s390,s390x}-linux with this and the added goto cancel didn't trigger in any of the bootstraps, and triggered only on 1-3 testcases in the testsuite which all had asm goto in them (one of them this newly added testcase). Alternately, the second patch turns an assert in redirect_jump_1 into return 0, so it will fail (that also fixes the testcase). With that patch alone, I'm worried about dead_or_predicable calling invert_jump_1 on asm goto, which I can't understand how it would work (there is no way how the condition can be inverted). So, if the second patch is preferable, I think dead_or_predicable should still goto cancel if (reversep !any_condjump_p (jump)). Or invert_jump_1 should fail early if !any_condjump_p. Or both of the patches could be applied together as is (of course, testcase just from one of those). 2014-09-29 Jakub Jelinek ja...@redhat.com PR inline-asm/63282 * ifcvt.c (dead_or_predicable): Don't call redirect_jump_1 or invert_jump_1 if jump isn't any_condjump_p. * gcc.c-torture/compile/pr63282.c: New test. I think restricting to normal jumps is fine. Approved. jeff
Re: [PATCH v4 1/2] Fix __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__
On Sep 29, 2014, at 6:20 AM, FX fxcoud...@gmail.com wrote: I have not seen any trouble arising following the fix to PR 61407 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61407). The patch committed is here: https://gcc.gnu.org/viewcvs?rev=215251root=gccview=rev I’ve just tested the exact same change on the 4.9 branch, and it bootstraps and regtests fine on x86_64-apple-darwin14 (I’ve got unrelated objc/obj-c++ failures, see PS). I suggest we backport it to 4.9, so that when 4.9.2 is released it builds fine on Yosemite. OK? Ok.
Re: [PATCH] [ARM] [RFC] Fix longstanding push_minipool_fix ICE (PR49423, lp1296601)
Given that we've had this bake sufficiently on trunk and have seen no regressions reported it should be fine to go back to these branches. Further we've had the 4.9.1 and 4.8.3 releases recently so I'd say Yes, unless the RM's object in the next 24 hours. Make check passed on 4.8 and 4.9 with no regressions. Committed to 4.8 as r215686. Committed to 4.9 as r215685. Sorry for the delay.
[PATCHv3][PING] Enable -fsanitize-recover for KASan
Hi all, This patch enables -fsanitize-recover for KASan by default. This causes KASan to continue execution after error in case of inline instrumentation. This feature is needed because - reports during early bootstrap won't even be printed - needed to run all tests w/o rebooting machine for every test - needed for interactive work on desktop This is the third version of patch which renames -fsanitize-recover to -fubsan-recover and introduces -fasan-recover (enabled by default for KASan). It also moves flag handling to finish_options per Jakub's request. Bootstrapped and regtested on x64. -Y commit a9451a79bcdcab69856a38d228bec8986c0b0b2a Author: Yury Gribov y.gri...@samsung.com Date: Fri Aug 29 16:43:42 2014 +0400 2014-09-29 Yury Gribov y.gri...@samsung.com gcc/ * asan.c (report_error_func): Optionally call recoverable routines. (asan_expand_check_ifn): Ditto. (check_func): Fix formatting. * common.opt (fasan-recover): New option. (fubsan-recover): Rename. * doc/invoke.texi (-fasan-recover): Document new option. * sanitizer.def: New builtins. * opts.c (common_handle_option): Move default initialization to (finish_options): Here. Also initialize flag_asan_recover. * flag-types.h (SANITIZE_UNDEFINED_NONDEFAULT): Rename. * builtins.def: Ditto. * gcc.c (sanitize_spec_function): Ditto. * opts.c (common_handle_option): Ditto. * ubsan.c (ubsan_expand_bounds_ifn): Rename flag. (ubsan_expand_null_ifn): Ditto. (ubsan_build_overflow_builtin): Ditto. (instrument_bool_enum_load): Ditto. (ubsan_instrument_float_cast): Ditto. (instrument_nonnull_arg): Ditto. (instrument_nonnull_return): Ditto. gcc/c-family/ * c-ubsan.c (ubsan_instrument_division): Rename flag_sanitize_recover to flag_ubsan_recover. (ubsan_instrument_shift): Ditto. (ubsan_instrument_vla): Ditto. gcc/testsuite/ * c-c++-common/asan/recovery-1.c: New test. * c-c++-common/ubsan/align-1.c: Rename flag. * c-c++-common/ubsan/align-3.c: Ditto. * c-c++-common/ubsan/bounds-1.c: Ditto. * c-c++-common/ubsan/div-by-zero-7.c: Ditto. * c-c++-common/ubsan/float-cast-overflow-10.c: Ditto. * c-c++-common/ubsan/float-cast-overflow-7.c: Ditto. * c-c++-common/ubsan/float-cast-overflow-8.c: Ditto. * c-c++-common/ubsan/float-cast-overflow-9.c: Ditto. * c-c++-common/ubsan/nonnull-2.c: Ditto. * c-c++-common/ubsan/nonnull-3.c: Ditto. * c-c++-common/ubsan/overflow-1.c: Ditto. * c-c++-common/ubsan/overflow-add-1.c: Ditto. * c-c++-common/ubsan/overflow-add-3.c: Ditto. * c-c++-common/ubsan/overflow-mul-1.c: Ditto. * c-c++-common/ubsan/overflow-mul-3.c: Ditto. * c-c++-common/ubsan/overflow-negate-2.c: Ditto. * c-c++-common/ubsan/overflow-sub-1.c: Ditto. * c-c++-common/ubsan/pr59503.c: Ditto. * c-c++-common/ubsan/pr60613-1.c: Ditto. * c-c++-common/ubsan/save-expr-1.c: Ditto. * c-c++-common/ubsan/shift-3.c: Ditto. * c-c++-common/ubsan/shift-6.c: Ditto. * c-c++-common/ubsan/undefined-1.c: Ditto. * c-c++-common/ubsan/vla-2.c: Ditto. * c-c++-common/ubsan/vla-3.c: Ditto. * c-c++-common/ubsan/vla-4.c: Ditto. * g++.dg/ubsan/cxx11-shift-1.C: Ditto. * g++.dg/ubsan/return-2.C: Ditto. Conflicts: gcc/doc/invoke.texi diff --git a/gcc/asan.c b/gcc/asan.c index 63f99f5..fb7a660 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -1376,22 +1376,36 @@ asan_protect_global (tree decl) IS_STORE is either 1 (for a store) or 0 (for a load). */ static tree -report_error_func (bool is_store, HOST_WIDE_INT size_in_bytes, int *nargs) -{ - static enum built_in_function report[2][6] -= { { BUILT_IN_ASAN_REPORT_LOAD1, BUILT_IN_ASAN_REPORT_LOAD2, - BUILT_IN_ASAN_REPORT_LOAD4, BUILT_IN_ASAN_REPORT_LOAD8, - BUILT_IN_ASAN_REPORT_LOAD16, BUILT_IN_ASAN_REPORT_LOAD_N }, - { BUILT_IN_ASAN_REPORT_STORE1, BUILT_IN_ASAN_REPORT_STORE2, - BUILT_IN_ASAN_REPORT_STORE4, BUILT_IN_ASAN_REPORT_STORE8, - BUILT_IN_ASAN_REPORT_STORE16, BUILT_IN_ASAN_REPORT_STORE_N } }; +report_error_func (bool is_store, bool recover_p, HOST_WIDE_INT size_in_bytes, + int *nargs) +{ + static enum built_in_function report[2][2][6] += { { { BUILT_IN_ASAN_REPORT_LOAD1, BUILT_IN_ASAN_REPORT_LOAD2, + BUILT_IN_ASAN_REPORT_LOAD4, BUILT_IN_ASAN_REPORT_LOAD8, + BUILT_IN_ASAN_REPORT_LOAD16, BUILT_IN_ASAN_REPORT_LOAD_N }, + { BUILT_IN_ASAN_REPORT_STORE1, BUILT_IN_ASAN_REPORT_STORE2, + BUILT_IN_ASAN_REPORT_STORE4, BUILT_IN_ASAN_REPORT_STORE8, + BUILT_IN_ASAN_REPORT_STORE16, BUILT_IN_ASAN_REPORT_STORE_N } }, + { { BUILT_IN_ASAN_REPORT_RECOVER_LOAD1, + BUILT_IN_ASAN_REPORT_RECOVER_LOAD2, + BUILT_IN_ASAN_REPORT_RECOVER_LOAD4, + BUILT_IN_ASAN_REPORT_RECOVER_LOAD8, + BUILT_IN_ASAN_REPORT_RECOVER_LOAD16, + BUILT_IN_ASAN_REPORT_RECOVER_LOAD_N }, + {
[PATCHv3][Kasan][PING] Allow to override Asan shadow offset from command line
Hi all, Kasan developers has asked for an option to override offset of Asan shadow memory region. This should simplify experimenting with memory layouts on 64-bit architectures. New patch which checks that -fasan-shadow-offset is only enabled for -fsanitize=kernel-address. I (unfortunately) can't make this --param because this can be a 64-bit value. Bootstrapped and regtested on x64. -Y commit 05829f7922915b075c0f4275d3613947aa793a9c Author: Yury Gribov y.gri...@samsung.com Date: Fri Aug 29 11:58:03 2014 +0400 Allow to override Asan shadow offset. 2014-09-26 Yury Gribov y.gri...@samsung.com gcc/ * asan.c (set_asan_shadow_offset): New function. (asan_shadow_offset): Likewise. (asan_emit_stack_protection): Call asan_shadow_offset. (build_shadow_mem_access): Likewise. * asan.h (set_asan_shadow_offset): Declare. * common.opt (fasan-shadow-offset): New option. * doc/invoke.texi (fasan-shadow-offset): Describe new option. * opts-global.c (handle_common_deferred_options): Handle -fasan-shadow-offset. * opts.c (common_handle_option): Likewise. gcc/testsuite/ * c-c++-common/asan/shadow-offset-1.c: New test. diff --git a/gcc/asan.c b/gcc/asan.c index f520eab..63f99f5 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -238,6 +238,39 @@ along with GCC; see the file COPYING3. If not see A destructor function that calls the runtime asan library function _asan_unregister_globals is also installed. */ +static unsigned HOST_WIDE_INT asan_shadow_offset_value; +static bool asan_shadow_offset_computed; + +/* Sets shadow offset to value in string VAL. */ + +bool +set_asan_shadow_offset (const char *val) +{ + char *endp; + + errno = 0; + asan_shadow_offset_value = strtoul (val, endp, 0); + if (!(*val != '\0' *endp == '\0' errno == 0)) +return false; + + asan_shadow_offset_computed = true; + + return true; +} + +/* Returns Asan shadow offset. */ + +static unsigned HOST_WIDE_INT +asan_shadow_offset () +{ + if (!asan_shadow_offset_computed) +{ + asan_shadow_offset_computed = true; + asan_shadow_offset_value = targetm.asan_shadow_offset (); +} + return asan_shadow_offset_value; +} + alias_set_type asan_shadow_set = -1; /* Pointer types to 1 resp. 2 byte integers in shadow memory. A separate @@ -1124,7 +1157,7 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb, NULL_RTX, 1, OPTAB_DIRECT); shadow_base = plus_constant (Pmode, shadow_base, - targetm.asan_shadow_offset () + asan_shadow_offset () + (base_align_bias ASAN_SHADOW_SHIFT)); gcc_assert (asan_shadow_set != -1 (ASAN_RED_ZONE_SIZE ASAN_SHADOW_SHIFT) == 4); @@ -1502,7 +1535,7 @@ insert_if_then_before_iter (gimple cond, } /* Build - (base_addr ASAN_SHADOW_SHIFT) + targetm.asan_shadow_offset (). */ + (base_addr ASAN_SHADOW_SHIFT) + asan_shadow_offset (). */ static tree build_shadow_mem_access (gimple_stmt_iterator *gsi, location_t location, @@ -1519,7 +1552,7 @@ build_shadow_mem_access (gimple_stmt_iterator *gsi, location_t location, gimple_set_location (g, location); gsi_insert_after (gsi, g, GSI_NEW_STMT); - t = build_int_cst (uintptr_type, targetm.asan_shadow_offset ()); + t = build_int_cst (uintptr_type, asan_shadow_offset ()); g = gimple_build_assign_with_ops (PLUS_EXPR, make_ssa_name (uintptr_type, NULL), gimple_assign_lhs (g), t); diff --git a/gcc/asan.h b/gcc/asan.h index 198433f..eadf029 100644 --- a/gcc/asan.h +++ b/gcc/asan.h @@ -36,7 +36,7 @@ extern gimple_stmt_iterator create_cond_insert_point extern alias_set_type asan_shadow_set; /* Shadow memory is found at - (address ASAN_SHADOW_SHIFT) + targetm.asan_shadow_offset (). */ + (address ASAN_SHADOW_SHIFT) + asan_shadow_offset (). */ #define ASAN_SHADOW_SHIFT 3 /* Red zone size, stack and global variables are padded by ASAN_RED_ZONE_SIZE @@ -76,4 +76,6 @@ asan_red_zone_size (unsigned int size) return c ? 2 * ASAN_RED_ZONE_SIZE - c : ASAN_RED_ZONE_SIZE; } +extern bool set_asan_shadow_offset (const char *); + #endif /* TREE_ASAN */ diff --git a/gcc/common.opt b/gcc/common.opt index b4f0ed4..90f6bd4 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -879,6 +879,10 @@ fsanitize= Common Driver Report Joined Select what to sanitize +fasan-shadow-offset= +Common Joined RejectNegative Var(common_deferred_options) Defer +-fasan-shadow-offset=string Use custom shadow memory offset. + fsanitize-recover Common Report Var(flag_sanitize_recover) Init(1) After diagnosing undefined behavior attempt to continue execution diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index f6c3b42..d9bd1f7 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -297,7 +297,7 @@ Objective-C and Objective-C++ Dialects}. @xref{Debugging Options,,Options for Debugging Your Program or GCC}. @gccoptlist{-d@var{letters} -dumpspecs -dumpmachine
[patch] Flatten function.h take 2
On 09/16/2014 05:23 PM, Andrew MacLeod wrote: On 09/16/2014 05:12 PM, Joseph S. Myers wrote: On Tue, 16 Sep 2014, Andrew MacLeod wrote: I did an include file reduction on all the language/*.[ch] and core *.[ch] files, but left the target files with the full complement of 7 includes that function.h use to have. Its probably easier when this is all done to fully reduce the targets one at a time... there are so many nooks and crannies I figured I'd bust something right now if i tried to do all the targets as well :-) How did you determine what includes to remove? You appear to have removed tm.h includes from various files that do in fact use target macros; maybe they get it indirectly included by some other header, but I thought a principle of this flattening was to avoid relying on such indirect inclusions. Because of possible use of target macros in #ifdef conditionals, compiles with the include removed is not a sufficient condition for removing it. cfgrtl.c gimple-fold.c mode-switching.c tree-inline.c vmsdbgout.c fortran/f95-lang.c fortran/trans-decl.c objc/objc-act.c Many of those files do in fact get numerous include files from expr.h, which are likely to get put back in when expr.h is flattened, but there is a risk as you point out. Perhaps I should proceed by simply moving the includes and removing any duplicate includes, leaving the reduction for later date. There is less chance of that causing issues. I did forget about the discussion last year concerning target macros from the RTL end of things... My mind is slowly going :-). OK, here's take 2.. I left all the include files except ones which were duplicated as a result of the flattening. The first one was left, and any subsequent #Includes of the files were removed. we'll address unneeded includes separately and all at once.. perhaps with a newer tool that has been taught about input and output dependencies Bootstrapepd on x86_64-unknown-linux-gnu with no new regressions. Currently config-list.mk is building, but Im not expecting any issues there. assuming all is oK, ok to check in? Andrew PS.. the original commentary: This flattens function.h. It wasn't too bad, there were a few prototypes and defines in expr.h and rtl.h that belong in function.h, and a couple of other prototypes that belonged in other .h files. A bunch of the gen*.c generated files actually use function.h.. so they needed some tweaking. * function.h: Flatten file. Remove includes, adjust prototypes to reflect only what is in function.h. (enum direction, struct args_size, struct locate_and_pad_arg_data, ADD_PARM_SIZE, SUB_PARM_SIZE, ARGS_SIZE_TREE, ARGS_SIZE_RTX): Relocate from expr.h. (ASLK_REDUCE_ALIGN, ASLK_RECORD_PAD): Relocate from rtl.h. (optimize_function_for_size_p, optimize_function_for_speed_p): Move prototypes to predict.h. (init_varasm_status): Move prototype to varasm.h. * expr.h: Adjust include files. (enum direction, struct args_size, struct locate_and_pad_arg_data, ADD_PARM_SIZE, SUB_PARM_SIZE, ARGS_SIZE_TREE, ARGS_SIZE_RTX): Move to function.h. (locate_and_pad_parm): Move prototype to function.h. * rtl.h: (assign_stack_local, ASLK_REDUCE_ALIGN, ASLK_RECORD_PAD, assign_stack_local_1, assign_stack_temp, assign_stack_temp_for_type, assign_temp, reposition_prologue_and_epilogue_notes, prologue_epilogue_contains, sibcall_epilogue_contains, update_temp_slot_address, maybe_copy_prologue_epilogue_insn, set_return_jump_label): Move prototypes to function.h. * predict.h (optimize_function_for_size_p, optimize_function_for_speed_p): Relocate prototypes from function.h. * shrink-wrap.h (emit_return_into_block, active_insn_between, convert_jumps_to_returns, emit_return_for_exit): Move prototypes to function.h. * varasm.h (init_varasm_status): Relocate prototype from function.h. * genattrtab.c (write_header): Add predict.h to include list. * genconditions.c (write_header): Add predict.h to include list. * genemit.c (main): Adjust header file includes. * gengtype.c (ifiles): Add flattened function.h header files. * genoutput.c (output_prologue): Add predict.h to include list. * genpreds.c (write_insn_preds_c): Adjust header file includes. * genrecog.c (write_header): Add flattened function.h header files. * alias.c: Adjust include files. * auto-inc-dec.c: Likewise. * basic-block.h: Likewise. * bb-reorder.c: Likewise. * bt-load.c: Likewise. * builtins.c: Likewise. * caller-save.c: Likewise. * calls.c: Likewise. * cfgbuild.c: Likewise. * cfgcleanup.c: Likewise. * cfgexpand.c: Likewise. * cfgloop.c: Likewise. * cfgloop.h: Likewise. * cfgrtl.c: Likewise. * cgraph.h: Likewise. * cgraphclones.c: Likewise. * cgraphunit.c: Likewise. * combine-stack-adj.c: Likewise. * combine.c: Likewise. * coverage.c: Likewise. * cprop.c: Likewise. * cse.c: Likewise. * cselib.c: Likewise. * dbxout.c: Likewise. * ddg.c: Likewise. * df-core.c: Likewise. * df-problems.c: Likewise. *
Re: [PATCH] Fix finding default baseline symbols directory
Jonathan Wakely jwak...@redhat.com writes: Would a safer change be to just add a new pattern for aarch64? --- a/libstdc++-v3/configure.host +++ b/libstdc++-v3/configure.host @@ -345,6 +345,9 @@ case ${host} in x86_64) abi_baseline_pair=x86_64-linux-gnu ;; + aarch64) +abi_baseline_pair=aarch64-linux-gnu +;; *) if test -d ${glibcxx_srcdir}/config/abi/post/${try_cpu}-linux-gnu; then abi_baseline_pair=${try_cpu}-linux-gnu IMHO it doesn't make sense to use try_cpu here if it is generic. * configure.host (abi_baseline_pair): If try_cpu is generic use host_cpu for the default. diff --git a/libstdc++-v3/configure.host b/libstdc++-v3/configure.host index a12871a..d1298c4 100644 --- a/libstdc++-v3/configure.host +++ b/libstdc++-v3/configure.host @@ -346,8 +346,13 @@ case ${host} in abi_baseline_pair=x86_64-linux-gnu ;; *) -if test -d ${glibcxx_srcdir}/config/abi/post/${try_cpu}-linux-gnu; then - abi_baseline_pair=${try_cpu}-linux-gnu +if test $try_cpu = generic; then + try_abi_cpu=$host_cpu +else + try_abi_cpu=$try_cpu +fi +if test -d ${glibcxx_srcdir}/config/abi/post/${try_abi_cpu}-linux-gnu; then + abi_baseline_pair=${try_abi_cpu}-linux-gnu fi esac case ${host} in Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
RE: [RFC: Patch, PR 60102] [4.9/4.10 Regression] powerpc fp-bit ices@dwf_regno
On Mon, 29 Sep 2014, rohitarul...@freescale.com wrote: As I understand it, the change was supposed to only affect GCC internals, all externally generated debug info was supposed to remain unchanged. If there are changes in debug info, something must have gone wrong. Let me check if I can track this down. Thanks. In case that helps the multilib flags I used for this testing were: -mcpu=8548 -mfloat-gprs=double -mspe=yes -mabi=spe Maciej
[PATCH, rs6000] Remove splat calls with out-of-range arguments from gcc.dg/vmx/ops.c
Hi, While working on another patch, I observed that the test case gcc.dg/vmx/ops.c contains numerous calls to vec_splat and friends for which the second argument (the element selector) is out of range. At best these calls are invalid; as it is, we generate insns that can cause trouble during optimization. (In the case I saw, simplify-rtx tried to reduce the splat of its input at compile time, but the out-of-range element selector caused it to report a bad insn and abort.) This patch removes all of the calls with out-of-range element selectors from the test case. Tested on powerpc64le-unknown-linux-gnu. Ok to commit? Thanks, Bill 2014-09-29 Bill Schmidt wschm...@vnet.linux.ibm.com * gcc.dg/vmx/ops.c: Remove calls to vec_splat, vec_vsplth, vec_vspltw, and vec_vspltb for which the second argument is out of range. Index: gcc/testsuite/gcc.dg/vmx/ops.c === --- gcc/testsuite/gcc.dg/vmx/ops.c (revision 215683) +++ gcc/testsuite/gcc.dg/vmx/ops.c (working copy) @@ -337,32 +337,8 @@ void f2() { *var_vec_b16++ = vec_splat(var_vec_b16[0], 5); *var_vec_b16++ = vec_splat(var_vec_b16[0], 6); *var_vec_b16++ = vec_splat(var_vec_b16[0], 7); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 8); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 9); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 10); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 11); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 12); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 13); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 14); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 15); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 16); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 17); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 18); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 19); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 20); } void f3() { - *var_vec_b16++ = vec_splat(var_vec_b16[0], 21); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 22); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 23); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 24); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 25); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 26); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 27); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 28); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 29); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 30); - *var_vec_b16++ = vec_splat(var_vec_b16[0], 31); *var_vec_b16++ = vec_srl(var_vec_b16[0], var_vec_u16[1]); *var_vec_b16++ = vec_srl(var_vec_b16[0], var_vec_u32[1]); *var_vec_b16++ = vec_srl(var_vec_b16[0], var_vec_u8[1]); @@ -393,30 +369,6 @@ void f3() { *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 5); *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 6); *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 7); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 8); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 9); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 10); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 11); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 12); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 13); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 14); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 15); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 16); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 17); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 18); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 19); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 20); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 21); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 22); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 23); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 24); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 25); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 26); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 27); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 28); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 29); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 30); - *var_vec_b16++ = vec_vsplth(var_vec_b16[0], 31); *var_vec_b16++ = vec_vsr(var_vec_b16[0], var_vec_u16[1]); *var_vec_b16++ = vec_vsr(var_vec_b16[0], var_vec_u32[1]); *var_vec_b16++ = vec_vsr(var_vec_b16[0], var_vec_u8[1]); @@ -451,36 +403,8 @@ void f3() { *var_vec_b32++ = vec_splat(var_vec_b32[0], 1); *var_vec_b32++ = vec_splat(var_vec_b32[0], 2); *var_vec_b32++ = vec_splat(var_vec_b32[0], 3); - *var_vec_b32++ = vec_splat(var_vec_b32[0], 4); } void f4() { - *var_vec_b32++ = vec_splat(var_vec_b32[0], 5); - *var_vec_b32++ = vec_splat(var_vec_b32[0], 6); - *var_vec_b32++ = vec_splat(var_vec_b32[0], 7); - *var_vec_b32++ = vec_splat(var_vec_b32[0], 8); - *var_vec_b32++ = vec_splat(var_vec_b32[0], 9); - *var_vec_b32++ = vec_splat(var_vec_b32[0], 10); - *var_vec_b32++ = vec_splat(var_vec_b32[0], 11); - *var_vec_b32++ = vec_splat(var_vec_b32[0], 12); - *var_vec_b32++ = vec_splat(var_vec_b32[0], 13);
Re: [PATCH 2/n] OpenMP 4.0 offloading infrastructure: LTO streaming
On 29 Sep 03:10, Jan Hubicka wrote: dump for me implied debug dump. LTO is usually called streaming, so prehaps need_lto_stremaing? Fixed. +initialize_offload (void) Perhaps have_offload_p? Nothing is initialized here... The next patch will add some initialization to this function. And they'll be committed in a series. So, I'd prefer to keep this name. How does LTO combine with offloading? Both .gnu.lto_ and .gnu.target_lto_ sections are created. LTO just ignores target sections, and offload compiler ignores .gnu.lto_ sections. Everything works fine on my testcases. @@ -4325,11 +4325,6 @@ void inline_free_summary (void) { struct cgraph_node *node; - if (!inline_edge_summary_vec.exists ()) -return; - FOR_EACH_DEFINED_FUNCTION (node) -if (!node-alias) - reset_inline_summary (node); if (function_insertion_hook_holder) symtab-remove_cgraph_insertion_hook (function_insertion_hook_holder); function_insertion_hook_holder = NULL; @@ -4345,6 +4340,11 @@ inline_free_summary (void) if (edge_duplication_hook_holder) symtab-remove_edge_duplication_hook (edge_duplication_hook_holder); edge_duplication_hook_holder = NULL; + if (!inline_edge_summary_vec.exists ()) +return; + FOR_EACH_DEFINED_FUNCTION (node) +if (!node-alias) + reset_inline_summary (node); Why this is needed? Without this change gcc/testsuite/g++.dg/gomp/declare-simd-1.C will fail at -O0, since inline_generate_summary adds add_new_function hook, but at -O0 the inline_edge_summary_vec is empty, and we don't call remove_cgraph_insertion_hook ( https://gcc.gnu.org/ml/gcc-patches/2014-02/msg00055.html ) lto_set_symtab_encoder_in_partition (lto_symtab_encoder_t encoder, symtab_node *node) { + /* Ignore not needed nodes. */ + if (!node-need_dump) +return; I think it should be rather done at caller side (in the loop setting what to output) rather than in this simple datastructure accestor. Done. + /* Ignore references from non-target functions in offload lto mode. */ + if (offload_lto_mode + !lookup_attribute (omp declare target, + DECL_ATTRIBUTES (ref-referring-decl))) + continue; Those are quite busy loops, you may consder making offload a flag. Why you can't test need_dump here? Definitely. I have no idea why I did not used this flag here :) Fixed. I think you also need to run free lang data when you decide to stream something. When I compile a file with offloading, but without -flto, I see free lang data, executed during all_small_ipa_passes: #0 free_lang_data () at gcc/tree.c:5655 #1 in (anonymous namespace)::pass_ipa_free_lang_data::execute (this=0x20ce470) at gcc/tree.c:5708 #2 in execute_one_pass (pass=0x20ce470) at gcc/passes.c:2151 #3 in execute_ipa_pass_list (pass=0x20ce470) at gcc/passes.c:2543 #4 in ipa_passes () at gcc/cgraphunit.c:2055 #5 in symbol_table::compile (this=0x719fd000) at gcc/cgraphunit.c:2187 #6 in symbol_table::finalize_compilation_unit (this=0x719fd000) at gcc/cgraphunit.c:2340 #7 in c_write_global_declarations () at gcc/c/c-decl.c:10431 #8 in compile_file () at gcc/toplev.c:566 #9 in do_compile () at gcc/toplev.c:1949 #10 in toplev_main (argc=17, argv=0x7fffe3a8) at gcc/toplev.c:2025 #11 in main (argc=17, argv=0x7fffe3a8) at gcc/main.c:36 Otherwise the cgraph bits seems resonable. I think Richi will want to comment on LTO part. Here is updated patch. Bootstrapped and regtested. OK for trunk (after all patches from the series will be approved)? Thanks, -- Ilya gcc/ * cgraph.h (symtab_node): Add need_lto_streaming flag. * cgraphunit.c: Include lto-section-names.h. (initialize_offload): New function. (ipa_passes): Initialize offload and call ipa_write_summaries if there is something to write to OMP_SECTION_NAME_PREFIX sections. (symbol_table::compile): Call lto_streamer_hooks_init under flag_openmp. * ipa-inline-analysis.c (inline_generate_summary): Do not exit under flag_openmp. (inline_free_summary): Always remove hooks. * lto-cgraph.c (referenced_from_other_partition_p): Ignore references from non-target functions to target functions if we are streaming out target-side bytecode (offload lto mode). (reachable_from_other_partition_p): Likewise. (select_what_to_stream): New function. (compute_ltrans_boundary): Do not call lto_set_symtab_encoder_in_partition if the node should not be streamed. * lto-section-names.h (OMP_SECTION_NAME_PREFIX): Define. (section_name_prefix): Declare. * lto-streamer.c (section_name_prefix): New variable. (lto_get_section_name): Use section_name_prefix instead of LTO_SECTION_NAME_PREFIX. * lto-streamer.h (select_what_to_stream): Declare.
Re: [PATCHv3][PING] Enable -fsanitize-recover for KASan
On Mon, Sep 29, 2014 at 09:21:11PM +0400, Yury Gribov wrote: This patch enables -fsanitize-recover for KASan by default. This causes KASan to continue execution after error in case of inline instrumentation. This feature is needed because - reports during early bootstrap won't even be printed - needed to run all tests w/o rebooting machine for every test - needed for interactive work on desktop This is the third version of patch which renames -fsanitize-recover to -fubsan-recover and introduces -fasan-recover (enabled by default for KASan). It also moves flag handling to finish_options per Jakub's request. As the -fsanitize-recover option comes from clang originally, I think this needs coordination with them (whether clang will also rename the option), and certainly keep -fsanitize-recover as a non-documented compat option alias for -fubsan-recover. So, can you please talk to the clang folks about it? Jakub
Re: [PATCH, rs6000] Remove splat calls with out-of-range arguments from gcc.dg/vmx/ops.c
On Mon, Sep 29, 2014 at 1:27 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, While working on another patch, I observed that the test case gcc.dg/vmx/ops.c contains numerous calls to vec_splat and friends for which the second argument (the element selector) is out of range. At best these calls are invalid; as it is, we generate insns that can cause trouble during optimization. (In the case I saw, simplify-rtx tried to reduce the splat of its input at compile time, but the out-of-range element selector caused it to report a bad insn and abort.) This patch removes all of the calls with out-of-range element selectors from the test case. Tested on powerpc64le-unknown-linux-gnu. Ok to commit? Thanks, Bill 2014-09-29 Bill Schmidt wschm...@vnet.linux.ibm.com * gcc.dg/vmx/ops.c: Remove calls to vec_splat, vec_vsplth, vec_vspltw, and vec_vspltb for which the second argument is out of range. Okay. Thanks, David
Re: [c++-concepts] Check function concept definitions
On 2014-09-29 18:32, Jason Merrill wrote: On 09/29/2014 11:46 AM, Andrew Sutton wrote: The main reason for the restriction is that concept definitions are normalized into a single constraint-expression. And it's not obvious how things like using declarations and static-assertions should be interpreted within the constraint language. A using-declaration just affects name lookup. They and typedefs/aliases can help to make the return statement easier to write. That said, having a static_assert inside a concept kind of defeats the purpose since it triggers a diagnostic when its condition isn't satisfied. That's not very SFINAE friendly :) True. It might still be useful if for some reason testing a concept for a certain class of types indicates an error somewhere else. And people are likely to try it, as indicated by the bug report. :) Maybe the restriction can relaxed when we consider the TS for adoption in 17. I suppose, but I'd prefer not to wait that long. I guess we can talk about it on the call today. Jason Since I sent that sample code with the static_assert inside the concept, let me add that I wanted to test something completely different and stumbled over that internal error by accident :-) That being said, I had expected the same rules for concepts as for constexpr functions indeed. It seemed quite natural (and would require less RAM in my brain). Best, Roland
Re: [PATCH x86_64] Optimize access to globals in -fpie -pie builds with copy relocations
Ping. On Fri, Sep 19, 2014 at 2:11 PM, Sriraman Tallam tmsri...@google.com wrote: Hi Richard, I also ran the gcc testsuite with RUNTESTFLAGS=--tool_opts=-mcopyrelocs to check for issues. The only test that failed was g++.dg/tsan/default_options.C. It uses -fpie -pie and BFD ld to link. Since BFD ld does not support copy relocations with -pie, it does not link. I linked with gold to make the test pass. Could you please take another look at this patch? Thanks Sri On Mon, Sep 8, 2014 at 3:19 PM, Sriraman Tallam tmsri...@google.com wrote: On Tue, Sep 2, 2014 at 1:40 PM, Richard Henderson r...@redhat.com wrote: On 06/20/2014 05:17 PM, Sriraman Tallam wrote: Index: config/i386/i386.c === --- config/i386/i386.c(revision 211826) +++ config/i386/i386.c(working copy) @@ -12691,7 +12691,9 @@ legitimate_pic_address_disp_p (rtx disp) return true; } else if (!SYMBOL_REF_FAR_ADDR_P (op0) - SYMBOL_REF_LOCAL_P (op0) + (SYMBOL_REF_LOCAL_P (op0) +|| (TARGET_64BIT ix86_copyrelocs flag_pie + !SYMBOL_REF_FUNCTION_P (op0))) ix86_cmodel != CM_LARGE_PIC) return true; break; This is the wrong place to patch. You ought to be adjusting SYMBOL_REF_LOCAL_P, by providing a modified TARGET_BINDS_LOCAL_P. I have done this in the new attached patch, I added a new function i386_binds_local_p which will check for this and call default_binds_local_p otherwise. Note in particular that I believe that you are doing the wrong thing with weak and COMMON symbols, in that you probably ought not force a copy reloc there. I added an extra check to not do this for WEAK symbols. I also added a check for DECL_EXTERNAL so I believe this will also not be called for COMMON symbols. Note the complexity of default_binds_local_p_1, and the fact that all you really want to modify is /* If PIC, then assume that any global name can be overridden by symbols resolved from other modules. */ else if (shlib) local_p = false; near the bottom of that function. I did not understand what you mean here? Were you suggesting an alternative way of doing this? Thanks for reviewing Sri r~
Re: [Patch AArch64] Fix extended register width
Ping. On Mon, Sep 22, 2014 at 11:41 AM, Carrot Wei car...@google.com wrote: Hi The extended register width in add/adds/sub/subs/cmp instructions is not always the same as target register, it depends on both target register width and extension type. But in current implementation the extended register width is always the same as target register. We have noticed it can generate following wrong assembler code when compiled an internal application, add x2, x20, x0, sxtw 3 The correct assembler should be add x2, x20, w0, sxtw 3 On the other hand I noticed current gcc can only generate following extension types: xtb, xth, xtw. In these cases the extended register width can only be 'w'. So this patch changes the the extended register size attribute to 'w'. Passed regression tests on qemu without failure. OK for trunk and 4.9 branch? thanks Guozhi Wei 2014-09-22 Guozhi Wei car...@google.com * config/aarch64/aarch64.md (*adds_optabALLX:mode_GPI:mode): Change the extended register width to w. (*subs_optabALLX:mode_GPI:mode): Likewise. (*adds_optabmode_multp2): Likewise. (*subs_optabmode_multp2): Likewise. (*add_optabALLX:mode_GPI:mode): Likewise. (*add_optabALLX:mode_shft_GPI:mode): Likewise. (*add_optabALLX:mode_mult_GPI:mode): Likewise. (*add_optabmode_multp2): Likewise. (*add_uxtmode_multp2): Likewise. (*sub_optabALLX:mode_GPI:mode): Likewise. (*sub_optabALLX:mode_shft_GPI:mode): Likewise. (*sub_optabmode_multp2): Likewise. (*sub_uxtmode_multp2): Likewise. (*cmp_swp_optabALLX:mode_regGPI:mode): Likewise. (*cmp_swp_optabALLX:mode_shft_GPI:mode): Likewise. 2014-09-22 Guozhi Wei car...@google.com * gcc.target/aarch64/subs3.c: Change the extended register width to w. * gcc.target/aarch64/adds3.c: Likewise. * gcc.target/aarch64/cmp.c: Likewise.
Re: __intN patch 3/5: main __int128 - __intN conversion.
Just one question about the include/std/limits changes below. It seems that __glibcxx_signed_b isn't strictly necessary as it doesn't use the B argument, so is it just there for consistency? Yup.
Re: [shrink-wrap] should not sink instructions which may cause trap ?
On 26/09/14 17:12, Jeff Law wrote: On 09/26/14 08:50, Jiong Wang wrote: if (may_trap_p (x)) don't sink this instruction. any comments? Should be checking if x may throw internally instead. Richard, thanks for the suggestion, have used insn_could_throw_p to do the check, which will only do the check when flag_exception and flag_non_call_exception be true, so those instruction could still be sink for normal c/c++ program. Jeff, below is the fix for pr49847.C regression on aarch64. I re-run full test on aarch64-none-elf bare metal, no regression. bootstrap ok on x86, no regression on check-gcc/g++. ok for trunk? (re-sent with changelog entry) gcc/ 2014-09-26 Jiong Wangjiong.w...@arm.com * shrink-wrap.c (move_insn_for_shrink_wrap): Check insn_could_throw_p before sinking insn. I think can_throw_internal, per Richi's recommendation is better. Note that can_throw_internal keys off the existence of the EH landing pads for the particular insn. If flag_exceptions is false (for example), then would not expect those landing pads to exist and the insn would not be considered as potentially throwing. Can you test with can_throw_internal to verify it's behaviour and resubmit thanks for pointing this out, patch updated. re-tested, pass x86-64 bootstrap and no regression on check-gcc/g++. pass aarch64-none-elf cross check also. ok for trunk? BTW, another bug exposed by linux x86-64 kernel build, and it's at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63404 the problem is caused by we missed clobber/use check. I will send a seperate patch for review. really sorry for causing the trouble, the insn move in generic code is actually not that generic, related with some backend features... 2014-09-26 Jiong Wang jiong.w...@arm.com * shrink-wrap.c (move_insn_for_shrink_wrap): Check can_throw_internal before sinking insn. commit bff6072abd52fecde5916d1967a7833f581c1e98 Author: Jiong Wang jiong.w...@arm.com Date: Mon Sep 29 13:32:02 2014 +0100 1 diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c index bd4813c..b1ff8a2 100644 --- a/gcc/shrink-wrap.c +++ b/gcc/shrink-wrap.c @@ -189,6 +189,9 @@ move_insn_for_shrink_wrap (basic_block bb, rtx_insn *insn, unsigned int nonconstobj_num = 0; rtx src_inner = NULL_RTX; + if (can_throw_internal (insn)) + return false; + subrtx_var_iterator::array_type array; FOR_EACH_SUBRTX_VAR (iter, array, src, ALL) {
[PATCH] PR63404, gcc 5 miscompiles linux block layer
it's exposed by linux kernel for x86. the root cause is current single_set will ignore CLOBBER USE, while we need to take them into account when handling shrink-wrap. this patch add one parameter to single_set_2 to support return NULL_RTX if we want to remove any side-effect. add a new helper function single_set_no_clobber_use added. pass x86-64 bootstrap check-gcc/g++, also manually checked ths issue reported at 63404 is gone away. also no regression on aarch64-none-elf regression test. comments? thanks. 2014-09-26 Jiong Wang jiong.w...@arm.com * rtl.h (single_set_no_clobber_use): New function. (single_set_2): New parameter fail_on_clobber_use. (single_set): Likewise. * config/ia64/ia64.c (ia64_single_set): Likewise. * rtlanal.c (single_set_2): Return NULL_RTX if fail_on_clobber_use be true. * shrink-wrap.c (move_insn_for_shrink_wrap): Use single_set_no_clobber_use. diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c index 9337be1..09d3c4a 100644 --- a/gcc/config/ia64/ia64.c +++ b/gcc/config/ia64/ia64.c @@ -7172,7 +7172,7 @@ ia64_single_set (rtx_insn *insn) break; default: - ret = single_set_2 (insn, x); + ret = single_set_2 (insn, x, false); break; } diff --git a/gcc/rtl.h b/gcc/rtl.h index e73f731..7c40d5a 100644 --- a/gcc/rtl.h +++ b/gcc/rtl.h @@ -2797,7 +2797,7 @@ extern void set_insn_deleted (rtx); /* Functions in rtlanal.c */ -extern rtx single_set_2 (const rtx_insn *, const_rtx); +extern rtx single_set_2 (const rtx_insn *, const_rtx, bool fail_on_clobber_use); /* Handle the cheap and common cases inline for performance. */ @@ -2810,7 +2810,20 @@ inline rtx single_set (const rtx_insn *insn) return PATTERN (insn); /* Defer to the more expensive case. */ - return single_set_2 (insn, PATTERN (insn)); + return single_set_2 (insn, PATTERN (insn), false); +} + +inline rtx single_set_no_clobber_use (const rtx_insn *insn) +{ + if (!INSN_P (insn)) +return NULL_RTX; + + if (GET_CODE (PATTERN (insn)) == SET) +return PATTERN (insn); + + /* Defer to the more expensive case, and return NULL_RTX if there is + USE or CLOBBER. */ + return single_set_2 (insn, PATTERN (insn), true); } extern enum machine_mode get_address_mode (rtx mem); diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c index 3063458..cb5e36a 100644 --- a/gcc/rtlanal.c +++ b/gcc/rtlanal.c @@ -1182,7 +1182,7 @@ record_hard_reg_uses (rtx *px, void *data) will not be used, which we ignore. */ rtx -single_set_2 (const rtx_insn *insn, const_rtx pat) +single_set_2 (const rtx_insn *insn, const_rtx pat, bool fail_on_clobber_use) { rtx set = NULL; int set_verified = 1; @@ -1197,6 +1197,8 @@ single_set_2 (const rtx_insn *insn, const_rtx pat) { case USE: case CLOBBER: + if (fail_on_clobber_use) + return NULL_RTX; break; case SET: diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c index b1ff8a2..5624ef7 100644 --- a/gcc/shrink-wrap.c +++ b/gcc/shrink-wrap.c @@ -177,7 +177,7 @@ move_insn_for_shrink_wrap (basic_block bb, rtx_insn *insn, edge live_edge; /* Look for a simple register copy. */ - set = single_set (insn); + set = single_set_no_clobber_use (insn); if (!set) return false; src = SET_SRC (set);
libcilkrts: Remove unused function __cilkrts_sysdep_is_worker_thread_id (was: Cilk Library)
Hi! On Wed, 9 Oct 2013 18:32:11 +, Iyer, Balaji V balaji.v.i...@intel.com wrote: [libcilkrts] I have found a function that is -- as far as I can tell -- unused, and I'm thus proposing to remove it. This increases portability, as this code has dependencies on the operating system. Tested on x86 GNU/Hurd, and x86_64 GNU/Linux is in progress. OK for trunk once testing completed? commit 4f32339be3c95330b7fcd3bc6bb520a7401aa510 Author: Thomas Schwinge tho...@codesourcery.com Date: Sat Sep 20 19:53:56 2014 +0200 libcilkrts: Remove unused function __cilkrts_sysdep_is_worker_thread_id. libcilkrts/ * runtime/sysdep-unix.c (__cilkrts_sysdep_is_worker_thread_id): Remove function. --- libcilkrts/runtime/sysdep-unix.c | 22 -- 1 file changed, 22 deletions(-) diff --git libcilkrts/runtime/sysdep-unix.c libcilkrts/runtime/sysdep-unix.c index 1f82b62..b9f1ad0 100644 --- libcilkrts/runtime/sysdep-unix.c +++ libcilkrts/runtime/sysdep-unix.c @@ -571,28 +571,6 @@ void __cilkrts_make_unrunnable_sysdep(__cilkrts_worker *w, } } -/* - * __cilkrts_sysdep_is_worker_thread_id - * - * Returns true if the thread ID specified matches the thread ID we saved - * for a worker. - */ - -int __cilkrts_sysdep_is_worker_thread_id(global_state_t *g, - int i, - void *thread_id) -{ -#if defined( __linux__) || defined(__VXWORKS__) -pthread_t tid = *(pthread_t *)thread_id; -if (i 0 || i g-total_workers) -return 0; -return g-sysdep-threads[i] == tid; -#else -// Needs to be implemented -return 0; -#endif -} - Grüße, Thomas pgpA8DZhwHBWW.pgp Description: PGP signature
libcilkrts: Use AC_USE_SYSTEM_EXTENSIONS (was: Cilk Library)
Hi! On Wed, 9 Oct 2013 18:32:11 +, Iyer, Balaji V balaji.v.i...@intel.com wrote: [libcilkrts] Here is a patch to have libcilkrts use AC_USE_SYSTEM_EXTENSIONS (as other libraries are doing) instead of manually fiddling with the _GNU_SOURCE definition. This increases portability, as most of those definitions are currently hard-coded for __linux__ only. Tested on x86 GNU/Hurd, and x86_64 GNU/Linux is in progress. OK for trunk once testing completed? commit 18a4ed57818379684dc4ce86f5d20141e0a6040d Author: Thomas Schwinge tho...@codesourcery.com Date: Sat Sep 20 19:49:59 2014 +0200 libcilkrts: Use AC_USE_SYSTEM_EXTENSIONS. libcilkrts/ * configure.ac (AC_USE_SYSTEM_EXTENSIONS): Instantiate. HAVE_PTHREAD_AFFINITY_NP: Don't define _GNU_SOURCE. * configure: Regenerate. * runtime/os-unix.c [__linux__] (_GNU_SOURCE): Don't define. * runtime/sysdep-unix.c [__linux__] (_GNU_SOURCE): Likewise. --- libcilkrts/configure | 3479 +++--- libcilkrts/configure.ac |6 +- libcilkrts/runtime/os-unix.c |7 - libcilkrts/runtime/sysdep-unix.c |7 - 4 files changed, 2091 insertions(+), 1408 deletions(-) diff --git libcilkrts/configure libcilkrts/configure index 1e8eabd..be96533 100644 --- libcilkrts/configure +++ libcilkrts/configure @@ -632,9 +632,6 @@ MAC_LINKER_SCRIPT_TRUE LINUX_LINKER_SCRIPT_FALSE LINUX_LINKER_SCRIPT_TRUE config_dir -EGREP -GREP -CPP ALLOCA multi_basedir am__fastdepCXX_FALSE @@ -643,6 +640,9 @@ CXXDEPMODE ac_ct_CXX CXXFLAGS CXX +MAINT +MAINTAINER_MODE_FALSE +MAINTAINER_MODE_TRUE am__fastdepCC_FALSE am__fastdepCC_TRUE CCDEPMODE @@ -652,16 +652,6 @@ AMDEP_TRUE am__quote am__include DEPDIR -OBJEXT -EXEEXT -ac_ct_CC -CPPFLAGS -LDFLAGS -CFLAGS -CC -MAINT -MAINTAINER_MODE_FALSE -MAINTAINER_MODE_TRUE am__untar am__tar AMTAR @@ -685,6 +675,16 @@ am__isrc INSTALL_DATA INSTALL_SCRIPT INSTALL_PROGRAM +EGREP +GREP +CPP +OBJEXT +EXEEXT +ac_ct_CC +CPPFLAGS +LDFLAGS +CFLAGS +CC target_os target_vendor target_cpu @@ -738,8 +738,8 @@ SHELL' ac_subst_files='' ac_user_opts=' enable_option_checking -enable_maintainer_mode enable_dependency_tracking +enable_maintainer_mode enable_multilib enable_version_specific_runtime_libs enable_shared @@ -757,10 +757,10 @@ CFLAGS LDFLAGS LIBS CPPFLAGS +CPP CXX CXXFLAGS CCC -CPP CXXCPP' @@ -1383,10 +1383,10 @@ Optional Features: --disable-option-checking ignore unrecognized --enable/--with options --disable-FEATURE do not include FEATURE (same as --enable-FEATURE=no) --enable-FEATURE[=ARG] include FEATURE [ARG=yes] - --enable-maintainer-mode enable make rules and dependencies not useful - (and sometimes confusing) to the casual installer --disable-dependency-tracking speeds up one-time build --enable-dependency-tracking do not reject slow dependency extractors + --enable-maintainer-mode enable make rules and dependencies not useful + (and sometimes confusing) to the casual installer --enable-multilib build many library versions (default) --enable-version-specific-runtime-libs Specify that runtime libraries should be installed @@ -1412,9 +1412,9 @@ Some influential environment variables: LIBSlibraries to pass to the linker, e.g. -llibrary CPPFLAGSC/C++/Objective C preprocessor flags, e.g. -Iinclude dir if you have headers in a nonstandard directory include dir + CPP C preprocessor CXX C++ compiler command CXXFLAGSC++ compiler flags - CPP C preprocessor CXXCPP C++ preprocessor Use these variables to override the choices made by `configure' or to help @@ -1535,6 +1535,209 @@ fi } # ac_fn_c_try_compile +# ac_fn_c_try_cpp LINENO +# -- +# Try to preprocess conftest.$ac_ext, and return whether this succeeded. +ac_fn_c_try_cpp () +{ + as_lineno=${as_lineno-$1} as_lineno_stack=as_lineno_stack=$as_lineno_stack + if { { ac_try=$ac_cpp conftest.$ac_ext +case (($ac_try in + *\* | *\`* | *\\*) ac_try_echo=\$ac_try;; + *) ac_try_echo=$ac_try;; +esac +eval ac_try_echo=\\$as_me:${as_lineno-$LINENO}: $ac_try_echo\ +$as_echo $ac_try_echo; } 5 + (eval $ac_cpp conftest.$ac_ext) 2conftest.err + ac_status=$? + if test -s conftest.err; then +grep -v '^ *+' conftest.err conftest.er1 +cat conftest.er1 5 +mv -f conftest.er1 conftest.err + fi + $as_echo $as_me:${as_lineno-$LINENO}: \$? = $ac_status 5 + test $ac_status = 0; } /dev/null { +test -z $ac_c_preproc_warn_flag$ac_c_werror_flag || +test ! -s conftest.err + }; then : + ac_retval=0 +else + $as_echo $as_me: failed program was: 5 +sed 's/^/| /' conftest.$ac_ext 5 + +ac_retval=1 +fi + eval $as_lineno_stack; test x$as_lineno_stack = x { as_lineno=; unset as_lineno;} + return $ac_retval + +} # ac_fn_c_try_cpp + +#
libcilkrts: GNU toolchain, GNU linker scripts (was: Cilk Library)
Hi! On Wed, 9 Oct 2013 18:32:11 +, Iyer, Balaji V balaji.v.i...@intel.com wrote: [libcilkrts] As requested during patch review, symbol versioning infrastructure has been added to libcilkrts. However, this is currently described/implemented as Linux-only, while in fact it's standard GNU linker scripts, generally supported with the GNU toolchain, so I'm proposing to change this as follows. This increases portability. Tested on x86 GNU/Hurd, and x86_64 GNU/Linux is in progress. OK for trunk once testing completed? commit 44e41129a59a4f69d26923a6fa6091902ae584b2 Author: Thomas Schwinge tho...@codesourcery.com Date: Sat Sep 20 19:16:44 2014 +0200 libcilkrts: GNU toolchain, GNU linker scripts. libcilkrts/ * configure.ac (linux_linker_script): Rename to gnu_linker_script. Also set for *-*-gnu*. (LINUX_LINKER_SCRIPT): Rename to GNU_LINKER_SCRIPT. Adapt all users. * configure: Regenerate. * Makefile.in: Regenerate. * runtime/linux-symbols.ver: Rename to runtime/gnu-symbols.ver. --- libcilkrts/Makefile.am | 6 ++-- libcilkrts/Makefile.in | 4 +-- libcilkrts/configure | 32 -- libcilkrts/configure.ac| 12 .../runtime/{linux-symbols.ver = gnu-symbols.ver} | 0 5 files changed, 29 insertions(+), 25 deletions(-) diff --git libcilkrts/Makefile.am libcilkrts/Makefile.am index 70538a2..e77dfa6 100644 --- libcilkrts/Makefile.am +++ libcilkrts/Makefile.am @@ -95,9 +95,9 @@ libcilkrts_la_LDFLAGS = -version-info 5:0:0 libcilkrts_la_LDFLAGS += @lt_cv_dlopen_libs@ libcilkrts_la_LDFLAGS += $(AM_LDFLAGS) -# If we're building on Linux, use the Linux version script -if LINUX_LINKER_SCRIPT -libcilkrts_la_LDFLAGS += -Wl,--version-script,$(srcdir)/runtime/linux-symbols.ver +# If we're building with a GNU toolchain, use the GNU version script. +if GNU_LINKER_SCRIPT + libcilkrts_la_LDFLAGS += -Wl,--version-script,$(srcdir)/runtime/gnu-symbols.ver endif # If we're building on MacOS, use the Mac versioning diff --git libcilkrts/Makefile.in libcilkrts/Makefile.in index e1a54b5..dd482b4 100644 --- libcilkrts/Makefile.in +++ libcilkrts/Makefile.in @@ -115,8 +115,8 @@ DIST_COMMON = $(srcdir)/include/internal/rev.mk README ChangeLog \ $(srcdir)/../mkinstalldirs $(srcdir)/libcilkrts.spec.in \ $(srcdir)/../depcomp -# If we're building on Linux, use the Linux version script -@LINUX_LINKER_SCRIPT_TRUE@am__append_1 = -Wl,--version-script,$(srcdir)/runtime/linux-symbols.ver +# If we're building with a GNU toolchain, use the GNU version script. +@GNU_LINKER_SCRIPT_TRUE@am__append_1 = -Wl,--version-script,$(srcdir)/runtime/gnu-symbols.ver # If we're building on MacOS, use the Mac versioning @MAC_LINKER_SCRIPT_TRUE@am__append_2 = -Wl,-exported_symbols_list,$(srcdir)/runtime/mac-symbols.txt diff --git libcilkrts/configure libcilkrts/configure index be96533..b75533c 100644 --- libcilkrts/configure +++ libcilkrts/configure @@ -629,8 +629,8 @@ SED LIBTOOL MAC_LINKER_SCRIPT_FALSE MAC_LINKER_SCRIPT_TRUE -LINUX_LINKER_SCRIPT_FALSE -LINUX_LINKER_SCRIPT_TRUE +GNU_LINKER_SCRIPT_FALSE +GNU_LINKER_SCRIPT_TRUE config_dir ALLOCA multi_basedir @@ -5647,19 +5647,21 @@ case ${target} in esac -# We have linker scripts for appropriate operating systems -linux_linker_script=no +# We have linker scripts for appropriate toolchains. + +gnu_linker_script=no case ${host} in -*-*-linux*) -linux_linker_script=yes +*-*-gnu* | *-*-linux*) + # Assume a GNU toolchain. +gnu_linker_script=yes ;; esac - if test $linux_linker_script = yes; then - LINUX_LINKER_SCRIPT_TRUE= - LINUX_LINKER_SCRIPT_FALSE='#' + if test $gnu_linker_script = yes; then + GNU_LINKER_SCRIPT_TRUE= + GNU_LINKER_SCRIPT_FALSE='#' else - LINUX_LINKER_SCRIPT_TRUE='#' - LINUX_LINKER_SCRIPT_FALSE= + GNU_LINKER_SCRIPT_TRUE='#' + GNU_LINKER_SCRIPT_FALSE= fi @@ -11755,7 +11757,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat conftest.$ac_ext _LT_EOF -#line 11758 configure +#line 11760 configure #include confdefs.h #if HAVE_DLFCN_H @@ -11861,7 +11863,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat conftest.$ac_ext _LT_EOF -#line 11864 configure +#line 11866 configure #include confdefs.h #if HAVE_DLFCN_H @@ -15379,8 +15381,8 @@ if test -z ${am__fastdepCXX_TRUE} test -z ${am__fastdepCXX_FALSE}; then as_fn_error conditional \am__fastdepCXX\ was never defined. Usually this means the macro was only invoked conditionally. $LINENO 5 fi -if test -z ${LINUX_LINKER_SCRIPT_TRUE} test -z ${LINUX_LINKER_SCRIPT_FALSE}; then - as_fn_error conditional \LINUX_LINKER_SCRIPT\ was never defined. +if test -z ${GNU_LINKER_SCRIPT_TRUE} test -z ${GNU_LINKER_SCRIPT_FALSE}; then +
libcilkrts: GNU Hurd port, and some code cleanup/consolidation (was: Cilk Library)
Hi! On Wed, 9 Oct 2013 18:32:11 +, Iyer, Balaji V balaji.v.i...@intel.com wrote: [libcilkrts] Currently, by means of the libcilkrts/configure.tgt file that has been added during patch review, libcilkrts is attempted to be built for all *-*-gnu* system, but it has actually only been ported to GNU/Linux. This is Debian bug http://bugs.debian.org/734973. Here is a basic GNU Hurd port, and some code cleanup/consolidation. Tested on x86 GNU/Hurd, and x86_64 GNU/Linux is in progress. OK for trunk once testing completed? commit ca8d437e22c659aa6a8d2d57afd9e3944f9b33ce Author: Thomas Schwinge tho...@codesourcery.com Date: Sun Sep 21 20:35:49 2014 +0200 libcilkrts: GNU Hurd port, and some code cleanup/consolidation. libcilkrts/ * runtime/cilk_malloc.c: Consider __GLIBC__ next to __linux__. * runtime/os-unix.c: Basic port for __GNU__. Apply some code cleanup/consolidation. --- libcilkrts/runtime/cilk_malloc.c | 2 +- libcilkrts/runtime/os-unix.c | 45 +++- 2 files changed, 22 insertions(+), 25 deletions(-) diff --git libcilkrts/runtime/cilk_malloc.c libcilkrts/runtime/cilk_malloc.c index 9d02c52..d3de756 100644 --- libcilkrts/runtime/cilk_malloc.c +++ libcilkrts/runtime/cilk_malloc.c @@ -39,7 +39,7 @@ #include cilk_malloc.h #include stdlib.h -#if defined _WIN32 || defined _WIN64 || defined __linux__ +#if defined _WIN32 || defined _WIN64 || defined __GLIBC__ || defined __linux__ #include malloc.h #define HAS_MEMALIGN 1 #endif diff --git libcilkrts/runtime/os-unix.c libcilkrts/runtime/os-unix.c index 229c438..70acb14 100644 --- libcilkrts/runtime/os-unix.c +++ libcilkrts/runtime/os-unix.c @@ -47,12 +47,6 @@ #elif defined __APPLE__ # include sys/sysctl.h // Uses sysconf(_SC_NPROCESSORS_ONLN) in verbose output -#elif defined __DragonFly__ -// No additional include files -#elif defined __FreeBSD__ -// No additional include files -#elif defined __CYGWIN__ -// Cygwin on Windows - no additional include files #elif defined __VXWORKS__ # include vxWorks.h # include vxCpuLib.h @@ -60,6 +54,9 @@ // Solaris #elif defined __sun__ defined __svr4__ # include sched.h +#elif defined __CYGWIN__ || defined __DragonFly__ || defined __FreeBSD__ \ + || defined __GNU__ +// No additional include files. #else # error Unsupported OS #endif @@ -349,7 +346,12 @@ static int linux_get_affinity_count (int tid) COMMON_SYSDEP int __cilkrts_hardware_cpu_count(void) { -#if defined __ANDROID__ || (defined(__sun__) defined(__svr4__)) +#if defined __ANDROID__ \ + || defined __CYGWIN__ \ + || defined __DragonFly__ \ + || defined __FreeBSD__ \ + || defined __GNU__ \ + || (defined (__sun__) defined (__svr4__)) return sysconf (_SC_NPROCESSORS_ONLN); #elif defined __MIC__ /// HACK: Usually, the 3rd and 4th hyperthreads are not beneficial @@ -369,16 +371,10 @@ COMMON_SYSDEP int __cilkrts_hardware_cpu_count(void) assert((unsigned)count == count); return count; -#elif defined __FreeBSD__ || defined __CYGWIN__ || defined __DragonFly__ -int ncores = sysconf(_SC_NPROCESSORS_ONLN); - -return ncores; -// Just get the number of processors -//return sysconf(_SC_NPROCESSORS_ONLN); #elif defined __VXWORKS__ return __builtin_popcount( vxCpuEnabledGet() ); #else -#error Unknown architecture +# error Unsupported architecture #endif } @@ -393,13 +389,16 @@ COMMON_SYSDEP void __cilkrts_sleep(void) COMMON_SYSDEP void __cilkrts_yield(void) { -#if __APPLE__ || __FreeBSD__ || __VXWORKS__ -// On MacOS, call sched_yield to yield quantum. I'm not sure why we +#if defined (__ANDROID__) \ + || __APPLE__ \ + || defined (__DragonFly__) \ + || __FreeBSD__ \ + || defined (__GNU__) \ + || (defined (__sun__) defined (__svr4__)) \ + || __VXWORKS__ +// Call sched_yield to yield quantum. I'm not sure why we // don't do this on Linux also. sched_yield(); -#elif defined(__DragonFly__) -// On DragonFly BSD, call sched_yield to yield quantum. -sched_yield(); #elif defined(__MIC__) // On MIC, pthread_yield() really trashes things. Arch's measurements // showed that calling _mm_delay_32() (or doing nothing) was a better @@ -407,14 +406,12 @@ COMMON_SYSDEP void __cilkrts_yield(void) // giving up the processor and latency starting up when work becomes // available _mm_delay_32(1024); -#elif defined(__ANDROID__) || (defined(__sun__) defined(__svr4__)) -// On Android and Solaris, call sched_yield to yield quantum. I'm not -// sure why we don't do this on Linux also. -sched_yield(); -#else +#elif defined __linux__ // On Linux, call pthread_yield (which in turn will call sched_yield) // to yield quantum. pthread_yield(); +#else +# error Unsupported architecture #endif } Grüße, Thomas pgpU8anISy7j1.pgp Description: PGP signature
RE: libcilkrts: Remove unused function __cilkrts_sysdep_is_worker_thread_id (was: Cilk Library)
It's a remnant from something we were attempting to support (and abandoned) 4 years ago. I'm fine with removing it from the runtime. Igor, I'll make the change and send you a new copy. - Barry -Original Message- From: Thomas Schwinge [mailto:tho...@codesourcery.com] Sent: Monday, September 29, 2014 2:13 PM To: Iyer, Balaji V; Tannenbaum, Barry M; Zamyatin, Igor Cc: gcc-patches@gcc.gnu.org Subject: libcilkrts: Remove unused function __cilkrts_sysdep_is_worker_thread_id (was: Cilk Library) Hi! On Wed, 9 Oct 2013 18:32:11 +, Iyer, Balaji V balaji.v.i...@intel.com wrote: [libcilkrts] I have found a function that is -- as far as I can tell -- unused, and I'm thus proposing to remove it. This increases portability, as this code has dependencies on the operating system. Tested on x86 GNU/Hurd, and x86_64 GNU/Linux is in progress. OK for trunk once testing completed? commit 4f32339be3c95330b7fcd3bc6bb520a7401aa510 Author: Thomas Schwinge tho...@codesourcery.com Date: Sat Sep 20 19:53:56 2014 +0200 libcilkrts: Remove unused function __cilkrts_sysdep_is_worker_thread_id. libcilkrts/ * runtime/sysdep-unix.c (__cilkrts_sysdep_is_worker_thread_id): Remove function. --- libcilkrts/runtime/sysdep-unix.c | 22 -- 1 file changed, 22 deletions(-) diff --git libcilkrts/runtime/sysdep-unix.c libcilkrts/runtime/sysdep-unix.c index 1f82b62..b9f1ad0 100644 --- libcilkrts/runtime/sysdep-unix.c +++ libcilkrts/runtime/sysdep-unix.c @@ -571,28 +571,6 @@ void __cilkrts_make_unrunnable_sysdep(__cilkrts_worker *w, } } -/* - * __cilkrts_sysdep_is_worker_thread_id - * - * Returns true if the thread ID specified matches the thread ID we saved - * for a worker. - */ - -int __cilkrts_sysdep_is_worker_thread_id(global_state_t *g, - int i, - void *thread_id) -{ -#if defined( __linux__) || defined(__VXWORKS__) -pthread_t tid = *(pthread_t *)thread_id; -if (i 0 || i g-total_workers) -return 0; -return g-sysdep-threads[i] == tid; -#else -// Needs to be implemented -return 0; -#endif -} - Grüße, Thomas
Re: [PATCH] PR63404, gcc 5 miscompiles linux block layer
On 09/29/2014 11:12 AM, Jiong Wang wrote: +inline rtx single_set_no_clobber_use (const rtx_insn *insn) +{ + if (!INSN_P (insn)) +return NULL_RTX; + + if (GET_CODE (PATTERN (insn)) == SET) +return PATTERN (insn); + + /* Defer to the more expensive case, and return NULL_RTX if there is + USE or CLOBBER. */ + return single_set_2 (insn, PATTERN (insn), true); } What more expensive case? If you're disallowing USE and CLOBBER, then single_set is just GET_CODE == SET. I think this function is somewhat useless, and should not be added. An adjustment to move_insn_for_shrink_wrap may be reasonable though. I haven't tried to understand the miscompilation yet. I can imagine that this would disable quite a bit of shrink wrapping for x86 though. Can we do better in understanding when the clobbered register is live at the location to which we'd like to move then insns? r~
Re: __intN patch 3/5: main __int128 - __intN conversion.
On 29/09/14 14:06 -0400, DJ Delorie wrote: Just one question about the include/std/limits changes below. It seems that __glibcxx_signed_b isn't strictly necessary as it doesn't use the B argument, so is it just there for consistency? Yup. OK, thanks for confirming.
Re: [PATCH] Fix finding default baseline symbols directory
On 29/09/14 19:24 +0200, Andreas Schwab wrote: Jonathan Wakely jwak...@redhat.com writes: Would a safer change be to just add a new pattern for aarch64? --- a/libstdc++-v3/configure.host +++ b/libstdc++-v3/configure.host @@ -345,6 +345,9 @@ case ${host} in x86_64) abi_baseline_pair=x86_64-linux-gnu ;; + aarch64) +abi_baseline_pair=aarch64-linux-gnu +;; *) if test -d ${glibcxx_srcdir}/config/abi/post/${try_cpu}-linux-gnu; then abi_baseline_pair=${try_cpu}-linux-gnu IMHO it doesn't make sense to use try_cpu here if it is generic. * configure.host (abi_baseline_pair): If try_cpu is generic use host_cpu for the default. diff --git a/libstdc++-v3/configure.host b/libstdc++-v3/configure.host index a12871a..d1298c4 100644 --- a/libstdc++-v3/configure.host +++ b/libstdc++-v3/configure.host @@ -346,8 +346,13 @@ case ${host} in abi_baseline_pair=x86_64-linux-gnu ;; *) -if test -d ${glibcxx_srcdir}/config/abi/post/${try_cpu}-linux-gnu; then - abi_baseline_pair=${try_cpu}-linux-gnu +if test $try_cpu = generic; then + try_abi_cpu=$host_cpu +else + try_abi_cpu=$try_cpu +fi +if test -d ${glibcxx_srcdir}/config/abi/post/${try_abi_cpu}-linux-gnu; then + abi_baseline_pair=${try_abi_cpu}-linux-gnu fi esac case ${host} in Yes, that looks sensible to me - OK for trunk, thanks.
[debug-early] rearrange some checks in gen_subprogram_die
I'm rearranging some code in Michael's original patch to minimize the difference with mainline. It seems that the check for DECL_STRUCT_FUNCTION (decl)-gimple_df, was merely a check to see if we had already set the FDE bits for the decl in question. I've moved the check inside the original DECL_EXTERNAL check, thus making it obvious what is being accomplished. I also got rid of mainline's gcc_checking_assert of fun. We're dereferencing it immediately after. That should be enough to trigger an ICE. Also I removed Michael's check for DECL_STRUCT_FUNCTION(decl), since mainline drops into this codepath regardless, and has/had that gcc_checking_assert anyhow. No regressions. Committed to branch. Aldy commit a23ae1821d5c45b2c56ea5db6940ea8982f8fd69 Author: Aldy Hernandez al...@redhat.com Date: Mon Sep 29 11:48:41 2014 -0700 * dwarf2out.c (gen_subprogram_die): Do not check DECL_STRUCT_FUNCTION, thus leaving the check as mainline. Test for fun-fde instead of fun-gimple_df. diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index 41c4feb..c92101f 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -18441,9 +18441,7 @@ gen_subprogram_die (tree decl, dw_die_ref context_die) equate_decl_number_to_die (decl, subr_die); } - else if (!DECL_EXTERNAL (decl) - (!DECL_STRUCT_FUNCTION (decl) - || DECL_STRUCT_FUNCTION (decl)-gimple_df)) + else if (!DECL_EXTERNAL (decl)) { HOST_WIDE_INT cfa_fb_offset; @@ -18452,7 +18450,9 @@ gen_subprogram_die (tree decl, dw_die_ref context_die) if (!old_die || !get_AT (old_die, DW_AT_inline)) equate_decl_number_to_die (decl, subr_die); - gcc_checking_assert (fun); + if (!fun-fde) + goto no_fde_continue; + if (!flag_reorder_blocks_and_partition) { dw_fde_ref fde = fun-fde; @@ -18608,11 +18608,8 @@ gen_subprogram_die (tree decl, dw_die_ref context_die) if (fun-static_chain_decl) add_AT_location_description (subr_die, DW_AT_static_link, loc_list_from_tree (fun-static_chain_decl, 2)); -} - else if (!DECL_EXTERNAL (decl)) -{ - if (!old_die || !get_AT (old_die, DW_AT_inline)) - equate_decl_number_to_die (decl, subr_die); +no_fde_continue: + ; } /* Generate child dies for template paramaters. */
Re: [PATCH] PR63404, gcc 5 miscompiles linux block layer
On 29/09/14 19:32, Richard Henderson wrote: On 09/29/2014 11:12 AM, Jiong Wang wrote: +inline rtx single_set_no_clobber_use (const rtx_insn *insn) +{ + if (!INSN_P (insn)) +return NULL_RTX; + + if (GET_CODE (PATTERN (insn)) == SET) +return PATTERN (insn); + + /* Defer to the more expensive case, and return NULL_RTX if there is + USE or CLOBBER. */ + return single_set_2 (insn, PATTERN (insn), true); } Richard, thanks for review. What more expensive case? single_set_no_clobber_use is just a clone of single_set, I copied the comments with only minor modifications. I think the more expensive case here means the case where there are PARALLEL that we need to check the inner rtx. If you're disallowing USE and CLOBBER, then single_set is just GET_CODE == SET. I think this function is somewhat useless, and should not be added. An adjustment to move_insn_for_shrink_wrap may be reasonable though. I haven't tried to understand the miscompilation yet. I can imagine that this would disable quite a bit of shrink wrapping for x86 though. I don't think so. from the x86-64 bootstrap, there is no regression on the number of functions shrink-wrapped. actually speaking, previously only single mov dest, src handled, so the disallowing USE/CLOBBER will not disallow shrink-wrap opportunity which was allowed previously. and I am afraid if we don't reuse single_set_2, then there will be another loop to check all those inner rtx which single_set_2 already does. so, IMHO, just modify single_set_2 will be more efficient. Can we do better in understanding when the clobbered register is live at the location to which we'd like to move then insns? currently, the generic code in move_insn_for_shrink_wrap only handle dest/src be single register, so if there is clobber or use, then we might need to check multiply regs, then there might be a few modifications. and I think that's better be done after all single dest/src issues fixed. -- Jiong r~
Add myself as libstdc++ special modes maintainer
I added myself as libstdc++ special modes maintainer. Special modes are debug, profile and parallel modes. Thanks for your trust. François
[PATCH, rs6000, libcpp] Revise search_line_fast to avoid old unaligned load sequences
Hi, The vec_lvsl and vec_lvsr interfaces are deprecated for little-endian Power, and really should not be used on big-endian Power either when the target CPU is power8 or above. The lexer in libcpp currently makes use of these interfaces in search_line_fast(). This patch provides a new version of search_line_fast() that allows unaligned loads to be handled by the hardware. The new version is used when _ARCH_PWR8 and __ALTIVEC__ are defined. Otherwise, the older version may be used; however it is now restricted for use only on big-endian systems. If we are targeting little-endian (which requires P8 or higher) and either Altivec support or Power8 architecture support has been disabled, then we revert to a slower search routine. This prevents ever using the deprecated instructions for little-endian code generation. I haven't added a new test case, as bootstrapping GCC is an excellent test of search_line_fast(), and that appears to be all we do at present for the existing implementations. Bootstrapped and tested on powerpc64le-unknown-linux-gnu and powerpc64-unknown-linux-gnu with no new regressions. Is this ok for trunk? Thanks, Bill 2014-09-29 Bill Schmidt wschm...@linux.vnet.ibm.com * lex.c (search_line_fast): Add new version to be used for Power8 and later targets when Altivec is enabled. Restrict the existing Altivec version to big-endian systems so that lvsr is not used on little endian, where it is deprecated. Remove LE-specific code from the now-BE-only version. Index: libcpp/lex.c === --- libcpp/lex.c(revision 215683) +++ libcpp/lex.c(working copy) @@ -513,9 +513,111 @@ init_vectorized_lexer (void) search_line_fast = impl; } -#elif (GCC_VERSION = 4005) defined(__ALTIVEC__) +#elif defined(_ARCH_PWR8) defined(__ALTIVEC__) -/* A vection of the fast scanner using AltiVec vectorized byte compares. */ +/* A vection of the fast scanner using AltiVec vectorized byte compares + and VSX unaligned loads (when VSX is available). This is otherwise + the same as the pre-GCC 5 version. */ + +static const uchar * +search_line_fast (const uchar *s, const uchar *end ATTRIBUTE_UNUSED) +{ + typedef __attribute__((altivec(vector))) unsigned char vc; + + const vc repl_nl = { +'\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', +'\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n' + }; + const vc repl_cr = { +'\r', '\r', '\r', '\r', '\r', '\r', '\r', '\r', +'\r', '\r', '\r', '\r', '\r', '\r', '\r', '\r' + }; + const vc repl_bs = { +'\\', '\\', '\\', '\\', '\\', '\\', '\\', '\\', +'\\', '\\', '\\', '\\', '\\', '\\', '\\', '\\' + }; + const vc repl_qm = { +'?', '?', '?', '?', '?', '?', '?', '?', +'?', '?', '?', '?', '?', '?', '?', '?', + }; + const vc zero = { 0 }; + + vc data, t; + + /* Main loop processing 16 bytes at a time. */ + do +{ + vc m_nl, m_cr, m_bs, m_qm; + + data = *((const vc *)s); + s += 16; + + m_nl = (vc) __builtin_vec_cmpeq(data, repl_nl); + m_cr = (vc) __builtin_vec_cmpeq(data, repl_cr); + m_bs = (vc) __builtin_vec_cmpeq(data, repl_bs); + m_qm = (vc) __builtin_vec_cmpeq(data, repl_qm); + t = (m_nl | m_cr) | (m_bs | m_qm); + + /* T now contains 0xff in bytes for which we matched one of the relevant +characters. We want to exit the loop if any byte in T is non-zero. +Below is the expansion of vec_any_ne(t, zero). */ +} + while (!__builtin_vec_vcmpeq_p(/*__CR6_LT_REV*/3, t, zero)); + + /* Restore s to to point to the 16 bytes we just processed. */ + s -= 16; + + { +#define N (sizeof(vc) / sizeof(long)) + +union { + vc v; + /* Statically assert that N is 2 or 4. */ + unsigned long l[(N == 2 || N == 4) ? N : -1]; +} u; +unsigned long l, i = 0; + +u.v = t; + +/* Find the first word of T that is non-zero. */ +switch (N) + { + case 4: + l = u.l[i++]; + if (l != 0) + break; + s += sizeof(unsigned long); + l = u.l[i++]; + if (l != 0) + break; + s += sizeof(unsigned long); + case 2: + l = u.l[i++]; + if (l != 0) + break; + s += sizeof(unsigned long); + l = u.l[i]; + } + +/* L now contains 0xff in bytes for which we matched one of the + relevant characters. We can find the byte index by finding + its bit index and dividing by 8. */ +#ifdef __BIG_ENDIAN__ +l = __builtin_clzl(l) 3; +#else +l = __builtin_ctzl(l) 3; +#endif +return s + l; + +#undef N + } +} + +#elif (GCC_VERSION = 4005) defined(__ALTIVEC__) defined (__BIG_ENDIAN__) + +/* A vection of the fast scanner using AltiVec vectorized byte compares. + This cannot be used for little endian because vec_lvsl/lvsr are + deprecated for little endian and the code won't work properly. */ /* ??? Unfortunately,
Re: [PATCH] PR63404, gcc 5 miscompiles linux block layer
On Mon, 2014-09-29 at 20:24 +0100, Jiong Wang wrote: On 29/09/14 19:32, Richard Henderson wrote: On 09/29/2014 11:12 AM, Jiong Wang wrote: +inline rtx single_set_no_clobber_use (const rtx_insn *insn) +{ + if (!INSN_P (insn)) +return NULL_RTX; + + if (GET_CODE (PATTERN (insn)) == SET) +return PATTERN (insn); + + /* Defer to the more expensive case, and return NULL_RTX if there is + USE or CLOBBER. */ + return single_set_2 (insn, PATTERN (insn), true); } Richard, thanks for review. What more expensive case? single_set_no_clobber_use is just a clone of single_set, I copied the comments with only minor modifications. I introduced that comment to single_set, in r215089, when making single_set into an inline function (so that it could check that it received an rtx_insn *, rather than an rtx). I think the more expensive case here means the case where there are PARALLEL that we need to check the inner rtx. My comment may have been misleading, sorry. IIRC, what I was thinking that the old implementation had split single_set into a macro and a function. This was by Honza (CCed), 14 years ago to the day back in r36664 (on 2000-09-29): https://gcc.gnu.org/ml/gcc-patches/2000-09/msg00893.html /* Single set is implemented as macro for performance reasons. */ #define single_set(I) (INSN_P (I) \ ? (GET_CODE (PATTERN (I)) == SET \ ? PATTERN (I) : single_set_1 (I)) \ : NULL_RTX) I think by the more expensive case I meant having to make a function call to handle the less-common cases (which indeed covers the PARALLEL case), rather than having logic inline; preserving that inlined vs not-inlined split was one of my aims for r215089. Perhaps it should be rewritten to Defer to a function call to handle the less common cases, or somesuch? [...snip rest of post...] Dave
Re: libcilkrts: Remove unused function __cilkrts_sysdep_is_worker_thread_id
On 09/29/14 12:13, Thomas Schwinge wrote: Hi! On Wed, 9 Oct 2013 18:32:11 +, Iyer, Balaji V balaji.v.i...@intel.com wrote: [libcilkrts] I have found a function that is -- as far as I can tell -- unused, and I'm thus proposing to remove it. This increases portability, as this code has dependencies on the operating system. Tested on x86 GNU/Hurd, and x86_64 GNU/Linux is in progress. OK for trunk once testing completed? commit 4f32339be3c95330b7fcd3bc6bb520a7401aa510 Author: Thomas Schwinge tho...@codesourcery.com Date: Sat Sep 20 19:53:56 2014 +0200 libcilkrts: Remove unused function __cilkrts_sysdep_is_worker_thread_id. libcilkrts/ * runtime/sysdep-unix.c (__cilkrts_sysdep_is_worker_thread_id): Remove function. The Cilk+ runtime is shared with ICC and they'll need to pull in this code first. We can then pick it up via merges. jeff
Re: libcilkrts: Remove unused function __cilkrts_sysdep_is_worker_thread_id
Hi! On Mon, 29 Sep 2014 15:00:03 -0600, Jeff Law l...@redhat.com wrote: On 09/29/14 12:13, Thomas Schwinge wrote: libcilkrts/ * [...] The Cilk+ runtime is shared with ICC and they'll need to pull in this code first. We can then pick it up via merges. Yeah, Barry is already guiding me through this process. Grüße, Thomas pgpd5znroAKAX.pgp Description: PGP signature
Re: [PATCHv3][PING] Enable -fsanitize-recover for KASan
+Alexey Samsonov On Mon, Sep 29, 2014 at 10:43 AM, Jakub Jelinek ja...@redhat.com wrote: On Mon, Sep 29, 2014 at 09:21:11PM +0400, Yury Gribov wrote: This patch enables -fsanitize-recover for KASan by default. This causes KASan to continue execution after error in case of inline instrumentation. This feature is needed because - reports during early bootstrap won't even be printed - needed to run all tests w/o rebooting machine for every test - needed for interactive work on desktop This is the third version of patch which renames -fsanitize-recover to -fubsan-recover and introduces -fasan-recover (enabled by default for KASan). It also moves flag handling to finish_options per Jakub's request. As the -fsanitize-recover option comes from clang originally, I think this needs coordination with them (whether clang will also rename the option), and certainly keep -fsanitize-recover as a non-documented compat option alias for -fubsan-recover. So, can you please talk to the clang folks about it? Jakub
Re: [PATCH IRA] update_equiv_regs fails to set EQUIV reg-note for pseudo with more than one definition
On 09/27/14 08:48, Felix Yang wrote: Thanks for the explaination. I have changed the loop_depth into a short interger hoping that we can save some memory :-) Thanks. Attached please find the updated patch. Bootstrapped and reg-tested on x86_64-suse-linux. Please do a final revew once the assignment is ready. As for the new list walking interface, I choose the function no_equiv and tried the checked cast way. The bad news is that GCC failed to bootstrap with the following change: Index: ira.c === --- ira.c (revision 215536) +++ ira.c (working copy) @@ -3242,12 +3242,12 @@ no_equiv (rtx reg, const_rtx store ATTRIBUTE_UNUSE void *data ATTRIBUTE_UNUSED) { int regno; - rtx list; + rtx_insn_list *list; if (!REG_P (reg)) return; regno = REGNO (reg); - list = reg_equiv[regno].init_insns; + list = as_a rtx_insn_list * (reg_equiv[regno].init_insns); if (list == const0_rtx) return; reg_equiv[regno].init_insns = const0_rtx; @@ -3258,9 +3258,9 @@ no_equiv (rtx reg, const_rtx store ATTRIBUTE_UNUSE return; ira_reg_equiv[regno].defined_p = false; ira_reg_equiv[regno].init_insns = NULL; - for (; list; list = XEXP (list, 1)) + for (; list; list = list-next ()) { - rtx insn = XEXP (list, 0); + rtx_insn *insn = list-insn (); remove_note (insn, find_reg_note (insn, REG_EQUIV, NULL_RTX)); } } Yea. I'm going to post a patch shortly to go ahead with this conversion. There's a couple issues that come into play. First const0_rtx is not an INSN, so we *really* don't want it in the INSN field of an INSN_LIST. That's probably the ICE you're seeing. const0_rtx is being used to mark pseudos which we've already determined can't have a valid equivalence. So we just need a different marker. That different marker must be embeddable in an INSN_LIST node. The easiest is just a NULL insn ;-) The other tests for the const0_rtx marker in ira.c need relatively trivial updating. And in the end we don't need the checked cast at all ;-) Jeff
[PATCH, rs6000] Warn for deprecated use of vec_lvsl and vec_lvsr for little endian
Hi, The vec_lvsl and vec_lvsr interfaces are deprecated for little endian in the ELFv2 ABI document. At the moment, these interfaces will produce incorrect code, and the only indication a programmer has of this is that his or her code does not function correctly. This patch adds a warning message to inform the little endian programmer of the deprecated usage. The patch described in https://gcc.gnu.org/ml/gcc-patches/2014-09/msg02580.html is a prerequisite for this patch, as otherwise the deprecation message causes bootstrap failure due to -Werror in the later stages. I feel the deprecation message is needed because, in a future patch, we plan to make vec_lvsl and vec_lvsr work so that BE code will run on LE without requiring code modifications. However, code modifications are still desirable because the LE code, while correct, will be pretty poor. The deprecation message will encourage programmers to rewrite their code that makes use of vec_lvsl/lvsr. I've added a new test to demonstrate the message, and updated a number of tests to use -Wno-deprecated so the new message doesn't disturb them. Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions. Is this ok for trunk? Thanks, Bill [gcc] 2014-09-29 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): Issue a warning message when vec_lvsl or vec_lvsr is used with a little endian target. [gcc/testsuite] 2014-09-29 Bill Schmidt wschm...@linux.vnet.ibm.com * g++.dg/ext/altivec-2.C: Compile with -Wno-deprecated to avoid failing with the new warning message. * gcc.dg/vmx/3c-01a.c: Likewise. * gcc.dg/vmx/ops-long-1.c: Likewise. * gcc.dg/vmx/ops.c: Likewise. * gcc.target/powerpc/altivec-20.c: Likewise. * gcc.target/powerpc/altivec-6.c: Likewise. * gcc.target/powerpc/altivec-vec-merge.c: Likewise. * gcc.target/powerpc/vsx-builtin-8.c: Likewise. * gcc.target/powerpc/warn-lvsl-lvsr.c: New test. Index: gcc/config/rs6000/rs6000-c.c === --- gcc/config/rs6000/rs6000-c.c(revision 215691) +++ gcc/config/rs6000/rs6000-c.c(working copy) @@ -4326,6 +4326,14 @@ altivec_resolve_overloaded_builtin (location_t loc if (TARGET_DEBUG_BUILTIN) fprintf (stderr, altivec_resolve_overloaded_builtin, code = %4d, %s\n, (int)fcode, IDENTIFIER_POINTER (DECL_NAME (fndecl))); + + /* vec_lvsl and vec_lvsr are deprecated for use with LE element order. */ + if (fcode == ALTIVEC_BUILTIN_VEC_LVSL !VECTOR_ELT_ORDER_BIG) +warning (OPT_Wdeprecated, vec_lvsl is deprecated for little endian; use \ +assignment for unaligned loads and stores); + else if (fcode == ALTIVEC_BUILTIN_VEC_LVSR !VECTOR_ELT_ORDER_BIG) +warning (OPT_Wdeprecated, vec_lvsr is deprecated for little endian; use \ +assignment for unaligned loads and stores); /* For now treat vec_splats and vec_promote as the same. */ if (fcode == ALTIVEC_BUILTIN_VEC_SPLATS Index: gcc/testsuite/g++.dg/ext/altivec-2.C === --- gcc/testsuite/g++.dg/ext/altivec-2.C(revision 215691) +++ gcc/testsuite/g++.dg/ext/altivec-2.C(working copy) @@ -1,6 +1,6 @@ /* { dg-do compile { target powerpc*-*-* } } */ /* { dg-require-effective-target powerpc_altivec_ok } */ -/* { dg-options -maltivec -Wall -Wno-unused-but-set-variable } */ +/* { dg-options -maltivec -Wall -Wno-unused-but-set-variable -Wno-deprecated } */ /* This test checks if AltiVec builtins accept const-qualified arguments. */ Index: gcc/testsuite/gcc.dg/vmx/3c-01a.c === --- gcc/testsuite/gcc.dg/vmx/3c-01a.c (revision 215691) +++ gcc/testsuite/gcc.dg/vmx/3c-01a.c (working copy) @@ -1,4 +1,5 @@ /* { dg-do compile } */ +/* { dg-options -Wno-deprecated } */ #include altivec.h typedef const volatile unsigned int _1; typedef const unsigned int _2; Index: gcc/testsuite/gcc.dg/vmx/ops-long-1.c === --- gcc/testsuite/gcc.dg/vmx/ops-long-1.c (revision 215691) +++ gcc/testsuite/gcc.dg/vmx/ops-long-1.c (working copy) @@ -1,4 +1,5 @@ /* { dg-do compile } */ +/* { dg-options -Wno-deprecated } */ /* Checks from the original ops.c that pass pointers to long or unsigned long for operations that support that in released versions Index: gcc/testsuite/gcc.dg/vmx/ops.c === --- gcc/testsuite/gcc.dg/vmx/ops.c (revision 215691) +++ gcc/testsuite/gcc.dg/vmx/ops.c (working copy) @@ -1,4 +1,5 @@ /* { dg-do compile } */ +/* { dg-options -Wno-deprecated } */ #include altivec.h #include stdlib.h extern char * *var_char_ptr; Index: gcc/testsuite/gcc.target/powerpc/altivec-20.c
Re: [Patch, Fortran] Add CO_BROADCAST
Le 29 sept. 2014 à 23:56, Dominique d'Humières domi...@lps.ens.fr a écrit : Unless there is an objection I plan to commit tomorrow the following patch with a change log: --- ../_clean/gcc/testsuite/gfortran.dg/coarray_collectives_9.f90 2014-09-25 12:14:05.0 +0200 +++ gcc/testsuite/gfortran.dg/coarray_collectives_9.f90 2014-09-29 20:23:24.0 +0200 @@ -1,5 +1,5 @@ ! { dg-do compile } -! { dg-options -fcoarray=single } +! { dg-options -fcoarray=single -fmax-errors=40 } ! ! ! CO_BROADCAST/CO_REDUCE @@ -29,7 +29,7 @@ program test call co_reduce(abc) ! { dg-error Missing actual argument 'operator' in call to 'co_reduce' } call co_broadcast(1, source_image=1) ! { dg-error 'a' argument of 'co_broadcast' intrinsic at .1. must be a variable } call co_reduce(a=1, operator=red_f) ! { dg-error 'a' argument of 'co_reduce' intrinsic at .1. must be a variable } - call co_reduce(a=val, operator=red_f2) ! { dg-error OPERATOR argument at (1) must be a PURE function } + call co_reduce(a=val, operator=red_f2) ! { dg-error OPERATOR argument at \\(1\\) must be a PURE function } call co_broadcast(val, source_image=[1,2]) ! { dg-error must be a scalar } call co_broadcast(val, source_image=1.0) ! { dg-error must be INTEGER } @@ -49,14 +49,14 @@ program test call co_reduce(val, red_f, stat=[1,2]) ! { dg-error must be a scalar } call co_reduce(val, red_f, stat=1.0) ! { dg-error must be INTEGER } call co_reduce(val, red_f, stat=1) ! { dg-error must be a variable } - call co_reduce(val, red_f, stat=i, result_image=1) ! OK - call co_reduce(val, red_f, stat=i, errmsg=errmsg, result_image=1) ! OK + call co_reduce(val, red_f, stat=i, result_image=1) ! { dg-error CO_REDUCE at \\(1\\) is not yet implemented } + call co_reduce(val, red_f, stat=i, errmsg=errmsg, result_image=1) ! { dg-error CO_REDUCE at \\(1\\) is not yet implemented } call co_reduce(val, red_f, stat=i, errmsg=[errmsg], result_image=1) ! { dg-error must be a scalar } call co_reduce(val, red_f, stat=i, errmsg=5, result_image=1) ! { dg-error must be CHARACTER } call co_reduce(val, red_f, errmsg=abc) ! { dg-error must be a variable } call co_reduce(val, red_f, stat=i8) ! { dg-error The stat= argument at .1. must be a kind=4 integer variable } call co_reduce(val, red_f, errmsg=msg4) ! { dg-error The errmsg= argument at .1. must be a default-kind character variable } - call co_broadcasr(vec(idx), 1) ! { dg-error Argument 'A' with INTENT\\(INOUT\\) at .1. of the intrinsic subroutine co_sum shall not have a vector subscript } - call co_reduce(vec([1,3,2]), red_f) ! { dg-error Argument 'A' with INTENT\\(INOUT\\) at .1. of the intrinsic subroutine co_min shall not have a vector subscript } + call co_broadcast(vec(idx), 1) ! { dg-error Argument 'A' with INTENT\\(INOUT\\) at .1. of the intrinsic subroutine co_broadcast shall not have a vector subscript } + call co_reduce(vec([1,3,2]), red_f) ! { dg-error Argument 'A' with INTENT\\(INOUT\\) at .1. of the intrinsic subroutine co_reduce shall not have a vector subscript } end program test Le 29 sept. 2014 à 16:31, Tobias Burnus tobias.bur...@physik.fu-berlin.de a écrit : On Mon, Sep 29, 2014 at 10:17:04AM +0200, Tobias Burnus wrote: Dominique Dhumieres wrote: The failures for the gfortran.dg/coarray_collectives_9.f90 are fixed with the following patch: Looks good to me. The patch is OK with a ChangLog. Actually, I missed the following part: ... - call co_broadcasr(vec(idx), 1) ! { dg-error Argument 'A' with INTENT\\(INOUT\\) at .1. of the intrinsic subroutine co_sum shall not have a vector subscript } - call co_reduce(vec([1,3,2]), red_f) ! { dg-error Argument 'A' with INTENT\\(INOUT\\) at .1. of the intrinsic subroutine co_min shall not have a vector subscript } + call co_broadcasr(vec(idx), 1) ! OK? ... Which is not fully okay: The error message should stay - but the procedure name should (...casr) should be corrected (...cast). Tobias PS: I think I will soon post a patch to support Fortran 2015's IMPLICIT NONE () where ... can be is an implicit-none list with values TYPE and EXTERNAL. Because a implicit none (type, external) would have found the typo! (Or likewise: -Wimplicit-procedure.)
[PATCH, rs6000] Generate LE code for vec_lvsl and vec_lvsr that is compatible with BE code
Hi, Up till now we have not attempted to generate code for LE usage of vec_lvsl and vec_lvsr that is compatible with expected BE usage. The LE code sequence corresponding to lvsl/vperm is not good, and we encourage programmers to convert those sequences to use direct assignment and the type system for unaligned loads. However, the issue comes up frequently enough that it seems best to provide this sequence together with a warning message (in a previous patch submission) to avoid confusion. The method used in this patch is to perform a byte-reversal of the result of the lvsl/lvsr. This is accomplished by loading the vector char constant {0,1,...,15}, which will appear in the register from left to right as {15,...,1,0}. A vperm instruction (which uses BE element ordering) is applied to the result of the lvsl/lvsr using the loaded constant as the permute control vector. Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions. Is this ok for trunk? Thanks, Bill [gcc] 2014-09-29 Bill Schmidt wschm...@linux.vnet.ibm.com * altivec.md (altivec_lvsl): New define_expand. (altivec_lvsl_direct): Rename define_insn from altivec_lvsl. (altivec_lvsr): New define_expand. (altivec_lvsr_direct): Rename define_insn from altivec_lvsr. * rs6000.c (rs6000_expand_builtin): Change to use altivec_lvs[lr]_direct; remove commented-out code. [gcc/testsuite] 2014-09-29 Bill Schmidt wschm...@linux.vnet.ibm.com * gcc.target/powerpc/lvsl-lvsr.c: New test. Index: gcc/config/rs6000/altivec.md === --- gcc/config/rs6000/altivec.md(revision 215689) +++ gcc/config/rs6000/altivec.md(working copy) @@ -2297,7 +2297,32 @@ dststt %0,%1,%2 [(set_attr type vecsimple)]) -(define_insn altivec_lvsl +(define_expand altivec_lvsl + [(use (match_operand:V16QI 0 register_operand )) + (use (match_operand:V16QI 1 memory_operand Z))] + TARGET_ALTIVEC + +{ + if (VECTOR_ELT_ORDER_BIG) +emit_insn (gen_altivec_lvsl_direct (operands[0], operands[1])); + else +{ + int i; + rtx mask, perm[16], constv, vperm; + mask = gen_reg_rtx (V16QImode); + emit_insn (gen_altivec_lvsl_direct (mask, operands[1])); + for (i = 0; i 16; ++i) +perm[i] = GEN_INT (i); + constv = gen_rtx_CONST_VECTOR (V16QImode, gen_rtvec_v (16, perm)); + constv = force_reg (V16QImode, constv); + vperm = gen_rtx_UNSPEC (V16QImode, gen_rtvec (3, mask, mask, constv), + UNSPEC_VPERM); + emit_insn (gen_rtx_SET (VOIDmode, operands[0], vperm)); +} + DONE; +}) + +(define_insn altivec_lvsl_direct [(set (match_operand:V16QI 0 register_operand =v) (unspec:V16QI [(match_operand:V16QI 1 memory_operand Z)] UNSPEC_LVSL))] @@ -2305,7 +2330,32 @@ lvsl %0,%y1 [(set_attr type vecload)]) -(define_insn altivec_lvsr +(define_expand altivec_lvsr + [(use (match_operand:V16QI 0 register_operand )) + (use (match_operand:V16QI 1 memory_operand Z))] + TARGET_ALTIVEC + +{ + if (VECTOR_ELT_ORDER_BIG) +emit_insn (gen_altivec_lvsr_direct (operands[0], operands[1])); + else +{ + int i; + rtx mask, perm[16], constv, vperm; + mask = gen_reg_rtx (V16QImode); + emit_insn (gen_altivec_lvsr_direct (mask, operands[1])); + for (i = 0; i 16; ++i) +perm[i] = GEN_INT (i); + constv = gen_rtx_CONST_VECTOR (V16QImode, gen_rtvec_v (16, perm)); + constv = force_reg (V16QImode, constv); + vperm = gen_rtx_UNSPEC (V16QImode, gen_rtvec (3, mask, mask, constv), + UNSPEC_VPERM); + emit_insn (gen_rtx_SET (VOIDmode, operands[0], vperm)); +} + DONE; +}) + +(define_insn altivec_lvsr_direct [(set (match_operand:V16QI 0 register_operand =v) (unspec:V16QI [(match_operand:V16QI 1 memory_operand Z)] UNSPEC_LVSR))] Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 215689) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -13898,8 +13898,8 @@ rs6000_expand_builtin (tree exp, rtx target, rtx s case ALTIVEC_BUILTIN_MASK_FOR_LOAD: case ALTIVEC_BUILTIN_MASK_FOR_STORE: { - int icode = (BYTES_BIG_ENDIAN ? (int) CODE_FOR_altivec_lvsr -: (int) CODE_FOR_altivec_lvsl); + int icode = (BYTES_BIG_ENDIAN ? (int) CODE_FOR_altivec_lvsr_direct +: (int) CODE_FOR_altivec_lvsl_direct); enum machine_mode tmode = insn_data[icode].operand[0].mode; enum machine_mode mode = insn_data[icode].operand[1].mode; tree arg; @@ -13927,7 +13927,6 @@ rs6000_expand_builtin (tree exp, rtx target, rtx s || ! (*insn_data[icode].operand[0].predicate) (target, tmode)) target = gen_reg_rtx (tmode); - /*pat = gen_altivec_lvsr (target,
Re: [PATCHv3][PING] Enable -fsanitize-recover for KASan
(resending in plain-text mode) -fasan-recover doesn't look like a good idea - for instance, in Clang, we never use ?san in flag names, preferring -fsanitize-whatever. What's the rationale behind splitting -fsanitize-recover in two flags (ASan- and UBSan- specific)? Is there no way to keep a single -f(no-)sanitize-recover for that purpose? Now it works only for UBSan checks, but we may extend it to another sanitizers as well. On Mon, Sep 29, 2014 at 2:20 PM, Konstantin Serebryany konstantin.s.serebry...@gmail.com wrote: +Alexey Samsonov On Mon, Sep 29, 2014 at 10:43 AM, Jakub Jelinek ja...@redhat.com wrote: On Mon, Sep 29, 2014 at 09:21:11PM +0400, Yury Gribov wrote: This patch enables -fsanitize-recover for KASan by default. This causes KASan to continue execution after error in case of inline instrumentation. This feature is needed because - reports during early bootstrap won't even be printed - needed to run all tests w/o rebooting machine for every test - needed for interactive work on desktop This is the third version of patch which renames -fsanitize-recover to -fubsan-recover and introduces -fasan-recover (enabled by default for KASan). It also moves flag handling to finish_options per Jakub's request. As the -fsanitize-recover option comes from clang originally, I think this needs coordination with them (whether clang will also rename the option), and certainly keep -fsanitize-recover as a non-documented compat option alias for -fubsan-recover. So, can you please talk to the clang folks about it? Jakub -- Alexey Samsonov, Mountain View, CA
Re: [RFC] optimize x - y cmp 0 with undefined overflow
Yeah, that sounds good to me. Here's what I have at last commited after testing on x86-64/Linux. 2014-09-29 Eric Botcazou ebotca...@adacore.com * tree-vrp.c (get_single_symbol): New function. (build_symbolic_expr): Likewise. (symbolic_range_based_on_p): New predicate. (extract_range_from_binary_expr_1): Deal with single-symbolic ranges for PLUS and MINUS. Do not drop symbolic ranges at the end. (extract_range_from_binary_expr): Try harder for PLUS and MINUS if operand is symbolic and based on the other operand. 2014-09-29 Eric Botcazou ebotca...@adacore.com * gcc.dg/tree-ssa/vrp94.c: New test. * gnat.dg/opt40.adb: Likewise. -- Eric BotcazouIndex: tree-vrp.c === --- tree-vrp.c (revision 215656) +++ tree-vrp.c (working copy) @@ -916,6 +916,98 @@ symbolic_range_p (value_range_t *vr) || !is_gimple_min_invariant (vr-max)); } +/* Return the single symbol (an SSA_NAME) contained in T if any, or NULL_TREE + otherwise. We only handle additive operations and set NEG to true if the + symbol is negated and INV to the invariant part, if any. */ + +static tree +get_single_symbol (tree t, bool *neg, tree *inv) +{ + bool neg_; + tree inv_; + + if (TREE_CODE (t) == PLUS_EXPR + || TREE_CODE (t) == POINTER_PLUS_EXPR + || TREE_CODE (t) == MINUS_EXPR) +{ + if (is_gimple_min_invariant (TREE_OPERAND (t, 0))) + { + neg_ = (TREE_CODE (t) == MINUS_EXPR); + inv_ = TREE_OPERAND (t, 0); + t = TREE_OPERAND (t, 1); + } + else if (is_gimple_min_invariant (TREE_OPERAND (t, 1))) + { + neg_ = false; + inv_ = TREE_OPERAND (t, 1); + t = TREE_OPERAND (t, 0); + } + else +return NULL_TREE; +} + else +{ + neg_ = false; + inv_ = NULL_TREE; +} + + if (TREE_CODE (t) == NEGATE_EXPR) +{ + t = TREE_OPERAND (t, 0); + neg_ = !neg_; +} + + if (TREE_CODE (t) != SSA_NAME) +return NULL_TREE; + + *neg = neg_; + *inv = inv_; + return t; +} + +/* The reverse operation: build a symbolic expression with TYPE + from symbol SYM, negated according to NEG, and invariant INV. */ + +static tree +build_symbolic_expr (tree type, tree sym, bool neg, tree inv) +{ + const bool pointer_p = POINTER_TYPE_P (type); + tree t = sym; + + if (neg) +t = build1 (NEGATE_EXPR, type, t); + + if (integer_zerop (inv)) +return t; + + return build2 (pointer_p ? POINTER_PLUS_EXPR : PLUS_EXPR, type, t, inv); +} + +/* Return true if value range VR involves exactly one symbol SYM. */ + +static bool +symbolic_range_based_on_p (value_range_t *vr, const_tree sym) +{ + bool neg, min_has_symbol, max_has_symbol; + tree inv; + + if (is_gimple_min_invariant (vr-min)) +min_has_symbol = false; + else if (get_single_symbol (vr-min, neg, inv) == sym) +min_has_symbol = true; + else +return false; + + if (is_gimple_min_invariant (vr-max)) +max_has_symbol = false; + else if (get_single_symbol (vr-max, neg, inv) == sym) +max_has_symbol = true; + else +return false; + + return (min_has_symbol || max_has_symbol); +} + /* Return true if value range VR uses an overflow infinity. */ static inline bool @@ -1199,25 +1291,30 @@ compare_values_warnv (tree val1, tree va both integers. */ gcc_assert (POINTER_TYPE_P (TREE_TYPE (val1)) == POINTER_TYPE_P (TREE_TYPE (val2))); + /* Convert the two values into the same type. This is needed because sizetype causes sign extension even for unsigned types. */ val2 = fold_convert (TREE_TYPE (val1), val2); STRIP_USELESS_TYPE_CONVERSION (val2); if ((TREE_CODE (val1) == SSA_NAME + || (TREE_CODE (val1) == NEGATE_EXPR + TREE_CODE (TREE_OPERAND (val1, 0)) == SSA_NAME) || TREE_CODE (val1) == PLUS_EXPR || TREE_CODE (val1) == MINUS_EXPR) (TREE_CODE (val2) == SSA_NAME + || (TREE_CODE (val2) == NEGATE_EXPR + TREE_CODE (TREE_OPERAND (val2, 0)) == SSA_NAME) || TREE_CODE (val2) == PLUS_EXPR || TREE_CODE (val2) == MINUS_EXPR)) { tree n1, c1, n2, c2; enum tree_code code1, code2; - /* If VAL1 and VAL2 are of the form 'NAME [+-] CST' or 'NAME', + /* If VAL1 and VAL2 are of the form '[-]NAME [+-] CST' or 'NAME', return -1 or +1 accordingly. If VAL1 and VAL2 don't use the same name, return -2. */ - if (TREE_CODE (val1) == SSA_NAME) + if (TREE_CODE (val1) == SSA_NAME || TREE_CODE (val1) == NEGATE_EXPR) { code1 = SSA_NAME; n1 = val1; @@ -1239,7 +1336,7 @@ compare_values_warnv (tree val1, tree va } } - if (TREE_CODE (val2) == SSA_NAME) + if (TREE_CODE (val2) == SSA_NAME || TREE_CODE (val2) == NEGATE_EXPR) { code2 = SSA_NAME; n2 = val2; @@ -1262,11 +1359,15 @@ compare_values_warnv (tree val1, tree va } /* Both values must use the same name. */ + if (TREE_CODE (n1) == NEGATE_EXPR
Re: [PATCHv3][PING] Enable -fsanitize-recover for KASan
On Mon, Sep 29, 2014 at 03:36:20PM -0700, Alexey Samsonov wrote: -fasan-recover doesn't look like a good idea - for instance, in Clang, we never use ?san in flag names, preferring -fsanitize-whatever. What's the rationale behind splitting -fsanitize-recover in two flags (ASan- and UBSan- specific)? Is there no way to keep a single -f(no-)sanitize-recover for that purpose? Now it works only for UBSan checks, but we may extend it to another sanitizers as well. The problem is that if we start using it for ASan, it needs to have a different default, because ASan wants to abort by default, while UBSan recover by default. -fsanitize=kernel-address w (KASan) wants to recover by default. So, the option is either to never support recover for -fsanitize=address, for ubsan keep -fsanitize-recover (by default) as is and for kasan use that same switch, or have separate flags. Jakub
[patch commit] [SH] Use define_c_enum in sh.md
I've noticed that config/sh/sh.md uses define_constants to define unspec and unspecv numbers, though config/sh/sync.md uses define_c_enum for them. This causes collisions of some numbers. The attached patch would be an obvious fix. Tested on sh4-unknown-linux. Applied to trunk. I'll backport it to branches in a week or so. Regards, kaz -- 2014-09-29 Kaz Kojima kkoj...@gcc.gnu.org * config/sh/sh.md: Use define_c_enum for unspec and unspecv. --- ORIG/trunk/gcc/config/sh/sh.md 2014-09-20 08:59:46.0 +0900 +++ trunk/gcc/config/sh/sh.md 2014-09-29 09:36:41.661822991 +0900 @@ -109,73 +109,73 @@ (TR2_REG 130) (XD0_REG 136) +]) +(define_c_enum unspec [ ;; These are used with unspec. - (UNSPEC_COMPACT_ARGS 0) - (UNSPEC_MOVA 1) - (UNSPEC_CASESI 2) - (UNSPEC_DATALABEL3) - (UNSPEC_BBR 4) - (UNSPEC_SFUNC5) - (UNSPEC_PIC 6) - (UNSPEC_GOT 7) - (UNSPEC_GOTOFF 8) - (UNSPEC_PLT 9) - (UNSPEC_CALLER 10) - (UNSPEC_GOTPLT 11) - (UNSPEC_ICACHE 12) - (UNSPEC_INIT_TRAMP 13) - (UNSPEC_FCOSA14) - (UNSPEC_FSRRA15) - (UNSPEC_FSINA16) - (UNSPEC_NSB 17) - (UNSPEC_ALLOCO 18) - (UNSPEC_TLSGD20) - (UNSPEC_TLSLDM 21) - (UNSPEC_TLSIE22) - (UNSPEC_DTPOFF 23) - (UNSPEC_GOTTPOFF 24) - (UNSPEC_TPOFF25) - (UNSPEC_RA 26) - (UNSPEC_DIV_INV_M0 30) - (UNSPEC_DIV_INV_M1 31) - (UNSPEC_DIV_INV_M2 32) - (UNSPEC_DIV_INV_M3 33) - (UNSPEC_DIV_INV2034) - (UNSPEC_DIV_INV_TABLE37) - (UNSPEC_ASHIFTRT 35) - (UNSPEC_THUNK36) - (UNSPEC_CHKADD 38) - (UNSPEC_SP_SET 40) - (UNSPEC_SP_TEST 41) - (UNSPEC_MOVUA42) - + UNSPEC_COMPACT_ARGS + UNSPEC_MOVA + UNSPEC_CASESI + UNSPEC_DATALABEL + UNSPEC_BBR + UNSPEC_SFUNC + UNSPEC_PIC + UNSPEC_GOT + UNSPEC_GOTOFF + UNSPEC_PLT + UNSPEC_CALLER + UNSPEC_GOTPLT + UNSPEC_ICACHE + UNSPEC_INIT_TRAMP + UNSPEC_FCOSA + UNSPEC_FSRRA + UNSPEC_FSINA + UNSPEC_NSB + UNSPEC_ALLOCO + UNSPEC_TLSGD + UNSPEC_TLSLDM + UNSPEC_TLSIE + UNSPEC_DTPOFF + UNSPEC_GOTTPOFF + UNSPEC_TPOFF + UNSPEC_RA + UNSPEC_DIV_INV_M0 + UNSPEC_DIV_INV_M1 + UNSPEC_DIV_INV_M2 + UNSPEC_DIV_INV_M3 + UNSPEC_DIV_INV20 + UNSPEC_DIV_INV_TABLE + UNSPEC_ASHIFTRT + UNSPEC_THUNK + UNSPEC_CHKADD + UNSPEC_SP_SET + UNSPEC_SP_TEST + UNSPEC_MOVUA ;; (unspec [VAL SHIFT] UNSPEC_EXTRACT_S16) computes (short) (VAL SHIFT). ;; UNSPEC_EXTRACT_U16 is the unsigned equivalent. - (UNSPEC_EXTRACT_S16 43) - (UNSPEC_EXTRACT_U16 44) - + UNSPEC_EXTRACT_S16 + UNSPEC_EXTRACT_U16 ;; (unspec [TARGET ANCHOR] UNSPEC_SYMOFF) == TARGET - ANCHOR. - (UNSPEC_SYMOFF 45) - + UNSPEC_SYMOFF ;; (unspec [OFFSET ANCHOR] UNSPEC_PCREL_SYMOFF) == OFFSET - (ANCHOR - .). - (UNSPEC_PCREL_SYMOFF 46) - + UNSPEC_PCREL_SYMOFF ;; Misc builtins - (UNSPEC_BUILTIN_STRLEN 47) + UNSPEC_BUILTIN_STRLEN +]) +(define_c_enum unspecv [ ;; These are used with unspec_volatile. - (UNSPECV_BLOCKAGE0) - (UNSPECV_ALIGN 1) - (UNSPECV_CONST2 2) - (UNSPECV_CONST4 4) - (UNSPECV_CONST8 6) - (UNSPECV_WINDOW_END 10) - (UNSPECV_CONST_END 11) - (UNSPECV_EH_RETURN 12) - (UNSPECV_GBR 13) - (UNSPECV_SP_SWITCH_B 14) - (UNSPECV_SP_SWITCH_E 15) + UNSPECV_BLOCKAGE + UNSPECV_ALIGN + UNSPECV_CONST2 + UNSPECV_CONST4 + UNSPECV_CONST8 + UNSPECV_WINDOW_END + UNSPECV_CONST_END + UNSPECV_EH_RETURN + UNSPECV_GBR + UNSPECV_SP_SWITCH_B + UNSPECV_SP_SWITCH_E ]) ;; -
Re: [PATCHv3][PING] Enable -fsanitize-recover for KASan
On Mon, Sep 29, 2014 at 4:17 PM, Jakub Jelinek ja...@redhat.com wrote: On Mon, Sep 29, 2014 at 03:36:20PM -0700, Alexey Samsonov wrote: -fasan-recover doesn't look like a good idea - for instance, in Clang, we never use ?san in flag names, preferring -fsanitize-whatever. What's the rationale behind splitting -fsanitize-recover in two flags (ASan- and UBSan- specific)? Is there no way to keep a single -f(no-)sanitize-recover for that purpose? Now it works only for UBSan checks, but we may extend it to another sanitizers as well. The problem is that if we start using it for ASan, it needs to have a different default, because ASan wants to abort by default, while UBSan recover by default. -fsanitize=kernel-address w (KASan) wants to recover by default. So, the option is either to never support recover for -fsanitize=address, for ubsan keep -fsanitize-recover (by default) as is and for kasan use that same switch, or have separate flags. Jakub I don't think we ever going to support recovery for regular ASan (Kostya, correct me if I'm wrong). I see no problem in enabling -fsanitize-recover by default for -fsanitize=undefined and -fsanitize=kernel-address. We can, potentially, extend -fsanitize-recover flag to take the same values as -fsanitize= one, so that one can specify which sanitizers are recoverable, and which are not, but I'd try to make this a last resort - this is too complex. -- Alexey Samsonov, Mountain View, CA
libgo patch committed: Use -Qunused-arguments with asm tests
This patch to the libgo configure script changes it to use -Qunused-arguments when running the assembler tests. Apparently clang by default complains when arguments to the driver are unused, as in -I arguments when invoked on a .s file. Passing -Qunused-arguments disables these warnings, and lets us run the tests as they are supposed to run. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 225a208260a6 libgo/configure.ac --- a/libgo/configure.ac Mon Sep 22 14:14:24 2014 -0700 +++ b/libgo/configure.ac Mon Sep 29 15:53:00 2014 -0700 @@ -785,14 +785,28 @@ [Define to the flags needed for the .section .eh_frame directive.]) fi +AC_CACHE_CHECK([if compiler supports -Qunused-arguments], +[libgo_cv_c_unused_arguments], +[CFLAGS_hold=$CFLAGS +CFLAGS=$CFLAGS -Qunused-arguments +AC_COMPILE_IFELSE([[int i;]], +[libgo_cv_c_unused_arguments=yes], +[libgo_cv_c_unused_arguments=no]) +CFLAGS=$CFLAGS_hold]) + AC_CACHE_CHECK([if assembler supports GNU comdat group syntax], libgo_cv_as_comdat_gnu, [ echo '.section .text,axG,@progbits,.foo,comdat' conftest.s +CFLAGS_hold=$CFLAGS +if test $libgo_cv_c_unused_arguments = yes; then + CFLAGS=$CFLAGS -Qunused-arguments +fi if $CC $CFLAGS -c conftest.s /dev/null 21; then libgo_cv_as_comdat_gnu=yes else libgo_cv_as_comdat_gnu=no fi +CFLAGS=$CFLAGS_hold ]) if test x$libgo_cv_as_comdat_gnu = xyes; then AC_DEFINE(HAVE_AS_COMDAT_GAS, 1, @@ -803,9 +817,14 @@ libgo_cv_as_x86_pcrel, [ libgo_cv_as_x86_pcrel=yes echo '.text; foo: nop; .data; .long foo-.; .text' conftest.s +CFLAGS_hold=$CFLAGS +if test $libgo_cv_c_unused_arguments = yes; then + CFLAGS=$CFLAGS -Qunused-arguments +fi if $CC $CFLAGS -c conftest.s 21 | $EGREP -i 'illegal|warning' /dev/null; then libgo_cv_as_x86_pcrel=no fi +CFLAGS=$CFLAGS_hold ]) if test x$libgo_cv_as_x86_pcrel = xyes; then AC_DEFINE(HAVE_AS_X86_PCREL, 1, @@ -816,9 +835,14 @@ libgo_cv_as_x86_64_unwind_section_type, [ libgo_cv_as_x86_64_unwind_section_type=yes echo '.section .eh_frame,a,@unwind' conftest.s +CFLAGS_hold=$CFLAGS +if test $libgo_cv_c_unused_arguments = yes; then + CFLAGS=$CFLAGS -Qunused-arguments +fi if $CC $CFLAGS -c conftest.s 21 | grep -i warning /dev/null; then libgo_cv_as_x86_64_unwind_section_type=no fi +CFLAGS=$CFLAGS_hold ]) if test x$libgo_cv_as_x86_64_unwind_section_type = xyes; then AC_DEFINE(HAVE_AS_X86_64_UNWIND_SECTION_TYPE, 1,
libffi patch RFA: Pass -Qunused-arguments for asm files
Similar to a recent patch to libgo, this patch to the libffi configure script checks whether the compiler support -Qunused-arguments. If it does, it passes -Qunused-arguments when invoking the compiler on .s files. This is because the clang driver complains by default when given useless arguments, such as -I options when compiling a .s file. This somewhat annoying behaviour works poorly with configure scripts. The -Qunused-arguments option disables it. Bootstrapped and ran libffi and libgo tests on x86_64-unknown-linux-gnu. OK for mainline? Ian 2014-09-29 Ian Lance Taylor i...@google.com * configure.ac: If the compiler supports -Qunused-arguments, use it when running the compiler on .s files. * configure: Regenerated. Index: configure.ac === --- configure.ac (revision 215699) +++ configure.ac (working copy) @@ -295,6 +295,15 @@ AC_C_BIGENDIAN GCC_AS_CFI_PSEUDO_OP +AC_CACHE_CHECK([if compiler supports -Qunused-arguments], +[libffi_cv_c_unused_arguments], +[CFLAGS_hold=$CFLAGS +CFLAGS=$CFLAGS -Qunused-arguments +AC_COMPILE_IFELSE([[int i;]], +[libffi_cv_c_unused_arguments=yes], +[libffi_cv_c_unused_arguments=no]) +CFLAGS=$CFLAGS_hold]) + if test x$TARGET = xSPARC; then AC_CACHE_CHECK([assembler and linker support unaligned pc related relocs], libffi_cv_as_sparc_ua_pcrel, [ @@ -331,9 +340,14 @@ if test x$TARGET = xX86 || test x$TARGET libffi_cv_as_x86_pcrel, [ libffi_cv_as_x86_pcrel=yes echo '.text; foo: nop; .data; .long foo-.; .text' conftest.s + CFLAGS_hold=$CFLAGS + if test $libffi_cv_c_unused_arguments = yes; then + CFLAGS=$CFLAGS -Qunused-arguments + fi if $CC $CFLAGS -c conftest.s 21 | $EGREP -i 'illegal|warning' /dev/null; then libffi_cv_as_x86_pcrel=no fi + CFLAGS=$CFLAGS_hold ]) if test x$libffi_cv_as_x86_pcrel = xyes; then AC_DEFINE(HAVE_AS_X86_PCREL, 1, @@ -397,9 +411,14 @@ if test x$TARGET = xX86_64; then libffi_cv_as_x86_64_unwind_section_type, [ libffi_cv_as_x86_64_unwind_section_type=yes echo '.section .eh_frame,a,@unwind' conftest.s + CFLAGS_hold=$CFLAGS + if test $libffi_cv_c_unused_arguments = yes; then + CFLAGS=$CFLAGS -Qunused-arguments + fi if $CC $CFLAGS -c conftest.s 21 | grep -i warning /dev/null; then libffi_cv_as_x86_64_unwind_section_type=no fi + CFLAGS=$CFLAGS_hold ]) if test x$libffi_cv_as_x86_64_unwind_section_type = xyes; then AC_DEFINE(HAVE_AS_X86_64_UNWIND_SECTION_TYPE, 1,
Re: [PATCHv3][PING] Enable -fsanitize-recover for KASan
On Mon, Sep 29, 2014 at 4:26 PM, Alexey Samsonov samso...@google.com wrote: On Mon, Sep 29, 2014 at 4:17 PM, Jakub Jelinek ja...@redhat.com wrote: On Mon, Sep 29, 2014 at 03:36:20PM -0700, Alexey Samsonov wrote: -fasan-recover doesn't look like a good idea - for instance, in Clang, we never use ?san in flag names, preferring -fsanitize-whatever. What's the rationale behind splitting -fsanitize-recover in two flags (ASan- and UBSan- specific)? Is there no way to keep a single -f(no-)sanitize-recover for that purpose? Now it works only for UBSan checks, but we may extend it to another sanitizers as well. The problem is that if we start using it for ASan, it needs to have a different default, because ASan wants to abort by default, while UBSan recover by default. -fsanitize=kernel-address w (KASan) wants to recover by default. So, the option is either to never support recover for -fsanitize=address, for ubsan keep -fsanitize-recover (by default) as is and for kasan use that same switch, or have separate flags. Jakub I don't think we ever going to support recovery for regular ASan (Kostya, correct me if I'm wrong). I hope so too. Another point is that with asan-instrumentation-with-call-threshold=0 (instrumentation with callbacks) we can and probably will allow to recover from errors (glibc demands that), but that does not require any compile-time flag. I see no problem in enabling -fsanitize-recover by default for -fsanitize=undefined and This becomes more interesting when we use asan and ubsan together. Which default setting is stronger? :) -fsanitize=kernel-address. We can, potentially, extend -fsanitize-recover flag to take the same values as -fsanitize= one, so that one can specify which sanitizers are recoverable, and which are not, but I'd try to make this a last resort - this is too complex. -- Alexey Samsonov, Mountain View, CA