avoid alignment of static variables affecting stack's
Function (or more narrow) scope static variables (as well as others not placed on the stack) should also not have any effect on the stack alignment. I noticed the issue first with Linux'es dynamic_pr_debug() construct using an 8-byte aligned sub-file-scope local variable. According to my checking bad behavior started with 4.6.x (4.5.3 was still okay), but generated code got quite a bit worse as of 4.9.0. gcc/ 2014-10-23 Jan Beulich jbeul...@suse.com * cfgexpand.c (expand_one_var): Exclude static, external, and hard register variables when adjusting stack alignment related state. gcc/testsuite/ 2014-10-23 Jan Beulich jbeul...@suse.com * gcc.c-torture/execute/stkalign.c: New. --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -1233,12 +1233,16 @@ static HOST_WIDE_INT expand_one_var (tree var, bool toplevel, bool really_expand) { unsigned int align = BITS_PER_UNIT; + bool stack = true; tree origvar = var; var = SSAVAR (var); if (TREE_TYPE (var) != error_mark_node TREE_CODE (var) == VAR_DECL) { + stack = !TREE_STATIC (var) !DECL_EXTERNAL (var) + !DECL_HARD_REGISTER (var); + /* Because we don't know if VAR will be in register or on stack, we conservatively assume it will be on stack even if VAR is eventually put into register after RA pass. For non-automatic @@ -1267,22 +1271,25 @@ expand_one_var (tree var, bool toplevel, align = POINTER_SIZE; } - if (SUPPORTS_STACK_ALIGNMENT - crtl-stack_alignment_estimated align) + if (stack) { - /* stack_alignment_estimated shouldn't change after stack - realign decision made */ - gcc_assert (!crtl-stack_realign_processed); - crtl-stack_alignment_estimated = align; + if (SUPPORTS_STACK_ALIGNMENT + crtl-stack_alignment_estimated align) + { + /* stack_alignment_estimated shouldn't change after stack +realign decision made */ + gcc_assert (!crtl-stack_realign_processed); + crtl-stack_alignment_estimated = align; + } + + /* stack_alignment_needed PREFERRED_STACK_BOUNDARY is permitted. +So here we only make sure stack_alignment_needed = align. */ + if (crtl-stack_alignment_needed align) + crtl-stack_alignment_needed = align; + if (crtl-max_used_stack_slot_alignment align) + crtl-max_used_stack_slot_alignment = align; } - /* stack_alignment_needed PREFERRED_STACK_BOUNDARY is permitted. - So here we only make sure stack_alignment_needed = align. */ - if (crtl-stack_alignment_needed align) -crtl-stack_alignment_needed = align; - if (crtl-max_used_stack_slot_alignment align) -crtl-max_used_stack_slot_alignment = align; - if (TREE_CODE (origvar) == SSA_NAME) { gcc_assert (TREE_CODE (var) != VAR_DECL --- a/gcc/testsuite/gcc.c-torture/execute/stkalign.c +++ b/gcc/testsuite/gcc.c-torture/execute/stkalign.c @@ -0,0 +1,26 @@ +/* { dg-options -fno-inline } */ + +#include assert.h + +#define ALIGNMENT 64 + +unsigned test(unsigned n, unsigned p) +{ + static struct { char __attribute__((__aligned__(ALIGNMENT))) c; } s; + unsigned x; + + assert(__alignof__(s) == ALIGNMENT); + asm ( : =g (x), +m (s) : 0 (x)); + + return n ? test(n - 1, x) : (x ^ p); +} + +int main (int argc, char *argv[] __attribute__((unused))) +{ + unsigned int x = test(argc, 0); + + x |= test(argc + 1, 0); + x |= test(argc + 2, 0); + + return !(x (ALIGNMENT - 1)); +} avoid alignment of static variables affecting stack's Function (or more narrow) scope static variables (as well as others not placed on the stack) should also not have any effect on the stack alignment. I noticed the issue first with Linux'es dynamic_pr_debug() construct using an 8-byte aligned sub-file-scope local variable. According to my checking bad behavior started with 4.6.x (4.5.3 was still okay), but generated code got quite a bit worse as of 4.9.0. gcc/ 2014-10-23 Jan Beulich jbeul...@suse.com * cfgexpand.c (expand_one_var): Exclude static, external, and hard register variables when adjusting stack alignment related state. gcc/testsuite/ 2014-10-23 Jan Beulich jbeul...@suse.com * gcc.c-torture/execute/stkalign.c: New. --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -1233,12 +1233,16 @@ static HOST_WIDE_INT expand_one_var (tree var, bool toplevel, bool really_expand) { unsigned int align = BITS_PER_UNIT; + bool stack = true; tree origvar = var; var = SSAVAR (var); if (TREE_TYPE (var) != error_mark_node TREE_CODE (var) == VAR_DECL) { + stack = !TREE_STATIC (var) !DECL_EXTERNAL (var) + !DECL_HARD_REGISTER (var); + /* Because we don't know if VAR will be in register or on stack, we conservatively assume it will be on stack even if VAR is eventually put into register after RA pass. For non-automatic @@ -1267,22 +1271,25 @@ expand_one_var
Re: avoid alignment of static variables affecting stack's
On Thu, Oct 23, 2014 at 07:30:27AM +0100, Jan Beulich wrote: Function (or more narrow) scope static variables (as well as others not placed on the stack) should also not have any effect on the stack alignment. I noticed the issue first with Linux'es dynamic_pr_debug() construct using an 8-byte aligned sub-file-scope local variable. According to my checking bad behavior started with 4.6.x (4.5.3 was still okay), but generated code got quite a bit worse as of 4.9.0. If the static/external var has BLKmode, then perhaps it is safe, but I wonder about other vars, say vectors etc. Such vars are most likely loaded from their memory location, and if for some reason that needs to be spilled again, stack realignment would not be able to do that. Or do we inspect the IL and for any pseudos with modes needing larger alignment we adjust the dynamic stack realignment fields? Jakub
[PATCH PR63173] [AARCH64, NEON] Improve vld[234](q?)_dup intrinsics
Hi, This patch converts the vld[234](q?)_dup intrinsics to use builtin functions instead of the previous inline assembly syntax. It can fix the performance issue on PR63173. Reg-tested with aarch64-linux-gnu-gcc on qemu. OK for the trunk? Thanks Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 216558) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,39 @@ +2014-10-23 Felix Yang felix.y...@huawei.com + Jiji Jiang jiangj...@huawei.com + + PR target/63173 + * config/aarch64/arm_neon.h (__LD2R_FUNC): Remove macro. + (__LD3R_FUNC): Ditto. + (__LD4R_FUNC): Ditto. + (vld2_dup_s8, vld2_dup_s16, vld2_dup_s32, vld2_dup_f32, vld2_dup_f64, +vld2_dup_u8, vld2_dup_u16, vld2_dup_u32, vld2_dup_p8, vld2_dup_p16 +vld2_dup_s64, vld2_dup_u64, vld2q_dup_s8, vld2q_dup_p8, +vld2q_dup_s16, vld2q_dup_p16, vld2q_dup_s32, vld2q_dup_s64, +vld2q_dup_u8, vld2q_dup_u16, vld2q_dup_u32, vld2q_dup_u64 +vld2q_dup_f32, vld2q_dup_f64): Rewrite using builtin functions. + (vld3_dup_s64, vld3_dup_u64, vld3_dup_f64, vld3_dup_s8 +vld3_dup_p8, vld3_dup_s16, vld3_dup_p16, vld3_dup_s32 +vld3_dup_u8, vld3_dup_u16, vld3_dup_u32, vld3_dup_f32 +vld3q_dup_s8, vld3q_dup_p8, vld3q_dup_s16, vld3q_dup_p16 +vld3q_dup_s32, vld3q_dup_s64, vld3q_dup_u8, vld3q_dup_u16 +vld3q_dup_u32, vld3q_dup_u64, vld3q_dup_f32, vld3q_dup_f64): Likewise. + (vld4_dup_s64, vld4_dup_u64, vld4_dup_f64, vld4_dup_s8 +vld4_dup_p8, vld4_dup_s16, vld4_dup_p16, vld4_dup_s32 +vld4_dup_u8, vld4_dup_u16, vld4_dup_u32, vld4_dup_f32 +vld4q_dup_s8, vld4q_dup_p8, vld4q_dup_s16, vld4q_dup_p16 +vld4q_dup_s32, vld4q_dup_s64, vld4q_dup_u8, vld4q_dup_u16 +vld4q_dup_u32, vld4q_dup_u64, vld4q_dup_f32, vld4q_dup_f64): Likewise. + * config/aarch64/aarch64.md (define_c_enum unspec): Add + UNSPEC_LD2_DUP, UNSPEC_LD3_DUP, UNSPEC_LD4_DUP. + * config/aarch64/aarch64-simd-builtins.def (ld2r, ld3r, ld4r): New + builtins. + * config/aarch64/aarch64-simd.md (aarch64_simd_ld2rmode): New pattern. + (aarch64_simd_ld3rmode): Likewise. + (aarch64_simd_ld4rmode): Likewise. + (aarch64_ld2rmode): New expand. + (aarch64_ld3rmode): Likewise. + (aarch64_ld4rmode): Likewise. + 2014-10-22 Richard Sandiford richard.sandif...@arm.com * lra.c (lra): Remove call to recog_init. Index: gcc/config/aarch64/aarch64-simd.md === --- gcc/config/aarch64/aarch64-simd.md (revision 216558) +++ gcc/config/aarch64/aarch64-simd.md (working copy) @@ -3991,6 +3991,16 @@ [(set_attr type neon_load2_2regq)] ) +(define_insn aarch64_simd_ld2rmode + [(set (match_operand:OI 0 register_operand =w) + (unspec:OI [(match_operand:V_TWO_ELEM 1 aarch64_simd_struct_operand Utv) + (unspec:VALLDIF [(const_int 0)] UNSPEC_VSTRUCTDUMMY) ] + UNSPEC_LD2_DUP))] + TARGET_SIMD + ld2r\\t{%S0.Vtype - %T0.Vtype}, %1 + [(set_attr type neon_load2_all_lanesq)] +) + (define_insn vec_store_lanesoimode [(set (match_operand:OI 0 aarch64_simd_struct_operand =Utv) (unspec:OI [(match_operand:OI 1 register_operand w) @@ -4022,6 +4032,16 @@ [(set_attr type neon_load3_3regq)] ) +(define_insn aarch64_simd_ld3rmode + [(set (match_operand:CI 0 register_operand =w) + (unspec:CI [(match_operand:V_THREE_ELEM 1 aarch64_simd_struct_operand Utv) + (unspec:VALLDIF [(const_int 0)] UNSPEC_VSTRUCTDUMMY) ] + UNSPEC_LD3_DUP))] + TARGET_SIMD + ld3r\\t{%S0.Vtype - %U0.Vtype}, %1 + [(set_attr type neon_load3_all_lanesq)] +) + (define_insn vec_store_lanescimode [(set (match_operand:CI 0 aarch64_simd_struct_operand =Utv) (unspec:CI [(match_operand:CI 1 register_operand w) @@ -4053,6 +4073,16 @@ [(set_attr type neon_load4_4regq)] ) +(define_insn aarch64_simd_ld4rmode + [(set (match_operand:XI 0 register_operand =w) + (unspec:XI [(match_operand:V_FOUR_ELEM 1 aarch64_simd_struct_operand Utv) + (unspec:VALLDIF [(const_int 0)] UNSPEC_VSTRUCTDUMMY) ] + UNSPEC_LD4_DUP))] + TARGET_SIMD + ld4r\\t{%S0.Vtype - %V0.Vtype}, %1 + [(set_attr type neon_load4_all_lanesq)] +) + (define_insn vec_store_lanesximode [(set (match_operand:XI 0 aarch64_simd_struct_operand =Utv) (unspec:XI [(match_operand:XI 1 register_operand w) @@ -4193,6 +4223,45 @@ aarch64_simd_disambiguate_copy (operands, dest, src, 4); }) +(define_expand aarch64_ld2rmode + [(match_operand:OI 0 register_operand =w) + (match_operand:DI 1 register_operand w) + (unspec:VALLDIF [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] + TARGET_SIMD +{ + enum machine_mode mode = V_TWO_ELEMmode; + rtx mem = gen_rtx_MEM (mode, operands[1]); + + emit_insn
Fix 63615 - FAIL: gcc.target/i386/addr-sel-1.c
PR 63615 was caused by r21642 fixing the accounting of n_constants in simplify_plus_minus. Previously, any expression handled by simplify_plus_minus that expanded to more than two elements and contained at least one constant would have resulted in n_constants being larger than one, even if it had only one constant. This had the effect of setting canonicalized for such expressions. The missed optimisation had these operands to simplify_plus_minus: (gdb) p debug_rtx(op0) (plus:SI (reg:SI 0 ax [96]) (const_int 1 [0x1])) $1 = void (gdb) p debug_rtx(op1) (symbol_ref:SI (a) var_decl 0x76c2f900 a) $2 = void resulting in the ops array being populated as (gdb) p n_ops $3 = 3 (gdb) p ops[0]@3 $4 = {{op = 0x76d4b360, neg = 0}, {op = 0x76d483a8, neg = 0}, {op = 0x76c29490, neg = 0}} (gdb) p debug_rtx(ops[0].op) (reg:SI 0 ax [96]) $5 = void (gdb) p debug_rtx(ops[1].op) (symbol_ref:SI (a) var_decl 0x76c2f900 a) $6 = void (gdb) p debug_rtx(ops[2].op) (const_int 1 [0x1]) $7 = void Of note here is that the operands have been reordered from their original positions. What was ax + 1 + sym is now ax + sym + 1, and it happens that this ordering is correct in the sense that simplify_plus_minus_op_data_cmp sorting of the ops array produces no changes. Now any change made during sorting sets canonicalized, so I figure reordering while decomposing operands ought to set canonicalized too. Indeed, the reordering seen above has canonicalized the expression. (Of course the reordering during decomposition might be exactly cancelled by later sorting, but it hardly seems worth fixing that, and other cases where we might return the input expression unchanged..) I'm still running bootstrap and regression tests on x86_64-linux, this time with both -m64 and -m32 regression tests. OK to apply assuming no regressions? PR rtl-optimization/63615 * simplify-rtx.c (simplify_plus_minus): Set canonicalized on decomposing PLUS or MINUS if operands are not placed adjacent in the ops array. Index: gcc/simplify-rtx.c === --- gcc/simplify-rtx.c (revision 216573) +++ gcc/simplify-rtx.c (working copy) @@ -4006,7 +4006,7 @@ ops[i].op = XEXP (this_op, 0); changed = 1; - canonicalized |= this_neg; + canonicalized |= this_neg || i != n_ops - 2; break; case NEG: -- Alan Modra Australia Development Lab, IBM
Re: avoid alignment of static variables affecting stack's
On 23.10.14 at 08:50, ja...@redhat.com wrote: On Thu, Oct 23, 2014 at 07:30:27AM +0100, Jan Beulich wrote: Function (or more narrow) scope static variables (as well as others not placed on the stack) should also not have any effect on the stack alignment. I noticed the issue first with Linux'es dynamic_pr_debug() construct using an 8-byte aligned sub-file-scope local variable. According to my checking bad behavior started with 4.6.x (4.5.3 was still okay), but generated code got quite a bit worse as of 4.9.0. If the static/external var has BLKmode, then perhaps it is safe, but I wonder about other vars, say vectors etc. Such vars are most likely loaded from their memory location, and if for some reason that needs to be spilled again, stack realignment would not be able to do that. Or do we inspect the IL and for any pseudos with modes needing larger alignment we adjust the dynamic stack realignment fields? I don't know, but it would seem to me that this ought to happen anyway: If the pseudo holds the result of some computation other than a simple load from memory and needs spilling, the same would apply afaict. Furthermore, shouldn't there be an existing test case for what you describe, and hence me not seeing regressions with the patch in place should be sufficient proof of there not being an issue? Jan
[PATCHv4] Enable -fsanitize-recover for KASan
Hi all, On 09/29/2014 09:21 PM, Yury Gribov wrote: This patch enables -fsanitize-recover for KASan by default. This causes KASan to continue execution after error in case of inline instrumentation. This feature is needed because - reports during early bootstrap won't even be printed - needed to run all tests w/o rebooting machine for every test - needed for interactive work on desktop This is the third version of patch which renames -fsanitize-recover to -fubsan-recover and introduces -fasan-recover (enabled by default for KASan). It also moves flag handling to finish_options per Jakub's request. A new version of patch based upon Jakub's recent changes to -fsanitize-recover=. I've renamed __asan_report_recover_load* to __asan_report_load*_noabort to match UBSan's style. Note that currently -fsanitize=kernel-address -fno-sanitize-recover=kernel-address won't work as expected because we miss __asan_load*_abort family of functions in libasan. Bootstrapped and regtested on x64. -Y From 75586eb21f1272a4fcf7c626d9b740eed7150c2c Mon Sep 17 00:00:00 2001 From: Yury Gribov y.gri...@samsung.com Date: Wed, 22 Oct 2014 17:24:55 +0400 Subject: [PATCH] Enable -fsanitize-recover for KASan. 2014-10-22 Yury Gribov y.gri...@samsung.com gcc/ * asan.c (report_error_func): Add noabort path. (check_func): Formatting. (asan_expand_check_ifn): Handle noabort path. * common.opt (flag_sanitize_recover): Add SANITIZE_KERNEL_ADDRESS to default value. * doc/invoke.texi (-fsanitize-recover=): Mention KASan. * opts.c (finish_options): Reword comment. * sanitizer.def: Add noabort ASan builtins. gcc/testsuite/ * c-c++-common/asan/kasan-recover-1.c: New test. --- gcc/asan.c| 52 ++--- gcc/common.opt|2 +- gcc/doc/invoke.texi |8 ++-- gcc/opts.c|4 +- gcc/sanitizer.def | 38 +++ gcc/testsuite/c-c++-common/asan/kasan-recover-1.c | 12 + 6 files changed, 93 insertions(+), 23 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/asan/kasan-recover-1.c diff --git a/gcc/asan.c b/gcc/asan.c index 97f0b4c..6b04591 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -1344,22 +1344,36 @@ asan_protect_global (tree decl) IS_STORE is either 1 (for a store) or 0 (for a load). */ static tree -report_error_func (bool is_store, HOST_WIDE_INT size_in_bytes, int *nargs) -{ - static enum built_in_function report[2][6] -= { { BUILT_IN_ASAN_REPORT_LOAD1, BUILT_IN_ASAN_REPORT_LOAD2, - BUILT_IN_ASAN_REPORT_LOAD4, BUILT_IN_ASAN_REPORT_LOAD8, - BUILT_IN_ASAN_REPORT_LOAD16, BUILT_IN_ASAN_REPORT_LOAD_N }, - { BUILT_IN_ASAN_REPORT_STORE1, BUILT_IN_ASAN_REPORT_STORE2, - BUILT_IN_ASAN_REPORT_STORE4, BUILT_IN_ASAN_REPORT_STORE8, - BUILT_IN_ASAN_REPORT_STORE16, BUILT_IN_ASAN_REPORT_STORE_N } }; +report_error_func (bool is_store, bool recover_p, HOST_WIDE_INT size_in_bytes, + int *nargs) +{ + static enum built_in_function report[2][2][6] += { { { BUILT_IN_ASAN_REPORT_LOAD1, BUILT_IN_ASAN_REPORT_LOAD2, + BUILT_IN_ASAN_REPORT_LOAD4, BUILT_IN_ASAN_REPORT_LOAD8, + BUILT_IN_ASAN_REPORT_LOAD16, BUILT_IN_ASAN_REPORT_LOAD_N }, + { BUILT_IN_ASAN_REPORT_STORE1, BUILT_IN_ASAN_REPORT_STORE2, + BUILT_IN_ASAN_REPORT_STORE4, BUILT_IN_ASAN_REPORT_STORE8, + BUILT_IN_ASAN_REPORT_STORE16, BUILT_IN_ASAN_REPORT_STORE_N } }, + { { BUILT_IN_ASAN_REPORT_LOAD1_NOABORT, + BUILT_IN_ASAN_REPORT_LOAD2_NOABORT, + BUILT_IN_ASAN_REPORT_LOAD4_NOABORT, + BUILT_IN_ASAN_REPORT_LOAD8_NOABORT, + BUILT_IN_ASAN_REPORT_LOAD16_NOABORT, + BUILT_IN_ASAN_REPORT_LOAD_N_NOABORT }, + { BUILT_IN_ASAN_REPORT_STORE1_NOABORT, + BUILT_IN_ASAN_REPORT_STORE2_NOABORT, + BUILT_IN_ASAN_REPORT_STORE4_NOABORT, + BUILT_IN_ASAN_REPORT_STORE8_NOABORT, + BUILT_IN_ASAN_REPORT_STORE16_NOABORT, + BUILT_IN_ASAN_REPORT_STORE_N_NOABORT } } }; if (size_in_bytes == -1) { *nargs = 2; - return builtin_decl_implicit (report[is_store][5]); + return builtin_decl_implicit (report[recover_p][is_store][5]); } *nargs = 1; - return builtin_decl_implicit (report[is_store][exact_log2 (size_in_bytes)]); + int size_log2 = exact_log2 (size_in_bytes); + return builtin_decl_implicit (report[recover_p][is_store][size_log2]); } /* Construct a function tree for __asan_{load,store}{1,2,4,8,16,_n}. @@ -1370,11 +1384,11 @@ check_func (bool is_store, int size_in_bytes, int *nargs) { static enum built_in_function check[2][6] = { { BUILT_IN_ASAN_LOAD1, BUILT_IN_ASAN_LOAD2, - BUILT_IN_ASAN_LOAD4, BUILT_IN_ASAN_LOAD8, - BUILT_IN_ASAN_LOAD16, BUILT_IN_ASAN_LOADN }, + BUILT_IN_ASAN_LOAD4, BUILT_IN_ASAN_LOAD8, + BUILT_IN_ASAN_LOAD16, BUILT_IN_ASAN_LOADN }, { BUILT_IN_ASAN_STORE1, BUILT_IN_ASAN_STORE2, - BUILT_IN_ASAN_STORE4, BUILT_IN_ASAN_STORE8, -
Re: [PATCHv4] Enable -fsanitize-recover for KASan
On Thu, Oct 23, 2014 at 11:11:29AM +0400, Yury Gribov wrote: Hi all, On 09/29/2014 09:21 PM, Yury Gribov wrote: This patch enables -fsanitize-recover for KASan by default. This causes KASan to continue execution after error in case of inline instrumentation. This feature is needed because - reports during early bootstrap won't even be printed - needed to run all tests w/o rebooting machine for every test - needed for interactive work on desktop This is the third version of patch which renames -fsanitize-recover to -fubsan-recover and introduces -fasan-recover (enabled by default for KASan). It also moves flag handling to finish_options per Jakub's request. A new version of patch based upon Jakub's recent changes to -fsanitize-recover=. I've renamed __asan_report_recover_load* to __asan_report_load*_noabort to match UBSan's style. Note that currently -fsanitize=kernel-address -fno-sanitize-recover=kernel-address won't work as expected because we miss __asan_load*_abort family of functions in libasan. I thought __asan_* functions are provided by the kernel, not libasan, for -fsanitize=kernel-address. Or is kernel linked with real libasan.a or some stripped down version thereof? Jakub
Re: [PATCH] Account for prologue spills in reg_pressure scheduling
Maxim Kuvyrkov maxim.kuvyr...@linaro.org writes: @@ -6701,6 +6753,24 @@ alloc_global_sched_pressure_data (void) saved_reg_live = BITMAP_ALLOC (NULL); region_ref_regs = BITMAP_ALLOC (NULL); } + + /* Calculate number of CALL_USED_REGS in register classes that + we calculate register pressure for. */ + for (int c = 0; c ira_pressure_classes_num; ++c) + { + enum reg_class cl = ira_pressure_classes[c]; + + call_used_regs_num[cl] = 0; + + for (int i = 0; i ira_class_hard_regs_num[cl]; ++i) + if (call_used_regs[ira_class_hard_regs[cl][i]]) + { + /* Register I belongs to pressure class CL. Pressure +classes do not intersect, so don't look further. */ + ++call_used_regs_num[cl]; + break; + } + } I don't think we want the break here. The effect would be to count at most one call-used register per pressure class. Thanks, Richard
Re: [PATCH] Account for prologue spills in reg_pressure scheduling
On Oct 23, 2014, at 8:20 PM, Richard Sandiford richard.sandif...@arm.com wrote: Maxim Kuvyrkov maxim.kuvyr...@linaro.org writes: @@ -6701,6 +6753,24 @@ alloc_global_sched_pressure_data (void) saved_reg_live = BITMAP_ALLOC (NULL); region_ref_regs = BITMAP_ALLOC (NULL); } + + /* Calculate number of CALL_USED_REGS in register classes that + we calculate register pressure for. */ + for (int c = 0; c ira_pressure_classes_num; ++c) +{ + enum reg_class cl = ira_pressure_classes[c]; + + call_used_regs_num[cl] = 0; + + for (int i = 0; i ira_class_hard_regs_num[cl]; ++i) +if (call_used_regs[ira_class_hard_regs[cl][i]]) + { +/* Register I belongs to pressure class CL. Pressure + classes do not intersect, so don't look further. */ +++call_used_regs_num[cl]; +break; + } +} I don't think we want the break here. The effect would be to count at most one call-used register per pressure class. You make me feel dumb :-) Thank you, -- Maxim Kuvyrkov www.linaro.org
Re: [PATCHv4] Enable -fsanitize-recover for KASan
On 10/23/2014 11:13 AM, Jakub Jelinek wrote: On Thu, Oct 23, 2014 at 11:11:29AM +0400, Yury Gribov wrote: Hi all, On 09/29/2014 09:21 PM, Yury Gribov wrote: This patch enables -fsanitize-recover for KASan by default. This causes KASan to continue execution after error in case of inline instrumentation. This feature is needed because - reports during early bootstrap won't even be printed - needed to run all tests w/o rebooting machine for every test - needed for interactive work on desktop This is the third version of patch which renames -fsanitize-recover to -fubsan-recover and introduces -fasan-recover (enabled by default for KASan). It also moves flag handling to finish_options per Jakub's request. A new version of patch based upon Jakub's recent changes to -fsanitize-recover=. I've renamed __asan_report_recover_load* to __asan_report_load*_noabort to match UBSan's style. Note that currently -fsanitize=kernel-address -fno-sanitize-recover=kernel-address won't work as expected because we miss __asan_load*_abort family of functions in libasan. I thought __asan_* functions are provided by the kernel, not libasan, for -fsanitize=kernel-address. Or is kernel linked with real libasan.a or some stripped down version thereof? Hm, right, libasan is not linked to kernel so it indeed does not need any changes. But now I see that for -fsanitize=kernel-address we need both __asan_load* and __asan_load*_noabort (the latter being default) depending on -fsanitize-recover setting. Let me update the patch for this. -Y
[PATCH] Var-tracking initialization fix (PR debug/63623)
Hi! As I wrote in the PR, vt_stack_adjustments can often compute wrong offsets, because it never considers pops with autoinc addressing, which can lead either to wrong debug info, or turning off -fvar-tracking altogether for a function on which that issue resulted in stack depth inconsistencies on the edges. Here are some stats from --enable-checking=yes,rtl cc1plus bootstrapped with/without the patch (without the patch got then rebuilt stage3 with the patch, i.e. just var-tracking.o and cc1plus-checksum.o got recompiled, so I'm comparing identical code, different debug info): x86_64-linux cc1plus built without the patch: cov%samples cumul 0.0 506230/38% 506230/38% 0..10 10327/0%516557/39% 11..20 12390/0%528947/39% 21..30 31265/2%560212/42% 31..40 18775/1%578987/43% 41..50 20631/1%599618/45% 51..60 24921/1%624539/47% 61..70 40959/3%665498/50% 71..80 23771/1%689269/52% 81..90 41771/3%731040/55% 91..99 81667/6%812707/61% 100 510564/38% 1323271/100% x86_64-linux cc1plus built with the patch: cov%samples cumul 0.0 382214/28% 382214/28% 0..10 13100/0%395314/29% 11..20 14568/1%409882/30% 21..30 33708/2%443590/33% 31..40 21927/1%465517/35% 41..50 23924/1%489441/36% 51..60 28736/2%518177/39% 61..70 45847/3%564024/42% 71..80 29284/2%593308/44% 81..90 52085/3%645393/48% 91..99 99971/7%745364/56% 100 577907/43% 1323271/100% i686-linux cc1plus built without the patch: cov%samples cumul 0.0 631348/48% 631348/48% 0..10 7764/0% 639112/48% 11..20 9690/0% 648802/49% 21..30 25036/1%673838/51% 31..40 16113/1%689951/52% 41..50 19753/1%709704/54% 51..60 14563/1%724267/55% 61..70 34093/2%758360/58% 71..80 17450/1%775810/59% 81..90 31339/2%807149/61% 91..99 60368/4%867517/66% 100 437548/33% 1305065/100% i686-linux cc1plus built with the patch: cov%samples cumul 0.0 377352/28% 377352/28% 0..10 16077/1%393429/30% 11..20 15390/1%408819/31% 21..30 31790/2%440609/33% 31..40 23889/1%464498/35% 41..50 29267/2%493765/37% 51..60 22902/1%516667/39% 61..70 45629/3%562296/43% 71..80 29511/2%591807/45% 81..90 50536/3%642343/49% 91..99 93584/7%735927/56% 100 569138/43% 1305065/100% .debug_info/.debug_loc sizes in bytes: x86_64-linux cc1plus without patch .debug_info 75411710, .debug_loc 75421077 x86_64-linux cc1plus withpatch .debug_info 78498790, .debug_loc 90530117 i686-linux cc1plus without patch .debug_info 59921183, .debug_loc 37823166 i686-linux cc1plus withpatch .debug_info 63009554, .debug_loc 59535100 I've also performed instrumented bootstraps/regtests (x86_64-linux and i686-linux), where I've logged in how many functions the result of vt_stack_adjustments differed between the bad old way and new way. In both the bootstraps/regtests, it affected 16892 32-bit and 6646 64-bit functions, in all cases it was old way giving up and new way succeeding. Not adding a testcase, as the one in the PR failed to produce proper debug info only in 4.8 (then got latent), there already are some guality improvements with the patch: -FAIL: gcc.dg/guality/pr54693-2.c -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions line 21 i == v + 1 -FAIL: gcc.dg/guality/pr54693-2.c -O3 -fomit-frame-pointer -funroll-loops line 21 i == v + 1 on x86_64 and: -FAIL: gcc.dg/guality/pr54519-1.c -O2 line 20 y == 25 -FAIL: gcc.dg/guality/pr54519-1.c -O2 line 20 z == 6 -FAIL: gcc.dg/guality/pr54519-1.c -O2 line 23 y == 117 -FAIL: gcc.dg/guality/pr54519-1.c -O2 line 23 z == 8 -FAIL: gcc.dg/guality/pr54519-1.c -O3 -fomit-frame-pointer line 20 x == 36 -FAIL: gcc.dg/guality/pr54519-1.c -O3 -fomit-frame-pointer line 20 y == 25 -FAIL: gcc.dg/guality/pr54519-1.c -O3 -fomit-frame-pointer line 20 z == 6 -FAIL: gcc.dg/guality/pr54519-1.c -O3 -g line 20 x == 36 -FAIL: gcc.dg/guality/pr54519-1.c -O3 -g line 20 y == 25 -FAIL: gcc.dg/guality/pr54519-1.c -O3 -g line 20 z == 6 -FAIL: gcc.dg/guality/pr54519-3.c -O2 line 20 x == 36 -FAIL: gcc.dg/guality/pr54519-3.c -O2 line 20 y == 25 -FAIL: gcc.dg/guality/pr54519-3.c -O2 line 20 z == 6 -FAIL: gcc.dg/guality/pr54519-3.c -O2 line 23 x == 98 -FAIL: gcc.dg/guality/pr54519-3.c -O2 line 23 y == 117 -FAIL: gcc.dg/guality/pr54519-3.c -O2 line 23 z == 8 -FAIL: gcc.dg/guality/pr54519-3.c -O2 -flto line 20 x == 36 -FAIL: gcc.dg/guality/pr54519-3.c -O2 -flto line 23 x == 98 -FAIL: gcc.dg/guality/pr54519-3.c -O2 -flto -flto-partition=none line 20 x == 36 -FAIL: gcc.dg/guality/pr54519-3.c -O2 -flto -flto-partition=none line 23 x == 98 -FAIL: gcc.dg/guality/pr54519-3.c -O3 -fomit-frame-pointer line 20 x == 36 -FAIL:
Re: [patch,gomp-4_0-branch] openacc parallel reduction part 1
Hi! On Mon, 28 Jul 2014 10:36:03 -0700, Cesar Philippidis cesar_philippi...@mentor.com wrote: On 07/28/2014 10:02 AM, Thomas Schwinge wrote: On Sun, 6 Jul 2014 16:10:56 -0700, Cesar Philippidis cesar_philippi...@mentor.com wrote: This patch is the first step to enabling parallel reductions in openacc. I think I have found one issue in this code -- but please verify that my understanding of reductions is correct. Namely: --- a/gcc/omp-low.c +++ b/gcc/omp-low.c +/* Helper function to finalize local data for the reduction arrays. The + reduction array needs to be reduced to the original reduction variable. + FIXME: This function assumes that there are vector_length threads in + total. Also, it assumes that there are at least vector_length iterations + in the for loop. */ + +static void +finalize_reduction_data (tree clauses, tree nthreads, gimple_seq *stmt_seqp, + omp_context *ctx) +{ + gcc_assert (is_gimple_omp_oacc_specifically (ctx-stmt)); + + tree c, var, array, loop_header, loop_body, loop_exit; + gimple stmt; + + /* Create for loop. + + let var = the original reduction variable + let array = reduction variable array + + var = array[0] + for (i = 1; i nthreads; i++) + var op= array[i] + */ This should also consider the reduction variable's original value. Test case (which does the expected thing if modified for OpenMP): #include stdlib.h int main(void) { #define I 5 #define N 11 #define A 8 int a = A; int s = I; #pragma acc parallel vector_length(N) { int i; #pragma acc loop reduction(+:s) for (i = 0; i N; ++i) s += a; } if (s != I + N * A) abort (); return 0; } OK to check in the following? Reductions can be specified with both the parallel and loop constructs. According to section 2.5.11 in the opacc spec, a reduction in a parallel construct should behave as you described: At the end of the region, the values for each gang are combined using the reduction operator, and the result combined with the value of the original variable and stored in the original variable. However,in section 2.7.11, a reduction in a loop construct behaves as follows: At the end of the loop, the values for each thread are combined using the specified reduction operator, and the result stored in the original variable at the end of the parallel or kernels region. The parallel reduction behavior does make more sense though. I'll ask the openacc gurus if there's a typo in section 2.7.11. It does refer to parallel reduction. I proceeded by checking in the following patch to gomp-4_0-branch, r216574: commit 75e2a58b8ef7d20be2239ff029493986542ee7e3 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Thu Oct 23 07:26:40 2014 + OpenACC reductions: Don't skip the reduction variable's original value. gcc/ * omp-low.c (finalize_reduction_data): Don't skip the reduction variable's original value. libgomp/ * testsuite/libgomp.oacc-c/reduction-initial-1.c: New file. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@216574 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.gomp | 5 +++ gcc/omp-low.c | 40 ++ libgomp/ChangeLog.gomp | 4 +++ .../testsuite/libgomp.oacc-c/reduction-initial-1.c | 32 + 4 files changed, 44 insertions(+), 37 deletions(-) diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp index 6d107d2..28e7252 100644 --- gcc/ChangeLog.gomp +++ gcc/ChangeLog.gomp @@ -1,3 +1,8 @@ +2014-10-23 Thomas Schwinge tho...@codesourcery.com + + * omp-low.c (finalize_reduction_data): Don't skip the reduction + variable's original value. + 2014-10-20 Cesar Philippidis ce...@codesourcery.com * gimplify.c (gimplify_scan_omp_clauses): Remove switch stmt which diff --git gcc/omp-low.c gcc/omp-low.c index b8022c2..b21235f 100644 --- gcc/omp-low.c +++ gcc/omp-low.c @@ -9869,8 +9869,7 @@ finalize_reduction_data (tree clauses, tree nthreads, gimple_seq *stmt_seqp, let var = the original reduction variable let array = reduction variable array - var = array[0] - for (i = 1; i nthreads; i++) + for (i = 0; i nthreads; i++) var op= array[i] */ @@ -9878,42 +9877,9 @@ finalize_reduction_data (tree clauses, tree nthreads, gimple_seq *stmt_seqp, loop_body = create_artificial_label (UNKNOWN_LOCATION); loop_exit = create_artificial_label (UNKNOWN_LOCATION); - /* Initialize the reduction variables to be value of the first array - element. */
Re: [PATCHv2] Don't expand string/memory builtins if ASan is enabled.
On 10/21/2014 05:26 PM, Maxim Ostapenko wrote: Hi, this is the second version of the patch. Here the major changes from the previous one: [snip] case BUILT_IN_BCMP: + *intercepted_p = false; The code which identifies interceptors seems to be duplicated in expand_builtin. What about factoring this out to some is_asan_intercepted() in asan.h? -Y
[PATCH] Fix genmatch linking
This adds a libcpp host module without NLS and ICONV support and properly links genmatch against the build libcpp instead of the host one. Bootstrap running on x86_64-unknown-linux-gnu (stage1 all-gcc finished fine). Ok for trunk? Thanks, Richard. 2014-10-23 Richard Biener rguent...@suse.de * Makefile.def: Add libcpp build module and dependencies. * configure.ac: Add libcpp build module. * Makefile.in: Regenerate. * configure: Likewise. gcc/ * Makefile.in (BUILD_CPPLIB): Add. (build/genmatch$(build_exeext)): Use BUILD_CPPLIB, not CPPLIB. Drop LIBIBERTY. Index: Makefile.def === --- Makefile.def(revision 216543) +++ Makefile.def(working copy) @@ -30,6 +30,8 @@ build_modules= { module= flex; }; build_modules= { module= m4; }; build_modules= { module= texinfo; }; build_modules= { module= fixincludes; }; +build_modules= { module= libcpp; + extra_configure_flags='--disable-nls am_cv_func_iconv=no';}; host_modules= { module= bfd; bootstrap=true; }; host_modules= { module= opcodes; bootstrap=true; }; @@ -301,6 +303,7 @@ dependencies = { module=all-build-flex; dependencies = { module=all-build-libiberty; on=all-build-texinfo; }; dependencies = { module=all-build-m4; on=all-build-texinfo; }; dependencies = { module=all-build-fixincludes; on=all-build-libiberty; }; +dependencies = { module=all-build-libcpp; on=all-build-libiberty; }; // Host modules specific to gcc. dependencies = { module=configure-gcc; on=configure-intl; }; @@ -321,6 +324,7 @@ dependencies = { module=all-gcc; on=all- dependencies = { module=all-gcc; on=all-build-flex; }; dependencies = { module=all-gcc; on=all-build-libiberty; }; dependencies = { module=all-gcc; on=all-build-fixincludes; }; +dependencies = { module=all-gcc; on=all-build-libcpp; }; dependencies = { module=all-gcc; on=all-zlib; }; dependencies = { module=all-gcc; on=all-libbacktrace; hard=true; }; dependencies = { module=all-gcc; on=all-libcpp; hard=true; }; Index: configure.ac === --- configure.ac(revision 216543) +++ configure.ac(working copy) @@ -126,7 +126,7 @@ extra_host_args= # these library is used by various programs built for the build # environment # -build_libs=build-libiberty +build_libs=build-libiberty build-libcpp # these tools are built for the build environment build_tools=build-texinfo build-flex build-bison build-m4 build-fixincludes Index: gcc/Makefile.in === --- gcc/Makefile.in (revision 216543) +++ gcc/Makefile.in (working copy) @@ -981,6 +981,7 @@ else LIBIBERTY = ../libiberty/libiberty.a BUILD_LIBIBERTY = $(build_libobjdir)/libiberty/libiberty.a endif +BUILD_CPPLIB = $(build_libobjdir)/libcpp/libcpp.a # Dependencies on the intl and portability libraries. LIBDEPS= libcommon.a $(CPPLIB) $(LIBIBERTY) $(LIBINTL_DEP) $(LIBICONV_DEP) \ @@ -2520,7 +2521,7 @@ genprog = $(genprogerr) check checksum c # These programs need libs over and above what they get from the above list. build/genautomata$(build_exeext) : BUILD_LIBS += -lm -build/genmatch$(build_exeext) : $(CPPLIB) $(LIBIBERTY) \ +build/genmatch$(build_exeext) : $(BUILD_CPPLIB) \ $(BUILD_ERRORS) build/vec.o build/hash-table.o # These programs are not linked with the MD reader. Index: Makefile.in === --- Makefile.in (revision 216543) +++ Makefile.in (working copy) @@ -991,6 +991,7 @@ all-build: maybe-all-build-flex all-build: maybe-all-build-m4 all-build: maybe-all-build-texinfo all-build: maybe-all-build-fixincludes +all-build: maybe-all-build-libcpp .PHONY: all-host @@ -3005,6 +3006,63 @@ all-build-fixincludes: configure-build-f +.PHONY: configure-build-libcpp maybe-configure-build-libcpp +maybe-configure-build-libcpp: +@if gcc-bootstrap +configure-build-libcpp: stage_current +@endif gcc-bootstrap +@if build-libcpp +maybe-configure-build-libcpp: configure-build-libcpp +configure-build-libcpp: + @: $(MAKE); $(unstage) + @r=`${PWD_COMMAND}`; export r; \ + s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ + test ! -f $(BUILD_SUBDIR)/libcpp/Makefile || exit 0; \ + $(SHELL) $(srcdir)/mkinstalldirs $(BUILD_SUBDIR)/libcpp ; \ + $(BUILD_EXPORTS) \ + echo Configuring in $(BUILD_SUBDIR)/libcpp; \ + cd $(BUILD_SUBDIR)/libcpp || exit 1; \ + case $(srcdir) in \ + /* | [A-Za-z]:[\\/]*) topdir=$(srcdir) ;; \ + *) topdir=`echo $(BUILD_SUBDIR)/libcpp/ | \ + sed -e 's,\./,,g' -e 's,[^/]*/,../,g' `$(srcdir) ;; \ + esac; \ + module_srcdir=libcpp; \ + rm -f no-such-file || : ; \ + CONFIG_SITE=no-such-file $(SHELL) \ + $$s/$$module_srcdir/configure \ +
Re: [testsuite] Fix gcc.dg/ipa/ipa-icf-2[18].c on Solaris
Rainer Orth r...@cebitec.uni-bielefeld.de writes: Two of the new gcc.dg/ipa/ipa-icf-*.c tests were FAILing on Solaris: FAIL: gcc.dg/ipa/ipa-icf-21.c (test for excess errors) UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Equal symbols: 1 UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Semantic equality hit:bar-foo FAIL: gcc.dg/ipa/ipa-icf-28.c (test for excess errors) UNRESOLVED: gcc.dg/ipa/ipa-icf-28.c scan-ipa-dump icf Equal symbols: 0 UNRESOLVED: gcc.dg/ipa/ipa-icf-28.c scan-ipa-dump icf attribute values are different The former requires xmmintrin.h which is x86-only, thus cannot work anywhere else. The other uses constructor priorities, but doesn't declare this dependency. The following patch fixes both, so far tested with the appropriate runtest invocation on sparc-sun-solaris2.11; will commit to mainline once they've also been included in a current x86_64-unknown-linux-gnu bootstrap. I've now installed the patch to remove the obvious testsuite failures, but it turned out that the ipa-icf-21.c testcase has further issues on i686-unknown-linux-gnu: FAIL: gcc.dg/ipa/ipa-icf-21.c (test for excess errors) Excess errors: /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c:10:10: warning: SSE vector return without SSE enabled changes the ABI [-Wpsabi] /var/gcc/regression/trunk/3.14.17-gcc-gas-gld-32/build/gcc/include/xmmintrin.h:909:1: error: inlining failed in call to always_inline '_mm_load1_ps': target specific option mismatch /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c:10:10: error: called from here The former can be cured by adding -Wno-psabi to dg-options, but I've no idea what to do about the latter, especially since it doesn't occur on i386-pc-solaris2.11. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[match-and-simplify] add new lower function
Instead of iterating each time for calling lowering function (lower_opt_convert, etc.), add new lower function and pass lower_opt_convert, lower_for etc. to it as callback. * genmatch.c (lower): New overloaded function. (lower): Adjust to call overloaded lower. Thanks, Prathamesh Index: gcc/genmatch.c === --- gcc/genmatch.c (revision 216577) +++ gcc/genmatch.c (working copy) @@ -880,24 +880,25 @@ /* Lower the AST for everything in SIMPLIFIERS. */ static void +lower (vecsimplify * in, vecsimplify * out, void (*lower_fn)(simplify *, vecsimplify *)) +{ + for (unsigned i = 0; i in.length (); ++i) +lower_fn (in[i], out); +} + +static void lower (vecsimplify * simplifiers) { auto_vecsimplify * out_simplifiers0; - for (unsigned i = 0; i simplifiers.length (); ++i) -lower_opt_convert (simplifiers[i], out_simplifiers0); - auto_vecsimplify * out_simplifiers1; - for (unsigned i = 0; i out_simplifiers0.length (); ++i) -lower_commutative (out_simplifiers0[i], out_simplifiers1); - + + lower (simplifiers, out_simplifiers0, lower_opt_convert); + lower (out_simplifiers0, out_simplifiers1, lower_commutative); + simplifiers.truncate (0); - for (unsigned i = 0; i out_simplifiers1.length (); ++i) -lower_for (out_simplifiers1[i], simplifiers); + lower (out_simplifiers1, simplifiers, lower_for); } - - - /* The decision tree built for generating GIMPLE and GENERIC pattern matching code. It represents the 'match' expression of all simplifies and has those as its leafs. */
Re: [testsuite] Fix gcc.dg/ipa/ipa-icf-2[18].c on Solaris
On Thu, Oct 23, 2014 at 11:16:39AM +0200, Rainer Orth wrote: Rainer Orth r...@cebitec.uni-bielefeld.de writes: Two of the new gcc.dg/ipa/ipa-icf-*.c tests were FAILing on Solaris: FAIL: gcc.dg/ipa/ipa-icf-21.c (test for excess errors) UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Equal symbols: 1 UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Semantic equality hit:bar-foo FAIL: gcc.dg/ipa/ipa-icf-28.c (test for excess errors) UNRESOLVED: gcc.dg/ipa/ipa-icf-28.c scan-ipa-dump icf Equal symbols: 0 UNRESOLVED: gcc.dg/ipa/ipa-icf-28.c scan-ipa-dump icf attribute values are different The former requires xmmintrin.h which is x86-only, thus cannot work anywhere else. The other uses constructor priorities, but doesn't declare this dependency. The following patch fixes both, so far tested with the appropriate runtest invocation on sparc-sun-solaris2.11; will commit to mainline once they've also been included in a current x86_64-unknown-linux-gnu bootstrap. I've now installed the patch to remove the obvious testsuite failures, but it turned out that the ipa-icf-21.c testcase has further issues on i686-unknown-linux-gnu: FAIL: gcc.dg/ipa/ipa-icf-21.c (test for excess errors) Excess errors: /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c:10:10: warning: SSE vector return without SSE enabled changes the ABI [-Wpsabi] /var/gcc/regression/trunk/3.14.17-gcc-gas-gld-32/build/gcc/include/xmmintrin.h:909:1: error: inlining failed in call to always_inline '_mm_load1_ps': target specific option mismatch /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c:10:10: error: called from here The former can be cured by adding -Wno-psabi to dg-options, but I've no idea what to do about the latter, especially since it doesn't occur on i386-pc-solaris2.11. The testcase is clearly i?86/x86_64 specific, so it must be guarded with { target i?86-*-* x86_64-*-* }, and as it is a compile time test, no need for assembler support, so just adding -msse2 option to dg-option is the right thing to do. Jakub
Re: [testsuite] Fix gcc.dg/ipa/ipa-icf-2[18].c on Solaris
Jakub Jelinek ja...@redhat.com writes: On Thu, Oct 23, 2014 at 11:16:39AM +0200, Rainer Orth wrote: Rainer Orth r...@cebitec.uni-bielefeld.de writes: Two of the new gcc.dg/ipa/ipa-icf-*.c tests were FAILing on Solaris: FAIL: gcc.dg/ipa/ipa-icf-21.c (test for excess errors) UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Equal symbols: 1 UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Semantic equality hit:bar-foo FAIL: gcc.dg/ipa/ipa-icf-28.c (test for excess errors) UNRESOLVED: gcc.dg/ipa/ipa-icf-28.c scan-ipa-dump icf Equal symbols: 0 UNRESOLVED: gcc.dg/ipa/ipa-icf-28.c scan-ipa-dump icf attribute values are different The former requires xmmintrin.h which is x86-only, thus cannot work anywhere else. The other uses constructor priorities, but doesn't declare this dependency. The following patch fixes both, so far tested with the appropriate runtest invocation on sparc-sun-solaris2.11; will commit to mainline once they've also been included in a current x86_64-unknown-linux-gnu bootstrap. I've now installed the patch to remove the obvious testsuite failures, but it turned out that the ipa-icf-21.c testcase has further issues on i686-unknown-linux-gnu: FAIL: gcc.dg/ipa/ipa-icf-21.c (test for excess errors) Excess errors: /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c:10:10: warning: SSE vector return without SSE enabled changes the ABI [-Wpsabi] /var/gcc/regression/trunk/3.14.17-gcc-gas-gld-32/build/gcc/include/xmmintrin.h:909:1: error: inlining failed in call to always_inline '_mm_load1_ps': target specific option mismatch /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c:10:10: error: called from here The former can be cured by adding -Wno-psabi to dg-options, but I've no idea what to do about the latter, especially since it doesn't occur on i386-pc-solaris2.11. The testcase is clearly i?86/x86_64 specific, so it must be guarded with { target i?86-*-* x86_64-*-* }, and as it is a compile time test, no need Right, that's what my patch already does. for assembler support, so just adding -msse2 option to dg-option is the right thing to do. Ok, will try that. Still no idea why Linux/x86 and Solaris/x86 are different here. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [testsuite] Fix gcc.dg/ipa/ipa-icf-2[18].c on Solaris
On Thu, Oct 23, 2014 at 11:35:44AM +0200, Rainer Orth wrote: The testcase is clearly i?86/x86_64 specific, so it must be guarded with { target i?86-*-* x86_64-*-* }, and as it is a compile time test, no need Right, that's what my patch already does. for assembler support, so just adding -msse2 option to dg-option is the right thing to do. Ok, will try that. Still no idea why Linux/x86 and Solaris/x86 are different here. Why do you think so? I certainly have: FAIL: gcc.dg/ipa/ipa-icf-21.c (test for excess errors) UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Equal symbols: 1 UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Semantic equality hit:bar-foo in my i686-linux test_summary log. It really depends on how the compiler is configured, if it defaults to march that has sse/sse2 by default, it will succeed, otherwise it will not. Jakub
Re: [PATCHv4] Enable -fsanitize-recover for KASan
On 10/23/2014 11:28 AM, Yury Gribov wrote: On 10/23/2014 11:13 AM, Jakub Jelinek wrote: On Thu, Oct 23, 2014 at 11:11:29AM +0400, Yury Gribov wrote: Hi all, On 09/29/2014 09:21 PM, Yury Gribov wrote: This patch enables -fsanitize-recover for KASan by default. This causes KASan to continue execution after error in case of inline instrumentation. This feature is needed because - reports during early bootstrap won't even be printed - needed to run all tests w/o rebooting machine for every test - needed for interactive work on desktop This is the third version of patch which renames -fsanitize-recover to -fubsan-recover and introduces -fasan-recover (enabled by default for KASan). It also moves flag handling to finish_options per Jakub's request. A new version of patch based upon Jakub's recent changes to -fsanitize-recover=. I've renamed __asan_report_recover_load* to __asan_report_load*_noabort to match UBSan's style. Note that currently -fsanitize=kernel-address -fno-sanitize-recover=kernel-address won't work as expected because we miss __asan_load*_abort family of functions in libasan. I thought __asan_* functions are provided by the kernel, not libasan, for -fsanitize=kernel-address. Or is kernel linked with real libasan.a or some stripped down version thereof? Hm, right, libasan is not linked to kernel so it indeed does not need any changes. But now I see that for -fsanitize=kernel-address we need both __asan_load* and __asan_load*_noabort (the latter being default) depending on -fsanitize-recover setting. Let me update the patch for this. IMO we don't need different versions of __asan_load* and __asan_load*_noabort, because -fno-sanitize-recover=kernel-address will never work with the linux kernel. I already said this before, and repeat this once again: There is few places in kernel where we validly touch poisoned memory, so we need to disable error reporting in runtime for such memory accesses. I use per-thread flag which is raised before the valid access to poisoned memory. This flag checked in __asan_report*() function. If it raised then we shouldn't print any error message, just silently exit from report. -fno-sanitize-recover=kernel-address will just cause early kernel crash on boot, so we will never use it.
Re: [PATCHv4] Enable -fsanitize-recover for KASan
On Thu, Oct 23, 2014 at 01:51:12PM +0400, Andrey Ryabinin wrote: IMO we don't need different versions of __asan_load* and __asan_load*_noabort, because -fno-sanitize-recover=kernel-address will never work with the linux kernel. I already said this before, and repeat this once again: There is few places in kernel where we validly touch poisoned memory, so we need to disable error reporting in runtime for such memory accesses. I use per-thread flag which is raised before the valid access to poisoned memory. This flag checked in __asan_report*() function. If it raised then we shouldn't print any error message, just silently exit from report. Can't you just use __attribute__((no_sanitize_address)) on the functions that have such a code? Or you could use special macros for those accesses (which could e.g. call function to read memory or write memory, implemented in assembly or in __attribute__((no_sanitize_address)) function), or temporarily unpoison and poison again. Jakub
Re: [PATCHv4] Enable -fsanitize-recover for KASan
On Thu, Oct 23, 2014 at 11:55:32AM +0200, Jakub Jelinek wrote: On Thu, Oct 23, 2014 at 01:51:12PM +0400, Andrey Ryabinin wrote: IMO we don't need different versions of __asan_load* and __asan_load*_noabort, because -fno-sanitize-recover=kernel-address will never work with the linux kernel. I already said this before, and repeat this once again: There is few places in kernel where we validly touch poisoned memory, so we need to disable error reporting in runtime for such memory accesses. I use per-thread flag which is raised before the valid access to poisoned memory. This flag checked in __asan_report*() function. If it raised then we shouldn't print any error message, just silently exit from report. Can't you just use __attribute__((no_sanitize_address)) on the functions that have such a code? Or you could use special macros for those accesses (which could e.g. call function to read memory or write memory, implemented in assembly or in __attribute__((no_sanitize_address)) function), or temporarily unpoison and poison again. Also, if you always rely on recovery for kernel-address, wonder why all the effort to make it optional (when it could be decided based on flag_sanitize SANITIZE_KERNEL_ADDRESS), and whether I should wait with 4.9.2-rc1 for that (given that 4.9 branch now has kasan support backported, but not -fsanitize-recover (neither old style, nor new style)). I'd really like to release 4.9.2 soon... Jakub
Re: [COMMITTED][PATCH][ARM]Add ACLE 2.0 predefined marco __ARM_FEATURE_IDIV
On Wed, Oct 22, 2014 at 11:02 AM, Jiong Wang jiong.w...@arm.com wrote: On 21/10/14 15:30, Ramana Radhakrishnan wrote: On Mon, Oct 13, 2014 at 3:15 PM, Renlin Li renlin...@arm.com wrote: Hi all, This is a simple patch to add missing __ARM_FEATURE_IDIV__ predefined marco(ACLE 2.0) into TARGET_CPU_CPP_BUILTINS. Is it Okay to commit? gcc/ChangeLog: 2014-10-13 Renlin Li renlin...@arm.com * config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Add ACLE 2.0 predefined marco __ARM_FEATURE_IDIV__. Replace this with Define __ARM_FEATURE_IDIV__. in the Changelog. Are you sure that the ACLE documents this with trailing underscores ? The copy that I have doesn't. Ramana Ok with that change. committed on behalf of Renlin as 216547. 2014-10-22 Renlin Li renlin...@arm.com gcc/ * config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Define __ARM_FEATURE_IDIV__. Also fixed the missing '\' on the last line as obvious 2014-10-22 Jiong Wang jiong.w...@arm.com gcc/ * config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Add missing '\'. Index: gcc/config/arm/arm.h === --- gcc/config/arm/arm.h(revision 216547) +++ gcc/config/arm/arm.h(working copy) @@ -167,7 +167,7 @@ { \ builtin_define (__ARM_ARCH_EXT_IDIV__); \ builtin_define (__ARM_FEATURE_IDIV__); \ - } + } \ } while (0) #include config/arm/arm-opts.h Ramana
Re: [PATCHv4] Enable -fsanitize-recover for KASan
On 10/23/2014 01:55 PM, Jakub Jelinek wrote: On Thu, Oct 23, 2014 at 01:51:12PM +0400, Andrey Ryabinin wrote: IMO we don't need different versions of __asan_load* and __asan_load*_noabort, because -fno-sanitize-recover=kernel-address will never work with the linux kernel. I already said this before, and repeat this once again: There is few places in kernel where we validly touch poisoned memory, so we need to disable error reporting in runtime for such memory accesses. I use per-thread flag which is raised before the valid access to poisoned memory. This flag checked in __asan_report*() function. If it raised then we shouldn't print any error message, just silently exit from report. Can't you just use __attribute__((no_sanitize_address)) on the functions that have such a code? Or you could use special macros for those accesses (which could e.g. call function to read memory or write memory, implemented in assembly or in __attribute__((no_sanitize_address)) function), or Those are quite generic functions used from a lot of places. So we want to instrument them in general, but there are few call sites which use those functions for poisoned memory. temporarily unpoison and poison again. That's a bit tricky. State of shadow memory is unknown, so we would need to store shadow somewhere before unpoisoning to restore it later. Jakub
[Ada] Undefined symbol when using -gnatVa
This patch modifies the generation of validity checks for the bounds of a range to propagate the related subtype. This ensures the the same range will produce the same temporaries to capture the bounds with and without validity checks enabled. No small reproducer available. Tested on x86_64-pc-linux-gnu, committed on trunk 2014-10-23 Hristian Kirtchev kirtc...@adacore.com * checks.adb (Ensure_Valid): Update the subprogram profile. Propagate the contex attributes to Insert_Valid_Check. (Insert_Valid_Check): Update the subprogram profile. Propagate the attributes of the context to Duplicate_Subexpr_No_Checks. (Validity_Check_Range): Update the subprogram profile. Propagate the context attribute to Ensure_Valid. * checks.ads (Ensure_Valid): Update the subprogram profile along with the comment on usage. (Insert_Valid_Check): Update the subprogram profile along with the comment on usage. (Validity_Check_Range): Update the subprogram profile along with the comment on usage. * exp_util.adb (Build_Temporary): New routine. (Duplicate_Subexpr_No_Checks): Update the subprogram profile. Propagate the attributes of the context to Remove_Side_Effects. (Remove_Side_Effects): Update the subprogram profile. Update all calls to Make_Temporary to invoke Build_Temporary. * exp_util.ads (Duplicate_Subexpr_No_Checks): Update the subprogram profile along with the comment on usage. (Remove_Side_Effects): Update the subprogram profile along with the comment on usage. * sem_ch3.adb (Process_Range_Expr_In_Decl): Pass the subtype to the validity check machinery. Explain the reason for this propagation. Index: sem_ch3.adb === --- sem_ch3.adb (revision 216574) +++ sem_ch3.adb (working copy) @@ -19734,16 +19734,29 @@ Lo := Low_Bound (R); Hi := High_Bound (R); + -- Validity checks on the range of a quantified expression are + -- delayed until the construct is transformed into a loop. + + if Nkind (Parent (R)) = N_Loop_Parameter_Specification + and then Nkind (Parent (Parent (R))) = N_Quantified_Expression + then +null; + -- We need to ensure validity of the bounds here, because if we -- go ahead and do the expansion, then the expanded code will get -- analyzed with range checks suppressed and we miss the check. - -- Validity checks on the range of a quantified expression are - -- delayed until the construct is transformed into a loop. - if Nkind (Parent (R)) /= N_Loop_Parameter_Specification - or else Nkind (Parent (Parent (R))) /= N_Quantified_Expression - then -Validity_Check_Range (R); + -- WARNING: The capture of the range bounds with xxx_FIRST/_LAST and + -- the temporaries generated by routine Remove_Side_Effects by means + -- of validity checks must use the same names. When a range appears + -- in the parent of a generic, the range is processed with checks + -- disabled as part of the generic context and with checks enabled + -- for code generation purposes. This leads to link issues as the + -- generic contains references to xxx_FIRST/_LAST, but the inlined + -- template sees the temporaries generated by Remove_Side_Effects. + + else +Validity_Check_Range (R, Subtyp); end if; -- If there were errors in the declaration, try and patch up some @@ -19784,16 +19797,16 @@ if Nkind (Lo) = N_String_Literal then Rewrite (Lo, Make_Attribute_Reference (Sloc (Lo), -Attribute_Name = Name_First, -Prefix = New_Occurrence_Of (T, Sloc (Lo; +Prefix = New_Occurrence_Of (T, Sloc (Lo)), +Attribute_Name = Name_First)); Analyze_And_Resolve (Lo); end if; if Nkind (Hi) = N_String_Literal then Rewrite (Hi, Make_Attribute_Reference (Sloc (Hi), -Attribute_Name = Name_First, -Prefix = New_Occurrence_Of (T, Sloc (Hi; +Prefix = New_Occurrence_Of (T, Sloc (Hi)), +Attribute_Name = Name_First)); Analyze_And_Resolve (Hi); end if; Index: exp_util.adb === --- exp_util.adb(revision 216574) +++ exp_util.adb(working copy) @@ -1922,14 +1922,24 @@ - function Duplicate_Subexpr_No_Checks - (Exp : Node_Id; - Name_Req : Boolean := False; - Renaming_Req : Boolean := False) return Node_Id + (Exp : Node_Id;
Re: [PATCH 2/2] [AARCH64,NEON] Convert arm_neon.h to use new builtins for vld[234](q?)_lane_*
On 8 October 2014 18:27, charles.bay...@linaro.org wrote: +#define __LD2_LANE_FUNC(intype, vectype, largetype, ptrtype, \ Just above the insertion point in arm-neon.h is the comment: /* Start of temporary inline asm for vldn, vstn and friends. */ This patch removes the temporary inline asm vldn implementation, the replacement implementation should be inserted in the section below, look for the comment that starts: Start of optimal implementations +mode, ptrmode, funcsuffix, signedtype)\ +__extension__ static __inline intype __attribute__ ((__always_inline__)) \ +vld2_lane_##funcsuffix (const ptrtype * __ptr, intype __b, const int __c) \ +{ \ + __builtin_aarch64_simd_oi __o; \ + largetype __temp; \ + __temp.val[0] = \ There is something odd about the white space here, space before tab? This is repeated in various places through the rest of the patch. Otherwise this and the previous 1/2 associated patch look good, can you respin with these tidy ups? Thanks /Marcus
Re: [PATCHv4] Enable -fsanitize-recover for KASan
On Thu, Oct 23, 2014 at 02:09:47PM +0400, Andrey Ryabinin wrote: On 10/23/2014 01:55 PM, Jakub Jelinek wrote: On Thu, Oct 23, 2014 at 01:51:12PM +0400, Andrey Ryabinin wrote: IMO we don't need different versions of __asan_load* and __asan_load*_noabort, because -fno-sanitize-recover=kernel-address will never work with the linux kernel. I already said this before, and repeat this once again: There is few places in kernel where we validly touch poisoned memory, so we need to disable error reporting in runtime for such memory accesses. I use per-thread flag which is raised before the valid access to poisoned memory. This flag checked in __asan_report*() function. If it raised then we shouldn't print any error message, just silently exit from report. Can't you just use __attribute__((no_sanitize_address)) on the functions that have such a code? Or you could use special macros for those accesses (which could e.g. call function to read memory or write memory, implemented in assembly or in __attribute__((no_sanitize_address)) function), or Those are quite generic functions used from a lot of places. So we want to instrument them in general, but there are few call sites which use those functions for poisoned memory. Actually, -fsanitize=kernel-address forcibly uses function calls (i.e. __asan_load* etc. rather than __asan_report_load* only if inline shadow memory test suggested there is a problem). So, at that point you can include your ugly hacks in __asan_load* logic in the kernel, the difference between __asan_load4 and __asan_load4_noabort will be just that the latter will always return, while the former will not if some error has been reported. All the __asan_load* and __asan_store* entrypoints, regardless of -f{,no-}sanitize-recover=kernel-address are by definition not noreturn, they in the common case (if the code is not buggy) return. Jakub
[Ada] Freezing issues with expression functions
An expression function does not freeze the corresponding expression. The function and the expression are frozen by a call, or by the end of the enclosing declarative part. If the expression includes attribute references to outer types, these must be frozen at that point. This patch ensures that freeze nodes for such types are generated properly. Executing the following:: gnatmake -q bad_main.adb bad_main must yield: bad main OK --- with Text_IO; use Text_IO; with Bad_Gen; procedure Bad_Main is package Test is new Bad_Gen (Test); begin Put_Line (bad main (if Test.Names.Max_Length 2 then OK else Fails)); end Bad_Main; --- with Ada.Strings.Bounded; generic Name: String := ; package Bad_Gen is type Desc is tagged null record; -- D : Desc; function Size return Natural is ( if Desc'External_Tag'Length 2 then Desc'External_Tag'Length else Name'Length ); package Names is new Ada.Strings.Bounded.Generic_Bounded_Length (Size); Test: Names.Bounded_String := Names.To_Bounded_String (Desc'External_Tag); end Bad_Gen; Tested on x86_64-pc-linux-gnu, committed on trunk 2014-10-23 Ed Schonberg schonb...@adacore.com * sem_ch12.adb (Analyze_Associations): If an actual for a formal object is a call to a parameterless expression function, add the function to the list of actuals to freeze. * freeze.adb (Check_Expression_Function): Create freeze nodes of outer types that may be references in the body of the expression. Index: sem_ch12.adb === --- sem_ch12.adb(revision 216574) +++ sem_ch12.adb(working copy) @@ -1664,6 +1664,18 @@ Assoc); end if; + -- If the object is a call to an expression function, this + -- is a freezing point for it. + + if Is_Entity_Name (Match) +and then Present (Entity (Match)) +and then Nkind + (Original_Node (Unit_Declaration_Node (Entity (Match += N_Expression_Function + then + Append_Elmt (Entity (Match), Actuals_To_Freeze); + end if; + when N_Formal_Type_Declaration = Match := Matching_Actual ( Index: freeze.adb === --- freeze.adb (revision 216574) +++ freeze.adb (working copy) @@ -112,6 +112,11 @@ -- to deferred constants without completion. We report this at the freeze -- point of the function, to provide a better error message. + -- In most cases the expression itself is frozen by the time the function + -- itself is frozen, because the formals will be frozen by then. However, + -- Attribute references to outer types are freeze points for those types; + -- this routine generates the required freeze nodes for them. + procedure Check_Strict_Alignment (E : Entity_Id); -- E is a base type. If E is tagged or has a component that is aliased -- or tagged or contains something this is aliased or tagged, set @@ -1272,6 +1277,14 @@ then Error_Msg_NE (premature use of in call or instance, N, Entity (Nod)); + + elsif Nkind (Nod) = N_Attribute_Reference then +Analyze (Prefix (Nod)); +if Is_Entity_Name (Prefix (Nod)) + and then Is_Type (Entity (Prefix (Nod))) +then + Freeze_Before (N, Entity (Prefix (Nod))); +end if; end if; return OK; @@ -5983,7 +5996,7 @@ -- and the expressions include allocators, the designed type is frozen -- as well. - function In_Exp_Body (N : Node_Id) return Boolean; + function In_Expanded_Body (N : Node_Id) return Boolean; -- Given an N_Handled_Sequence_Of_Statements node N, determines whether -- it is the handled statement sequence of an expander-generated -- subprogram (init proc, stream subprogram, or renaming as body). @@ -6023,11 +6036,11 @@ return Empty; end Find_Aggregate_Component_Desig_Type; - - - -- In_Exp_Body -- - - + -- + -- In_Expanded_Body -- + -- - function In_Exp_Body (N : Node_Id) return Boolean is + function In_Expanded_Body (N : Node_Id) return Boolean is P : Node_Id; Id : Entity_Id; @@ -6044,7 +6057,8 @@ else Id := Defining_Unit_Name (Specification (P)); --- Following complex conditional could use comments ??? +-- The following are expander-created
[Ada] Illegal declarations of deferred constants
A deferred constant whose type is an anonymous array is always illegal (unless it is imported). This patch makes the error message more detailed, to help future Ada beginners. Compiling array_constant.ads must yield: array_constant.ads:2:17: a deferred constant must have a named type array_constant.ads:4:04: type does not match declaration at line 2 array_constant.ads:4:04: each anonymous array is a distinct type --- package Array_Constant is A : constant array (Boolean) of Integer; private A : constant array (Boolean) of Integer := ( True = 1, False = 0 ); end Array_Constant; Tested on x86_64-pc-linux-gnu, committed on trunk 2014-10-23 Ed Schonberg schonb...@adacore.com * sem_ch3.adb (Constant_Redeclaration): Make error message more explicit on a deferred constant whose object_definition is an anonymous array. Index: sem_ch3.adb === --- sem_ch3.adb (revision 216583) +++ sem_ch3.adb (working copy) @@ -11751,6 +11751,20 @@ Set_Full_View (Prev, Id); Set_Etype (Id, Any_Type); + -- A deferred constant whose type is an anonymous array is always + -- illegal (unless imported). A detailed error message might be + -- helpful for Ada beginners. + + if Nkind (Object_Definition (Parent (Prev))) += N_Constrained_Array_Definition + and then Nkind (Object_Definition (N)) + = N_Constrained_Array_Definition + then +Error_Msg_N (\each anonymous array is a distinct type, N); +Error_Msg_N (a deferred constant must have a named type, + Object_Definition (Parent (Prev))); + end if; + elsif Null_Exclusion_Present (Parent (Prev)) and then not Null_Exclusion_Present (N)
[Ada] Ignore pragmas Inline/Inline_Always in GNATprove mode
Frontend inlining is applied independently of pragmas Inline or Inline_Always in GNATprove mode, to benefit from contextual analysis whenever possible. Hence, ignore such pragmas in GNATprove mode, to avoid getting useless errors on these when the corresponding subprogram is inlined by the frontend already. Tested on x86_64-pc-linux-gnu, committed on trunk 2014-10-23 Yannick Moy m...@adacore.com * sem_prag.adb (Analyze_Pragma/Pragma_Inline Pragma_Inline_Always): Disable analysis in GNATprove mode. Index: sem_prag.adb === --- sem_prag.adb(revision 216574) +++ sem_prag.adb(working copy) @@ -14894,12 +14894,21 @@ when Pragma_Inline = --- Inline status is Enabled if inlining option is active +-- Pragma always active unless in GNATprove mode. It is disabled +-- in GNATprove mode because frontend inlining is applied +-- independently of pragmas Inline and Inline_Always for +-- formal verification, see Can_Be_Inlined_In_GNATprove_Mode +-- in inline.ads. -if Inline_Active then - Process_Inline (Enabled); -else - Process_Inline (Disabled); +if not GNATprove_Mode then + + -- Inline status is Enabled if inlining option is active + + if Inline_Active then + Process_Inline (Enabled); + else + Process_Inline (Disabled); + end if; end if; --- @@ -14911,15 +14920,15 @@ when Pragma_Inline_Always = GNAT_Pragma; --- Pragma always active unless in CodePeer mode. It is disabled --- in CodePeer mode because inlining is not helpful, and enabling --- if caused walk order issues. +-- Pragma always active unless in CodePeer mode or GNATprove +-- mode. It is disabled in CodePeer mode because inlining is +-- not helpful, and enabling it caused walk order issues. It +-- is disabled in GNATprove mode because frontend inlining is +-- applied independently of pragmas Inline and Inline_Always for +-- formal verification, see Can_Be_Inlined_In_GNATprove_Mode in +-- inline.ads. --- Historical note: this pragma used to be disabled in GNATprove --- mode as well, but that was odd since walk order should not be --- an issue in that case. - -if not CodePeer_Mode then +if not CodePeer_Mode and not GNATprove_Mode then Process_Inline (Enabled); end if;
Re: [PATCHv4] Enable -fsanitize-recover for KASan
On 10/23/2014 02:00 PM, Jakub Jelinek wrote: On Thu, Oct 23, 2014 at 11:55:32AM +0200, Jakub Jelinek wrote: On Thu, Oct 23, 2014 at 01:51:12PM +0400, Andrey Ryabinin wrote: IMO we don't need different versions of __asan_load* and __asan_load*_noabort, because -fno-sanitize-recover=kernel-address will never work with the linux kernel. I already said this before, and repeat this once again: There is few places in kernel where we validly touch poisoned memory, so we need to disable error reporting in runtime for such memory accesses. I use per-thread flag which is raised before the valid access to poisoned memory. This flag checked in __asan_report*() function. If it raised then we shouldn't print any error message, just silently exit from report. Can't you just use __attribute__((no_sanitize_address)) on the functions that have such a code? Or you could use special macros for those accesses (which could e.g. call function to read memory or write memory, implemented in assembly or in __attribute__((no_sanitize_address)) function), or temporarily unpoison and poison again. Also, if you always rely on recovery for kernel-address, wonder why all the effort to make it optional (when it could be decided based on flag_sanitize SANITIZE_KERNEL_ADDRESS), and whether I should wait with 4.9.2-rc1 for that (given that 4.9 branch now has kasan support backported, but not -fsanitize-recover (neither old style, nor new style)). I'd really like to release 4.9.2 soon... -fsanitize-recover needed only for inline instrumentation, and 4.9 don't support inline instrumentation for kernel-address. There is no reason to delay release unless you want to see inline support in 4.9. Jakub
[Ada] Use of SPARK_Mode with front end inlining (-gnatN)
This patch modifies the front end inlining mechanism to ensure that a package body is always analyzed with the SPARK_Mode of the enclosing context. -- Source -- -- front_end_inlining.adc pragma SPARK_Mode (On); -- front_end_inlining.ads package Front_End_Inlining is procedure P; end Front_End_Inlining; -- front_end_inlining.adb with Ada.Text_IO; use Ada.Text_IO; package body Front_End_Inlining with SPARK_Mode = Off is subtype Small_Int is Integer range 0 .. 3; procedure P is package Small_Int_IO is new Integer_IO (Small_Int); begin null; end P; end Front_End_Inlining; - -- Compilation -- - $ gcc -c -gnatec=front_end_inlining.adc -gnatN front_end_inlining.adb Tested on x86_64-pc-linux-gnu, committed on trunk 2014-10-23 Hristian Kirtchev kirtc...@adacore.com * sem_ch12.adb (Inline_Instance_Body): Alphabetize local variables and constants. Add constants Save_SM and Save_SMP to capture SPARK_Mode-related attributes. Compile the inlined body with the SPARK_Mode of the enclosing context. Index: sem_ch12.adb === --- sem_ch12.adb(revision 216583) +++ sem_ch12.adb(working copy) @@ -4425,25 +4425,31 @@ Gen_Unit : Entity_Id; Act_Decl : Node_Id) is - Vis : Boolean; - Gen_Comp : constant Entity_Id := - Cunit_Entity (Get_Source_Unit (Gen_Unit)); - Curr_Comp: constant Node_Id := Cunit (Current_Sem_Unit); - Curr_Scope : Entity_Id := Empty; - Curr_Unit: constant Entity_Id := Cunit_Entity (Current_Sem_Unit); - Removed : Boolean := False; - Num_Scopes : Int := 0; + Curr_Comp : constant Node_Id := Cunit (Current_Sem_Unit); + Curr_Unit : constant Entity_Id := Cunit_Entity (Current_Sem_Unit); + Gen_Comp : constant Entity_Id := +Cunit_Entity (Get_Source_Unit (Gen_Unit)); + Save_SM : constant SPARK_Mode_Type := SPARK_Mode; + Save_SMP : constant Node_Id := SPARK_Mode_Pragma; + -- Save all SPARK_Mode-related attributes as removing enclosing scopes + -- to provide a clean environment for analysis of the inlined body will + -- eliminate any previously set SPARK_Mode. + Scope_Stack_Depth : constant Int := Scope_Stack.Last - Scope_Stack.First + 1; Use_Clauses : array (1 .. Scope_Stack_Depth) of Node_Id; Instances: array (1 .. Scope_Stack_Depth) of Entity_Id; Inner_Scopes : array (1 .. Scope_Stack_Depth) of Entity_Id; + Curr_Scope : Entity_Id := Empty; List : Elist_Id; Num_Inner: Int := 0; + Num_Scopes : Int := 0; N_Instances : Int := 0; + Removed : Boolean := False; S: Entity_Id; + Vis : Boolean; begin -- Case of generic unit defined in another unit. We must remove the @@ -4574,6 +4580,10 @@ pragma Assert (Num_Inner Num_Scopes); + -- The inlined package body must be analyzed with the SPARK_Mode of + -- the enclosing context, otherwise the body may cause bogus errors + -- if a configuration SPARK_Mode pragma in in effect. + Push_Scope (Standard_Standard); Scope_Stack.Table (Scope_Stack.Last).Is_Active_Stack_Base := True; Instantiate_Package_Body @@ -4587,8 +4597,8 @@ Version = Ada_Version, Version_Pragma = Ada_Version_Pragma, Warnings = Save_Warnings, - SPARK_Mode = SPARK_Mode, - SPARK_Mode_Pragma= SPARK_Mode_Pragma)), + SPARK_Mode = Save_SM, + SPARK_Mode_Pragma= Save_SMP)), Inlined_Body = True); Pop_Scope; @@ -4692,7 +4702,9 @@ end loop; end; - -- If generic unit is in current unit, current context is correct + -- If generic unit is in current unit, current context is correct. Note + -- that the context is guaranteed to carry the correct SPARK_Mode as no + -- enclosing scopes were removed. else Instantiate_Package_Body
Re: [PATCHv4] Enable -fsanitize-recover for KASan
On 10/23/2014 02:20 PM, Andrey Ryabinin wrote: On 10/23/2014 02:00 PM, Jakub Jelinek wrote: On Thu, Oct 23, 2014 at 11:55:32AM +0200, Jakub Jelinek wrote: On Thu, Oct 23, 2014 at 01:51:12PM +0400, Andrey Ryabinin wrote: IMO we don't need different versions of __asan_load* and __asan_load*_noabort, because -fno-sanitize-recover=kernel-address will never work with the linux kernel. I already said this before, and repeat this once again: There is few places in kernel where we validly touch poisoned memory, so we need to disable error reporting in runtime for such memory accesses. I use per-thread flag which is raised before the valid access to poisoned memory. This flag checked in __asan_report*() function. If it raised then we shouldn't print any error message, just silently exit from report. Can't you just use __attribute__((no_sanitize_address)) on the functions that have such a code? Or you could use special macros for those accesses (which could e.g. call function to read memory or write memory, implemented in assembly or in __attribute__((no_sanitize_address)) function), or temporarily unpoison and poison again. Also, if you always rely on recovery for kernel-address, wonder why all the effort to make it optional (when it could be decided based on flag_sanitize SANITIZE_KERNEL_ADDRESS), and whether I should wait with 4.9.2-rc1 for that (given that 4.9 branch now has kasan support backported, but not -fsanitize-recover (neither old style, nor new style)). I'd really like to release 4.9.2 soon... -fsanitize-recover needed only for inline instrumentation, and 4.9 don't support inline instrumentation for kernel-address. There is no reason to delay release unless you want to see inline support in 4.9. +1
[Ada] Bug with interfaces and limited views
A type that implements interfaces generates internal procedures (thunks) that access the dispatch table to retrieve the proper operation on a dispatching call. This patch fixes a bug in the handling of these thunks, when one of its formals has a limited view of a type. Usually the presence of such a type forces a delay in generating code for the subprogram, but this delay must not be applied to a thunk. The following must compile quietly: gcc -c p.ads --- with Q; package P is type Instance is new Q.Instance with null record; end P; --- with R; with S; limited with T; package Q is type Instance is new R.Instance and S.Instance with null record; procedure Proc (Self : Instance; Current : T.Instance); end Q; --- package R is type Instance is tagged null record; end R; --- limited with T; package S is type Instance is interface; procedure Proc (Self : Instance; Current : T.Instance) is abstract; end S; --- package T is type Instance is tagged null record; end T; Tested on x86_64-pc-linux-gnu, committed on trunk 2014-10-23 Ed Schonberg schonb...@adacore.com * sem_ch6.adb (Process_Formals): A thunk procedure with a parameter of a limited view does not need a freeze node. Index: sem_ch6.adb === --- sem_ch6.adb (revision 216584) +++ sem_ch6.adb (working copy) @@ -9946,7 +9946,9 @@ -- (Note that the same is done for controlling access -- parameter cases in function Access_Definition.) -Set_Has_Delayed_Freeze (Current_Scope); +if not Is_Thunk (Current_Scope) then + Set_Has_Delayed_Freeze (Current_Scope); +end if; end if; end if;
[Ada] Local renaming should not be externally visible
This patch modifies the analysis of a package body to hide object and subprogram renamings from external visibility. -- Source -- -- externals.ads package Externals is procedure Force_Body; end Externals; -- externals.adb package body Externals is Obj : constant String := Hello; Obj_Ren : String renames Obj; procedure Force_Body is begin null; end Force_Body; end Externals; -- main.adb with Externals; procedure Main is begin Externals.Force_Body; end Main; -- Compilation and output -- $ gnatmake -q main.adb $ nm main | grep externals__obj | cut -d' ' -f2- r externals__obj r externals__obj_ren Tested on x86_64-pc-linux-gnu, committed on trunk 2014-10-23 Hristian Kirtchev kirtc...@adacore.com * sem_ch7.adb (Analyze_Package_Body_Helper): The logic which hides local entities from external visibility is now contained in routine Hide_Public_Entities. (Hide_Public_Entities): New routine. Object and subprogram renamings are now hidden from external visibility the same way objects are. Index: sem_ch7.adb === --- sem_ch7.adb (revision 216574) +++ sem_ch7.adb (working copy) @@ -220,12 +220,12 @@ - procedure Analyze_Package_Body_Helper (N : Node_Id) is - HSS : Node_Id; - Body_Id : Entity_Id; - Spec_Id : Entity_Id; - Last_Spec_Entity : Entity_Id; - New_N: Node_Id; - Pack_Decl: Node_Id; + procedure Hide_Public_Entities (Decls : List_Id); + -- Attempt to hide all public entities found in declarative list Decls + -- by resetting their Is_Public flag to False depending on whether the + -- entities are not referenced by inlined or generic bodies. This kind + -- of processing is a conservative approximation and may still leave + -- certain entities externally visible. procedure Install_Composite_Operations (P : Entity_Id); -- Composite types declared in the current scope may depend on types @@ -233,6 +233,310 @@ -- is now in scope. Indicate that the corresponding operations on the -- composite type are available. + -- + -- Hide_Public_Entities -- + -- + + procedure Hide_Public_Entities (Decls : List_Id) is + function Contains_Subp_Or_Const_Refs (N : Node_Id) return Boolean; + -- Subsidiary to routine Has_Referencer. Determine whether a node + -- contains a reference to a subprogram or a non-static constant. + -- WARNING: this is a very expensive routine as it performs a full + -- tree traversal. + + function Has_Referencer + (Decls : List_Id; +Top_Level : Boolean := False) return Boolean; + -- A referencer is a construct which may reference a previous + -- declaration. Examine all declarations in list Decls in reverse + -- and determine whether once such referencer exists. All entities + -- in the range Last (Decls) .. Referencer are hidden from external + -- visibility. + + - + -- Contains_Subp_Or_Const_Refs -- + - + + function Contains_Subp_Or_Const_Refs (N : Node_Id) return Boolean is +Reference_Seen : Boolean := False; + +function Is_Subp_Or_Const_Ref + (N : Node_Id) return Traverse_Result; +-- Determine whether a node denotes a reference to a subprogram or +-- a non-static constant. + +-- +-- Is_Subp_Or_Const_Ref -- +-- + +function Is_Subp_Or_Const_Ref + (N : Node_Id) return Traverse_Result +is + Val : Node_Id; + +begin + -- Detect a reference of the form + --Subp_Call + + if Nkind (N) in N_Subprogram_Call + and then Is_Entity_Name (Name (N)) + then + Reference_Seen := True; + return Abandon; + + -- Detect a reference of the form + --Subp'Some_Attribute + + elsif Nkind (N) = N_Attribute_Reference + and then Is_Entity_Name (Prefix (N)) + and then Is_Subprogram (Entity (Prefix (N))) + then + Reference_Seen := True; + return Abandon; + + -- Detect the use of a non-static constant + + elsif Is_Entity_Name (N) + and then Present (Entity (N)) + and then Ekind (Entity (N)) = E_Constant + then + Val
[Ada] Infinite loop on nested instantiations with dynamic elaboration checks
This patch fixes an infinite loop in GNAT when dynamic elaboration checks are generated for an instantiation of a generic function whose type is obtained from a formal package. The following must compile quietly: gcc -c -gnatE main.adb --- with Ada.Text_IO; use Ada.Text_IO; with Optional_Values; with Optional_Values_Map; procedure Main is package Optional_Floats is new Optional_Values (T = Float); package Optional_Integers is new Optional_Values (T = Integer); function Int_Of_Float (X : Float) return Integer is begin return Integer (X); end Int_Of_Float; function Map is new Optional_Values_Map (Input_Type = Float, Output_Type = Integer, Optional_Inputs = Optional_Floats, Optional_Outputs = Optional_Integers, Map = Int_Of_Float); Pi : constant := 3.1415; Optional_Three : constant Optional_Integers.Optional_Value_Type := Map (Optional_Floats.Of_Value (Value = Pi)); begin if Optional_Integers.Has_Value (Optional_Three) then declare Three : constant Integer := Optional_Integers.Get_Value (Optional_Three); begin Put_Line (Result = Three'Img); end; end if; end Main; --- package body Optional_Values is function Of_Value (Value : T) return Optional_Value_Type is ((Optional = (Has_Value = True, Value = Value))); function Get_Value (Optional_Value : Optional_Value_Type) return T is (Optional_Value.Optional.Value); end Optional_Values; --- generic type T is private; package Optional_Values is pragma Pure; type Optional_Value_Type is private; Null_Optional_Value : constant Optional_Value_Type; function Of_Value (Value : T) return Optional_Value_Type; function Has_Value (Optional_Value : Optional_Value_Type) return Boolean; function Get_Value (Optional_Value : Optional_Value_Type) return T with Pre = Has_Value (Optional_Value); private type Internal_Type (Has_Value : Boolean := False) is record case Has_Value is when True = Value : T; when False = null; end case; end record; type Optional_Value_Type is record Optional : Internal_Type; end record; Null_Optional_Value : constant Optional_Value_Type := (Optional = (Has_Value = False)); function Has_Value (Optional_Value : Optional_Value_Type) return Boolean is (Optional_Value.Optional.Has_Value); end Optional_Values; --- function Optional_Values_Map (Optional_Input : Optional_Inputs.Optional_Value_Type) return Optional_Outputs.Optional_Value_Type is use Optional_Inputs; begin if Has_Value (Optional_Input) then return Optional_Outputs.Of_Value (Map (Get_Value (Optional_Input))); end if; return Optional_Outputs.Null_Optional_Value; end Optional_Values_Map; --- with Optional_Values; generic type Input_Type is private; type Output_Type is private; with package Optional_Inputs is new Optional_Values (T = Input_Type); with package Optional_Outputs is new Optional_Values (T = Output_Type); with function Map (Input : Input_Type) return Output_Type; function Optional_Values_Map (Optional_Input : Optional_Inputs.Optional_Value_Type) return Optional_Outputs.Optional_Value_Type; Tested on x86_64-pc-linux-gnu, committed on trunk 2014-10-23 Ed Schonberg schonb...@adacore.com * sem_attr.adb (Analyze_Attribute): The prefix of attribute Elaborated does not require freezing, in particular if it denotes a generic function. Index: sem_attr.adb === --- sem_attr.adb(revision 216574) +++ sem_attr.adb(working copy) @@ -11164,8 +11164,17 @@ -- Normally the Freezing is done by Resolve but sometimes the Prefix -- is not resolved, in which case the freezing must be done now. - Freeze_Expression (P); + -- For an elaboration check on a subprogram, we do not freeze its type. + -- It may be declared in an unrelated scope, in particular in the case + -- of a generic function whose type may remain unelaborated. + if Attr_Id = Attribute_Elaborated then + null; + + else + Freeze_Expression (P); + end if; + -- Finally perform static evaluation on the attribute reference Analyze_Dimension (N);
Re: [COMMITTED][PATCH][ARM]Add ACLE 2.0 predefined marco __ARM_FEATURE_IDIV
Are you sure that the ACLE documents this with trailing underscores ? The copy that I have doesn't. You are right, it's my incaution. I have double checked, the macro should be __ARM_FEATURE_IDIV. Could you please do a obvious fix? Thank you so much! Kind regards, Renlin
Re: [PATCH diagnostics] PR 53061 cleanup initialization
Hello Manuel, Manuel López-Ibáñez lopeziba...@gmail.com writes: This is an old patch of mine that never got finished. I updated it following the suggestions of Gabriel here https://gcc.gnu.org/ml/gcc-patches/2012-04/msg00443.html Thanks for looking at this again. Bootstrapped and tested on x86_64-linux-gnu. OK? I think the patch is good. I only have minor observations regarding comments and one function naming. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 215890) +++ gcc/doc/invoke.texi (working copy) @@ -3075,15 +3075,14 @@ information should be reported. Note th honor these options. @table @gcctabopt @item -fmessage-length=@var{n} @opindex fmessage-length -Try to format error messages so that they fit on lines of about @var{n} -characters. The default is 72 characters for @command{g++} and 0 for the rest of -the front ends supported by GCC@. If @var{n} is zero, then no -line-wrapping is done; each error message appears on a single -line. +Try to format error messages so that they fit on lines of about +@var{n} characters. If @var{n} is zero, then no line-wrapping will be +done; each error message will appear on a single line. This is the +default for all front ends. Agreed. @item -fdiagnostics-show-location=once @opindex fdiagnostics-show-location Only meaningful in line-wrapping mode. Instructs the diagnostic messages reporter to emit source location information @emph{once}; that is, in Index: gcc/c-family/c-opts.c === --- gcc/c-family/c-opts.c (revision 215890) +++ gcc/c-family/c-opts.c (working copy) @@ -176,25 +176,14 @@ c_diagnostic_finalizer (diagnostic_conte virt_loc_aware_diagnostic_finalizer (context, diagnostic); pp_destroy_prefix (context-printer); pp_newline_and_flush (context-printer); } -/* Common diagnostics initialization. */ +/* Common default settings for diagnostics. */ void -c_common_initialize_diagnostics (diagnostic_context *context) +c_common_diagnostics_defaults (diagnostic_context *context) Please, call this c_common_diagnostics_set_defaults(). Having a verb in the function name (at least for functions that are not accessors) makes it easier to read, I think. { - /* This is conditionalized only because that is the way the front - ends used to do it. Maybe this should be unconditional? */ - if (c_dialect_cxx ()) -{ - /* By default wrap lines at 80 characters. Is getenv - (COLUMNS) preferable? */ - diagnostic_line_cutoff (context) = 80; - /* By default, emit location information once for every - diagnostic message. */ - diagnostic_prefixing_rule (context) = DIAGNOSTICS_SHOW_PREFIX_ONCE; -} diagnostic_finalizer (context) = c_diagnostic_finalizer; context-opt_permissive = OPT_fpermissive; } OK. /* Whether options from all C-family languages should be accepted Index: gcc/c-family/c-common.h === --- gcc/c-family/c-common.h (revision 215890) +++ gcc/c-family/c-common.h (working copy) @@ -824,11 +824,11 @@ extern void set_compound_literal_name (t extern tree build_va_arg (location_t, tree, tree); extern const unsigned int c_family_lang_mask; extern unsigned int c_common_option_lang_mask (void); -extern void c_common_initialize_diagnostics (diagnostic_context *); +extern void c_common_diagnostics_defaults (diagnostic_context *); c_common_diagnostics_defaults - c_common_diagnostics_set_defaults. extern bool c_common_complain_wrong_lang_p (const struct cl_option *); extern void c_common_init_options_struct (struct gcc_options *); extern void c_common_init_options (unsigned int, struct cl_decoded_option *); extern bool c_common_post_options (const char **); extern bool c_common_init (void); Index: gcc/c/c-objc-common.c === --- gcc/c/c-objc-common.c (revision 215890) +++ gcc/c/c-objc-common.c (working copy) @@ -60,19 +60,11 @@ c_warn_unused_global_decl (const_tree de bool c_objc_common_init (void) { c_init_decl_processing (); - if (c_common_init () == false) -return false; - - /* These were not defined in the Objective-C front end, but I'm - putting them here anyway. The diagnostic format decoder might - want an enhanced ObjC implementation. */ - diagnostic_format_decoder (global_dc) = c_tree_printer; - - return true; + return c_common_init (); } OK. /* Called during diagnostic message formatting process to print a source-level entity onto BUFFER. The meaning of the format specifiers is as follows: @@ -184,19 +176,20 @@ has_c_linkage (const_tree decl ATTRIBUTE } void c_initialize_diagnostics (diagnostic_context *context)
Re: [PATCHv4] Enable -fsanitize-recover for KASan
On 10/23/2014 02:16 PM, Jakub Jelinek wrote: On Thu, Oct 23, 2014 at 02:09:47PM +0400, Andrey Ryabinin wrote: On 10/23/2014 01:55 PM, Jakub Jelinek wrote: On Thu, Oct 23, 2014 at 01:51:12PM +0400, Andrey Ryabinin wrote: IMO we don't need different versions of __asan_load* and __asan_load*_noabort, because -fno-sanitize-recover=kernel-address will never work with the linux kernel. I already said this before, and repeat this once again: There is few places in kernel where we validly touch poisoned memory, so we need to disable error reporting in runtime for such memory accesses. I use per-thread flag which is raised before the valid access to poisoned memory. This flag checked in __asan_report*() function. If it raised then we shouldn't print any error message, just silently exit from report. Can't you just use __attribute__((no_sanitize_address)) on the functions that have such a code? Or you could use special macros for those accesses (which could e.g. call function to read memory or write memory, implemented in assembly or in __attribute__((no_sanitize_address)) function), or Those are quite generic functions used from a lot of places. So we want to instrument them in general, but there are few call sites which use those functions for poisoned memory. Actually, -fsanitize=kernel-address forcibly uses function calls (i.e. __asan_load* etc. rather than __asan_report_load* only if inline shadow memory test suggested there is a problem). Actually this is a historical artifact. If inlining proves to be significantly faster, they may want to switch. So, at that point you can include your ugly hacks in __asan_load* logic in the kernel, the difference between __asan_load4 and __asan_load4_noabort will be just that the latter will always return, while the former will not if some error has been reported. All the __asan_load* and __asan_store* entrypoints, regardless of -f{,no-}sanitize-recover=kernel-address are by definition not noreturn, they in the common case (if the code is not buggy) return. Perhaps we should just keep __asan_load* as is and leave the decision whether to abort or continue for the runtime? This would make semantics of -fsanitize-recover cumbersome though (because it wouldn't work if user selects outline instrumentation). -Y
Re: [PATCHv4] Enable -fsanitize-recover for KASan
On Thu, Oct 23, 2014 at 02:33:42PM +0400, Yury Gribov wrote: Actually this is a historical artifact. If inlining proves to be significantly faster, they may want to switch. Ok. So, at that point you can include your ugly hacks in __asan_load* logic in the kernel, the difference between __asan_load4 and __asan_load4_noabort will be just that the latter will always return, while the former will not if some error has been reported. All the __asan_load* and __asan_store* entrypoints, regardless of -f{,no-}sanitize-recover=kernel-address are by definition not noreturn, they in the common case (if the code is not buggy) return. Perhaps we should just keep __asan_load* as is and leave the decision whether to abort or continue for the runtime? This would make semantics of -fsanitize-recover cumbersome though (because it wouldn't work if user selects outline instrumentation). Well, the don't ever report anything while some per-CPU flag is set thing can be considered as part of the is this memory access ok test, it is pretending everything is accessible. But, otherwise, if it is supposed to be developer's decision at compile time, __asan_load*_noabort should better always continue, even if it reported issues, and __asan_load* should better not return after reporting errors. Jakub
Re: C/C++ diagnostics guidelines
Manuel López-Ibáñez lopeziba...@gmail.com writes: On 17 October 2014 19:33, Joseph S. Myers jos...@codesourcery.com wrote: On Fri, 17 Oct 2014, Manuel López-Ibáñez wrote: Thus, I drafted some guidelines at:https://gcc.gnu.org/wiki/Better_Diagnostics#guidelines Please, could you take a look and comment whether I got it right/wrong? Yes, that looks right to me. Thanks! I added guidelines also about locations and warning options. I like these very much. Thank you for looking into this. To take this further, I am thinking that these guidelines would be even better served by standing on their own page. If nobody objects, I can create a DiagnosticsGuidelines page in the wiki with the content that you added. Cheers, -- Dodji
[Ada] Spurious warnings about unused units
A prefixed call is resolved by examining the dispatch table of the controlling object, not by visibility. The operation may be defined in another unit, and no reference to it, or to its scope, might be created for it through the usual Generate_Reference machinery. To prevent spurious warnings about unused units, indicate that the operation and its scope are in fact referenced. No simple example available. Tested on x86_64-pc-linux-gnu, committed on trunk 2014-10-23 Ed Schonberg schonb...@adacore.com * sem_ch4.adb (Complete_Object_Operation): Indicate that the scope of the operation (s) is referenced, to prevent spurious warnings about unused units. Index: sem_ch4.adb === --- sem_ch4.adb (revision 216574) +++ sem_ch4.adb (working copy) @@ -7617,6 +7617,17 @@ Rewrite (First_Actual, Obj); end if; + -- The operation is obtained from the dispatch table and not by + -- visibility, and may be declared in a unit that is not explicitly + -- referenced in the source, but is nevertheless required in the + -- context of the current unit. Indicate that operation and its scope + -- are referenced, to prevent spurious and misleading warnings. If + -- the operation is overloaded, all primitives are in the same scope + -- and we can use any of them. + + Set_Referenced (Entity (Subprog), True); + Set_Referenced (Scope (Entity (Subprog)), True); + Rewrite (Node_To_Replace, Call_Node); -- Propagate the interpretations collected in subprog to the new
[Ada] Introduce new message kind Check
This patch introduces a kind of message which is not an error (i.e. is non-fatal), but is not a warning either (cannot be suppressed with pragma Warnings). This new kind is called a check, and is recognized by a severity prefix low: , medium: or high: . This new message kind is to be used by the gnat2why backend for detected runtime-checks and other issues. Tested on x86_64-pc-linux-gnu, committed on trunk 2014-10-23 Johannes Kanig ka...@adacore.com * errout.adb (Error_Msg_Internal): Copy check flag, increment check msg count. * erroutc.adb (Delete_Msg) adjust check msg count. (Output_Msg_Text) handle check msg case (do nothing). (Prescan_Message) recognize check messages with severity prefixes. * errutil.adb (Error_Msg) handle check flag, adjust counter. Index: errout.adb === --- errout.adb (revision 216574) +++ errout.adb (working copy) @@ -982,6 +982,7 @@ Col = Get_Column_Number (Sptr), Warn = Is_Warning_Msg, Info = Is_Info_Msg, + Check= Is_Check_Msg, Warn_Err = False, -- reset below Warn_Chr = Warning_Msg_Char, Style= Is_Style_Msg, @@ -1140,6 +1141,9 @@ Info_Messages := Info_Messages + 1; end if; + elsif Errors.Table (Cur_Msg).Check then + Check_Messages := Check_Messages + 1; + else Total_Errors_Detected := Total_Errors_Detected + 1; Index: errout.ads === --- errout.ads (revision 216574) +++ errout.ads (working copy) @@ -413,6 +413,13 @@ -- are continuations that are not printed using the -gnatj switch they -- will also have this prefix. + --Insertion sequence low: or medium: or high: (check message) + -- This appears only at the start of the message (and not any of its + -- continuations, if any), and indicates that the message is a check + -- message. The message will be output with this prefix. Check + -- messages are not fatal (so are like info messages in that respect) + -- and are not controlled by pragma Warnings. + - -- Global Values Used for Error Message Insertions -- - Index: atree.ads === --- atree.ads (revision 216574) +++ atree.ads (working copy) @@ -320,6 +320,10 @@ -- Number of info messages generated. Info messages are neved treated as -- errors (whether from use of the pragma, or the compiler switch -gnatwe). + Check_Messages : Nat := 0; + -- Number of check messages generated. Check messages are neither warnings + -- nor errors. + Warnings_Treated_As_Errors : Nat := 0; -- Number of warnings changed into errors as a result of matching a pattern -- given in a Warning_As_Error configuration pragma. Index: errutil.adb === --- errutil.adb (revision 216574) +++ errutil.adb (working copy) @@ -213,6 +213,7 @@ Col = Get_Column_Number (Sptr), Warn = Is_Warning_Msg, Info = Is_Info_Msg, +Check= Is_Check_Msg, Warn_Err = Warning_Mode = Treat_As_Error, Warn_Chr = Warning_Msg_Char, Style= Is_Style_Msg, @@ -313,6 +314,9 @@ Info_Messages := Info_Messages + 1; end if; + elsif Errors.Table (Cur_Msg).Check then + Check_Messages := Check_Messages + 1; + else Total_Errors_Detected := Total_Errors_Detected + 1; Index: erroutc.adb === --- erroutc.adb (revision 216582) +++ erroutc.adb (working copy) @@ -145,6 +145,9 @@ -- because this only gets incremented if we actually output the -- message, which we won't do if we are deleting it here! +elsif Errors.Table (D).Check then + Check_Messages := Check_Messages - 1; + else Total_Errors_Detected := Total_Errors_Detected - 1; @@ -653,6 +656,11 @@ elsif Errors.Table (E).Style then null; +-- No prefix needed for check message, severity is there already + + elsif Errors.Table (E).Check then +null; + -- All other cases, add error: if unique error tag set elsif Opt.Unique_Error_Tag then @@ -765,6 +773,15 @@ Is_Info_Msg := Msg'Length 6 and then Msg (Msg'First .. Msg'First + 5) = info: ; + -- Check check message + + Is_Check_Msg := +(Msg'Length 8 and then Msg (Msg'First .. Msg'First + 7) = medium: ) +or else + (Msg'Length 6 and then Msg
[Ada] Spurious conformance error with instance and child unit
This patch fixes a spurious subtype conformance error in a child unit when the actual is a private type declared in a package instantiation, whose full view is a constrained discriminated type. The following must compile quietly: gcc -c new_bounded_strings-child.adb --- with G_Instance; package body New_Bounded_Strings.Child is procedure P (S : New_Bounded_Strings.NBString) is null; begin G_Instance.R (P'Access); end; --- package New_Bounded_Strings.Child is -- Needs to be a child package procedure P (S : New_Bounded_Strings.NBString); end; -- with Ada.Strings.Bounded; package Bounded_Strings_Instance is package BS is new Ada.Strings.Bounded.Generic_Bounded_Length (10); type BString is new BS.Bounded_String; end; -- generic type T is private; package G is type Proc is access procedure (X : T); procedure R (P : Proc); end; -- with G; with New_Bounded_Strings; package G_Instance is new G (New_Bounded_Strings.NBString); -- with Bounded_Strings_Instance; package New_Bounded_Strings is type NBString is private; -- compiles if not private private type NBString is new Bounded_Strings_Instance.BString; end; Tested on x86_64-pc-linux-gnu, committed on trunk 2014-10-23 Ed Schonberg schonb...@adacore.com * sem_eval.adb (Subtypes_Statically_Match): For a generic actual type, check for the presence of discriminants in its parent type, against the presence of discriminants in the context type. Index: sem_eval.adb === --- sem_eval.adb(revision 216582) +++ sem_eval.adb(working copy) @@ -5737,7 +5737,17 @@ -- same base type. if Has_Discriminants (T1) /= Has_Discriminants (T2) then -if In_Instance then +-- A generic actual type is declared through a subtype declaration +-- and may have an inconsistent indication of the presence of +-- discriminants, so check the type it renames. + +if Is_Generic_Actual_Type (T1) + and then not Has_Discriminants (Etype (T1)) + and then not Has_Discriminants (T2) +then + return True; + +elsif In_Instance then if Is_Private_Type (T2) and then Present (Full_View (T2)) and then Has_Discriminants (Full_View (T2))
[Ada] Spurious warnings about unreferenced entities of a packed type
A reference to an element of a packed array is rewritten through conversions and masks and shifts. If the prefix of the reference is a source entity we must generate a reference to it for cross-reference purposes, and to prevent spurious warnings about unused entities. Compiling gcc -c -gnatwa case_2.adb must yield: case_2.adb:10:06: warning: unit System is not referenced case_2.adb:37:13: warning: 1 bits of Storage_Record_Type unused -- -- Test case for anomalous GNAT compiler warning. -- Compile as follows: -- gnatmake -f -gnatwf case_2 -- -- This will create a warning that the formal parameter Storage of the -- function Test_Function is not referenced, even though it is. -- with System; with Text_IO; procedure Case_2 is -- -- 32-bit signed integer type -- type My_Long is range -(2 ** 31) .. +(2 ** 31) - 1; for My_Long'size use 32; -- -- Changing this to 64 will prevent the warning from being reported. -- Record_Size : constant := 65; -- -- Note: this type is defined as having an extra bit unused for ease of -- switching the size to 64 to see the warning go away. -- Even adding an extra field to fill in the unused bit, GNAT still -- reports the warning. -- type Storage_Record_Type is record E : My_Long; F : My_Long; end record with Size = Record_Size; type My_Storage_Array_Type is array (My_Long range ) of Storage_Record_Type with Pack = True; -- -- GNAT gives a warning that Storage is not used. -- function Test_Function (Storage : in My_Storage_Array_Type; Left: in My_Long; Right : in My_Long) return Boolean is begin return Storage (Left).E Storage (Right).F; end Test_Function; -- -- Dummy variables to support the below code. -- Test : My_Storage_Array_Type (1 .. 1); Result : Boolean; begin -- -- This small bit of code is only here to ensure that none of the above -- parts get optimized out during compilation. -- Test (1) := (E = 0, F = 0); Result := Test_Function (Test, 1, 1); Text_IO.Put_Line (Boolean'image (Result)); end Case_2; Tested on x86_64-pc-linux-gnu, committed on trunk 2014-10-23 Ed Schonberg schonb...@adacore.com * exp_pakd.adb (Expand_Packed_Element_Reference): If the prefix is a source entity, generate a reference to it before transformation, because rewritten node might not generate a proper reference, leading to spurious warnings. Index: exp_pakd.adb === --- exp_pakd.adb(revision 216574) +++ exp_pakd.adb(working copy) @@ -30,6 +30,7 @@ with Exp_Dbug; use Exp_Dbug; with Exp_Util; use Exp_Util; with Layout; use Layout; +with Lib.Xref; use Lib.Xref; with Namet;use Namet; with Nlists; use Nlists; with Nmake;use Nmake; @@ -1682,6 +1683,16 @@ Expand_Packed_Element_Reference (Prefix (N)); end if; + -- The prefix may be rewritten below as a conversion. If it is a source + -- entity generate reference to it now, to prevent spurious warnings + -- about unused entities. + + if Is_Entity_Name (Prefix (N)) +and then Comes_From_Source (Prefix (N)) + then + Generate_Reference (Entity (Prefix (N)), Prefix (N), 'r'); + end if; + -- If not bit packed, we have the enumeration case, which is easily -- dealt with (just adjust the subscripts of the indexed component)
Re: C/C++ diagnostics guidelines
Manuel López-Ibáñez lopeziba...@gmail.com writes: On 17 October 2014 20:04, Manuel López-Ibáñez lopeziba...@gmail.com wrote: On 17 October 2014 19:33, Joseph S. Myers jos...@codesourcery.com wrote: On Fri, 17 Oct 2014, Manuel López-Ibáñez wrote: Thus, I drafted some guidelines at:https://gcc.gnu.org/wiki/Better_Diagnostics#guidelines Please, could you take a look and comment whether I got it right/wrong? Yes, that looks right to me. Thanks! I added guidelines also about locations and warning options. I believe there are also some rules about when to use some special line-map functions that arise when warning about macros like NULL, but I am not aware of the specifics. It would be useful if someone added those. Dodji, Paolo? Do you know what I'm talking about? Hmmh, I am not sure. Do you have any example of warning about such macros? Cheers, -- Dodji
Re: [PATCH AVX512] [81.1/n] Extend `function_code' field in `tree_var_decl' structure.
On Wed, Oct 22, 2014 at 11:51:32AM +0400, Kirill Yukhin wrote: gcc/ * tree-core.h (tree_var_decl): Extend `function_code' field by one bit, move `regdecl_flag' field to ... (tree_decl_with_vis): Here. * tree.h (DECL_STATIC_CHAIN): Update struct name. Bootstrapped. Is it ok for trunk? Ok for trunk for now, I hope we can reduce the number of builtins in the future though. Even the current 1470 i?86 builtins is too big IMHO, and if you add hundreds further builtins... Jakub
Re: [PATCHv4] Enable -fsanitize-recover for KASan
On 10/23/2014 02:38 PM, Jakub Jelinek wrote: On Thu, Oct 23, 2014 at 02:33:42PM +0400, Yury Gribov wrote: Actually this is a historical artifact. If inlining proves to be significantly faster, they may want to switch. Ok. So, at that point you can include your ugly hacks in __asan_load* logic in the kernel, the difference between __asan_load4 and __asan_load4_noabort will be just that the latter will always return, while the former will not if some error has been reported. All the __asan_load* and __asan_store* entrypoints, regardless of -f{,no-}sanitize-recover=kernel-address are by definition not noreturn, they in the common case (if the code is not buggy) return. Perhaps we should just keep __asan_load* as is and leave the decision whether to abort or continue for the runtime? This would make semantics of -fsanitize-recover cumbersome though (because it wouldn't work if user selects outline instrumentation). Well, the don't ever report anything while some per-CPU flag is set thing can be considered as part of the is this memory access ok test, it is pretending everything is accessible. But, otherwise, if it is supposed to be developer's decision at compile time, __asan_load*_noabort should better always continue, even if it reported issues, and __asan_load* should better not return after reporting errors. True, but why we need new functions for that. __asan_load could also abort or not depending on what user/developer wants. Why we have to rebuild the entire kernel if someone wants to switch from abort to noabort? I'm not against __asan_load_noabort, I'm just saying that this is no point to have separate __asan_load/__asan_load_noabort functions in kernel.
Re: [PATCH diagnostics/fortran] dynamically generate locations from offset + handle %C
Hello Manuel, Manuel López-Ibáñez lopeziba...@gmail.com writes: Dodji, are the linemap_asserts() appropriate? Yes they are. I have some additional comments though. libcpp/ChangeLog: 2014-10-16 Manuel López-Ibáñez m...@gcc.gnu.org PR fortran/44054 * include/line-map.h (linemap_position_for_loc_and_offset): Declare. * line-map.c (linemap_position_for_loc_and_offset): New. [...] --- libcpp/include/line-map.h (revision 216257) +++ libcpp/include/line-map.h (working copy) @@ -601,10 +601,17 @@ linemap_position_for_column (struct line column. */ source_location linemap_position_for_line_and_column (const struct line_map *, linenum_type, unsigned int); +/* Encode and return a source_location starting from location LOC + and shifting it by OFFSET columns. */ +source_location +linemap_position_for_loc_and_offset (struct line_maps *set, + source_location loc, + unsigned int offset); + OK. [...] --- libcpp/line-map.c (revision 216257) +++ libcpp/line-map.c (working copy) [...] +/* Encode and return a source_location starting from location LOC + and shifting it by OFFSET columns. */ + The comment is OK. I would just add that this function currently only works with non-virtual locations. +source_location +linemap_position_for_loc_and_offset (struct line_maps *set, + source_location loc, + unsigned int offset) +{ + const struct line_map * map = NULL; + + /* This function does not support virtual locations yet. */ + linemap_assert (!linemap_location_from_macro_expansion_p (set, loc)); + + if (offset == 0) +return loc; Here, I'd replace the above condition and return status statement with: if (offset == 0 /* Adding an offset to a reserved location (like UNKNOWN_LOCATION for the C/C++ FEs) does not really make sense. So let's live the location intact in that case. */ || loc RESERVED_LOCATION) return loc; + + /* First, we find the real location and shift it. */ + loc = linemap_resolve_location (set, loc, LRK_SPELLING_LOCATION, map); + linemap_assert (MAP_START_LOCATION (map) loc + offset); OK. First I'd add a comment above the assert that says: /* The new location (loc + offset) should be higher than the first location encoded by MAP. */ and I'd add another assert: /* If MAP is not the last line map of its set, then the new location (loc + offset) should be less than the first location encoded by the next line map of the set. */ if (map LINEMAPS_LAST_ORDINARY_MAP(set)) linemap_assert(MAP_START_LOCATION(map[1]) loc + offset); + + offset += SOURCE_COLUMN (map, loc); + linemap_assert (offset (1u map-d.ordinary.column_bits)); + + source_location r = +linemap_position_for_line_and_column (map, + SOURCE_LINE (map, loc), + offset); + linemap_assert (map == linemap_lookup (set, r)); + return r; +} + OK. So the line map part of the patch is OK from me if it passes bootstrap with the added asserts. Thank you for looking into this. Cheers. -- Dodji
Re: [PATCH] Var-tracking initialization fix (PR debug/63623)
On Thu, Oct 23, 2014 at 9:30 AM, Jakub Jelinek ja...@redhat.com wrote: Hi! As I wrote in the PR, vt_stack_adjustments can often compute wrong offsets, because it never considers pops with autoinc addressing, which can lead either to wrong debug info, or turning off -fvar-tracking altogether for a function on which that issue resulted in stack depth inconsistencies on the edges. Here are some stats from --enable-checking=yes,rtl cc1plus bootstrapped with/without the patch (without the patch got then rebuilt stage3 with the patch, i.e. just var-tracking.o and cc1plus-checksum.o got recompiled, so I'm comparing identical code, different debug info): x86_64-linux cc1plus built without the patch: cov%samples cumul 0.0 506230/38% 506230/38% 0..10 10327/0%516557/39% 11..20 12390/0%528947/39% 21..30 31265/2%560212/42% 31..40 18775/1%578987/43% 41..50 20631/1%599618/45% 51..60 24921/1%624539/47% 61..70 40959/3%665498/50% 71..80 23771/1%689269/52% 81..90 41771/3%731040/55% 91..99 81667/6%812707/61% 100 510564/38% 1323271/100% x86_64-linux cc1plus built with the patch: cov%samples cumul 0.0 382214/28% 382214/28% 0..10 13100/0%395314/29% 11..20 14568/1%409882/30% 21..30 33708/2%443590/33% 31..40 21927/1%465517/35% 41..50 23924/1%489441/36% 51..60 28736/2%518177/39% 61..70 45847/3%564024/42% 71..80 29284/2%593308/44% 81..90 52085/3%645393/48% 91..99 99971/7%745364/56% 100 577907/43% 1323271/100% i686-linux cc1plus built without the patch: cov%samples cumul 0.0 631348/48% 631348/48% 0..10 7764/0% 639112/48% 11..20 9690/0% 648802/49% 21..30 25036/1%673838/51% 31..40 16113/1%689951/52% 41..50 19753/1%709704/54% 51..60 14563/1%724267/55% 61..70 34093/2%758360/58% 71..80 17450/1%775810/59% 81..90 31339/2%807149/61% 91..99 60368/4%867517/66% 100 437548/33% 1305065/100% i686-linux cc1plus built with the patch: cov%samples cumul 0.0 377352/28% 377352/28% 0..10 16077/1%393429/30% 11..20 15390/1%408819/31% 21..30 31790/2%440609/33% 31..40 23889/1%464498/35% 41..50 29267/2%493765/37% 51..60 22902/1%516667/39% 61..70 45629/3%562296/43% 71..80 29511/2%591807/45% 81..90 50536/3%642343/49% 91..99 93584/7%735927/56% 100 569138/43% 1305065/100% .debug_info/.debug_loc sizes in bytes: x86_64-linux cc1plus without patch .debug_info 75411710, .debug_loc 75421077 x86_64-linux cc1plus withpatch .debug_info 78498790, .debug_loc 90530117 i686-linux cc1plus without patch .debug_info 59921183, .debug_loc 37823166 i686-linux cc1plus withpatch .debug_info 63009554, .debug_loc 59535100 I've also performed instrumented bootstraps/regtests (x86_64-linux and i686-linux), where I've logged in how many functions the result of vt_stack_adjustments differed between the bad old way and new way. In both the bootstraps/regtests, it affected 16892 32-bit and 6646 64-bit functions, in all cases it was old way giving up and new way succeeding. Not adding a testcase, as the one in the PR failed to produce proper debug info only in 4.8 (then got latent), there already are some guality improvements with the patch: -FAIL: gcc.dg/guality/pr54693-2.c -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions line 21 i == v + 1 -FAIL: gcc.dg/guality/pr54693-2.c -O3 -fomit-frame-pointer -funroll-loops line 21 i == v + 1 on x86_64 and: -FAIL: gcc.dg/guality/pr54519-1.c -O2 line 20 y == 25 -FAIL: gcc.dg/guality/pr54519-1.c -O2 line 20 z == 6 -FAIL: gcc.dg/guality/pr54519-1.c -O2 line 23 y == 117 -FAIL: gcc.dg/guality/pr54519-1.c -O2 line 23 z == 8 -FAIL: gcc.dg/guality/pr54519-1.c -O3 -fomit-frame-pointer line 20 x == 36 -FAIL: gcc.dg/guality/pr54519-1.c -O3 -fomit-frame-pointer line 20 y == 25 -FAIL: gcc.dg/guality/pr54519-1.c -O3 -fomit-frame-pointer line 20 z == 6 -FAIL: gcc.dg/guality/pr54519-1.c -O3 -g line 20 x == 36 -FAIL: gcc.dg/guality/pr54519-1.c -O3 -g line 20 y == 25 -FAIL: gcc.dg/guality/pr54519-1.c -O3 -g line 20 z == 6 -FAIL: gcc.dg/guality/pr54519-3.c -O2 line 20 x == 36 -FAIL: gcc.dg/guality/pr54519-3.c -O2 line 20 y == 25 -FAIL: gcc.dg/guality/pr54519-3.c -O2 line 20 z == 6 -FAIL: gcc.dg/guality/pr54519-3.c -O2 line 23 x == 98 -FAIL: gcc.dg/guality/pr54519-3.c -O2 line 23 y == 117 -FAIL: gcc.dg/guality/pr54519-3.c -O2 line 23 z == 8 -FAIL: gcc.dg/guality/pr54519-3.c -O2 -flto line 20 x == 36 -FAIL: gcc.dg/guality/pr54519-3.c -O2 -flto line 23 x == 98 -FAIL: gcc.dg/guality/pr54519-3.c -O2 -flto -flto-partition=none line 20 x == 36
[C PATCH] Don't output warning twice (PR c/63626)
At present, we print the inline function ... declared but never defined warning twice. The reason for that is that this warning is being printed in pop_scope, which is called when popping file scope (c_common_parse_file-pop_file_scope), and when popping external scope (c_write_global_declarations). I think we should not print this warning when popping the external scope. We don't have to worry about nested functions here. Writing a proper testcase is a little bit tricky, but I hope what I did would work fine. Bootstrapped/regtested on x86_64-linux, ok for trunk? 2014-10-23 Marek Polacek pola...@redhat.com PR c/63626 * c-decl.c (pop_scope): Don't print warning in external_scope. * gcc.dg/pr63626.c: New test. diff --git gcc/c/c-decl.c gcc/c/c-decl.c index b18da48..7d1840e 100644 --- gcc/c/c-decl.c +++ gcc/c/c-decl.c @@ -1182,7 +1182,8 @@ pop_scope (void) with an inline function specifier ... shall also be defined in the same translation unit. */ if (!flag_gnu89_inline - !lookup_attribute (gnu_inline, DECL_ATTRIBUTES (p))) + !lookup_attribute (gnu_inline, DECL_ATTRIBUTES (p)) + scope != external_scope) pedwarn (input_location, 0, inline function %q+D declared but never defined, p); DECL_EXTERNAL (p) = 1; diff --git gcc/testsuite/gcc.dg/pr63626.c gcc/testsuite/gcc.dg/pr63626.c index e69de29..1f03818 100644 --- gcc/testsuite/gcc.dg/pr63626.c +++ gcc/testsuite/gcc.dg/pr63626.c @@ -0,0 +1,8 @@ +/* PR c/63626 */ +/* { dg-do compile } */ +/* { dg-options } */ + +/* Test that we don't output the warning twice. */ + +inline int foo (void); /* { dg-bogus inline function.*inline function } */ +/* { dg-warning inline function .foo. declared but never defined { target *-*-* } 7 } */ Marek
Re: [patch] Excessive alignment in ix86_data_alignment
On 09 Oct 08:25, H.J. Lu wrote: On Thu, Oct 9, 2014 at 1:37 AM, Uros Bizjak ubiz...@gmail.com wrote: On Thu, Oct 9, 2014 at 10:25 AM, Kirill Yukhin kirill.yuk...@gmail.com wrote: On 08 Oct 23:02, Petr Murzin wrote: Hi, I have measured performance impact on Haswell platform according to this input: https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00978.html Kirill, please mention: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61296 in your ChangeLog. What about older processors? Kirill, please collect data on Nehelam/Westmere, Sandybrigde/Ivybride and Silvermont. The optimization was introduced well before Haswell for then current processors, and it was based on the recommendation from Intel optimization guide. If this optimization doesn't apply for new processors, then tune option should be introduced and set accordingly. I believe the original excessive alignment was introduced by cut/paste from https://gcc.gnu.org/git/?p=gcc.git;a=patch;h=ed45e834f305d1f2709bf200a13d5beebc2fcfee to improve x86 FP performance, which might be partially copied from CONSTANT_ALIGNMENT: https://gcc.gnu.org/git/?p=gcc.git;a=patch;h=f7d6703c5d83fc9fb06246d6eb49e9b61098045c -- H.J. Hello, Please have a look at collected data. SLM -O2: Test Previous Current Ratio(%) 400.perlbench 11.500011.5000 +0% 401.bzip2 8.7800 8.7500 -0.34% 403.gcc9.7700 9.8200 +0.51% 429.mcf9.900010.1000 +2.02% 445.gobmk 10.400010.4000 +0% 456.hmmer 12.700012.7000 +0% 458.sjeng 10.600010.6000 +0% 462.libquantum25.24.7000 -1.20% 464.h264ref 17.500017.4000 -0.57% 471.omnetpp7.2700 7.2100 -0.82% 473.astar 8.5700 8.5600 -0.11% 483.xalancbmk 10.400010.4000 +0% 410.bwaves24.100024.1000 +0% 416.gamess 9.6900 9.6700 -0.20% 433.milc 9.5400 9.7300 +1.99% 434.zeusmp 8.7000 8.6900 -0.11% 435.gromacs7.7800 7.7700 -0.12% 436.cactusADM 12.400012.3000 -0.80% 437.leslie3d 10.500010.4000 -0.95% 444.namd 9.0100 9.0100 +0% 447.dealII17.800017.8000 +0% 450.soplex11.700011.7000 +0% 453.povray11.700011.7000 +0% 454.calculix 5.8700 5.8700 +0% 459.GemsFDTD 12.100012.1000 +0% 465.tonto 8.4700 8.4700 +0% 470.lbm 17.800017.8000 +0% 481.wrf 13.500013.6000 +0.74% 482.sphinx3 12.600012.6000 +0% Geomeans: INT : 11.20 11.19 -0.05% FP : 11.29 11.29 +0.03% ALL : 11.25 11.25 -0.00% SLM -O3: Test Previous Current Ratio(%) 400.perlbench 11.500011.5000 +0% 401.bzip2 8.7400 8.7400 +0% 403.gcc9.7800 9.8000 +0.20% 429.mcf9.890010.2000 +3.13% 445.gobmk 10.400010.4000 +0% 456.hmmer 12.700012.7000 +0% 458.sjeng 10.600010.6000 +0% 462.libquantum24.800025. +0.80% 464.h264ref 17.400017.4000 +0% 471.omnetpp7.1900 7.3100 +1.66% 473.astar 8.6000 8.5800 -0.23% 483.xalancbmk 10.400010.4000 +0% 410.bwaves24.200024.2000 +0% 416.gamess 9.7000 9.6700 -0.30% 433.milc 9.7300 9.7500 +0.20% 434.zeusmp 8.7000 8.7000 +0% 435.gromacs7.7700 7.7700 +0% 436.cactusADM 12.400012.3000 -0.80% 437.leslie3d 10.400010.4000 +0% 444.namd 9.0100 9.0100 +0% 447.dealII17.800017.9000 +0.56% 450.soplex11.900011.8000 -0.84% 453.povray11.700011.7000 +0% 454.calculix 5.8600 5.8700 +0.17% 459.GemsFDTD 12.100012. -0.82% 465.tonto 8.4800 8.4700 -0.11% 470.lbm 17.800017.8000 +0% 481.wrf 13.500013.5000 +0% 482.sphinx3 12.700012.7000 +0% Geomeans: INT : 11.17 11.22 +0.46% FP : 11.31 11.30 -0.12% ALL : 11.25 11.27 +0.12% SNB -O2: Test Previous Current Ratio(%) 400.perlbench 31.300031.3000 +0% 401.bzip2 21.700021.7000 +0% 403.gcc 30.600030.6000 +0% 429.mcf 43.200043.3000 +0.23% 445.gobmk 24.900024.9000 +0% 456.hmmer 23.800023.8000 +0% 458.sjeng 26.26.1000 +0.38% 462.libquantum63.200063.6000 +0.63% 464.h264ref 46.700046.9000 +0.42% 471.omnetpp 23.900023.7000 -0.83% 473.astar 22.800022.8000 +0% 483.xalancbmk 38.900038.7000 -0.51% 410.bwaves55.55.2000 +0.36% 416.gamess28.300028.3000 +0% 433.milc 23.100023. -0.43% 434.zeusmp28.600028.4000 -0.69% 435.gromacs 24.100024.2000 +0.41% 436.cactusADM 43.600043. -1.37% 437.leslie3d 38.2000
Re: [testsuite] Fix gcc.dg/ipa/ipa-icf-2[18].c on Solaris
On 10/23/2014 11:38 AM, Jakub Jelinek wrote: On Thu, Oct 23, 2014 at 11:35:44AM +0200, Rainer Orth wrote: The testcase is clearly i?86/x86_64 specific, so it must be guarded with { target i?86-*-* x86_64-*-* }, and as it is a compile time test, no need Right, that's what my patch already does. for assembler support, so just adding -msse2 option to dg-option is the right thing to do. Ok, will try that. Still no idea why Linux/x86 and Solaris/x86 are different here. Why do you think so? I certainly have: FAIL: gcc.dg/ipa/ipa-icf-21.c (test for excess errors) UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Equal symbols: 1 UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Semantic equality hit:bar-foo in my i686-linux test_summary log. It really depends on how the compiler is configured, if it defaults to march that has sse/sse2 by default, it will succeed, otherwise it will not. Jakub Hello. I combined both patches and run regression tests on x86_64-linux-pc. Ready for trunk? Thanks, Martin diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c b/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c index 7358e43..68aabc5 100644 --- a/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c +++ b/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c @@ -1,5 +1,5 @@ -/* { dg-do compile } */ -/* { dg-options -O2 -fdump-ipa-icf } */ +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */ +/* { dg-options -O2 -msse2 -fdump-ipa-icf } */ #include xmmintrin.h diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-icf-28.c b/gcc/testsuite/gcc.dg/ipa/ipa-icf-28.c index 538e0ab..bcaf84e 100644 --- a/gcc/testsuite/gcc.dg/ipa/ipa-icf-28.c +++ b/gcc/testsuite/gcc.dg/ipa/ipa-icf-28.c @@ -1,4 +1,4 @@ -/* { dg-do compile } */ +/* { dg-do compile { target init_priority } } */ /* { dg-options -O2 -fdump-ipa-icf-details -fno-inline } */ __attribute__ ((noinline, constructor(200)))
Re: [testsuite] Fix gcc.dg/ipa/ipa-icf-2[18].c on Solaris
On Thu, Oct 23, 2014 at 01:31:53PM +0200, Martin Liška wrote: On 10/23/2014 11:38 AM, Jakub Jelinek wrote: On Thu, Oct 23, 2014 at 11:35:44AM +0200, Rainer Orth wrote: The testcase is clearly i?86/x86_64 specific, so it must be guarded with { target i?86-*-* x86_64-*-* }, and as it is a compile time test, no need Right, that's what my patch already does. for assembler support, so just adding -msse2 option to dg-option is the right thing to do. Ok, will try that. Still no idea why Linux/x86 and Solaris/x86 are different here. Why do you think so? I certainly have: FAIL: gcc.dg/ipa/ipa-icf-21.c (test for excess errors) UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Equal symbols: 1 UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Semantic equality hit:bar-foo in my i686-linux test_summary log. It really depends on how the compiler is configured, if it defaults to march that has sse/sse2 by default, it will succeed, otherwise it will not. Jakub Hello. I combined both patches and run regression tests on x86_64-linux-pc. Ready for trunk? With proper ChangeLog entry yes. diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c b/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c index 7358e43..68aabc5 100644 --- a/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c +++ b/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c @@ -1,5 +1,5 @@ -/* { dg-do compile } */ -/* { dg-options -O2 -fdump-ipa-icf } */ +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */ +/* { dg-options -O2 -msse2 -fdump-ipa-icf } */ #include xmmintrin.h diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-icf-28.c b/gcc/testsuite/gcc.dg/ipa/ipa-icf-28.c index 538e0ab..bcaf84e 100644 --- a/gcc/testsuite/gcc.dg/ipa/ipa-icf-28.c +++ b/gcc/testsuite/gcc.dg/ipa/ipa-icf-28.c @@ -1,4 +1,4 @@ -/* { dg-do compile } */ +/* { dg-do compile { target init_priority } } */ /* { dg-options -O2 -fdump-ipa-icf-details -fno-inline } */ __attribute__ ((noinline, constructor(200))) Jakub
Re: [testsuite] Fix gcc.dg/ipa/ipa-icf-2[18].c on Solaris
Jakub Jelinek ja...@redhat.com writes: The former can be cured by adding -Wno-psabi to dg-options, but I've no idea what to do about the latter, especially since it doesn't occur on i386-pc-solaris2.11. The testcase is clearly i?86/x86_64 specific, so it must be guarded with { target i?86-*-* x86_64-*-* }, and as it is a compile time test, no need for assembler support, so just adding -msse2 option to dg-option is the right thing to do. Here's what I've committed after testing on x86_64-unknown-linux-gnu, i686-unknown-linux-gnu, and i386-pc-solaris2.11 (both multilibs in each case). Rainer 2014-10-23 Rainer Orth r...@cebitec.uni-bielefeld.de * gcc.dg/ipa/ipa-icf-21.c: Add -msse2 to dg-options. diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c b/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c --- a/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c +++ b/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c @@ -1,5 +1,5 @@ /* { dg-do compile { target i?86-*-* x86_64-*-* } } */ -/* { dg-options -O2 -fdump-ipa-icf } */ +/* { dg-options -O2 -msse2 -fdump-ipa-icf } */ #include xmmintrin.h -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [testsuite] Fix gcc.dg/ipa/ipa-icf-2[18].c on Solaris
Jakub Jelinek ja...@redhat.com writes: On Thu, Oct 23, 2014 at 11:35:44AM +0200, Rainer Orth wrote: The testcase is clearly i?86/x86_64 specific, so it must be guarded with { target i?86-*-* x86_64-*-* }, and as it is a compile time test, no need Right, that's what my patch already does. for assembler support, so just adding -msse2 option to dg-option is the right thing to do. Ok, will try that. Still no idea why Linux/x86 and Solaris/x86 are different here. Why do you think so? I certainly have: FAIL: gcc.dg/ipa/ipa-icf-21.c (test for excess errors) UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Equal symbols: 1 UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Semantic equality hit:bar-foo in my i686-linux test_summary log. It really depends on how the compiler Right, but on i386-pc-solaris2.11 it passes. is configured, if it defaults to march that has sse/sse2 by default, it will succeed, otherwise it will not. I had a terrible time figuring out what march value i686-unknown-linux-gnu uses; only in gcc -v it's possible to see that it uses pentiumpro, unlike pentium4 on Solaris/x86, which explains the difference. Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [testsuite] Fix gcc.dg/ipa/ipa-icf-2[18].c on Solaris
On Thu, Oct 23, 2014 at 01:33:18PM +0200, Jakub Jelinek wrote: I combined both patches and run regression tests on x86_64-linux-pc. Ready for trunk? With proper ChangeLog entry yes. Actually, Rainer seems to have committed everything already. Jakub
Re: [PATCH 5/5] New tests introduction
On 10/19/2014 09:50 AM, Andreas Schwab wrote: Martin Liška mli...@suse.cz writes: diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c b/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c new file mode 100644 index 000..7358e43 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-ipa-icf } */ + +#include xmmintrin.h + +__attribute__ ((noinline)) +void foo() +{ + float x = 1.2345f; + __m128 v =_mm_load1_ps(x); +} + +__attribute__ ((noinline)) +void bar() +{ + float x = 1.2345f; + __m128 v =_mm_load1_ps(x); +} + +int main() +{ + return 2; +} + +/* { dg-final { scan-ipa-dump Semantic equality hit:bar-foo icf } } */ +/* { dg-final { scan-ipa-dump Equal symbols: 1 icf } } */ +/* { dg-final { cleanup-ipa-dump icf } } */ FAIL: gcc.dg/ipa/ipa-icf-21.c (test for excess errors) Excess errors: /usr/local/gcc/gcc-20141019/gcc/testsuite/gcc.dg/ipa/ipa-icf-21.c:4:23: fatal e\ rror: xmmintrin.h: No such file or directory compilation terminated. Andreas. Hello Andreas. Starting from r216589 this problem should be fixed. Thanks, Martin
[PATCH] Fix ubsan shift instrumentation
The issue here was that we were diagnosing an artificial check that we created within the scope of shift instrumentation. In other words, for shifts we create something like (unsigned) A (B - C) and signed-integer-overflow triggered on that subtraction. Fixed by making the subtraction work on unsigned types. This only happened in C99/C++11 mode. Middle end seems to cope well with RSHIFT_EXPR whose second op has an unsigned type. Bootstrapped/regtested on x86_64-linux, ok for trunk? 2014-10-23 Marek Polacek pola...@redhat.com * c-ubsan.c (ubsan_instrument_shift): Perform the MINUS_EXPR in unsigned type. * c-c++-common/ubsan/undefined-2.c: New test. diff --git gcc/c-family/c-ubsan.c gcc/c-family/c-ubsan.c index 5a42303..7f4dc25 100644 --- gcc/c-family/c-ubsan.c +++ gcc/c-family/c-ubsan.c @@ -128,19 +128,19 @@ ubsan_instrument_shift (location_t loc, enum tree_code code, tree op1_utype = unsigned_type_for (type1); HOST_WIDE_INT op0_prec = TYPE_PRECISION (type0); tree uprecm1 = build_int_cst (op1_utype, op0_prec - 1); - tree precm1 = build_int_cst (type1, op0_prec - 1); t = fold_convert_loc (loc, op1_utype, op1); t = fold_build2 (GT_EXPR, boolean_type_node, t, uprecm1); /* For signed x y, in C99/C11, the following: - (unsigned) x (precm1 - y) + (unsigned) x (uprecm1 - y) if non-zero, is undefined. */ if (code == LSHIFT_EXPR !TYPE_UNSIGNED (type0) flag_isoc99) { - tree x = fold_build2 (MINUS_EXPR, integer_type_node, precm1, op1); + tree x = fold_build2 (MINUS_EXPR, unsigned_type_node, uprecm1, + fold_convert (op1_utype, op1)); tt = fold_convert_loc (loc, unsigned_type_for (type0), op0); tt = fold_build2 (RSHIFT_EXPR, TREE_TYPE (tt), tt, x); tt = fold_build2 (NE_EXPR, boolean_type_node, tt, @@ -148,13 +148,14 @@ ubsan_instrument_shift (location_t loc, enum tree_code code, } /* For signed x y, in C++11 and later, the following: - x 0 || ((unsigned) x (precm1 - y)) + x 0 || ((unsigned) x (uprecm1 - y)) if 1, is undefined. */ if (code == LSHIFT_EXPR !TYPE_UNSIGNED (TREE_TYPE (op0)) (cxx_dialect = cxx11)) { - tree x = fold_build2 (MINUS_EXPR, integer_type_node, precm1, op1); + tree x = fold_build2 (MINUS_EXPR, unsigned_type_node, uprecm1, + fold_convert (op1_utype, op1)); tt = fold_convert_loc (loc, unsigned_type_for (type0), op0); tt = fold_build2 (RSHIFT_EXPR, TREE_TYPE (tt), tt, x); tt = fold_build2 (GT_EXPR, boolean_type_node, tt, diff --git gcc/testsuite/c-c++-common/ubsan/undefined-2.c gcc/testsuite/c-c++-common/ubsan/undefined-2.c index e69de29..7b06709 100644 --- gcc/testsuite/c-c++-common/ubsan/undefined-2.c +++ gcc/testsuite/c-c++-common/ubsan/undefined-2.c @@ -0,0 +1,22 @@ +/* { dg-do run } */ +/* { dg-options -fsanitize=undefined -fno-sanitize-recover=signed-integer-overflow } */ +/* { dg-additional-options -std=gnu11 { target c } } */ +/* { dg-additional-options -std=c++11 { target c++ } } */ + +volatile int w, z; + +__attribute__ ((noinline, noclone)) int +foo (int x, int y) +{ + z++; + return x y; +} + +int +main () +{ + w = foo (0, -__INT_MAX__); + return 0; +} + +/* { dg-output shift exponent -\[^\n\r]* is negative\[^\n\r]*(\n|\r\n|\r) } */ Marek
[patch,avr,committed]: Fix coding rules.
This no-op change fixed typos and restored coding rules as obvious like: spaces around operators, space before opening parentheses, wrapping long lines, indentation rules, trailing blanks, etc... http://gcc.gnu.org/r216592 Johann gcc/ * config/avr/avr.c: Fix GNU coding rules and typos. * config/avr/avr.h: Dito. * config/avr/avr-c.c: Dito. * config/avr/avr.md: Dito. Index: config/avr/avr.md === --- config/avr/avr.md (revision 216591) +++ config/avr/avr.md (revision 216592) @@ -64,8 +64,8 @@ (define_constants ]) (define_constants - [ (TMP_REGNO_TINY 16) ; r16 is temp register for AVR_TINY -(ZERO_REGNO_TINY 17) ; r17 is zero register for AVR_TINY + [(TMP_REGNO_TINY 16) ; r16 is temp register for AVR_TINY + (ZERO_REGNO_TINY 17) ; r17 is zero register for AVR_TINY ]) (define_c_enum unspec @@ -640,32 +640,31 @@ (define_expand movmode DONE; } -/* AVRTC-579 -if the source operand expression is out of range for 'lds' instruction - copy source operand expression to register -For tiny core, LDS instruction's memory access range limited to 0x40..0xbf -*/ -if (!tiny_valid_direct_memory_access_range(src,MODEmode)) +// AVRTC-579 +// If the source operand expression is out of range for LDS instruction +// copy source operand expression to register. +// For tiny core, LDS instruction's memory access range limited to 0x40..0xbf. + +if (!tiny_valid_direct_memory_access_range (src, MODEmode)) { -rtx srcx = XEXP(src,0); -operands[1] = src = replace_equiv_address (src,copy_to_mode_reg (GET_MODE(srcx),srcx)); -emit_move_insn(dest,src); +rtx srcx = XEXP (src, 0); +operands[1] = src = replace_equiv_address (src, copy_to_mode_reg (GET_MODE (srcx), srcx)); +emit_move_insn (dest, src); DONE; } -/* AVRTC-579 -if the destination operand expression is out of range for 'sts' instruction - copy destination operand expression to register -For tiny core, STS instruction's memory access range limited to 0x40..0xbf -*/ -if (!tiny_valid_direct_memory_access_range(dest,MODEmode)) -{ - rtx destx = XEXP(dest,0); - operands[0] = dest = replace_equiv_address (dest,copy_to_mode_reg (GET_MODE(destx),destx)); - emit_move_insn(dest,src); - DONE; -} +// AVRTC-579 +// If the destination operand expression is out of range for STS instruction +// copy destination operand expression to register. +// For tiny core, STS instruction's memory access range limited to 0x40..0xbf. +if (!tiny_valid_direct_memory_access_range (dest, MODEmode)) + { +rtx destx = XEXP (dest, 0); +operands[0] = dest = replace_equiv_address (dest, copy_to_mode_reg (GET_MODE (destx), destx)); +emit_move_insn (dest, src); +DONE; + } }) ;; @@ -683,12 +682,12 @@ (define_insn movmode_insn [(set (match_operand:ALL1 0 nonimmediate_operand =r,d,Qm ,r ,q,r,*r) (match_operand:ALL1 1 nox_general_operand r Y00,n Ynn,r Y00,Qm,r,q,i))] (register_operand (operands[0], MODEmode) - || reg_or_0_operand (operands[1], MODEmode)) - /* skip if operands are out of lds/sts memory access range(0x40..0xbf) - though access range is checked during define_expand, it is required - here to avoid merging rtls during combine pass */ - tiny_valid_direct_memory_access_range(operands[0],QImode) - tiny_valid_direct_memory_access_range(operands[1],QImode) +|| reg_or_0_operand (operands[1], MODEmode)) + /* Skip if operands are out of lds/sts memory access range(0x40..0xbf) + though access range is checked during define_expand, it is required + here to avoid merging RTXes during combine pass. */ +tiny_valid_direct_memory_access_range (operands[0], QImode) +tiny_valid_direct_memory_access_range (operands[1], QImode) { return output_movqi (insn, operands, NULL); } @@ -782,12 +781,12 @@ (define_insn *movmode [(set (match_operand:ALL2 0 nonimmediate_operand =r,r ,r,m,d,*r,q,r) (match_operand:ALL2 1 nox_general_operand r,Y00,m,r Y00,i,i ,r,q))] (register_operand (operands[0], MODEmode) - || reg_or_0_operand (operands[1], MODEmode)) - /* skip if operands are out of lds/sts memory access range(0x40..0xbf) - though access range is checked during define_expand, it is required - here to avoid merging rtls during combine pass */ - tiny_valid_direct_memory_access_range(operands[0],HImode) - tiny_valid_direct_memory_access_range(operands[1],HImode) +|| reg_or_0_operand (operands[1], MODEmode)) + /* Skip if operands are out of lds/sts memory access range(0x40..0xbf) + though access range is checked during define_expand, it is required + here to avoid merging RTXes
Re: [PATCH] Fix genmatch linking
Richard Biener rguent...@suse.de writes: This adds a libcpp host module without NLS and ICONV support and properly links genmatch against the build libcpp instead of the host one. Bootstrap running on x86_64-unknown-linux-gnu (stage1 all-gcc finished fine). Unfortunately, this doesn't work on i386-pc-solaris2.10: In stage1, build/genmatch is linked correctly g++ -g -DIN_GCC-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-format -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -DGENERATOR_FILE -static-libstdc++ -static-libgcc -o build/genmatch \ build/genmatch.o ../build-i386-pc-solaris2.10/libcpp/libcpp.a build/errors.o build/vec.o build/hash-table.o ../build-i386-pc-solaris2.10/libiberty/libiberty.a OTOH, in stage2, I get /var/gcc/regression/trunk/10-gcc/build/./prev-gcc/xg++ -B/var/gcc/regression/trunk/10-gcc/build/./prev-gcc/ -B/vol/gcc/i386-pc-solaris2.10/bin/ -nostdinc++ -B/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/src/.libs -B/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/libsupc++/.libs -I/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/include/i386-pc-solaris2.10 -I/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/include -I/vol/gcc/src/hg/trunk/local/libstdc++-v3/libsupc++ -L/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/src/.libs -L/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/libsupc++/.libs -g -O2 -DIN_GCC-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common -DHAVE_CONFIG_H -DGENERATOR_FILE -static-libstdc++ -static-libgcc -o build/genmatch \ build/genmatch.o .././libcpp/libcpp.a build/errors.o build/vec.o build/hash-table.o .././libiberty/libiberty.a Undefined first referenced symbol in file libintl_bindtextdomain .././libcpp/libcpp.a(init.o) libintl_dgettext.././libcpp/libcpp.a(errors.o) ld: fatal: symbol referencing errors. No output written to build/genmatch collect2: error: ld returned 1 exit status make[3]: *** [build/genmatch] Error 1 Diffing the stage1 and 2 gcc/Makefiles, I see --- prev-gcc/Makefile 2014-10-23 13:45:20.720460523 +0200 +++ gcc/Makefile2014-10-23 14:32:06.111476537 +0200 @@ -86 +86 @@ -build_libsubdir=build-i386-pc-solaris2.10 +build_libsubdir=. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] Fix ubsan shift instrumentation
On Thu, Oct 23, 2014 at 02:46:52PM +0200, Jakub Jelinek wrote: Ok. Can you please queue it for 4.9 branch too, after 4.9.2 is released? There is no -f*sanitize-recover* support, but it can be supposedly left out from dg-options for the branch. Sure. Marek
Re: [PATCH] Fix and improve avx2 broadcasts (PR target/63594)
Hi Jakub, This patch fixes a bunch of recent regressions: FAIL: gcc.target/i386/avx-1.c (internal compiler error) FAIL: gcc.target/i386/avx-1.c (test for excess errors) FAIL: gcc.target/i386/avx-2.c (internal compiler error) FAIL: gcc.target/i386/avx-2.c (test for excess errors) FAIL: gcc.target/i386/avx512f-vec-init.c (internal compiler error) FAIL: gcc.target/i386/avx512f-vec-init.c (test for excess errors) UNRESOLVED: gcc.target/i386/avx512f-vec-init.c scan-assembler-times vbroadcastsd 1 UNRESOLVED: gcc.target/i386/avx512f-vec-init.c scan-assembler-times vbroadcastss 1 UNRESOLVED: gcc.target/i386/avx512f-vec-init.c scan-assembler-times vmovdqa64[ t]+%zmm 2 UNRESOLVED: gcc.target/i386/avx512f-vec-init.c scan-assembler-times vpbroadcastb 2 UNRESOLVED: gcc.target/i386/avx512f-vec-init.c scan-assembler-times vpbroadcastd 1 UNRESOLVED: gcc.target/i386/avx512f-vec-init.c scan-assembler-times vpbroadcastq 1 UNRESOLVED: gcc.target/i386/avx512f-vec-init.c scan-assembler-times vpbroadcastw 2 FAIL: gcc.target/i386/sse-14.c (internal compiler error) FAIL: gcc.target/i386/sse-14.c (test for excess errors) FAIL: gcc.target/i386/sse-22.c (internal compiler error) FAIL: gcc.target/i386/sse-22.c (test for excess errors) FAIL: gcc.target/i386/sse-22a.c (internal compiler error) FAIL: gcc.target/i386/sse-22a.c (test for excess errors) FAIL: gcc.target/i386/sse-23.c (internal compiler error) FAIL: gcc.target/i386/sse-23.c (test for excess errors) FAIL: gcc.target/i386/sse-24.c (internal compiler error) FAIL: gcc.target/i386/sse-24.c (test for excess errors) and improves quality of code generated for AVX2 and AVX512F broadcasts; as AVX2 broadcast instructions can have source in memory or vector register (but only AVX512F can have it in GPRs), the patch adds splitter for the GPR case and adds ! for that, so that RA can choose what is best and if broadcast from GPR is desirable, it first performs vmovd from GPR into the dest register and then vpbroadcast{b,w,d} it. The AVX512* patterns should be IMHO merged, so that whether GPR or MEM is used are just alternatives of the same define_insn rather than different define_insns, but am not changing that right now, will leave that to Kirill as a follow-up. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? [...] * gcc.dg/pr63594-1.c: New test. * gcc.dg/pr63594-2.c: New test. Unfortunately, I see some problems with those tests on Solaris: * On Solaris/x86, I get FAIL: gcc.dg/pr63594-2.c execution test for 32-bit. Any particular reason to restrict -mno-mmx to Linux/x86? Manually building the testcase with -mno-mmx on Solaris/x86 seems to cure the failure. * On 64-bit Solaris/SPARC, I get FAIL: gcc.dg/pr63594-1.c (internal compiler error) FAIL: gcc.dg/pr63594-1.c (test for excess errors) /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/pr63594-1.c: In function 'test1float1': /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/pr63594-1.c:19:1: internal compiler error: Bus Error /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/pr63594-1.c:57:1: note: in expansion of macro 'T' 0x751c03 crash_signal /vol/gcc/src/hg/trunk/local/gcc/toplev.c:349 0x44ffb4 gen_group_rtx(rtx_def*) /vol/gcc/src/hg/trunk/local/gcc/expr.c:1624 0x4f8167 expand_function_start(tree_node*) /vol/gcc/src/hg/trunk/local/gcc/function.c:4803 0x36278f execute /vol/gcc/src/hg/trunk/local/gcc/cfgexpand.c:5709 In gdb, I see a SEGV instead: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 1 (LWP 1)] gen_group_rtx (orig=0xfb5a3690) at /vol/gcc/src/hg/trunk/local/gcc/expr.c:1624 1624 if (i) (gdb) where #0 gen_group_rtx (orig=0xfb5a3690) at /vol/gcc/src/hg/trunk/local/gcc/expr.c:1624 #1 0x004f8168 in expand_function_start (subr=0xfb497680) at /vol/gcc/src/hg/trunk/local/gcc/function.c:4803 #2 0x00362790 in (anonymous namespace)::pass_expand::execute ( this=optimized out, fun=0xfb4a11e0) at /vol/gcc/src/hg/trunk/local/gcc/cfgexpand.c:5709 #3 0x006819b8 in execute_one_pass (pass=pass@entry=0x112aab0) at /vol/gcc/src/hg/trunk/local/gcc/passes.c:2156 #4 0x00682020 in execute_pass_list_1 (pass=0x112aab0, pass@entry=0x1128610) at /vol/gcc/src/hg/trunk/local/gcc/passes.c:2208 #5 0x00682088 in execute_pass_list (fn=0xfb4a11e0, pass=0x1128610) at /vol/gcc/src/hg/trunk/local/gcc/passes.c:2219 #6 0x0038fda4 in cgraph_node::expand (this=this@entry=0xfb4b2700) at /vol/gcc/src/hg/trunk/local/gcc/cgraphunit.c:1742 #7 0x003918c4 in expand_all_functions () at /vol/gcc/src/hg/trunk/local/gcc/cgraphunit.c:1878 #8 symbol_table::compile (this=0xfb41) at /vol/gcc/src/hg/trunk/local/gcc/cgraphunit.c:2213 #9 0x003935f0 in symbol_table::finalize_compilation_unit (this=0xfb41) at /vol/gcc/src/hg/trunk/local/gcc/cgraphunit.c:2290 #10 0x002205dc in c_write_global_declarations () at
Re: [PATCH] Fix ubsan shift instrumentation
On Thu, Oct 23, 2014 at 02:34:04PM +0200, Marek Polacek wrote: The issue here was that we were diagnosing an artificial check that we created within the scope of shift instrumentation. In other words, for shifts we create something like (unsigned) A (B - C) and signed-integer-overflow triggered on that subtraction. Fixed by making the subtraction work on unsigned types. This only happened in C99/C++11 mode. Middle end seems to cope well with RSHIFT_EXPR whose second op has an unsigned type. Bootstrapped/regtested on x86_64-linux, ok for trunk? 2014-10-23 Marek Polacek pola...@redhat.com * c-ubsan.c (ubsan_instrument_shift): Perform the MINUS_EXPR in unsigned type. * c-c++-common/ubsan/undefined-2.c: New test. Ok. Can you please queue it for 4.9 branch too, after 4.9.2 is released? There is no -f*sanitize-recover* support, but it can be supposedly left out from dg-options for the branch. Jakub
Re: [PATCH] Fix genmatch linking
On Thu, 23 Oct 2014, Rainer Orth wrote: Richard Biener rguent...@suse.de writes: This adds a libcpp host module without NLS and ICONV support and properly links genmatch against the build libcpp instead of the host one. Bootstrap running on x86_64-unknown-linux-gnu (stage1 all-gcc finished fine). Unfortunately, this doesn't work on i386-pc-solaris2.10: In stage1, build/genmatch is linked correctly g++ -g -DIN_GCC-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-format -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -DGENERATOR_FILE -static-libstdc++ -static-libgcc -o build/genmatch \ build/genmatch.o ../build-i386-pc-solaris2.10/libcpp/libcpp.a build/errors.o build/vec.o build/hash-table.o ../build-i386-pc-solaris2.10/libiberty/libiberty.a OTOH, in stage2, I get /var/gcc/regression/trunk/10-gcc/build/./prev-gcc/xg++ -B/var/gcc/regression/trunk/10-gcc/build/./prev-gcc/ -B/vol/gcc/i386-pc-solaris2.10/bin/ -nostdinc++ -B/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/src/.libs -B/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/libsupc++/.libs -I/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/include/i386-pc-solaris2.10 -I/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/include -I/vol/gcc/src/hg/trunk/local/libstdc++-v3/libsupc++ -L/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/src/.libs -L/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/libsupc++/.libs -g -O2 -DIN_GCC-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common -DHAVE_CONFIG_H -DGENERATOR_FILE -static-libstdc++ -static-libgcc -o build/genmatch \ build/genmatch.o .././libcpp/libcpp.a build/errors.o build/vec.o build/hash-table.o .././libiberty/libiberty.a Undefined first referenced symbol in file libintl_bindtextdomain .././libcpp/libcpp.a(init.o) libintl_dgettext.././libcpp/libcpp.a(errors.o) ld: fatal: symbol referencing errors. No output written to build/genmatch collect2: error: ld returned 1 exit status make[3]: *** [build/genmatch] Error 1 Diffing the stage1 and 2 gcc/Makefiles, I see --- prev-gcc/Makefile 2014-10-23 13:45:20.720460523 +0200 +++ gcc/Makefile2014-10-23 14:32:06.111476537 +0200 @@ -86 +86 @@ -build_libsubdir=build-i386-pc-solaris2.10 +build_libsubdir=. Hmm, why do stage2+ use the host libraries? Ah, because they can assume that host == build. Ok, so when bootstrapping you then need to link libcpp dependencies as well, for stage2+. Not sure how to achieve that. I guess doing sth in gcc/configure - but how does that know what stage we are in? Richard.
Re: [PATCH] Relax check against commuting XOR and ASHIFTRT in combine.c
Andreas Schwab sch...@linux-m68k.org writes: Alan Lawrence alan.lawre...@arm.com writes: diff --git a/gcc/testsuite/gcc.dg/combine_ashiftrt_1.c b/gcc/testsuite/gcc.dg/combine_ashiftrt_1.c new file mode 100644 index ..90e64fd10dc358f10ad03a90041605bc3ccb7011 --- /dev/null +++ b/gcc/testsuite/gcc.dg/combine_ashiftrt_1.c @@ -0,0 +1,18 @@ +/* { dg-do compile {target sparc64*-*-* aarch64*-*-* x86_64-*-* powerpc64*-*-*} } */ You should check for lp64 instead of matching 64 in target names, to reject -m32. diff --git a/gcc/testsuite/gcc.dg/combine_ashiftrt_2.c b/gcc/testsuite/gcc.dg/combine_ashiftrt_2.c new file mode 100644 index ..fd6827caed230ea5dd2d6ec4431b11bf826531ea --- /dev/null +++ b/gcc/testsuite/gcc.dg/combine_ashiftrt_2.c @@ -0,0 +1,18 @@ +/* { dg-do compile {target arm*-*-* i?86-*-* powerpc-*-* sparc-*-*} } */ Likewise, using ilp32 to reject -m64. Right, the current target lists are simply bogus on biarch targets. Alan, what's the reasoning behind your current target lists here? Any reason the test couldn't work elsewhere? If not, it would be way better to introduce a corresponding effective-target keyword than listing particular targets without explanation. This needs to be fixed: the issue is knowns for three weeks now and causes testsuite noise on many platforms. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] Fix and improve avx2 broadcasts (PR target/63594)
On Thu, Oct 23, 2014 at 02:58:06PM +0200, Rainer Orth wrote: Unfortunately, I see some problems with those tests on Solaris: * On Solaris/x86, I get FAIL: gcc.dg/pr63594-2.c execution test for 32-bit. Any particular reason to restrict -mno-mmx to Linux/x86? Manually building the testcase with -mno-mmx on Solaris/x86 seems to cure the failure. No reason, probably finger memory without lots of thinking. The reason for -mno-mmx is that the functions use floating point vectors and scalar floating point arithmetics in the same function. Feel free to change both pr63594-{1,2}.c with s/linux//g . * On 64-bit Solaris/SPARC, I get FAIL: gcc.dg/pr63594-1.c (internal compiler error) FAIL: gcc.dg/pr63594-1.c (test for excess errors) /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/pr63594-1.c: In function 'test1float1': /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/pr63594-1.c:19:1: internal compiler error: Bus Error /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/pr63594-1.c:57:1: note: in expansion of macro 'T' 0x751c03 crash_signal /vol/gcc/src/hg/trunk/local/gcc/toplev.c:349 0x44ffb4 gen_group_rtx(rtx_def*) /vol/gcc/src/hg/trunk/local/gcc/expr.c:1624 0x4f8167 expand_function_start(tree_node*) /vol/gcc/src/hg/trunk/local/gcc/function.c:4803 0x36278f execute /vol/gcc/src/hg/trunk/local/gcc/cfgexpand.c:5709 Works fine on x86_64, and doesn't seem to be related to the fix in any way, it seems the ICE is related to returning or passing the vectors, so supposedly some latent Solaris/SPARC issue? Jakub
[PATCH] Avoid recursion in fold
On match-and-simplify branch I run into a recursion folding (bool_var 1) == 0 via !(bool_var 1) to (bool_var 1) == 0 The appearant recursion prevention (well, not documented as such) in fold_truth_not_expr doesn't work because we still fold its result via the various fold_convert calls which on the branch results in folding (T)((bool_var 1) == 0) to (bool_var 1) == 0 with type T (thus folding that comparison again). The fix is simple as TRUTH_NOT_EXPR does not need to have the same type as its opearand (see fold_truth_not_expr for an existing example). Simply avoid all the excessive conversions and build it with the desired type in the first place. Btw, the simplification match-and-simplify has but fold-const.c on trunk has not (yet) which triggers this is (simplify (convert (eq @0 @1)) (if (TREE_CODE (type) == BOOLEAN_TYPE) (cmp @0 @1)) which folds of comparisons to bool into the comparison itself (that's from forwprop). Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2014-10-23 Richard Biener rguent...@suse.de * fold-const.c (fold_truth_not_expr): Take type as argument. (fold_unary_loc): Pass type to fold_truth_not_expr, remove redundant conversion. (fold_binary_loc): Build TRUTH_NOT_EXPR with the desired type directly. Index: gcc/fold-const.c === --- gcc/fold-const.c(revision 216549) +++ gcc/fold-const.c(working copy) @@ -3125,9 +3125,8 @@ omit_two_operands_loc (location_t loc, t problems with the dominator optimizer. */ static tree -fold_truth_not_expr (location_t loc, tree arg) +fold_truth_not_expr (location_t loc, tree type, tree arg) { - tree type = TREE_TYPE (arg); enum tree_code code = TREE_CODE (arg); location_t loc1, loc2; @@ -8108,10 +8107,7 @@ fold_unary_loc (location_t loc, enum tre and its values must be 0 or 1. (true is a fixed value perhaps depending on the language, but we don't handle values other than 1 correctly yet.) */ - tem = fold_truth_not_expr (loc, arg0); - if (!tem) - return NULL_TREE; - return fold_convert_loc (loc, type, tem); + return fold_truth_not_expr (loc, type, arg0); case REALPART_EXPR: if (TREE_CODE (TREE_TYPE (arg0)) != COMPLEX_TYPE) @@ -12628,16 +12624,12 @@ fold_binary_loc (location_t loc, /* bool_var != 1 becomes !bool_var. */ if (TREE_CODE (TREE_TYPE (arg0)) == BOOLEAN_TYPE integer_onep (arg1) code == NE_EXPR) -return fold_convert_loc (loc, type, -fold_build1_loc (loc, TRUTH_NOT_EXPR, - TREE_TYPE (arg0), arg0)); +return fold_build1_loc (loc, TRUTH_NOT_EXPR, type, arg0); /* bool_var == 0 becomes !bool_var. */ if (TREE_CODE (TREE_TYPE (arg0)) == BOOLEAN_TYPE integer_zerop (arg1) code == EQ_EXPR) -return fold_convert_loc (loc, type, -fold_build1_loc (loc, TRUTH_NOT_EXPR, - TREE_TYPE (arg0), arg0)); +return fold_build1_loc (loc, TRUTH_NOT_EXPR, type, arg0); /* !exp != 0 becomes !exp */ if (TREE_CODE (arg0) == TRUTH_NOT_EXPR integer_zerop (arg1)
[PATCH][match-and-simplify] 2nd try handling TREE_SIDE_EFFECTS
This is a second attempt - it fixes the bugs in the previous one and handles falling back or not separately for the incoming arguments. What we handle too conservatively right now is non-captured leafs and captured expressions that are substituted into the result. Both result in the incoming arguments that mention the respective leaf / non-capture to be forced side-effect free. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2014-10-23 Richard Biener rguent...@suse.de * genmatch.c (capture_info): New class. (capture_info::capture_info): New constructor. (capture_info::walk_match): New method. (capture_info::walk_result): New method. (capture_info::walk_c_expr): New method. (dt_simplify::gen): Handle preserving side-effects for GENERIC code generation. (decision_tree::gen_generic): Do not reject operands with TREE_SIDE_EFFECTS. Index: gcc/genmatch.c === --- gcc/genmatch.c (revision 216549) +++ gcc/genmatch.c (working copy) @@ -1861,6 +1861,166 @@ dt_operand::gen (FILE *f, bool gimple) fprintf (f, }\n); } + +/* For GENERIC we have to take care of wrapping multiple-used + expressions with side-effects in save_expr and preserve side-effects + of expressions with omit_one_operand. Analyze captures in + match, result and with expressions and perform early-outs + on the outermost match expression operands for cases we cannot + handle. */ + +struct capture_info +{ + capture_info (simplify *s); + void walk_match (operand *o, unsigned toplevel_arg, bool); + void walk_result (operand *o, bool); + void walk_c_expr (c_expr *); + + struct cinfo +{ + bool expr_p; + bool cse_p; + bool force_no_side_effects_p; + unsigned long toplevel_msk; + int result_use_count; +}; + + auto_veccinfo info; + unsigned long force_no_side_effects; +}; + +/* Analyze captures in S. */ + +capture_info::capture_info (simplify *s) +{ + expr *e; + if (!s-result + || ((e = dyn_cast expr * (s-result)) + is_a predicate_id * (e-operation))) +{ + force_no_side_effects = -1; + return; +} + + info.safe_grow_cleared (s-capture_max + 1); + e = as_a expr * (s-match); + for (unsigned i = 0; i e-ops.length (); ++i) +walk_match (e-ops[i], i, false); + + walk_result (s-result, false); + + for (unsigned i = 0; i s-ifexpr_vec.length (); ++i) +if (s-ifexpr_vec[i].is_with) + walk_c_expr (as_a c_expr *(s-ifexpr_vec[i].cexpr)); +} + +/* Analyze captures in the match expression piece O. */ + +void +capture_info::walk_match (operand *o, unsigned toplevel_arg, bool conditional_p) +{ + if (capture *c = dyn_cast capture * (o)) +{ + info[c-where].toplevel_msk |= 1 toplevel_arg; + info[c-where].force_no_side_effects_p |= conditional_p; + /* Mark expr (non-leaf) captures and recurse. */ + if (c-what + is_a expr * (c-what)) + { + info[c-where].expr_p = true; + walk_match (c-what, toplevel_arg, conditional_p); + } +} + else if (expr *e = dyn_cast expr * (o)) +{ + for (unsigned i = 0; i e-ops.length (); ++i) + { + bool cond_p = conditional_p; + if (i == 0 + *e-operation == COND_EXPR) + cond_p = true; + else if (*e-operation == TRUTH_ANDIF_EXPR + || *e-operation == TRUTH_ORIF_EXPR) + cond_p = true; + walk_match (e-ops[i], toplevel_arg, cond_p); + } +} + else if (is_a predicate * (o)) +{ + /* Mark non-captured leafs toplevel arg for checking. */ + force_no_side_effects |= 1 toplevel_arg; +} + else +gcc_unreachable (); +} + +/* Analyze captures in the result expression piece O. */ + +void +capture_info::walk_result (operand *o, bool conditional_p) +{ + if (capture *c = dyn_cast capture * (o)) +{ + info[c-where].result_use_count++; + /* If we substitute an expression capture we don't know + which captures this will end up using (well, we don't +compute that). Force the uses to be side-effect free +which means forcing the toplevels that reach the +expression side-effect free. */ + if (info[c-where].expr_p) + force_no_side_effects |= info[c-where].toplevel_msk; + /* Mark CSE capture capture uses as forced to have + no side-effects. */ + if (c-what + is_a expr * (c-what)) + { + info[c-where].cse_p = true; + walk_result (c-what, true); + } +} + else if (expr *e = dyn_cast expr * (o)) +{ + for (unsigned i = 0; i e-ops.length (); ++i) + { + bool cond_p = conditional_p; + if (i == 0 + *e-operation == COND_EXPR) + cond_p = true; + else if (*e-operation == TRUTH_ANDIF_EXPR + || *e-operation ==
Re: [PATCH 1/X, i386, PR54232] Enable EBX for x86 in 32bits PIC code
Evgeny Stupachenko evstu...@gmail.com writes: Reattached. On Mon, Oct 13, 2014 at 8:22 PM, Uros Bizjak ubiz...@gmail.com wrote: On Mon, Oct 13, 2014 at 4:53 PM, Evgeny Stupachenko evstu...@gmail.com wrote: ChangeLog for testsuite: 2014-10-13 Evgeny Stupachenko evstu...@gmail.com PR target/8340 PR middle-end/47602 PR rtl-optimization/55458 * gcc.target/i386/pic-1.c: Remove dg-error as test should pass now. * gcc.target/i386/pr55458.c: Likewise. * gcc.target/i386/pr47602.c: New. * gcc.target/i386/pr23098.c: Move to XFAIL. The unconditional XFAIL is wrong: the test now XPASSes on i386-pc-solaris2.11 and x86_64-unknown-linux-gnu, i686-unknown-linux-gnu for 32-bit. Please fix. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] Fix and improve avx2 broadcasts (PR target/63594)
Jakub Jelinek ja...@redhat.com writes: On Thu, Oct 23, 2014 at 02:58:06PM +0200, Rainer Orth wrote: Unfortunately, I see some problems with those tests on Solaris: * On Solaris/x86, I get FAIL: gcc.dg/pr63594-2.c execution test for 32-bit. Any particular reason to restrict -mno-mmx to Linux/x86? Manually building the testcase with -mno-mmx on Solaris/x86 seems to cure the failure. No reason, probably finger memory without lots of thinking. The reason for -mno-mmx is that the functions use floating point vectors and scalar floating point arithmetics in the same function. Feel free to change both pr63594-{1,2}.c with s/linux//g . Ok, will do and commit after Linux and Solaris testing. * On 64-bit Solaris/SPARC, I get FAIL: gcc.dg/pr63594-1.c (internal compiler error) FAIL: gcc.dg/pr63594-1.c (test for excess errors) /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/pr63594-1.c: In function 'test1float1': /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/pr63594-1.c:19:1: internal compiler error: Bus Error /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/pr63594-1.c:57:1: note: in expansion of macro 'T' 0x751c03 crash_signal /vol/gcc/src/hg/trunk/local/gcc/toplev.c:349 0x44ffb4 gen_group_rtx(rtx_def*) /vol/gcc/src/hg/trunk/local/gcc/expr.c:1624 0x4f8167 expand_function_start(tree_node*) /vol/gcc/src/hg/trunk/local/gcc/function.c:4803 0x36278f execute /vol/gcc/src/hg/trunk/local/gcc/cfgexpand.c:5709 Works fine on x86_64, and doesn't seem to be related to the fix in any way, it seems the ICE is related to returning or passing the vectors, so supposedly some latent Solaris/SPARC issue? Ok, I'll file a PR and Cc Eric. Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH i386 AVX512] [81.2/n] Add new built-ins.
On Thu, Oct 23, 2014 at 2:50 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Hello Uroš, On 20 Oct 17:41, Kirill Yukhin wrote: This patch adds (almost) all built-ins This is second, target dependent, part of the initial patch. ChangeLog is the same as in intial mail - tree* changes. Is it ok for trunk? OK. (I was not able to review this huge patch in detail, but the approach is OK and these are fairly mechanical additions). Thanks, Uros.
[PATCH] Fix copypaste error in fold
The following makes sure to convert the correct argument, not the one we already converted. Currently we end up folding ptr-type - integer-type. Oops. Fails spectacularly on match-and-simplify where this is the first strip-nops missed-optimization (on bogus input, that is). I'm considering this is obvious but will include it in a bootstrap/regtest anyway. Thanks, Richard. 2014-10-23 Richard Biener rguent...@suse.de * fold-const.c (fold_binary_loc): Fix copy-and-pasto. Index: gcc/fold-const.c === --- gcc/fold-const.c(revision 216546) +++ gcc/fold-const.c(working copy) @@ -10596,8 +10596,9 @@ fold_binary_loc (location_t loc, TREE_OPERAND (arg1, 0)); tree arg11 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1)); - tree tmp = fold_binary_loc (loc, MINUS_EXPR, type, arg0, - fold_convert_loc (loc, type, arg10)); + tree tmp = fold_binary_loc (loc, MINUS_EXPR, type, + fold_convert_loc (loc, type, arg0), + arg10); if (tmp) return fold_build2_loc (loc, MINUS_EXPR, type, tmp, arg11); }
[patch,avr] tweak sign extensions
This optimization makes most sign-extensions one instruction shorter in the case when the source register may be clobbered and the register numbers are different. Source and destination may overlap. Ok for trunk? Johann gcc/ * config/avr/avr.md (extendqihi2, extendqipsi2, extendqisi2) (extendhipsi2, extendhisi2): Optimize if source reg is unused after the insns and has different REGNO than destination. Index: config/avr/avr.md === --- config/avr/avr.md (revision 216592) +++ config/avr/avr.md (working copy) @@ -4174,9 +4174,14 @@ (define_insn extendqihi2 [(set (match_operand:HI 0 register_operand =r,r) (sign_extend:HI (match_operand:QI 1 combine_pseudo_register_operand 0,*r)))] - @ - clr %B0\;sbrc %0,7\;com %B0 - mov %A0,%A1\;clr %B0\;sbrc %A0,7\;com %B0 + { +if (REGNO (operands[0]) == REGNO (operands[1])) + return clr %B0\;sbrc %0,7\;com %B0; +else if (reg_unused_after (insn, operands[1])) + return mov %A0,%1\;lsl %1\;sbc %B0,%B0; +else + return mov %A0,%1\;clr %B0\;sbrc %A0,7\;com %B0; + } [(set_attr length 3,4) (set_attr cc set_n,set_n)]) @@ -4184,9 +4189,14 @@ (define_insn extendqipsi2 [(set (match_operand:PSI 0 register_operand =r,r) (sign_extend:PSI (match_operand:QI 1 combine_pseudo_register_operand 0,*r)))] - @ - clr %B0\;sbrc %A0,7\;com %B0\;mov %C0,%B0 - mov %A0,%A1\;clr %B0\;sbrc %A0,7\;com %B0\;mov %C0,%B0 + { +if (REGNO (operands[0]) == REGNO (operands[1])) + return clr %B0\;sbrc %0,7\;com %B0\;mov %C0,%B0; +else if (reg_unused_after (insn, operands[1])) + return mov %A0,%1\;lsl %1\;sbc %B0,%B0\;mov %C0,%B0; +else + return mov %A0,%1\;clr %B0\;sbrc %A0,7\;com %B0\;mov %C0,%B0; + } [(set_attr length 4,5) (set_attr cc set_n,set_n)]) @@ -4194,9 +4204,14 @@ (define_insn extendqisi2 [(set (match_operand:SI 0 register_operand =r,r) (sign_extend:SI (match_operand:QI 1 combine_pseudo_register_operand 0,*r)))] - @ - clr %B0\;sbrc %A0,7\;com %B0\;mov %C0,%B0\;mov %D0,%B0 - mov %A0,%A1\;clr %B0\;sbrc %A0,7\;com %B0\;mov %C0,%B0\;mov %D0,%B0 + { +if (REGNO (operands[0]) == REGNO (operands[1])) + return clr %B0\;sbrc %0,7\;com %B0\;mov %C0,%B0\;mov %D0,%B0; +else if (reg_unused_after (insn, operands[1])) + return mov %A0,%1\;lsl %1\;sbc %B0,%B0\;mov %C0,%B0\;mov %D0,%B0; +else + return mov %A0,%1\;clr %B0\;sbrc %A0,7\;com %B0\;mov %C0,%B0\;mov %D0,%B0; + } [(set_attr length 5,6) (set_attr cc set_n,set_n)]) @@ -4204,10 +4219,16 @@ (define_insn extendhipsi2 [(set (match_operand:PSI 0 register_operand =r,r ,r) (sign_extend:PSI (match_operand:HI 1 combine_pseudo_register_operand 0,*r,*r)))] - @ - clr %C0\;sbrc %B0,7\;com %C0 - mov %A0,%A1\;mov %B0,%B1\;clr %C0\;sbrc %B0,7\;com %C0 - movw %A0,%A1\;clr %C0\;sbrc %B0,7\;com %C0 + { +if (REGNO (operands[0]) == REGNO (operands[1])) + return clr %C0\;sbrc %B0,7\;com %C0; +output_asm_insn (AVR_HAVE_MOVW + ? movw %A0,%A1 + : mov %A0,%A1\n\tmov %B0,%B1, operands); +return (reg_unused_after (insn, operands[1]) +? lsl %B1\;sbc %C0,%C0 +: clr %C0\;sbrc %B0,7\;com %C0); + } [(set_attr length 3,5,4) (set_attr isa *,mov,movw) (set_attr cc set_n)]) @@ -4216,10 +4237,16 @@ (define_insn extendhisi2 [(set (match_operand:SI 0 register_operand =r,r ,r) (sign_extend:SI (match_operand:HI 1 combine_pseudo_register_operand 0,*r,*r)))] - @ - clr %C0\;sbrc %B0,7\;com %C0\;mov %D0,%C0 - mov %A0,%A1\;mov %B0,%B1\;clr %C0\;sbrc %B0,7\;com %C0\;mov %D0,%C0 - movw %A0,%A1\;clr %C0\;sbrc %B0,7\;com %C0\;mov %D0,%C0 + { +if (REGNO (operands[0]) == REGNO (operands[1])) + return clr %C0\;sbrc %B0,7\;com %C0\;mov %D0,%C0; +output_asm_insn (AVR_HAVE_MOVW + ? movw %A0,%A1 + : mov %A0,%A1\n\tmov %B0,%B1, operands); +return (reg_unused_after (insn, operands[1]) +? lsl %B1\;sbc %C0,%C0\;mov %D0,%C0 +: clr %C0\;sbrc %B0,7\;com %C0\;mov %D0,%C0); + } [(set_attr length 4,6,5) (set_attr isa *,mov,movw) (set_attr cc set_n)])
Re: [testsuite] Fix gcc.dg/ipa/ipa-icf-2[18].c on Solaris
Hello! Ok, will try that. Still no idea why Linux/x86 and Solaris/x86 are different here. Why do you think so? I certainly have: FAIL: gcc.dg/ipa/ipa-icf-21.c (test for excess errors) UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Equal symbols: 1 UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Semantic equality hit:bar-foo in my i686-linux test_summary log. It really depends on how the compiler Right, but on i386-pc-solaris2.11 it passes. is configured, if it defaults to march that has sse/sse2 by default, it will succeed, otherwise it will not. For some reason, I hit a division by zero on alphaev68-linux-gnu in the report generation code: Starting program: /space/uros/gcc-build/prev-gcc/cc1 -O2 -fdump-ipa-icf -quiet ipa-icf-14.i Program received signal SIGFPE, Arithmetic exception. ipa_icf::sem_item_optimizer::merge_classes (this=0x12187dd80, prev_class_count=3) at /space/homedirs/uros/gcc-svn/trunk/gcc/ipa-icf.c:2203 2203 non_singular_classes_count); (gdb) list 2198 fprintf (dump_file, Average class size before: %.2f, after: %.2f\n, 2199 1.0f * item_count / prev_class_count, 2200 1.0f * item_count / class_count); 2201 fprintf (dump_file, Average non-singular class size: %.2f, count: %u\n, 2202 1.0f * non_singular_classes_sum / non_singular_classes_count, 2203 non_singular_classes_count); 2204 fprintf (dump_file, Equal symbols: %u\n, equal_items); 2205 fprintf (dump_file, Fraction of visited symbols: %.2f%%\n\n, 2206 100.0f * equal_items / item_count); 2207} (gdb) p non_singular_classes_count $1 = 0 Uros.
Re: [testsuite] Fix gcc.dg/ipa/ipa-icf-2[18].c on Solaris
On Thu, Oct 23, 2014 at 3:54 PM, Uros Bizjak ubiz...@gmail.com wrote: Ok, will try that. Still no idea why Linux/x86 and Solaris/x86 are different here. Why do you think so? I certainly have: FAIL: gcc.dg/ipa/ipa-icf-21.c (test for excess errors) UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Equal symbols: 1 UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Semantic equality hit:bar-foo in my i686-linux test_summary log. It really depends on how the compiler Right, but on i386-pc-solaris2.11 it passes. is configured, if it defaults to march that has sse/sse2 by default, it will succeed, otherwise it will not. For some reason, I hit a division by zero on alphaev68-linux-gnu in the report generation code: Starting program: /space/uros/gcc-build/prev-gcc/cc1 -O2 -fdump-ipa-icf -quiet ipa-icf-14.i Program received signal SIGFPE, Arithmetic exception. ipa_icf::sem_item_optimizer::merge_classes (this=0x12187dd80, prev_class_count=3) at /space/homedirs/uros/gcc-svn/trunk/gcc/ipa-icf.c:2203 2203 non_singular_classes_count); (gdb) list 2198 fprintf (dump_file, Average class size before: %.2f, after: %.2f\n, 2199 1.0f * item_count / prev_class_count, 2200 1.0f * item_count / class_count); 2201 fprintf (dump_file, Average non-singular class size: %.2f, count: %u\n, 2202 1.0f * non_singular_classes_sum / non_singular_classes_count, 2203 non_singular_classes_count); 2204 fprintf (dump_file, Equal symbols: %u\n, equal_items); 2205 fprintf (dump_file, Fraction of visited symbols: %.2f%%\n\n, 2206 100.0f * equal_items / item_count); 2207} (gdb) p non_singular_classes_count $1 = 0 Also: (gdb) p non_singular_classes_sum $1 = 0 This creates a nice NaN which can throw an exception. (gdb) bt #0 ipa_icf::sem_item_optimizer::merge_classes (this=0x12187dd80, prev_class_count=3) at /space/homedirs/uros/gcc-svn/trunk/gcc/ipa-icf.c:2203 #1 0x000121255c70 in ipa_icf::sem_item_optimizer::execute (this=0x12187dd80) at /space/homedirs/uros/gcc-svn/trunk/gcc/ipa-icf.c:1602 #2 0x00012125999c in ipa_icf::ipa_icf_driver () at /space/homedirs/uros/gcc-svn/trunk/gcc/ipa-icf.c:2319 #3 0x00012125a46c in ipa_icf::pass_ipa_icf::execute (this=0x121850f20) at /space/homedirs/uros/gcc-svn/trunk/gcc/ipa-icf.c:2367 #4 0x0001209c10c4 in execute_one_pass (pass=0x121850f20) at /space/homedirs/uros/gcc-svn/trunk/gcc/passes.c:2156 #5 0x0001209c26f4 in execute_ipa_pass_list (pass=0x121850f20) at /space/homedirs/uros/gcc-svn/trunk/gcc/passes.c:2550 #6 0x00012048fcdc in ipa_passes () at /space/homedirs/uros/gcc-svn/trunk/gcc/cgraphunit.c:2057 Uros.
Re: [PATCH PR63173] [AARCH64, NEON] Improve vld[234](q?)_dup intrinsics
Index: gcc/config/aarch64/arm_neon.h === --- gcc/config/aarch64/arm_neon.h (revision 216558) +++ gcc/config/aarch64/arm_neon.h (working copy) @@ -11765,46 +11765,271 @@ __STRUCTN (poly, 8, 4) __STRUCTN (float, 64, 4) #undef __STRUCTN -#define __LD2R_FUNC(rettype, structtype, ptrtype, \ - regsuffix, funcsuffix, Q) \ - __extension__ static __inline rettype \ - __attribute__ ((__always_inline__)) \ - vld2 ## Q ## _dup_ ## funcsuffix (const ptrtype *ptr) \ - {\ -rettype result;\ -__asm__ (ld2r {v16. #regsuffix , v17. #regsuffix }, %1\n\t \ -st1 {v16. #regsuffix , v17. #regsuffix }, %0\n\t \ -: =Q(result) \ -: Q(*(const structtype *)ptr)\ -: memory, v16, v17); \ -return result; \ - } +__extension__ static __inline int8x8x2_t __attribute__ ((__always_inline__)) +vld2_dup_s8 (const int8_t * __a) +{ + int8x8x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld2rv8qi ((const __builtin_aarch64_simd_qi *) __a); + ret.val[0] = (int8x8_t) __builtin_aarch64_get_dregoiv8qi (__o, 0); + ret.val[1] = (int8x8_t) __builtin_aarch64_get_dregoiv8qi (__o, 1); + return ret; +} -__LD2R_FUNC (float32x2x2_t, float32x2_t, float32_t, 2s, f32,) -__LD2R_FUNC (float64x1x2_t, float64x2_t, float64_t, 1d, f64,) -__LD2R_FUNC (poly8x8x2_t, poly8x2_t, poly8_t, 8b, p8,) -__LD2R_FUNC (poly16x4x2_t, poly16x2_t, poly16_t, 4h, p16,) -__LD2R_FUNC (int8x8x2_t, int8x2_t, int8_t, 8b, s8,) -__LD2R_FUNC (int16x4x2_t, int16x2_t, int16_t, 4h, s16,) -__LD2R_FUNC (int32x2x2_t, int32x2_t, int32_t, 2s, s32,) -__LD2R_FUNC (int64x1x2_t, int64x2_t, int64_t, 1d, s64,) -__LD2R_FUNC (uint8x8x2_t, uint8x2_t, uint8_t, 8b, u8,) -__LD2R_FUNC (uint16x4x2_t, uint16x2_t, uint16_t, 4h, u16,) -__LD2R_FUNC (uint32x2x2_t, uint32x2_t, uint32_t, 2s, u32,) -__LD2R_FUNC (uint64x1x2_t, uint64x2_t, uint64_t, 1d, u64,) -__LD2R_FUNC (float32x4x2_t, float32x2_t, float32_t, 4s, f32, q) -__LD2R_FUNC (float64x2x2_t, float64x2_t, float64_t, 2d, f64, q) -__LD2R_FUNC (poly8x16x2_t, poly8x2_t, poly8_t, 16b, p8, q) -__LD2R_FUNC (poly16x8x2_t, poly16x2_t, poly16_t, 8h, p16, q) -__LD2R_FUNC (int8x16x2_t, int8x2_t, int8_t, 16b, s8, q) -__LD2R_FUNC (int16x8x2_t, int16x2_t, int16_t, 8h, s16, q) -__LD2R_FUNC (int32x4x2_t, int32x2_t, int32_t, 4s, s32, q) -__LD2R_FUNC (int64x2x2_t, int64x2_t, int64_t, 2d, s64, q) -__LD2R_FUNC (uint8x16x2_t, uint8x2_t, uint8_t, 16b, u8, q) -__LD2R_FUNC (uint16x8x2_t, uint16x2_t, uint16_t, 8h, u16, q) -__LD2R_FUNC (uint32x4x2_t, uint32x2_t, uint32_t, 4s, u32, q) -__LD2R_FUNC (uint64x2x2_t, uint64x2_t, uint64_t, 2d, u64, q) +__extension__ static __inline int16x4x2_t __attribute__ ((__always_inline__)) +vld2_dup_s16 (const int16_t * __a) +{ + int16x4x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld2rv4hi ((const __builtin_aarch64_simd_hi *) __a); + ret.val[0] = (int16x4_t) __builtin_aarch64_get_dregoiv4hi (__o, 0); + ret.val[1] = (int16x4_t) __builtin_aarch64_get_dregoiv4hi (__o, 1); + return ret; +} +__extension__ static __inline int32x2x2_t __attribute__ ((__always_inline__)) +vld2_dup_s32 (const int32_t * __a) +{ + int32x2x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld2rv2si ((const __builtin_aarch64_simd_si *) __a); + ret.val[0] = (int32x2_t) __builtin_aarch64_get_dregoiv2si (__o, 0); + ret.val[1] = (int32x2_t) __builtin_aarch64_get_dregoiv2si (__o, 1); + return ret; +} + +__extension__ static __inline float32x2x2_t __attribute__ ((__always_inline__)) +vld2_dup_f32 (const float32_t * __a) +{ + float32x2x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld2rv2sf ((const __builtin_aarch64_simd_sf *) __a); + ret.val[0] = (float32x2_t) __builtin_aarch64_get_dregoiv2sf (__o, 0); + ret.val[1] = (float32x2_t) __builtin_aarch64_get_dregoiv2sf (__o, 1); + return ret; +} + +__extension__ static __inline float64x1x2_t __attribute__ ((__always_inline__)) +vld2_dup_f64 (const float64_t * __a) +{ + float64x1x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld2rdf ((const __builtin_aarch64_simd_df *) __a); + ret.val[0] = (float64x1_t) {__builtin_aarch64_get_dregoidf (__o, 0)}; + ret.val[1] = (float64x1_t) {__builtin_aarch64_get_dregoidf (__o, 1)}; + return ret; +} + +__extension__ static __inline uint8x8x2_t __attribute__ ((__always_inline__)) +vld2_dup_u8 (const uint8_t * __a) +{ + uint8x8x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld2rv8qi
Re: [PATCH] Fix genmatch linking
On Thu, 23 Oct 2014, Richard Biener wrote: On Thu, 23 Oct 2014, Rainer Orth wrote: Richard Biener rguent...@suse.de writes: This adds a libcpp host module without NLS and ICONV support and properly links genmatch against the build libcpp instead of the host one. Bootstrap running on x86_64-unknown-linux-gnu (stage1 all-gcc finished fine). Unfortunately, this doesn't work on i386-pc-solaris2.10: In stage1, build/genmatch is linked correctly g++ -g -DIN_GCC-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-format -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -DGENERATOR_FILE -static-libstdc++ -static-libgcc -o build/genmatch \ build/genmatch.o ../build-i386-pc-solaris2.10/libcpp/libcpp.a build/errors.o build/vec.o build/hash-table.o ../build-i386-pc-solaris2.10/libiberty/libiberty.a OTOH, in stage2, I get /var/gcc/regression/trunk/10-gcc/build/./prev-gcc/xg++ -B/var/gcc/regression/trunk/10-gcc/build/./prev-gcc/ -B/vol/gcc/i386-pc-solaris2.10/bin/ -nostdinc++ -B/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/src/.libs -B/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/libsupc++/.libs -I/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/include/i386-pc-solaris2.10 -I/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/include -I/vol/gcc/src/hg/trunk/local/libstdc++-v3/libsupc++ -L/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/src/.libs -L/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/libsupc++/.libs -g -O2 -DIN_GCC-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macro s -Wno-overlength-strings -Werror -fno-common -DHAVE_CONFIG_H -DGENERATOR_FILE -static-libstdc++ -static-libgcc -o build/genmatch \ build/genmatch.o .././libcpp/libcpp.a build/errors.o build/vec.o build/hash-table.o .././libiberty/libiberty.a Undefined first referenced symbol in file libintl_bindtextdomain .././libcpp/libcpp.a(init.o) libintl_dgettext.././libcpp/libcpp.a(errors.o) ld: fatal: symbol referencing errors. No output written to build/genmatch collect2: error: ld returned 1 exit status make[3]: *** [build/genmatch] Error 1 Diffing the stage1 and 2 gcc/Makefiles, I see --- prev-gcc/Makefile 2014-10-23 13:45:20.720460523 +0200 +++ gcc/Makefile2014-10-23 14:32:06.111476537 +0200 @@ -86 +86 @@ -build_libsubdir=build-i386-pc-solaris2.10 +build_libsubdir=. Hmm, why do stage2+ use the host libraries? Ah, because they can assume that host == build. Ok, so when bootstrapping you then need to link libcpp dependencies as well, for stage2+. Not sure how to achieve that. I guess doing sth in gcc/configure - but how does that know what stage we are in? I wonder if the following works Richard. Index: gcc/Makefile.in === --- gcc/Makefile.in (revision 216590) +++ gcc/Makefile.in (working copy) @@ -981,7 +981,15 @@ else LIBIBERTY = ../libiberty/libiberty.a BUILD_LIBIBERTY = $(build_libobjdir)/libiberty/libiberty.a endif +ifeq ($(current_stage),stage1) BUILD_CPPLIB = $(build_libobjdir)/libcpp/libcpp.a +else +ifeq ($(current_stage),) +BUILD_CPPLIB = $(build_libobjdir)/libcpp/libcpp.a +else +BUILD_CPPLIB = $(CPPLIB) $(LIBIBERTY) $(LIBINTL_DEP) ($LIBICONV_DEP) +endif +endif # Dependencies on the intl and portability libraries. LIBDEPS= libcommon.a $(CPPLIB) $(LIBIBERTY) $(LIBINTL_DEP) $(LIBICONV_DEP) \
Re: [PATCH] Fix genmatch linking
On Thu, 23 Oct 2014, Richard Biener wrote: On Thu, 23 Oct 2014, Richard Biener wrote: On Thu, 23 Oct 2014, Rainer Orth wrote: Richard Biener rguent...@suse.de writes: This adds a libcpp host module without NLS and ICONV support and properly links genmatch against the build libcpp instead of the host one. Bootstrap running on x86_64-unknown-linux-gnu (stage1 all-gcc finished fine). Unfortunately, this doesn't work on i386-pc-solaris2.10: In stage1, build/genmatch is linked correctly g++ -g -DIN_GCC-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-format -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -DGENERATOR_FILE -static-libstdc++ -static-libgcc -o build/genmatch \ build/genmatch.o ../build-i386-pc-solaris2.10/libcpp/libcpp.a build/errors.o build/vec.o build/hash-table.o ../build-i386-pc-solaris2.10/libiberty/libiberty.a OTOH, in stage2, I get /var/gcc/regression/trunk/10-gcc/build/./prev-gcc/xg++ -B/var/gcc/regression/trunk/10-gcc/build/./prev-gcc/ -B/vol/gcc/i386-pc-solaris2.10/bin/ -nostdinc++ -B/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/src/.libs -B/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/libsupc++/.libs -I/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/include/i386-pc-solaris2.10 -I/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/include -I/vol/gcc/src/hg/trunk/local/libstdc++-v3/libsupc++ -L/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/src/.libs -L/var/gcc/regression/trunk/10-gcc/build/prev-i386-pc-solaris2.10/libstdc++-v3/libsupc++/.libs -g -O2 -DIN_GCC-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-mac ros -Wno-overlength-strings -Werror -fno-common -DHAVE_CONFIG_H -DGENERATOR_FILE -static-libstdc++ -static-libgcc -o build/genmatch \ build/genmatch.o .././libcpp/libcpp.a build/errors.o build/vec.o build/hash-table.o .././libiberty/libiberty.a Undefined first referenced symbol in file libintl_bindtextdomain .././libcpp/libcpp.a(init.o) libintl_dgettext.././libcpp/libcpp.a(errors.o) ld: fatal: symbol referencing errors. No output written to build/genmatch collect2: error: ld returned 1 exit status make[3]: *** [build/genmatch] Error 1 Diffing the stage1 and 2 gcc/Makefiles, I see --- prev-gcc/Makefile 2014-10-23 13:45:20.720460523 +0200 +++ gcc/Makefile2014-10-23 14:32:06.111476537 +0200 @@ -86 +86 @@ -build_libsubdir=build-i386-pc-solaris2.10 +build_libsubdir=. Hmm, why do stage2+ use the host libraries? Ah, because they can assume that host == build. Ok, so when bootstrapping you then need to link libcpp dependencies as well, for stage2+. Not sure how to achieve that. I guess doing sth in gcc/configure - but how does that know what stage we are in? I wonder if the following works Or this one. Index: gcc/Makefile.in === --- gcc/Makefile.in (revision 216590) +++ gcc/Makefile.in (working copy) @@ -981,7 +981,14 @@ else LIBIBERTY = ../libiberty/libiberty.a BUILD_LIBIBERTY = $(build_libobjdir)/libiberty/libiberty.a endif +# For stage1 and when cross-compiling use the build libcpp which is +# built with NLS disabled. For stage2+ use the host library and +# its dependencies. +ifeq ($(build_objdir),$(build_libobjdir)) BUILD_CPPLIB = $(build_libobjdir)/libcpp/libcpp.a +else +BUILD_CPPLIB = $(CPPLIB) $(LIBIBERTY) $(LIBINTL_DEP) $(LIBICONV_DEP) +endif # Dependencies on the intl and portability libraries. LIBDEPS= libcommon.a $(CPPLIB) $(LIBIBERTY) $(LIBINTL_DEP) $(LIBICONV_DEP) \ pre-approved if it works for you. Richard. Richard. Index: gcc/Makefile.in === --- gcc/Makefile.in (revision 216590) +++ gcc/Makefile.in (working copy) @@ -981,7 +981,15 @@ else LIBIBERTY = ../libiberty/libiberty.a BUILD_LIBIBERTY = $(build_libobjdir)/libiberty/libiberty.a endif +ifeq ($(current_stage),stage1) BUILD_CPPLIB = $(build_libobjdir)/libcpp/libcpp.a +else +ifeq ($(current_stage),) +BUILD_CPPLIB = $(build_libobjdir)/libcpp/libcpp.a +else +BUILD_CPPLIB = $(CPPLIB) $(LIBIBERTY) $(LIBINTL_DEP) ($LIBICONV_DEP) +endif +endif # Dependencies on the intl and portability libraries. LIBDEPS= libcommon.a $(CPPLIB) $(LIBIBERTY)
Re: [PATCH] Add missing requirement to crossmodule-indircall-1a.c
Jeff Law l...@redhat.com: On 10/21/14 12:21, jb...@gmx.de wrote: Jeff Law l...@redhat.com: On 10/21/14 16:13, Haswell wrote: The additional source must have the same requirement crossmodule-indircall-1.c has. * crossmodule-indircall-1a.c: Add missing requirement. Why? When used by crossmodule-indircall-1.c we'll have already tested the marker and when used by itself, it does nothing. So I don't see why you think a marker is needed for this source file. When configuring --disable-lto it gets compiled twice: FAIL: gcc.dg/tree-prof/crossmodule-indircall-1a.c compilation, -fprofile-generate -D_PROFILE_GENERATE UNRESOLVED: gcc.dg/tree-prof/crossmodule-indircall-1a.c execution, -fprofile-generate -D_PROFILE_GENERATE UNRESOLVED: gcc.dg/tree-prof/crossmodule-indircall-1a.c compilation, -fprofile-use -D_PROFILE_USE UNRESOLVED: gcc.dg/tree-prof/crossmodule-indircall-1a.c execution, -fprofile-use -D_PROFILE_USE I'd recommend looking deeper. I believe that file should be collapsing down to main () { return 0; } when LTO is not enabled. I'm not a dejagnu expert, but this is what happens: /tmp/build/gcc/xgcc -B/tmp/build/gcc/ /tmp/gcc-4.9.1/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indircall-1a.c -fno-diagnostics-show-caret -fdiagnostics-color=never /tmp/gcc-4.9.1/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indircall-1a.c -fprofile-generate -D_PROFILE_GENERATE -lm -o /tmp/build/gcc/testsuite/gcc/crossmodule-indircall-1a.x01 /tmp/cc4rrWCn.o: In function `main': crossmodule-indircall-1a.c:(.text+0x0): multiple definition of `main' /tmp/ccgMlXGi.o:crossmodule-indircall-1a.c:(.text+0x0): first defined here collect2: error: ld returned 1 exit status compiler exited with status 1
Re: [PATCH] Fix genmatch linking
Final try for today. Richard. Index: gcc/Makefile.in === --- gcc/Makefile.in (revision 216590) +++ gcc/Makefile.in (working copy) @@ -981,7 +981,15 @@ else LIBIBERTY = ../libiberty/libiberty.a BUILD_LIBIBERTY = $(build_libobjdir)/libiberty/libiberty.a endif +# For stage1 and when cross-compiling use the build libcpp which is +# built with NLS disabled. For stage2+ use the host library and +# its dependencies. +ifeq ($(build_objdir),$(build_libobjdir)) BUILD_CPPLIB = $(build_libobjdir)/libcpp/libcpp.a +else +BUILD_CPPLIB = $(CPPLIB) $(LIBIBERTY) $(LIBINTL) $(LIBICONV) $(LIBINTL_DEP) $(LIBICONV_DEP) +build/genmatch$(build_exeext): BUILD_LIBDEPS += $(LIBINTL_DEP) $(LIBICONV_DEP) +endif # Dependencies on the intl and portability libraries. LIBDEPS= libcommon.a $(CPPLIB) $(LIBIBERTY) $(LIBINTL_DEP) $(LIBICONV_DEP) \
Re: [PATCH] Fix genmatch linking
On Thu, 23 Oct 2014, Richard Biener wrote: Final try for today. And this may work as well and is slightly simpler. Index: gcc/Makefile.in === --- gcc/Makefile.in (revision 216590) +++ gcc/Makefile.in (working copy) @@ -981,7 +981,15 @@ else LIBIBERTY = ../libiberty/libiberty.a BUILD_LIBIBERTY = $(build_libobjdir)/libiberty/libiberty.a endif +# For stage1 and when cross-compiling use the build libcpp which is +# built with NLS disabled. For stage2+ use the host library and +# its dependencies. +ifeq ($(build_objdir),$(build_libobjdir)) BUILD_CPPLIB = $(build_libobjdir)/libcpp/libcpp.a +else +BUILD_CPPLIB = $(CPPLIB) $(LIBIBERTY) $(LIBINTL) $(LIBICONV) +build/genmatch$(build_exeext): BUILD_LIBDEPS += $(LIBINTL_DEP) $(LIBICONV_DEP) +endif # Dependencies on the intl and portability libraries. LIBDEPS= libcommon.a $(CPPLIB) $(LIBIBERTY) $(LIBINTL_DEP) $(LIBICONV_DEP) \ Can you test it please? Thanks, Richard.
Re: [PATCH] Fix genmatch linking
On Thu, Oct 23, 2014 at 04:31:36PM +0200, Richard Biener wrote: Final try for today. Richard. Index: gcc/Makefile.in === --- gcc/Makefile.in (revision 216590) +++ gcc/Makefile.in (working copy) @@ -981,7 +981,15 @@ else LIBIBERTY = ../libiberty/libiberty.a BUILD_LIBIBERTY = $(build_libobjdir)/libiberty/libiberty.a endif +# For stage1 and when cross-compiling use the build libcpp which is +# built with NLS disabled. For stage2+ use the host library and +# its dependencies. +ifeq ($(build_objdir),$(build_libobjdir)) BUILD_CPPLIB = $(build_libobjdir)/libcpp/libcpp.a +else +BUILD_CPPLIB = $(CPPLIB) $(LIBIBERTY) $(LIBINTL) $(LIBICONV) $(LIBINTL_DEP) $(LIBICONV_DEP) The $(LIBINTL_DEP) $(LIBICONV_DEP) on the above line is unnecessary IMHO, # Rule for the generator programs: $(genprog:%=build/gen%$(build_exeext)): build/gen%$(build_exeext): build/gen%.o $(BUILD_LIBDEPS) +$(LINKER_FOR_BUILD) $(BUILD_LINKERFLAGS) $(BUILD_LDFLAGS) -o $@ \ $(filter-out $(BUILD_LIBDEPS), $^) $(BUILD_LIBS) adds that automatically. +build/genmatch$(build_exeext): BUILD_LIBDEPS += $(LIBINTL_DEP) $(LIBICONV_DEP) Here it is needed supposedly. +endif # Dependencies on the intl and portability libraries. LIBDEPS= libcommon.a $(CPPLIB) $(LIBIBERTY) $(LIBINTL_DEP) $(LIBICONV_DEP) \ Jakub
Re: Patch RFA: Top-level configure patch: disable go on systems where it doesn't work
On 10/23/2014 04:36 AM, Ian Taylor wrote: This patch to the top level GCC configure script disables the go languages on some systems where it is known to not work. Bootstrapped on x86_64-unknown-gnu-linux. OK for mainline? Ian 2014-10-22 Ian Lance Taylor i...@google.com * configure.ac: Disable the Go frontend on systems where it is known to not work. * configure: Regenerate. I think it'd be better if knowledge specific to subdirs was pushed down to the subdirs, rather than being kept in the top level, in the direction of how we disable libatomic, libsanitizer, etc. That is, by sourcing something in the subdir to get back the info top level needs. With that in place, changes to the set of supported hosts/targets/configurations no longer needs to be synchronized between the projects that use the top-level scripts. In the specific case of languages, it seems to be we already have such a script. E.g.: # First scan to see if an enabled language requires some other language. # We assume that a given config-lang.in will list all the language # front ends it requires, even if some are required indirectly. for lang_frag in ${srcdir}/gcc/*/config-lang.in .. ; do case ${lang_frag} in Each config-lang.in sets some output variables. For the case of a language being unsupported for some reason, we'd declare that the lang fragments can specify one more output variable, simply called unsupported, and then in in go's gcc/go/config-lang.in, we'd add: # Disable the go frontend on systems where it is known to not work. case ${target} in *-*-darwin* | *-*-cygwin* | *-*-mingw* | *-*-aix*) unsupported=true ;; esac Then back in the top level, near where we do: # Disable languages that need other directories if these aren't available. for i in $subdir_requires; do test -f $srcdir/gcc/$i/config-lang.in continue ... We'd handle the unsupported variable. WDYT? Thanks, Pedro Alves
Re: Patch RFA: Top-level configure patch: disable go on systems where it doesn't work
On Thu, Oct 23, 2014 at 8:27 AM, Pedro Alves pal...@redhat.com wrote: I think it'd be better if knowledge specific to subdirs was pushed down to the subdirs, rather than being kept in the top level, in the direction of how we disable libatomic, libsanitizer, etc. That is, by sourcing something in the subdir to get back the info top level needs. With that in place, changes to the set of supported hosts/targets/configurations no longer needs to be synchronized between the projects that use the top-level scripts. In the specific case of languages, it seems to be we already have such a script. E.g.: # First scan to see if an enabled language requires some other language. # We assume that a given config-lang.in will list all the language # front ends it requires, even if some are required indirectly. for lang_frag in ${srcdir}/gcc/*/config-lang.in .. ; do case ${lang_frag} in Each config-lang.in sets some output variables. For the case of a language being unsupported for some reason, we'd declare that the lang fragments can specify one more output variable, simply called unsupported, and then in in go's gcc/go/config-lang.in, we'd add: # Disable the go frontend on systems where it is known to not work. case ${target} in *-*-darwin* | *-*-cygwin* | *-*-mingw* | *-*-aix*) unsupported=true ;; esac Then back in the top level, near where we do: # Disable languages that need other directories if these aren't available. for i in $subdir_requires; do test -f $srcdir/gcc/$i/config-lang.in continue ... We'd handle the unsupported variable. My patch was, of course, just building on the existing unsupported_languages support. You are suggesting that we move that support from the top level configure script to the language-specific config-lang.in scripts. That change sounds fine to me. Ian
Re: [testsuite] Fix gcc.dg/ipa/ipa-icf-2[18].c on Solaris
Jakub Jelinek ja...@redhat.com writes: On Thu, Oct 23, 2014 at 11:35:44AM +0200, Rainer Orth wrote: The testcase is clearly i?86/x86_64 specific, so it must be guarded with { target i?86-*-* x86_64-*-* }, and as it is a compile time test, no need Right, that's what my patch already does. for assembler support, so just adding -msse2 option to dg-option is the right thing to do. Ok, will try that. Still no idea why Linux/x86 and Solaris/x86 are different here. Why do you think so? I certainly have: FAIL: gcc.dg/ipa/ipa-icf-21.c (test for excess errors) UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Equal symbols: 1 UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Semantic equality hit:bar-foo in my i686-linux test_summary log. It really depends on how the compiler Right, but on i386-pc-solaris2.11 it passes. is configured, if it defaults to march that has sse/sse2 by default, it will succeed, otherwise it will not. I had a terrible time figuring out what march value i686-unknown-linux-gnu uses; only in gcc -v it's possible to see that it uses pentiumpro, unlike pentium4 on Solaris/x86, which explains the difference. What is the reason to use pentium4 for Solaris/x86? Is it -march setting or -mtune? For -mtune it is a terrible choice Honza Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: Patch RFA: Top-level configure patch: disable go on systems where it doesn't work
On 10/23/2014 04:31 PM, Ian Taylor wrote: On Thu, Oct 23, 2014 at 8:27 AM, Pedro Alves pal...@redhat.com wrote: I think it'd be better if knowledge specific to subdirs was pushed down to the subdirs, rather than being kept in the top level, in the direction of how we disable libatomic, libsanitizer, etc. That is, by sourcing something in the subdir to get back the info top level needs. With that in place, changes to the set of supported hosts/targets/configurations no longer needs to be synchronized between the projects that use the top-level scripts. In the specific case of languages, it seems to be we already have such a script. E.g.: # First scan to see if an enabled language requires some other language. # We assume that a given config-lang.in will list all the language # front ends it requires, even if some are required indirectly. for lang_frag in ${srcdir}/gcc/*/config-lang.in .. ; do case ${lang_frag} in Each config-lang.in sets some output variables. For the case of a language being unsupported for some reason, we'd declare that the lang fragments can specify one more output variable, simply called unsupported, and then in in go's gcc/go/config-lang.in, we'd add: # Disable the go frontend on systems where it is known to not work. case ${target} in *-*-darwin* | *-*-cygwin* | *-*-mingw* | *-*-aix*) unsupported=true ;; esac Then back in the top level, near where we do: # Disable languages that need other directories if these aren't available. for i in $subdir_requires; do test -f $srcdir/gcc/$i/config-lang.in continue ... We'd handle the unsupported variable. My patch was, of course, just building on the existing unsupported_languages support. You are suggesting that we move that support from the top level configure script to the language-specific config-lang.in scripts. AFAICS no target in the top level disables go yet, so we could IMO do the config-lang.in mechanism without moving any of the existing target checks for other languages. It'd be a small change that way. As bonus, I think you wouldn't need approval for further changes to the set of go supported systems (though that may not change often). My .02c. That change sounds fine to me. Thanks, Pedro Alves
Re: [PATCH 3/4] Add libgomp plugin for Intel MIC
On 22 Oct 11:22, Jakub Jelinek wrote: On Tue, Oct 21, 2014 at 09:24:13PM +0400, Ilya Verbin wrote: + mic_lib_path_new = (char *) malloc ((mic_lib_path ? strlen (mic_lib_path) : 0) + + strlen (ld_lib_path) + 2); malloc can fail, SIGSEGV in response to that is not desirable. Can't you fallback to alloca, or use just alloca, or use alloca with malloc fallback? I replaced it with alloca. +extern C void +GOMP_OFFLOAD_register_image (void *host_table, void *target_image) +{ + TRACE ((host_table = %p, target_image = %p), host_table, target_image); + + if (num_libraries = 1000) +{ + fprintf (stderr, + %s: The number of loaded shared libraries is over 1000!\n, + __FILE__); + exit (1); Where does this artificial limit come from? Using libNNN.so library names? Can't you use lib%d.so instead? Yes, it comes from the Image structure (liboffloadmic/runtime/offload_host.h:52) It must contain a null-terminated name, therefore I need to allocate some space for the name in plugin's struct TargetImage. But the structure can't contain any bytes after the trailing zero and before the actual data. So, now I extended the name to 10 digits and removed the comparison with 1000. Also, seeing register_image, shouldn't there be GOMP_OFFLOAD_unregister_image which would be invoked when the library containing MIC offloading regions is dlclosed? One could use __cxa_atexit or similar for that, something that is given __dso_handle. Or is no cleanup necessary? At least unregistering it from translation tables, because the same addresses might be reused by a different shared library? With dlopen/dlclose in mind, 1000 might be easily reached, consider 1 times dlopening/dlclosing (perhaps over longer time, by long running daemon) a shared library containg #pragma omp target region. Hmm, previously we've tested only cases when all libraries are loaded before the first offload. Offloading from a dlopened library after the call to gomp_target_init isn't working. So, this will require some changes in libgomp/target.c . Is it ok to fix this bug in a separate patch? And yes, there should be GOMP_OFFLOAD_unregister_image for this case. +static int first_init = 1; + +/* Load offload_target_main on target. */ +extern C void +GOMP_OFFLOAD_init_device (int device) +{ + TRACE (); + pthread_mutex_lock (mutex); + if (first_init) +{ + __offload_register_image (main_target_image); + first_init = 0; +} + pthread_mutex_unlock (mutex); pthread_once instead? Done. Patch is updated. Thanks, -- Ilya --- diff --git a/liboffloadmic/configure.ac b/liboffloadmic/configure.ac index fb575b3..81fae8f 100644 --- a/liboffloadmic/configure.ac +++ b/liboffloadmic/configure.ac @@ -42,6 +42,7 @@ AC_PROG_CC AC_PROG_CXX AC_CONFIG_FILES([Makefile liboffloadmic_host.spec liboffloadmic_target.spec]) AM_ENABLE_MULTILIB(, ..) +AC_CONFIG_SUBDIRS(plugin) AC_FUNC_ALLOCA AC_CHECK_HEADERS([mm_malloc.h], [], [AC_MSG_ERROR([Couldn't find mm_malloc.h])]) AC_CHECK_FUNCS([__secure_getenv secure_getenv]) diff --git a/liboffloadmic/plugin/Makefile.am b/liboffloadmic/plugin/Makefile.am new file mode 100644 index 000..0baf70d --- /dev/null +++ b/liboffloadmic/plugin/Makefile.am @@ -0,0 +1,123 @@ +# Plugin for offload execution on Intel MIC devices. +# +# Copyright (C) 2014 Free Software Foundation, Inc. +# +# Contributed by Ilya Verbin ilya.ver...@intel.com and +# Andrey Turetskiy andrey.turets...@intel.com. +# +# This file is part of the GNU OpenMP Library (libgomp). +# +# Libgomp is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. +# +# Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY +# WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +# FOR A PARTICULAR PURPOSE. See the GNU General Public License for +# more details. +# +# Under Section 7 of GPL version 3, you are granted additional +# permissions described in the GCC Runtime Library Exception, version +# 3.1, as published by the Free Software Foundation. +# +# You should have received a copy of the GNU General Public License and +# a copy of the GCC Runtime Library Exception along with this program; +# see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +# http://www.gnu.org/licenses/. + + +AUTOMAKE_OPTIONS = foreign +ACLOCAL_AMFLAGS = -I ../.. -I ../../config + +# Directories +build_dir = $(top_builddir) +source_dir = $(top_srcdir) +coi_inc_dir = $(top_srcdir)/../include/coi +myo_inc_dir = $(top_srcdir)/../include/myo +libgomp_src_dir = $(top_srcdir)/../../libgomp +libgomp_dir = $(build_dir)/../../libgomp +liboffload_src_dir = $(top_srcdir)/../runtime +liboffload_dir = $(top_builddir)/.. + +# May be used
[patch] Define std::promise::set_value_at_thread_exit() etc.
This adds: std::notify_all_at_thread_exit() std::promise::set_value_at_thread_exit() std::promise::set_exception_at_thread_exit() std::packaged_task::make_ready_at_thread_exit() There's a linked list of callbacks that run after TLS destructors (called by a pthread_key_create destructor) to make shared states ready and notify condition variables. The core of the change to futures is that the shared state is considered ready when _M_ready == true, instead of when _M_result != nullptr, so that we can store a result in _M_result without making it ready. The callback that would make it ready at thread exit stores a weak_ptr so it can safely check whether the shared state has already been destroyed before thread exit (see thread c++std-parallel-1162 on the SG1 list for related discussion). Tested x86_64-linux, I'd like to commit this next week some time. commit 9f0d052017db0e0484cbfaf79677c62e3532904f Author: Jonathan Wakely jwak...@redhat.com Date: Mon Oct 20 12:23:24 2014 +0100 Define *_at_thread_exit() functions. * config/abi/pre/gnu.ver: Add new exports. * include/std/condition_variable (notify_all_at_thread_exit): Declare. (__at_thread_exit_elt): New base class. * include/std/future (__future_base::_State_baseV2::_State_baseV2()): Use brace-or-equal initializers and define constructor as defaulted. (__future_base::_State_baseV2::_M_ready): Replace member function with member variable. (__future_base::_State_baseV2::_M_set_result): Set _M_ready. (__future_base::_State_baseV2::_M_set_result_aside): Define. (__future_base::_State_baseV2::_M_break_promise): Set _M_ready. (__future_base::_State_baseV2::_Make_ready): New helper class. (__future_base::_Task_state_base::__run_not_ready): Declare new pure virtual function. (__future_base::_Task_state::__run_not_ready): Define override. (promise::set_value_at_thread_exit): Define. (promise::set_exception_at_thread_exit): Define. (packaged_task::make_ready_at_thread_exit): Define. * src/c++11/condition_variable.cc (notify_all_at_thread_exit): Define. * src/c++11/future.cc (__future_base::_State_baseV2::_Make_ready::_M_set): Define. * testsuite/30_threads/condition_variable/members/3.cc: New. * testsuite/30_threads/packaged_task/members/at_thread_exit.cc: New. * testsuite/30_threads/promise/members/at_thread_exit.cc: New. diff --git a/libstdc++-v3/config/abi/pre/gnu.ver b/libstdc++-v3/config/abi/pre/gnu.ver index 4c6d994..5404094 100644 --- a/libstdc++-v3/config/abi/pre/gnu.ver +++ b/libstdc++-v3/config/abi/pre/gnu.ver @@ -124,7 +124,8 @@ GLIBCXX_3.4 { std::messages*; std::money*; # std::n[^u]*; - std::n[^aue]*; + std::n[^aueo]*; + std::nothrow; std::nu[^m]*; std::num[^e]*; std::ostrstream*; @@ -1476,6 +1477,11 @@ GLIBCXX_3.4.21 { # std::ctype_base::blank _ZNSt10ctype_base5blankE; +# std::notify_all_at_thread_exit +_ZSt25notify_all_at_thread_exitRSt18condition_variableSt11unique_lockISt5mutexE; +# std::__future_base::_State_baseV2::_Make_ready::_M_set() +_ZNSt13__future_base13_State_baseV211_Make_ready6_M_setEv; + } GLIBCXX_3.4.20; diff --git a/libstdc++-v3/include/std/condition_variable b/libstdc++-v3/include/std/condition_variable index 921cb83..a3682c0 100644 --- a/libstdc++-v3/include/std/condition_variable +++ b/libstdc++-v3/include/std/condition_variable @@ -170,6 +170,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION } }; + void + notify_all_at_thread_exit(condition_variable, unique_lockmutex); + + struct __at_thread_exit_elt + { +__at_thread_exit_elt* _M_next; +void (*_M_cb)(void*); + }; + inline namespace _V2 { /// condition_variable_any diff --git a/libstdc++-v3/include/std/future b/libstdc++-v3/include/std/future index 8989474..9008b78 100644 --- a/libstdc++-v3/include/std/future +++ b/libstdc++-v3/include/std/future @@ -294,12 +294,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _Ptr_type _M_result; mutex _M_mutex; condition_variable _M_cond; - atomic_flag _M_retrieved; + atomic_flag _M_retrieved = ATOMIC_FLAG_INIT; + bool _M_ready = false; once_flag _M_once; public: - _State_baseV2() noexcept : _M_result(), _M_retrieved(ATOMIC_FLAG_INIT) - { } + _State_baseV2() noexcept = default; _State_baseV2(const _State_baseV2) = delete; _State_baseV2 operator=(const _State_baseV2) = delete; virtual ~_State_baseV2() = default; @@ -309,7 +309,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { _M_complete_async(); unique_lockmutex __lock(_M_mutex); - _M_cond.wait(__lock, [] { return _M_ready(); }); + _M_cond.wait(__lock, [] { return _M_ready; }); return *_M_result; } @@ -318,11 +318,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION wait_for(const chrono::duration_Rep, _Period __rel) { unique_lockmutex
Re: PR lto/63603: Fix -fno-lto handling in driver (gcc/gcc.c)
On 2014.10.22 at 13:57 +0200, Tobias Burnus wrote: I intent to commit the following patch this evening, which was pre-proved for 4.9/trunk by Richard in the PR. However, I still need to do the bootstrapping and regtesting. This breaks bootstrap-lto: ... configure: error: Link tests are not allowed after GCC_NO_EXECUTABLES. -- Markus
Re: [C PATCH] Don't output warning twice (PR c/63626)
On Thu, 23 Oct 2014, Marek Polacek wrote: At present, we print the inline function ... declared but never defined warning twice. The reason for that is that this warning is being printed in pop_scope, which is called when popping file scope (c_common_parse_file-pop_file_scope), and when popping external scope (c_write_global_declarations). I think we should not print this warning when popping the external scope. We don't have to worry about nested functions here. Writing a proper testcase is a little bit tricky, but I hope what I did would work fine. Bootstrapped/regtested on x86_64-linux, ok for trunk? 2014-10-23 Marek Polacek pola...@redhat.com PR c/63626 * c-decl.c (pop_scope): Don't print warning in external_scope. * gcc.dg/pr63626.c: New test. OK. -- Joseph S. Myers jos...@codesourcery.com
RE: [PATCH] Relax check against commuting XOR and ASHIFTRT in combine.c
Mmmm, I've made a few attempts at filtering according to LP64 and ILP32, but not managed to get anything working so far (that is, I've ended up with the test not being executed on platforms where it should)ah, I see now where I've been going wrong, patch attached. The original intent was pretty much to execute the test on everything with the appropriate word size, i.e. where a 64-bit comparison would be done in 64 bits rather than emulated in 2*32; and for 32-bit where that was not sign-extended to 64 (or some other such problem). The architectures I wrote in the file, were those on which I tested the rtl dump, excluding some archs where you get (neg (lt 0 x)) rather than (neg (ge x 0)); but the latter really shouldn't be a problem, it should be possible to use a regex matching either form, and then drop the target selection. However, as a quick first step, does adding the ilp32 / lp64 (and keeping the architectures list for now) solve the immediate problem? Patch attached, OK for trunk? gcc/testsuite/ChangeLog: * gcc.dg/combine_ashiftrt_1.c: require-effective-target LP64 * gcc.dg/combine_ashiftrt_2.c: require-effective-target ILP32 --Alan From: Rainer Orth [r...@cebitec.uni-bielefeld.de] Sent: 23 October 2014 14:10 To: Andreas Schwab Cc: Alan Lawrence; Jeff Law; gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Relax check against commuting XOR and ASHIFTRT in combine.c Andreas Schwab sch...@linux-m68k.org writes: Alan Lawrence alan.lawre...@arm.com writes: diff --git a/gcc/testsuite/gcc.dg/combine_ashiftrt_1.c b/gcc/testsuite/gcc.dg/combine_ashiftrt_1.c new file mode 100644 index ..90e64fd10dc358f10ad03a90041605bc3ccb7011 --- /dev/null +++ b/gcc/testsuite/gcc.dg/combine_ashiftrt_1.c @@ -0,0 +1,18 @@ +/* { dg-do compile {target sparc64*-*-* aarch64*-*-* x86_64-*-* powerpc64*-*-*} } */ You should check for lp64 instead of matching 64 in target names, to reject -m32. diff --git a/gcc/testsuite/gcc.dg/combine_ashiftrt_2.c b/gcc/testsuite/gcc.dg/combine_ashiftrt_2.c new file mode 100644 index ..fd6827caed230ea5dd2d6ec4431b11bf826531ea --- /dev/null +++ b/gcc/testsuite/gcc.dg/combine_ashiftrt_2.c @@ -0,0 +1,18 @@ +/* { dg-do compile {target arm*-*-* i?86-*-* powerpc-*-* sparc-*-*} } */ Likewise, using ilp32 to reject -m64. Right, the current target lists are simply bogus on biarch targets. Alan, what's the reasoning behind your current target lists here? Any reason the test couldn't work elsewhere? If not, it would be way better to introduce a corresponding effective-target keyword than listing particular targets without explanation. This needs to be fixed: the issue is knowns for three weeks now and causes testsuite noise on many platforms. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld Universitycommit 43e8585f475dff386d245cb150940755cd9b43d9 Author: Alan Lawrence alan.lawre...@arm.com Date: Thu Oct 23 17:41:28 2014 +0100 Add ILP32 / LP64 diff --git a/gcc/testsuite/gcc.dg/combine_ashiftrt_1.c b/gcc/testsuite/gcc.dg/combine_ashiftrt_1.c index 90e64fd..cb669c9 100644 --- a/gcc/testsuite/gcc.dg/combine_ashiftrt_1.c +++ b/gcc/testsuite/gcc.dg/combine_ashiftrt_1.c @@ -1,4 +1,5 @@ /* { dg-do compile {target sparc64*-*-* aarch64*-*-* x86_64-*-* powerpc64*-*-*} } */ +/* { dg-require-effective-target lp64 } */ /* { dg-options -O2 -fdump-rtl-combine-all } */ typedef long long int int64_t; diff --git a/gcc/testsuite/gcc.dg/combine_ashiftrt_2.c b/gcc/testsuite/gcc.dg/combine_ashiftrt_2.c index fd6827c..6bd6f2f 100644 --- a/gcc/testsuite/gcc.dg/combine_ashiftrt_2.c +++ b/gcc/testsuite/gcc.dg/combine_ashiftrt_2.c @@ -1,4 +1,5 @@ /* { dg-do compile {target arm*-*-* i?86-*-* powerpc-*-* sparc-*-*} } */ +/* { dg-require-effective-target ilp32} */ /* { dg-options -O2 -fdump-rtl-combine-all } */ typedef long int32_t;
[patch] Update libstdc++ status docs
Committed to trunk. commit 65b21ceb68ef845874f79555cd51d7e54e69340f Author: Jonathan Wakely jwak...@redhat.com Date: Thu Oct 23 17:54:34 2014 +0100 * doc/xml/manual/status_cxx2011.xml: Update status. * doc/xml/manual/status_cxx2014.xml: Update TS status. diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml index a553adf..7b631c2 100644 --- a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml +++ b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml @@ -1176,11 +1176,10 @@ particular release. entry/ /row row - ?dbhtml bgcolor=#B0B0B0 ? entry22.3.3.1/entry entryCharacter classification/entry - entryPartial/entry - entryMissing codeisblank/code./entry + entryY/entry + entry/ /row row entry22.3.3.2/entry diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2014.xml b/libstdc++-v3/doc/xml/manual/status_cxx2014.xml index b7ed7ed..e2a2d94 100644 --- a/libstdc++-v3/doc/xml/manual/status_cxx2014.xml +++ b/libstdc++-v3/doc/xml/manual/status_cxx2014.xml @@ -319,18 +319,6 @@ not in any particular release. row ?dbhtml bgcolor=#C8B0B0 ? entry - link xmlns:xlink=http://www.w3.org/1999/xlink; xlink:href=http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2014/n3843.pdf; - N3843 - /link - /entry - entryA SFINAE-Friendly std::common_type (option 2 was approved)/entry - entryN/entry - entryLibrary Fundamentals TS/entry -/row - -row - ?dbhtml bgcolor=#C8B0B0 ? - entry link xmlns:xlink=http://www.w3.org/1999/xlink; xlink:href=http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2014/n3866.html; N3866 /link @@ -411,14 +399,14 @@ not in any particular release. /row row - ?dbhtml bgcolor=#C8B0B0 ? + ?dbhtml bgcolor=#C8C8B0 ? entry link xmlns:xlink=http://www.w3.org/1999/xlink; xlink:href=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3932.htm; N3932 /link /entry entryVariable Templates For Type Traits/entry - entryN/entry + entryPartial/entry entryLibrary Fundamentals TS/entry /row
Re: [PATCH] Relax check against commuting XOR and ASHIFTRT in combine.c
Alan Lawrence alan.lawre...@arm.com writes: Mmmm, I've made a few attempts at filtering according to LP64 and ILP32, but not managed to get anything working so far (that is, I've ended up with the test not being executed on platforms where it should)ah, I see now where I've been going wrong, patch attached. The original intent was pretty much to execute the test on everything with the appropriate word size, i.e. where a 64-bit comparison would be done in 64 bits rather than emulated in 2*32; and for 32-bit where that was not sign-extended to 64 (or some other such problem). The architectures I wrote in the file, were those on which I tested the rtl dump, excluding some archs where you get (neg (lt 0 x)) rather than (neg (ge x 0)); but the latter really shouldn't be a problem, it should be possible to use a regex matching either form, and then drop the target selection. However, as a quick first step, does adding the ilp32 / lp64 (and keeping the architectures list for now) solve the immediate problem? Patch attached, OK for trunk? No, as I said this is wrong for biarch targets like sparc and i386. gcc/testsuite/ChangeLog: * gcc.dg/combine_ashiftrt_1.c: require-effective-target LP64 * gcc.dg/combine_ashiftrt_2.c: require-effective-target ILP32 Nit: write this as e.g. Require lp64. diff --git a/gcc/testsuite/gcc.dg/combine_ashiftrt_1.c b/gcc/testsuite/gcc.dg/combine_ashiftrt_1.c index 90e64fd..cb669c9 100644 --- a/gcc/testsuite/gcc.dg/combine_ashiftrt_1.c +++ b/gcc/testsuite/gcc.dg/combine_ashiftrt_1.c @@ -1,4 +1,5 @@ /* { dg-do compile {target sparc64*-*-* aarch64*-*-* x86_64-*-* powerpc64*-*-*} } */ This should be something like { target aarch64*-*-* i?86-*-* powerpc*-*-* sparc*-*-* x86_64-*-* } E.g. sparc-sun-solaris2.11 with -m64 is lp64, but would be excluded by your target list. Keep the list sorted alphabetically and best add an explanation so others know what those targets have in common. +/* { dg-require-effective-target lp64 } */ /* { dg-options -O2 -fdump-rtl-combine-all } */ typedef long long int int64_t; diff --git a/gcc/testsuite/gcc.dg/combine_ashiftrt_2.c b/gcc/testsuite/gcc.dg/combine_ashiftrt_2.c index fd6827c..6bd6f2f 100644 --- a/gcc/testsuite/gcc.dg/combine_ashiftrt_2.c +++ b/gcc/testsuite/gcc.dg/combine_ashiftrt_2.c @@ -1,4 +1,5 @@ /* { dg-do compile {target arm*-*-* i?86-*-* powerpc-*-* sparc-*-*} } */ Same here: { target arm*-*-* i?86-*-* powerpc*-*-* sparc*-*-* x86_64-*-* } +/* { dg-require-effective-target ilp32} */ /* { dg-options -O2 -fdump-rtl-combine-all } */ typedef long int32_t; Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [testsuite] Fix gcc.dg/ipa/ipa-icf-2[18].c on Solaris
On 10/23/2014 04:03 PM, Uros Bizjak wrote: On Thu, Oct 23, 2014 at 3:54 PM, Uros Bizjak ubiz...@gmail.com wrote: Ok, will try that. Still no idea why Linux/x86 and Solaris/x86 are different here. Why do you think so? I certainly have: FAIL: gcc.dg/ipa/ipa-icf-21.c (test for excess errors) UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Equal symbols: 1 UNRESOLVED: gcc.dg/ipa/ipa-icf-21.c scan-ipa-dump icf Semantic equality hit:bar-foo in my i686-linux test_summary log. It really depends on how the compiler Right, but on i386-pc-solaris2.11 it passes. is configured, if it defaults to march that has sse/sse2 by default, it will succeed, otherwise it will not. For some reason, I hit a division by zero on alphaev68-linux-gnu in the report generation code: Starting program: /space/uros/gcc-build/prev-gcc/cc1 -O2 -fdump-ipa-icf -quiet ipa-icf-14.i Program received signal SIGFPE, Arithmetic exception. ipa_icf::sem_item_optimizer::merge_classes (this=0x12187dd80, prev_class_count=3) at /space/homedirs/uros/gcc-svn/trunk/gcc/ipa-icf.c:2203 2203 non_singular_classes_count); (gdb) list 2198 fprintf (dump_file, Average class size before: %.2f, after: %.2f\n, 2199 1.0f * item_count / prev_class_count, 2200 1.0f * item_count / class_count); 2201 fprintf (dump_file, Average non-singular class size: %.2f, count: %u\n, 2202 1.0f * non_singular_classes_sum / non_singular_classes_count, 2203 non_singular_classes_count); 2204 fprintf (dump_file, Equal symbols: %u\n, equal_items); 2205 fprintf (dump_file, Fraction of visited symbols: %.2f%%\n\n, 2206 100.0f * equal_items / item_count); 2207} (gdb) p non_singular_classes_count $1 = 0 Also: (gdb) p non_singular_classes_sum $1 = 0 This creates a nice NaN which can throw an exception. Hello. Sorry for a stupid bug. I attached patch that should fix these divisions by zero. I'm just wondering if we have a machine in compile farm with alpha? Thanks, Martin (gdb) bt #0 ipa_icf::sem_item_optimizer::merge_classes (this=0x12187dd80, prev_class_count=3) at /space/homedirs/uros/gcc-svn/trunk/gcc/ipa-icf.c:2203 #1 0x000121255c70 in ipa_icf::sem_item_optimizer::execute (this=0x12187dd80) at /space/homedirs/uros/gcc-svn/trunk/gcc/ipa-icf.c:1602 #2 0x00012125999c in ipa_icf::ipa_icf_driver () at /space/homedirs/uros/gcc-svn/trunk/gcc/ipa-icf.c:2319 #3 0x00012125a46c in ipa_icf::pass_ipa_icf::execute (this=0x121850f20) at /space/homedirs/uros/gcc-svn/trunk/gcc/ipa-icf.c:2367 #4 0x0001209c10c4 in execute_one_pass (pass=0x121850f20) at /space/homedirs/uros/gcc-svn/trunk/gcc/passes.c:2156 #5 0x0001209c26f4 in execute_ipa_pass_list (pass=0x121850f20) at /space/homedirs/uros/gcc-svn/trunk/gcc/passes.c:2550 #6 0x00012048fcdc in ipa_passes () at /space/homedirs/uros/gcc-svn/trunk/gcc/cgraphunit.c:2057 Uros. diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c index d1238a4..e7a293e 100644 --- a/gcc/ipa-icf.c +++ b/gcc/ipa-icf.c @@ -1736,7 +1736,7 @@ sem_item_optimizer::parse_nonsingleton_classes (void) if (dump_file) fprintf (dump_file, Init called for %u items (%.2f%%).\n, init_called_count, - 100.0f * init_called_count / m_items.length ()); + m_items.length () ? 100.0f * init_called_count / m_items.length (): 0.0f); } /* Equality function for semantic items is used to subdivide existing @@ -2196,14 +2196,15 @@ sem_item_optimizer::merge_classes (unsigned int prev_class_count) fprintf (dump_file, Congruent classes before: %u, after: %u\n, prev_class_count, class_count); fprintf (dump_file, Average class size before: %.2f, after: %.2f\n, - 1.0f * item_count / prev_class_count, - 1.0f * item_count / class_count); + prev_class_count ? 1.0f * item_count / prev_class_count : 0.0f, + class_count ? 1.0f * item_count / class_count : 0.0f); fprintf (dump_file, Average non-singular class size: %.2f, count: %u\n, - 1.0f * non_singular_classes_sum / non_singular_classes_count, + non_singular_classes_count ? 1.0f * non_singular_classes_sum / + non_singular_classes_count : 0.0f, non_singular_classes_count); fprintf (dump_file, Equal symbols: %u\n, equal_items); fprintf (dump_file, Fraction of visited symbols: %.2f%%\n\n, - 100.0f * equal_items / item_count); + item_count ? 100.0f * equal_items / item_count : 0.0f); } for (hash_tablecongruence_class_group_hash::iterator it = m_classes.begin ();
Re: [PATCH] Add zero-overhead looping for xtensa backend
On Tue, Oct 21, 2014 at 7:20 PM, Yangfei (Felix) felix.y...@huawei.com wrote: If the tripcount spill issue is not handled in the pattern, ICE may happen then. Here reload is trying to spill pseudo 173, but a memory operand is not allowed in zero_cost_loop_end pattern. And this is what I am trying to solve. We have full control of the zero_cost_loop_end pattern. Plus, it doesn't actually generate any real code. Edit it so it can take a memory operand.
Re: [PATCH i386 AVX512] [82/n] Add gather built-ins.
On Wed, Oct 22, 2014 at 2:17 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Hello, This patch introduces built-ins for gather insns. Bootstrapped. AVX-512* tests on top of patch-set all pass under simulator. Is it ok for trunk? gcc/ * config/i386/i386.c (ix86_builtins): Add IX86_BUILTIN_GATHER3ALTSIV4DF, IX86_BUILTIN_GATHER3ALTDIV8SF, IX86_BUILTIN_GATHER3ALTSIV4DI, IX86_BUILTIN_GATHER3ALTDIV8SI. (ix86_expand_builtin): Handle IX86_BUILTIN_GATHER3ALTDIV8SF, IX86_BUILTIN_GATHER3ALTDIV8SI, IX86_BUILTIN_SCATTERSIV4DF, IX86_BUILTIN_SCATTERSIV4DI, IX86_BUILTIN_SCATTERDIV2DF, IX86_BUILTIN_SCATTERDIV4DF, IX86_BUILTIN_GATHER3ALTSIV4DI, IX86_BUILTIN_GATHER3ALTSIV4DF, IX86_BUILTIN_SCATTERDIV4DI, IX86_BUILTIN_SCATTERDIV2DI, IX86_BUILTIN_GATHER3SIV8SI, IX86_BUILTIN_GATHER3DIV8SI, IX86_BUILTIN_GATHER3SIV4DF, IX86_BUILTIN_GATHER3SIV4DI, IX86_BUILTIN_GATHER3DIV8SF, IX86_BUILTIN_GATHER3SIV8SF, IX86_BUILTIN_GATHER3DIV4DF, IX86_BUILTIN_GATHER3DIV2DF, IX86_BUILTIN_GATHER3DIV2DI, IX86_BUILTIN_GATHER3DIV4DI, IX86_BUILTIN_SCATTERDIV4SF, IX86_BUILTIN_SCATTERSIV2DI, IX86_BUILTIN_GATHER3SIV2DI, IX86_BUILTIN_GATHER3SIV4SI, IX86_BUILTIN_GATHER3SIV4SF, IX86_BUILTIN_GATHER3SIV2DF, IX86_BUILTIN_SCATTERSIV2DF, IX86_BUILTIN_SCATTERDIV4SI, IX86_BUILTIN_SCATTERSIV4SF, IX86_BUILTIN_SCATTERSIV4SI, IX86_BUILTIN_SCATTERDIV8SI, IX86_BUILTIN_GATHER3DIV4SI, IX86_BUILTIN_SCATTERSIV8SI, IX86_BUILTIN_SCATTERSIV8SF, IX86_BUILTIN_GATHER3DIV4SF, IX86_BUILTIN_SCATTERDIV8SF. (ix86_vectorize_builtin_gather): Update V2DFmode, V4DFmode, V2DImode, V4DImode, V4SFmode, V8SFmode, V4SImode, V8SImode. OK. Thanks, Uros.
Re: [PATCH 2/2] [AARCH64,NEON] Convert arm_neon.h to use new builtins for vld[234](q?)_lane_*
On 23 October 2014 11:14, Marcus Shawcroft marcus.shawcr...@gmail.com wrote: On 8 October 2014 18:27, charles.bay...@linaro.org wrote: +#define __LD2_LANE_FUNC(intype, vectype, largetype, ptrtype, \ Just above the insertion point in arm-neon.h is the comment: /* Start of temporary inline asm for vldn, vstn and friends. */ This patch removes the temporary inline asm vldn implementation, the replacement implementation should be inserted in the section below, look for the comment that starts: OK. The vstN_lane intrinsics are similarly misplaced, I'll do a separate patch to move them. Start of optimal implementations +mode, ptrmode, funcsuffix, signedtype)\ +__extension__ static __inline intype __attribute__ ((__always_inline__)) \ +vld2_lane_##funcsuffix (const ptrtype * __ptr, intype __b, const int __c) \ +{ \ + __builtin_aarch64_simd_oi __o; \ + largetype __temp; \ + __temp.val[0] = \ There is something odd about the white space here, space before tab? This is repeated in various places through the rest of the patch. There are a few spaces before tabs, but the weird misalignment of the \'s when reading the patch is just due to the effect of the unified diff misaligning the code with the tabstops. I have respun the patch with those spaces removed. Otherwise this and the previous 1/2 associated patch look good, can you respin with these tidy ups? OK for trunk? From a37d24c57f6c7abe4ade05c1f383e82ebd20c052 Mon Sep 17 00:00:00 2001 From: Charles Baylis charles.bay...@linaro.org Date: Wed, 10 Sep 2014 13:45:25 +0100 Subject: [PATCH 2/2] [AARCH64,NEON] Convert arm_neon.h to use new builtins for vld[234](q?)_lane_* This patch replaces the inline assembler implementations of the vld[234](q?)_lane_* intrinsics with new versions which exploit the new builtin functions added in patch 1. Tested (with the rest of the patch series) with make check on aarch64-oe-linux with qemu, and also causes no regressions in clyon's NEON intrinsics tests. DATE Charles Baylis charles.bay...@linaro.org * config/aarch64/arm_neon.h (__LD2_LANE_FUNC): Rewrite using builtins, update uses to use new macro arguments. (__LD3_LANE_FUNC): Likewise. (__LD4_LANE_FUNC): Likewise. --- gcc/config/aarch64/arm_neon.h | 404 +- 1 file changed, 281 insertions(+), 123 deletions(-) diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index 9b1873f..18c6e92 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -11805,47 +11805,6 @@ __LD2R_FUNC (uint16x8x2_t, uint16x2_t, uint16_t, 8h, u16, q) __LD2R_FUNC (uint32x4x2_t, uint32x2_t, uint32_t, 4s, u32, q) __LD2R_FUNC (uint64x2x2_t, uint64x2_t, uint64_t, 2d, u64, q) -#define __LD2_LANE_FUNC(rettype, ptrtype, regsuffix, \ - lnsuffix, funcsuffix, Q) \ - __extension__ static __inline rettype \ - __attribute__ ((__always_inline__)) \ - vld2 ## Q ## _lane_ ## funcsuffix (const ptrtype *ptr, \ - rettype b, const int c) \ - { \ -rettype result; \ -__asm__ (ld1 {v16. #regsuffix , v17. #regsuffix }, %1\n\t \ - ld2 {v16. #lnsuffix , v17. #lnsuffix }[%3], %2\n\t \ - st1 {v16. #regsuffix , v17. #regsuffix }, %0\n\t \ - : =Q(result) \ - : Q(b), Q(*(const rettype *)ptr), i(c) \ - : memory, v16, v17); \ -return result; \ - } - -__LD2_LANE_FUNC (int8x8x2_t, uint8_t, 8b, b, s8,) -__LD2_LANE_FUNC (float32x2x2_t, float32_t, 2s, s, f32,) -__LD2_LANE_FUNC (float64x1x2_t, float64_t, 1d, d, f64,) -__LD2_LANE_FUNC (poly8x8x2_t, poly8_t, 8b, b, p8,) -__LD2_LANE_FUNC (poly16x4x2_t, poly16_t, 4h, h, p16,) -__LD2_LANE_FUNC (int16x4x2_t, int16_t, 4h, h, s16,) -__LD2_LANE_FUNC (int32x2x2_t, int32_t, 2s, s, s32,) -__LD2_LANE_FUNC (int64x1x2_t, int64_t, 1d, d, s64,) -__LD2_LANE_FUNC (uint8x8x2_t, uint8_t, 8b, b, u8,) -__LD2_LANE_FUNC (uint16x4x2_t, uint16_t, 4h, h, u16,) -__LD2_LANE_FUNC (uint32x2x2_t, uint32_t, 2s, s, u32,) -__LD2_LANE_FUNC (uint64x1x2_t, uint64_t, 1d, d, u64,) -__LD2_LANE_FUNC (float32x4x2_t, float32_t, 4s, s, f32, q) -__LD2_LANE_FUNC (float64x2x2_t, float64_t, 2d, d, f64, q) -__LD2_LANE_FUNC (poly8x16x2_t, poly8_t, 16b, b, p8, q) -__LD2_LANE_FUNC (poly16x8x2_t, poly16_t, 8h, h, p16, q) -__LD2_LANE_FUNC (int8x16x2_t, int8_t, 16b, b, s8, q) -__LD2_LANE_FUNC (int16x8x2_t, int16_t, 8h, h, s16, q) -__LD2_LANE_FUNC (int32x4x2_t, int32_t, 4s, s, s32, q) -__LD2_LANE_FUNC (int64x2x2_t, int64_t, 2d, d, s64, q) -__LD2_LANE_FUNC (uint8x16x2_t, uint8_t, 16b, b, u8, q) -__LD2_LANE_FUNC (uint16x8x2_t, uint16_t, 8h, h, u16, q) -__LD2_LANE_FUNC (uint32x4x2_t, uint32_t, 4s, s, u32, q) -__LD2_LANE_FUNC (uint64x2x2_t, uint64_t, 2d, d,
Re: [PATCH i386 AVX512] [83/n] Add scatter built-ins.
On Wed, Oct 22, 2014 at 2:19 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Hello, This patch adds built-ins for scatter insns. Bootstrapped. AVX-512* tests on top of patch-set all pass under simulator. Is it ok for trunk? gcc/ * config/i386/i386.c (ix86_init_mmx_sse_builtins): Define __builtin_ia32_gather3siv2df, __builtin_ia32_gather3siv4df, __builtin_ia32_gather3div2df, __builtin_ia32_gather3div4df, __builtin_ia32_gather3siv4sf, __builtin_ia32_gather3siv8sf, __builtin_ia32_gather3div4sf, __builtin_ia32_gather3div8sf, __builtin_ia32_gather3siv2di, __builtin_ia32_gather3siv4di, __builtin_ia32_gather3div2di, __builtin_ia32_gather3div4di, __builtin_ia32_gather3siv4si, __builtin_ia32_gather3siv8si, __builtin_ia32_gather3div4si, __builtin_ia32_gather3div8si, __builtin_ia32_gather3altsiv4df, __builtin_ia32_gather3altdiv8sf, __builtin_ia32_gather3altsiv4di, __builtin_ia32_gather3altdiv8si, __builtin_ia32_scattersiv8sf, __builtin_ia32_scattersiv4sf, __builtin_ia32_scattersiv4df, __builtin_ia32_scattersiv2df, __builtin_ia32_scatterdiv8sf, __builtin_ia32_scatterdiv4sf, __builtin_ia32_scatterdiv4df, __builtin_ia32_scatterdiv2df, __builtin_ia32_scattersiv8si, __builtin_ia32_scattersiv4si, __builtin_ia32_scattersiv4di, __builtin_ia32_scattersiv2di, __builtin_ia32_scatterdiv8si, __builtin_ia32_scatterdiv4si, __builtin_ia32_scatterdiv4di, __builtin_ia32_scatterdiv2di. OK. Thanks, Uros.
Re: [PATCH i386 AVX512] [84/n] Add missing immediate checks.
On Wed, Oct 22, 2014 at 2:28 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Hello, This tiny patch adds couple of missing immediate checks. Bootstrapped. AVX-512* tests on top of patch-set all pass under simulator. Is it ok for trunk? gcc/ * config/i386/i386.c (ix86_expand_args_builtin): Handle avx_vpermilv4df_mask, avx_shufpd256_mask, avx_vpermilv2df_mask. OK. Thanks, Uros.