[RFC] Add aarch64 support for ada
From: Richard Henderson r...@redhat.com The primary bit of rfc here is the hunk that applies to ada/types.h with respect to Fat_Pointer. Given that the Ada type, as defined in s-stratt.ads, does not include alignment, I can't imagine why the C type should have it. This causes problems with the AArch64 calling convention, which honors this alignment in the set of registers it chooses to pass the struct. One can see this difference in create_concat_name vs Exp_Dbug.Get_External_Name_With_Suffix. The secondary bit of rfc is in the Makefile change. In particular, + system.adssystem-linux-x86_64.ads IMO, this should really be called system-linux-lp64.ads, and should be usable for any 64-bit target that uses full ieee floating point, which is all of them. IMO basically all of the differences between x86 and the other linux targets is a bug in the other linux targets. I.e. missing functionality. There are rare exceptions, such as ARM32 and its AAPCS unwinding. Similarly with the HAVE_GNAT_ALTERNATE_STACK stuff. There aren't any linux hosts that don't support sigaltstack, so why is this conditionalized? Anyway, === gnat Summary === # of expected passes2308 # of expected failures 34 # of unsupported tests 22 I'll see about puting some rpms somewhere public so that no one else has to do the whole canadian-cross compile dance. r~ * gcc-interface/Makefile.in: Support aarch64-linux. * init.c (__gnat_alternate_stack): Use for aarch64. (__gnat_install_handler): Do sigaltstack for aarch64 too. * types.h (Fat_Pointer): Remove alignment attribute. diff --git a/gcc/ada/gcc-interface/Makefile.in b/gcc/ada/gcc-interface/Makefile.in index dc5e912..302d9a3 100644 --- a/gcc/ada/gcc-interface/Makefile.in +++ b/gcc/ada/gcc-interface/Makefile.in @@ -2123,6 +2123,44 @@ ifeq ($(strip $(filter-out alpha% linux%,$(arch) $(osys))),) LIBRARY_VERSION := $(LIB_VERSION) endif +# AArch64 Linux +ifeq ($(strip $(filter-out aarch64% linux%,$(arch) $(osys))),) + LIBGNAT_TARGET_PAIRS = \ + a-exetim.adba-exetim-posix.adb \ + a-exetim.adsa-exetim-default.ads \ + a-intnam.adsa-intnam-linux.ads \ + a-synbar.adba-synbar-posix.adb \ + a-synbar.adsa-synbar-posix.ads \ + s-inmaop.adbs-inmaop-posix.adb \ + s-intman.adbs-intman-posix.adb \ + s-linux.adss-linux.ads \ + s-mudido.adbs-mudido-affinity.adb \ + s-osinte.adss-osinte-linux.ads \ + s-osinte.adbs-osinte-posix.adb \ + s-osprim.adbs-osprim-posix.adb \ + s-taprop.adbs-taprop-linux.adb \ + s-tasinf.adss-tasinf-linux.ads \ + s-tasinf.adbs-tasinf-linux.adb \ + s-tpopsp.adbs-tpopsp-tls.adb \ + s-taspri.adss-taspri-posix.ads \ + g-sercom.adbg-sercom-linux.adb \ + $(ATOMICS_TARGET_PAIRS) \ + $(ATOMICS_BUILTINS_TARGET_PAIRS) \ + system.adssystem-linux-x86_64.ads + ## ^^ Note the above is a pretty-close placeholder. + + TOOLS_TARGET_PAIRS = \ +mlib-tgt-specific.adbmlib-tgt-specific-linux.adb \ +indepsw.adbindepsw-gnu.adb + + EXTRA_GNATRTL_TASKING_OBJS=s-linux.o a-exetim.o + EH_MECHANISM=-gcc + THREADSLIB=-lpthread -lrt + GNATLIB_SHARED=gnatlib-shared-dual + GMEM_LIB = gmemlib + LIBRARY_VERSION := $(LIB_VERSION) +endif + # x86-64 Linux ifeq ($(strip $(filter-out %x86_64 linux%,$(arch) $(osys))),) LIBGNAT_TARGET_PAIRS = \ diff --git a/gcc/ada/init.c b/gcc/ada/init.c index f5c3a81..0ac2398 100644 --- a/gcc/ada/init.c +++ b/gcc/ada/init.c @@ -562,7 +562,9 @@ __gnat_error_handler (int sig, siginfo_t *si ATTRIBUTE_UNUSED, void *ucontext) Raise_From_Signal_Handler (exception, msg); } -#if defined (i386) || defined (__x86_64__) || defined (__powerpc__) +#if defined (i386) || defined (__x86_64__) || defined (__powerpc__) \ +|| defined (__aarch64__) +#define HAVE_GNAT_ALTERNATE_STACK 1 /* This must be in keeping with System.OS_Interface.Alternate_Stack_Size. */ char __gnat_alternate_stack[16 * 1024]; /* 2 * SIGSTKSZ */ #endif @@ -603,7 +605,7 @@ __gnat_install_handler (void) handled properly, avoiding a SEGV generation from stack usage by the handler itself. */ -#if defined (i386) || defined (__x86_64__) || defined (__powerpc__) +#ifdef HAVE_GNAT_ALTERNATE_STACK stack_t stack; stack.ss_sp = __gnat_alternate_stack; stack.ss_size = sizeof (__gnat_alternate_stack); @@ -624,7 +626,7 @@ __gnat_install_handler (void) sigaction (SIGILL, act, NULL); if (__gnat_get_interrupt_state (SIGBUS) != 's') sigaction (SIGBUS, act, NULL); -#if defined (i386) || defined (__x86_64__) || defined (__powerpc__) +#ifdef HAVE_GNAT_ALTERNATE_STACK act.sa_flags |= SA_ONSTACK; #endif if (__gnat_get_interrupt_state (SIGSEGV) != 's') diff --git a/gcc/ada/types.h b/gcc/ada/types.h index a0f2891..6b3db93 100644 --- a/gcc/ada/types.h +++ b/gcc/ada/types.h @@ -79,8 +79,7 @@ typedef Char *Str_Ptr; /* Types for the fat pointer used for strings and the template it points to. */ typedef struct {int Low_Bound, High_Bound; } String_Template;
Re: [RFC] Add aarch64 support for ada
The Makfile.in and init.c changes are OK. The types.h change is likely more controversial and may be problematic, I'll let Eric comment. + system.adssystem-linux-x86_64.ads IMO, this should really be called system-linux-lp64.ads, and should be usable for any 64-bit target that uses full ieee floating point, which is all of them. Well, in our experience, each time we've tried to share system files, this came back and bit us at some point. But I do not know the aarch64 architecture to comment on this specific case. Arno
Re: [PATCH] Fix reassociation with -g (PR tree-optimization/60844)
On Tue, 15 Apr 2014, Jakub Jelinek wrote: Hi! The (admittedly ugly) reassoc stmt positioning stuff requires that we maintain uids in ascending order within each bb (equal uid for several adjacent stmts is ok), including debug stmts. We assign those initially, and for stmts we add we make sure to copy the uid from some adjacent insn. But, as the following testcase shows, we don't take into account that gsi_remove can add debug stmts, and those don't have uid set. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/4.9.1? Ok. Thanks, Richard. 2014-04-15 Jakub Jelinek ja...@redhat.com PR tree-optimization/60844 * tree-ssa-reassoc.c (reassoc_remove_stmt): New function. (propagate_op_to_single_use, remove_visited_stmt_chain, linearize_expr, repropagate_negates, reassociate_bb): Use it instead of gsi_remove. * gcc.dg/pr60844.c: New test. --- gcc/tree-ssa-reassoc.c.jj 2014-03-13 10:38:09.0 +0100 +++ gcc/tree-ssa-reassoc.c2014-04-15 13:59:14.511383249 +0200 @@ -221,6 +221,35 @@ static struct pointer_map_t *operand_ran static long get_rank (tree); static bool reassoc_stmt_dominates_stmt_p (gimple, gimple); +/* Wrapper around gsi_remove, which adjusts gimple_uid of debug stmts + possibly added by gsi_remove. */ + +bool +reassoc_remove_stmt (gimple_stmt_iterator *gsi) +{ + gimple stmt = gsi_stmt (*gsi); + + if (!MAY_HAVE_DEBUG_STMTS || gimple_code (stmt) == GIMPLE_PHI) +return gsi_remove (gsi, true); + + gimple_stmt_iterator prev = *gsi; + gsi_prev (prev); + unsigned uid = gimple_uid (stmt); + basic_block bb = gimple_bb (stmt); + bool ret = gsi_remove (gsi, true); + if (!gsi_end_p (prev)) +gsi_next (prev); + else +prev = gsi_start_bb (bb); + gimple end_stmt = gsi_stmt (*gsi); + while ((stmt = gsi_stmt (prev)) != end_stmt) +{ + gcc_assert (stmt is_gimple_debug (stmt) gimple_uid (stmt) == 0); + gimple_set_uid (stmt, uid); + gsi_next (prev); +} + return ret; +} /* Bias amount for loop-carried phis. We want this to be larger than the depth of any reassociation tree we can see, but not larger than @@ -1123,7 +1152,7 @@ propagate_op_to_single_use (tree op, gim update_stmt (use_stmt); gsi = gsi_for_stmt (stmt); unlink_stmt_vdef (stmt); - gsi_remove (gsi, true); + reassoc_remove_stmt (gsi); release_defs (stmt); } @@ -3072,7 +3101,7 @@ remove_visited_stmt_chain (tree var) { var = gimple_assign_rhs1 (stmt); gsi = gsi_for_stmt (stmt); - gsi_remove (gsi, true); + reassoc_remove_stmt (gsi); release_defs (stmt); } else @@ -3494,7 +3523,7 @@ linearize_expr (gimple stmt) update_stmt (stmt); gsi = gsi_for_stmt (oldbinrhs); - gsi_remove (gsi, true); + reassoc_remove_stmt (gsi); release_defs (oldbinrhs); gimple_set_visited (stmt, true); @@ -3896,7 +3925,7 @@ repropagate_negates (void) gimple_assign_set_rhs_with_ops (gsi2, NEGATE_EXPR, x, NULL); user = gsi_stmt (gsi2); update_stmt (user); - gsi_remove (gsi, true); + reassoc_remove_stmt (gsi); release_defs (feed); plus_negates.safe_push (gimple_assign_lhs (user)); } @@ -4413,7 +4442,7 @@ reassociate_bb (basic_block bb) reassociations. */ if (has_zero_uses (gimple_get_lhs (stmt))) { - gsi_remove (gsi, true); + reassoc_remove_stmt (gsi); release_defs (stmt); /* We might end up removing the last stmt above which places the iterator to the end of the sequence. --- gcc/testsuite/gcc.dg/pr60844.c.jj 2014-04-15 14:01:27.561689401 +0200 +++ gcc/testsuite/gcc.dg/pr60844.c2014-04-15 14:01:10.0 +0200 @@ -0,0 +1,16 @@ +/* PR tree-optimization/60844 */ +/* { dg-do compile } */ +/* { dg-options -O2 -g } */ +/* { dg-additional-options -mtune=atom { target { i?86-*-* x86_64-*-* } } } */ + +void +foo (int *x, int y, int z) +{ + int b, c = x[0], d = x[1]; + for (b = 0; b 1; b++) +{ + int e = (y ? 1 : 0) | (d ? 2 : 0) | (z ? 1 : 0); + e |= (c ? 2 : 0) | ((1 b) ? 1 : 0); + x[2 + b] = e; +} +} Jakub -- Richard Biener rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer
Re: [RFC] Add aarch64 support for ada
The primary bit of rfc here is the hunk that applies to ada/types.h with respect to Fat_Pointer. Given that the Ada type, as defined in s-stratt.ads, does not include alignment, I can't imagine why the C type should have it. See gcc-interface/utils.c:finish_fat_pointer_type. This causes problems with the AArch64 calling convention, which honors this alignment in the set of registers it chooses to pass the struct. One can see this difference in create_concat_name vs Exp_Dbug.Get_External_Name_With_Suffix. This should not happen though, since String is passed as a fat pointer too. Similarly with the HAVE_GNAT_ALTERNATE_STACK stuff. There aren't any linux hosts that don't support sigaltstack, so why is this conditionalized? Because we don't want to use it if we can avoid it, as this generally makes things less robust. It's mandatory for x86 and x86-64, but I'm not sure why PowerPC is in the list. I'll try and remove it. Given that aarch32 works without it, I don't think that we should add it for aarch64. This may make some stack checking tests fail, but that's OK since there is no stack checking support in the aarch64 back-end AFAIK. -- Eric Botcazou
RE: [patch] Disable if_conversion2 for Og
-Original Message- From: Joey Ye [mailto:joey...@arm.com] Sent: Tuesday, April 15, 2014 6:37 PM To: 'Richard Biener' Cc: GCC Patches Subject: RE: [patch] Disable if_conversion2 for Og Ok for trunk and branches after a while. Why does if-conversion not have the same problem? On the GIMPLE part we avoid all kinds of if-conversion with -Og. I think if-conversion should be disabled for Og too, but I don't have a case to show that it is harmful. If GIMPLE avoids all if-conversion, it is nature to do the same for RTL. I'll test and send another patch to also disable if-conversion. New patch tested with more regressions with -Og, which are expected. FAIL: gcc.target/arm/its.c scan-assembler-times \\tit 2 FAIL: gcc.target/arm/pr40956.c scan-assembler-times mov[t ]*r., #0 1 FAIL: gcc.target/arm/pr42835.c scan-assembler-times moveq[t ]*r.,[t ]*# 1 FAIL: gcc.target/arm/pr42835.c scan-assembler-times movne[t ]*r.,[t ]*# 1 FAIL: gcc.target/arm/shiftable.c scan-assembler sub.*[al]sl #6 FAIL: gcc.target/arm/thumb-ifcvt-2.c scan-assembler asreq FAIL: gcc.target/arm/thumb-ifcvt-2.c scan-assembler lslne FAIL: gcc.target/arm/thumb-ifcvt.c scan-assembler asrne FAIL: gcc.target/arm/thumb-ifcvt.c scan-assembler lslne FAIL: gcc.target/arm/vseleqdf.c scan-assembler-times vseleq.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vseleqsf.c scan-assembler-times vseleq.f32\\ts[0-9]+ 1 FAIL: gcc.target/arm/vselgedf.c scan-assembler-times vselge.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vselgesf.c scan-assembler-times vselge.f32\\ts[0-9]+ 1 FAIL: gcc.target/arm/vselgtdf.c scan-assembler-times vselgt.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vselgtsf.c scan-assembler-times vselgt.f32\\ts[0-9]+ 1 FAIL: gcc.target/arm/vselledf.c scan-assembler-times vselgt.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vsellesf.c scan-assembler-times vselgt.f32\\ts[0-9]+ 1 FAIL: gcc.target/arm/vselltdf.c scan-assembler-times vselge.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vselltsf.c scan-assembler-times vselge.f32\\ts[0-9]+ 1 FAIL: gcc.target/arm/vselnedf.c scan-assembler-times vseleq.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vselnesf.c scan-assembler-times vseleq.f32\\ts[0-9]+ 1 FAIL: gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\\ts[0-9]+ 1 FAIL: gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\\ts[0-9]+ 1 OK? ChangeLog: * opts.c (OPT_fif_conversion, OPT_fif_conversion2): Disable for Og. diff --git a/gcc/opts.c b/gcc/opts.c index fdc903f..3f3db1a 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -431,8 +431,8 @@ static const struct default_options default_options_table[] = { OPT_LEVELS_1_PLUS, OPT_fguess_branch_probability, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fcprop_registers, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fforward_propagate, NULL, 1 }, -{ OPT_LEVELS_1_PLUS, OPT_fif_conversion, NULL, 1 }, -{ OPT_LEVELS_1_PLUS, OPT_fif_conversion2, NULL, 1 }, +{ OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fif_conversion, NULL, 1 }, +{ OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fif_conversion2, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fipa_pure_const, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fipa_reference, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fipa_profile, NULL, 1 },
Re: [RFC] Add aarch64 support for ada
Similarly with the HAVE_GNAT_ALTERNATE_STACK stuff. There aren't any linux hosts that don't support sigaltstack, so why is this conditionalized? Hum, I didn't know that Android also used the alternate stack... OK, let's use it unconditionally on Linux then, except for IA-64 which is a totally different beast. Can you change the patch accordingly? -- Eric Botcazou
Re: [PATCH 2/3, x86] X86 Silvermont vector cost model tune
On Tue, Apr 15, 2014 at 6:08 PM, Evgeny Stupachenko evstu...@gmail.com wrote: 2d part: 2014-04-15 Evgeny Stupachenko evstu...@gmail.com * config/i386/x86-tune.def (TARGET_SLOW_PHUFFB): Target for slow byte shuffle on some x86 architectures. ... (X86_TUNE_SLOW_PSHUFB): New tune definition. Typo: TARGET_SLOW_PHUFFB - TARGET_SLOW_PSHUFB. * config/i386/i386.h (TARGET_SLOW_PHUFFB): Ditto. ... : New tune flag. * config/i386/i386.c (expand_vec_perm_even_odd_1): Avoid byte shuffles in architectures where they are slow (TARGET_SLOW_PHUFFB). ...: Avoid byte shuffles for TARGET_SLOW_PSHUFB. OK for mainline with the above ChangeLog modifications. Thanks, Uros.
Re: [PATCH 3/3, x86] X86 Silvermont vector cost model tune
On Tue, Apr 15, 2014 at 6:12 PM, Evgeny Stupachenko evstu...@gmail.com wrote: 3d part: 2014-04-15 Evgeny Stupachenko evstu...@gmail.com * config/i386/i386.c (x86_add_stmt_cost): Fixing vector cost model for Silvermont. ... : Fix vector cost ... OK for mainline with the above ChangeLog fix. Thanks, Uros.
Re: [PATCH][AArch64] Vectorise bswap[16,32,64]
On 15/04/14 18:45, Eric Christopher wrote: Testcase weirdness? for (i 0; i N; ++i) { arr[i] = i; expect[i] = __builtin_bswap64 (i); if (y) /* Avoid vectorisation. */ abort (); } i 0 :) duplicated in all 3 testcases btw. Oops, here it is fixed. Thanks for catching this. Kyrill -eric On Tue, Apr 15, 2014 at 4:25 AM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote: Hi all, This patch enables aarch64 to vectorise bswap[16,32,64] operations by using the AdvancedSIMD forms of the rev[16,32,64] instructions. The TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION hook is extended to return the vectorised forms of __builtin_bswap* where possible and vector bswap patterns are added. I've added the tests in vect.exp and a new effective target check (vect_bswap) that can be extended for other targets in the future if they can also vectorise these operations. Is that ok? Bootstrapped and tested aarch64-none-linux-gnu. Ok for trunk? Thanks, Kyrill 2014-04-15 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/aarch64/aarch64-builtins.c (aarch64_builtin_vectorized_function): Handle BUILT_IN_BSWAP16, BUILT_IN_BSWAP32, BUILT_IN_BSWAP64. * config/aarch64/aarch64-simd.md (bswapmode): New pattern. * config/aarch64/aarch64-simd-builtins.def: Define vector bswap builtins. * config/aarch64/iterator.md (VDQHSD): New mode iterator. (Vrevsuff): New mode attribute. 2014-04-15 Kyrylo Tkachov kyrylo.tkac...@arm.com * lib/target-supports.exp (check_effective_target_vect_bswap): New. * gcc.dg/vect/vect-bswap16: New test. * gcc.dg/vect/vect-bswap32: Likewise. * gcc.dg/vect/vect-bswap64: Likewise. commit 0d6d820881443a7ce7f9bd51f35aff04866c5e57 Author: Kyrylo Tkachov kyrylo.tkac...@arm.com Date: Thu Apr 3 09:22:14 2014 +0100 [AArch64] vectorise bswap diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index 55cfe0a..d839a40 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -1086,7 +1086,29 @@ aarch64_builtin_vectorized_function (tree fndecl, tree type_out, tree type_in) return aarch64_builtin_decls[builtin]; } - + case BUILT_IN_BSWAP16: +#undef AARCH64_CHECK_BUILTIN_MODE +#define AARCH64_CHECK_BUILTIN_MODE(C, N) \ + (out_mode == N##Imode out_n == C \ +in_mode == N##Imode in_n == C) + if (AARCH64_CHECK_BUILTIN_MODE (4, H)) + return aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_UNOPU_bswapv4hi]; + else if (AARCH64_CHECK_BUILTIN_MODE (8, H)) + return aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_UNOPU_bswapv8hi]; + else + return NULL_TREE; + case BUILT_IN_BSWAP32: + if (AARCH64_CHECK_BUILTIN_MODE (2, S)) + return aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_UNOPU_bswapv2si]; + else if (AARCH64_CHECK_BUILTIN_MODE (4, S)) + return aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_UNOPU_bswapv4si]; + else + return NULL_TREE; + case BUILT_IN_BSWAP64: + if (AARCH64_CHECK_BUILTIN_MODE (2, D)) + return aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_UNOPU_bswapv2di]; + else + return NULL_TREE; default: return NULL_TREE; } diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index c9b7570..e9736da 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -330,6 +330,8 @@ VAR1 (UNOP, floatunsv4si, 2, v4sf) VAR1 (UNOP, floatunsv2di, 2, v2df) + VAR5 (UNOPU, bswap, 10, v4hi, v8hi, v2si, v4si, v2di) + /* Implemented by aarch64_PERMUTE:perm_insnPERMUTE:perm_hilomode. */ BUILTIN_VALL (BINOP, zip1, 0) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 73aee2c..75db3e8 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -286,6 +286,14 @@ [(set_attr type neon_mul_Vetypeq)] ) +(define_insn bswapmode + [(set (match_operand:VDQHSD 0 register_operand =w) +(bswap:VDQHSD (match_operand:VDQHSD 1 register_operand w)))] + TARGET_SIMD + revVrevsuff\\t%0.Vbtype, %1.Vbtype + [(set_attr type neon_revq)] +) + (define_insn *aarch64_mul3_eltmode [(set (match_operand:VMUL 0 register_operand =w) (mult:VMUL diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index f1339b8..2b5ebd1 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -150,6 +150,9 @@ ;; Vector modes for H and S types. (define_mode_iterator VDQHS [V4HI V8HI V2SI V4SI]) +;; Vector modes for H, S and D types. +(define_mode_iterator VDQHSD [V4HI V8HI V2SI V4SI V2DI]) + ;; Vector modes for Q, H and S types. (define_mode_iterator VDQQHS [V8QI V16QI V4HI V8HI V2SI V4SI]) @@ -352,6 +355,9 @@ (V2DI 2d) (V2SF 2s) (V4SF 4s) (V2DF 2d)]) +(define_mode_attr Vrevsuff [(V4HI 16) (V8HI 16) (V2SI 32) +(V4SI 32)
Re: [PATCH][ARM] Handle simple SImode PLUS and MINUS operations in rtx costs
On 02/04/14 13:55, Kyrill Tkachov wrote: Pinging this for stage1, otherwise I'll forget about it and it'll fall through the cracks... http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01276.html Thanks, Kyrill Ping. http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01276.html Thanks, Kyrill On 24/03/14 17:21, Kyrill Tkachov wrote: Hi all, I noticed that we don't handle simple reg-to-reg arithmetic operations in the arm rtx cost functions. We should be adding the cost of alu.arith to the costs of the operands. This patch does that. Since we don't have any cost tables yet that have a non-zero value for that field it shouldn't affect code-gen for any current cores. Bootstrapped and tested on arm-none-linux-gnueabihf. Ok for next stage1? Thanks, Kyrill 2014-03-24 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/arm/arm.c (arm_new_rtx_costs): Handle reg-to-reg PLUS and MINUS RTXs.
Re: [PATCH] [CLEANUP] Wrap locally-used functions in anonymous namespaces
On Tue, Apr 15, 2014 at 7:33 PM, Patrick Palka patr...@parcs.ath.cx wrote: On Tue, Apr 15, 2014 at 3:51 AM, Richard Biener richard.guent...@gmail.com wrote: On Mon, Apr 14, 2014 at 4:51 PM, Patrick Palka patr...@parcs.ath.cx wrote: Hi everyone, This patch wraps a bunch of locally-used, non-debug functions in an anonymous namespace. These functions can't simply be marked as static because they are used as template arguments to hash_table::traverse, and the C++98 standard does not allow non-extern variables to be used as template arguments. The next best thing to marking them static is to define each of these functions inside an anonymous namespace. Hum, the formatting used looks super-ugly. I suppose a local visibility attribute would work as well? (well, what's the goal of the patch?) Thanks, Richard. The goal of this patch is to resolve warnings emitted by -Wmissing-declarations for the GCC sources. Later I would like to propose adding -Wmissing-declarations to GCC's build flags, but I figured that these kinds of cleanup patches are good on their own. I don't think a local visibility attribute would squelch the -Wmissing-declaration warnings. Is there a better/standardized format for defining a function within an anonymous namespace? I personally don't think the formatting is too bad. Existing uses at least add vertical spacing after namespace { and before the corresponding closing }, adding // anon namespace after that. Thus the patch is ok with that style change. Thanks, Richard.
Re: [PATCH] [CLEANUP] Declare global functions before defining them
On Tue, Apr 15, 2014 at 7:33 PM, Patrick Palka patr...@parcs.ath.cx wrote: On Tue, Apr 15, 2014 at 3:52 AM, Richard Biener richard.guent...@gmail.com wrote: On Mon, Apr 14, 2014 at 4:52 PM, Patrick Palka patr...@parcs.ath.cx wrote: Hi everyone, Many source files currently define a global function that is not previously declared within that source file because the source file did not include the appropriate header file that declares said function. This patch fixes a number of these occurrences by making sure to include the appropriate header file within the offending source files. Bootstrapped and regtested on x86_64-unknown-linux-gnu. How did you find these? (in the C bootstrap times -Wstrict-prototypes did that) Thanks, Richard. Like with the other two patches, the changes in this patch address a subset of the warnings emitted by -Wmissing-declarations. In this case the subset is extern functions that are declared inside a header file but whose defining source file does not include said header file. Ok. Then we should enable -Wmissing-declarations to not regress here. The patch is ok. Thanks, Richard.
Re: [PATCH] Make SRA tolerate most throwing statements
back in January in http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00848.html Eric pointed out a testcase where the problem was SRA not scalarizing an aggregate because it was involved in a throwing statement. The reason is that SRA is likely to need to append new statements after each one where a replaced aggregate is present, but throwing statements must end their BBs. This patch comes up with a fix for most such situations by adding these new statements onto a single successor non-EH edge, if there is one and only one such edge. Thanks for working on this. I have bootstrapped and tested a very similar version on x86_64-linux, bootstrap and testing of this exact one is currently underway. OK for trunk? Eric, if and once this gets in, can you please add the testcase from your original post to the suite? Reduced testcase attached, you can install it with the patch. * gnat.dg/opt34.adb: New test. * gnat.dg/opt34_pkg.ads: New helper. -- Eric Botcazou-- { dg-do compile } -- { dg-options -O -fdump-tree-esra } with Opt34_Pkg; use Opt34_Pkg; procedure Opt34 is function Local_Fun (Arg : T_Private) return T_Private is Result : T_Private; begin case Get_Integer (Arg) is when 1 = Result := Get_Private (100); when 2 = Result := T_Private_Zero; when others = null; end case; return Result; end Local_Fun; begin Assert (Get_Integer (Local_Fun (Get_Private (1))) = 100); end; -- { dg-final { scan-tree-dump Created a replacement for result esra } } -- { dg-final { cleanup-tree-dump esra } } package Opt34_Pkg is type T_Private is record I : Integer := 0; end record; T_Private_Zero : constant T_Private := (I = 0); function Get_Private (I : Integer) return T_Private; function Get_Integer (X : T_Private) return Integer; procedure Assert (Cond : Boolean); end Opt34_Pkg;
Re: [RFC] Enable virtual operands at -O0
On Tue, Apr 15, 2014 at 7:55 PM, Eric Botcazou ebotca...@adacore.com wrote: ISTR some more ???/FIXMEs and/or special-casings we could remove with that. As followup, of course. It would be better to remove them all at once, so if you have specifics... grepping for '[ \t!(]optimize[ )\$]' I find in tree-ssa-ter.c: /* Without alias info we can't move around loads. */ if (!optimize gimple_assign_single_p (stmt) !is_gimple_val (gimple_assign_rhs1 (stmt))) return false; I think that's all I found, the above check can be safely removed after the patch (TER is disabled by default at -O0 thus this was a guard for a miscompile with -O0 -ftree-ter IIRC). The single reason why we don't have virtual operands at -O0 is compile-time btw - SSA rewrite doesn't come for free. But I don't mind - still maybe a quick comparison of stage1-gcc compile-time with/without that patch would be interesting? 3m3.306s vs 3m3.041s for the 64-bit build of an earlier compiler version. The difference doesn't seem to be much more noticeable on big preprocessed files, e.g. combine.i or pt.i, but I'm not sure this means anything. In theory we have much more convoluted CFGs at -O0 and more memory vars (we don't prune TREE_ADDRESSABLE at -O0) and thus PHI insertion and vop renaming will be comparably more expensive at -O0 than with optimization. But given that virtual operands are pretty much an important correctness factor of GIMPLE omitting them constrains what utility functions and passes we can run at -O0. Thus removing that -O0 difference was always on my list ... Which means - ok for trunk (with the above check in TER removed). Thanks, Richard.
Libstdc++ ABI transition (was: Re: [PATCH 1/3] libstdc++: Add time_get::get support.)
On 16 April 2014 06:58, Marc Glisse wrote: On Tue, 15 Apr 2014, Paolo Carlini wrote: Anyway, the real issue is indeed that implementing those bits requires a new virtual function, and that would break the ABI. What is the status of the ABI half-break plan (abi_tag and such), necessary to get the remaining pieces of C++11? It's happening. I'm working on it this week and hope to post patches for discussion pretty soon. The plan is to default to new, abi_tagged versions of all classes that need an incompatible change. Users will be allowed to select the old version of classes using single macro to get the old versions of everything, or more fine-grained macros to get the old locales/streams but new std::list, for example. Any affected symbols exported from the library will need to be compiled twice and exported twice, using the old and new mangled names.
Re: RFA: Tighten checking for 'X' constraints
On Tue, Apr 15, 2014 at 09:53:16PM +0100, Richard Sandiford wrote: As Robert pointed out here: http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00416.html we're a bit too eager when folding stuff into an 'X' constraint. The value at expand time is sensible, but after that asm_operand_ok allows arbitrary rtx expressions, including any number of registers as well as MEMs with unchecked addresses. This is a target-independent problem, as shown by the testcase below. Reload would give bogus impossible constraint in asm errors while LRA ICEs. Tested on x86_64-linux-gnu. OK to install? But then what will be X good for compared to gin or similar? X constraint is meant for operands that aren't really needed, trying to print is a user error. I guess the documentation agrees with this: 'X' Any operand whatsoever is allowed, even if it does not satisfy 'general_operand'. This is normally used in the constraint of a 'match_scratch' when certain alternatives will not actually require a scratch register. So I think we should just error out if somebody tries to print something that satisfies X constraint. Jakub
Re: [PATCH] Make SRA tolerate most throwing statements
On Tue, 15 Apr 2014, Martin Jambor wrote: Hi, back in January in http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00848.html Eric pointed out a testcase where the problem was SRA not scalarizing an aggregate because it was involved in a throwing statement. The reason is that SRA is likely to need to append new statements after each one where a replaced aggregate is present, but throwing statements must end their BBs. This patch comes up with a fix for most such situations by adding these new statements onto a single successor non-EH edge, if there is one and only one such edge. I have bootstrapped and tested a very similar version on x86_64-linux, bootstrap and testing of this exact one is currently underway. OK for trunk? Eric, if and once this gets in, can you please add the testcase from your original post to the suite? Thanks, Martin 2014-04-15 Martin Jambor mjam...@suse.cz * tree-sra.c (single_non_eh_succ): New function. (disqualify_ops_if_throwing_stmt): Renamed to disqualify_if_bad_bb_terminating_stmt. Allow throwing statements having one non-EH successor BB. (gsi_for_eh_followups): New function. (sra_modify_expr): If stmt ends bb, use single non-EH successor to generate loads into replacements. (sra_modify_assign): Likewise and and also use the simple path for such statements. (sra_modify_function_body): Iterate safely over BBs. diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index ffef13d..4fd0f5e 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -1142,17 +1142,41 @@ build_access_from_expr (tree expr, gimple stmt, bool write) return false; } -/* Disqualify LHS and RHS for scalarization if STMT must end its basic block in - modes in which it matters, return true iff they have been disqualified. RHS - may be NULL, in that case ignore it. If we scalarize an aggregate in - intra-SRA we may need to add statements after each statement. This is not - possible if a statement unconditionally has to end the basic block. */ +/* Return the single non-EH successor edge of BB or NULL if there is none or + more than one. */ + +static edge +single_non_eh_succ (basic_block bb) +{ + edge e, res = NULL; + edge_iterator ei; + + FOR_EACH_EDGE (e, ei, bb-succs) +if (!(e-flags EDGE_EH)) + { + if (res) + return NULL; + res = e; + } + + return res; +} + +/* Disqualify LHS and RHS for scalarization if STMT has to terminate its BB and + there is no alternative spot where to put statements SRA might need to + generate after it. The spot we are looking for is an edge leading to a + single non-EH successor, if it exists and is indeed single. RHS may be + NULL, in that case ignore it. */ + static bool -disqualify_ops_if_throwing_stmt (gimple stmt, tree lhs, tree rhs) +disqualify_if_bad_bb_terminating_stmt (gimple stmt, tree lhs, tree rhs) { if ((sra_mode == SRA_MODE_EARLY_INTRA || sra_mode == SRA_MODE_INTRA) - (stmt_can_throw_internal (stmt) || stmt_ends_bb_p (stmt))) + stmt_ends_bb_p (stmt)) { + if (single_non_eh_succ (gimple_bb (stmt))) + return false; + disqualify_base_of_expr (lhs, LHS of a throwing stmt.); if (rhs) disqualify_base_of_expr (rhs, RHS of a throwing stmt.); @@ -1180,7 +1204,7 @@ build_accesses_from_assign (gimple stmt) lhs = gimple_assign_lhs (stmt); rhs = gimple_assign_rhs1 (stmt); - if (disqualify_ops_if_throwing_stmt (stmt, lhs, rhs)) + if (disqualify_if_bad_bb_terminating_stmt (stmt, lhs, rhs)) return false; racc = build_access_from_expr_1 (rhs, stmt, false); @@ -1319,7 +1343,7 @@ scan_function (void) } t = gimple_call_lhs (stmt); - if (t !disqualify_ops_if_throwing_stmt (stmt, t, NULL)) + if (t !disqualify_if_bad_bb_terminating_stmt (stmt, t, NULL)) ret |= build_access_from_expr (t, stmt, true); break; @@ -2734,6 +2758,19 @@ get_access_for_expr (tree expr) return get_var_base_offset_size_access (base, offset, max_size); } +/* Split the single non-EH successor edge from BB (there must be exactly one) + and return a gimple iterator to the new block. */ + +static gimple_stmt_iterator +gsi_for_eh_followups (basic_block bb) +{ + edge e = single_non_eh_succ (bb); + gcc_assert (e); + + basic_block new_bb = split_edge (e); + return gsi_start_bb (new_bb); +} + /* Replace the expression EXPR with a scalar replacement if there is one and generate other statements to do type conversion or subtree copying if necessary. GSI is used to place newly created statements, WRITE is true if @@ -2763,6 +2800,13 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write) type = TREE_TYPE (*expr); loc = gimple_location (gsi_stmt (*gsi)); + gimple_stmt_iterator
Re: RFA: Tighten checking for 'X' constraints
On Tue, Apr 15, 2014 at 1:53 PM, Richard Sandiford rdsandif...@googlemail.com wrote: As Robert pointed out here: http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00416.html we're a bit too eager when folding stuff into an 'X' constraint. The value at expand time is sensible, but after that asm_operand_ok allows arbitrary rtx expressions, including any number of registers as well as MEMs with unchecked addresses. This is a target-independent problem, as shown by the testcase below. Reload would give bogus impossible constraint in asm errors while LRA ICEs. Tested on x86_64-linux-gnu. OK to install? AARCH64 ran into something similar and we did a similar patch though rejecting only mems which are invalid: http://gcc.gnu.org/ml/gcc-patches/2012-12/msg00765.html (http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01128.html) Thanks, Andrew Pinski Thanks, Richard gcc/ * recog.c (asm_operand_ok): Tighten MEM validity for 'X'. gcc/testsuite/ * gcc.dg/torture/asm-x-constraint-1.c: New test. Index: gcc/recog.c === --- gcc/recog.c 2014-04-12 22:43:54.729854903 +0100 +++ gcc/recog.c 2014-04-15 21:47:32.139873570 +0100 @@ -1840,7 +1840,17 @@ asm_operand_ok (rtx op, const char *cons break; case 'X': - result = 1; + /* Although the asm itself doesn't impose any restrictions on +the operand, we still need to restrict it to something that +can be reloaded and printed. + +MEM operands are always reloaded to make them legitimate, +regardless of the constraint, so we need to handle them +in the same way as for 'm' and 'g'. Since 'X' is not treated +as an address constraint, the only other valid operand types +are constants and registers. */ + result = (CONSTANT_P (op) + || general_operand (op, VOIDmode)); break; case 'g': Index: gcc/testsuite/gcc.dg/torture/asm-x-constraint-1.c === --- /dev/null 2014-04-15 08:10:27.294524132 +0100 +++ gcc/testsuite/gcc.dg/torture/asm-x-constraint-1.c 2014-04-15 19:11:29.830962008 +0100 @@ -0,0 +1,27 @@ +void +noprop1 (int **x, int y, int z) +{ + int *ptr = *x + y * z / 11; + __asm__ __volatile__ (noprop1 %0 : : X (*ptr)); +} + +void +noprop2 (int **x, int y, int z) +{ + int *ptr = *x + y * z / 11; + __asm__ __volatile__ (noprop2 %0 : : X (ptr)); +} + +int *global_var; + +void +const1 (void) +{ + __asm__ __volatile__ (const1 %0 : : X (global_var)); +} + +void +const2 (void) +{ + __asm__ __volatile__ (const2 %0 : : X (*global_var)); +}
Re: [patch] Disable if_conversion2 for Og
On Wed, Apr 16, 2014 at 9:46 AM, Joey Ye joey...@arm.com wrote: -Original Message- From: Joey Ye [mailto:joey...@arm.com] Sent: Tuesday, April 15, 2014 6:37 PM To: 'Richard Biener' Cc: GCC Patches Subject: RE: [patch] Disable if_conversion2 for Og Ok for trunk and branches after a while. Why does if-conversion not have the same problem? On the GIMPLE part we avoid all kinds of if-conversion with -Og. I think if-conversion should be disabled for Og too, but I don't have a case to show that it is harmful. If GIMPLE avoids all if-conversion, it is nature to do the same for RTL. I'll test and send another patch to also disable if-conversion. New patch tested with more regressions with -Og, which are expected. FAIL: gcc.target/arm/its.c scan-assembler-times \\tit 2 FAIL: gcc.target/arm/pr40956.c scan-assembler-times mov[t ]*r., #0 1 FAIL: gcc.target/arm/pr42835.c scan-assembler-times moveq[t ]*r.,[t ]*# 1 FAIL: gcc.target/arm/pr42835.c scan-assembler-times movne[t ]*r.,[t ]*# 1 FAIL: gcc.target/arm/shiftable.c scan-assembler sub.*[al]sl #6 FAIL: gcc.target/arm/thumb-ifcvt-2.c scan-assembler asreq FAIL: gcc.target/arm/thumb-ifcvt-2.c scan-assembler lslne FAIL: gcc.target/arm/thumb-ifcvt.c scan-assembler asrne FAIL: gcc.target/arm/thumb-ifcvt.c scan-assembler lslne FAIL: gcc.target/arm/vseleqdf.c scan-assembler-times vseleq.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vseleqsf.c scan-assembler-times vseleq.f32\\ts[0-9]+ 1 FAIL: gcc.target/arm/vselgedf.c scan-assembler-times vselge.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vselgesf.c scan-assembler-times vselge.f32\\ts[0-9]+ 1 FAIL: gcc.target/arm/vselgtdf.c scan-assembler-times vselgt.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vselgtsf.c scan-assembler-times vselgt.f32\\ts[0-9]+ 1 FAIL: gcc.target/arm/vselledf.c scan-assembler-times vselgt.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vsellesf.c scan-assembler-times vselgt.f32\\ts[0-9]+ 1 FAIL: gcc.target/arm/vselltdf.c scan-assembler-times vselge.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vselltsf.c scan-assembler-times vselge.f32\\ts[0-9]+ 1 FAIL: gcc.target/arm/vselnedf.c scan-assembler-times vseleq.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vselnesf.c scan-assembler-times vseleq.f32\\ts[0-9]+ 1 FAIL: gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\\ts[0-9]+ 1 FAIL: gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\\ts[0-9]+ 1 OK? I suppose the tests above are usually not run with -Og thus the patch won't regress regular testing. Ok for trunk. Thanks, Richard. ChangeLog: * opts.c (OPT_fif_conversion, OPT_fif_conversion2): Disable for Og. diff --git a/gcc/opts.c b/gcc/opts.c index fdc903f..3f3db1a 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -431,8 +431,8 @@ static const struct default_options default_options_table[] = { OPT_LEVELS_1_PLUS, OPT_fguess_branch_probability, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fcprop_registers, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fforward_propagate, NULL, 1 }, -{ OPT_LEVELS_1_PLUS, OPT_fif_conversion, NULL, 1 }, -{ OPT_LEVELS_1_PLUS, OPT_fif_conversion2, NULL, 1 }, +{ OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fif_conversion, NULL, 1 }, +{ OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fif_conversion2, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fipa_pure_const, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fipa_reference, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fipa_profile, NULL, 1 },
Re: [patch] Disable if_conversion2 for Og
On 15/04/14 02:59, Joey Ye wrote: If-converstion is harmful to optimized debugging as it generates conditional execution instructions with line number information, which resulted in a dillusion to developers that both then-else branches are executed. For example: test.c: 1: unsigned oldest_sequence; 2: 3: unsigned foo(unsigned seq_number) 4: { 5: if ((seq_number + 5) 10) 6:seq_number += 100; 7: else 8: seq_number = oldest_sequence; if (seq_number oldest_sequence) seq_number = oldest_sequence; return seq_number; } Arguably, this is a bug in gdb. The debugger should understand when a breakpointed conditional instruction is not going to execute and silently continue. That preserves the illusion of not executing the code without requiring the compiler to de-optimize things. R. $ arm-none-eabi-gcc -mthumb -mcpu=cortex-m3 -Og -g3 gets: .loc 1 5 0 addsr3, r0, #5 cmp r3, #9 .loc 1 6 0 - line 6, then branch iteels addls r0, r0, #100 .LVL1: .loc 1 8 0 - line 8, else branch. Both branches seems to be executed in GDB ldrhi r3, .L5 ldrhi r0, [r3] The reason is that if_conversion2 is still enabled in Og. The patch simply disables it for Og. Tests: * -Og bootstrap passed. * Make check default (no additional option): No regression. * Make check with -Og: expected regressions. Cases relying on if-conversion2 failed. FAIL: gcc.target/arm/its.c scan-assembler-times \\tit 2 FAIL: gcc.target/arm/pr40956.c scan-assembler-times mov[t ]*r., #0 1 FAIL: gcc.target/arm/thumb-ifcvt-2.c scan-assembler asreq FAIL: gcc.target/arm/thumb-ifcvt-2.c scan-assembler lslne FAIL: gcc.target/arm/thumb-ifcvt.c scan-assembler asrne FAIL: gcc.target/arm/thumb-ifcvt.c scan-assembler lslne OK to trunk and 4.8/4.9 branch? ChangeLog: * opts.c (OPT_fif_conversion2): Disable for Og. diff --git a/gcc/opts.c b/gcc/opts.c index fdc903f..e076253 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -432,7 +432,7 @@ static const struct default_options default_options_table[] = { OPT_LEVELS_1_PLUS, OPT_fcprop_registers, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fforward_propagate, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fif_conversion, NULL, 1 }, -{ OPT_LEVELS_1_PLUS, OPT_fif_conversion2, NULL, 1 }, +{ OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fif_conversion2, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fipa_pure_const, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fipa_reference, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fipa_profile, NULL, 1 },
Re: [patch] Disable if_conversion2 for Og
On 16/04/14 10:30, Richard Biener wrote: On Wed, Apr 16, 2014 at 9:46 AM, Joey Ye joey...@arm.com wrote: -Original Message- From: Joey Ye [mailto:joey...@arm.com] Sent: Tuesday, April 15, 2014 6:37 PM To: 'Richard Biener' Cc: GCC Patches Subject: RE: [patch] Disable if_conversion2 for Og Ok for trunk and branches after a while. Why does if-conversion not have the same problem? On the GIMPLE part we avoid all kinds of if-conversion with -Og. I think if-conversion should be disabled for Og too, but I don't have a case to show that it is harmful. If GIMPLE avoids all if-conversion, it is nature to do the same for RTL. I'll test and send another patch to also disable if-conversion. New patch tested with more regressions with -Og, which are expected. FAIL: gcc.target/arm/its.c scan-assembler-times \\tit 2 FAIL: gcc.target/arm/pr40956.c scan-assembler-times mov[t ]*r., #0 1 FAIL: gcc.target/arm/pr42835.c scan-assembler-times moveq[t ]*r.,[t ]*# 1 FAIL: gcc.target/arm/pr42835.c scan-assembler-times movne[t ]*r.,[t ]*# 1 FAIL: gcc.target/arm/shiftable.c scan-assembler sub.*[al]sl #6 FAIL: gcc.target/arm/thumb-ifcvt-2.c scan-assembler asreq FAIL: gcc.target/arm/thumb-ifcvt-2.c scan-assembler lslne FAIL: gcc.target/arm/thumb-ifcvt.c scan-assembler asrne FAIL: gcc.target/arm/thumb-ifcvt.c scan-assembler lslne FAIL: gcc.target/arm/vseleqdf.c scan-assembler-times vseleq.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vseleqsf.c scan-assembler-times vseleq.f32\\ts[0-9]+ 1 FAIL: gcc.target/arm/vselgedf.c scan-assembler-times vselge.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vselgesf.c scan-assembler-times vselge.f32\\ts[0-9]+ 1 FAIL: gcc.target/arm/vselgtdf.c scan-assembler-times vselgt.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vselgtsf.c scan-assembler-times vselgt.f32\\ts[0-9]+ 1 FAIL: gcc.target/arm/vselledf.c scan-assembler-times vselgt.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vsellesf.c scan-assembler-times vselgt.f32\\ts[0-9]+ 1 FAIL: gcc.target/arm/vselltdf.c scan-assembler-times vselge.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vselltsf.c scan-assembler-times vselge.f32\\ts[0-9]+ 1 FAIL: gcc.target/arm/vselnedf.c scan-assembler-times vseleq.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vselnesf.c scan-assembler-times vseleq.f32\\ts[0-9]+ 1 FAIL: gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\\ts[0-9]+ 1 FAIL: gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\\td[0-9]+ 1 FAIL: gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\\ts[0-9]+ 1 OK? I suppose the tests above are usually not run with -Og thus the patch won't regress regular testing. Ok for trunk. I'd like to see these tests fixed to ensure that they're skipped if this goes in. R. Thanks, Richard. ChangeLog: * opts.c (OPT_fif_conversion, OPT_fif_conversion2): Disable for Og. diff --git a/gcc/opts.c b/gcc/opts.c index fdc903f..3f3db1a 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -431,8 +431,8 @@ static const struct default_options default_options_table[] = { OPT_LEVELS_1_PLUS, OPT_fguess_branch_probability, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fcprop_registers, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fforward_propagate, NULL, 1 }, -{ OPT_LEVELS_1_PLUS, OPT_fif_conversion, NULL, 1 }, -{ OPT_LEVELS_1_PLUS, OPT_fif_conversion2, NULL, 1 }, +{ OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fif_conversion, NULL, 1 }, +{ OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fif_conversion2, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fipa_pure_const, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fipa_reference, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fipa_profile, NULL, 1 },
Re: [PATCH] register CALL_INSN_FUNCTION_USAGE in find_all_hard_reg_sets
On 16-01-14 09:13, Richard Sandiford wrote: Tom de Vries tom_devr...@mentor.com writes: * The set of registers which are clobbered during a call by things like the plt - these are not picked up by the use-caller-save optimization. We need the hook to inform the compiler about these registers Right, but... * And finally, registers clobbered in the caller itself during a sequence of instructions implementing a function call. On mips, that's R6, which may be clobbered by the call. Normally that doesn't need mentioning in the RTL since it's a call_used_reg, but since use-caller-save might discover a set of registers for the called function that does not include R6, it becomes important to record this clobber explicitly. It could be represented in the RTL by a clobber on the insn, or a clobber in C_I_F_U. Or it could just be part of the registers returned by the hook - but that was previously deemed not acceptable (and it doesn't match the description of the hook). ...why do we need two different mechanisms to deal with these two? IMO the set recorded for the callee should contain what the callee instructions clobber and nothing else. CALL_INSN_FUNCTION_USAGE should contain everything clobbered by a call outside the callee, whether that's in the calling function itself, in a PLT, in a MIPS16 stub, or whatever. Richard, Is this what you mean? This patch introduces a hook that specifies which registers are implicitly clobbered by a call, not including the registers that are clobbered in the called function, and then uses that hook to add those registers to CALL_INSN_FUNCTION_USAGE. Thanks, - Tom 2013-04-29 Radovan Obradovic robrado...@mips.com Tom de Vries t...@codesourcery.com * target.def (call_clobbered_regs): New DEFHOOK. * doc/tm.texi.in (@node Stack and Calling): Add Miscellaneous Register Hooks to @menu. (@node Miscellaneous Register Hooks): New node. (@hook TARGET_CALL_CLOBBERED_REGS): New hook. * doc/tm.texi: Regenerate. * calls.c (expand_call, emit_library_call_value_1): Add regs in targetm.call_clobbered_regs to CALL_INSN_FUNCTION_USAGE. diff --git a/gcc/calls.c b/gcc/calls.c index e798c7a..edee262 100644 --- a/gcc/calls.c +++ b/gcc/calls.c @@ -3191,6 +3191,27 @@ expand_call (tree exp, rtx target, int ignore) add_reg_note (last, REG_CALL_DECL, datum); } + if (targetm.call_clobbered_regs != NULL) + { + struct hard_reg_set_container call_clobbered_regs; + rtx last = last_call_insn (); + + CLEAR_HARD_REG_SET (call_clobbered_regs.set); + if (targetm.call_clobbered_regs (fndecl, call_clobbered_regs)) + { + hard_reg_set_iterator hrsi; + unsigned int i; + EXECUTE_IF_SET_IN_HARD_REG_SET (call_clobbered_regs.set, 0, i, hrsi) + { + rtx reg = gen_rtx_REG (word_mode, i); + CALL_INSN_FUNCTION_USAGE (last) + = gen_rtx_EXPR_LIST (VOIDmode, + gen_rtx_CLOBBER (VOIDmode, reg), + CALL_INSN_FUNCTION_USAGE (last)); + } + } + } + /* If the call setup or the call itself overlaps with anything of the argument setup we probably clobbered our call address. In that case we can't do sibcalls. */ @@ -4226,6 +4247,27 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx value, add_reg_note (last, REG_CALL_DECL, datum); } + if (targetm.call_clobbered_regs != NULL) +{ + struct hard_reg_set_container call_clobbered_regs; + rtx last = last_call_insn (); + + CLEAR_HARD_REG_SET (call_clobbered_regs.set); + if (targetm.call_clobbered_regs (fndecl, call_clobbered_regs)) + { + hard_reg_set_iterator hrsi; + unsigned int i; + EXECUTE_IF_SET_IN_HARD_REG_SET (call_clobbered_regs.set, 0, i, hrsi) + { + rtx reg = gen_rtx_REG (word_mode, i); + CALL_INSN_FUNCTION_USAGE (last) + = gen_rtx_EXPR_LIST (VOIDmode, + gen_rtx_CLOBBER (VOIDmode, reg), + CALL_INSN_FUNCTION_USAGE (last)); + } + } +} + /* Right-shift returned value if necessary. */ if (!pcc_struct_value TYPE_MODE (tfom) != BLKmode diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index b8ca17e..cd52f73 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -3091,6 +3091,7 @@ This describes the stack layout and calling conventions. * Profiling:: * Tail Calls:: * Stack Smashing Protection:: +* Miscellaneous Register Hooks:: @end menu @node Frame Layout @@ -5016,6 +5017,14 @@ normally defined in @file{libgcc2.c}. Whether this target supports splitting the stack when the options described in @var{opts} have been passed. This is called after options have been parsed, so the target may reject splitting the stack in some configurations. The default version of this hook returns false. If @var{report} is true, this function may issue a warning or error; if @var{report} is false, it must simply return a value @end deftypefn +@node Miscellaneous Register Hooks
[PING] [PATCH] register CALL_INSN_FUNCTION_USAGE in find_all_hard_reg_sets
On 15-01-14 17:53, Tom de Vries wrote: Eric, This patch adds scanning of clobbers in CALL_INSN_FUNCTION_USAGE to find_all_hard_reg_sets. For MIPS, calls are split at some point. After the split, one of the resulting insns may clobber $6. But before the split, that's not explicit in the rtl representation of the unsplit call. For -fuse-caller-save, that's a problem, and Richard S. suggested to add the clobber of $6 to the CALL_INSN_FUNCTION_USAGE of the unsplit call. I wrote a patch for that ( http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00730.html ), but found that doing so did not fix the problem with -fuse-caller-save, because find_all_hard_reg_sets (the mechanism -fuse-caller-save uses to detect which registers are set or clobbered) does not take CALL_INSN_FUNCTION_USAGE into account. This patch fixes that. Build and reg-tested on MIPS. OK for stage1 if x86_64 bootstrap reg-test succeeds? Eric, Ping of this ( http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00888.html ) patch. Ok for stage1? Thanks, - Tom
RE: [patch] Disable if_conversion2 for Og
-Original Message- From: Richard Earnshaw Sent: Wednesday, April 16, 2014 5:44 PM To: Joey Ye Cc: gcc-patches@gcc.gnu.org Subject: Re: [patch] Disable if_conversion2 for Og Arguably, this is a bug in gdb. The debugger should understand when a breakpointed conditional instruction is not going to execute and silently continue. That preserves the illusion of not executing the code without requiring the compiler to de-optimize things. R. Or compiler just optimizes it, and emits generic DWARFx information to help GDB handle it in more target independently? - Joey
Re: [patch] Disable if_conversion2 for Og
On 16/04/14 11:02, Joey Ye wrote: -Original Message- From: Richard Earnshaw Sent: Wednesday, April 16, 2014 5:44 PM To: Joey Ye Cc: gcc-patches@gcc.gnu.org Subject: Re: [patch] Disable if_conversion2 for Og Arguably, this is a bug in gdb. The debugger should understand when a breakpointed conditional instruction is not going to execute and silently continue. That preserves the illusion of not executing the code without requiring the compiler to de-optimize things. R. Or compiler just optimizes it, and emits generic DWARFx information to help GDB handle it in more target independently? - Joey I'm not sure extra dwarf info would help much. The debugger still has to understand that the breakpoint has not really been hit. R.
Re: [PATCH v2] libstdc++: Add hexfloat/defaultfloat io manipulators.
On 16/04/14 14:26 +0900, Luke Allardyce wrote: Also the old standard seems to require that ios_base::fixed | ios_base::scientific (or any other combination) falls through to the uppercase test; I was trying to use abi_tag for a solution as not only would two versions of _S_format_float be necessary, but also num_get due to the pre-instantiated templates for char and wchar, which led me to http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60642. It might just be more trouble than it's worth. I don't think we need to worry about that, if I understand correctly the combination of fixed|scientific has unspecified behaviour in C++03, so we can make our implementation do exactly what it does in C++11. It seems to me that it is well defined, going from [lib.facet.num.put.virtuals] 6 All tables used in describing stage 1 are ordered. That is, the first line whose condition is true applies. A line without a condition is the default behavior when none of the earlier lines apply. So fixed|scientific would be equivalent to specifying neither according to table 58, and the resulting specifier would be %g or %G depending on whether uppercase is set or not. Thanks, I was wrong about that. Then I think we should just bite the bullet and provide the new behaviour. If we do have an abi_tag on those types in the next release then we can preserve the old behaviour in the old ABI and use the C++11 semantics for the abi_tagged type, which will be used for both C++03 and C++11 code. I am not too concerned that people who use a meaningless modifier in C++03 code get the C++11 behaviour. If they really want %g or %G then they shouldn't use fixed|scientific.
Re: [patch] Disable if_conversion2 for Og
On 16/04/14 11:17, Joey Ye wrote: -Original Message- From: Richard Earnshaw Sent: Wednesday, April 16, 2014 6:04 PM To: Joey Ye Cc: gcc-patches@gcc.gnu.org Subject: Re: [patch] Disable if_conversion2 for Og On 16/04/14 11:02, Joey Ye wrote: -Original Message- From: Richard Earnshaw Sent: Wednesday, April 16, 2014 5:44 PM To: Joey Ye Cc: gcc-patches@gcc.gnu.org Subject: Re: [patch] Disable if_conversion2 for Og Arguably, this is a bug in gdb. The debugger should understand when a breakpointed conditional instruction is not going to execute and silently continue. That preserves the illusion of not executing the code without requiring the compiler to de-optimize things. R. Or compiler just optimizes it, and emits generic DWARFx information to help GDB handle it in more target independently? - Joey I'm not sure extra dwarf info would help much. The debugger still has to understand that the breakpoint has not really been hit. R. Yes, it is inevitable. But without extra dwarf info it will be even more painful: each time setting break-point or break-point hits it has to decode the break-pointed instructions and its context to search for conditional execution and IT blocks. For thumb code it can get the conditional information it needs from the IT state in the PSR; for ARM code it has to look no further than the instruction itself. R.
RE: [patch] Disable if_conversion2 for Og
-Original Message- From: Richard Earnshaw Sent: Wednesday, April 16, 2014 6:04 PM To: Joey Ye Cc: gcc-patches@gcc.gnu.org Subject: Re: [patch] Disable if_conversion2 for Og On 16/04/14 11:02, Joey Ye wrote: -Original Message- From: Richard Earnshaw Sent: Wednesday, April 16, 2014 5:44 PM To: Joey Ye Cc: gcc-patches@gcc.gnu.org Subject: Re: [patch] Disable if_conversion2 for Og Arguably, this is a bug in gdb. The debugger should understand when a breakpointed conditional instruction is not going to execute and silently continue. That preserves the illusion of not executing the code without requiring the compiler to de-optimize things. R. Or compiler just optimizes it, and emits generic DWARFx information to help GDB handle it in more target independently? - Joey I'm not sure extra dwarf info would help much. The debugger still has to understand that the breakpoint has not really been hit. R. Yes, it is inevitable. But without extra dwarf info it will be even more painful: each time setting break-point or break-point hits it has to decode the break-pointed instructions and its context to search for conditional execution and IT blocks. - Joey
Re: [PATCH] register CALL_INSN_FUNCTION_USAGE in find_all_hard_reg_sets
Tom de Vries tom_devr...@mentor.com writes: On 16-01-14 09:13, Richard Sandiford wrote: Tom de Vries tom_devr...@mentor.com writes: * The set of registers which are clobbered during a call by things like the plt - these are not picked up by the use-caller-save optimization. We need the hook to inform the compiler about these registers Right, but... * And finally, registers clobbered in the caller itself during a sequence of instructions implementing a function call. On mips, that's R6, which may be clobbered by the call. Normally that doesn't need mentioning in the RTL since it's a call_used_reg, but since use-caller-save might discover a set of registers for the called function that does not include R6, it becomes important to record this clobber explicitly. It could be represented in the RTL by a clobber on the insn, or a clobber in C_I_F_U. Or it could just be part of the registers returned by the hook - but that was previously deemed not acceptable (and it doesn't match the description of the hook). ...why do we need two different mechanisms to deal with these two? IMO the set recorded for the callee should contain what the callee instructions clobber and nothing else. CALL_INSN_FUNCTION_USAGE should contain everything clobbered by a call outside the callee, whether that's in the calling function itself, in a PLT, in a MIPS16 stub, or whatever. Richard, Is this what you mean? This patch introduces a hook that specifies which registers are implicitly clobbered by a call, not including the registers that are clobbered in the called function, and then uses that hook to add those registers to CALL_INSN_FUNCTION_USAGE. I don't think a new hook is needed. The call patterns should just add the registers to CALL_INSN_FUNCTION_USAGE when generating the call insn. MIPS already does this some cases where normally-call-saved registers are actually clobbered: /* If we are handling a floating-point return value, we need to save $18 in the function prologue. Putting a note on the call will mean that df_regs_ever_live_p ($18) will be true if the call is not eliminated, and we can check that in the prologue code. */ if (fp_ret_p) CALL_INSN_FUNCTION_USAGE (insn) = gen_rtx_EXPR_LIST (VOIDmode, gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (word_mode, 18)), CALL_INSN_FUNCTION_USAGE (insn)); Although we really should have a utility function like use_reg, but for clobbers, so that the above would become: clobber_reg (CALL_INSN_FUNCTION_USAGE (insn), gen_rtx_REG (word_mode, 18)); Thanks, Richard
RE: [patch] Disable if_conversion2 for Og
-Original Message- From: Richard Earnshaw Sent: Wednesday, April 16, 2014 6:21 PM To: Joey Ye Cc: gcc-patches@gcc.gnu.org Subject: Re: [patch] Disable if_conversion2 for Og On 16/04/14 11:17, Joey Ye wrote: -Original Message- From: Richard Earnshaw Sent: Wednesday, April 16, 2014 6:04 PM To: Joey Ye Cc: gcc-patches@gcc.gnu.org Subject: Re: [patch] Disable if_conversion2 for Og On 16/04/14 11:02, Joey Ye wrote: -Original Message- From: Richard Earnshaw Sent: Wednesday, April 16, 2014 5:44 PM To: Joey Ye Cc: gcc-patches@gcc.gnu.org Subject: Re: [patch] Disable if_conversion2 for Og Arguably, this is a bug in gdb. The debugger should understand when a breakpointed conditional instruction is not going to execute and silently continue. That preserves the illusion of not executing the code without requiring the compiler to de-optimize things. R. Or compiler just optimizes it, and emits generic DWARFx information to help GDB handle it in more target independently? - Joey I'm not sure extra dwarf info would help much. The debugger still has to understand that the breakpoint has not really been hit. R. Yes, it is inevitable. But without extra dwarf info it will be even more painful: each time setting break-point or break-point hits it has to decode the break- pointed instructions and its context to search for conditional execution and IT blocks. For thumb code it can get the conditional information it needs from the IT state in the PSR; for ARM code it has to look no further than the instruction itself. R. The thing is, debugger has to do this for every breakpoint, even though more of them are not conditional execution, which isn't efficient.
Re: [patch] Disable if_conversion2 for Og
On 16/04/14 11:30, Joey Ye wrote: -Original Message- From: Richard Earnshaw Sent: Wednesday, April 16, 2014 6:21 PM To: Joey Ye Cc: gcc-patches@gcc.gnu.org Subject: Re: [patch] Disable if_conversion2 for Og On 16/04/14 11:17, Joey Ye wrote: -Original Message- From: Richard Earnshaw Sent: Wednesday, April 16, 2014 6:04 PM To: Joey Ye Cc: gcc-patches@gcc.gnu.org Subject: Re: [patch] Disable if_conversion2 for Og On 16/04/14 11:02, Joey Ye wrote: -Original Message- From: Richard Earnshaw Sent: Wednesday, April 16, 2014 5:44 PM To: Joey Ye Cc: gcc-patches@gcc.gnu.org Subject: Re: [patch] Disable if_conversion2 for Og Arguably, this is a bug in gdb. The debugger should understand when a breakpointed conditional instruction is not going to execute and silently continue. That preserves the illusion of not executing the code without requiring the compiler to de-optimize things. R. Or compiler just optimizes it, and emits generic DWARFx information to help GDB handle it in more target independently? - Joey I'm not sure extra dwarf info would help much. The debugger still has to understand that the breakpoint has not really been hit. R. Yes, it is inevitable. But without extra dwarf info it will be even more painful: each time setting break-point or break-point hits it has to decode the break- pointed instructions and its context to search for conditional execution and IT blocks. For thumb code it can get the conditional information it needs from the IT state in the PSR; for ARM code it has to look no further than the instruction itself. R. The thing is, debugger has to do this for every breakpoint, even though more of them are not conditional execution, which isn't efficient. Then cache whether the instruction might be conditional when it's created. That's more work when creating the bp (and you do have to scan back to check if a thumb instruction might be made conditional with IT), but would save time when hitting it. Anyway, this is getting off-topic for the GCC-patches list. R.
Re: RFA: Tighten checking for 'X' constraints
Jakub Jelinek ja...@redhat.com writes: On Tue, Apr 15, 2014 at 09:53:16PM +0100, Richard Sandiford wrote: As Robert pointed out here: http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00416.html we're a bit too eager when folding stuff into an 'X' constraint. The value at expand time is sensible, but after that asm_operand_ok allows arbitrary rtx expressions, including any number of registers as well as MEMs with unchecked addresses. This is a target-independent problem, as shown by the testcase below. Reload would give bogus impossible constraint in asm errors while LRA ICEs. Tested on x86_64-linux-gnu. OK to install? But then what will be X good for compared to gin or similar? X constraint is meant for operands that aren't really needed, trying to print is a user error. I guess the documentation agrees with this: 'X' Any operand whatsoever is allowed, even if it does not satisfy 'general_operand'. This is normally used in the constraint of a 'match_scratch' when certain alternatives will not actually require a scratch register. So I think we should just error out if somebody tries to print something that satisfies X constraint. That's the internal documentation though, whereas here we're checking asm uses. The documentation for asms just says Any operand whatsoever is allowed. It doesn't say anything about it being unprintable. I just added the printing side for completeness though. It wasn't the point of the patch. Like I say, the point is that LRA ICEs (on x86_64) because it can't reload the operands. Reload couldn't reload them either but raised impossible constraint in asm errors instead of internal errors. (IMO those errors were also invalid though. If X allows any operand whatsoever, how can the operands in the testcase be invalid?) X was defined against reload, which always reloaded MEM addresses to follow the appropriate base and index register classes. This was done as a first pass before matching against the constraints: /* Examine each operand that is a memory reference or memory address and reload parts of the addresses into index registers. Also here any references to pseudo regs that didn't get hard regs but are equivalent to constants get replaced in the insn itself with those constants. Nobody will ever see them again. Finally, set up the preferred classes of each operand. */ for (i = 0; i noperands; i++) { ... else if (code == MEM) { address_reloaded[i] = find_reloads_address (GET_MODE (recog_data.operand[i]), recog_data.operand_loc[i], XEXP (recog_data.operand[i], 0), XEXP (recog_data.operand[i], 0), i, address_type[i], ind_levels, insn); recog_data.operand[i] = *recog_data.operand_loc[i]; substed_operand[i] = recog_data.operand[i]; So I don't think it has ever been the case that X allowed MEMs with arbitrary expressions as the address. IMO the point of X (as implied the doc you quoted) is that it allows (scratch) operands to be kept as (scratch)s in cases where no scratch register is needed. Thanks, Richard
Re: RFA: Tighten checking for 'X' constraints
Andrew Pinski pins...@gmail.com writes: On Tue, Apr 15, 2014 at 1:53 PM, Richard Sandiford rdsandif...@googlemail.com wrote: As Robert pointed out here: http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00416.html we're a bit too eager when folding stuff into an 'X' constraint. The value at expand time is sensible, but after that asm_operand_ok allows arbitrary rtx expressions, including any number of registers as well as MEMs with unchecked addresses. This is a target-independent problem, as shown by the testcase below. Reload would give bogus impossible constraint in asm errors while LRA ICEs. Tested on x86_64-linux-gnu. OK to install? AARCH64 ran into something similar and we did a similar patch though rejecting only mems which are invalid: http://gcc.gnu.org/ml/gcc-patches/2012-12/msg00765.html (http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01128.html) Sorry, missed that. I went for the same thing at first, but the second example in the testcase shows that it's needed for non-MEM operands too. Thanks, Richard
[PATCH, Pointer Bounds Checker 1/x] Pointer bounds type and mode
Hi, This patch restarts the series for introducing Pointer Bounds Checker instrumentation and supporting Intel Memory Protection Extension (MPX) technology. Detailed description is on GCC Wiki page: http://gcc.gnu.org/wiki/Intel%20MPX%20support%20in%20the%20GCC%20compiler. The first patch introduces pointer bounds type and mode. It was approved earlier for 4.9 and had no significant changes since then. I'll assume patch is OK if no objections arise. Patch was bootstrapped and tested for linux-x86_64. Thanks, Ilya -- gcc/ 2014-04-16 Ilya Enkovich ilya.enkov...@intel.com * mode-classes.def (MODE_POINTER_BOUNDS): New. * tree.def (POINTER_BOUNDS_TYPE): New. * genmodes.c (complete_mode): Support MODE_POINTER_BOUNDS. (POINTER_BOUNDS_MODE): New. (make_pointer_bounds_mode): New. * machmode.h (POINTER_BOUNDS_MODE_P): New. * stor-layout.c (int_mode_for_mode): Support MODE_POINTER_BOUNDS. (layout_type): Support POINTER_BOUNDS_TYPE. * tree-pretty-print.c (dump_generic_node): Support POINTER_BOUNDS_TYPE. * tree.c (build_int_cst_wide): Support POINTER_BOUNDS_TYPE. (type_contains_placeholder_1): Likewise. * tree.h (POINTER_BOUNDS_TYPE_P): New. * varasm.c (output_constant): Support POINTER_BOUNDS_TYPE. * doc/rtl.texi (MODE_POINTER_BOUNDS): New. diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi index 20b7187..3a1014d 100644 --- a/gcc/doc/rtl.texi +++ b/gcc/doc/rtl.texi @@ -1382,6 +1382,12 @@ any @code{CC_MODE} modes listed in the @file{@var{machine}-modes.def}. @xref{Jump Patterns}, also see @ref{Condition Code}. +@findex MODE_POINTER_BOUNDS +@item MODE_POINTER_BOUNDS +Pointer bounds modes. Used to represent values of pointer bounds type. +Operations in these modes may be executed as NOPs depending on hardware +features and environment setup. + @findex MODE_RANDOM @item MODE_RANDOM This is a catchall mode class for modes which don't fit into the above diff --git a/gcc/genmodes.c b/gcc/genmodes.c index 8cc3cde..9d0b413 100644 --- a/gcc/genmodes.c +++ b/gcc/genmodes.c @@ -333,6 +333,7 @@ complete_mode (struct mode_data *m) break; case MODE_INT: +case MODE_POINTER_BOUNDS: case MODE_FLOAT: case MODE_DECIMAL_FLOAT: case MODE_FRACT: @@ -534,6 +535,19 @@ make_special_mode (enum mode_class cl, const char *name, new_mode (cl, name, file, line); } +#define POINTER_BOUNDS_MODE(N, Y) \ + make_pointer_bounds_mode (#N, Y, __FILE__, __LINE__) + +static void ATTRIBUTE_UNUSED +make_pointer_bounds_mode (const char *name, + unsigned int bytesize, + const char *file, unsigned int line) +{ + struct mode_data *m = new_mode (MODE_POINTER_BOUNDS, name, file, line); + m-bytesize = bytesize; +} + + #define INT_MODE(N, Y) FRACTIONAL_INT_MODE (N, -1U, Y) #define FRACTIONAL_INT_MODE(N, B, Y) \ make_int_mode (#N, B, Y, __FILE__, __LINE__) diff --git a/gcc/machmode.h b/gcc/machmode.h index bc5d901..cbe5042 100644 --- a/gcc/machmode.h +++ b/gcc/machmode.h @@ -174,6 +174,9 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES]; || CLASS == MODE_ACCUM \ || CLASS == MODE_UACCUM) +#define POINTER_BOUNDS_MODE_P(MODE) \ + (GET_MODE_CLASS (MODE) == MODE_POINTER_BOUNDS) + /* Get the size in bytes and bits of an object of mode MODE. */ extern CONST_MODE_SIZE unsigned char mode_size[NUM_MACHINE_MODES]; diff --git a/gcc/mode-classes.def b/gcc/mode-classes.def index 9c6a8bb..b645484 100644 --- a/gcc/mode-classes.def +++ b/gcc/mode-classes.def @@ -22,6 +22,7 @@ along with GCC; see the file COPYING3. If not see DEF_MODE_CLASS (MODE_CC),/* condition code in a register */ \ DEF_MODE_CLASS (MODE_INT), /* integer */ \ DEF_MODE_CLASS (MODE_PARTIAL_INT), /* integer with padding bits */\ + DEF_MODE_CLASS (MODE_POINTER_BOUNDS), /* bounds */ \ DEF_MODE_CLASS (MODE_FRACT), /* signed fractional number */ \ DEF_MODE_CLASS (MODE_UFRACT),/* unsigned fractional number */ \ DEF_MODE_CLASS (MODE_ACCUM), /* signed accumulator */ \ diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c index 084d195..af0ab88 100644 --- a/gcc/stor-layout.c +++ b/gcc/stor-layout.c @@ -386,6 +386,7 @@ int_mode_for_mode (enum machine_mode mode) case MODE_VECTOR_ACCUM: case MODE_VECTOR_UFRACT: case MODE_VECTOR_UACCUM: +case MODE_POINTER_BOUNDS: mode = mode_for_size (GET_MODE_BITSIZE (mode), MODE_INT, 0); break; @@ -2124,6 +2125,11 @@ layout_type (tree type) SET_TYPE_MODE (type, VOIDmode); break; +case POINTER_BOUNDS_TYPE: + TYPE_SIZE (type) = bitsize_int (GET_MODE_BITSIZE (TYPE_MODE (type))); + TYPE_SIZE_UNIT (type) = size_int (GET_MODE_SIZE (TYPE_MODE (type))); + break; + case OFFSET_TYPE: TYPE_SIZE (type) =
Re: [PATCH] register CALL_INSN_FUNCTION_USAGE in find_all_hard_reg_sets
On 16/04/14 12:28, Richard Sandiford wrote: This patch introduces a hook that specifies which registers are implicitly clobbered by a call, not including the registers that are clobbered in the called function, and then uses that hook to add those registers to CALL_INSN_FUNCTION_USAGE. I don't think a new hook is needed. Richard, the hook enables us to determine whether a target supplies the information provided by the hook. If the target does not provide this information, the fuse-caller-save optimization is possibly not safe. How do you propose to handle this without this hook? Apart from that, I don't see the reason why we should add similar code to several targets, if we can add a hook that specifies information about the target, and add generic code that handles the information. Thanks, - Tom
Re: [PATCH] register CALL_INSN_FUNCTION_USAGE in find_all_hard_reg_sets
On Wed, Apr 16, 2014 at 11:46:14AM +0200, Tom de Vries wrote: ...why do we need two different mechanisms to deal with these two? IMO the set recorded for the callee should contain what the callee instructions clobber and nothing else. CALL_INSN_FUNCTION_USAGE should contain everything clobbered by a call outside the callee, whether that's in the calling function itself, in a PLT, in a MIPS16 stub, or whatever. Always putting all call clobbered registers to C_I_F_U explicitly can be a serious memory hog on some architectures, e.g. doesn't ia64 have ~ 160 call clobbered hard registers, times number of calls in a function (sometimes tens of thousands)? Jakub
Re: [PATCH 1/3] libstdc++: Add time_get::get support.
On Tuesday 15 April 2014 23:36:51 Paolo Carlini wrote: Those should be isolated and a compiler bug report opened including a minimized reproducer. I'm not sure if this is a compiler bug or simply due to the fact that I didn't add the virtual function to the ABI linker script. Anyway, the real issue is indeed that implementing those bits requires a new virtual function, and that would break the ABI. We do have a bug in Bugzilla tracking the issue. But now that 4.9 has branched isn't it acceptable to extent the ABI? Regards, Rüdiger
[PATCH, i386, Pointer Bounds Checker 2/x] Intel Memory Protection Extensions (MPX) instructions support
Hi, This patch introduces Intel MPX bound registers and instructions. It was approved earlier for 4.9 and had no significant changes since then. I'll assume patch is OK if no objections arise. Patch was bootstrapped and tested for linux-x86_64. Thanks, Ilya -- gcc/ 2014-04-16 Ilya Enkovich ilya.enkov...@intel.com * mode-classes.def (MODE_BOUND): New. * tree.def (BOUND_TYPE): New. * genmodes.c (complete_mode): Support MODE_BOUND. (BOUND_MODE): New. (make_bound_mode): New. * machmode.h (BOUND_MODE_P): New. * stor-layout.c (int_mode_for_mode): Support MODE_BOUND. (layout_type): Support BOUND_TYPE. * tree-pretty-print.c (dump_generic_node): Support BOUND_TYPE. * tree.c (build_int_cst_wide): Support BOUND_TYPE. (type_contains_placeholder_1): Likewise. * tree.h (BOUND_TYPE_P): New. * varasm.c (output_constant): Support BOUND_TYPE. * config/i386/constraints.md (B): New. (Ti): New. (Tb): New. * config/i386/i386-modes.def (BND32): New. (BND64): New. * config/i386/i386-protos.h (ix86_bnd_prefixed_insn_p): New. * config/i386/i386.c (isa_opts): Add mmpx. (regclass_map): Add bound registers. (dbx_register_map): Likewise. (dbx64_register_map): Likewise. (svr4_dbx_register_map): Likewise. (PTA_MPX): New. (ix86_option_override_internal) Support MPX ISA. (ix86_code_end): Add MPX bnd prefix. (output_set_got): Likewise. (ix86_output_call_insn): Likewise. (get_some_local_dynamic_name): Add '!' (MPX bnd) print prefix support. (ix86_print_operand_punct_valid_p): Likewise. (ix86_print_operand_address): Support UNSPEC_BNDMK_ADDR and UNSPEC_BNDMK_ADDR. (ix86_class_likely_spilled_p): Add bound regs support. (ix86_hard_regno_mode_ok): Likewise. (x86_order_regs_for_local_alloc): Likewise. (ix86_bnd_prefixed_insn_p): New. * config/i386/i386.h (FIRST_PSEUDO_REGISTER): Fix to new value. (FIXED_REGISTERS): Add bound registers. (CALL_USED_REGISTERS): Likewise. (REG_ALLOC_ORDER): Likewise. (HARD_REGNO_NREGS): Likewise. (TARGET_MPX): New. (VALID_BND_REG_MODE): New. (FIRST_BND_REG): New. (LAST_BND_REG): New. (reg_class): Add BND_REGS. (REG_CLASS_NAMES): Likewise. (REG_CLASS_CONTENTS): Likewise. (BND_REGNO_P): New. (ANY_BND_REG_P): New. (BNDmode): New. (HI_REGISTER_NAMES): Add bound registers. * config/i386/i386.md (UNSPEC_BNDMK): New. (UNSPEC_BNDMK_ADDR): New. (UNSPEC_BNDSTX): New. (UNSPEC_BNDLDX): New. (UNSPEC_BNDLDX_ADDR): New. (UNSPEC_BNDCL): New. (UNSPEC_BNDCU): New. (UNSPEC_BNDCN): New. (UNSPEC_MPX_FENCE): New. (BND0_REG): New. (BND1_REG): New. (type): Add mpxmov, mpxmk, mpxchk, mpxld, mpxst. (length_immediate): Likewise. (prefix_0f): Likewise. (memory): Likewise. (prefix_rep): Check for bnd prefix. (BND): New. (bnd_ptr): New. (BNDCHECK): New. (bndcheck): New. (*jcc_1): Add MPX bnd prefix and fix length. (*jcc_2): Likewise. (jump): Likewise. (simple_return_internal): Likewise. (simple_return_pop_internal): Likewise. (*indirect_jump): Add MPX bnd prefix. (*tablejump_1): Likewise. (simple_return_internal_long): Likewise. (simple_return_indirect_internal): Likewise. (mode_mk): New. (*mode_mk): New. (movmode): New. (*movmode_internal_mpx): New. (mode_bndcheck): New. (*mode_bndcheck): New. (mode_ldx): New. (*mode_ldx): New. (mode_stx): New. (*mode_stx): New. * config/i386/predicates.md (lea_address_operand) Rename to... (address_no_seg_operand): ... this. (address_mpx_no_base_operand): New. (address_mpx_no_index_operand): New. (bnd_mem_operator): New. * config/i386/i386.opt (mmpx): New. diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md index 567e705..3cd7e43 100644 --- a/gcc/config/i386/constraints.md +++ b/gcc/config/i386/constraints.md @@ -18,7 +18,7 @@ ;; http://www.gnu.org/licenses/. ;;; Unused letters: -;;; B H +;;; H ;;; h j ;; Integer register constraints. @@ -91,6 +91,9 @@ (define_register_constraint x TARGET_SSE ? SSE_REGS : NO_REGS Any SSE register.) +(define_register_constraint B TARGET_MPX ? BND_REGS : NO_REGS + @internal Any bound register.) + ;; We use the Y prefix to denote any number of conditional register sets: ;; z First SSE register. ;; i SSE2 inter-unit moves to SSE register enabled @@ -243,6 +246,8 @@ ;; T prefix is used for different address constraints ;; v -
RE: [PATCH] Add a new option -fmerge-bitfields (patch / doc inside)
Hello, This is new patch version. Lowering is applied only for bit-fields copy sequences that are merged. Data structure representing bit-field copy sequences is renamed and reduced in size. Optimization turned on by default for -O2 and higher. Some comments fixed. Benchmarking performed on WebKit for Android. Code size reduction noticed on several files, best examples are: core/rendering/style/StyleMultiColData (632-520 bytes) core/platform/graphics/FontDescription (1715-1475 bytes) core/rendering/style/FillLayer (5069-4513 bytes) core/rendering/style/StyleRareInheritedData (5618-5346) core/css/CSSSelectorList(4047-3887) core/platform/animation/CSSAnimationData (3844-3440 bytes) core/css/resolver/FontBuilder (13818-13350 bytes) core/platform/graphics/Font (16447-15975 bytes) Example: One of the motivating examples for this work was copy constructor of the class which contains bit-fields. C++ code: class A { public: A(const A x); unsigned a : 1; unsigned b : 2; unsigned c : 4; }; A::A(const Ax) { a = x.a; b = x.b; c = x.c; } GIMPLE code without optimization: bb 2: _3 = x_2(D)-a; this_4(D)-a = _3; _6 = x_2(D)-b; this_4(D)-b = _6; _8 = x_2(D)-c; this_4(D)-c = _8; return; Optimized GIMPLE code: bb 2: _10 = x_2(D)-D.1867; _11 = BIT_FIELD_REF _10, 7, 0; _12 = this_4(D)-D.1867; _13 = _12 128; _14 = (unsigned char) _11; _15 = _13 | _14; this_4(D)-D.1867 = _15; return; Generated MIPS32r2 assembly code without optimization: lw $3,0($5) lbu $2,0($4) andi$3,$3,0x1 andi$2,$2,0xfe or $2,$2,$3 sb $2,0($4) lw $3,0($5) andi$2,$2,0xf9 andi$3,$3,0x6 or $2,$2,$3 sb $2,0($4) lw $3,0($5) andi$2,$2,0x87 andi$3,$3,0x78 or $2,$2,$3 j $31 sb $2,0($4) Optimized MIPS32r2 assembly code: lw $3,0($5) lbu $2,0($4) andi$3,$3,0x7f andi$2,$2,0x80 or $2,$3,$2 j $31 sb $2,0($4) Algorithm works on basic block level and consists of following 3 major steps: 1. Go through basic block statements list. If there are statement pairs that implement copy of bit field content from one memory location to another record statements pointers and other necessary data in corresponding data structure. 2. Identify records that represent adjacent bit field accesses and mark them as merged. 3. Lower bit-field accesses by using new field size for those that can be merged. New command line option -fmerge-bitfields is introduced. Tested - passed gcc regression tests for MIPS32r2. Changelog - gcc/ChangeLog: 2014-04-16 Zoran Jovanovic (zoran.jovano...@imgtec.com) * common.opt (fmerge-bitfields): New option. * doc/invoke.texi: Add reference to -fmerge-bitfields. * tree-sra.c (lower_bitfields): New function. Entry for (-fmerge-bitfields). (bf_access_candidate_p): New function. (lower_bitfield_read): New function. (lower_bitfield_write): New function. (bitfield_stmt_bfcopy_pair::hash): New function. (bitfield_stmt_bfcopy_pair::equal): New function. (bitfield_stmt_bfcopy_pair::remove): New function. (create_and_insert_bfcopy): New function. (get_bit_offset): New function. (add_stmt_bfcopy_pair): New function. (cmp_bfcopies): New function. (get_merged_bit_field_size): New function. * dwarf2out.c (simple_type_size_in_bits): Move to tree.c. (field_byte_offset): Move declaration to tree.h and make it extern. * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test. * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test. * tree-ssa-sccvn.c (expressions_equal_p): Move to tree.c. * tree-ssa-sccvn.h (expressions_equal_p): Move declaration to tree.h. * tree.c (expressions_equal_p): Move from tree-ssa-sccvn.c. (simple_type_size_in_bits): Move from dwarf2out.c. * tree.h (expressions_equal_p): Add declaration. (field_byte_offset): Add declaration. Patch - diff --git a/gcc/common.opt b/gcc/common.opt index da275e5..52c7f58 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2203,6 +2203,10 @@ ftree-sra Common Report Var(flag_tree_sra) Optimization Perform scalar replacement of aggregates +fmerge-bitfields +Common Report Var(flag_tree_bitfield_merge) Optimization +Merge loads and stores of consecutive bitfields + ftree-ter Common Report Var(flag_tree_ter) Optimization Replace temporary expressions in the SSA-normal pass diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 3fdfeb9..546638e 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -411,7 +411,7 @@ Objective-C and Objective-C++ Dialects}. -fsplit-ivs-in-unroller -fsplit-wide-types -fstack-protector @gol -fstack-protector-all -fstack-protector-strong -fstrict-aliasing @gol -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
Re: [PATCH] register CALL_INSN_FUNCTION_USAGE in find_all_hard_reg_sets
Jakub Jelinek ja...@redhat.com writes: On Wed, Apr 16, 2014 at 11:46:14AM +0200, Tom de Vries wrote: ...why do we need two different mechanisms to deal with these two? IMO the set recorded for the callee should contain what the callee instructions clobber and nothing else. CALL_INSN_FUNCTION_USAGE should contain everything clobbered by a call outside the callee, whether that's in the calling function itself, in a PLT, in a MIPS16 stub, or whatever. Always putting all call clobbered registers to C_I_F_U explicitly can be a serious memory hog on some architectures, e.g. doesn't ia64 have ~ 160 call clobbered hard registers, times number of calls in a function (sometimes tens of thousands)? That isn't what we're doing though. The problem Tom's trying to solve is that call sequences themselves can sometimes use call-clobbered registers internally, on the assumption that they cannot possibly hold a live value. These uses aren't always exposed in the rtl, at least not until after reload (which is later than Tom needs the information). So it isn't always correct to assume that a call only clobbers the registers that are clobbered by the call target. E.g. $gp is call-clobbered on MIPS o32, so we need to restore it after a call if the GOT base is still needed. This is doing using post-reload split of the call insn. And on MIPS16 we need a temporary register to do that, since it isn't possible to load directly into $gp. The temporary register we use is the call-clobbered $6. Tom's original approach was to have a hook that told the target-independent code that $6 might be clobbered in this way. My argument was that we should reuse CALL_INSN_FUNCTION_USAGE instead, since it already holds other such special uses and clobbers. The difference is that the special registers we need to know about here are call-clobbered, whereas until now it has only been necessary to mention call-saved ones (or, in the case of USEs, registers set up before the call and not otherwise used by the call). Thanks, Richard
Re: [PATCH] register CALL_INSN_FUNCTION_USAGE in find_all_hard_reg_sets
Tom de Vries tom_devr...@mentor.com writes: On 16/04/14 12:28, Richard Sandiford wrote: This patch introduces a hook that specifies which registers are implicitly clobbered by a call, not including the registers that are clobbered in the called function, and then uses that hook to add those registers to CALL_INSN_FUNCTION_USAGE. I don't think a new hook is needed. Richard, the hook enables us to determine whether a target supplies the information provided by the hook. If the target does not provide this information, the fuse-caller-save optimization is possibly not safe. How do you propose to handle this without this hook? Maybe we should just have a bool field in the target structure to say whether -fuse-caller-save is supported, a bit like delay_sched2. Thanks, Richard
[PATCH, Pointer Bounds Checker 3/x] Target hooks for Pointer Bounds Checker
Hi, This patch introduces target hooks to be used by Pointer Bounds Checker. Hooks set is different from what was approved for 4.9 (and later reverted). I added hooks to work with returned bounds and to prepare incoming bounds for vararg functions. It allowed to remove some target assumptions from expand code. Bootstrapped and tested on linux-x86_64. OK for trunk? Thanks, Ilya -- gcc/ 2014-04-16 Ilya Enkovich ilya.enkov...@intel.com * target.def (builtin_chkp_function): New. (chkp_bound_type): New. (chkp_bound_mode): New. (chkp_make_bounds_constant): New. (chkp_initialize_bounds): New. (fn_abi_va_list_bounds_size): New. (load_bounds_for_arg): New. (store_bounds_for_arg): New. (load_returned_bounds): New. (store_returned_bounds): New. (chkp_function_value_bounds): New. (setup_incoming_vararg_bounds): New. * targhooks.h (default_load_bounds_for_arg): New. (default_store_bounds_for_arg): New. (default_load_returned_bounds): New. (default_store_returned_bounds): New. (default_fn_abi_va_list_bounds_size): New. (default_chkp_bound_type): New. (default_chkp_bound_mode): New. (default_builtin_chkp_function): New. (default_chkp_function_value_bounds): New. (default_chkp_make_bounds_constant): New. (default_chkp_initialize_bounds): New. (default_setup_incoming_vararg_bounds): New. * targhooks.c (default_load_bounds_for_arg): New. (default_store_bounds_for_arg): New. (default_load_returned_bounds): New. (default_store_returned_bounds): New. (default_fn_abi_va_list_bounds_size): New. (default_chkp_bound_type): New. (default_chkp_bound_mode); New. (default_builtin_chkp_function): New. (default_chkp_function_value_bounds): New. (default_chkp_make_bounds_constant): New. (default_chkp_initialize_bounds): New. (default_setup_incoming_vararg_bounds): New. * doc/tm.texi.in (TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE): New. (TARGET_LOAD_BOUNDS_FOR_ARG): New. (TARGET_STORE_BOUNDS_FOR_ARG): New. (TARGET_LOAD_RETURNED_BOUNDS): New. (TARGET_STORE_RETURNED_BOUNDS): New. (TARGET_CHKP_FUNCTION_VALUE_BOUNDS): New. (TARGET_SETUP_INCOMING_VARARG_BOUNDS): New. (TARGET_BUILTIN_CHKP_FUNCTION): New. (TARGET_CHKP_BOUND_TYPE): New. (TARGET_CHKP_BOUND_MODE): New. (TARGET_CHKP_MAKE_BOUNDS_CONSTANT): New. (TARGET_CHKP_INITIALIZE_BOUNDS): New. * doc/tm.texi: Regenerated. diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index b8ca17e..d868129 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -4333,6 +4333,13 @@ This hook returns the va_list type of the calling convention specified by The default version of this hook returns @code{va_list_type_node}. @end deftypefn +@deftypefn {Target Hook} tree TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE (tree @var{fndecl}) +This hook returns size for @code{va_list} object in function specified +by @var{fndecl}. This hook is used by Pointer Bounds Checker to build bounds +for @code{va_list} object. Return @code{integer_zero_node} if no bounds +should be used (e.g. @code{va_list} is a scalar pointer to the stack). +@end deftypefn + @deftypefn {Target Hook} tree TARGET_CANONICAL_VA_LIST_TYPE (tree @var{type}) This hook returns the va_list type of the calling convention specified by the type of @var{type}. If @var{type} is not a valid va_list type, it returns @@ -5150,6 +5157,49 @@ defined, then define this hook to return @code{true} if Otherwise, you should not define this hook. @end deftypefn +@deftypefn {Target Hook} rtx TARGET_LOAD_BOUNDS_FOR_ARG (rtx @var{slot}, rtx @var{arg}, rtx @var{slot_no}) +This hook is used by expand pass to emit insn to load bounds of +@var{arg} passed in @var{slot}. Expand pass uses this hook in case +bounds of @var{arg} are not passed in register. If @var{slot} is a +memory, then bounds are loaded as for regular pointer loaded from +memory. If @var{slot} is not a memory then @var{slot_no} is an integer +constant holding number of the target dependent special slot which +should be used to obtain bounds. Hook returns RTX holding loaded bounds. +@end deftypefn + +@deftypefn {Target Hook} void TARGET_STORE_BOUNDS_FOR_ARG (rtx @var{arg}, rtx @var{slot}, rtx @var{bounds}, rtx @var{slot_no}) +This hook is used by expand pass to emit insns to store @var{bounds} of +@var{arg} passed in @var{slot}. Expand pass uses this hook in case +@var{bounds} of @var{arg} are not passed in register. If @var{slot} is a +memory, then @var{bounds} are stored as for regular pointer stored in +memory. If @var{slot} is not a memory then @var{slot_no} is an integer +constant holding number of the target dependent special slot which +should be used to store @var{bounds}. +@end deftypefn + +@deftypefn {Target
Re: [PATCH] Change HONOR_REG_ALLOC_ORDER to a marco for C expression
On 2014-04-15, 9:26 AM, Kito Cheng wrote: Hi Vladimir: Although this patch is safe. I guess it could wait for stage 1 as right now we don't need this functionality. The patch is ok for the stage1 which is probably about a month away. ping is this patch ok now? Yes, I approved it already a mount ago. Thanks, Kito.
Re: RFA: Tighten checking for 'X' constraints
On Wed, Apr 16, 2014 at 11:43:12AM +0100, Richard Sandiford wrote: X was defined against reload, which always reloaded MEM addresses to follow the appropriate base and index register classes. This was done as a first pass before matching against the constraints: I think it would be fine if X had a MEM that isn't valid to replace it say by (mem (scratch)) or similar. What I think X is useful for is e.g. if you want to describe e.g. a side-effect of inline-asm on certain location in memory, but don't really need the address of that memory. Often memory is too big hammer, people often say that certain inline-asm uses or sets or uses/sets or clobbers say 100 byte long piece of memory somewhere, but the operand is there solely to tell the compiler what memory it is. I think X constraint is good for that if you aren't planning to actually use the address anywhere. E.g. you call in inline-asm some function, but the address construction is in the callee, there is no point to costly compute the address in the caller (say for -fPIC). Jakub
Re: [PATCH] Fix PR c++/60765
Could someone install this on my behalf?
Re: [PATCH] Change HONOR_REG_ALLOC_ORDER to a marco for C expression
Hi Vladimir: thanks your replay and approve, however I don't have commit right yet, can you help to commit it? thanks! On Wed, Apr 16, 2014 at 8:10 PM, Vladimir Makarov vmaka...@redhat.com wrote: On 2014-04-15, 9:26 AM, Kito Cheng wrote: Hi Vladimir: Although this patch is safe. I guess it could wait for stage 1 as right now we don't need this functionality. The patch is ok for the stage1 which is probably about a month away. ping is this patch ok now? Yes, I approved it already a mount ago. Thanks, Kito.
Re: [PATCH] Fix PR c++/60764
Could someone install this for me?
[PATCH, Pointer Bounds Checker 4/x] Built-in functions
Hi, This patch introduces built-in functions used by Pointer Bounds Checker. It is mostly similar to what was reverted from 4.9, I just added types and attributes to builtins. This patch also introduces pointer_bounds_type_node to be used in built-in function type declarations. Bootstrapped and tested on linux-x86_64. OK for trunk? Thanks, Ilya -- gcc/ 2014-04-16 Ilya Enkovich ilya.enkov...@intel.com * tree-core.h (tree_index): Add TI_POINTER_BOUNDS_TYPE. * tree.h (pointer_bounds_type_node): New. * tree.c (build_common_tree_nodes): Initialize pointer_bounds_type_node. * builtin-types.def (BT_BND): New. (BT_FN_PTR_CONST_PTR): New. (BT_FN_CONST_PTR_CONST_PTR): New. (BT_FN_BND_CONST_PTR): New. (BT_FN_CONST_PTR_BND): New. (BT_FN_PTR_CONST_PTR_SIZE): New. (BT_FN_PTR_CONST_PTR_CONST_PTR): New. (BT_FN_VOID_PTRPTR_CONST_PTR): New. (BT_FN_VOID_CONST_PTR_SIZE): New. (BT_FN_VOID_PTR_BND): New. (BT_FN_CONST_PTR_CONST_PTR_CONST_PTR): New. (BT_FN_BND_CONST_PTR_SIZE): New. (BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE): New. (BT_FN_VOID_CONST_PTR_BND_CONST_PTR): New. * chkp-builtins.def: New. * builtins.def: include chkp-builtins.def. (DEF_CHKP_BUILTIN): New. * builtins.c (expand_builtin): Support BUILT_IN_CHKP_INIT_PTR_BOUNDS, BUILT_IN_CHKP_NULL_PTR_BOUNDS, BUILT_IN_CHKP_COPY_PTR_BOUNDS, BUILT_IN_CHKP_CHECK_PTR_LBOUNDS, BUILT_IN_CHKP_CHECK_PTR_UBOUNDS, BUILT_IN_CHKP_CHECK_PTR_BOUNDS, BUILT_IN_CHKP_SET_PTR_BOUNDS, BUILT_IN_CHKP_NARROW_PTR_BOUNDS, BUILT_IN_CHKP_STORE_PTR_BOUNDS, BUILT_IN_CHKP_GET_PTR_LBOUND, BUILT_IN_CHKP_GET_PTR_UBOUND, BUILT_IN_CHKP_BNDMK, BUILT_IN_CHKP_BNDSTX, BUILT_IN_CHKP_BNDCL, BUILT_IN_CHKP_BNDCU, BUILT_IN_CHKP_BNDLDX, BUILT_IN_CHKP_BNDRET, BUILT_IN_CHKP_INTERSECT, BUILT_IN_CHKP_NARROW, BUILT_IN_CHKP_EXTRACT_LOWER, BUILT_IN_CHKP_EXTRACT_UPPER. * c-family/c.opt (fcheck-pointer-bounds): New. * toplev.c (process_options): Check Pointer Bounds Checker is supported. * doc/extend.texi: Document Pointer Bounds Checker built-in functions. diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def index fba9c7d..2e5f361 100644 --- a/gcc/builtin-types.def +++ b/gcc/builtin-types.def @@ -133,6 +133,8 @@ DEF_PRIMITIVE_TYPE (BT_I4, builtin_type_for_size (BITS_PER_UNIT*4, 1)) DEF_PRIMITIVE_TYPE (BT_I8, builtin_type_for_size (BITS_PER_UNIT*8, 1)) DEF_PRIMITIVE_TYPE (BT_I16, builtin_type_for_size (BITS_PER_UNIT*16, 1)) +DEF_PRIMITIVE_TYPE (BT_BND, pointer_bounds_type_node) + DEF_POINTER_TYPE (BT_PTR_CONST_STRING, BT_CONST_STRING) DEF_POINTER_TYPE (BT_PTR_LONG, BT_LONG) DEF_POINTER_TYPE (BT_PTR_ULONGLONG, BT_ULONGLONG) @@ -234,6 +236,10 @@ DEF_FUNCTION_TYPE_1 (BT_FN_UINT16_UINT16, BT_UINT16, BT_UINT16) DEF_FUNCTION_TYPE_1 (BT_FN_UINT32_UINT32, BT_UINT32, BT_UINT32) DEF_FUNCTION_TYPE_1 (BT_FN_UINT64_UINT64, BT_UINT64, BT_UINT64) DEF_FUNCTION_TYPE_1 (BT_FN_BOOL_INT, BT_BOOL, BT_INT) +DEF_FUNCTION_TYPE_1 (BT_FN_PTR_CONST_PTR, BT_PTR, BT_CONST_PTR) +DEF_FUNCTION_TYPE_1 (BT_FN_CONST_PTR_CONST_PTR, BT_CONST_PTR, BT_CONST_PTR) +DEF_FUNCTION_TYPE_1 (BT_FN_BND_CONST_PTR, BT_BND, BT_CONST_PTR) +DEF_FUNCTION_TYPE_1 (BT_FN_CONST_PTR_BND, BT_CONST_PTR, BT_BND) DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR, BT_FN_VOID_PTR) @@ -347,6 +353,13 @@ DEF_FUNCTION_TYPE_2 (BT_FN_BOOL_SIZE_CONST_VPTR, BT_BOOL, BT_SIZE, BT_CONST_VOLATILE_PTR) DEF_FUNCTION_TYPE_2 (BT_FN_BOOL_INT_BOOL, BT_BOOL, BT_INT, BT_BOOL) DEF_FUNCTION_TYPE_2 (BT_FN_VOID_UINT_UINT, BT_VOID, BT_UINT, BT_UINT) +DEF_FUNCTION_TYPE_2 (BT_FN_PTR_CONST_PTR_SIZE, BT_PTR, BT_CONST_PTR, BT_SIZE) +DEF_FUNCTION_TYPE_2 (BT_FN_PTR_CONST_PTR_CONST_PTR, BT_PTR, BT_CONST_PTR, BT_CONST_PTR) +DEF_FUNCTION_TYPE_2 (BT_FN_VOID_PTRPTR_CONST_PTR, BT_VOID, BT_PTR_PTR, BT_CONST_PTR) +DEF_FUNCTION_TYPE_2 (BT_FN_VOID_CONST_PTR_SIZE, BT_VOID, BT_CONST_PTR, BT_SIZE) +DEF_FUNCTION_TYPE_2 (BT_FN_VOID_PTR_BND, BT_VOID, BT_PTR, BT_BND) +DEF_FUNCTION_TYPE_2 (BT_FN_CONST_PTR_CONST_PTR_CONST_PTR, BT_CONST_PTR, BT_CONST_PTR, BT_CONST_PTR) +DEF_FUNCTION_TYPE_2 (BT_FN_BND_CONST_PTR_SIZE, BT_BND, BT_CONST_PTR, BT_SIZE) DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR_PTR, BT_FN_VOID_PTR_PTR) @@ -430,6 +443,8 @@ DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I4_INT, BT_VOID, BT_VOLATILE_PTR, BT_I4, BT DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I8_INT, BT_VOID, BT_VOLATILE_PTR, BT_I8, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I16_INT, BT_VOID, BT_VOLATILE_PTR, BT_I16, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_INT_PTRPTR_SIZE_SIZE, BT_INT, BT_PTR_PTR, BT_SIZE, BT_SIZE) +DEF_FUNCTION_TYPE_3 (BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE, BT_PTR, BT_CONST_PTR, BT_CONST_PTR, BT_SIZE) +DEF_FUNCTION_TYPE_3 (BT_FN_VOID_CONST_PTR_BND_CONST_PTR, BT_VOID, BT_CONST_PTR, BT_BND, BT_CONST_PTR) DEF_FUNCTION_TYPE_4
Re: [PATCH 1/3] libstdc++: Add time_get::get support.
On 16/04/14 13:19 +0200, Rüdiger Sonderfeld wrote: On Tuesday 15 April 2014 23:36:51 Paolo Carlini wrote: Those should be isolated and a compiler bug report opened including a minimized reproducer. I'm not sure if this is a compiler bug or simply due to the fact that I didn't add the virtual function to the ABI linker script. Anyway, the real issue is indeed that implementing those bits requires a new virtual function, and that would break the ABI. We do have a bug in Bugzilla tracking the issue. But now that 4.9 has branched isn't it acceptable to extent the ABI? Adding new virtual functions does not extend the ABI, it makes it incompatible. We either need to change the library's SONAME (not acceptable to some GNU/Linux dsitributions) or mangle the type differently so it is a new type for linkage purposes (which is what we plan to do using the abi_tag attribute).
[PATCH, Pointer Bounds Checker 6/x] New static constructor types
Hi, This patch add new static constructor types used by Pointer Bounds Checker. It was approved earlier for 4.9 and I'll assume patch is OK for trunk if no objections arise. Patch was bootstrapped and tested for linux-x86_64. Thanks, Ilya -- gcc/ 2014-04-16 Ilya Enkovich ilya.enkov...@intel.com * ipa.c (cgraph_build_static_cdtor_1): Support contructors with chkp ctor and bnd_legacy attributes. * gimplify.c (gimplify_init_constructor): Avoid infinite loop during gimplification of bounds initializer. diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 7441784..67ab515 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -3803,10 +3803,19 @@ gimplify_init_constructor (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p, individual element initialization. Also don't do this for small all-zero initializers (which aren't big enough to merit clearing), and don't try to make bitwise copies of - TREE_ADDRESSABLE types. */ + TREE_ADDRESSABLE types. + + We cannot apply such transformation when compiling chkp static + initializer because creation of initializer image in the memory + will require static initialization of bounds for it. It should + result in another gimplification of similar initializer and we + may fall into infinite loop. */ if (valid_const_initializer !(cleared || num_nonzero_elements == 0) -!TREE_ADDRESSABLE (type)) +!TREE_ADDRESSABLE (type) +(!current_function_decl + || !lookup_attribute (chkp ctor, + DECL_ATTRIBUTES (current_function_decl { HOST_WIDE_INT size = int_size_in_bytes (type); unsigned int align; diff --git a/gcc/ipa.c b/gcc/ipa.c index 26e9b03..5ab3aed 100644 --- a/gcc/ipa.c +++ b/gcc/ipa.c @@ -1345,9 +1345,11 @@ make_pass_ipa_whole_program_visibility (gcc::context *ctxt) } /* Generate and emit a static constructor or destructor. WHICH must - be one of 'I' (for a constructor) or 'D' (for a destructor). BODY - is a STATEMENT_LIST containing GENERIC statements. PRIORITY is the - initialization priority for this constructor or destructor. + be one of 'I' (for a constructor), 'D' (for a destructor), 'P' + (for chp static vars constructor) or 'B' (for chkp static bounds + constructor). BODY is a STATEMENT_LIST containing GENERIC + statements. PRIORITY is the initialization priority for this + constructor or destructor. FINAL specify whether the externally visible name for collect2 should be produced. */ @@ -1406,6 +1408,20 @@ cgraph_build_static_cdtor_1 (char which, tree body, int priority, bool final) DECL_STATIC_CONSTRUCTOR (decl) = 1; decl_init_priority_insert (decl, priority); break; +case 'P': + DECL_STATIC_CONSTRUCTOR (decl) = 1; + DECL_ATTRIBUTES (decl) = tree_cons (get_identifier (chkp ctor), + NULL, + NULL_TREE); + decl_init_priority_insert (decl, priority); + break; +case 'B': + DECL_STATIC_CONSTRUCTOR (decl) = 1; + DECL_ATTRIBUTES (decl) = tree_cons (get_identifier (bnd_legacy), + NULL, + NULL_TREE); + decl_init_priority_insert (decl, priority); + break; case 'D': DECL_STATIC_DESTRUCTOR (decl) = 1; decl_fini_priority_insert (decl, priority); @@ -1423,9 +1439,11 @@ cgraph_build_static_cdtor_1 (char which, tree body, int priority, bool final) } /* Generate and emit a static constructor or destructor. WHICH must - be one of 'I' (for a constructor) or 'D' (for a destructor). BODY - is a STATEMENT_LIST containing GENERIC statements. PRIORITY is the - initialization priority for this constructor or destructor. */ + be one of 'I' (for a constructor), 'D' (for a destructor), 'P' + (for chkp static vars constructor) or 'B' (for chkp static bounds + constructor). BODY is a STATEMENT_LIST containing GENERIC + statements. PRIORITY is the initialization priority for this + constructor or destructor. */ void cgraph_build_static_cdtor (char which, tree body, int priority)
[PATCH, Pointer Bounds Checker 7/x] Call/ret ifaces
Hi, This patch adds flags and ifaces to mark instrumented calls, extends return stms with additional operand and introduces some basic bounds predicates. These changes were previously reverted from 4.9 and I'll assume patch is OK for trunk if no objections arise. Patch was bootstrapped and tested for linux-x86_64. Thanks, Ilya -- gcc/ 2014-04-16 Ilya Enkovich ilya.enkov...@intel.com * gimple.h (gf_mask): Add GF_CALL_WITH_BOUNDS. (gimple_call_with_bounds_p): New. (gimple_call_set_with_bounds): New. (gimple_return_retbnd): New. (gimple_return_set_retbnd): New * rtl.h (CALL_EXPR_WITH_BOUNDS_P): New. * tree.h (POINTER_BOUNDS_P): New. (BOUNDED_TYPE_P): New. (BOUNDED_P): New. (CALL_WITH_BOUNDS_P): New. * gimple.c (gimple_build_return): Increase number of ops for return statement. (gimple_build_call_from_tree): Propagate CALL_WITH_BOUNDS_P flag. * gimple-pretty-print.c (dump_gimple_return): Print second op. diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c index 741cd92..a792bb9 100644 --- a/gcc/gimple-pretty-print.c +++ b/gcc/gimple-pretty-print.c @@ -547,11 +547,12 @@ dump_gimple_assign (pretty_printer *buffer, gimple gs, int spc, int flags) static void dump_gimple_return (pretty_printer *buffer, gimple gs, int spc, int flags) { - tree t; + tree t, t2; t = gimple_return_retval (gs); + t2 = gimple_return_retbnd (gs); if (flags TDF_RAW) -dump_gimple_fmt (buffer, spc, flags, %G %T, gs, t); +dump_gimple_fmt (buffer, spc, flags, %G %T %T, gs, t, t2); else { pp_string (buffer, return); @@ -560,6 +561,11 @@ dump_gimple_return (pretty_printer *buffer, gimple gs, int spc, int flags) pp_space (buffer); dump_generic_node (buffer, t, spc, flags, false); } + if (t2) + { + pp_string (buffer, , ); + dump_generic_node (buffer, t2, spc, flags, false); + } pp_semicolon (buffer); } } diff --git a/gcc/gimple.c b/gcc/gimple.c index 2a278e4..89f9b4c 100644 --- a/gcc/gimple.c +++ b/gcc/gimple.c @@ -180,7 +180,7 @@ gimple_build_with_ops_stat (enum gimple_code code, unsigned subcode, gimple gimple_build_return (tree retval) { - gimple s = gimple_build_with_ops (GIMPLE_RETURN, ERROR_MARK, 1); + gimple s = gimple_build_with_ops (GIMPLE_RETURN, ERROR_MARK, 2); if (retval) gimple_return_set_retval (s, retval); return s; @@ -367,6 +367,7 @@ gimple_build_call_from_tree (tree t) gimple_call_set_va_arg_pack (call, CALL_EXPR_VA_ARG_PACK (t)); gimple_call_set_nothrow (call, TREE_NOTHROW (t)); gimple_set_no_warning (call, TREE_NO_WARNING (t)); + gimple_call_set_with_bounds (call, CALL_WITH_BOUNDS_P (t)); return call; } diff --git a/gcc/gimple.h b/gcc/gimple.h index 11959a8..8b8693c 100644 --- a/gcc/gimple.h +++ b/gcc/gimple.h @@ -90,6 +90,7 @@ enum gf_mask { GF_CALL_NOTHROW= 1 4, GF_CALL_ALLOCA_FOR_VAR = 1 5, GF_CALL_INTERNAL = 1 6, +GF_CALL_WITH_BOUNDS= 1 7, GF_OMP_PARALLEL_COMBINED = 1 0, GF_OMP_FOR_KIND_MASK = 3 0, GF_OMP_FOR_KIND_FOR= 0 0, @@ -2438,6 +2439,31 @@ gimple_call_internal_p (const_gimple gs) } +/* Return true if call GS is marked as instrumented by + Pointer Bounds Checker. */ + +static inline bool +gimple_call_with_bounds_p (const_gimple gs) +{ + GIMPLE_CHECK (gs, GIMPLE_CALL); + return (gs-subcode GF_CALL_WITH_BOUNDS) != 0; +} + + +/* If INSTRUMENTED_P is true, marm statement GS as instrumented by + Pointer Bounds Checker. */ + +static inline void +gimple_call_set_with_bounds (gimple gs, bool with_bounds) +{ + GIMPLE_CHECK (gs, GIMPLE_CALL); + if (with_bounds) +gs-subcode |= GF_CALL_WITH_BOUNDS; + else +gs-subcode = ~GF_CALL_WITH_BOUNDS; +} + + /* Return the target of internal call GS. */ static inline enum internal_fn @@ -5517,6 +5543,26 @@ gimple_return_set_retval (gimple gs, tree retval) } +/* Return the return bounds for GIMPLE_RETURN GS. */ + +static inline tree +gimple_return_retbnd (const_gimple gs) +{ + GIMPLE_CHECK (gs, GIMPLE_RETURN); + return gimple_op (gs, 1); +} + + +/* Set RETVAL to be the return bounds for GIMPLE_RETURN GS. */ + +static inline void +gimple_return_set_retbnd (gimple gs, tree retval) +{ + GIMPLE_CHECK (gs, GIMPLE_RETURN); + gimple_set_op (gs, 1, retval); +} + + /* Returns true when the gimple statement STMT is any of the OpenMP types. */ #define CASE_GIMPLE_OMP\ diff --git a/gcc/rtl.h b/gcc/rtl.h index f1cda4c..54d1cf1 100644 --- a/gcc/rtl.h +++ b/gcc/rtl.h @@ -265,7 +265,8 @@ struct GTY((chain_next (RTX_NEXT (%h)), In a CODE_LABEL, part of the two-bit alternate entry field. 1 in a CONCAT is VAL_EXPR_IS_COPIED in var-tracking.c. 1 in a VALUE is SP_BASED_VALUE_P in cselib.c. - 1 in a SUBREG generated by LRA for
[PATCH, Pointer Bounds Checker 8/x] Add varpool node field
Hi, This patch add new field for varpool_node to mark vars requiring bounds initalization. These changes were previously reverted from 4.9 and I'll assume patch is OK for trunk if no objections arise. Patch was bootstrapped and tested for linux-x86_64. Thanks, Ilya -- gcc/ 2014-04-16 Ilya Enkovich ilya.enkov...@intel.com * cgraph.h (varpool_node): Add need_bounds_init field. * lto-cgraph.c (lto_output_varpool_node): Output need_bounds_init value. (input_varpool_node): Read need_bounds_init value. * varpool.c (dump_varpool_node): Dump need_bounds_init field. diff --git a/gcc/cgraph.h b/gcc/cgraph.h index 15310d8..a6a51cf 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -640,6 +640,10 @@ public: /* Set when variable is scheduled to be assembled. */ unsigned output : 1; + /* Set when variable has statically initialized pointer + or is a static bounds variable and needs initalization. */ + unsigned need_bounds_init : 1; + /* Set if the variable is dynamically initialized, except for function local statics. */ unsigned dynamically_initialized : 1; diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c index 173067f..999ce3d 100644 --- a/gcc/lto-cgraph.c +++ b/gcc/lto-cgraph.c @@ -585,6 +585,7 @@ lto_output_varpool_node (struct lto_simple_output_block *ob, varpool_node *node, boundary_p !DECL_EXTERNAL (node-decl), 1); /* in_other_partition. */ } + bp_pack_value (bp, node-need_bounds_init, 1); streamer_write_bitpack (bp); if (node-same_comdat_group !boundary_p) { @@ -1160,6 +1161,7 @@ input_varpool_node (struct lto_file_decl_data *file_data, node-analyzed = bp_unpack_value (bp, 1); node-used_from_other_partition = bp_unpack_value (bp, 1); node-in_other_partition = bp_unpack_value (bp, 1); + node-need_bounds_init = bp_unpack_value (bp, 1); if (node-in_other_partition) { DECL_EXTERNAL (node-decl) = 1; diff --git a/gcc/varpool.c b/gcc/varpool.c index acb5221..0eeb2b6 100644 --- a/gcc/varpool.c +++ b/gcc/varpool.c @@ -205,6 +205,8 @@ dump_varpool_node (FILE *f, varpool_node *node) fprintf (f, initialized); if (node-output) fprintf (f, output); + if (node-need_bounds_init) +fprintf (f, need-bounds-init); if (TREE_READONLY (node-decl)) fprintf (f, read-only); if (ctor_for_folding (node-decl) != error_mark_node)
Re: RFA: Tighten checking for 'X' constraints
Jakub Jelinek ja...@redhat.com writes: On Wed, Apr 16, 2014 at 11:43:12AM +0100, Richard Sandiford wrote: X was defined against reload, which always reloaded MEM addresses to follow the appropriate base and index register classes. This was done as a first pass before matching against the constraints: I think it would be fine if X had a MEM that isn't valid to replace it say by (mem (scratch)) or similar. What I think X is useful for is e.g. if you want to describe e.g. a side-effect of inline-asm on certain location in memory, but don't really need the address of that memory. Often memory is too big hammer, people often say that certain inline-asm uses or sets or uses/sets or clobbers say 100 byte long piece of memory somewhere, but the operand is there solely to tell the compiler what memory it is. I think X constraint is good for that if you aren't planning to actually use the address anywhere. E.g. you call in inline-asm some function, but the address construction is in the callee, there is no point to costly compute the address in the caller (say for -fPIC). If we want to replace the address with a scratch, I think we should do it at expand time so that all unneeded dependent code gets removed, rather than doing it only if the address isn't valid after optimisation. But since there's not AFAIK ever been a rule that X operands can't be printed, I think we should only do that if the operand isn't mentioned in the asm string. So would that be OK as a compromise? At expand time, check which operands are used (via get_referenced_operands). If an operand isn't used, is a MEM, has a constraint string that is exactly X, and is not tied to other operands via matching constraints, replace the MEM address with (scratch). Then allow (mem (scratch)) as well as CONSTANT_P and general_operand in check_asm_operands? I suppose a follow-on optimisation would be to convert m into X in the same situation. Thanks, Richard
Re: RFA: Tighten checking for 'X' constraints
On Wed, Apr 16, 2014 at 02:24:06PM +0100, Richard Sandiford wrote: side-effect of inline-asm on certain location in memory, but don't really need the address of that memory. Often memory is too big hammer, people often say that certain inline-asm uses or sets or uses/sets or clobbers say 100 byte long piece of memory somewhere, but the operand is there solely to tell the compiler what memory it is. I think X constraint is good for that if you aren't planning to actually use the address anywhere. E.g. you call in inline-asm some function, but the address construction is in the callee, there is no point to costly compute the address in the caller (say for -fPIC). If we want to replace the address with a scratch, I think we should do it at expand time so that all unneeded dependent code gets removed, rather than doing it only if the address isn't valid after optimisation. But since there's not AFAIK ever been a rule that X operands can't be printed, I think we should only do that if the operand isn't mentioned in the asm string. So would that be OK as a compromise? At expand time, check which operands are used (via get_referenced_operands). If an operand isn't used, is a MEM, has a constraint string that is exactly X, and is not tied to other operands via matching constraints, replace the MEM address with (scratch). Then allow (mem (scratch)) as well as CONSTANT_P and general_operand in check_asm_operands? I suppose a follow-on optimisation would be to convert m into X in the same situation. Creating a (mem (scratch)) too early may pessimize code too much, perhaps it can be used during say sched1 etc. for alias analysis, (mem (scratch)) is considered to alias everything,. Plus, I think at least so far we have not been doing different decisions based on whether some operand has been referenced in the template or not, not sure if it is desirable to introduce it. Anyway, others can have different opinion on what X should mean, CCing Jeff and Eric. Jakub
Re: [PATCH] Add a new option -fmerge-bitfields (patch / doc inside)
On 16 April 2014 13:38, Zoran Jovanovic zoran.jovano...@imgtec.com wrote: Hello, This is new patch version. The comment from the previous iteration still holds true: +@item -fbitfield-merge you are talking about '-fmerge-bitfields' up until here. Please fix all occurances of bitfield-merge, both in the docs as well as in the gcc.dg/tree-ssa/bitfldmrg2.c testcase -- how did that pass anyway as that option is presumably not recognized? :) thanks,
[PATCH, Pointer Bounds Checker 9/x] Cgraph extension
Hi, This patch introduces changes in call graph for Pointer Bounds Checker. New fields instrumented_version, instrumentation_clone and orig_decl are added for cgraph_node: - instrumentation_clone field is 1 for nodes created for instrumented version of functions - instrumented_version points to instrumented/original node - orig_decl holds original function declaration for instrumented nodes in case original node is removed IPA_REF_CHKP reference type is introduced for nodes to reference instrumented function versions from originals. It is used to have proper reachability analysis. When original function bodies are not needed anymore, functions are transformed into thunks having call edge to the instrumented function. Therefore new field appeared in cgraph_thunk_info to mark such thunks. Does it look OK? Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- gcc/ 2014-04-16 Ilya Enkovich ilya.enkov...@intel.com * cgraph.h (cgraph_thunk_info): Add add_pointer_bounds_args field. (cgraph_node): Add instrumented_version, orig_decl and instrumentation_clone fields. (symtab_alias_target): Allow IPA_REF_CHKP reference. * cgraph.c (cgraph_remove_node): Fix instrumented_version of the referenced node if any. (dump_cgraph_node): Dump instrumentation_clone and instrumented_version fields. (verify_cgraph_node): Check correctness of IPA_REF_CHKP references and instrumentation thunks. * cgraphbuild.c (rebuild_cgraph_edges): Rebuild IPA_REF_CHKP reference. (cgraph_rebuild_references): Likewise. * cgraphunit.c (assemble_thunks_and_aliases): Skip thunks calling instrumneted function version. * ipa-ref.h (ipa_ref_use): Add IPA_REF_CHKP. (ipa_ref): increase size of use field. * ipa-ref.c (ipa_ref_use_name): Add element for IPA_REF_CHKP. * lto-cgraph.c (lto_output_node): Output instrumentation_clone, thunk.add_pointer_bounds_args and orig_decl field. (lto_output_ref): Adjust to new ipa_ref::use field size. (input_overwrite_node): Read instrumentation_clone field. (input_node): Read thunk.add_pointer_bounds_args and orig_decl fields. (input_ref): Adjust to new ipa_ref::use field size. (input_cgraph_1): Compute instrumented_version fields and restore IDENTIFIER_TRANSPARENT_ALIAS chains. * lto-streamer.h (LTO_minor_version): Change minor version from 0 to 1. * ipa.c (symtab_remove_unreachable_nodes): Consider instrumented clone as address taken if the original one is address taken. (cgraph_externally_visible_p): Mark instrumented 'main' as externally visible. (function_and_variable_visibility): Filter instrumentation thunks. diff --git a/gcc/cgraph.c b/gcc/cgraph.c index be3661a..6210c68 100644 --- a/gcc/cgraph.c +++ b/gcc/cgraph.c @@ -1828,6 +1828,12 @@ cgraph_remove_node (struct cgraph_node *node) } cgraph_n_nodes--; + if (node-instrumented_version) +{ + node-instrumented_version-instrumented_version = NULL; + node-instrumented_version = NULL; +} + /* Clear out the node to NULL all pointers and add the node to the free list. */ memset (node, 0, sizeof (*node)); @@ -2070,6 +2076,11 @@ dump_cgraph_node (FILE *f, struct cgraph_node *node) if (indirect_calls_count) fprintf (f, Has %i outgoing edges for indirect calls.\n, indirect_calls_count); + + if (node-instrumentation_clone) +fprintf (f, Is instrumented version.\n); + else if (node-instrumented_version) +fprintf (f, Has instrumented version.\n); } @@ -2850,7 +2861,9 @@ verify_cgraph_node (struct cgraph_node *node) } for (i = 0; ipa_ref_list_reference_iterate (node-ref_list, i, ref); i++) - if (ref-use != IPA_REF_ALIAS) + if (ref-use == IPA_REF_CHKP) + ; + else if (ref-use != IPA_REF_ALIAS) { error (Alias has non-alias reference); error_found = true; @@ -2868,6 +2881,35 @@ verify_cgraph_node (struct cgraph_node *node) error_found = true; } } + + /* Check all nodes reference their instrumented versions. */ + if (node-analyzed + node-instrumented_version + !node-instrumentation_clone) +{ + bool ref_found = false; + int i; + struct ipa_ref *ref; + + for (i = 0; ipa_ref_list_reference_iterate (node-ref_list, + i, ref); i++) + if (ref-use == IPA_REF_CHKP) + { + if (ref_found) + { + error (Node has more than one chkp reference); + error_found = true; + } + ref_found = true; + } + + if (!ref_found) + { + error (Analyzed node has no reference to
Re: [PATCH] Add a new option -fmerge-bitfields (patch / doc inside)
On Wed, Apr 16, 2014 at 8:38 AM, Zoran Jovanovic zoran.jovano...@imgtec.com wrote: Hello, This is new patch version. Lowering is applied only for bit-fields copy sequences that are merged. Data structure representing bit-field copy sequences is renamed and reduced in size. Optimization turned on by default for -O2 and higher. Some comments fixed. Benchmarking performed on WebKit for Android. Code size reduction noticed on several files, best examples are: core/rendering/style/StyleMultiColData (632-520 bytes) core/platform/graphics/FontDescription (1715-1475 bytes) core/rendering/style/FillLayer (5069-4513 bytes) core/rendering/style/StyleRareInheritedData (5618-5346) core/css/CSSSelectorList(4047-3887) core/platform/animation/CSSAnimationData (3844-3440 bytes) core/css/resolver/FontBuilder (13818-13350 bytes) core/platform/graphics/Font (16447-15975 bytes) Example: One of the motivating examples for this work was copy constructor of the class which contains bit-fields. C++ code: class A { public: A(const A x); unsigned a : 1; unsigned b : 2; unsigned c : 4; }; A::A(const Ax) { a = x.a; b = x.b; c = x.c; } Very interesting. Does this work with inheritance too? E.g. struct Base { uint32_t x:1; uint32_t y:3; Base(const Base other) { x = other.x; y = other.y; } }; struct Der : Base { Der() = default; Der(const Der other) : Base(other) { z = other.z; } uint32_t z:9; }; GIMPLE code without optimization: bb 2: _3 = x_2(D)-a; this_4(D)-a = _3; _6 = x_2(D)-b; this_4(D)-b = _6; _8 = x_2(D)-c; this_4(D)-c = _8; return; Optimized GIMPLE code: bb 2: _10 = x_2(D)-D.1867; _11 = BIT_FIELD_REF _10, 7, 0; _12 = this_4(D)-D.1867; _13 = _12 128; _14 = (unsigned char) _11; _15 = _13 | _14; this_4(D)-D.1867 = _15; return; Generated MIPS32r2 assembly code without optimization: lw $3,0($5) lbu $2,0($4) andi$3,$3,0x1 andi$2,$2,0xfe or $2,$2,$3 sb $2,0($4) lw $3,0($5) andi$2,$2,0xf9 andi$3,$3,0x6 or $2,$2,$3 sb $2,0($4) lw $3,0($5) andi$2,$2,0x87 andi$3,$3,0x78 or $2,$2,$3 j $31 sb $2,0($4) Optimized MIPS32r2 assembly code: lw $3,0($5) lbu $2,0($4) andi$3,$3,0x7f andi$2,$2,0x80 or $2,$3,$2 j $31 sb $2,0($4) Algorithm works on basic block level and consists of following 3 major steps: 1. Go through basic block statements list. If there are statement pairs that implement copy of bit field content from one memory location to another record statements pointers and other necessary data in corresponding data structure. 2. Identify records that represent adjacent bit field accesses and mark them as merged. 3. Lower bit-field accesses by using new field size for those that can be merged. New command line option -fmerge-bitfields is introduced. Tested - passed gcc regression tests for MIPS32r2. Changelog - gcc/ChangeLog: 2014-04-16 Zoran Jovanovic (zoran.jovano...@imgtec.com) * common.opt (fmerge-bitfields): New option. * doc/invoke.texi: Add reference to -fmerge-bitfields. * tree-sra.c (lower_bitfields): New function. Entry for (-fmerge-bitfields). (bf_access_candidate_p): New function. (lower_bitfield_read): New function. (lower_bitfield_write): New function. (bitfield_stmt_bfcopy_pair::hash): New function. (bitfield_stmt_bfcopy_pair::equal): New function. (bitfield_stmt_bfcopy_pair::remove): New function. (create_and_insert_bfcopy): New function. (get_bit_offset): New function. (add_stmt_bfcopy_pair): New function. (cmp_bfcopies): New function. (get_merged_bit_field_size): New function. * dwarf2out.c (simple_type_size_in_bits): Move to tree.c. (field_byte_offset): Move declaration to tree.h and make it extern. * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test. * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test. * tree-ssa-sccvn.c (expressions_equal_p): Move to tree.c. * tree-ssa-sccvn.h (expressions_equal_p): Move declaration to tree.h. * tree.c (expressions_equal_p): Move from tree-ssa-sccvn.c. (simple_type_size_in_bits): Move from dwarf2out.c. * tree.h (expressions_equal_p): Add declaration. (field_byte_offset): Add declaration. Patch - diff --git a/gcc/common.opt b/gcc/common.opt index da275e5..52c7f58 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2203,6 +2203,10 @@ ftree-sra Common Report Var(flag_tree_sra) Optimization Perform scalar replacement of aggregates +fmerge-bitfields +Common Report Var(flag_tree_bitfield_merge) Optimization +Merge loads and stores of consecutive bitfields +
Re: [PATCH 3/3, x86] X86 Silvermont vector cost model tune
For the 3d part of the patch there was a misprint in estimated constant. It should be 1.7 instead of 1.8. - retval = (retval * 18) / 10; + retval = (retval * 17) / 10; Bootstarp passed. On Wed, Apr 16, 2014 at 12:02 PM, Uros Bizjak ubiz...@gmail.com wrote: On Tue, Apr 15, 2014 at 6:12 PM, Evgeny Stupachenko evstu...@gmail.com wrote: 3d part: 2014-04-15 Evgeny Stupachenko evstu...@gmail.com * config/i386/i386.c (x86_add_stmt_cost): Fixing vector cost model for Silvermont. ... : Fix vector cost ... OK for mainline with the above ChangeLog fix. Thanks, Uros.
Re: Make string_view operations involving CharT* *not* noexcept and consistent beween string_view and string_view.tcc.
On 04/15/2014 03:06 PM, Jonathan Wakely wrote: On 29/03/14 14:54 -0400, Ed Smith-Rowland wrote: All, In string_view I botched the noexcept specification of operations like find and friends with CharT* arguments. I'm a little surprised the inconsistency between string_view and string_view.tcc didn't error. In fact, in one repo thats a little behind trunk it does. I'll continue to look after that issue separately. I'm fixing this differently, by strengthening the exception specs as Marc suggested. I haven't addressed Marc's other comments, but we should do. Tested x86_64-linux, committed to trunk. Thanks, The latest library fundamentals paper has a lot of changes coming - a lot of constexpr in the find type functions. Unfortunately, most of that will wave to wait until we get C++14 constexpr. Also, if the built-in strlen is or could be made constexpr then all the char* ctors could be constexpr as well.
[PATCH 1/2] libstdc++: Add std::align.
C++11 [ptr.align]. This should probably not be inline. But for now this avoids any ABI changes. * libstdc++-v3/testsuite/20_util/align/1.cc: New file. * libstdc++-v3/include/std/memory (align): New function. --- libstdc++-v3/include/std/memory | 35 + libstdc++-v3/testsuite/20_util/align/1.cc | 82 +++ 2 files changed, 117 insertions(+) create mode 100644 libstdc++-v3/testsuite/20_util/align/1.cc diff --git a/libstdc++-v3/include/std/memory b/libstdc++-v3/include/std/memory index dafec0c..f9ae7b9 100644 --- a/libstdc++-v3/include/std/memory +++ b/libstdc++-v3/include/std/memory @@ -83,6 +83,41 @@ # if _GLIBCXX_USE_DEPRECATED #include backward/auto_ptr.h # endif + +/** + * @brief Fit aligned storage in buffer. + * + * [ptr.align] + * + * This function tries to fit __size storage with __alignment into + * the buffer __ptr of size __space bytes. If such a buffer fits + * then __ptr is changed to point to the storage and __space is + * reduced by the bytes needed to adjust the alignment. + * + * @param __alignment A fundamental or extended alignment value + of the desired storage. + * @param __size Size of the aligned storage. + * @param __ptr Pointer to a buffer of __space byte size. + * @param __space Size of the buffer pointed to by __ptr. + * @return the updated pointer if the aligned storage fits or + nullptr else. + */ +inline +void* +align(size_t __alignment, size_t __size, void* __ptr, size_t __space) +{ + const size_t __diff = __alignment - +reinterpret_castuintptr_t(__ptr) % __alignment; + if (__diff + __size = __space) +return nullptr; + else +{ + __space -= __diff; + __ptr = static_castchar*(__ptr) + __diff; + return __ptr; +} +} + #else # include backward/auto_ptr.h #endif diff --git a/libstdc++-v3/testsuite/20_util/align/1.cc b/libstdc++-v3/testsuite/20_util/align/1.cc new file mode 100644 index 000..2e74806 --- /dev/null +++ b/libstdc++-v3/testsuite/20_util/align/1.cc @@ -0,0 +1,82 @@ +// { dg-options -std=gnu++11 } + +// 2014-04-16 Rüdiger Sonderfeld ruedi...@c-plusplus.de + +// Copyright (C) 2014 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the terms +// of the GNU General Public License as published by the Free Software +// Foundation; either version 3, or (at your option) any later +// version. + +// This library is distributed in the hope that it will be useful, but +// WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +// General Public License for more details. + +// You should have received a copy of the GNU General Public License +// along with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +// C++11 [ptr.align] (20.6.5): std::align + +#include memory +#include cstdint +#include testsuite_hooks.h + +#ifndef _GLIBCXX_ASSERT +# include iostream +# define STRINGIZE2(x) #x +# define STRINGIZE(x) STRINGIZE2(x) +# define CHECK(a, op, b) \ + do\ +{ \ + if ( !( (a) op (b) ) )\ +{ \ + cerr Mismatch: \ +STRINGIZE(a) \ + ( (a) )\ +STRINGIZE(op) ' ' \ +STRINGIZE(b) \ + ( (b) )\n; \ +} \ +} while(false) +#else +# define CHECK(a, op, b) VERIFY( (a) op (b) ) +#endif + +void test01() +{ + using namespace std; + bool test __attribute__((unused)) = true; + + size_t space = 100; + void* ptr = new char[space]; + char* const orig_ptr = static_castchar*(ptr); + char* old_ptr = orig_ptr; + const size_t orig_space = space; + size_t old_space = space; + const size_t alignment = 16; + const size_t size = 10; + while( void* const r = align(alignment, size, ptr, space) ) +{ + CHECK( r, ==, ptr ); + uintptr_t p = reinterpret_castuintptr_t(ptr); + CHECK( p % alignment, ==, 0 ); + char* const x = static_castchar*(ptr); + CHECK( x - old_ptr, ==, old_space - space ); + CHECK( (void*)x, , (void*)(orig_ptr + orig_space) ); + CHECK( (void*)(x + size), , (void*)(orig_ptr + orig_space) ); + ptr = x + size; + old_ptr = x; + old_space = space; + space -= size; +} + delete [] orig_ptr; +} + +int main() +{ + test01(); +} -- 1.9.2
[PATCH 2/2] libstdc++: Add std::aligned_union.
C++11: [meta.trans.other] * libstdc++-v3/testsuite/20_util/aligned_union/1.cc: New file. * libstdc++-v3/include/std/type_traits (__strictest_alignment): New helper struct. (aligned_union): New struct (C++11). (aligned_union_t): New type alias (C++14). --- libstdc++-v3/include/std/type_traits | 43 ++ libstdc++-v3/testsuite/20_util/aligned_union/1.cc | 72 +++ 2 files changed, 115 insertions(+) create mode 100644 libstdc++-v3/testsuite/20_util/aligned_union/1.cc diff --git a/libstdc++-v3/include/std/type_traits b/libstdc++- v3/include/std/type_traits index 4b434a6..2b75345 100644 --- a/libstdc++-v3/include/std/type_traits +++ b/libstdc++-v3/include/std/type_traits @@ -1837,6 +1837,46 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION }; }; + template typename... _Types +struct __strictest_alignment +{ + static const size_t _M_alignment = 0; + static const size_t _M_size = 0; +}; + + template typename _T, typename... _Types +struct __strictest_alignment_T, _Types... +{ + static const size_t _M_alignment = +alignof(_T) __strictest_alignment_Types...::_M_alignment ? +alignof(_T) : __strictest_alignment_Types...::_M_alignment; + static const size_t _M_size = +sizeof(_T) __strictest_alignment_Types...::_M_size ? +sizeof(_T) : __strictest_alignment_Types...::_M_size; +}; + + /** + * @brief Provide aligned storage for types. + * + * [meta.trans.other] + * + * Provides aligned storage for any of the provided types of at + * least size _Len. + * + * @see aligned_storage + */ + template size_t _Len, typename... _Types +struct aligned_union +{ + /// The value of the strictest alignment of _Types. + static const size_t alignment_value = +__strictest_alignment_Types...::_M_alignment; + static const size_t _M_len = +_Len __strictest_alignment_Types...::_M_size ? +_Len : __strictest_alignment_Types...::_M_size; + /// The storage. + typedef typename aligned_storage_M_len, alignment_value::type type; +}; // Decay trait for arrays and functions, used for perfect forwarding // in make_pair, make_tuple, etc. @@ -2173,6 +2213,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __alignof__(typename __aligned_storage_msa_Len::__type) using aligned_storage_t = typename aligned_storage_Len, _Align::type; + template size_t Len, class... Types +using aligned_union_t = typename aligned_unionLen,Types...::type; + /// Alias template for decay templatetypename _Tp using decay_t = typename decay_Tp::type; diff --git a/libstdc++-v3/testsuite/20_util/aligned_union/1.cc b/libstdc++- v3/testsuite/20_util/aligned_union/1.cc new file mode 100644 index 000..c01ecc0 --- /dev/null +++ b/libstdc++-v3/testsuite/20_util/aligned_union/1.cc @@ -0,0 +1,72 @@ +// { dg-options -std=gnu++11 } + +// 2014-04-16 Rüdiger Sonderfeld ruedi...@c-plusplus.de + +// Copyright (C) 2014 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the terms +// of the GNU General Public License as published by the Free Software +// Foundation; either version 3, or (at your option) any later +// version. + +// This library is distributed in the hope that it will be useful, but +// WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +// General Public License for more details. + +// You should have received a copy of the GNU General Public License +// along with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +// C++11 [meta.trans.other] 20.9.7.6: aligned_union + +#include type_traits +#include initializer_list +#include testsuite_tr1.h + +struct MSAlignType { } __attribute__((__aligned__)); + +templatetypename...T + struct mymax + { +static const std::size_t alignment = 0; +static const std::size_t size = 0; + }; + +templatetypename L, typename...T + struct mymaxL, T... + { +static const std::size_t alignment = alignof(L) mymaxT...::alignment + ? alignof(L) : mymaxT...::alignment; +static const std::size_t size = sizeof(L) mymaxT...::size + ? sizeof(L) : mymaxT...::size; + }; + +void test01() +{ + using std::aligned_union; + using std::alignment_of; + using std::size_t; + using namespace __gnu_test; + + const size_t max_a = mymaxchar, short, int, double, int[4], + ClassType, MSAlignType::alignment; + const size_t max_s = mymaxchar, short, int, double, int[4], + ClassType, MSAlignType::size; + + typedef aligned_union0, char, short, int, double, int[4], +ClassType, MSAlignType au_type; + static_assert(au_type::alignment_value == max_a, Alignment value); +
RE: [PATCH] [MIPS] Fix operands for microMIPS SW16, SH16 and SB16
-Original Message- From: Richard Sandiford [mailto:rdsandif...@googlemail.com] Sent: Tuesday, April 15, 2014 4:32 PM To: Moore, Catherine Cc: Rozycki, Maciej; Matthew Fortune; gcc-patches@gcc.gnu.org Subject: Re: [PATCH] [MIPS] Fix operands for microMIPS SW16, SH16 and SB16 Moore, Catherine catherine_mo...@mentor.com writes: -Original Message- From: Moore, Catherine Sent: Tuesday, April 15, 2014 8:49 AM To: Rozycki, Maciej; Richard Sandiford Cc: Matthew Fortune; gcc-patches@gcc.gnu.org; Moore, Catherine Subject: RE: [PATCH] [MIPS] Fix operands for microMIPS SW16, SH16 and SB16 -Original Message- From: Maciej W. Rozycki [mailto:ma...@codesourcery.com] Sent: Tuesday, April 15, 2014 7:28 AM To: Richard Sandiford Cc: Matthew Fortune; Moore, Catherine; gcc-patches@gcc.gnu.org Subject: Re: [PATCH] [MIPS] Fix operands for microMIPS SW16, SH16 and SB16 On Tue, 15 Apr 2014, Richard Sandiford wrote: I believe you need to adjust constraints to ensure constant 0 is known to produce a 16-bit instruction encoding where possible. Otherwise you'll end up with suboptimal code when the instruction is in a branch delay slot. Yeah, it'd be good to do that too (although this is a preexisting problem). Well, it depends on how you look at the problem being solved here -- if it is for SW16, SH16 and SB16 GCC produces broken code for the `s0' source register, then indeed it is, whereas if it is GCC does not handle the source register set for SW16, SH16 and SB16 correctly, then it is a part of the same problem, not completely corrected. I can live with that until 4.10/4.9.1 though if you prefer. I'm relying on you guys to do the microMIPS stuff though -- I don't have a way of testing it. An assembly/objdump test is enough to cover this, so you've got all tools at hand, although I understand you may not be inclined to rush working on it. ;) I'll take care of this bit. I've attached an updated patch to address Maciej's concern with $0 and the microMIPS store instructions. Does this look okay to install? No, the point was that zero is modelled as a constant in RTL, so like Maciej says, the way to handle it is to use the J constraint (like some of the existing contraints use dJ for any GPR or zero). What we want to test is that: *ptr = 0; is a 16-bit instruction. You could do that by adding -dp to the options and matching something like: MICROMIPS void f1 (unsigned char *ptr) { *ptr = 0; } ...[similarly for short and int]... /* { dg-final { scan-assembler \tsb\t\\\$0, 0\\(\\\$4\\)\[^\n\]length = 2 } } */ ...[similarly for sh and sw]... Completely untested. I bet the regexp needs different backslashes. :-) Okay, this patch modifies the constraints instead. Okay? umips-store.cl Description: umips-store.cl umips-store.patch Description: umips-store.patch
Re: [PATCH 1/2] libstdc++: Add std::align.
On 16/04/14 17:06 +0200, Rüdiger Sonderfeld wrote: C++11 [ptr.align]. This should probably not be inline. But for now this avoids any ABI changes. Adding new non-member functions is fine ABI purposes (adding new virtual functions is not).
[RFC] proof-of-concept: warning for a bit comparison that is always true/false
Hello! I am new to GCC. I want to add a warning to GCC when bit comparison is always true/false. Example: if ((x4)==0) {} // - no warning if ((x4)==4) {} // - no warning if ((x4)==5) {} // - warn! When this warning is triggered, the most common cause is that somebody made a mistake when using bitmasks. I attach a proof-of-concept patch. I would like comments. The patch needs some cleanup before it's applied.. I would like it to handle at least != also and not just ==. And I would like it to be less strict about where integer constants are located. I wonder where I should put this code. Is gcc/c/c-typeck.c a good file to put this in? Should I put it in somewhere else? What warning flags should be used to enable this? Is some -Wcondition-bitop a good idea? Can this be added by -Wall? I wrote this check for Cppcheck years ago. In my experience this warning has a good signal/noise ratio. Best regards, Daniel Marjamäki Index: gcc/testsuite/c-c++-common/Wcondition-bitop.c === --- gcc/testsuite/c-c++-common/Wcondition-bitop.c (revision 0) +++ gcc/testsuite/c-c++-common/Wcondition-bitop.c (revision 0) @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-options -Wall } */ + +void f(int x) { + if ((x4)==0){} + if ((x4)==4){} + if ((x4)==2){} /* { dg-warning Comparison is always false } */ +} Index: gcc/c/c-typeck.c === --- gcc/c/c-typeck.c(revision 209347) +++ gcc/c/c-typeck.c(arbetskopia) @@ -9980,6 +9980,15 @@ bool op0_int_operands, op1_int_operands; bool int_const, int_const_or_overflow, int_operands; + if ((code == EQ_EXPR ) + (TREE_CODE(orig_op0) == BIT_AND_EXPR) + (TREE_CODE(orig_op1) == INTEGER_CST ) + (TREE_INT_CST_LOW(orig_op1)!= 0 ) + (TREE_CODE(TREE_OPERAND_CHECK(orig_op0,1)) == INTEGER_CST ) + (((TREE_INT_CST_LOW(TREE_OPERAND_CHECK(orig_op0,1)) TREE_INT_CST_LOW(orig_op1)) != TREE_INT_CST_LOW(orig_op1)) || + ((TREE_INT_CST_HIGH(TREE_OPERAND_CHECK(orig_op0,1)) TREE_INT_CST_HIGH(orig_op1)) != TREE_INT_CST_HIGH(orig_op1 +warning_at(location, 0, Comparison is always false); + /* Expression code to give to the expression when it is built. Normally this is CODE, which is what the caller asked for, but in some special cases we change it. */
Re: [PATCH 2/2] libstdc++: Add std::aligned_union.
On 16/04/14 17:06 +0200, Rüdiger Sonderfeld wrote: C++11: [meta.trans.other] * libstdc++-v3/testsuite/20_util/aligned_union/1.cc: New file. * libstdc++-v3/include/std/type_traits (__strictest_alignment): New helper struct. (aligned_union): New struct (C++11). (aligned_union_t): New type alias (C++14). Thanks! I was hoping to implement it the straightforward way, but was thwarted by http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59012 + template typename... _Types +struct __strictest_alignment +{ + static const size_t _M_alignment = 0; + static const size_t _M_size = 0; The naming convention is to use _S_xxx for static members + template size_t Len, class... Types +using aligned_union_t = typename aligned_unionLen,Types...::type; It should be s/class/typename/ here. Is initializer_list needed for the testcase? The testcase can use { dg-do compile } to turn it into a compile-only test, and then doesn't need a main() function. Otherwise I think this looks good, thanks.
Re: [PATCH 1/2] libstdc++: Add std::align.
On 16/04/14 16:19 +0100, Jonathan Wakely wrote: On 16/04/14 17:06 +0200, Rüdiger Sonderfeld wrote: C++11 [ptr.align]. This should probably not be inline. But for now this avoids any ABI changes. Adding new non-member functions is fine ABI purposes (adding new virtual functions is not). Actually I think that is OK to be inline anyway, patch approved. I'll apply it shortly - thanks.
RE: [PATCH] Do not run IPA transform phases multiple times
Likely after this was checked in appeared following on x86 FAIL: gcc.dg/vect/vect-simd-clone-11.c -flto -ffat-lto-objects (internal compiler error) FAIL: gcc.dg/vect/vect-simd-clone-11.c -flto -ffat-lto-objects (test for excess errors) FAIL: gcc.dg/vect/vect-simd-clone-12.c -flto -ffat-lto-objects (internal compiler error) FAIL: gcc.dg/vect/vect-simd-clone-12.c -flto -ffat-lto-objects (test for excess errors) FAIL: gcc.dg/vect/vect-simd-clone-1.c -flto -ffat-lto-objects (internal compiler error) FAIL: gcc.dg/vect/vect-simd-clone-1.c -flto -ffat-lto-objects (test for excess errors) FAIL: gcc.dg/vect/vect-simd-clone-2.c -flto -ffat-lto-objects (internal compiler error) FAIL: gcc.dg/vect/vect-simd-clone-2.c -flto -ffat-lto-objects (test for excess errors) FAIL: gcc.dg/vect/vect-simd-clone-3.c -flto -ffat-lto-objects (internal compiler error) FAIL: gcc.dg/vect/vect-simd-clone-3.c -flto -ffat-lto-objects (test for excess errors) FAIL: gcc.dg/vect/vect-simd-clone-4.c -flto -ffat-lto-objects (internal compiler error) FAIL: gcc.dg/vect/vect-simd-clone-4.c -flto -ffat-lto-objects (test for excess errors) FAIL: gcc.dg/vect/vect-simd-clone-5.c -flto -ffat-lto-objects (internal compiler error) FAIL: gcc.dg/vect/vect-simd-clone-5.c -flto -ffat-lto-objects (test for excess errors) FAIL: gcc.dg/vect/vect-simd-clone-6.c -flto -ffat-lto-objects (internal compiler error) FAIL: gcc.dg/vect/vect-simd-clone-6.c -flto -ffat-lto-objects (test for excess errors) FAIL: gcc.dg/vect/vect-simd-clone-7.c -flto -ffat-lto-objects (internal compiler error) FAIL: gcc.dg/vect/vect-simd-clone-7.c -flto -ffat-lto-objects (test for excess errors) FAIL: gcc.dg/vect/vect-simd-clone-8.c -flto -ffat-lto-objects (internal compiler error) FAIL: gcc.dg/vect/vect-simd-clone-8.c -flto -ffat-lto-objects (test for excess errors) FAIL: gcc.dg/vect/vect-simd-clone-9.c -flto -ffat-lto-objects (internal compiler error) FAIL: gcc.dg/vect/vect-simd-clone-9.c -flto -ffat-lto-objects (test for excess errors) -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Martin Jambor Sent: Monday, April 14, 2014 9:37 PM To: GCC Patches Cc: Jan Hubicka Subject: [PATCH] Do not run IPA transform phases multiple times Hi, I have noticed that we often run IPA-transformation phases multiple times. The reason is that when a virtual clone is created, it starts with an empty ipa_transforms_to_apply vector and all subsequent IPA passes that see it are pushed onto it. The original node gets these pushed on their ipa_transforms_to_apply too. Later on, when a clone receives its body, in tree_function_versioning, the contents of original node's ipa_transforms_to_apply is added to clone's ipa_transforms_to_apply (in a slightly convoluted way). I do not understand how this was supposed to work but obviously we can and do end up with multiple copies of a few passes in ipa_transforms_to_apply of the clone. I believe the correct thing to do is to copy the vector when the virtual clone is created and remove the part in tree_function_versioning that does the copying. I have verified that we no longer call transformation of IPA-CP multiple times when we previously did and added asserts to check that no other caller of tree_function_versioning expects to have the vector copied. Bootstrapped and tested on x86_64-linux, OK for trunk? Thanks, Martin 2014-04-10 Martin Jambor mjam...@suse.cz * cgraphclones.c (cgraph_create_virtual_clone): Duplicate ipa_transforms_to_apply. (cgraph_function_versioning): Assert that old_node has empty ipa_transforms_to_apply. * trans-mem.c (ipa_tm_create_version): Likewise. * tree-inline.c (tree_function_versioning): Do not duplicate ipa_transforms_to_apply. Index: src/gcc/cgraphclones.c == = --- src.orig/gcc/cgraphclones.c +++ src/gcc/cgraphclones.c @@ -600,6 +600,9 @@ cgraph_create_virtual_clone (struct cgra } else new_node-clone.combined_args_to_skip = args_to_skip; + if (old_node-ipa_transforms_to_apply.exists ()) +new_node-ipa_transforms_to_apply + = old_node-ipa_transforms_to_apply.copy (); cgraph_call_node_duplication_hooks (old_node, new_node); @@ -971,6 +974,7 @@ cgraph_function_versioning (struct cgrap cgraph_copy_node_for_versioning (old_version_node, new_decl, redirect_callers, bbs_to_copy); + gcc_assert (!old_version_node-ipa_transforms_to_apply.exists ()); /* Copy the OLD_VERSION_NODE function tree to the new version. */ tree_function_versioning (old_decl, new_decl, tree_map, false, args_to_skip, skip_return, bbs_to_copy, new_entry_block); Index: src/gcc/trans-mem.c == = --- src.orig/gcc/trans-mem.c +++
Re: [RFC] Add aarch64 support for ada
On 04/16/2014 12:39 AM, Eric Botcazou wrote: The primary bit of rfc here is the hunk that applies to ada/types.h with respect to Fat_Pointer. Given that the Ada type, as defined in s-stratt.ads, does not include alignment, I can't imagine why the C type should have it. See gcc-interface/utils.c:finish_fat_pointer_type. Ah hah. /* Make sure we can put it into a register. */ if (STRICT_ALIGNMENT) TYPE_ALIGN (record_type) = MIN (BIGGEST_ALIGNMENT, 2 * POINTER_SIZE); AArch64 is not a STRICT_ALIGNMENT target, thus the mismatch. If we were to make this alignment unconditional, would it be better to drop the code from here in finish_fat_pointer_type and instead record that in the Ada source, as we do with the C source? I presume for Fat_Pointer'Alignment use System.Address'Size * 2; or some such incantation would do that... r~
[PATCHv2 2/2] libstdc++: Add std::aligned_union.
Thanks! I was hoping to implement it the straightforward way, but was thwarted by http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59012 There are certainly nicer ways to implement it. At least in C++14 there should be a usable constexpr std::max instead of the verbose ?: usage. Maybe sizeof/alignof could be used with the parameter pack expansion. I fixed the other issues below. Regards, Rüdiger -- 8 --- 8 -- C++11: [meta.trans.other] * libstdc++-v3/testsuite/20_util/aligned_union/1.cc: New file. * libstdc++-v3/include/std/type_traits (__strictest_alignment): New helper struct. (aligned_union): New struct (C++11). (aligned_union_t): New type alias (C++14). --- libstdc++-v3/include/std/type_traits | 43 ++ libstdc++-v3/testsuite/20_util/aligned_union/1.cc | 72 +++ 2 files changed, 115 insertions(+) create mode 100644 libstdc++-v3/testsuite/20_util/aligned_union/1.cc diff --git a/libstdc++-v3/include/std/type_traits b/libstdc++-v3/include/std/type_traits index 4b434a6..4441290 100644 --- a/libstdc++-v3/include/std/type_traits +++ b/libstdc++-v3/include/std/type_traits @@ -1837,6 +1837,46 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION }; }; + template typename... _Types +struct __strictest_alignment +{ + static const size_t _S_alignment = 0; + static const size_t _S_size = 0; +}; + + template typename _T, typename... _Types +struct __strictest_alignment_T, _Types... +{ + static const size_t _S_alignment = +alignof(_T) __strictest_alignment_Types...::_S_alignment ? +alignof(_T) : __strictest_alignment_Types...::_S_alignment; + static const size_t _S_size = +sizeof(_T) __strictest_alignment_Types...::_S_size ? +sizeof(_T) : __strictest_alignment_Types...::_S_size; +}; + + /** + * @brief Provide aligned storage for types. + * + * [meta.trans.other] + * + * Provides aligned storage for any of the provided types of at + * least size _Len. + * + * @see aligned_storage + */ + template size_t _Len, typename... _Types +struct aligned_union +{ + /// The value of the strictest alignment of _Types. + static const size_t alignment_value = +__strictest_alignment_Types...::_M_alignment; + static const size_t _S_len = +_Len __strictest_alignment_Types...::_S_size ? +_Len : __strictest_alignment_Types...::_S_size; + /// The storage. + typedef typename aligned_storage_S_len, alignment_value::type type; +}; // Decay trait for arrays and functions, used for perfect forwarding // in make_pair, make_tuple, etc. @@ -2173,6 +2213,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __alignof__(typename __aligned_storage_msa_Len::__type) using aligned_storage_t = typename aligned_storage_Len, _Align::type; + template size_t _Len, typename... _Types +using aligned_union_t = typename aligned_union_Len, _Types...::type; + /// Alias template for decay templatetypename _Tp using decay_t = typename decay_Tp::type; diff --git a/libstdc++-v3/testsuite/20_util/aligned_union/1.cc b/libstdc++-v3/testsuite/20_util/aligned_union/1.cc new file mode 100644 index 000..5285bb0 --- /dev/null +++ b/libstdc++-v3/testsuite/20_util/aligned_union/1.cc @@ -0,0 +1,72 @@ +// { dg-options -std=gnu++11 } +// { dg-do compile } + +// 2014-04-16 Rüdiger Sonderfeld ruedi...@c-plusplus.de + +// Copyright (C) 2014 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the terms +// of the GNU General Public License as published by the Free Software +// Foundation; either version 3, or (at your option) any later +// version. + +// This library is distributed in the hope that it will be useful, but +// WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +// General Public License for more details. + +// You should have received a copy of the GNU General Public License +// along with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +// C++11 [meta.trans.other] 20.9.7.6: aligned_union + +#include type_traits +#include testsuite_tr1.h + +struct MSAlignType { } __attribute__((__aligned__)); + +templatetypename...T + struct mymax + { +static const std::size_t alignment = 0; +static const std::size_t size = 0; + }; + +templatetypename L, typename...T + struct mymaxL, T... + { +static const std::size_t alignment = alignof(L) mymaxT...::alignment + ? alignof(L) : mymaxT...::alignment; +static const std::size_t size = sizeof(L) mymaxT...::size + ? sizeof(L) : mymaxT...::size; + }; + +void test01() +{ + using std::aligned_union; + using std::alignment_of; + using std::size_t; + using namespace
[PATCHv3 2/2] libstdc++: Add std::aligned_union.
Of course I forgot to replace one _M_ instance. This should work now. Sorry about this. -- 8 - 8 -- C++11: [meta.trans.other] * libstdc++-v3/testsuite/20_util/aligned_union/1.cc: New file. * libstdc++-v3/include/std/type_traits (__strictest_alignment): New helper struct. (aligned_union): New struct (C++11). (aligned_union_t): New type alias (C++14). --- libstdc++-v3/include/std/type_traits | 43 ++ libstdc++-v3/testsuite/20_util/aligned_union/1.cc | 72 +++ 2 files changed, 115 insertions(+) create mode 100644 libstdc++-v3/testsuite/20_util/aligned_union/1.cc diff --git a/libstdc++-v3/include/std/type_traits b/libstdc++- v3/include/std/type_traits index 4b434a6..7fb3b74 100644 --- a/libstdc++-v3/include/std/type_traits +++ b/libstdc++-v3/include/std/type_traits @@ -1837,6 +1837,46 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION }; }; + template typename... _Types +struct __strictest_alignment +{ + static const size_t _S_alignment = 0; + static const size_t _S_size = 0; +}; + + template typename _T, typename... _Types +struct __strictest_alignment_T, _Types... +{ + static const size_t _S_alignment = +alignof(_T) __strictest_alignment_Types...::_S_alignment ? +alignof(_T) : __strictest_alignment_Types...::_S_alignment; + static const size_t _S_size = +sizeof(_T) __strictest_alignment_Types...::_S_size ? +sizeof(_T) : __strictest_alignment_Types...::_S_size; +}; + + /** + * @brief Provide aligned storage for types. + * + * [meta.trans.other] + * + * Provides aligned storage for any of the provided types of at + * least size _Len. + * + * @see aligned_storage + */ + template size_t _Len, typename... _Types +struct aligned_union +{ + /// The value of the strictest alignment of _Types. + static const size_t alignment_value = +__strictest_alignment_Types...::_S_alignment; + static const size_t _S_len = +_Len __strictest_alignment_Types...::_S_size ? +_Len : __strictest_alignment_Types...::_S_size; + /// The storage. + typedef typename aligned_storage_S_len, alignment_value::type type; +}; // Decay trait for arrays and functions, used for perfect forwarding // in make_pair, make_tuple, etc. @@ -2173,6 +2213,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __alignof__(typename __aligned_storage_msa_Len::__type) using aligned_storage_t = typename aligned_storage_Len, _Align::type; + template size_t _Len, typename... _Types +using aligned_union_t = typename aligned_union_Len, _Types...::type; + /// Alias template for decay templatetypename _Tp using decay_t = typename decay_Tp::type; diff --git a/libstdc++-v3/testsuite/20_util/aligned_union/1.cc b/libstdc++- v3/testsuite/20_util/aligned_union/1.cc new file mode 100644 index 000..5285bb0 --- /dev/null +++ b/libstdc++-v3/testsuite/20_util/aligned_union/1.cc @@ -0,0 +1,72 @@ +// { dg-options -std=gnu++11 } +// { dg-do compile } + +// 2014-04-16 Rüdiger Sonderfeld ruedi...@c-plusplus.de + +// Copyright (C) 2014 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the terms +// of the GNU General Public License as published by the Free Software +// Foundation; either version 3, or (at your option) any later +// version. + +// This library is distributed in the hope that it will be useful, but +// WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +// General Public License for more details. + +// You should have received a copy of the GNU General Public License +// along with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +// C++11 [meta.trans.other] 20.9.7.6: aligned_union + +#include type_traits +#include testsuite_tr1.h + +struct MSAlignType { } __attribute__((__aligned__)); + +templatetypename...T + struct mymax + { +static const std::size_t alignment = 0; +static const std::size_t size = 0; + }; + +templatetypename L, typename...T + struct mymaxL, T... + { +static const std::size_t alignment = alignof(L) mymaxT...::alignment + ? alignof(L) : mymaxT...::alignment; +static const std::size_t size = sizeof(L) mymaxT...::size + ? sizeof(L) : mymaxT...::size; + }; + +void test01() +{ + using std::aligned_union; + using std::alignment_of; + using std::size_t; + using namespace __gnu_test; + + const size_t max_a = mymaxchar, short, int, double, int[4], + ClassType, MSAlignType::alignment; + const size_t max_s = mymaxchar, short, int, double, int[4], + ClassType, MSAlignType::size; + + typedef aligned_union0, char, short, int, double,
Re: Remove obsolete Solaris 9 support
* Eric: In libgcc/config/sparc/sol2-unwind.h, I've removed the Solaris 9 cases after verifying that the cuh_pattern's used there only occur in Solaris 9 (from FCS to the latest libthread.so.1 patch), but not even in Solaris 10 FCS. For Solaris 10, do you have any more details on when the 2-frame case occurs? I've found that patch submission http://gcc.gnu.org/ml/gcc-patches/2010-10/msg02370.html but no details on what S10 update/patch this happens with. Let's not touch libgcc/config/sparc/sol2-unwind.h, the small gain is not worth the potential trouble IMO. -- Eric Botcazou
Re: RFA: Tighten checking for 'X' constraints
Anyway, others can have different opinion on what X should mean, CCing Jeff and Eric. I personally think that we should not change it and adjust LRA instead to error out instead of ICEing (even if this means erroring out in a few more cases with LRA than with reload for now, e.g. gcc.dg/torture/asm-subreg-1.c which looks somewhat dubious to me). -- Eric Botcazou
Re: Remove obsolete Solaris 9 support
Eric Botcazou ebotca...@adacore.com writes: * Eric: In libgcc/config/sparc/sol2-unwind.h, I've removed the Solaris 9 cases after verifying that the cuh_pattern's used there only occur in Solaris 9 (from FCS to the latest libthread.so.1 patch), but not even in Solaris 10 FCS. For Solaris 10, do you have any more details on when the 2-frame case occurs? I've found that patch submission http://gcc.gnu.org/ml/gcc-patches/2010-10/msg02370.html but no details on what S10 update/patch this happens with. Let's not touch libgcc/config/sparc/sol2-unwind.h, the small gain is not worth the potential trouble IMO. Maybe not for the 2-frame vs. 3-frame case, though it would still be good to know the exact circumstances. But for the Solaris 9 stuff, it crystal clear that this cannot occur on Solaris 10 and up (no single-threaded case anymore since libthread.so.1 has been folded into libc.so.1). Ok to remove this part? Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: RFA: Tighten checking for 'X' constraints
Eric Botcazou ebotca...@adacore.com writes: Anyway, others can have different opinion on what X should mean, CCing Jeff and Eric. I personally think that we should not change it and adjust LRA instead to error out instead of ICEing (even if this means erroring out in a few more cases with LRA than with reload for now, e.g. gcc.dg/torture/asm-subreg-1.c which looks somewhat dubious to me). But the error isn't really under user control. The original asm in the testcase is: __asm__ __volatile__ (... : : X (ptr)); which seems fine, rather than: __asm__ __volatile__ (... : : X (*x + y * z / 11)); I don't think it's the user's fault that the compiler decided to convert the former to the latter. Thanks, Richard
Re: C++ PATCH for c++/51747 (list-initialization from same type)
OK. Jason
Re: [PATCH] offline gcda profile processing tool
On Tue, Apr 15, 2014 at 2:38 PM, Jan Hubicka hubi...@ucw.cz wrote: Rong, David, Dehao, Teresa I would like to have some rought idea of what we could merge this stage1. There is certainly a lot of interesting stuff on the google branch including AutoFDO, LIPO, the multivalue profile counters that may be used by the new devirtualization bits and more. I also think we should switch counts into floating point representation so Teresa's splitting patch works. Can we get plans to make this effective? My personal schedule is quite free until April 29 when I go to Czech Republic for wedding and I will be back in Calgary at 14th. 2014-03-03 Rong Xu x...@google.com * gcc/gcov-io.c (gcov_read_string): Make this routine available to gcov-tool. (gcov_sync): Ditto. * gcc/Makefile.in: Build and install gcov-tool. * gcc/gcov-tool.c (unlink_gcda_file): Remove one gcda file. (unlink_profile_dir): Remove gcda files from the profile path. (profile_merge): Merge two profiles in directory. (print_merge_usage_message): Print merge usage. (merge_usage): Print merge usage and exit. (do_merge): Driver for profile merge sub-command. (profile_rewrite): Rewrite profile. (print_rewrite_usage_message): Print rewrite usage. (rewrite_usage): Print rewrite usage and exit. (do_rewrite): Driver for profile rewrite sub-command. (print_usage): Print gcov-info usage and exit. (print_version): Print gcov-info version. (process_args): Process arguments. (main): Main routine for gcov-tool. * libgcc/libgcov.h : Include the set of base-type headers for gcov-tool. (struct gcov_info): Make the functions field mutable in gcov-tool compilation. * libgcc/libgcov-merge.c (gcov_get_counter): New wrapper function to get the profile counter. (gcov_get_counter_target): New wrapper function to get the profile values that should not be scaled. (__gcov_merge_add): Replace gcov_read_counter() with the wrapper functions. (__gcov_merge_ior): Ditto. (__gcov_merge_time_profile): Ditto. (__gcov_merge_single): Ditto. (__gcov_merge_delta): Ditto. * libgcc/libgcov-util.c (void gcov_set_verbose): Set the verbose flag in the utility functions. (set_fn_ctrs): Utility function for reading gcda files to in-memory gcov_list object link lists. (tag_function): Ditto. (tag_blocks): Ditto. (tag_arcs): Ditto. (tag_lines): Ditto. (tag_counters): Ditto. (tag_summary): Ditto. (read_gcda_finalize): Ditto. (read_gcda_file): Ditto. (ftw_read_file): Ditto. (read_profile_dir_init): Ditto. (gcov_read_profile_dir): Ditto. (gcov_read_counter_mem): Ditto. (gcov_get_merge_weight): Ditto. (merge_wrapper): A wrapper function that calls merging handler. (gcov_merge): Merge two gcov_info objects with weights. (find_match_gcov_info): Find the matched gcov_info in the list. (gcov_profile_merge): Merge two gcov_info object lists. (__gcov_add_counter_op): Process edge profile counter values. (__gcov_ior_counter_op): Process IOR profile counter values. (__gcov_delta_counter_op): Process delta profile counter values. (__gcov_single_counter_op): Process single profile counter values. (fp_scale): Callback function for float-point scaling. (int_scale): Callback function for integer fraction scaling. (gcov_profile_scale): Scaling profile counters. (gcov_profile_normalize): Normalize profile counters. Index: gcc/gcov-io.c === --- gcc/gcov-io.c (revision 208237) +++ gcc/gcov-io.c (working copy) @@ -564,7 +564,7 @@ gcov_read_counter (void) buffer, or NULL on empty string. You must copy the string before calling another gcov function. */ -#if !IN_LIBGCOV +#if !IN_LIBGCOV || defined (IN_GCOV_TOOL) GCOV_LINKAGE const char * gcov_read_string (void) { @@ -641,7 +641,7 @@ gcov_read_summary (struct gcov_summary *summary) } } -#if !IN_LIBGCOV +#if !IN_LIBGCOV || defined (IN_GCOV_TOOL) /* Reset to a known position. BASE should have been obtained from gcov_position, LENGTH should be a record length. */ I am slightly confused here, IN_LIBGCOV IMO means that the gcov-io is going to be linked into the gcov runtime as opposed to gcc, gcov, gcov-dump or gcov-tool. Why we define IN_LIBGCOV IN_GCOV_TOOL? GCOT_TOOL needs to use this function to read the string in gcda file to memory to construct gcov_info objects. As you noticed, gcov runtime does not need this interface. But gcov-tool links with gcov runtime and it also uses the function. We could make it available in gcov_runtime, but that will slightly increase the memory footprint.
Re: [PATCH 3/3, x86] X86 Silvermont vector cost model tune
On Wed, Apr 16, 2014 at 4:31 PM, Evgeny Stupachenko evstu...@gmail.com wrote: For the 3d part of the patch there was a misprint in estimated constant. It should be 1.7 instead of 1.8. - retval = (retval * 18) / 10; + retval = (retval * 17) / 10; Bootstarp passed. The change is also OK. BTW: trivial patch adjustments like this do not need re-approvals. The message to the ML should be enough. Uros.
Re: Remove obsolete Solaris 9 support
On Wed, Apr 16, 2014 at 1:16 PM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Now that 4.9 has branched, it's time to actually remove the obsolete Solaris 9 configuration. Most of this is just legwork and falls under my Solaris maintainership. A couple of questions, though: * Uros: I'm removing all sse_os_support() checks from the testsuite. Solaris 9 was the only consumer, so it seems best to do away with it. This is OK, but please leave sse-os-check.h (and corresponding sse_os_support calls) in the testsuite. Just remove the Solaris 9 specific code from sse-os-check.h and always return 1, perhaps with the comment that all currently supported OSes support SSE instructions. Uros.
Re: [C PATCH] Make attributes accept enum values (PR c/50459)
On 04/15/2014 03:56 PM, Marek Polacek wrote: The testsuite doesn't hit this code with C++, but does hit this code with C. The thing is, if we have e.g. enum { A = 128 }; void *fn1 (void) __attribute__((assume_aligned (A))); then handle_assume_aligned_attribute walks the attribute arguments and gets the argument via TREE_VALUE. If this argument is an enum value, then for C the argument is identifier_node that contains const_decl, Ah. Then I think the C parser should be fixed to check attribute_takes_identifier_p and look up the argument if false. Jason
[patch] Minor simplification in functional
This avoids a template instantiation when storing a function pointer in a std::function. At some point I want to extend the definition of __is_location_invariant to include trivially-copyable object types. I suspect this may be why boost::function can perform significantly better than our std::function in some cases. Tested x86_64-linux, committed to trunk. commit 11ba841cb0b5655c974dfed32330b077a4d3b312 Author: Jonathan Wakely jwak...@redhat.com Date: Wed Apr 16 18:21:28 2014 +0100 * include/std/functional (__is_location_invariant): Use __or_ helper. diff --git a/libstdc++-v3/include/std/functional b/libstdc++-v3/include/std/functional index 0e80fa3..295022d 100644 --- a/libstdc++-v3/include/std/functional +++ b/libstdc++-v3/include/std/functional @@ -1747,8 +1747,7 @@ _GLIBCXX_HAS_NESTED_TYPE(result_type) */ templatetypename _Tp struct __is_location_invariant -: integral_constantbool, (is_pointer_Tp::value - || is_member_pointer_Tp::value) +: __or_is_pointer_Tp, is_member_pointer_Tp::type { }; class _Undefined_class;
Re: Patch ping
On 04/14/2014 01:02 PM, Jakub Jelinek wrote: On Thu, Apr 10, 2014 at 12:01:31PM -0400, DJ Delorie wrote: So, now that 4.9 has branched, are both patches ok for trunk, or just the first one? The first one fixes --with-build-config=bootstrap-ubsan fully and --with-build-config=bootstrap-asan partially, the second one --with-build-config=bootstrap-asan fully. Now that the 4.9 branch happened, I sincerely hope this goes in (both parts of it) - my bootstrap-asan run this morning still failed. I'm quite sure regular asan/ubsan bootstraps on various platforms (mine is only the most common x86-64 one) would be helpful to find bugs in the compilers' frontends, middle end and libraries ... Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Patch ping
I'll approve both patches, if you agree to think about a way to solve this problem without module-specific configury changes for each such command line option. I understand the usefulness of having instrumentation, but the configure hack is a hack. Note that in a combined tree this isn't a problem, because we'd just instrument the linker at the same time.
Re: [PATCH] [MIPS] Fix operands for microMIPS SW16, SH16 and SB16
Moore, Catherine catherine_mo...@mentor.com writes: -Original Message- From: Richard Sandiford [mailto:rdsandif...@googlemail.com] Sent: Tuesday, April 15, 2014 4:32 PM To: Moore, Catherine Cc: Rozycki, Maciej; Matthew Fortune; gcc-patches@gcc.gnu.org Subject: Re: [PATCH] [MIPS] Fix operands for microMIPS SW16, SH16 and SB16 Moore, Catherine catherine_mo...@mentor.com writes: -Original Message- From: Moore, Catherine Sent: Tuesday, April 15, 2014 8:49 AM To: Rozycki, Maciej; Richard Sandiford Cc: Matthew Fortune; gcc-patches@gcc.gnu.org; Moore, Catherine Subject: RE: [PATCH] [MIPS] Fix operands for microMIPS SW16, SH16 and SB16 -Original Message- From: Maciej W. Rozycki [mailto:ma...@codesourcery.com] Sent: Tuesday, April 15, 2014 7:28 AM To: Richard Sandiford Cc: Matthew Fortune; Moore, Catherine; gcc-patches@gcc.gnu.org Subject: Re: [PATCH] [MIPS] Fix operands for microMIPS SW16, SH16 and SB16 On Tue, 15 Apr 2014, Richard Sandiford wrote: I believe you need to adjust constraints to ensure constant 0 is known to produce a 16-bit instruction encoding where possible. Otherwise you'll end up with suboptimal code when the instruction is in a branch delay slot. Yeah, it'd be good to do that too (although this is a preexisting problem). Well, it depends on how you look at the problem being solved here -- if it is for SW16, SH16 and SB16 GCC produces broken code for the `s0' source register, then indeed it is, whereas if it is GCC does not handle the source register set for SW16, SH16 and SB16 correctly, then it is a part of the same problem, not completely corrected. I can live with that until 4.10/4.9.1 though if you prefer. I'm relying on you guys to do the microMIPS stuff though -- I don't have a way of testing it. An assembly/objdump test is enough to cover this, so you've got all tools at hand, although I understand you may not be inclined to rush working on it. ;) I'll take care of this bit. I've attached an updated patch to address Maciej's concern with $0 and the microMIPS store instructions. Does this look okay to install? No, the point was that zero is modelled as a constant in RTL, so like Maciej says, the way to handle it is to use the J constraint (like some of the existing contraints use dJ for any GPR or zero). What we want to test is that: *ptr = 0; is a 16-bit instruction. You could do that by adding -dp to the options and matching something like: MICROMIPS void f1 (unsigned char *ptr) { *ptr = 0; } ...[similarly for short and int]... /* { dg-final { scan-assembler \tsb\t\\\$0, 0\\(\\\$4\\)\[^\n\]length = 2 } } Oops, I see I forgot the *, should have: \[^\n\]*length. But: */ ...[similarly for sh and sw]... Completely untested. I bet the regexp needs different backslashes. :-) Okay, this patch modifies the constraints instead. Okay? Index: testsuite/gcc.target/mips/umips-store16-2.c === --- testsuite/gcc.target/mips/umips-store16-2.c (revision 0) +++ testsuite/gcc.target/mips/umips-store16-2.c (revision 0) @@ -0,0 +1,22 @@ +/* { dg-options (-mmicromips) -dp } */ + +MICROMIPS void +f1 (unsigned char *ptr) +{ + *ptr = 0; +} + +MICROMIPS void +f2 (unsigned short *ptr) +{ + *ptr = 0; +} + +MICROMIPS void +f3 (unsigned int *ptr) +{ + *ptr = 0; +} +/* { dg-final { scan-assembler \tsb\t\\\$0,0\\(\\\$\[0-9\]+\\).*length = 2 } } */ ...it does need to be \[^\n\], since . can match newlines in Tcl. OK with that change if the new tests still pass, and if a full test run passes with -mmicromips. Thanks, Richard
fuse-caller-save - hook format
Vladimir, All patches for the fuse-caller-save optimization have been ok-ed. The only part not approved is the MIPS-specific part. The objection of Richard S. is not so much the patch itself, but more the idea of the hook fn_other_hard_reg_usage. For clarity, I'm restating the current hook definition here: ... +@deftypefn {Target Hook} bool TARGET_FN_OTHER_HARD_REG_USAGE (struct hard_reg_set_container *@var{regs}) Add any hard registers to @var{regs} that are set or clobbered by a call to the function. This hook only needs to add registers that cannot be found by examination of the final RTL representation of a function. This hook returns true if it managed to determine which registers need to be added. The default version of this hook returns false. ... Richard prefers to, rather than having a hook specifying what registers are implicitly clobbered, adding those clobbers to CALL_INSN_FUNCTION_USAGE. I can see these possibilities (and perhaps there are more): 1. We go with Richards proposal: we make each target responsible for adding these clobbers in CALL_INSN_FUNCTION_USAGE, and use a hook called f.i. targetm.fuse_caller_save_p or targetm.implicit_call_clobbers_in_fusage_p, to indicate whether a target has taken care of that, meaning it's safe to do the fuse-caller-save optimization. 2. A mixed solution: we make each target responsible for specifying which clobbers need to be added in CALL_INSN_FUNCTION_USAGE, using a hook called f.i. targetm.call_clobbered_regs, and add generic code to add those clobbers to CALL_INSN_FUNCTION_USAGE. 3. We stick with the current, approved hook format, and try to convince Richard to live with it. Since you are a register allocator maintainer, familiar with the fuse-caller-save optimization, and have approved the original hook, I would like to ask you to make a decision on how to proceed from here. Thanks, - Tom
[PATCH] PR60822 (m68k, missing earlyclobber in extendplussidi)
operand[0] has a subreg taken (as operand[3]), which is modified before operand[1] is used. Built succesfully but I'm not set up to run the testsuite, sorry. It fixes the testcase of course. gcc/ChangeLog: 2014-04-16 Segher Boessenkool seg...@kernel.crashing.org * config/m68k/m68k.md (extendplussidi): Add earlyclobber. --- gcc/config/m68k/m68k.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/m68k/m68k.md b/gcc/config/m68k/m68k.md index e61048b..9e7f3e2 100644 --- a/gcc/config/m68k/m68k.md +++ b/gcc/config/m68k/m68k.md @@ -1869,7 +1869,7 @@ (define_insn extendsidi2 ;; result of the SI tree to be in the lower register of the DI target (define_insn extendplussidi - [(set (match_operand:DI 0 register_operand =d) + [(set (match_operand:DI 0 register_operand =d) (sign_extend:DI (plus:SI (match_operand:SI 1 general_operand %rmn) (match_operand:SI 2 general_operand rmn] -- 1.8.1.4
Re: fuse-caller-save - hook format
Tom de Vries tom_devr...@mentor.com writes: Vladimir, All patches for the fuse-caller-save optimization have been ok-ed. The only part not approved is the MIPS-specific part. The objection of Richard S. is not so much the patch itself, but more the idea of the hook fn_other_hard_reg_usage. For clarity, I'm restating the current hook definition here: ... +@deftypefn {Target Hook} bool TARGET_FN_OTHER_HARD_REG_USAGE (struct hard_reg_set_container *@var{regs}) Add any hard registers to @var{regs} that are set or clobbered by a call to the function. This hook only needs to add registers that cannot be found by examination of the final RTL representation of a function. This hook returns true if it managed to determine which registers need to be added. The default version of this hook returns false. ... Just for the record, I think this hook was defined as applying during final to a potential callee function, rather than applying to a particular call. I.e., after calculating which registers a function uses, the code would add the registers returned by this hook to the set. My objection to that was that the set of registers clobbered while making a call depends on the caller. With that proviso... Richard prefers to, rather than having a hook specifying what registers are implicitly clobbered, adding those clobbers to CALL_INSN_FUNCTION_USAGE. ...I agree this is a fair summary. I can see these possibilities (and perhaps there are more): 1. We go with Richards proposal: we make each target responsible for adding these clobbers in CALL_INSN_FUNCTION_USAGE, and use a hook called f.i. targetm.fuse_caller_save_p or targetm.implicit_call_clobbers_in_fusage_p, to indicate whether a target has taken care of that, meaning it's safe to do the fuse-caller-save optimization. 2. A mixed solution: we make each target responsible for specifying which clobbers need to be added in CALL_INSN_FUNCTION_USAGE, using a hook called f.i. targetm.call_clobbered_regs, and add generic code to add those clobbers to CALL_INSN_FUNCTION_USAGE. 3. We stick with the current, approved hook format, and try to convince Richard to live with it. The reason I don't like (2) is that, on targets like MIPS where the different call cases are quite complicated, the implementation of the hook would need to follow the same logic as the call expander to figure out which case applies. It just seems more elegant to me to add the clobbers when emitting the call. IMO CALL_INSN_FUNCTION_USAGE is like a varargs part of the call pattern. In other words it's a way of allowing the set of uses and clobbers to vary from call to call without having to define lots of different call define_insns. If you look at it like that, adding the clobbers when emitting the insn seems more correct as well. Thanks, Richard
Re: [C++ Patch/RFC] Remove unify_success / unify_invalid unused parameter?
On 04/15/2014 12:21 PM, Paolo Carlini wrote: a lot of time ago I noticed that these parameters are unused: should I prepare a ChangeLog for the below or we have stylistic, etc, reasons for keeping the parameters? I'd leave them alone, we might want to print something sometime. PS: I also see many int return types in the various unify* which could as well be bool. Opinions about that? Doesn't seem worth bothering to change. Jason
Re: Remove obsolete Solaris 9 support
On Wed, Apr 16, 2014 at 4:16 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: * Ian: I've removed Solaris 8 and 9 support from libgo. I'm uncertain if you want this or rather keep that support for the 4.[789] branches? I want it. I don't try to maintain exact copies of older GCC branches. Your patch appears separable, and I can commit the libgo part. Let me know when I should do so. Ian
Re: Remove obsolete Solaris 9 support
Ian Lance Taylor i...@google.com writes: On Wed, Apr 16, 2014 at 4:16 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: * Ian: I've removed Solaris 8 and 9 support from libgo. I'm uncertain if you want this or rather keep that support for the 4.[789] branches? I want it. I don't try to maintain exact copies of older GCC branches. Your patch appears separable, and I can commit the libgo part. Let me know when I should do so. Go ahead whenever you like. It's at most a few days until I commit the rest, Solaris 9 isn't supposed to work on mainline any longer, and the libgo part is independent of the rest. I've already separated the libgo and classpath parts from my main patch. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [C++ Patch/RFC] Remove unify_success / unify_invalid unused parameter?
Hi, On 04/16/2014 09:47 PM, Jason Merrill wrote: On 04/15/2014 12:21 PM, Paolo Carlini wrote: a lot of time ago I noticed that these parameters are unused: should I prepare a ChangeLog for the below or we have stylistic, etc, reasons for keeping the parameters? I'd leave them alone, we might want to print something sometime. Makes sense. PS: I also see many int return types in the various unify* which could as well be bool. Opinions about that? Doesn't seem worth bothering to change. Ok. Personally, I believe that using bool instead of int, thus true and false instead of 1 and 0, may add clarity to the code whenever 0s and 1s are used as return values in the body of large functions, because one cannot wonder whether 2 and 3, etc, also would make sense (in fact elsewhere we have got functions returning int - not an enumeration type - with values outside 0 and 1 too, with special meanings... maybe those should be changed instead ;) Paolo.
Re: Remove obsolete Solaris 9 support
On Wed, Apr 16, 2014 at 1:02 PM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Ian Lance Taylor i...@google.com writes: On Wed, Apr 16, 2014 at 4:16 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: * Ian: I've removed Solaris 8 and 9 support from libgo. I'm uncertain if you want this or rather keep that support for the 4.[789] branches? I want it. I don't try to maintain exact copies of older GCC branches. Your patch appears separable, and I can commit the libgo part. Let me know when I should do so. Go ahead whenever you like. It's at most a few days until I commit the rest, Solaris 9 isn't supposed to work on mainline any longer, and the libgo part is independent of the rest. I've already separated the libgo and classpath parts from my main patch. Thanks. Committed attached patch to mainline. Ian diff -r 98547f162e12 libgo/configure.ac --- a/libgo/configure.ac Thu Apr 10 09:25:24 2014 -0700 +++ b/libgo/configure.ac Wed Apr 16 13:22:16 2014 -0700 @@ -316,11 +316,6 @@ # msghdr in sys/socket.h. OSCFLAGS=$OSCFLAGS -D_XOPEN_SOURCE=500 ;; -*-*-solaris2.[[89]]) - # Solaris 8/9 need this so struct msghdr gets the msg_control - # etc. fields in sys/socket.h (_XPG4_2). - OSCFLAGS=$OSCFLAGS -D_XOPEN_SOURCE=500 -D_XOPEN_SOURCE_EXTENDED -D__EXTENSIONS__ - ;; *-*-solaris2.1[[01]]) # Solaris 10+ needs this so struct msghdr gets the msg_control # etc. fields in sys/socket.h (_XPG4_2). _XOPEN_SOURCE=600 as @@ -662,21 +657,6 @@ [Define to 1 if math.h defines struct exception]) fi -dnl Check if makecontext expects the uc_stack member of ucontext to point -dnl to the top of the stack. -case $target in - sparc*-*-solaris2.[[89]]*) -libgo_cv_lib_makecontext_stack_top=yes -;; - *) -libgo_cv_lib_makecontext_stack_top=no -;; -esac -if test $libgo_cv_lib_makecontext_stack_top = yes; then - AC_DEFINE(MAKECONTEXT_STACK_TOP, 1, - [Define if makecontext expects top of stack in uc_stack.]) -fi - dnl See whether setcontext changes the value of TLS variables. AC_CACHE_CHECK([whether setcontext clobbers TLS variables], [libgo_cv_lib_setcontext_clobbers_tls], diff -r 98547f162e12 libgo/go/math/ldexp.go --- a/libgo/go/math/ldexp.go Thu Apr 10 09:25:24 2014 -0700 +++ b/libgo/go/math/ldexp.go Wed Apr 16 13:22:16 2014 -0700 @@ -17,16 +17,6 @@ func Ldexp(frac float64, exp int) float64 { r := libc_ldexp(frac, exp) - - // Work around a bug in the implementation of ldexp on Solaris - // 9. If multiplying a negative number by 2 raised to a - // negative exponent underflows, we want to return negative - // zero, but the Solaris 9 implementation returns positive - // zero. This workaround can be removed when and if we no - // longer care about Solaris 9. - if r == 0 frac 0 exp 0 { - r = Copysign(0, frac) - } return r } diff -r 98547f162e12 libgo/runtime/proc.c --- a/libgo/runtime/proc.c Thu Apr 10 09:25:24 2014 -0700 +++ b/libgo/runtime/proc.c Wed Apr 16 13:22:16 2014 -0700 @@ -1212,9 +1212,6 @@ // here we need to set up the context for g0. getcontext(mp-g0-context); mp-g0-context.uc_stack.ss_sp = g0_sp; -#ifdef MAKECONTEXT_STACK_TOP - mp-g0-context.uc_stack.ss_sp += g0_spsize; -#endif mp-g0-context.uc_stack.ss_size = g0_spsize; makecontext(mp-g0-context, kickoff, 0);
Re: [PATCH] PR60822 (m68k, missing earlyclobber in extendplussidi)
On 04/16/14 13:18, Segher Boessenkool wrote: operand[0] has a subreg taken (as operand[3]), which is modified before operand[1] is used. Built succesfully but I'm not set up to run the testsuite, sorry. It fixes the testcase of course. gcc/ChangeLog: 2014-04-16 Segher Boessenkool seg...@kernel.crashing.org * config/m68k/m68k.md (extendplussidi): Add earlyclobber. But in the case where writing operand3 would overwrite operand1, shouldn't we have have used the true arm of this statement: if (GET_CODE (operands[1]) == REG REGNO (operands[1]) == REGNO (operands[3])) output_asm_insn (add%.l %2,%3, operands); else output_asm_insn (move%.l %2,%3\;add%.l %1,%3, operands); Looking at the .reload dump I see: (insn 11 33 14 2 (set (reg:DI 0 %d0 [orig:47 D.1394 ] [47]) (sign_extend:DI (plus:SI (mem:SI (plus:SI (reg/v/f:SI 8 %a0 [orig:40 p ] [40]) (reg:SI 1 %d1 [44])) [3 p_4(D)-a+0 S4 A16]) (mem:SI (plus:SI (reg/v/f:SI 8 %a0 [orig:40 p ] [40]) (reg:SI 0 %d0 [45])) [3 p_4(D)-b+0 S4 A16] j.c:12 78 {extendplussidi} Isn't the problem that operands 1 is a MEM which use the same register as operands 3 in the memory address? ISTM either removing the memory constraint entirely, or splitting it off into a separate alternative and only earlyclobbering that alternative would be better. Or am I missing something? jeff
Re: [PATCH, rs6000] Improve TImode add/sub
On 04/08/2014 09:56 PM, seg...@kernel.crashing.org wrote: +/* { dg-do compile { target { powerpc*-*-* lp64 } } } */ +/* { dg-skip-if { powerpc*-*-darwin* } { * } { } } */ Please leave out the default arguments. Why does this need skipping on Darwin? +;; Define the TImode operations that can be done in a small number +;; of instructions. The constraints are to prevent the register +;; allocator from allocating registers that overlap with the inputs +;; (for example, having an input in 7,8 and an output in 6,7). We +;; also allow for the output being the same as one of the inputs. + +(define_insn addti3 + [(set (match_operand:TI 0 gpc_reg_operand =r,r,r,r) + (plus:TI (match_operand:TI 1 gpc_reg_operand %r,r,0,0) +(match_operand:TI 2 reg_or_short_operand r,I,r,I)))] + TARGET_POWERPC64 That's not the correct condition: the carry bit is set based on the 32-bit carry in 32-bit mode, so the condition has to be TARGET_64BIT. The adddi3 pattern has !TARGET_POWERPC64 since a 64-bit addition can be done without addc on a 64-bit machine, no matter what mode the CPU is in. + * +{ Might as well leave out this stuff on new code, just use the braces :-) Updated patch with above comments incorporated. Bootstrap/regtest on BE/LE with no new regressions. Ok for trunk? -Pat Index: gcc/testsuite/gcc.target/powerpc/ti_math1.c === --- gcc/testsuite/gcc.target/powerpc/ti_math1.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/ti_math1.c (revision 0) @@ -0,0 +1,20 @@ +/* { dg-do compile { target { powerpc*-*-* lp64 } } } */ +/* { dg-options -O2 } */ +/* { dg-final { scan-assembler-times addc 1 } } */ +/* { dg-final { scan-assembler-times adde 1 } } */ +/* { dg-final { scan-assembler-times subfc 1 } } */ +/* { dg-final { scan-assembler-times subfe 1 } } */ +/* { dg-final { scan-assembler-not subf } } */ + +__int128 +add_128 (__int128 *ptr, __int128 val) +{ + return (*ptr + val); +} + +__int128 +sub_128 (__int128 *ptr, __int128 val) +{ + return (*ptr - val); +} + Index: gcc/testsuite/gcc.target/powerpc/ti_math2.c === --- gcc/testsuite/gcc.target/powerpc/ti_math2.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/ti_math2.c (revision 0) @@ -0,0 +1,73 @@ +/* { dg-do run { target { powerpc*-*-* lp64 } } } */ +/* { dg-options -O2 -fno-inline } */ + +union U { + __int128 i128; + struct { +long l1; +long l2; + } s; +}; + +union U u1,u2; + +__int128 +create_128 (long most_sig, long least_sig) +{ + union U u; + +#if __LITTLE_ENDIAN__ + u.s.l1 = least_sig; + u.s.l2 = most_sig; +#else + u.s.l1 = most_sig; + u.s.l2 = least_sig; +#endif + return u.i128; +} + +long most_sig (union U * u) +{ +#if __LITTLE_ENDIAN__ + return (*u).s.l2; +#else + return (*u).s.l1; +#endif +} + +long least_sig (union U * u) +{ +#if __LITTLE_ENDIAN__ + return (*u).s.l1; +#else + return (*u).s.l2; +#endif +} + +__int128 +add_128 (__int128 *ptr, __int128 val) +{ + return (*ptr + val); +} + +__int128 +sub_128 (__int128 *ptr, __int128 val) +{ + return (*ptr - val); +} + +int +main (void) +{ + /* Do a simple add/sub to make sure carry is happening between the dwords + and that dwords are in correct endian order. */ + u1.i128 = create_128 (1, -1); + u2.i128 = add_128 (u1.i128, 1); + if ((most_sig (u2) != 2) || (least_sig (u2) != 0)) +__builtin_abort (); + u2.i128 = sub_128 (u2.i128, 1); + if ((most_sig (u2) != 1) || (least_sig (u2) != -1)) +__builtin_abort (); + return 0; +} + Index: gcc/config/rs6000/rs6000.md === --- gcc/config/rs6000/rs6000.md (revision 209226) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -6535,6 +6535,49 @@ (define_insn_and_split *floatunsdisf2_m [(set_attr length 8) (set_attr type fpload)]) +;; Define the TImode operations that can be done in a small number +;; of instructions. The constraints are to prevent the register +;; allocator from allocating registers that overlap with the inputs +;; (for example, having an input in 7,8 and an output in 6,7). We +;; also allow for the output being the same as one of the inputs. + +(define_insn addti3 + [(set (match_operand:TI 0 gpc_reg_operand =r,r,r,r) + (plus:TI (match_operand:TI 1 gpc_reg_operand %r,r,0,0) + (match_operand:TI 2 reg_or_short_operand r,I,r,I)))] + TARGET_64BIT +{ + if (WORDS_BIG_ENDIAN) +return (GET_CODE (operands[2])) != CONST_INT + ? \addc %L0,%L1,%L2\;adde %0,%1,%2\ + : \addic %L0,%L1,%2\;add%G2e %0,%1\; + else +return (GET_CODE (operands[2])) != CONST_INT + ? \addc %0,%1,%2\;adde %L0,%L1,%L2\ + : \addic %0,%1,%2\;add%G2e %L0,%L1\; +} + [(set_attr type two) + (set_attr length 8)]) + +(define_insn subti3 + [(set (match_operand:TI 0 gpc_reg_operand =r,r,r,r,r) + (minus:TI (match_operand:TI 1 reg_or_short_operand r,I,0,r,I) + (match_operand:TI 2
RE: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
Did you see the failures even after your mips_regno_mode_ok_for_base_p change? LRA should know how to reload a W address. Yes but I realize there is more. It fails because $sp is now included in BASE_REG_CLASS and W is based on it. However, I suppose that it would be too eager to say it is wrong and likely there is something missing in LRA if we want to keep all alternatives. Currently there is no check if a reloaded operand has a valid address, use of $sp in lbu/lhu cases. Even if we added extra checks we are less likely to benefit as we need to reload the base into register. Not sure what you mean, sorry. W exists specifically to exclude $sp-based and $pc-based addresses. LRA AFAIK should already be able to reload addresses that are valid in the TARGET_LEGITIMATE_ADDRESS_P sense but which do not match the constraints for a particular insn. Can you remember one of the tests that fails? I couldn't trigger the problem with the original testcase but found another one that reveals it. The following needs to compiled with -mips32r2 -mips16 -Os: struct { int addr; } c; struct command { int args[1]; }; unsigned short a; fn1 (struct command *p1) { unsigned short d; d = fn2 (); a = p1-args[0]; fn3 (a); if (c.addr) { fn4 (p1-args[0]); return; } (c)-addr = fn5 (); fn6 (d); } Not sure how the constraint would/should exclude $sp-based address in LRA. In this particular case, a spilled pseudo is changed to memory giving the following RTL form: (insn 30 29 31 4 (set (reg:SI 4 $4) (and:SI (mem/c:SI (plus:SI (reg/f:SI 78 $frame) (const_int 16 [0x10])) [7 %sfp+16 S4 A32]) (const_int 65535 [0x]))) shell.i:17 161 {*andsi3_mips16} (expr_list:REG_DEAD (reg:SI 194 [ D.1469 ]) (nil))) The operand 1 during alternative selection is not marked as a bad operand as it is a memory operand. $frame appears to be fine as it could be eliminated later to hard register. No reloads are inserted for the instructions concerned. Unless, $frame should be temporarily eliminated and then a reload would be inserted? Regards, Robert
Re: RFA: Tighten checking for 'X' constraints
On 04/16/14 07:37, Jakub Jelinek wrote: Creating a (mem (scratch)) too early may pessimize code too much, perhaps it can be used during say sched1 etc. for alias analysis, (mem (scratch)) is considered to alias everything,. Plus, I think at least so far we have not been doing different decisions based on whether some operand has been referenced in the template or not, not sure if it is desirable to introduce it. Anyway, others can have different opinion on what X should mean, CCing Jeff and Eric. My recollection is that X was supposed to be used in cases where we conditionally needed a scratch operand. The X constraint was used to identify alternatives where the scratch operand wasn't actually needed. Conceptually that meant that literally anything could go in there, it need not be valid or reloadable. I'm a bit surprised to see it showing up outside MD files. jeff
Re: [RFC] proof-of-concept: warning for a bit comparison that is always true/false
On 04/16/14 09:27, Daniel Marjamäki wrote: Hello! I am new to GCC. I want to add a warning to GCC when bit comparison is always true/false. Example: if ((x4)==0) {} // - no warning if ((x4)==4) {} // - no warning if ((x4)==5) {} // - warn! When this warning is triggered, the most common cause is that somebody made a mistake when using bitmasks. I attach a proof-of-concept patch. I would like comments. The patch needs some cleanup before it's applied.. I would like it to handle at least != also and not just ==. And I would like it to be less strict about where integer constants are located. I wonder where I should put this code. Is gcc/c/c-typeck.c a good file to put this in? Should I put it in somewhere else? What warning flags should be used to enable this? Is some -Wcondition-bitop a good idea? Can this be added by -Wall? I wrote this check for Cppcheck years ago. In my experience this warning has a good signal/noise ratio. I'd actually do this down at the gimple level. You'll have an SSA graph you can use to identify the masking and verify its producing a single bit result. You'll also have canonicalized comparisons, so there'll be fewer things to test. Depending on exactly where you put the optimization, you may also see more constants on the RHS due to propagation. A completely different approach would be to have VRP identify objects which have values that are only powers of 2. Once such a value is in the lattice, you can identify and warn/optimize when they're compared against values which aren't powers of 2. Jeff Best regards, Daniel Marjamäki
Re: fuse-caller-save - hook format
On 04/16/14 13:41, Richard Sandiford wrote: IMO CALL_INSN_FUNCTION_USAGE is like a varargs part of the call pattern. In other words it's a way of allowing the set of uses and clobbers to vary from call to call without having to define lots of different call define_insns. If you look at it like that, adding the clobbers when emitting the insn seems more correct as well. This seems like a better direction to me as well. There's just something clean and elegant about attaching this stuff to the CALL_INSN. jeff
Re: GCC's -fsplit-stack disturbing Mach's vm_allocate
Samuel Thibault, le Sat 12 Apr 2014 01:04:49 +0200, a écrit : Samuel Thibault, le Fri 11 Apr 2014 23:51:44 +0200, a écrit : So, do we really want to let munmap poke a hole at address 0 and thus let further vm_map() return address 0? i.e. we could apply this: I have applied it. Samuel
Re: Patch ping
On 01/13/14 01:07, Jakub Jelinek wrote: Hi! I'd like to ping 2 patches: http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00140.html - Ensure GET_MODE_{SIZE,INNER,NUNITS} (const) is constant rather than memory load after optimization (I'd like to keep the current MODE_SIZE patch for the reasons mentioned there, but also add this patch) This is fine. Per the follow-up discussion, I think you can mark it was resolving 36109 as well. http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00131.html - PR target/59617 handle gather loads for AVX512 (at least non-masked ones, masked ones will need to wait for 5.0 and we need to find how to represent it in GIMPLE) I'll leave this to Uros :-) jeff
Re: [C PATCH] Make attributes accept enum values (PR c/50459)
On 04/14/2014 10:32 AM, Marek Polacek wrote: + if (TREE_CODE (val) != IDENTIFIER_NODE +TREE_CODE (val) != FUNCTION_DECL) + val = default_conversion (val); + else if (TREE_CODE (val) == IDENTIFIER_NODE) + { + tree t = lookup_name (val); + if (t TREE_CODE (t) == CONST_DECL) + val = default_conversion (t); + } In addition to Jason's comment, a general style point: if (X != A X != B) ... else if (X == A) ... should be written if (X == A) ... else if (X != B) ... As a general rule, positive tests are easier to reason with than negative tests. r~