Re: C++ PATCH for c++/49418 (lost cv-quals on template parameter type)
On Thu, Jun 23, 2011 at 8:07 PM, H.J. Lu wrote: > On Thu, Jun 23, 2011 at 7:18 PM, Jason Merrill wrote: >> On 06/23/2011 08:45 PM, H.J. Lu wrote: >>> >>> This caused: >>> >>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49519 >> >> I'm checking this in as an alternate fix. Testing hasn't completed yet, but >> I'm confident that this version is safe. >> > > I still got the same failure with revision 175368. The problem is caused by > the "While looking at this, I've also changed a few more TYPE_MAIN_VARIANTs > to cv_unqualified" change in > > http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01622.html Revert this patch: diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 4d2caa8..2716f78 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -10246,7 +10246,7 @@ tsubst_arg_types (tree arg_types, /* Do array-to-pointer, function-to-pointer conversion, and ignore top-level qualifiers as required. */ -type = TYPE_MAIN_VARIANT (type_decays_to (type)); +type = cv_unqualified (type_decays_to (type)); /* We do not substitute into default arguments here. The standard mandates that they be instantiated only when needed, which is seems to fix the crash. -- H.J.
Re: C++ PATCH for c++/49418 (lost cv-quals on template parameter type)
On Thu, Jun 23, 2011 at 7:18 PM, Jason Merrill wrote: > On 06/23/2011 08:45 PM, H.J. Lu wrote: >> >> This caused: >> >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49519 > > I'm checking this in as an alternate fix. Testing hasn't completed yet, but > I'm confident that this version is safe. > I still got the same failure with revision 175368. The problem is caused by the "While looking at this, I've also changed a few more TYPE_MAIN_VARIANTs to cv_unqualified" change in http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01622.html commit b84cd9b997b527960d42d0855ff281af1550b627 Author: Jason Merrill Date: Tue Jun 21 13:22:34 2011 -0400 * call.c (add_builtin_candidates): Use cv_unqualified rather than TYPE_MAIN_VARIANT. * pt.c (tsubst_arg_types): Likewise. * except.c (build_throw): Use cv_unqualified. diff --git a/gcc/cp/call.c b/gcc/cp/call.c index 3ac7a8e..8123e3d 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -2773,7 +2773,7 @@ add_builtin_candidates (struct z_candidate **candidates, enum tree_code code, type = non_reference (type); if (i != 0 || ! ref1) { - type = TYPE_MAIN_VARIANT (type_decays_to (type)); + type = cv_unqualified (type_decays_to (type)); if (enum_p && TREE_CODE (type) == ENUMERAL_TYPE) VEC_safe_push (tree, gc, types[i], type); if (INTEGRAL_OR_UNSCOPED_ENUMERATION_TYPE_P (type)) @@ -2792,7 +2792,7 @@ add_builtin_candidates (struct z_candidate **candidates, enum tree_code code, type = non_reference (argtypes[i]); if (i != 0 || ! ref1) { - type = TYPE_MAIN_VARIANT (type_decays_to (type)); + type = cv_unqualified (type_decays_to (type)); if (enum_p && UNSCOPED_ENUM_P (type)) VEC_safe_push (tree, gc, types[i], type); if (INTEGRAL_OR_UNSCOPED_ENUMERATION_TYPE_P (type)) diff --git a/gcc/cp/except.c b/gcc/cp/except.c index 3399652..f8c8e47 100644 --- a/gcc/cp/except.c +++ b/gcc/cp/except.c @@ -722,7 +722,7 @@ build_throw (tree exp) respectively. */ temp_type = is_bitfield_expr_with_lowered_type (exp); if (!temp_type) - temp_type = type_decays_to (TREE_TYPE (exp)); + temp_type = cv_unqualified (type_decays_to (TREE_TYPE (exp))); /* OK, this is kind of wacky. The standard says that we call terminate when the exception handling mechanism, after diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 4d2caa8..2716f78 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -10246,7 +10246,7 @@ tsubst_arg_types (tree arg_types, /* Do array-to-pointer, function-to-pointer conversion, and ignore top-level qualifiers as required. */ -type = TYPE_MAIN_VARIANT (type_decays_to (type)); +type = cv_unqualified (type_decays_to (type)); /* We do not substitute into default arguments here. The standard mandates that they be instantiated only when needed, which is -- H.J.
Re: [RFC, ARM] Convert thumb1 prologue completely to rtl
Ping. This will shortly be holding up dwarf2 maintenance. http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01398.html r~
Re: C++ PATCH for c++/49418 (lost cv-quals on template parameter type)
On 06/23/2011 08:45 PM, H.J. Lu wrote: This caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49519 I'm checking this in as an alternate fix. Testing hasn't completed yet, but I'm confident that this version is safe. commit 9084174e504fed2e454948c24144e2a93fabdad2 Author: Jason Merrill Date: Thu Jun 23 22:07:43 2011 -0400 PR c++/49418 * typeck2.c (build_functional_cast): Strip cv-quals for value init. * init.c (build_zero_init_1): Not here. diff --git a/gcc/cp/init.c b/gcc/cp/init.c index 62b68f2..3c347a4 100644 --- a/gcc/cp/init.c +++ b/gcc/cp/init.c @@ -176,7 +176,7 @@ build_zero_init_1 (tree type, tree nelts, bool static_storage_p, initialized are initialized to zero. */ ; else if (SCALAR_TYPE_P (type)) -init = convert (cv_unqualified (type), integer_zero_node); +init = convert (type, integer_zero_node); else if (CLASS_TYPE_P (type)) { tree field; diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c index ff2949c..8bb938e 100644 --- a/gcc/cp/typeck2.c +++ b/gcc/cp/typeck2.c @@ -1641,7 +1641,7 @@ build_functional_cast (tree exp, tree parms, tsubst_flags_t complain) { if (VOID_TYPE_P (type)) return void_zero_node; - return build_value_init (type, complain); + return build_value_init (cv_unqualified (type), complain); } /* This must build a C cast. */
C++ PATCH for c++/35255 (address of template-id)
Per DR 115, if the context of a template-id doesn't give enough type information to resolve it and the template-id fully resolves exactly one specialization, we should use that one. The code in resolve_overloaded_unification was trying to do this, but was failing to handle the case where there are additional templates that aren't fully resolved. Tested x86_64-pc-linux-gnu, applying to trunk.
Re: Removed unused cp_binding_level field names_size. (issue4662052)
OK. Jason
Re: [AVX2] PATCH: Fixed 64-bit integer of gather* intrinsic declaration.
On Thu, Jun 23, 2011 at 9:38 AM, Kirill Yukhin wrote: > Hi, > I've updated 64-bit integer variant of gather intrinsics declarations. > It now works while passing '-pedantic' flag. > Also I fixed copy-paste problem to avoid AVX2 tests to be executed on > AVX-capable machines. > > ChangeLog.avx2 entry: > 2011-06-22 Yukhin Kirill > > * gcc/config/i386/avx2intrin.h (_mm_i32gather_epi64): Fixed > pointer type. > (_mm256_i32gather_epi64): Likewise. > (_mm256_mask_i32gather_epi64): Likewise. > (_mm_i64gather_epi64): Likewise. > (_mm_mask_i64gather_epi64): Likewise. > (_mm256_i64gather_epi64): Likewise. > (_mm256_mask_i64gather_epi64): Likewise. > > tesuite/ChangeLog.avx2 entry: > 2011-06-22 Yukhin Kirill > > * gcc.target/i386/avx2-vbroadcastsd_pd-2.c: Fixed test to run > on AVX2 machine, not AVX. > * gcc.target/i386/avx2-vbroadcastsi128-2.c: Likewise. > * gcc.target/i386/avx2-vbroadcastss_ps-2.c: Likewise. > * gcc.target/i386/avx2-vbroadcastss_ps256-2.c: Likewise. > * gcc.target/i386/avx2-vextracti128-2.c: Likewise. > * gcc.target/i386/avx2-vinserti128-2.c: Likewise. > * gcc.target/i386/avx2-vpmaskloadq256-2.c: Likewise. > * gcc.target/i386/avx2-vpmaskstoreq256-2.c: Likewise. > > Going to commit to avx2 branch. > I checked all 4 patches into avx2 branch. But I had to apply 2 ChangaLog.avx2 patch by hand. Please sync with avx2 branch and double your patches next time. Thanks. -- H.J.
Re: C++ PATCH for c++/49418 (lost cv-quals on template parameter type)
On Tue, Jun 21, 2011 at 12:03 PM, Jason Merrill wrote: > cv-qualifiers are dropped from a function parameter type in order to produce > the parameter-type-list, but the parameter itself still has the qualified > type within the function body. When I added cv-qualification stripping to > type_decays_to, it started affecting instantiation of template function > parameters, which is wrong. So I've reverted that change and instead added > explicit cv-qualification stripping to lambda_return_type. > > While looking at this, I've also changed a few more TYPE_MAIN_VARIANTs to > cv_unqualified. This caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49519 -- H.J.
[testsuite] ARM test pr42093.c: thumb2 or thumb1
Test gcc.target/arm/pr42093.c, added by Ramana, requires support for arm_thumb2 but fails for those targets. The patch for which it was added modified support for thumb1. Should the test instead require arm_thumb1_ok, as in this patch? Janis 2011-06-23 Janis Johnson * gcc.target/arm/pr42093.c: Require thumb1, not thumb2. Index: gcc.target/arm/pr42093.c === --- gcc.target/arm/pr42093.c(revision 175313) +++ gcc.target/arm/pr42093.c(working copy) @@ -1,5 +1,5 @@ /* { dg-options "-mthumb -O2" } */ -/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-require-effective-target arm_thumb1_ok } */ /* { dg-final { scan-assembler-not "tbb" } } */ /* { dg-final { scan-assembler-not "tbh" } } */
[testsuite] ARM tests vfp-ldm*.c and vfp-stm*.c
Tests target/arm/vfp-ldm*.c and vfp-sdm*.c add -mfloat-abi=softfp but fail if multilib flags override that option. This patch skips the test for multilibs that specify a different value for -mfloat-abi. Tested on arm-none-linux-gnueabi with 43 sets of multlib flags. OK for trunk, and later for 4.6? Janis 2011-06-23 Janis Johnson * gcc.target/arm/vfp-ldmdbd.c: Skip if multilib flags override -mfloat-abi from test. * gcc.target/arm/vfp-ldmdbs.c: Ditto. * gcc.target/arm/vfp-ldmiad.c: Ditto. * gcc.target/arm/vfp-ldmias.c: Ditto. * gcc.target/arm/vfp-stmdbd.c: Ditto. * gcc.target/arm/vfp-stmdbs.c: Ditto. * gcc.target/arm/vfp-stmiad.c: Ditto. * gcc.target/arm/vfp-stmias.c: Ditto. Index: gcc.target/arm/vfp-ldmdbd.c === --- gcc.target/arm/vfp-ldmdbd.c (revision 175313) +++ gcc.target/arm/vfp-ldmdbd.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target arm_vfp_ok } */ +/* { dg-skip-if "don't override float abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=softfp" } } */ /* { dg-options "-O2 -mfpu=vfp -mfloat-abi=softfp" } */ extern void bar (double); Index: gcc.target/arm/vfp-ldmdbs.c === --- gcc.target/arm/vfp-ldmdbs.c (revision 175313) +++ gcc.target/arm/vfp-ldmdbs.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target arm_vfp_ok } */ +/* { dg-skip-if "don't override float abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=softfp" } } */ /* { dg-options "-O2 -mfpu=vfp -mfloat-abi=softfp" } */ extern void baz (float); Index: gcc.target/arm/vfp-ldmiad.c === --- gcc.target/arm/vfp-ldmiad.c (revision 175313) +++ gcc.target/arm/vfp-ldmiad.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target arm_vfp_ok } */ +/* { dg-skip-if "don't override float abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=softfp" } } */ /* { dg-options "-O2 -mfpu=vfp -mfloat-abi=softfp" } */ extern void bar (double); Index: gcc.target/arm/vfp-ldmias.c === --- gcc.target/arm/vfp-ldmias.c (revision 175313) +++ gcc.target/arm/vfp-ldmias.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target arm_vfp_ok } */ +/* { dg-skip-if "don't override float abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=softfp" } } */ /* { dg-options "-O2 -mfpu=vfp -mfloat-abi=softfp" } */ extern void baz (float); Index: gcc.target/arm/vfp-stmdbd.c === --- gcc.target/arm/vfp-stmdbd.c (revision 175313) +++ gcc.target/arm/vfp-stmdbd.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target arm_vfp_ok } */ +/* { dg-skip-if "don't override float abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=softfp" } } */ /* { dg-options "-O2 -mfpu=vfp -mfloat-abi=softfp" } */ void Index: gcc.target/arm/vfp-stmdbs.c === --- gcc.target/arm/vfp-stmdbs.c (revision 175313) +++ gcc.target/arm/vfp-stmdbs.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target arm_vfp_ok } */ +/* { dg-skip-if "don't override float abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=softfp" } } */ /* { dg-options "-O2 -mfpu=vfp -mfloat-abi=softfp" } */ void Index: gcc.target/arm/vfp-stmiad.c === --- gcc.target/arm/vfp-stmiad.c (revision 175313) +++ gcc.target/arm/vfp-stmiad.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target arm_vfp_ok } */ +/* { dg-skip-if "don't override float abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=softfp" } } */ /* { dg-options "-O2 -mfpu=vfp -mfloat-abi=softfp" } */ void Index: gcc.target/arm/vfp-stmias.c === --- gcc.target/arm/vfp-stmias.c (revision 175313) +++ gcc.target/arm/vfp-stmias.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target arm_vfp_ok } */ +/* { dg-skip-if "don't override float abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=softfp" } } */ /* { dg-options "-O2 -mfpu=vfp -mfloat-abi=softfp" } */ void
Re: __sync_swap* [ rename sync builtins ]
On 06/23/2011 08:14 PM, Ian Lance Taylor wrote: On Tue, Jun 21, 2011 at 4:03 PM, Andrew MacLeod wrote: On 06/21/2011 06:26 PM, Graham Stott wrote: This looks to have broken the go frontend ah, missed it's .cc file, and I guess it doesn't build by default :-P This ought to fix it, checking in as obvious... Note that the files in gcc/go/gofrontend are mirrored from a different repository, and should be changed there before changing the files in the gcc repository. I am slowly fixing the cases where this causes generic gcc problems like this one. I will take care of moving this trivial patch over. oops, sorry I had no idea... and thanks :-) Andrew
Re: __sync_swap* [ rename sync builtins ]
On Tue, Jun 21, 2011 at 4:03 PM, Andrew MacLeod wrote: > On 06/21/2011 06:26 PM, Graham Stott wrote: >> This looks to have broken the go frontend > > ah, missed it's .cc file, and I guess it doesn't build by default :-P > > This ought to fix it, checking in as obvious... Note that the files in gcc/go/gofrontend are mirrored from a different repository, and should be changed there before changing the files in the gcc repository. I am slowly fixing the cases where this causes generic gcc problems like this one. I will take care of moving this trivial patch over. Ian
[PING] ARM testsuite for fp16
Ping for ARM testsuite patch: http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01195.html This patch is a different approach for the ARM tests that use effective target arm_neon_fp16_ok. It was apparently lost in the thread about an earlier proposal for those tests. The tests that use arm_neon_fp16_ok and add_options_for_arm_neon_fp16 don't require neon, so the patch removes "_neon" from their names. The check itself now skips tests if multilib flags -mfpu or -mfloat-abi are not appropriate, and defines defaults for -mfloat-abi and -mfpu for use in the tests. Tested on arm-none-linux-gnueabi with 43 sets of multilib flags. OK for trunk, and later for 4.6? Janis
Re: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer
Hi, > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -2128,6 +2128,9 @@ static const unsigned int > x86_avx256_split_unaligned_load > static const unsigned int x86_avx256_split_unaligned_store >= m_COREI7 | m_BDVER1 | m_GENERIC; > > +static const unsigned int x86_prefer_avx128 > + = m_BDVER1; What is reason for stuff like this to not go into initial_ix86_tune_features? I sort of liked them better when they was individual flags, but having the target tunning flags spread across multiple places seems unnecesary. Honza
Re: [PATCH] [annotalysis] Merge of google/integration into annotalysis
google/integration up to revision 175149 has now been merged into branches/annotalysis. On Thu, Jun 23, 2011 at 3:37 PM, Diego Novillo wrote: > On Thu, Jun 23, 2011 at 18:08, Delesley Hutchins wrote: >> This patch merges recent changes from google/integration into >> branches/annotalysis. >> >> Bootstrapped and passed GCC regression testsuite on x86_64-unknown-linux-gnu. >> >> Okay for branches/annotalysis? > > OK. > > > Diego. > -- -- DeLesley Hutchins | Software Engineer | deles...@google.com | 505-206-0315
[cxx-mem-model] __sync_mem_load
Here's the patch for __sync_mem_load, complete with tests. I'll change and correct the actual implementation of the load pattern later ( I think I have the x86 fence wrong). It occurs to me that if I implement the __sync_mem_thread_fence (model) routine, then the appropriate fences can be created by using that __sync instead of sync_synchronize all the time as well. Im trying to gather a set of optimal, or at least decent, sequences for each pattern. I'll go back and make changes when I have collected some and finished all the other atomic operations. It'll be easier then. Andrew * doc/extend.texi (__sync_mem_load): Document. * c-family/c-common.c (resolve_overloaded_builtin): Add BUILT_IN_SYNC_MEM_LOAD_N. * optabs.c (expand_sync_mem_load): New. * optabs.h (enum direct_optab_index): Add DOI_sync_mem_load. (sync_mem_load_optab): Define. * genopinit.c: Add entry for sync_mem_load. * builtins.c (expand_builtin_sync_mem_load): New. (expand_builtin): Handle BUILT_IN_SYNC_MEM_LOAD_* * sync-bultins.def: Add entries for BUILT_IN_SYNC_MEM_LOAD_*. * testsuite/gcc.dg/sync-mem-invalid.c: Add invalid load tests. * testsuite/gcc.dg/sync-mem.h: Add load executable tests. * builtin-types.def (BT_FN_I{1,2,4,8,16}_VPTR_INT): New. * expr.h (expand_sync_mem_load): Declare. * fortran/types.def (BT_FN_I{1,2,4,8,16}_VPTR_INT): New. * config/i386/sync.md (sync_mem_load): New pattern. Index: doc/extend.texi === *** doc/extend.texi (revision 175331) --- doc/extend.texi (working copy) *** This means that all previous memory stor *** 6729,6734 --- 6729,6747 previous memory loads have been satisfied, but following memory reads are not prevented from being speculated to before the barrier. + @item @var{type} __sync_mem_load (@var{type} *ptr, int memmodel, ...) + @findex __sync_mem_load + This builtin implements an atomic load operation within the constraints of a + memory model. It returns the contents of @code{*@var{ptr}}. + + The valid memory model variants for this builtin are + __SYNC_MEM_RELAXED, __SYNC_MEM_SEQ_CST, __SYNC_MEM_ACQUIRE, and + __SYNC_MEM_CONSUME. The target pattern is responsible + for issuing the different synchronization instructions. It should default to + the more restrictive memory model, the sequentially consistent model. If + nothing is implemented for the target, the compiler will implement it by + issuing a memory barrier, the load, and then another memory barrier. + @item @var{type} __sync_mem_exchange (@var{type} *ptr, @var{type} value, int memmodel, ...) @findex __sync_mem_exchange This builtin implements an atomic exchange operation within the Index: c-family/c-common.c === *** c-family/c-common.c (revision 175331) --- c-family/c-common.c (working copy) *** resolve_overloaded_builtin (location_t l *** 9061,9066 --- 9061,9067 case BUILT_IN_SYNC_LOCK_TEST_AND_SET_N: case BUILT_IN_SYNC_LOCK_RELEASE_N: case BUILT_IN_SYNC_MEM_EXCHANGE_N: + case BUILT_IN_SYNC_MEM_LOAD_N: { int n = sync_resolve_size (function, params); tree new_function, first_param, result; Index: optabs.c === *** optabs.c(revision 175331) --- optabs.c(working copy) *** expand_sync_lock_test_and_set (rtx mem, *** 7057,7062 --- 7057,7104 return NULL_RTX; } + /* This function expands the atomic load operation: +return the atomically loaded value in MEM. + +MEMMODEL is the memory model variant to use. +TARGET is an option place to stick the return value. */ + + rtx + expand_sync_mem_load (enum memmodel model, rtx mem, rtx target) + { + enum machine_mode mode = GET_MODE (mem); + enum insn_code icode; + + /* If the target supports the load directly, great. */ + icode = direct_optab_handler (sync_mem_load_optab, mode); + if (icode != CODE_FOR_nothing) + { + struct expand_operand ops[3]; + + create_output_operand (&ops[0], target, mode); + create_fixed_operand (&ops[1], mem); + create_integer_operand (&ops[2], model); + if (maybe_expand_insn (icode, 3, ops)) + return ops[0].value; + } + + /* For any model other than RELAXED, perform a synchronization first. */ + if (model != MEMMODEL_RELAXED) + expand_builtin_sync_synchronize (); + + /* If the result is unused, don't bother loading. */ + if (target != const0_rtx) + emit_move_insn (target, mem); + else + target = NULL_RTX; + + /* For SEQ_CST, also emit a barrier after the load. */ + if (model == MEMMODEL_SEQ_CST) + expand_builtin_sync_synchronize (); + + return target; + } + /* This function expands th
Re: [PATCH] [annotalysis] Merge of google/integration into annotalysis
On Thu, Jun 23, 2011 at 18:08, Delesley Hutchins wrote: > This patch merges recent changes from google/integration into > branches/annotalysis. > > Bootstrapped and passed GCC regression testsuite on x86_64-unknown-linux-gnu. > > Okay for branches/annotalysis? OK. Diego.
[PATCH] [annotalysis] Merge of google/integration into annotalysis
This patch merges recent changes from google/integration into branches/annotalysis. Bootstrapped and passed GCC regression testsuite on x86_64-unknown-linux-gnu. Okay for branches/annotalysis? -- DeLesley Hutchins | Software Engineer | deles...@google.com | 505-206-0315 Property changes on: . ___ Modified: svnmerge-integrated - /branches/google/integration:1-171014 /trunk:1-170776,170779-170934 + /branches/google/integration:1-175319 /trunk:1-170776,170779-170934 Modified: svn:mergeinfo Merged /trunk:r171161 Merged /branches/google/integration:r171167-175149 Index: libstdc++-v3/scripts/extract_symvers.in === --- libstdc++-v3/scripts/extract_symvers.in (revision 175318) +++ libstdc++-v3/scripts/extract_symvers.in (working copy) @@ -52,6 +52,9 @@ ${readelf} ${lib} |\ sed -e 's/ \[: [A-Fa-f0-9]*\] //' -e '/\.dynsym/,/^$/p;d' |\ egrep -v ' (LOCAL|UND) ' |\ + sed -e 's/ : / :_/g' |\ + sed -e 's/ : / :_/g' |\ + sed -e 's/ : / :_/g' |\ awk '{ if ($4 == "FUNC" || $4 == "NOTYPE") printf "%s:%s\n", $4, $8; else if ($4 == "OBJECT" || $4 == "TLS") Index: libstdc++-v3/src/Makefile.in === --- libstdc++-v3/src/Makefile.in (revision 175318) +++ libstdc++-v3/src/Makefile.in (working copy) @@ -484,7 +484,8 @@ $(XTEMPLATE_FLAGS) \ $(WARN_CXXFLAGS) \ $(OPTIMIZE_CXXFLAGS) \ - $(CONFIG_CXXFLAGS) + $(CONFIG_CXXFLAGS) \ + $($(@)_no_omit_frame_pointer) # libstdc++ libtool notes @@ -522,6 +523,9 @@ $(CXX) $(OPT_LDFLAGS) $(SECTION_LDFLAGS) $(AM_CXXFLAGS) $(LTLDFLAGS) -o $@ debugdir = debug + +# Google-specific pessimization +functexcept.lo_no_omit_frame_pointer = -fno-omit-frame-pointer all: all-am .SUFFIXES: Index: libstdc++-v3/src/Makefile.am === --- libstdc++-v3/src/Makefile.am (revision 175318) +++ libstdc++-v3/src/Makefile.am (working copy) @@ -395,7 +395,8 @@ $(XTEMPLATE_FLAGS) \ $(WARN_CXXFLAGS) \ $(OPTIMIZE_CXXFLAGS) \ - $(CONFIG_CXXFLAGS) + $(CONFIG_CXXFLAGS) \ + $($(@)_no_omit_frame_pointer) # libstdc++ libtool notes @@ -469,3 +470,6 @@ install_debug: (cd ${debugdir} && $(MAKE) \ toolexeclibdir=$(glibcxx_toolexeclibdir)/debug install) + +# Google-specific pessimization +functexcept.lo_no_omit_frame_pointer = -fno-omit-frame-pointer Index: libstdc++-v3/include/ext/vstring.h === --- libstdc++-v3/include/ext/vstring.h (revision 175318) +++ libstdc++-v3/include/ext/vstring.h (working copy) @@ -37,6 +37,21 @@ #include #include +#if __google_stl_debug_string && !defined(_GLIBCXX_DEBUG) +# undef _GLIBCXX_DEBUG_ASSERT +# undef _GLIBCXX_DEBUG_PEDASSERT +// Perform additional checks (but only in this file). +# define _GLIBCXX_DEBUG_ASSERT(_Condition) \ + if (! (_Condition)) {\ +char buf[512]; \ +__builtin_snprintf(buf, sizeof(buf), \ + "%s:%d: %s: Assertion '%s' failed.\n", \ + __FILE__, __LINE__, __func__, # _Condition); \ +std::__throw_runtime_error(buf); \ + } +# define _GLIBCXX_DEBUG_PEDASSERT(_Condition) _GLIBCXX_DEBUG_ASSERT(_Condition) +#endif + namespace __gnu_cxx _GLIBCXX_VISIBILITY(default) { _GLIBCXX_BEGIN_NAMESPACE_VERSION @@ -2793,4 +2808,12 @@ #include "vstring.tcc" +#if __google_stl_debug_string && !defined(_GLIBCXX_DEBUG) +// Undo our defines, so they don't affect anything else. +# undef _GLIBCXX_DEBUG_ASSERT +# undef _GLIBCXX_DEBUG_PEDASSERT +# define _GLIBCXX_DEBUG_ASSERT(_Condition) +# define _GLIBCXX_DEBUG_PEDASSERT(_Condition) +#endif + #endif /* _VSTRING_H */ Index: libstdc++-v3/include/ext/sso_string_base.h === --- libstdc++-v3/include/ext/sso_string_base.h (revision 175318) +++ libstdc++-v3/include/ext/sso_string_base.h (working copy) @@ -86,6 +86,13 @@ { if (!_M_is_local()) _M_destroy(_M_allocated_capacity); +#if __google_stl_debug_string_dangling + else { + // Wipe local storage for destructed string with 0xCD. + // This mimics what DebugAllocation does to free()d memory. + __builtin_memset(_M_local_data, 0xcd, sizeof(_M_local_data)); +} +#endif } void @@ -169,15 +176,29 @@ _M_leak() { } void - _M_set_length(size_type __n) + _M_set_length_no_wipe(size_type __n) { _M_length(__n); traits_type::assign(_M_data()[__n], _CharT()); } + void + _M_set_length(size_type __n) + { +#if __google_stl_debug_string_dangling + if (__n + 1 < _M_length()) + { +
Re: [testsuite] ARM ivopts tests: skip for no thumb support
On 06/23/2011 02:56 PM, Ramana Radhakrishnan wrote: > On 23 June 2011 22:36, Janis Johnson wrote: >> Tests gcc.target/arm/ivopts*.c add -mthumb but fail on targets without >> thumb support; skip those targets. The tests save temporary files and >> need to remove them at the end, easily done with cleanup-saved-temps. >> >> Test ivopts-6.c is the only one of the set that does not require thumb2 >> support in the check for object-size, and it fails for -march=iwmmxt >> and iwmmxt2; the check should probably be used on that test as well, >> although I haven't included it here. > > I'm not sure I understand the change for ivopts-6.c : > > It's skipping if there is no Thumb support by default but the test > assumes the test will run with -marm on the command line ? > > Ramana Oops, I got carried away and didn't notice that it uses -marm rather than -mthumb. I'll take another look at that one. Janis
Re: [testsuite] ARM ivopts tests: skip for no thumb support
On 23 June 2011 22:36, Janis Johnson wrote: > Tests gcc.target/arm/ivopts*.c add -mthumb but fail on targets without > thumb support; skip those targets. The tests save temporary files and > need to remove them at the end, easily done with cleanup-saved-temps. > > Test ivopts-6.c is the only one of the set that does not require thumb2 > support in the check for object-size, and it fails for -march=iwmmxt > and iwmmxt2; the check should probably be used on that test as well, > although I haven't included it here. I'm not sure I understand the change for ivopts-6.c : It's skipping if there is no Thumb support by default but the test assumes the test will run with -marm on the command line ? Ramana
Re: [Patch, AVR]: Fix PR46779
On 06/23/2011 01:15 PM, Denis Chertykov wrote: >> textdata bss dec hex filename >> 10032 25 0 100572749 bld-avr-orig/gcc/z.o >> 5816 25 0584116d1 bld-avr-new/gcc/z.o > > Richard, can you send me this z.c file ? > Right now I'm notice that new code is worse. That's gcc.c-torture/compile/950612-1.c. r~
Re: Mark variables addressable if they are copied using libcall in RTL expander
On Thu, Jun 23, 2011 at 12:16 PM, Jakub Jelinek wrote: > On Thu, Jun 23, 2011 at 12:02:35PM -0700, Easwaran Raman wrote: >> + if (y_expr) >> + mark_addressable (y_expr); > > Please watch formatting, a tab should be used instead of 8 spaces. > >> + if (x_expr) >> + mark_addressable (x_expr); > > Ditto. > >> @@ -1084,6 +1084,8 @@ initialize_argument_information (int num_actuals A >> && TREE_CODE (base) != SSA_NAME >> && (!DECL_P (base) || MEM_P (DECL_RTL (base) >> { >> + mark_addressable (args[i].tree_value); >> + > > Likewise, plus the line is indented too much, each level should be indented > by 2 chars. > > Jakub > I have attached a new patch that fixes the formatting issues. Thanks, Easwaran Index: gcc/expr.c === --- gcc/expr.c (revision 175346) +++ gcc/expr.c (working copy) @@ -1181,8 +1181,19 @@ emit_block_move_hints (rtx x, rtx y, rtx size, enu else if (may_use_call && ADDR_SPACE_GENERIC_P (MEM_ADDR_SPACE (x)) && ADDR_SPACE_GENERIC_P (MEM_ADDR_SPACE (y))) -retval = emit_block_move_via_libcall (x, y, size, - method == BLOCK_OP_TAILCALL); +{ + /* Since x and y are passed to a libcall, mark the corresponding + tree EXPR as addressable. */ + tree y_expr = MEM_EXPR (y); + tree x_expr = MEM_EXPR (x); + if (y_expr) + mark_addressable (y_expr); + if (x_expr) + mark_addressable (x_expr); + retval = emit_block_move_via_libcall (x, y, size, + method == BLOCK_OP_TAILCALL); +} + else emit_block_move_via_loop (x, y, size, align); Index: gcc/calls.c === --- gcc/calls.c (revision 175346) +++ gcc/calls.c (working copy) @@ -1084,6 +1084,8 @@ initialize_argument_information (int num_actuals A && TREE_CODE (base) != SSA_NAME && (!DECL_P (base) || MEM_P (DECL_RTL (base) { + mark_addressable (args[i].tree_value); + /* We can't use sibcalls if a callee-copied argument is stored in the current function's frame. */ if (!call_from_thunk_p && DECL_P (base) && !TREE_STATIC (base)) @@ -3524,7 +3526,12 @@ emit_library_call_value_1 (int retval, rtx orgfun, } if (MEM_P (val) && !must_copy) - slot = val; + { + tree val_expr = MEM_EXPR (val); + if (val_expr) + mark_addressable (val_expr); + slot = val; + } else { slot = assign_temp (lang_hooks.types.type_for_mode (mode, 0),
[testsuite] ARM wmul tests: require arm_dsp_multiply
Tests wmul-[1234].c and mla-2.c in gcc.target/arm require support that the arm backend identifies as TARGET_DSP_MULTIPLY. The tests all specify a -march option with that support, but it is overridden by multilib flags. This patch adds a new effective target, arm_dsp_multiply, and requires it for those tests instead of having them specify a -march value. This means that the tests will be skipped for older targets and test coverage relies on testing for some newer multilibs. The same effective target is needed for tests smlaltb-1.c, smlaltt-1.c, smlatb-1.c, and smlatt-1.c, but those also need to be renamed so the scans don't pass just because the file name is in the assembly file. OK for trunk, and later for 4.6? (btw, I'm currently testing ARM compile-only tests with 43 sets of multilib flags) 2011-06-23 Janis Johnson * lib/target-supports.exp (check_effective_target_arm_dsp_multiply): New. * gcc.target/arm/wmul-1.c: Require arm_dsp_multiply, don't supply -march. * gcc.target/arm/wmul-2.c: Likewise. * gcc.target/arm/wmul-3.c: Likewise. * gcc.target/arm/wmul-4.c: Likewise. * gcc.target/arm/mla-2.c: Likewise. Index: lib/target-supports.exp === --- lib/target-supports.exp (revision 175313) +++ lib/target-supports.exp (working copy) @@ -1902,6 +1902,33 @@ } } +# Return 1 if this is an ARM target that supports DSP multiply with +# current multilib flags. + +proc check_effective_target_arm_dsp_multiply { } { +return [check_no_compiler_messages arm_dsp_multiply assembly { + #if defined(__ARM_ARCH_2__) || defined(__ARM_ARCH_2A__) \ + || defined(__ARM_ARCH_3__) || defined(__ARM_ARCH_3M__) \ + || defined(__ARM_ARCH_4__) || defined(__ARM_ARCH_4T__) \ + || defined(__ARM_ARCH_5T__) + # error NOT_SUPPORTED + #elif defined(__thumb__) || defined(__thumb2__) + # if defined(__ARM_ARCH_5TE__) || defined(__ARM_ARCH_6__) \ + || defined(__ARM_ARCH_6J__) || defined(__ARM_ARCH_6M__) \ + || defined(__ARM_ARCH_6Z__) || defined(__ARM_ARCH_6ZK__) \ + || defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7M__) \ + || defined(__ARM_ARCH_IWMMXT__) || defined(__ARM_ARCH_IWMMXT2__) + #error NOT_SUPPORTED + # endif + #else + # if defined(__ARM_ARCH_5__) || defined(__ARM_ARCH_EP9312__) + #error NOT_SUPPORTED + # endif + #endif + int i; +}] +} + # Add the options needed for NEON. We need either -mfloat-abi=softfp # or -mfloat-abi=hard, but if one is already specified by the # multilib, use it. Similarly, if a -mfpu option already enables Index: gcc.target/arm/wmul-1.c === --- gcc.target/arm/wmul-1.c (revision 175313) +++ gcc.target/arm/wmul-1.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -march=armv6t2" } */ +/* { dg-require-effective-target arm_dsp_multiply } */ +/* { dg-options "-O2" } */ int mac(const short *a, const short *b, int sqr, int *sum) { Index: gcc.target/arm/wmul-2.c === --- gcc.target/arm/wmul-2.c (revision 175313) +++ gcc.target/arm/wmul-2.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -march=armv6t2" } */ +/* { dg-require-effective-target arm_dsp_multiply } */ +/* { dg-options "-O2" } */ void vec_mpy(int y[], const short x[], short scaler) { Index: gcc.target/arm/wmul-3.c === --- gcc.target/arm/wmul-3.c (revision 175313) +++ gcc.target/arm/wmul-3.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -march=armv6t2" } */ +/* { dg-require-effective-target arm_dsp_multiply } */ +/* { dg-options "-O2" } */ int mac(const short *a, const short *b, int sqr, int *sum) { Index: gcc.target/arm/wmul-4.c === --- gcc.target/arm/wmul-4.c (revision 175313) +++ gcc.target/arm/wmul-4.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -march=armv6t2" } */ +/* { dg-require-effective-target arm_dsp_multiply } */ +/* { dg-options "-O2" } */ int mac(const int *a, const int *b, long long sqr, long long *sum) { Index: gcc.target/arm/mla-2.c === --- gcc.target/arm/mla-2.c (revision 175313) +++ gcc.target/arm/mla-2.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -march=armv7-a" } */ +/* { dg-require-effective-target arm_dsp_multiply } */ +/* { dg-options "-O2" } */ long long foolong (long long x, short *a, short *b) {
Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
On 06/23/2011 07:40 AM, Andrew Stubbs wrote: +++ b/gcc/testsuite/gcc.target/arm/umlal-1.c +/* { dg-final { scan-assembler "umlal" } } */ Don't use the name of the instruction as the test name or the scan will always pass, because the file name shows up in assembly output. See http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01823.html for a proposed effective target that can be used in this test. Janis
Re: Removed unused cp_binding_level field names_size. (issue4662052)
Also: Tested on x86-64. Ok to commit to trunk? On Thu, Jun 23, 2011 at 2:32 PM, Gabriel Charette wrote: > The names_size member of cp_binding_level was write only. Removed it. > Seems like it was introduced for java in 2002, but it's not used anywhere > anymore in the code. > > Tested with bootstrap and full regression testing. > > 2011-06-23 Gabriel Charette > > * name-lookup.h (cp_binding_level): Removed unused > member names_size. Update all users. > > diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c > index 953edd5..8bf5f5f 100644 > --- a/gcc/cp/name-lookup.c > +++ b/gcc/cp/name-lookup.c > @@ -541,7 +541,6 @@ add_decl_to_level (tree decl, cxx_scope *b) > necessary. */ > TREE_CHAIN (decl) = b->names; > b->names = decl; > - b->names_size++; > > /* If appropriate, add decl to separate list of statics. We > include extern variables because they might turn out to be > diff --git a/gcc/cp/name-lookup.h b/gcc/cp/name-lookup.h > index 009b5d9..5f266eb 100644 > --- a/gcc/cp/name-lookup.h > +++ b/gcc/cp/name-lookup.h > @@ -191,9 +191,6 @@ struct GTY(()) cp_binding_level { > are wrapped in TREE_LISTs; the TREE_VALUE is the OVERLOAD. */ > tree names; > > - /* Count of elements in names chain. */ > - size_t names_size; > - > /* A chain of NAMESPACE_DECL nodes. */ > tree namespaces; > > > -- > This patch is available for review at http://codereview.appspot.com/4662052 >
[testsuite] ARM ivopts tests: skip for no thumb support
Tests gcc.target/arm/ivopts*.c add -mthumb but fail on targets without thumb support; skip those targets. The tests save temporary files and need to remove them at the end, easily done with cleanup-saved-temps. Test ivopts-6.c is the only one of the set that does not require thumb2 support in the check for object-size, and it fails for -march=iwmmxt and iwmmxt2; the check should probably be used on that test as well, although I haven't included it here. OK for trunk? 2011-06-23 Janis Johnson * gcc.target/arm/ivopts-2.c: Require thumb support, clean up temporary files. * gcc.target/arm/ivopts-3.c: Likewise. * gcc.target/arm/ivopts-4.c: Likewise. * gcc.target/arm/ivopts-5.c: Likewise. * gcc.target/arm/ivopts-6.c: Likewise. * gcc.target/arm/ivopts.c: Likewise. Index: gcc.target/arm/ivopts-2.c === --- gcc.target/arm/ivopts-2.c (revision 175313) +++ gcc.target/arm/ivopts-2.c (working copy) @@ -1,4 +1,5 @@ /* { dg-do assemble } */ +/* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */ /* { dg-options "-Os -mthumb -fdump-tree-ivopts -save-temps" } */ extern void foo2 (short*); @@ -16,3 +17,4 @@ /* { dg-final { scan-tree-dump-times "PHI <" 1 "ivopts"} } */ /* { dg-final { object-size text <= 26 { target arm_thumb2_ok } } } */ /* { dg-final { cleanup-tree-dump "ivopts" } } */ +/* { dg-final { cleanup-saved-temps "ivopts" } } */ Index: gcc.target/arm/ivopts-3.c === --- gcc.target/arm/ivopts-3.c (revision 175313) +++ gcc.target/arm/ivopts-3.c (working copy) @@ -1,4 +1,5 @@ /* { dg-do assemble } */ +/* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */ /* { dg-options "-Os -mthumb -fdump-tree-ivopts -save-temps" } */ extern unsigned int foo2 (short*) __attribute__((pure)); @@ -18,3 +19,4 @@ /* { dg-final { scan-tree-dump-times ", x" 0 "ivopts"} } */ /* { dg-final { object-size text <= 30 { target arm_thumb2_ok } } } */ /* { dg-final { cleanup-tree-dump "ivopts" } } */ +/* { dg-final { cleanup-saved-temps "ivopts" } } */ Index: gcc.target/arm/ivopts-4.c === --- gcc.target/arm/ivopts-4.c (revision 175313) +++ gcc.target/arm/ivopts-4.c (working copy) @@ -1,4 +1,5 @@ /* { dg-do assemble } */ +/* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */ /* { dg-options "-mthumb -Os -fdump-tree-ivopts -save-temps" } */ extern unsigned int foo (int*) __attribute__((pure)); @@ -19,3 +20,4 @@ /* { dg-final { scan-tree-dump-times ", x" 0 "ivopts"} } */ /* { dg-final { object-size text <= 36 { target arm_thumb2_ok } } } */ /* { dg-final { cleanup-tree-dump "ivopts" } } */ +/* { dg-final { cleanup-saved-temps "ivopts" } } */ Index: gcc.target/arm/ivopts-5.c === --- gcc.target/arm/ivopts-5.c (revision 175313) +++ gcc.target/arm/ivopts-5.c (working copy) @@ -1,4 +1,5 @@ /* { dg-do assemble } */ +/* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */ /* { dg-options "-Os -mthumb -fdump-tree-ivopts -save-temps" } */ extern unsigned int foo (int*) __attribute__((pure)); @@ -18,3 +19,4 @@ /* { dg-final { scan-tree-dump-times ", x" 0 "ivopts"} } */ /* { dg-final { object-size text <= 30 { target arm_thumb2_ok } } } */ /* { dg-final { cleanup-tree-dump "ivopts" } } */ +/* { dg-final { cleanup-saved-temps "ivopts" } } */ Index: gcc.target/arm/ivopts-6.c === --- gcc.target/arm/ivopts-6.c (revision 175313) +++ gcc.target/arm/ivopts-6.c (working copy) @@ -1,4 +1,5 @@ /* { dg-do assemble } */ +/* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */ /* { dg-options "-Os -fdump-tree-ivopts -save-temps -marm" } */ void @@ -13,3 +14,4 @@ /* { dg-final { scan-tree-dump-times "PHI <" 1 "ivopts"} } */ /* { dg-final { object-size text <= 32 } } */ /* { dg-final { cleanup-tree-dump "ivopts" } } */ +/* { dg-final { cleanup-saved-temps "ivopts" } } */ Index: gcc.target/arm/ivopts.c === --- gcc.target/arm/ivopts.c (revision 175313) +++ gcc.target/arm/ivopts.c (working copy) @@ -1,4 +1,5 @@ /* { dg-do assemble } */ +/* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */ /* { dg-options "-Os -mthumb -fdump-tree-ivopts -save-temps" } */ void @@ -13,3 +14,4 @@ /* { dg-final { scan-tree-dump-times "PHI <" 1 "ivopts"} } */ /* { dg-final { object-size text <= 20 { target arm_thumb2_ok } } } */ /* { dg-final { cleanup-tree-dump "ivopts" } } */ +/* { dg-final { cleanup-saved-temps "ivopts" } } */
Removed unused cp_binding_level field names_size. (issue4662052)
The names_size member of cp_binding_level was write only. Removed it. Seems like it was introduced for java in 2002, but it's not used anywhere anymore in the code. Tested with bootstrap and full regression testing. 2011-06-23 Gabriel Charette * name-lookup.h (cp_binding_level): Removed unused member names_size. Update all users. diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c index 953edd5..8bf5f5f 100644 --- a/gcc/cp/name-lookup.c +++ b/gcc/cp/name-lookup.c @@ -541,7 +541,6 @@ add_decl_to_level (tree decl, cxx_scope *b) necessary. */ TREE_CHAIN (decl) = b->names; b->names = decl; - b->names_size++; /* If appropriate, add decl to separate list of statics. We include extern variables because they might turn out to be diff --git a/gcc/cp/name-lookup.h b/gcc/cp/name-lookup.h index 009b5d9..5f266eb 100644 --- a/gcc/cp/name-lookup.h +++ b/gcc/cp/name-lookup.h @@ -191,9 +191,6 @@ struct GTY(()) cp_binding_level { are wrapped in TREE_LISTs; the TREE_VALUE is the OVERLOAD. */ tree names; -/* Count of elements in names chain. */ -size_t names_size; - /* A chain of NAMESPACE_DECL nodes. */ tree namespaces; -- This patch is available for review at http://codereview.appspot.com/4662052
Re: RFA PR middle-end/48770
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 06/22/11 08:25, Bernd Schmidt wrote: > On 06/14/2011 05:32 PM, Jeff Law wrote: >> >> This version incorporates suggestions from Bernd. Basically we have >> reload1.c set reload_completed internally rather than deferring it into >> ira.c. That allows the call to reload() to return whether or not a DCE >> pass is desirable at the end of reload. >> >> That in turn allows us to avoid the DF clumsiness of the previous version. >> >> Bootstrapped and regression tested on x86_64-unknown-linux-gnu. > > This is OK, although I'm slightly confused - wasn't your original > problem caused by the other delete_dead_insn call? It was caused by the recursive nature of delete_dead_insn. Removing the equivalencing insn is OK, it's deleting the insns feeding the equivalencing insn that causes problems. We could have delete_dead_insn not remove the equivalencing insn, but that's going to trigger a lot more calls to DCE after reload has completed and thus a higher compile-time hit. jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOA6ZwAAoJEBRtltQi2kC7U50H/AlscEfPiXJr3m89IEMp/eO3 C7NNc6WRaUPc/6j7T36xlryJd9H01VBKtuVDNHMAutWfosVhqufEid6a94okLbFG IWsd2Bd1UTiqt9l0ddklo3/9XYmQoKlYmoYi+5XrvjGi+RQAByThA1j2agehMri7 ENxaYuvsAiitdl7Uan5hkTCcM0u4oLU1xBTbDa37o0Z0XDwVl3m4t47cwIQxpvWD HZD5VvvafgvLRsmtRdJ8lipujicNR1uV2eblF08ZBnf8gat4AQ7/0+SLKLrLN1bN 9RIB7YJHgm2yxTnZ0pms1+elkTMvqVKhy+Mvk8u7njsRGpixgavhhlFfbtKvVeo= =K8kI -END PGP SIGNATURE-
Re: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer
On Thu, Jun 23, 2011 at 03:41:01PM -0500, Fang, Changpeng wrote: > This patch enables 128-bit avx instruction generation for the auto-vectorizer > for AMD bulldozer > machines. This enablement gives additional ~3% improvement on polyhedron 2005 > and cpu2006 > floating point programs. > > The patch passed bootstrapping on a x86_64-unknown-linux-gnu system with > Bulldozer cores. > > Is it OK to commit to trunk and backport to 4.6 branch? For 4.6 branch, if it is approved for trunk, please wait after 4.6.1 is released. Jakub
[PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer
Hi, This patch enables 128-bit avx instruction generation for the auto-vectorizer for AMD bulldozer machines. This enablement gives additional ~3% improvement on polyhedron 2005 and cpu2006 floating point programs. The patch passed bootstrapping on a x86_64-unknown-linux-gnu system with Bulldozer cores. Is it OK to commit to trunk and backport to 4.6 branch? Thanks, Changpeng From b5015593b0b30b14783866ac68c2c5f2e014d206 Mon Sep 17 00:00:00 2001 From: Changpeng Fang Date: Wed, 22 Jun 2011 15:03:05 -0700 Subject: [PATCH] Auto-vectorizer generates 128-bit AVX insns by default for bdver1 * config/i386/i386.opt (mprefer-avx128): Redefine the flag as a Mask option. * config/i386/i386.c (x86_prefer_avx128): New tune option definition. (ix86_option_override_internal): Enable the generation of the 128-bit instructions when x86_prefer_avx128 is set. (ix86_preferred_simd_mode): Use TARGET_PREFER_AVX128. (ix86_autovectorize_vector_sizes): Use TARGET_PREFER_AVX128. --- gcc/config/i386/i386.c | 13 ++--- gcc/config/i386/i386.opt |2 +- 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 014401b..1f5113f 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -2128,6 +2128,9 @@ static const unsigned int x86_avx256_split_unaligned_load static const unsigned int x86_avx256_split_unaligned_store = m_COREI7 | m_BDVER1 | m_GENERIC; +static const unsigned int x86_prefer_avx128 + = m_BDVER1; + /* In case the average insn count for single function invocation is lower than this constant, emit fast (but longer) prologue and epilogue code. */ @@ -2623,6 +2626,7 @@ ix86_target_string (int isa, int flags, const char *arch, const char *tune, { "-mvzeroupper", MASK_VZEROUPPER }, { "-mavx256-split-unaligned-load", MASK_AVX256_SPLIT_UNALIGNED_LOAD}, { "-mavx256-split-unaligned-store", MASK_AVX256_SPLIT_UNALIGNED_STORE}, +{ "-mprefer-avx128", MASK_PREFER_AVX128}, }; const char *opts[ARRAY_SIZE (isa_opts) + ARRAY_SIZE (flag_opts) + 6][2]; @@ -3672,6 +3676,9 @@ ix86_option_override_internal (bool main_args_p) if ((x86_avx256_split_unaligned_store & ix86_tune_mask) && !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_STORE)) target_flags |= MASK_AVX256_SPLIT_UNALIGNED_STORE; + if ((x86_prefer_avx128 & ix86_tune_mask) + && !(target_flags_explicit & MASK_PREFER_AVX128)) + target_flags |= MASK_PREFER_AVX128; } } else @@ -34614,7 +34621,7 @@ ix86_preferred_simd_mode (enum machine_mode mode) return V2DImode; case SFmode: - if (TARGET_AVX && !flag_prefer_avx128) + if (TARGET_AVX && !TARGET_PREFER_AVX128) return V8SFmode; else return V4SFmode; @@ -34622,7 +34629,7 @@ ix86_preferred_simd_mode (enum machine_mode mode) case DFmode: if (!TARGET_VECTORIZE_DOUBLE) return word_mode; - else if (TARGET_AVX && !flag_prefer_avx128) + else if (TARGET_AVX && !TARGET_PREFER_AVX128) return V4DFmode; else if (TARGET_SSE2) return V2DFmode; @@ -34639,7 +34646,7 @@ ix86_preferred_simd_mode (enum machine_mode mode) static unsigned int ix86_autovectorize_vector_sizes (void) { - return (TARGET_AVX && !flag_prefer_avx128) ? 32 | 16 : 0; + return (TARGET_AVX && !TARGET_PREFER_AVX128) ? 32 | 16 : 0; } /* Initialize the GCC target structure. */ diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index 21e0def..9886b7b 100644 --- a/gcc/config/i386/i386.opt +++ b/gcc/config/i386/i386.opt @@ -388,7 +388,7 @@ Do dispatch scheduling if processor is bdver1 and Haifa scheduling is selected. mprefer-avx128 -Target Report Var(flag_prefer_avx128) Init(0) +Target Report Mask(PREFER_AVX128) SAVE Use 128-bit AVX instructions instead of 256-bit AVX instructions in the auto-vectorizer. ;; ISA support -- 1.7.0.4
Re: [Patch, AVR]: Fix PR46779
2011/6/16 Richard Henderson : > On 06/15/2011 02:58 PM, Richard Henderson wrote: >> Indeed, I can work around this particular crash by either >> hacking Z to be call-saved, or hacking the frame pointer to >> not be required. The former of course changes the abi, and >> the second produces awful code due to too many copies from >> the stack pointer. So neither option is "preferred". > > Perhaps I spoke too soon re the frame pointer. The old > code is even worse. > > text data bss dec hex filename > 10032 25 0 10057 2749 bld-avr-orig/gcc/z.o > 5816 25 0 5841 16d1 bld-avr-new/gcc/z.o Richard, can you send me this z.c file ? Right now I'm notice that new code is worse. Denis.
[PATCH, PR 49516] Avoid SRA mem-refing its scalar replacements
Hi, When SRA tries to modify an assignment where on one side it should put a new scalar replacement but the other is actually an aggregate with a number of replacements for it, it will generate MEM-REFs into the former replacement which can lead to miscompilations. This is avoided by the simple patch below. With it, we deal with these situations like with other type-casts that SRA cannot handle: we channel the data through the original variable and the original statement. The testcase is not miscompiled with 4.6 gcc but the bug is just latent there. I have verified the problem goes away on i686-linux. I have bootstrapped and tested the patch on x86_64-linux too. I intend to do a full i686 bootstrap and test but so far have not managed to do it. OK for trunk and 4.6 after it is unfrozen? Thanks, Martin 2011-06-22 Martin Jambor PR tree-optimizations/49516 * tree-sra.c (sra_modify_assign): Choose the safe path for aggregate copies if we also did scalar replacements. * testsuite/g++.dg/tree-ssa/pr49516.C: New test. Index: src/gcc/tree-sra.c === --- src.orig/gcc/tree-sra.c +++ src/gcc/tree-sra.c @@ -2804,7 +2804,8 @@ sra_modify_assign (gimple *stmt, gimple_ there to do the copying and then load the scalar replacements of the LHS. This is what the first branch does. */ - if (gimple_has_volatile_ops (*stmt) + if (modify_this_stmt + || gimple_has_volatile_ops (*stmt) || contains_vce_or_bfcref_p (rhs) || contains_vce_or_bfcref_p (lhs)) { Index: src/gcc/testsuite/g++.dg/tree-ssa/pr49516.C === --- /dev/null +++ src/gcc/testsuite/g++.dg/tree-ssa/pr49516.C @@ -0,0 +1,86 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +extern "C" void abort (void); + +typedef int int32; +typedef unsigned int uint32; +typedef unsigned long long uint64; +typedef short int16; + +class Tp { + public: + Tp(int, const int segment, const int index) __attribute__((noinline)); + + inline bool operator==(const Tp& other) const; + inline bool operator!=(const Tp& other) const; + int GetType() const { return type_; } + int GetSegment() const { return segment_; } + int GetIndex() const { return index_; } + private: + inline static bool IsValidSegment(const int segment); + static const int kSegmentBits = 28; + static const int kTypeBits = 4; + static const int kMaxSegment = (1 << kSegmentBits) - 1; + + union { + +struct { + int32 index_; + uint32 segment_ : kSegmentBits; + uint32 type_ : kTypeBits; +}; +struct { + int32 dummy_; + uint32 type_and_segment_; +}; +uint64 value_; + }; +}; + +Tp::Tp(int t, const int segment, const int index) + : index_(index), segment_(segment), type_(t) {} + +inline bool Tp::operator==(const Tp& other) const { + return value_ == other.value_; +} +inline bool Tp::operator!=(const Tp& other) const { + return value_ != other.value_; +} + +class Range { + public: + inline Range(const Tp& position, const int count) __attribute__((always_inline)); + inline Tp GetBeginTokenPosition() const; + inline Tp GetEndTokenPosition() const; + private: + Tp position_; + int count_; + int16 begin_index_; + int16 end_index_; +}; + +inline Range::Range(const Tp& position, +const int count) +: position_(position), count_(count), begin_index_(0), end_index_(0) +{ } + +inline Tp Range::GetBeginTokenPosition() const { + return position_; +} +inline Tp Range::GetEndTokenPosition() const { + return Tp(position_.GetType(), position_.GetSegment(), +position_.GetIndex() + count_); +} + +int main () +{ + Range range(Tp(0, 0, 3), 0); + if (!(range.GetBeginTokenPosition() == Tp(0, 0, 3))) +abort (); + + if (!(range.GetEndTokenPosition() == Tp(0, 0, 3))) +abort(); + + return 0; +}
[lra] some cleaning up to speed up LRA
The following patch removes a code used for some experiments in pseudo live range splitting during the assignment sub-pass and, as consequence, speeds LRA up. The patch was successfully bootstrapped on x86-64. 2011-06-23 Vladimir Makarov * lra-int.h (struct lra_bb_info, lra_bb_info): Remove. * lra.c (lra_bb_info, init_bb_info, finish_bb_info): Remove. (lra): Remove calls of init_bb_info and finish_bb_info. * lra-lives.c (live_regs): Remove. (make_hard_regno_born, make_pseudo_live, make_pseudo_dead): Don't update live_regs. (process_bb_lives): Ditto. Don't set up lra_bb_info. (lra_create_live_ranges): Don't initialize/finalize live_regs. Index: lra.c === --- lra.c (revision 175313) +++ lra.c (working copy) @@ -45,35 +45,6 @@ along with GCC; see the file COPYING3. #include "lra-int.h" #include "df.h" -/* Info about BBs used by several LRA files. Remember that we never - create new BBs during LRA. */ -struct lra_bb_info *lra_bb_info; - -/* Allocate and initialize the BB info. */ -static void -init_bb_info (void) -{ - basic_block bb; - - lra_bb_info = (struct lra_bb_info *) xmalloc (sizeof (struct lra_bb_info) - * last_basic_block); - FOR_EACH_BB (bb) -bitmap_initialize (&lra_bb_info[bb->index].live_in_regs, ®_obstack); -} - -/* Finish and free the BB info. */ -static void -finish_bb_info (void) -{ - basic_block bb; - - FOR_EACH_BB (bb) -bitmap_clear (&lra_bb_info[bb->index].live_in_regs); - free (lra_bb_info); -} - - - /* Hard registers currently not available for allocation. It can changed after some registers become not eliminable. */ HARD_REG_SET lra_no_alloc_regs; @@ -2041,7 +2012,6 @@ lra (FILE *f) lra_dump_file = f; - init_bb_info (); init_insn_recog_data (); #ifdef ENABLE_CHECKING @@ -2141,7 +2111,6 @@ lra (FILE *f) lra_live_ranges_finish (); lra_contraints_finish (); finish_reg_info (); - finish_bb_info (); bitmap_clear (&lra_constraint_insn_stack_bitmap); VEC_free (rtx, heap, lra_constraint_insn_stack); finish_insn_recog_data (); Index: lra-int.h === --- lra-int.h (revision 175313) +++ lra-int.h (working copy) @@ -48,19 +48,6 @@ lra_get_preferred_class (int regno) return reg_preferred_class (regno); } -/* Info about BBs used by several LRA files. */ -struct lra_bb_info -{ - /* DFA creates a bit different data (DF_LR_IN) than we need for LRA - live range splitting. E.g. DF_LR_IN might be not accurate for BB - having EH predecessors. */ - bitmap_head live_in_regs; -}; - -/* Info about BBs used by several LRA files. Remember that we never - create new BBs during LRA. */ -extern struct lra_bb_info *lra_bb_info; - typedef struct lra_live_range *lra_live_range_t; /* The structure describes program points where a given pseudo lives. Index: lra-lives.c === --- lra-lives.c (revision 175313) +++ lra-lives.c (working copy) @@ -72,10 +72,6 @@ static sparseset pseudos_live; /* Set of hard regs (except eliminable ones) currently live. */ static HARD_REG_SET hard_regs_live; -/* Another representation of living pseudos and hard registers at the - current point. */ -static bitmap_head live_regs; - /* Set of pseudos and hard registers start living/dying. */ static sparseset start_living, start_dying; @@ -283,7 +279,6 @@ make_hard_regno_born (int regno) || TEST_HARD_REG_BIT (hard_regs_live, regno)) return; SET_HARD_REG_BIT (hard_regs_live, regno); - bitmap_set_bit (&live_regs, regno); sparseset_set_bit (start_living, regno); EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, i) SET_HARD_REG_BIT (lra_reg_info[i].conflict_hard_regs, regno); @@ -300,7 +295,6 @@ make_hard_regno_dead (int regno) gcc_assert (regno < FIRST_PSEUDO_REGISTER); sparseset_set_bit (start_dying, regno); CLEAR_HARD_REG_BIT (hard_regs_live, regno); - bitmap_clear_bit (&live_regs, regno); } /* Mark pseudo REGNO as currently living, update conflicting hard @@ -314,7 +308,6 @@ mark_pseudo_live (int regno) gcc_assert (regno >= FIRST_PSEUDO_REGISTER); gcc_assert (! sparseset_bit_p (pseudos_live, regno)); sparseset_set_bit (pseudos_live, regno); - bitmap_set_bit (&live_regs, regno); IOR_HARD_REG_SET (lra_reg_info[regno].conflict_hard_regs, hard_regs_live); if ((complete_info_p || lra_get_regno_hard_regno (regno) < 0) @@ -335,7 +328,6 @@ mark_pseudo_dead (int regno) gcc_assert (regno >= FIRST_PSEUDO_REGISTER); gcc_assert (sparseset_bit_p (pseudos_live, regno)); sparseset_clear_bit (pseudos_live, regno); - bitmap_clear_bit (&live_regs, regno); sparseset_set_bit (start_dying, regno); if (complete_info_p || lra_get_regno_hard_regno (regno) < 0) { @@ -488,16
Re: Mark variables addressable if they are copied using libcall in RTL expander
On Thu, Jun 23, 2011 at 12:02:35PM -0700, Easwaran Raman wrote: > + if (y_expr) > +mark_addressable (y_expr); Please watch formatting, a tab should be used instead of 8 spaces. > + if (x_expr) > +mark_addressable (x_expr); Ditto. > @@ -1084,6 +1084,8 @@ initialize_argument_information (int num_actuals A > && TREE_CODE (base) != SSA_NAME > && (!DECL_P (base) || MEM_P (DECL_RTL (base) > { > + mark_addressable (args[i].tree_value); > + Likewise, plus the line is indented too much, each level should be indented by 2 chars. Jakub
New German PO file for 'gcc' (version 4.6.0)
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'gcc' has been submitted by the German team of translators. The file is available at: http://translationproject.org/latest/gcc/de.po (This file, 'gcc-4.6.0.de.po', has just now been sent to you in a separate email.) All other PO files for your package are available in: http://translationproject.org/latest/gcc/ Please consider including all of these in your next release, whether official or a pretest. Whenever you have a new distribution with a new version number ready, containing a newer POT file, please send the URL of that distribution tarball to the address below. The tarball may be just a pretest or a snapshot, it does not even have to compile. It is just used by the translators when they need some extra translation context. The following HTML page has been updated: http://translationproject.org/domain/gcc.html If any question arises, please contact the translation coordinator. Thank you for all your work, The Translation Project robot, in the name of your translation coordinator.
Re: Mark variables addressable if they are copied using libcall in RTL expander
On Thu, Jun 23, 2011 at 3:22 AM, Eric Botcazou wrote: >> So, what's the patch(es) that need approval now? > > Original expr.c patch for PR rtl-optimization/49429 + adjusted and augmented > calls.c patch for PR target/49454. Everything is in this thread. > > Easwaran, would you mind posting a consolidated patch? > > -- > Eric Botcazou > Here is the revised patch. Bootstraps and all tests pass on x86_64-unknown-linux. OK for trunk? 2011-06-23 Easwaran Raman PR rtl-optimization/49429 PR target/49454 * expr.c (emit_block_move_hints): Mark MEM_EXPR(x) and MEM_EXPR(y) addressable if emit_block_move_via_libcall is used to copy y into x. * calls.c (initialize_argument_information): Mark an argument addressable if it is passed by invisible reference. (emit_library_call_value_1): Mark MEM_EXPR (val) addressable if it is passed by reference. Index: gcc/expr.c === --- gcc/expr.c (revision 175346) +++ gcc/expr.c (working copy) @@ -1181,8 +1181,19 @@ emit_block_move_hints (rtx x, rtx y, rtx size, enu else if (may_use_call && ADDR_SPACE_GENERIC_P (MEM_ADDR_SPACE (x)) && ADDR_SPACE_GENERIC_P (MEM_ADDR_SPACE (y))) -retval = emit_block_move_via_libcall (x, y, size, - method == BLOCK_OP_TAILCALL); +{ + /* Since x and y are passed to a libcall, mark the corresponding + tree EXPR as addressable. */ + tree y_expr = MEM_EXPR (y); + tree x_expr = MEM_EXPR (x); + if (y_expr) +mark_addressable (y_expr); + if (x_expr) +mark_addressable (x_expr); + retval = emit_block_move_via_libcall (x, y, size, + method == BLOCK_OP_TAILCALL); +} + else emit_block_move_via_loop (x, y, size, align); Index: gcc/calls.c === --- gcc/calls.c (revision 175346) +++ gcc/calls.c (working copy) @@ -1084,6 +1084,8 @@ initialize_argument_information (int num_actuals A && TREE_CODE (base) != SSA_NAME && (!DECL_P (base) || MEM_P (DECL_RTL (base) { + mark_addressable (args[i].tree_value); + /* We can't use sibcalls if a callee-copied argument is stored in the current function's frame. */ if (!call_from_thunk_p && DECL_P (base) && !TREE_STATIC (base)) @@ -3524,7 +3526,12 @@ emit_library_call_value_1 (int retval, rtx orgfun, } if (MEM_P (val) && !must_copy) - slot = val; +{ + tree val_expr = MEM_EXPR (val); + if (val_expr) +mark_addressable (val_expr); + slot = val; +} else { slot = assign_temp (lang_hooks.types.type_for_mode (mode, 0),
[PATCH] Fix tree-ssa/asm-1.c testcase (PR testsuite/49512)
Hi! This testcase checks whether two 2-digit numbers occur in *.optimized just once. They can from time to time match decl_uid=. too though, so this patch ensures DECL_UID isn't printed. Regtested on x86_64-linux and i686-linux, committed as obvious. 2011-06-23 Jakub Jelinek PR testsuite/49512 * gcc.dg/tree-ssa/asm-1.c: Use -fdump-tree-optimized-nouid instead of -fdump-tree-optimized. --- gcc/testsuite/gcc.dg/tree-ssa/asm-1.c.jj2008-09-05 12:54:29.0 +0200 +++ gcc/testsuite/gcc.dg/tree-ssa/asm-1.c 2011-06-23 10:19:09.0 +0200 @@ -2,7 +2,7 @@ as a def. */ /* { dg-do compile } */ -/* { dg-options "-O -fdump-tree-optimized" } */ +/* { dg-options "-O -fdump-tree-optimized-nouid" } */ void f() { Jakub
Re: [pph] Stream scope_chain->bindings instead of global namespace (issue4661045)
Yes I did fill the form, included you as an approver, haven't heard back from it yet. Gab On Thu, Jun 23, 2011 at 10:23 AM, Diego Novillo wrote: > > On Thu, Jun 23, 2011 at 13:21, Diego Novillo wrote: > > I've made a couple of minor edits to comments and formatting and > > committed to the branch (final patch below). > > Incidentally, did you fill-in the svn write access form? You've > produced enough good patches already. Time for you to be able to > commit your own. > > > Diego.
[RFA:] Removing target-libiberty on branches
Here's the patch I tested for 4.6, native x86_64-unknown-linux-gnu, cross to cris-axis-elf, both with old and new ("breaking") newlib. Ok for 4.6 and after testing, earlier branches? 2011-06-22 Hans-Peter Nilsson PR regression/47836 PR bootstrap/23656 PR other/47733 PR bootstrap/49247 PR c/48825 * configure.ac (target_libraries): Remove target-libiberty. Remove all target-specific settings adding target-libiberty to skipdirs and noconfigdirs. Remove checking target_configdirs and removing target-libiberty but keeping target-libgcc if otherwise empty. * Makefile.def (target_modules): Don't add libiberty. (dependencies): Remove all traces of target-libiberty. * configure, Makefile.in: Regenerate. Index: configure.ac === --- configure.ac(revision 175300) +++ configure.ac(working copy) @@ -186,9 +186,8 @@ libgcj="target-libffi \ # these libraries are built for the target environment, and are built after # the host libraries and the host tools (which may be a cross compiler) -# +# Note that libiberty is not a target library. target_libraries="target-libgcc \ - target-libiberty \ target-libgloss \ target-newlib \ target-libgomp \ @@ -595,14 +594,14 @@ case "${target}" in ;; *-*-kaos*) # Remove unsupported stuff on all kaOS configurations. -skipdirs="target-libiberty ${libgcj} target-libstdc++-v3 target-librx" +skipdirs="${libgcj} target-libstdc++-v3 target-librx" skipdirs="$skipdirs target-libobjc target-examples target-groff target-gperf" skipdirs="$skipdirs zlib fastjar target-libjava target-boehm-gc target-zlib" noconfigdirs="$noconfigdirs target-libgloss" ;; *-*-netbsd*) # Skip some stuff on all NetBSD configurations. -noconfigdirs="$noconfigdirs target-newlib target-libiberty target-libgloss" +noconfigdirs="$noconfigdirs target-newlib target-libgloss" # Skip some stuff that's unsupported on some NetBSD configurations. case "${target}" in @@ -614,21 +613,20 @@ case "${target}" in esac ;; *-*-netware*) -noconfigdirs="$noconfigdirs target-newlib target-libiberty target-libgloss ${libgcj} target-libmudflap" +noconfigdirs="$noconfigdirs target-newlib target-libgloss ${libgcj} target-libmudflap" ;; *-*-rtems*) -skipdirs="${skipdirs} target-libiberty" noconfigdirs="$noconfigdirs target-libgloss ${libgcj}" ;; # The tpf target doesn't support gdb yet. *-*-tpf*) -noconfigdirs="$noconfigdirs target-newlib target-libgloss target-libiberty ${libgcj} target-libmudflap gdb tcl tk libgui itcl" +noconfigdirs="$noconfigdirs target-newlib target-libgloss ${libgcj} target-libmudflap gdb tcl tk libgui itcl" ;; *-*-uclinux*) noconfigdirs="$noconfigdirs target-newlib target-libgloss target-rda ${libgcj}" ;; *-*-vxworks*) -noconfigdirs="$noconfigdirs target-newlib target-libgloss target-libiberty target-libstdc++-v3 ${libgcj}" +noconfigdirs="$noconfigdirs target-newlib target-libgloss target-libstdc++-v3 ${libgcj}" ;; alpha*-dec-osf*) # ld works, but does not support shared libraries. @@ -656,7 +654,7 @@ case "${target}" in sh*-*-pe|mips*-*-pe|*arm-wince-pe) noconfigdirs="$noconfigdirs ${libgcj}" noconfigdirs="$noconfigdirs target-examples" -noconfigdirs="$noconfigdirs target-libiberty texinfo send-pr" +noconfigdirs="$noconfigdirs texinfo send-pr" noconfigdirs="$noconfigdirs tcl tk itcl libgui sim" noconfigdirs="$noconfigdirs expect dejagnu" # the C++ libraries don't build on top of CE's C libraries @@ -690,7 +688,7 @@ case "${target}" in libgloss_dir=arm ;; arm*-*-symbianelf*) -noconfigdirs="$noconfigdirs ${libgcj} target-libiberty" +noconfigdirs="$noconfigdirs ${libgcj}" libgloss_dir=arm ;; arm-*-pe*) @@ -709,7 +707,7 @@ case "${target}" in noconfigdirs="$noconfigdirs ld target-libgloss ${libgcj}" ;; avr-*-*) -noconfigdirs="$noconfigdirs target-libiberty target-libstdc++-v3 ${libgcj} target-libssp" +noconfigdirs="$noconfigdirs target-libstdc++-v3 ${libgcj} target-libssp" ;; bfin-*-*) unsupported_languages="$unsupported_languages java" @@ -888,7 +886,7 @@ case "${target}" in noconfigdirs="$noconfigdirs ${libgcj}" ;; m68hc11-*-*|m6811-*-*|m68hc12-*-*|m6812-*-*) -noconfigdirs="$noconfigdirs target-libiberty target-libstdc++-v3 ${libgcj}" +noconfigdirs="$noconfigdirs target-libstdc++-v3 ${libgcj}" libgloss_dir=m68hc11 ;; m68k-*-elf*) @@ -918,9 +916,6 @@ case "${target}" in mt-*-*) noconfigdirs="$noconfigdirs sim" ;; - picochip-*-*) -noconfigdirs="$noconfigdirs target-libiberty" -;; powerpc-*-aix*) # copied from rs6000-*-* entry noco
Re: [pph] Stream scope_chain->bindings instead of global namespace (issue4661045)
On Thu, Jun 23, 2011 at 13:21, Diego Novillo wrote: > I've made a couple of minor edits to comments and formatting and > committed to the branch (final patch below). Incidentally, did you fill-in the svn write access form? You've produced enough good patches already. Time for you to be able to commit your own. Diego.
Re: [pph] Stream scope_chain->bindings instead of global namespace (issue4661045)
I've made a couple of minor edits to comments and formatting and committed to the branch (final patch below). Diego. commit 2f0fa40cd3e0c9debb4efae6c65530a7c6d3fb0f Author: dnovillo Date: Thu Jun 23 17:18:27 2011 + 2011-06-22 Gabriel Charette * pph-streamer-in.c (pph_add_names_to_namespace): Replaced by pph_add_bindings_to_namespace. (pph_add_bindings_to_namespace): New. (pph_in_scope_chain): New. (pph_read_file_contents): Remove unused variable file_ns. (pph_read_file_contents): Call pph_in_scope_chain. * pph-streamer-out.c (pph_out_scope_chain): New. (pph_write_file_contents): Call pph_out_scope_chain. * pph-streamer.c (pph_preload_common_nodes): Call lto_streamer_cache_append. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/pph@175346 138bc75d-0d04-0410-961f-82ee72b054a4 diff --git a/gcc/cp/ChangeLog.pph b/gcc/cp/ChangeLog.pph index 8cdc575..24e1584 100644 --- a/gcc/cp/ChangeLog.pph +++ b/gcc/cp/ChangeLog.pph @@ -1,5 +1,18 @@ 2011-06-22 Gabriel Charette + * pph-streamer-in.c (pph_add_names_to_namespace): Replaced by + pph_add_bindings_to_namespace. + (pph_add_bindings_to_namespace): New. + (pph_in_scope_chain): New. + (pph_read_file_contents): Remove unused variable file_ns. + (pph_read_file_contents): Call pph_in_scope_chain. + * pph-streamer-out.c (pph_out_scope_chain): New. + (pph_write_file_contents): Call pph_out_scope_chain. + * pph-streamer.c (pph_preload_common_nodes): + Call lto_streamer_cache_append. + +2011-06-22 Gabriel Charette + * pph-streamer-out.c (pph_out_lang_specific): Removed extra space. (pph_write_tree): Removed extra space. diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c index 0e8c6bf..7d501ef 100644 --- a/gcc/cp/pph-streamer-in.c +++ b/gcc/cp/pph-streamer-in.c @@ -977,15 +977,14 @@ pph_in_lang_type (pph_stream *stream) } -/* Add all the new names declared in NEW_NS to NS. */ +/* Add all bindings declared in BL to NS. */ static void -pph_add_names_to_namespace (tree ns, tree new_ns) +pph_add_bindings_to_namespace (struct cp_binding_level *bl, tree ns) { tree t, chain; - struct cp_binding_level *level = NAMESPACE_LEVEL (new_ns); - for (t = level->names; t; t = chain) + for (t = bl->names; t; t = chain) { /* Pushing a decl into a scope clobbers its DECL_CHAIN. Preserve it. */ @@ -993,18 +992,34 @@ pph_add_names_to_namespace (tree ns, tree new_ns) pushdecl_into_namespace (t, ns); } - for (t = level->namespaces; t; t = chain) + for (t = bl->namespaces; t; t = chain) { /* Pushing a decl into a scope clobbers its DECL_CHAIN. Preserve it. */ - /* FIXME pph: we should first check to see if it isn't already there. */ chain = DECL_CHAIN (t); + + /* FIXME pph: we should first check to see if it isn't already there. +If it is, we should use this function recursively to merge +the bindings in T in the corresponding namespace. */ pushdecl_into_namespace (t, ns); - pph_add_names_to_namespace (t, t); } } +/* Merge scope_chain bindings from STREAM into global_namespace. */ + +static void +pph_in_scope_chain (pph_stream *stream) +{ + struct cp_binding_level *pph_bindings; + + pph_bindings = pph_in_binding_level (stream); + + /* Merge the bindings obtained from STREAM in the global namespace. */ + pph_add_bindings_to_namespace (pph_bindings, global_namespace); +} + + /* Wrap a macro DEFINITION for printing in an error. */ static char * @@ -1129,7 +1144,6 @@ pph_read_file_contents (pph_stream *stream) cpp_ident_use *bad_use; const char *cur_def; cpp_idents_used idents_used; - tree file_ns; pth_load_identifiers (&idents_used, stream); @@ -1142,16 +1156,16 @@ pph_read_file_contents (pph_stream *stream) /* Re-instantiate all the pre-processor symbols defined by STREAM. */ cpp_lt_replay (parse_in, &idents_used); - /* Read global_namespace from STREAM and add all the names defined - there to the current global_namespace. */ - file_ns = pph_in_tree (stream); + /* Read the bindings from STREAM and merge them with the current bindings. */ + pph_in_scope_chain (stream); + if (flag_pph_dump_tree) -pph_dump_namespace (pph_logfile, file_ns); - pph_add_names_to_namespace (global_namespace, file_ns); +pph_dump_namespace (pph_logfile, global_namespace); + keyed_classes = pph_in_tree (stream); - unemitted_tinfo_decls = pph_in_tree_vec (stream); /* FIXME pph: This call replaces the tinfo, we should merge instead. - See pph_in_tree_VEC. */ + See pph_in_tree_vec. */ + unemitted_tinfo_decls = pph_in_tree_vec (stream); } diff --git a/gcc/cp/pph-streamer-out.c b/gcc/cp/pph-streamer-out.c index f219cef..dd26571 100644 --- a/gcc/cp/pph-streamer-out.c +++ b/gcc/cp/pph-streamer-out.c @@ -912,6 +912,35 @@ pph_out_lang_
Re: [pph] Fixed extra space typos in pph-streamer-out.c (issue4663041)
On Wed, Jun 22, 2011 at 19:03, Gabriel Charette wrote: > 2011-06-22 Gabriel Charette > > * gcc/cp/pph-streamer-out.c (pph_out_lang_specific): > Removed extra space. > (pph_write_tree): Removed extra space. OK. Committed to branch. Diego.
C++ PATCH for c++/49507 (ICE with template dtor defaulted out of class)
Here, if we're going to synthesize the dtor in instantiate_decl, we need to not also do it directly in mark_used. Tested x86_64-pc-linux-gnu, applying to trunk and 4.6.1 (since Jakub asked about fixing this for 4.6.1 and it seems safe). commit 2c7d73cc244974ee3e483cdfcddc210fdf98e25a Author: Jason Merrill Date: Thu Jun 23 11:06:43 2011 -0400 PR c++/49507 * decl2.c (mark_used): Don't call synthesize_method for functions defaulted outside the class. diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c index d2f075d..9e5a229 100644 --- a/gcc/cp/decl2.c +++ b/gcc/cp/decl2.c @@ -4297,6 +4297,9 @@ mark_used (tree decl) if (TREE_CODE (decl) == FUNCTION_DECL && DECL_NONSTATIC_MEMBER_FUNCTION_P (decl) && DECL_DEFAULTED_FN (decl) + /* A function defaulted outside the class is synthesized either by + cp_finish_decl or instantiate_decl. */ + && !DECL_DEFAULTED_OUTSIDE_CLASS_P (decl) && ! DECL_INITIAL (decl)) { /* Remember the current location for a function we will end up diff --git a/gcc/testsuite/g++.dg/cpp0x/defaulted30.C b/gcc/testsuite/g++.dg/cpp0x/defaulted30.C new file mode 100644 index 000..0bf4425 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/defaulted30.C @@ -0,0 +1,16 @@ +// PR c++/49507 +// { dg-options -std=c++0x } + +template +struct ConcretePoolKey +{ +virtual ~ConcretePoolKey(); +}; + +template +ConcretePoolKey::~ConcretePoolKey() = default; + +int main() +{ +ConcretePoolKey foo; +}
Re: [AVX2] PATCH: Fixed 64-bit integer of gather* intrinsic declaration.
On Thu, Jun 23, 2011 at 9:38 AM, Kirill Yukhin wrote: > Hi, > I've updated 64-bit integer variant of gather intrinsics declarations. > It now works while passing '-pedantic' flag. > Also I fixed copy-paste problem to avoid AVX2 tests to be executed on > AVX-capable machines. > > ChangeLog.avx2 entry: > 2011-06-22 Yukhin Kirill > > * gcc/config/i386/avx2intrin.h (_mm_i32gather_epi64): Fixed > pointer type. > (_mm256_i32gather_epi64): Likewise. > (_mm256_mask_i32gather_epi64): Likewise. > (_mm_i64gather_epi64): Likewise. > (_mm_mask_i64gather_epi64): Likewise. > (_mm256_i64gather_epi64): Likewise. > (_mm256_mask_i64gather_epi64): Likewise. > > tesuite/ChangeLog.avx2 entry: > 2011-06-22 Yukhin Kirill > > * gcc.target/i386/avx2-vbroadcastsd_pd-2.c: Fixed test to run > on AVX2 machine, not AVX. > * gcc.target/i386/avx2-vbroadcastsi128-2.c: Likewise. > * gcc.target/i386/avx2-vbroadcastss_ps-2.c: Likewise. > * gcc.target/i386/avx2-vbroadcastss_ps256-2.c: Likewise. > * gcc.target/i386/avx2-vextracti128-2.c: Likewise. > * gcc.target/i386/avx2-vinserti128-2.c: Likewise. > * gcc.target/i386/avx2-vpmaskloadq256-2.c: Likewise. > * gcc.target/i386/avx2-vpmaskstoreq256-2.c: Likewise. > > Going to commit to avx2 branch. > > Thanks, K > I will check it in for you. Thanks. -- H.J.
Re: [C++ Patch] PR 44625
OK. Jason
C++ PATCH for c++/49395 (cv-quals on scalar functional cast)
We've been stripping cv-quals from scalar prvalues in other situations, but missed this one. Tested x86_64-pc-linux-gnu, applying to trunk. commit 8f1186d00e9f78a8b64f678cc322332568bbec59 Author: Jason Merrill Date: Wed Jun 22 16:34:53 2011 -0400 PR c++/49395 * init.c (build_zero_init_1): Strip cv-quals from scalar types. diff --git a/gcc/cp/init.c b/gcc/cp/init.c index 3c347a4..62b68f2 100644 --- a/gcc/cp/init.c +++ b/gcc/cp/init.c @@ -176,7 +176,7 @@ build_zero_init_1 (tree type, tree nelts, bool static_storage_p, initialized are initialized to zero. */ ; else if (SCALAR_TYPE_P (type)) -init = convert (type, integer_zero_node); +init = convert (cv_unqualified (type), integer_zero_node); else if (CLASS_TYPE_P (type)) { tree field; diff --git a/gcc/testsuite/g++.dg/init/ref18.C b/gcc/testsuite/g++.dg/init/ref18.C new file mode 100644 index 000..e704077 --- /dev/null +++ b/gcc/testsuite/g++.dg/init/ref18.C @@ -0,0 +1,12 @@ +// PR c++/49395 + +volatile int foo(); +struct A { volatile int i; }; +typedef volatile int vi; + +volatile int i; + +const int& ir1 = foo(); +//const int& ir2 = A().i; // line 8 +const int& ir3 = static_cast(i); +const int& ir4 = vi(); // line 10
C++ PATCH for c++/49440 (wrong tinfo comparison with anonymous namespaces)
We were wrongly considering two classes in anonymous namespaces in different files to be the same class due to string comparison. We have a way to avoid that, by putting a '*' at the beginning of the typeinfo name, but weren't doing that in this case because TREE_PUBLIC was wrongly set on the typeinfo by set_linkage_according_to_type. Rather than try to fix that function, we should just use determine_visibility, which already gets this stuff right. While looking at this, I also noticed that we were adopting a tentative alias as first_global_object_name because its linkage flags hadn't been set properly yet. Tested x86_64-pc-linux-gnu, applying to trunk. commit 5576725d9e58e9201695756f6cd228199e3ea724 Author: Jason Merrill Date: Wed Jun 22 23:39:42 2011 -0400 PR c++/49440 * class.c (set_linkage_according_to_type): Just check TREE_PUBLIC on the type's name. diff --git a/gcc/cp/class.c b/gcc/cp/class.c index 09444fb..9e387a6 100644 --- a/gcc/cp/class.c +++ b/gcc/cp/class.c @@ -677,21 +677,10 @@ get_vtable_name (tree type) the abstract. */ void -set_linkage_according_to_type (tree type, tree decl) +set_linkage_according_to_type (tree type ATTRIBUTE_UNUSED, tree decl) { - /* If TYPE involves a local class in a function with internal - linkage, then DECL should have internal linkage too. Other local - classes have no linkage -- but if their containing functions - have external linkage, it makes sense for DECL to have external - linkage too. That will allow template definitions to be merged, - for example. */ - if (no_linkage_check (type, /*relaxed_p=*/true)) -{ - TREE_PUBLIC (decl) = 0; - DECL_INTERFACE_KNOWN (decl) = 1; -} - else -TREE_PUBLIC (decl) = 1; + TREE_PUBLIC (decl) = 1; + determine_visibility (decl); } /* Create a VAR_DECL for a primary or secondary vtable for CLASS_TYPE. diff --git a/gcc/testsuite/g++.dg/rtti/anon-ns1.C b/gcc/testsuite/g++.dg/rtti/anon-ns1.C new file mode 100644 index 000..fd6f8af --- /dev/null +++ b/gcc/testsuite/g++.dg/rtti/anon-ns1.C @@ -0,0 +1,15 @@ +// PR c++/49440 +// The typeinfo name for A should start with * so we compare +// it by address rather than contents. + +// { dg-final { scan-assembler "\"\*N\[^\"\]+1AE\"" } } + +namespace +{ + class A { }; +} + +void f() +{ + throw A(); +} commit d567cac789228a15f7ce98350e19a7b4c52429ab Author: Jason Merrill Date: Wed Jun 22 23:40:07 2011 -0400 * optimize.c (maybe_clone_body): Set linkage flags before cgraph_same_body_alias. diff --git a/gcc/cp/optimize.c b/gcc/cp/optimize.c index 87302dc..b9e3551 100644 --- a/gcc/cp/optimize.c +++ b/gcc/cp/optimize.c @@ -310,8 +310,11 @@ maybe_clone_body (tree fn) || (HAVE_COMDAT_GROUP && DECL_WEAK (fns[0]))) && (flag_syntax_only - || cgraph_same_body_alias (cgraph_get_node (fns[0]), clone, - fns[0]))) + /* Set linkage flags appropriately before + cgraph_create_function_alias looks at them. */ + || (expand_or_defer_fn_1 (clone) + && cgraph_same_body_alias (cgraph_get_node (fns[0]), + clone, fns[0] { alias = true; if (DECL_ONE_ONLY (fns[0]))
Re: [C++ Patch] PR 44625
On 06/23/2011 06:05 PM, Jason Merrill wrote: So we should be able to just reject nested anonymous aggregates and not worry about how to make them work. The below appears to work pretty well, regtests fine. I had to tweak the existing error17.C, we don't emit anymore the warning about no members in the nested anonymous struct, because we bail out early. Doesn't seem a serious issue to me... Ok? Paolo. // /cp 2011-06-23 Paolo Carlini PR c++/44625 * decl2.c (build_anon_union_vars): Early return error_mark_node for a nested anonymous struct. /testsuite 2011-06-23 Paolo Carlini PR c++/44625 * g++.dg/template/crash107.C: New. * g++.dg/template/error17.C: Adjust. Index: testsuite/g++.dg/template/error17.C === --- testsuite/g++.dg/template/error17.C (revision 175330) +++ testsuite/g++.dg/template/error17.C (working copy) @@ -6,5 +6,4 @@ foo() { union { struct { }; }; // { dg-error "prohibits anonymous struct" "anon" } // { dg-error "not inside" "not inside" { target *-*-* } 7 } - // { dg-warning "no members" "no members" { target *-*-* } 7 } } Index: testsuite/g++.dg/template/crash107.C === --- testsuite/g++.dg/template/crash107.C(revision 0) +++ testsuite/g++.dg/template/crash107.C(revision 0) @@ -0,0 +1,20 @@ +// PR c++/44625 +// { dg-do compile } +// { dg-options "" } + +template struct Vec { // { dg-message "note" } +Vec& operator^=(Vec& rhs) { +union { +struct {FP_ x,y,z;}; +}; // { dg-error "anonymous struct" } +X = y*rhs.z() - z*rhs.y(); // { dg-error "not declared|no member" } +} +Vec& operator^(Vec& rhs) { +return Vec(*this)^=rhs; // { dg-message "required" } +} +}; +Vec v(3,4,12); // { dg-error "no matching" } +// { dg-message "note" { target *-*-* } 16 } +Vec V(12,4,3); // { dg-error "no matching" } +// { dg-message "note" { target *-*-* } 18 } +Vec c = v^V; // { dg-message "required" } Index: cp/decl2.c === --- cp/decl2.c (revision 175335) +++ cp/decl2.c (working copy) @@ -1327,7 +1327,10 @@ build_anon_union_vars (tree type, tree object) /* Rather than write the code to handle the non-union case, just give an error. */ if (TREE_CODE (type) != UNION_TYPE) -error ("anonymous struct not inside named type"); +{ + error ("anonymous struct not inside named type"); + return error_mark_node; +} for (field = TYPE_FIELDS (type); field != NULL_TREE;
Re: [PATCH, MELT] loading extra module before setting options
On Thu, 23 Jun 2011 18:05:38 +0200 Pierre Vittet wrote: > > 2011-06-22 Pierre Vittet > > * melt-runtime.c (load_melt_modules_and_do_mode): load extra module >before setting options Thanks. Committed revision 175337. [on the MELT branch] Cheers -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mine, sont seulement les miennes} ***
[PING] [PATCH, libstdc++-v3] Add newlib specific ctype_members.cc
Ping for: http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00440.html > libstdc++-v3/ChangeLog > 2011-06-06 Yufeng Zhang > > * config/locale/newlib/ctype_members.cc: New file. > * acinclude.m4 (GLIBCXX_ENABLE_CLOCALE): Add a new C locale > kind: newlib. Configure to use the newlib specific > ctype_members.cc when with_newlib is enabled. > * configure: Regenerate.
Re: varpool alias reorg
> On Sat, Jun 18, 2011 at 7:19 AM, H.J. Lu wrote: > > On Sat, Jun 18, 2011 at 1:32 AM, Jan Hubicka wrote: > >> Hi, > >> this patch makes symetric changes to varpool as did the prevoius series to > >> cgraph. > >> Basically the aliases are now represented as separate varpool nodes with > >> alias reference > >> to the variable they refer to, with some infrastructure to walk the alias > >> references > >> as needed. > >> > >> Bootstrapped/regtested x86_64-linux, comitted. > >> > >> Honza > >> > >> * lto-symtab.c (lto_varpool_replace_node): Remove code handling > >> extra name aliases. > >> (lto_symtab_resolve_can_prevail_p): Likewise. > >> (lto_symtab_merge_cgraph_nodes): Update alias_of pointers.. > >> * cgraphbuild.c (record_reference): Remove extra body alias code. > >> (mark_load): Likewise. > >> (mark_store): Likewise. > >> * cgraph.h (varpool_node): Remove extra_name filed; > >> add alias_of and extraname_alias. > >> (varpool_create_variable_alias, varpool_for_node_and_aliases): > >> Declare. > >> (varpool_alias_aliased_node): New inline function. > >> (varpool_variable_node): New function. > >> * cgraphunit.c (handle_alias_pairs): Handle also variable aliases. > >> * ipa-ref.c (ipa_record_reference): Allow aliases on variables. > >> * lto-cgraph.c (lto_output_varpool_node): Update streaming. > >> (input_varpool_node): Likewise. > >> * lto-streamer-out.c (produce_symtab): Remove extra name aliases. > >> (varpool_externally_visible_p): Remove extra body alias code. > >> (function_and_variable_visibility): Likewise. > >> * tree-ssa-structalias.c (associate_varinfo_to_alias_1): New > >> function. > >> (ipa_pta_execute): Use it. > >> * varpool.c (varpool_remove_node): Remove extra name alias code. > >> (varpool_mark_needed_node): Likewise. > >> (varpool_analyze_pending_decls): Analyze aliases. > >> (assemble_aliases): New functoin. > >> (varpool_assemble_decl): Use it. > >> (varpool_create_variable_alias): New function. > >> (varpool_extra_name_alias): Rewrite. > >> (varpool_for_node_and_aliases): New function. > > > > This caused: > > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49463 > > > > This patch is incorrect as shown in the PR above. The builtins/strstr-asm.c is the same issue as I patched some time ago in http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00031.html just in this case the problem remained hidden until now. Again the problem needs plugin to manifest. There are two problems here 1) We do not stream builtin decls and merge them outside lto-symtab (by just streaming references to builtins with their asm names). There is at least one extra PR related to this and on my TODO is to simply remove the code. 2) Aliases within single unit works only when both the alias and the target use asm name. This is because internally we store mangled DECL_ASSEMBLER_NAME and the alias_pair code. With LTO this breaks existing code simply because what used to be multiple units and thus safe is now single LTO unit. Dave Korn fixed part of the problem by introducing mangling code into lto-symtab His code solve similar problems with aliases from the asm code, but it did not solve the problem with aliases from LTO units, like here, simply because alias pair code bypass the lto-symtab. One of goals of the incorrect patch above is to make lto-symtab to do the merging and thus fix this issue (for now at all decls except for builtins). Still it would be good to solve the problem on non-LTO compilation, too. We discussed introduction of proper symbol table into GCC at the GCC gathering last weekend. It is where I am heading but it will take some time. Until that happens, I suggest fixing the testcase same was as we already fixed the memops-asm-lib.c in 4.6 timeframe. Bootstrapped/regtested x86_64-linux, OK? Index: strstr-asm-lib.c === --- strstr-asm-lib.c(revision 175183) +++ strstr-asm-lib.c(working copy) @@ -7,6 +7,7 @@ extern int inside_main; extern const char *p; +__attribute__ ((used)) char * my_strstr (const char *s1, const char *s2) {
[AVX2] PATCH: Fixed 64-bit integer of gather* intrinsic declaration.
Hi, I've updated 64-bit integer variant of gather intrinsics declarations. It now works while passing '-pedantic' flag. Also I fixed copy-paste problem to avoid AVX2 tests to be executed on AVX-capable machines. ChangeLog.avx2 entry: 2011-06-22 Yukhin Kirill * gcc/config/i386/avx2intrin.h (_mm_i32gather_epi64): Fixed pointer type. (_mm256_i32gather_epi64): Likewise. (_mm256_mask_i32gather_epi64): Likewise. (_mm_i64gather_epi64): Likewise. (_mm_mask_i64gather_epi64): Likewise. (_mm256_i64gather_epi64): Likewise. (_mm256_mask_i64gather_epi64): Likewise. tesuite/ChangeLog.avx2 entry: 2011-06-22 Yukhin Kirill * gcc.target/i386/avx2-vbroadcastsd_pd-2.c: Fixed test to run on AVX2 machine, not AVX. * gcc.target/i386/avx2-vbroadcastsi128-2.c: Likewise. * gcc.target/i386/avx2-vbroadcastss_ps-2.c: Likewise. * gcc.target/i386/avx2-vbroadcastss_ps256-2.c: Likewise. * gcc.target/i386/avx2-vextracti128-2.c: Likewise. * gcc.target/i386/avx2-vinserti128-2.c: Likewise. * gcc.target/i386/avx2-vpmaskloadq256-2.c: Likewise. * gcc.target/i386/avx2-vpmaskstoreq256-2.c: Likewise. Going to commit to avx2 branch. Thanks, K avx2.gatherdecls.gcc.patch Description: Binary data
Re: [AVX2] PATCH: Adding support of AVX2 to driver
On Thu, Jun 23, 2011 at 9:27 AM, Kirill Yukhin wrote: > Hi, > I added checking of AVX2 support to driver-i386.c. > Also I added entries to doc/extend.texi and a couple of tests to work with > AVX2. > > ChangeLog.avx2 entry: > 2011-06-21 Yukhin Kirill > > * gcc/config/i386/driver-i386.c (host_detect_local_cpu): Define > and set has_avx2. > * gcc/doc/invoke.texi: Document -mavx2 and -mno-avx2. > > tesuite/ChangeLog.avx2 entry: > 2011-06-20 Yukhin Kirill > > * gcc.target/i386/funcspec-5.c: Add avx2 and no-avx2 targets. > * gcc.target/i386/funcspec-6.c: Likewise. > * gcc.target/i386/sse-12.c: Likewise. > > > Going to commit to avx2 branch. > I will check it in for you. -- H.J.
Re: RFA PR middle-end/49465
On Thu, Jun 23, 2011 at 6:13 PM, Jeff Law wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > > At the core of this PR is a case where we were threading through a > successor of a joiner block where there was already an edge from the > threadable successor to the final target. ie, the successor of the > joiner ended with a conditional branch, after threading both arms > reached the same location. > > For this case, we have to ensure that if the target block has PHIs that > the PHI arguments are the same for both incoming edges so that when the > two edges are combined into a single edge we don't lose information. > > We already check and reject cases where this is not true prior to > registering the jump thread. Unfortunately, the updating code tried to > update the PHIs when no update is necessary or desirable. > > I don't have a small testcase for this bug. > > Bootstrapped and regression tested on x86_64-unknown-linux-gnu; also > built and run cpu2006 C integer benchmarks which were failing with -O2 > - -fast-math. Those benchmarks run correctly with this patch applied. > > OK for trunk? Ok. Thanks, Richard. > Jeff > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.11 (GNU/Linux) > Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ > > iQEcBAEBAgAGBQJOA2YgAAoJEBRtltQi2kC7R4kH/1jRce606GysmFOjgbIapPGb > PyQk4NjnrMY3WujiQOghPe6D7wRi+UCs0DLhqW7zTcBUlBAJjCTpo3DbYyfnmSHp > vQu74JYeZItjMfkPRI+6JFkaUlEpGVaAqCK7+CAqI7e3qLEook05lsJfHtzOQ/60 > kGeczpjWn6p7hvVj9Q5p2OCo9+tWsu+vowdQrzF/2nBTrtSyqwE4oEW9fL7h/6Sw > 27/YBheJeGEf7HtwL3Pm/G0M9mEv7u57rIek2G+VweUqwmrADvI5NuVOs1mQdDTw > iYY61sjSOO8iPMAATufndjhLJKATndoXmdHL9sfpYrlsDZqsF6cCrR18o+RFymo= > =ylGm > -END PGP SIGNATURE- >
Re: [AVX2] PATCH: Improved error reporting for AVX2 immediates in vextracti/inserti128. New tests.
On Thu, Jun 23, 2011 at 9:19 AM, Kirill Yukhin wrote: > Attaching the patch > > On Thu, Jun 23, 2011 at 8:18 PM, Kirill Yukhin > wrote: >> Hi, >> I added spcial case for immediate expanding of vinserti128 and >> vextractf128 (AVX2) to improve error reporting. >> Also I added bunch of new tests to check error reporting of out of >> range immediates for AVX2. >> >> ChangeLog.avx2 entry: >> 2011-06-20 Yukhin Kirill >> >> * gcc/config/i386/i386.c (ix86_expand_args_builtin): Improved >> error diagnistic for extracti128/inserti128 immediates. >> >> tesuite/ChangeLog.avx2 entry: >> 2011-06-20 Yukhin Kirill >> >> * gcc.target/i386/avx2-mpsadbw-3.c: New test to check error >> diagnostic while passing wrong immediate. >> * gcc.target/i386/avx2-vextracti128-3.c: Likewise. >> * gcc.target/i386/avx2-vinserti128-3.c: Likewise. >> * gcc.target/i386/avx2-vpalignr256-3.c: Likewise. >> * gcc.target/i386/avx2-vpblendd128-3.c: Likewise. >> * gcc.target/i386/avx2-vpblendd256-3.c: Likewise. >> * gcc.target/i386/avx2-vpblendw-3.c: Likewise. >> * gcc.target/i386/avx2-vperm2i128-3.c: Likewise. >> * gcc.target/i386/avx2-vpermpd-3.c: Likewise. >> * gcc.target/i386/avx2-vpermq-3.c: Likewise. >> * gcc.target/i386/avx2-vpshufd-3.c: Likewise. >> * gcc.target/i386/avx2-vpshufhw-3.c: Likewise. >> * gcc.target/i386/avx2-vpshuflw-3.c: Likewise. >> * gcc.target/i386/avx2-vpslldq-3.c: Likewise. >> * gcc.target/i386/avx2-vpsrldq-3.c: Likewise. >> >> Going to commit to avx2 branch. >> I will check it in for you. Thanks. -- H.J.
[AVX2] PATCH: Adding support of AVX2 to driver
Hi, I added checking of AVX2 support to driver-i386.c. Also I added entries to doc/extend.texi and a couple of tests to work with AVX2. ChangeLog.avx2 entry: 2011-06-21 Yukhin Kirill * gcc/config/i386/driver-i386.c (host_detect_local_cpu): Define and set has_avx2. * gcc/doc/invoke.texi: Document -mavx2 and -mno-avx2. tesuite/ChangeLog.avx2 entry: 2011-06-20 Yukhin Kirill * gcc.target/i386/funcspec-5.c: Add avx2 and no-avx2 targets. * gcc.target/i386/funcspec-6.c: Likewise. * gcc.target/i386/sse-12.c: Likewise. Going to commit to avx2 branch. Thanks, K avx2.drv.gcc.patch Description: Binary data
Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
On Thu, Jun 23, 2011 at 4:40 PM, Andrew Stubbs wrote: > There are many cases where the widening_mult pass does not recognise > widening multiply-and-accumulate cases simply because there is a type > conversion step between the multiply and add statements. > > This patch should rectify that simply by looking beyond those conversions. That's surely wrong for (int)(short)int_var. You have to constrain the conversions you look through properly. Richard. > OK? > > Andrew > >
Re: [PATCH] Use get_pointer_alignment in vect_compute_data_ref_alignment
On Thu, Jun 23, 2011 at 4:07 PM, Jakub Jelinek wrote: > Hi! > > This is a precondition of the __builtin_assume_aligned patch (otherwise > it wouldn't be useful for vectorization for which it has been designed), > but I've bootstrapped/regtested it on x86_64-linux and i686-linux > separately. > > get_pointer_alignment can tell us that a pointer is already sufficiently > aligned and we don't need to use misaligned loads/stores. It should be > useful even in other cases, such as when the code contains explicit > ptr = (double *) (((uintptr_t) ptr) & ~(uintptr_t) 15); > and similar to guarantee that ptr is already 16 byte aligned, etc. > > I haven't played with doing something additionally just with > SSA_NAME_PTR_INFO (base_addr)->misalign yet if it isn't sufficiently aligned, > but ->align is big enough, for integer_zerop (misalign) I guess we could > just set misalign to that, otherwise? Also, I think we can't leave out the > TYPE_ALIGN_UNIT test, because get_pointer_alignment will often return > just BITS_PER_UNIT, e.g. for PARM_DECLs, even if they are pointers > to sufficiently aligned types. > > Ok for trunk? Looks good to me. To make use of misalign info we have to fold it into that of DR_OFFSET/INIT though. I eventually wanted to make get_pointer_alignment to also return misalign info (like get_object_alignment_1 does). Thanks, Richard. > 2011-06-23 Jakub Jelinek > > * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Use > get_pointer_alignment to see if base isn't sufficiently aligned. > > --- gcc/tree-vect-data-refs.c.jj 2011-06-17 11:02:19.0 +0200 > +++ gcc/tree-vect-data-refs.c 2011-06-23 12:37:43.0 +0200 > @@ -859,7 +859,9 @@ vect_compute_data_ref_alignment (struct > || (TREE_CODE (base_addr) == SSA_NAME > && tree_int_cst_compare (ssize_int (TYPE_ALIGN_UNIT (TREE_TYPE ( > TREE_TYPE (base_addr, > - alignment) >= 0)) > + alignment) >= 0) > + || (get_pointer_alignment (base_addr, TYPE_ALIGN (vectype)) > + >= TYPE_ALIGN (vectype))) > base_aligned = true; > else > base_aligned = false; > > Jakub >
RFA PR middle-end/49465
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 At the core of this PR is a case where we were threading through a successor of a joiner block where there was already an edge from the threadable successor to the final target. ie, the successor of the joiner ended with a conditional branch, after threading both arms reached the same location. For this case, we have to ensure that if the target block has PHIs that the PHI arguments are the same for both incoming edges so that when the two edges are combined into a single edge we don't lose information. We already check and reject cases where this is not true prior to registering the jump thread. Unfortunately, the updating code tried to update the PHIs when no update is necessary or desirable. I don't have a small testcase for this bug. Bootstrapped and regression tested on x86_64-unknown-linux-gnu; also built and run cpu2006 C integer benchmarks which were failing with -O2 - -fast-math. Those benchmarks run correctly with this patch applied. OK for trunk? Jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOA2YgAAoJEBRtltQi2kC7R4kH/1jRce606GysmFOjgbIapPGb PyQk4NjnrMY3WujiQOghPe6D7wRi+UCs0DLhqW7zTcBUlBAJjCTpo3DbYyfnmSHp vQu74JYeZItjMfkPRI+6JFkaUlEpGVaAqCK7+CAqI7e3qLEook05lsJfHtzOQ/60 kGeczpjWn6p7hvVj9Q5p2OCo9+tWsu+vowdQrzF/2nBTrtSyqwE4oEW9fL7h/6Sw 27/YBheJeGEf7HtwL3Pm/G0M9mEv7u57rIek2G+VweUqwmrADvI5NuVOs1mQdDTw iYY61sjSOO8iPMAATufndjhLJKATndoXmdHL9sfpYrlsDZqsF6cCrR18o+RFymo= =ylGm -END PGP SIGNATURE- PR middle-end/49465 * tree-ssa-threadupate.c (fix_duplicate_block_edges): Fix condition to detect threading through joiner block. If there was already an edge to the new target, then do not change the PHI nodes. Index: tree-ssa-threadupdate.c === *** tree-ssa-threadupdate.c (revision 175192) --- tree-ssa-threadupdate.c (working copy) *** fix_duplicate_block_edges (struct redire *** 385,391 to keep its control statement and redirect an outgoing edge. Else we want to remove the control statement & edges, then create a new outgoing edge. In both cases we may need to update PHIs. */ ! if (THREAD_TARGET2 (rd->incoming_edges->e) == rd->outgoing_edge) { edge victim; edge e2; --- 385,391 to keep its control statement and redirect an outgoing edge. Else we want to remove the control statement & edges, then create a new outgoing edge. In both cases we may need to update PHIs. */ ! if (THREAD_TARGET2 (rd->incoming_edges->e)) { edge victim; edge e2; *** fix_duplicate_block_edges (struct redire *** 400,407 victim = find_edge (rd->dup_block, THREAD_TARGET (e)->dest); e2 = redirect_edge_and_branch (victim, THREAD_TARGET2 (e)->dest); ! /* This updates the PHI at the target of the threaded edge. */ ! copy_phi_args (e2->dest, THREAD_TARGET2 (e), e2); } else { --- 400,410 victim = find_edge (rd->dup_block, THREAD_TARGET (e)->dest); e2 = redirect_edge_and_branch (victim, THREAD_TARGET2 (e)->dest); ! /* If we redirected the edge, then we need to copy PHI arguments !at the target. If the edge already existed (e2 != victim case), !then the PHIs in the target already have the correct arguments. */ ! if (e2 == victim) ! copy_phi_args (e2->dest, THREAD_TARGET2 (e), e2); } else {
Re: [AVX2] PATCH: Improved error reporting for AVX2 immediates in vextracti/inserti128. New tests.
Attaching the patch On Thu, Jun 23, 2011 at 8:18 PM, Kirill Yukhin wrote: > Hi, > I added spcial case for immediate expanding of vinserti128 and > vextractf128 (AVX2) to improve error reporting. > Also I added bunch of new tests to check error reporting of out of > range immediates for AVX2. > > ChangeLog.avx2 entry: > 2011-06-20 Yukhin Kirill > > * gcc/config/i386/i386.c (ix86_expand_args_builtin): Improved > error diagnistic for extracti128/inserti128 immediates. > > tesuite/ChangeLog.avx2 entry: > 2011-06-20 Yukhin Kirill > > * gcc.target/i386/avx2-mpsadbw-3.c: New test to check error > diagnostic while passing wrong immediate. > * gcc.target/i386/avx2-vextracti128-3.c: Likewise. > * gcc.target/i386/avx2-vinserti128-3.c: Likewise. > * gcc.target/i386/avx2-vpalignr256-3.c: Likewise. > * gcc.target/i386/avx2-vpblendd128-3.c: Likewise. > * gcc.target/i386/avx2-vpblendd256-3.c: Likewise. > * gcc.target/i386/avx2-vpblendw-3.c: Likewise. > * gcc.target/i386/avx2-vperm2i128-3.c: Likewise. > * gcc.target/i386/avx2-vpermpd-3.c: Likewise. > * gcc.target/i386/avx2-vpermq-3.c: Likewise. > * gcc.target/i386/avx2-vpshufd-3.c: Likewise. > * gcc.target/i386/avx2-vpshufhw-3.c: Likewise. > * gcc.target/i386/avx2-vpshuflw-3.c: Likewise. > * gcc.target/i386/avx2-vpslldq-3.c: Likewise. > * gcc.target/i386/avx2-vpsrldq-3.c: Likewise. > > Going to commit to avx2 branch. > > Thanks, K > avx2.imm.tests.gcc.patch Description: Binary data
[AVX2] PATCH: Improved error reporting for AVX2 immediates in vextracti/inserti128. New tests.
Hi, I added spcial case for immediate expanding of vinserti128 and vextractf128 (AVX2) to improve error reporting. Also I added bunch of new tests to check error reporting of out of range immediates for AVX2. ChangeLog.avx2 entry: 2011-06-20 Yukhin Kirill * gcc/config/i386/i386.c (ix86_expand_args_builtin): Improved error diagnistic for extracti128/inserti128 immediates. tesuite/ChangeLog.avx2 entry: 2011-06-20 Yukhin Kirill * gcc.target/i386/avx2-mpsadbw-3.c: New test to check error diagnostic while passing wrong immediate. * gcc.target/i386/avx2-vextracti128-3.c: Likewise. * gcc.target/i386/avx2-vinserti128-3.c: Likewise. * gcc.target/i386/avx2-vpalignr256-3.c: Likewise. * gcc.target/i386/avx2-vpblendd128-3.c: Likewise. * gcc.target/i386/avx2-vpblendd256-3.c: Likewise. * gcc.target/i386/avx2-vpblendw-3.c: Likewise. * gcc.target/i386/avx2-vperm2i128-3.c: Likewise. * gcc.target/i386/avx2-vpermpd-3.c: Likewise. * gcc.target/i386/avx2-vpermq-3.c: Likewise. * gcc.target/i386/avx2-vpshufd-3.c: Likewise. * gcc.target/i386/avx2-vpshufhw-3.c: Likewise. * gcc.target/i386/avx2-vpshuflw-3.c: Likewise. * gcc.target/i386/avx2-vpslldq-3.c: Likewise. * gcc.target/i386/avx2-vpsrldq-3.c: Likewise. Going to commit to avx2 branch. Thanks, K
Re: [C++ Patch] PR 44625
On 06/23/2011 06:11 PM, Paolo Carlini wrote: On 06/23/2011 06:05 PM, Jason Merrill wrote: Actually, 9.5 says A union of the form union { member-specification } ; is called an anonymous union; it defines an unnamed object of unnamed type. The member-specification of an anonymous union shall only define non-static data members. [ Note: Nested types and functions cannot be declared within an anonymous union. — end note ] So we should be able to just reject nested anonymous aggregates and not worry about how to make them work. Yes, but we are accepting already some of that as an extension. If I compile the testcase with -pedantic-errors I get: 44625.C:8:31: error: ISO C++ prohibits anonymous structs [-pedantic] 44625.C:9:9: error: anonymous struct not inside named type Of course only the first message is new with -pedantic-errors. Thus, the idea would be rejecting *nested* anonymous, now I see. Uhmm. Paolo.
C++ PATCH for c++/36435 (partial ordering ignoring return type)
Partial ordering to match a particular function signature should consider the return type. Tested x86_64-pc-linux-gnu, applying to trunk. commit 344d1ea28e060dc539b7f8cbcbeb33a32420f638 Author: Jason Merrill Date: Wed Jun 22 14:39:13 2011 -0400 PR c++/36435 * pt.c (most_specialized_instantiation): Do check return types. diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 2716f78..08ce5af 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -16610,12 +16610,12 @@ most_specialized_instantiation (tree templates) if (get_bindings (TREE_VALUE (champ), DECL_TEMPLATE_RESULT (TREE_VALUE (fn)), - NULL_TREE, /*check_ret=*/false)) + NULL_TREE, /*check_ret=*/true)) fate--; if (get_bindings (TREE_VALUE (fn), DECL_TEMPLATE_RESULT (TREE_VALUE (champ)), - NULL_TREE, /*check_ret=*/false)) + NULL_TREE, /*check_ret=*/true)) fate++; if (fate == -1) @@ -16637,10 +16637,10 @@ most_specialized_instantiation (tree templates) for (fn = templates; fn != champ; fn = TREE_CHAIN (fn)) if (get_bindings (TREE_VALUE (champ), DECL_TEMPLATE_RESULT (TREE_VALUE (fn)), - NULL_TREE, /*check_ret=*/false) + NULL_TREE, /*check_ret=*/true) || !get_bindings (TREE_VALUE (fn), DECL_TEMPLATE_RESULT (TREE_VALUE (champ)), - NULL_TREE, /*check_ret=*/false)) + NULL_TREE, /*check_ret=*/true)) { champ = NULL_TREE; break; diff --git a/gcc/testsuite/g++.dg/template/partial9.C b/gcc/testsuite/g++.dg/template/partial9.C new file mode 100644 index 000..4c340fc --- /dev/null +++ b/gcc/testsuite/g++.dg/template/partial9.C @@ -0,0 +1,6 @@ +// PR c++/36435 + +template T f(); +template T* f() { } + +template int* f();
Re: [C++ Patch] PR 44625
On 06/23/2011 06:05 PM, Jason Merrill wrote: Actually, 9.5 says A union of the form union { member-specification } ; is called an anonymous union; it defines an unnamed object of unnamed type. The member-specification of an anonymous union shall only define non-static data members. [ Note: Nested types and functions cannot be declared within an anonymous union. — end note ] So we should be able to just reject nested anonymous aggregates and not worry about how to make them work. Yes, but we are accepting already some of that as an extension. If I compile the testcase with -pedantic-errors I get: 44625.C:8:31: error: ISO C++ prohibits anonymous structs [-pedantic] 44625.C:9:9: error: anonymous struct not inside named type thus, it's not clear to me where we should stop, exactly. Paolo.
Re: [C++ Patch] PR 44625
On 06/23/2011 06:01 PM, Jason Merrill wrote: On 06/23/2011 11:52 AM, Paolo Carlini wrote: Ok, then, from what you saying I understand that it should be possible to actually construct a reject-valid or an ice-on-valid in this area, isn't just about improving the diagnostic, that is only the tip of the iceberg, so to speak. I guess it's not for me, at this time... For the time being you could improve the diagnostic by adding a sorry for the case of null DECL_NAME in a template in build_anon_union_vars. Indeed. Let me try then... Paolo.
[PATCH, MELT] loading extra module before setting options
Hello, In the function load_melt_modules_and_do_mode of melt-runtime.c, we first load initial modules, then we set options, and then we look at extra modules. With this patch, we load extra modules before we set options, because extra modules can contain code to handle options. This change has been compiled and tested without errors. ChangeLog: 2011-06-22 Pierre Vittet * melt-runtime.c (load_melt_modules_and_do_mode): load extra module before setting options Pierre Vittet Index: gcc/melt-runtime.c === --- gcc/melt-runtime.c (revision 175330) +++ gcc/melt-runtime.c (working copy) @@ -8721,65 +8721,6 @@ load_melt_modules_and_do_mode (void) } /** - * Then we set MELT options. - **/ - MELT_LOCATION_HERE ("before setting options"); - optstr = melt_argument ("option"); - debugeprintf ("load_initial_melt_modules optstr %s", optstr); - if (optstr && optstr[0] - && (optsetv=melt_get_inisysdata (FSYSDAT_OPTION_SET)) != NULL - && melt_magic_discr ((melt_ptr_t) optsetv) == MELTOBMAG_CLOSURE) -{ - char *optc = 0; - char *optname = 0; - char *optvalue = 0; - for (optc = CONST_CAST (char *, optstr); - optc && *optc; - ) - { - optname = optvalue = NULL; - if (!ISALPHA(*optc)) - melt_fatal_error ("invalid MELT option name %s [should start with letter]", - optc); - optname = optc; - while (*optc && (ISALNUM(*optc) || *optc=='_' || *optc=='-')) - optc++; - if (*optc == '=') { - *optc = (char)0; - optc++; - optvalue = optc; - while (*optc && *optc != ',') - optc++; - } - if (*optc==',') { - *optc = (char)0; - optc++; - } - optsymbv = meltgc_named_symbol (optname, MELT_CREATE); - { - union meltparam_un pararg[1]; - memset (¶rg, 0, sizeof (pararg)); - pararg[0].meltbp_cstring = optvalue; - MELT_LOCATION_HERE ("option set before apply"); - debugeprintf ("MELT option %s value %s", optname, - optvalue?optvalue:"_"); - optresv = - melt_apply ((meltclosure_ptr_t) optsetv, - (melt_ptr_t) optsymbv, - MELTBPARSTR_CSTRING, pararg, "", NULL); - if (!optresv) - warning (0, "unhandled MELT option %s", optname); - } - } - - /* after options setting, force a minor collection to ensure -nothing is left in young region */ - MELT_LOCATION_HERE ("option set done"); - melt_garbcoll (0, MELT_ONLY_MINOR); -} - MELT_LOCATION_HERE ("after setting options"); - - /** * Then we handle extra modules if given. **/ debugeprintf ("xtrastr %p %s", xtrastr, xtrastr); @@ -8845,6 +8786,65 @@ load_melt_modules_and_do_mode (void) debugeprintf ("no xtrastr %p", xtrastr); /** + * Then we set MELT options. + **/ + MELT_LOCATION_HERE ("before setting options"); + optstr = melt_argument ("option"); + debugeprintf ("load_initial_melt_modules optstr %s", optstr); + if (optstr && optstr[0] + && (optsetv=melt_get_inisysdata (FSYSDAT_OPTION_SET)) != NULL + && melt_magic_discr ((melt_ptr_t) optsetv) == MELTOBMAG_CLOSURE) +{ + char *optc = 0; + char *optname = 0; + char *optvalue = 0; + for (optc = CONST_CAST (char *, optstr); + optc && *optc; + ) + { + optname = optvalue = NULL; + if (!ISALPHA(*optc)) + melt_fatal_error ("invalid MELT option name %s [should start with letter]", + optc); + optname = optc; + while (*optc && (ISALNUM(*optc) || *optc=='_' || *optc=='-')) + optc++; + if (*optc == '=') { + *optc = (char)0; + optc++; + optvalue = optc; + while (*optc && *optc != ',') + optc++; + } + if (*optc==',') { + *optc = (char)0; + optc++; + } + optsymbv = meltgc_named_symbol (optname, MELT_CREATE); + { + union meltparam_un pararg[1]; + memset (¶rg, 0, sizeof (pararg)); + pararg[0].meltbp_cstring = optvalue; + MELT_LOCATION_HERE ("option set before apply"); + debugeprintf ("MELT option %s value %s", optname, + optvalue?optvalue:"_"); + optresv = + melt_apply ((meltclosure_ptr_t) optsetv, + (melt_ptr_t) optsymbv, + MELTBPARSTR_CSTRING, pararg, "", NULL); + if (!optresv) + warning (0, "unhandled MELT option %s", optname); + } + } + + /* after options setting, force a minor collection to ensure +nothing is left in young region *
Re: [C++ Patch] PR 44625
Actually, 9.5 says A union of the form union { member-specification } ; is called an anonymous union; it defines an unnamed object of unnamed type. The member-specification of an anonymous union shall only define non-static data members. [ Note: Nested types and functions cannot be declared within an anonymous union. — end note ] So we should be able to just reject nested anonymous aggregates and not worry about how to make them work. Jason
Re: [C++ Patch] PR 44625
On 06/23/2011 11:52 AM, Paolo Carlini wrote: Ok, then, from what you saying I understand that it should be possible to actually construct a reject-valid or an ice-on-valid in this area, isn't just about improving the diagnostic, that is only the tip of the iceberg, so to speak. I guess it's not for me, at this time... For the time being you could improve the diagnostic by adding a sorry for the case of null DECL_NAME in a template in build_anon_union_vars. Jason
Re: [C++ Patch] PR 44625
On 06/23/2011 05:47 PM, Jason Merrill wrote: Indeed. The code is using DECL_NAME in templates so that tsubst can look them up again by name, but as we see in this PR that can't work if the field has no name. We need a different strategy for handling anonymous aggregates nested in other anonymous aggregates in templates. Ok, then, from what you saying I understand that it should be possible to actually construct a reject-valid or an ice-on-valid in this area, isn't just about improving the diagnostic, that is only the tip of the iceberg, so to speak. I guess it's not for me, at this time... Paolo.
Re: [C++ Patch] PR 44625
On 06/23/2011 11:30 AM, Paolo Carlini wrote: Why are we creating a COMPONENT_REF with a null op1 in the first place? As a matter of fact, the possibility that DECL_NAME (field) could be null is considered in build_anon_union_vars itself, because right after the above mentioned build_min_nt call, there is a conditional 'if (DECL_NAME (field)) ...' Indeed. The code is using DECL_NAME in templates so that tsubst can look them up again by name, but as we see in this PR that can't work if the field has no name. We need a different strategy for handling anonymous aggregates nested in other anonymous aggregates in templates. Jason
Re: [AVX2] PATCH: Fixed predicates for AVX2's pshuf* description
On Thu, Jun 23, 2011 at 8:17 AM, Kirill Yukhin wrote: > Hi, > Attached patch, which limits immediates for pshuf* insns to 0..255 range. > > ChangeLog.avx2 entry: > 2011-06-20 Yukhin Kirill > > * gcc/config/i386/sse.md (avx2_pshufdv3): Fixed immediate's > predicate. > (avx2_pshuflwv3): Likewise. > (avx2_pshufhwv3): Likewise. > > Going to commit to avx2 branch. > I will check it in for you. Thanks. -- H.J.
Re: [C++ Patch] PR 44625
Hi, Why are we creating a COMPONENT_REF with a null op1 in the first place? For now, what I figured out is the following: build_anon_union_vars calls build_min_nt (COMPONENT_REF, object, DECL_NAME (field), NULL_TREE) with a null third argument, which becomes the null op1. As a matter of fact, the possibility that DECL_NAME (field) could be null is considered in build_anon_union_vars itself, because right after the above mentioned build_min_nt call, there is a conditional 'if (DECL_NAME (field)) ...' I don't know if the above is enough for you to suggest the way we should go... Paolo.
[AVX2] PATCH: Fixed predicates for AVX2's pshuf* description
Hi, Attached patch, which limits immediates for pshuf* insns to 0..255 range. ChangeLog.avx2 entry: 2011-06-20 Yukhin Kirill * gcc/config/i386/sse.md (avx2_pshufdv3): Fixed immediate's predicate. (avx2_pshuflwv3): Likewise. (avx2_pshufhwv3): Likewise. Going to commit to avx2 branch. Thanks, K avx2.pshufpred.gcc.patch Description: Binary data
Re: PATCH: PR rtl-optimization/49088: Combine fails to properly handle zero-extension and signed int constant
On Thu, Jun 16, 2011 at 3:20 AM, Eric Botcazou wrote: >> force_to_mode has >> >> /* If X is a CONST_INT, return a new one. Do this here since the >> test below will fail. */ >> if (CONST_INT_P (x)) >> { >> if (SCALAR_INT_MODE_P (mode)) >> return gen_int_mode (INTVAL (x) & mask, mode); >> else >> { >> x = GEN_INT (INTVAL (x) & mask); >> return gen_lowpart_common (mode, x); >> } >> } >> >> /* If X is narrower than MODE and we want all the bits in X's mode, just >> get X in the proper mode. */ >> if (GET_MODE_SIZE (GET_MODE (x)) < GET_MODE_SIZE (mode) >> && (GET_MODE_MASK (GET_MODE (x)) & ~mask) == 0) >> return gen_lowpart (mode, x); >> >> When it gets >> >> (zero_extend:DI (plus:SI (subreg:SI (reg/f:DI 20 frame) 0) >> (const_int -58 [0xffc6]))) >> >> It first sets mask to 32bit and leads to >> >> (subreg:DI (plus:SI (subreg:SI (reg/f:DI 20 frame) 0) >> (const_int -58 [0xffc6])) 0) >> >> with mask == 0x. The probem is >> >> binop: >> /* For most binary operations, just propagate into the operation and >> change the mode if we have an operation of that mode. */ >> >> op0 = force_to_mode (XEXP (x, 0), mode, mask, next_select); >> op1 = force_to_mode (XEXP (x, 1), mode, mask, next_select); >> >> where it calls force_to_mode with -58, 0x mask and DImode. This >> transformation is incorrect. > > I think that the conclusion is questionable. If MASK is really 0x, > then you're guaranteeing to force_to_mode that you don't care about the upper > 32 bits. Of course this is wrong for (zero_extend:DI ...). > > So it seems to me that the origin of the problem is the transition from: > > (zero_extend:DI (plus:SI (subreg:SI (reg/f:DI 20 frame) 0) > (const_int -58 [0xffc6]))) > > to force_to_mode being invoked on: > > (subreg:DI (plus:SI (subreg:SI (reg/f:DI 20 frame) 0) > (const_int -58 [0xffc6])) 0) > > with mask == 0x. This isn't equivalent, at least alone. > > > Who computes the mask and calls force_to_mode? Is it simplify_and_const_int_1? > > Then it should also mask the returned value, as explained in the code: > > /* Simplify VAROP knowing that we will be only looking at some of the > bits in it. > > Note by passing in CONSTOP, we guarantee that the bits not set in > CONSTOP are not significant and will never be examined. We must > ensure that is the case by explicitly masking out those bits > before returning. */ > varop = force_to_mode (varop, mode, constop, 0); > > If this is what actually happens, why gets this masking lost somewhere? > You are right. The real bug is http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49504 The fix is at http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01704.html -- H.J.
Re: [C++ Patch] PR 44625
Why are we creating a COMPONENT_REF with a null op1 in the first place? Jason
[PATCH (7/7)] Mixed-sign multiplies using narrowest mode
Patch 4 introduced support for using signed multiplies to code unsigned multiplies in a narrower mode. Patch 5 then introduced support for mis-matched input modes. These two combined mean that there is case where only the smaller of two inputs is unsigned, and yet it still tries to user a mode wider than the larger, signed input. This is bad because it means unnecessary extends and because the wider operation might not exist. This patch catches that case, and ensures that the smaller, unsigned input, is zero-extended to match the mode of the larger, signed input. Of course, both inputs may still have to be extended to fit the nearest available instruction, so it doesn't make a difference every time. OK? Andrew 2011-06-23 Andrew Stubbs gcc/ * tree-ssa-math-opts.c (convert_mult_to_widen): Better handle unsigned inputs of different modes. (convert_plusminus_to_widen): Likewise. gcc/testsuite/ * gcc.target/arm/smlalbb-3.c: New file. --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/smlalbb-3.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=armv7-a" } */ + +long long +foo (long long a, short *b, char *c) +{ + return a + *b * *c; +} + +/* { dg-final { scan-assembler "smlalbb" } } */ --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -2103,9 +2103,17 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi) { if (op != smul_widen_optab) { - from_mode = GET_MODE_WIDER_MODE (from_mode); - if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode)) - return false; + /* We can use a signed multiply with unsigned types as long as + there is a wider mode to use, or it is the smaller of the two + types that is unsigned. Note that type1 >= type2, always. */ + if (TYPE_UNSIGNED (type1) + || (TYPE_UNSIGNED (type2) + && TYPE_MODE (type2) == from_mode)) + { + from_mode = GET_MODE_WIDER_MODE (from_mode); + if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode)) + return false; + } op = smul_widen_optab; handler = find_widening_optab_handler_and_mode (op, to_mode, @@ -2244,14 +2252,21 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2)) { enum machine_mode mode = TYPE_MODE (type1); - mode = GET_MODE_WIDER_MODE (mode); - if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (TYPE_MODE (type))) + + /* We can use a signed multiply with unsigned types as long as + there is a wider mode to use, or it is the smaller of the two + types that is unsigned. Note that type1 >= type2, always. */ + if (TYPE_UNSIGNED (type1) + || (TYPE_UNSIGNED (type2) + && TYPE_MODE (type2) == mode)) { - type1 = type2 = lang_hooks.types.type_for_mode (mode, 0); - cast1 = cast2 = true; + mode = GET_MODE_WIDER_MODE (mode); + if (GET_MODE_SIZE (mode) >= GET_MODE_SIZE (TYPE_MODE (type))) + return false; } - else - return false; + + type1 = type2 = lang_hooks.types.type_for_mode (mode, 0); + cast1 = cast2 = true; } if (TYPE_MODE (type2) != TYPE_MODE (type1))
[PATCH (6/7)] More widening multiply-and-accumulate pattern matching
This patch fixes the case where widening multiply-and-accumulate were not recognised because the multiplication itself is not actually widening. This can happen when you have "DI + SI * SI" - the multiplication will be done in SImode as a non-widening multiply, and it's only the final accumulate step that is widening. This was not recognised for two reasons: 1. is_widening_mult_p inferred the output type from the multiply statement, which in not useful in this case. 2. The inputs to the multiply instruction may not have been converted at all (because they're not being widened), so the pattern match failed. The patch fixes these issues by making the output type explicit, and by permitting unconverted inputs (the types are still checked, so this is safe). OK? Andrew 2011-06-23 Andrew Stubbs gcc/ * tree-ssa-math-opts.c (is_widening_mult_rhs_p): Add new argument 'type'. Use 'type' from caller, not inferred from 'rhs'. Don't reject non-conversion statements. Do return lhs in this case. (is_widening_mult_p): Add new argument 'type'. Use 'type' from caller, not inferred from 'stmt'. Pass type to is_widening_mult_rhs_p. (convert_mult_to_widen): Pass type to is_widening_mult_p. (convert_plusminus_to_widen): Likewise. gcc/testsuite/ * gcc.target/arm/smlal-1.c: New file. --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/smlal-1.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=armv7-a" } */ + +long long +foo (long long a, int *b, int *c) +{ + return a + *b * *c; +} + +/* { dg-final { scan-assembler "smlal" } } */ --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -1963,7 +1963,8 @@ struct gimple_opt_pass pass_optimize_bswap = } }; -/* Return true if RHS is a suitable operand for a widening multiplication. +/* Return true if RHS is a suitable operand for a widening multiplication, + assuming a target type of TYPE. There are two cases: - RHS makes some value at least twice as wide. Store that value @@ -1973,32 +1974,32 @@ struct gimple_opt_pass pass_optimize_bswap = but leave *TYPE_OUT untouched. */ static bool -is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out) +is_widening_mult_rhs_p (tree type, tree rhs, tree *type_out, + tree *new_rhs_out) { gimple stmt; - tree type, type1, rhs1; + tree type1, rhs1; enum tree_code rhs_code; if (TREE_CODE (rhs) == SSA_NAME) { - type = TREE_TYPE (rhs); stmt = SSA_NAME_DEF_STMT (rhs); if (!is_gimple_assign (stmt)) return false; - rhs_code = gimple_assign_rhs_code (stmt); - if (TREE_CODE (type) == INTEGER_TYPE - ? !CONVERT_EXPR_CODE_P (rhs_code) - : rhs_code != FIXED_CONVERT_EXPR) - return false; - rhs1 = gimple_assign_rhs1 (stmt); type1 = TREE_TYPE (rhs1); if (TREE_CODE (type1) != TREE_CODE (type) || TYPE_PRECISION (type1) * 2 > TYPE_PRECISION (type)) return false; - *new_rhs_out = rhs1; + rhs_code = gimple_assign_rhs_code (stmt); + if (TREE_CODE (type) == INTEGER_TYPE + ? !CONVERT_EXPR_CODE_P (rhs_code) + : rhs_code != FIXED_CONVERT_EXPR) + *new_rhs_out = gimple_assign_lhs (stmt); + else + *new_rhs_out = rhs1; *type_out = type1; return true; } @@ -2013,28 +2014,27 @@ is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out) return false; } -/* Return true if STMT performs a widening multiplication. If so, - store the unwidened types of the operands in *TYPE1_OUT and *TYPE2_OUT - respectively. Also fill *RHS1_OUT and *RHS2_OUT such that converting - those operands to types *TYPE1_OUT and *TYPE2_OUT would give the - operands of the multiplication. */ +/* Return true if STMT performs a widening multiplication, assuming the + output type is TYPE. If so, store the unwidened types of the operands + in *TYPE1_OUT and *TYPE2_OUT respectively. Also fill *RHS1_OUT and + *RHS2_OUT such that converting those operands to types *TYPE1_OUT + and *TYPE2_OUT would give the operands of the multiplication. */ static bool -is_widening_mult_p (gimple stmt, +is_widening_mult_p (tree type, gimple stmt, tree *type1_out, tree *rhs1_out, tree *type2_out, tree *rhs2_out) { - tree type; - - type = TREE_TYPE (gimple_assign_lhs (stmt)); if (TREE_CODE (type) != INTEGER_TYPE && TREE_CODE (type) != FIXED_POINT_TYPE) return false; - if (!is_widening_mult_rhs_p (gimple_assign_rhs1 (stmt), type1_out, rhs1_out)) + if (!is_widening_mult_rhs_p (type, gimple_assign_rhs1 (stmt), type1_out, + rhs1_out)) return false; - if (!is_widening_mult_rhs_p (gimple_assign_rhs2 (stmt), type2_out, rhs2_out)) + if (!is_widening_mult_rhs_p (type, gimple_assign_rhs2 (stmt), type2_out, + rhs2_out)) return false; if (*type1_out == NULL) @@ -2084,7 +2084,7 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi) if (TREE_CODE (type) != INTEGER_TYPE) return false;
[PATCH (5/7)] Widening multiplies for mis-matched mode inputs
This patch removes the restriction that the inputs to a widening multiply must be of the same mode. It does this by extending the smaller of the two inputs to match the larger; therefore, it remains the case that subsequent code (in the expand pass, for example) can rely on the type of rhs1 being the input type of the operation, and the gimple verification code is still valid. OK? Andrew 2011-06-23 Andrew Stubbs gcc/ * tree-ssa-math-opts.c (is_widening_mult_p): Remove FIXME. Ensure the the larger type is the first operand. (convert_mult_to_widen): Insert cast if type2 is smaller than type1. (convert_plusminus_to_widen): Likewise. gcc/testsuite/ * gcc.target/arm/smlalbb-2.c: New file. --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/smlalbb-2.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=armv7-a" } */ + +unsigned long long +foo (unsigned long long a, unsigned char *b, unsigned short *c) +{ + return a + *b * *c; +} + +/* { dg-final { scan-assembler "smlalbb" } } */ --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -2051,9 +2051,17 @@ is_widening_mult_p (gimple stmt, *type2_out = *type1_out; } - /* FIXME: remove this restriction. */ - if (TYPE_PRECISION (*type1_out) != TYPE_PRECISION (*type2_out)) -return false; + /* Ensure that the larger of the two operands comes first. */ + if (TYPE_PRECISION (*type1_out) < TYPE_PRECISION (*type2_out)) +{ + tree tmp; + tmp = *type1_out; + *type1_out = *type2_out; + *type2_out = tmp; + tmp = *rhs1_out; + *rhs1_out = *rhs2_out; + *rhs2_out = tmp; +} return true; } @@ -2069,6 +2077,7 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi) enum insn_code handler; enum machine_mode to_mode, from_mode; optab op; + int cast1 = false, cast2 = false; lhs = gimple_assign_lhs (stmt); type = TREE_TYPE (lhs); @@ -2107,16 +2116,26 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi) return false; type1 = type2 = lang_hooks.types.type_for_mode (from_mode, 0); - - rhs1 = build_and_insert_cast (gsi, gimple_location (stmt), - create_tmp_var (type1, NULL), rhs1, type1); - rhs2 = build_and_insert_cast (gsi, gimple_location (stmt), - create_tmp_var (type2, NULL), rhs2, type2); + cast1 = cast2 = true; } else return false; } + if (TYPE_MODE (type2) != from_mode) +{ + type2 = lang_hooks.types.type_for_mode (from_mode, + TYPE_UNSIGNED (type2)); + cast2 = true; +} + + if (cast1) +rhs1 = build_and_insert_cast (gsi, gimple_location (stmt), + create_tmp_var (type1, NULL), rhs1, type1); + if (cast2) +rhs2 = build_and_insert_cast (gsi, gimple_location (stmt), + create_tmp_var (type2, NULL), rhs2, type2); + gimple_assign_set_rhs1 (stmt, fold_convert (type1, rhs1)); gimple_assign_set_rhs2 (stmt, fold_convert (type2, rhs2)); gimple_assign_set_rhs_code (stmt, WIDEN_MULT_EXPR); @@ -2142,6 +2161,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, optab this_optab; enum tree_code wmult_code; enum insn_code handler; + int cast1 = false, cast2 = false; lhs = gimple_assign_lhs (stmt); type = TREE_TYPE (lhs); @@ -2228,17 +2248,28 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (TYPE_MODE (type))) { type1 = type2 = lang_hooks.types.type_for_mode (mode, 0); - mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt), - create_tmp_var (type1, NULL), - mult_rhs1, type1); - mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt), - create_tmp_var (type2, NULL), - mult_rhs2, type2); + cast1 = cast2 = true; } else return false; } + if (TYPE_MODE (type2) != TYPE_MODE (type1)) +{ + type2 = lang_hooks.types.type_for_mode (TYPE_MODE (type1), + TYPE_UNSIGNED (type2)); + cast2 = true; +} + + if (cast1) +mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt), + create_tmp_var (type1, NULL), + mult_rhs1, type1); + if (cast2) +mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt), + create_tmp_var (type2, NULL), + mult_rhs2, type2); + /* Verify that the machine can perform a widening multiply accumulate in this mode/signedness combination, otherwise this transformation is likely to pessimize code. */
[PATCH (4/7)] Unsigned multiplies using wider signed multiplies
If one or both of the inputs to a widening multiply are of unsigned type then the compiler will attempt to use usmul_widen_optab or umul_widen_optab, respectively. That works fine, but only if the target supports those operations directly. Otherwise, it just bombs out and reverts to the normal inefficient non-widening multiply. This patch attempts to catch these cases and use an alternative signed widening multiply instruction, if one of those is available. I believe this should be legal as long as the top bit of both inputs is guaranteed to be zero. The code achieves this guarantee by zero-extending the inputs to a wider mode (which must still be narrower than the output mode). OK? Andrew 2011-06-23 Andrew Stubbs gcc/ * Makefile.in (tree-ssa-math-opts.o): Add langhooks.h dependency. * optabs.c (find_widening_optab_handler): Rename to ... (find_widening_optab_handler_and_mode): ... this, and add new argument 'found_mode'. * optabs.h (find_widening_optab_handler): Rename to ... (find_widening_optab_handler_and_mode): ... this. (find_widening_optab_handler): New macro. * tree-ssa-math-opts.c: Include langhooks.h (build_and_insert_cast): New function. (convert_mult_to_widen): Add new argument 'gsi'. Convert unsupported unsigned multiplies to signed. (convert_plusminus_to_widen): Likewise. (execute_optimize_widening_mul): Pass gsi to convert_mult_to_widen. gcc/testsuite/ * gcc.target/arm/smlalbb-1.c: New file. --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -2672,7 +2672,8 @@ tree-ssa-loop-im.o : tree-ssa-loop-im.c $(TREE_FLOW_H) $(CONFIG_H) \ tree-ssa-math-opts.o : tree-ssa-math-opts.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ $(TM_H) $(FLAGS_H) $(TREE_H) $(TREE_FLOW_H) $(TIMEVAR_H) \ $(TREE_PASS_H) alloc-pool.h $(BASIC_BLOCK_H) $(TARGET_H) \ - $(DIAGNOSTIC_H) $(RTL_H) $(EXPR_H) $(OPTABS_H) gimple-pretty-print.h + $(DIAGNOSTIC_H) $(RTL_H) $(EXPR_H) $(OPTABS_H) gimple-pretty-print.h \ + langhooks.h tree-ssa-alias.o : tree-ssa-alias.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \ $(TREE_H) $(TM_P_H) $(EXPR_H) $(GGC_H) $(TREE_INLINE_H) $(FLAGS_H) \ $(FUNCTION_H) $(TIMEVAR_H) convert.h $(TM_H) coretypes.h langhooks.h \ --- a/gcc/optabs.c +++ b/gcc/optabs.c @@ -232,9 +232,10 @@ add_equal_note (rtx insns, rtx target, enum rtx_code code, rtx op0, rtx op1) non-widening optabs also. */ enum insn_code -find_widening_optab_handler (optab op, enum machine_mode to_mode, - enum machine_mode from_mode, - int permit_non_widening) +find_widening_optab_handler_and_mode (optab op, enum machine_mode to_mode, + enum machine_mode from_mode, + int permit_non_widening, + enum machine_mode *found_mode) { for (; (permit_non_widening || from_mode != to_mode) && GET_MODE_SIZE (from_mode) <= GET_MODE_SIZE (to_mode) @@ -245,7 +246,11 @@ find_widening_optab_handler (optab op, enum machine_mode to_mode, from_mode); if (handler != CODE_FOR_nothing) - return handler; + { + if (found_mode) + *found_mode = from_mode; + return handler; + } } return CODE_FOR_nothing; --- a/gcc/optabs.h +++ b/gcc/optabs.h @@ -808,8 +808,13 @@ extern void emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code); extern bool maybe_emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code); /* Find a widening optab even if it doesn't widen as much as we want. */ -extern enum insn_code find_widening_optab_handler (optab, enum machine_mode, - enum machine_mode, int); +#define find_widening_optab_handler(A,B,C,D) \ + find_widening_optab_handler_and_mode (A, B, C, D, NULL) +extern enum insn_code find_widening_optab_handler_and_mode (optab, + enum machine_mode, + enum machine_mode, + int, + enum machine_mode *); /* An extra flag to control optab_for_tree_code's behavior. This is needed to distinguish between machines with a vector shift that takes a scalar for the --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/smlalbb-1.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=armv7-a" } */ + +long long +foo (long long a, unsigned char *b, signed char *c) +{ + return a + (long long)*b * (long long)*c; +} + +/* { dg-final { scan-assembler "smlalbb" } } */ --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -98,6 +98,7 @@ along with GCC; see the file COPYING3. If not see #include "basic-block.h" #include "target.h" #include "gimple-pretty-print.h" +#include "langhooks.h" /* FIXME: RTL headers have to be included here for optabs. */ #include "rtl.h" /* Because optabs.h wants enum rtx_code. */ @@ -1086,6 +1087,21 @@ build_and_insert_ref (gimple_stmt_iterator *gsi, location_t loc, tree type, return result; } +/* Build a gimple assignment to cast VAL to TYPE, and put the result in + TARGET. Insert the statement prior to GSI's current position, and + return the from SSA name. */ + +static tree +build_and_insert_cast (g
[PATCH (3/7)] Widening multiply-and-accumulate pattern matching
There are many cases where the widening_mult pass does not recognise widening multiply-and-accumulate cases simply because there is a type conversion step between the multiply and add statements. This patch should rectify that simply by looking beyond those conversions. OK? Andrew 2011-06-23 Andrew Stubbs gcc/ * tree-ssa-math-opts.c (convert_plusminus_to_widen): Look for multiply statement beyond NOP_EXPR statements. gcc/testsuite/ * gcc.target/arm/umlal-1.c: New file. --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/umlal-1.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=armv7-a" } */ + +long long +foo (long long a, char *b, char *c) +{ + return a + *b * *c; +} + +/* { dg-final { scan-assembler "umlal" } } */ --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -2114,26 +2114,39 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, else wmult_code = WIDEN_MULT_PLUS_EXPR; - rhs1 = gimple_assign_rhs1 (stmt); - rhs2 = gimple_assign_rhs2 (stmt); - - if (TREE_CODE (rhs1) == SSA_NAME) + rhs1_stmt = stmt; + do { - rhs1_stmt = SSA_NAME_DEF_STMT (rhs1); - if (is_gimple_assign (rhs1_stmt)) - rhs1_code = gimple_assign_rhs_code (rhs1_stmt); + rhs1_code = ERROR_MARK; + rhs1 = gimple_assign_rhs1 (rhs1_stmt); + + if (TREE_CODE (rhs1) == SSA_NAME) + { + rhs1_stmt = SSA_NAME_DEF_STMT (rhs1); + if (is_gimple_assign (rhs1_stmt)) + rhs1_code = gimple_assign_rhs_code (rhs1_stmt); + } + else + return false; } - else -return false; + while (rhs1_code == NOP_EXPR); - if (TREE_CODE (rhs2) == SSA_NAME) + rhs2_stmt = stmt; + do { - rhs2_stmt = SSA_NAME_DEF_STMT (rhs2); - if (is_gimple_assign (rhs2_stmt)) - rhs2_code = gimple_assign_rhs_code (rhs2_stmt); + rhs2_code = ERROR_MARK; + rhs2 = gimple_assign_rhs2 (rhs2_stmt); + + if (rhs2 && TREE_CODE (rhs2) == SSA_NAME) + { + rhs2_stmt = SSA_NAME_DEF_STMT (rhs2); + if (is_gimple_assign (rhs2_stmt)) + rhs2_code = gimple_assign_rhs_code (rhs2_stmt); + } + else + return false; } - else -return false; + while (rhs2_code == NOP_EXPR); if (code == PLUS_EXPR && rhs1_code == MULT_EXPR) {
[PATCH (2/7)] Widening multiplies by more than one mode
This patch has two effects: 1. It permits the use of widening multiply instructions that widen by more than one mode. E.g. HImode -> DImode. 2. It enables the use of widening multiply instructions for (extended) inputs of narrower mode than the instruction takes. E.g. QImode -> DImode where only HI->DI or SI->DI is available. Hopefully, most of the patch is self-explanatory, but here are few notes: The code introduces a temporary FIXME comment; this will be removed later in the patch series. In fact, this is not a new restriction; previously "type1" and "type2" were implicitly identical because they were required to be one mode smaller than "type". I regard the ARM portion of this patch as obvious, so I don't think I need an ARM maintainer to read this. Is the patch OK? Andrew 2011-06-23 Andrew Stubbs gcc/ * config/arm/arm.md (maddhidi4): Remove '*' from name. * expr.c (expand_expr_real_2): Use find_widening_optab_handler. * optabs.c (find_widening_optab_handler): New function. (expand_widen_pattern_expr): Use find_widening_optab_handler. (expand_binop_directly): Likewise. (expand_binop): Likewise. * optabs.h (find_widening_optab_handler): New prototype. * tree-cfg.c (verify_gimple_assign_binary): Adjust WIDEN_MULT_EXPR type precision rules. (verify_gimple_assign_ternary): Likewise for WIDEN_MULT_PLUS_EXPR. * tree-ssa-math-opts.c (is_widening_mult_rhs_p): Allow widening by more than one mode. Explicitly disallow mis-matched input types. (convert_mult_to_widen): Use find_widening_optab_handler. (convert_plusminus_to_widen): Likewise. --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -1857,7 +1857,7 @@ (set_attr "predicable" "yes")] ) -(define_insn "*maddhidi4" +(define_insn "maddhidi4" [(set (match_operand:DI 0 "s_register_operand" "=r") (plus:DI (mult:DI (sign_extend:DI --- a/gcc/expr.c +++ b/gcc/expr.c @@ -7632,19 +7632,16 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode, { enum machine_mode innermode = TYPE_MODE (TREE_TYPE (treeop0)); this_optab = usmul_widen_optab; - if (mode == GET_MODE_2XWIDER_MODE (innermode)) + if (find_widening_optab_handler (this_optab, mode, innermode, 0) + != CODE_FOR_nothing) { - if (widening_optab_handler (this_optab, mode, innermode) - != CODE_FOR_nothing) - { - if (TYPE_UNSIGNED (TREE_TYPE (treeop0))) - expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1, - EXPAND_NORMAL); - else - expand_operands (treeop0, treeop1, NULL_RTX, &op1, &op0, - EXPAND_NORMAL); - goto binop3; - } + if (TYPE_UNSIGNED (TREE_TYPE (treeop0))) + expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1, + EXPAND_NORMAL); + else + expand_operands (treeop0, treeop1, NULL_RTX, &op1, &op0, + EXPAND_NORMAL); + goto binop3; } } /* Check for a multiplication with matching signedness. */ @@ -7659,10 +7656,9 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode, optab other_optab = zextend_p ? smul_widen_optab : umul_widen_optab; this_optab = zextend_p ? umul_widen_optab : smul_widen_optab; - if (mode == GET_MODE_2XWIDER_MODE (innermode) - && TREE_CODE (treeop0) != INTEGER_CST) + if (TREE_CODE (treeop0) != INTEGER_CST) { - if (widening_optab_handler (this_optab, mode, innermode) + if (find_widening_optab_handler (this_optab, mode, innermode, 0) != CODE_FOR_nothing) { expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1, @@ -7671,7 +7667,7 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode, unsignedp, this_optab); return REDUCE_BIT_FIELD (temp); } - if (widening_optab_handler (other_optab, mode, innermode) + if (find_widening_optab_handler (other_optab, mode, innermode, 0) != CODE_FOR_nothing && innermode == word_mode) { --- a/gcc/optabs.c +++ b/gcc/optabs.c @@ -225,6 +225,32 @@ add_equal_note (rtx insns, rtx target, enum rtx_code code, rtx op0, rtx op1) return 1; } +/* Find a widening optab even if it doesn't widen as much as we want. + E.g. if from_mode is HImode, and to_mode is DImode, and there is no + direct HI->SI insn, then return SI->DI, if that exists. + If PERMIT_NON_WIDENING is non-zero then this can be used with + non-widening optabs also. */ + +enum insn_code +find_widening_optab_handler (optab op, enum machine_mode to_mode, + enum machine_mode from_mode, + int permit_non_widening) +{ + for (; (permit_non_widening || from_mode != to_mode) + && GET_MODE_SIZE (from_mode) <= GET_MODE_SIZE (to_mode) + && from_mode != VOIDmode; + from_mode = GET_MODE_WIDER_MODE (from_mode)) +{ + enum insn_code handler = widening_optab_handler (op, to_mode, + from_mode); + + if (handler != CODE_FOR_nothing) + return handler; +} + + return CODE_FOR_nothing; +} + /* Widen OP to MODE and return the rtx for the widene
Re: [pph] Stream scope_chain->bindings instead of global namespace (issue4661045)
On Wed, Jun 22, 2011 at 18:25, wrote: > I fixed the comment: removing it and elaborating on the FIXME above it > mentioning that > we need to look if the namespace already exists. Ah, OK. Thanks. > I removed the include, I originally included it because that's where > global_namespace is defined, but it compiles without it (which I guess > is fine if we don't need to stick to "include what you use"). Yeah, we don't have that rule in GCC. Diego.
[PATCH (1/7)] New optab framework for widening multiplies
This patch should have no effect on the compiler output. It merely replaces one way to represent widening operations with another, and refactors the other parts of the compiler to match. The rest of the patch set uses this new framework to implement the optimization improvements. I considered and discarded many approaches to this patch before arriving at this solution, and I feel sure that there'll be somebody out there who will think I chose the wrong one, so let me first explain how I got here The aim is to be able to encode and query optabs that have any given input mode, and any given output mode. This is similar to the convert_optab, but not compatible with that optab since it is handled completely differently in the code. (Just to be clear, the existing widening multiply support only covers instructions that widen by *one* mode, so it's only ever been necessary to know the output mode, up to now.) Option 1 was to add a second dimension to the handlers table in optab_d, but I discarded this option because it would increase the memory usage by the square of the number of modes, which is a bit much. Option 2 was to add a whole new optab, similar to optab_d, but with a second dimension like convert_optab_d, however this turned out to cause way too many pointer type mismatches in the code, and would have been very difficult to fix up. Option 3 was to add new optab entries for widening by two modes, by three modes, and so on. True, I would only need to add one extra set for what I need, but there would be so many places in the code that compare against smul_widen_optab, for example, that would need to be taught about these, that it seemed like a bad idea. Option 4 was to have a separate table that contained the widening operations, and refer to that whenever a widening entry in the main optab is referenced, but I found that there was no easy way to do the mapping without putting some sort of switch table in widening_optab_handler, and that negates the other advantages. So, what I've done in the end is add a new pointer entry "widening" into optab_d, and dynamically build the widening operations table for each optab that needs it. I've then added new accessor functions that take both input and output modes, and altered the code to use them where appropriate. The down-side of this approach is that the optab entries for widening operations now have two "handlers" tables, one of which is redundant. That said, those cases are in the minority, and it is the smaller table which is unused. If people find that very distasteful, it might be possible to remove the *_widen_optab entries and unify smul_optab with smul_widen_optab, and so on, and save space that way. I've not done so yet, but I expect I could if people feel strongly about it. As a side-effect, it's now possible for any optab to be "widening", should some target happen to have a widening add, shift, or whatever. Is this patch OK? Andrew 2011-06-23 Andrew Stubbs gcc/ * expr.c (expand_expr_real_2): Use widening_optab_handler. * genopinit.c (optabs): Use set_widening_optab_handler for $N. (gen_insn): $N now means $a must be wider than $b, not consecutive. * optabs.c (expand_widen_pattern_expr): Use widening_optab_handler. (expand_binop_directly): Likewise. (expand_binop): Likewise. * optabs.h (widening_optab_handlers): New struct. (optab_d): New member, 'widening'. (widening_optab_handler): New function. (set_widening_optab_handler): New function. * tree-ssa-math-opts.c (convert_mult_to_widen): Use widening_optab_handler. (convert_plusminus_to_widen): Likewise. --- a/gcc/expr.c +++ b/gcc/expr.c @@ -7634,7 +7634,8 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode, this_optab = usmul_widen_optab; if (mode == GET_MODE_2XWIDER_MODE (innermode)) { - if (optab_handler (this_optab, mode) != CODE_FOR_nothing) + if (widening_optab_handler (this_optab, mode, innermode) + != CODE_FOR_nothing) { if (TYPE_UNSIGNED (TREE_TYPE (treeop0))) expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1, @@ -7661,7 +7662,8 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode, if (mode == GET_MODE_2XWIDER_MODE (innermode) && TREE_CODE (treeop0) != INTEGER_CST) { - if (optab_handler (this_optab, mode) != CODE_FOR_nothing) + if (widening_optab_handler (this_optab, mode, innermode) + != CODE_FOR_nothing) { expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1, EXPAND_NORMAL); @@ -7669,7 +7671,8 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode, unsignedp, this_optab); return REDUCE_BIT_FIELD (temp); } - if (optab_handler (other_optab, mode) != CODE_FOR_nothing + if (widening_optab_handler (other_optab, mode, innermode) + != CODE_FOR_nothing && innermode == word_mode) { rtx htem, hipart; --- a/gcc
[PATCH (0/7)] Improve use of Widening Multiplies
Hi all, This patch series is intended to improve use of widening multiply, and widening multiply-and-accumulate instructions. This is primarily for the benefit of ARM targets, but should give some improvements to other targets also. The patches provide a number of improvements: * Support for instructions that widen by more than one mode (e.g. from HImode to DImode). * Use of widening multiplies even when the input mode is narrower than the instruction uses. (e.g. Use HI->DI to do QI->DI). * Use of signed widening multiplies (of a larger mode) where unsigned multiplies are not available. * Support for input operands with mis-matched signedness, with or without usmul_widen_optab. * Support for input operands with mis-matched mode [1]. * Improved pattern matching in the widening_mult pass. * Recognition of true types, even if obscured by a cast. * Insertion of extra gimple statements where the existing code was incompatible with widening multiplies. * Recognition of widening multiply-and-accumulate even where the multiply expression was not widening. The end result is that, on ARM, many many of the cases where the compiler would fall back to regular multiplies, extensions, and add instructions can now be handled with just one instruction. For those interested in the before and after states, I have attached a couple of shell scripts. These generate test cases with many permutations of types and signedness. [1] Operands of mis-matched mode are multiplied by extending the smaller one to match the larger one. Although this does not support mis-matched mode instructions directly, this ought to improve the chances of the combine pass doing The Right Thing. (Although this does depend on there being a suitable matched-mode instruction for widen_mult/expand to use.) So, on to the patches Andrew #!/bin/bash for op in madd mul; do for i1 in char short int "long long"; do for i2 in char short int "long long"; do for o in char short int "long long"; do for x in unsigned signed; do for y in unsigned signed; do for z in unsigned signed; do for c in cast nocast; do echo "$x $o" echo "${op}_${x}_${o/ /}_${y}_${i1/ /}_${z}_${i2/ /}_$c ($x $o a, $y $i1 *b, $z $i2 *c)" echo "{" case $op+$c in madd+cast) echo " return a + ($x $o)*b * ($x $o)*c;" ;; madd+nocast) echo " return a + *b * *c;" ;; mul+cast)echo " return ($x $o)*b * ($x $o)*c;" ;; mul+nocast) echo " return *b * *c;" ;; esac echo "}" echo done done done done done done done done #!/bin/bash for op in madd mul; do for i1 in char short int "long long"; do for i2 in char short int "long long"; do for o in char short int "long long"; do for x in unsigned signed; do for y in unsigned signed; do for z in unsigned signed; do for c in cast nocast; do echo "$x $o" echo "${op}_${x}_${o/ /}_${y}_${i1/ /}_${z}_${i2/ /}_$c ($x $o a, $y $i1 b, $z $i2 c)" echo "{" case $op+$c in madd+cast) echo " return a + ($x $o)b * ($x $o)c;" ;; madd+nocast) echo " return a + b * c;" ;; mul+cast)echo " return ($x $o)b * ($x $o)c;" ;; mul+nocast) echo " return b * c;" ;; esac echo "}" echo done done done done done done done done
Re: SRA generates uninitialized var use
Hi, On Mon, Jun 20, 2011 at 10:47:58PM +0200, Richard Guenther wrote: > On Mon, Jun 20, 2011 at 6:15 PM, Xinliang David Li wrote: > > It is used to indicate the fact the var decl needs to have a memory > > home (addressable) -- is there another way to do this? this is to > > avoid the following situation: > > > > 1) after SRA before update SSA, the IR looks like: > > > > MEM[ &SR_123] = ... > > > > other_var = SR_123; < (x) > > > > > > In this case, SR_123 is not of aggregate type, and it is not > > addressable, update_ssa won't assign a VUSE for (x), leading to > > The point is, SRA should never have created the above > > MEM[ &SR_123] = ... > > Martin, why would it even create new _memory_ backed decls? This is now PR 49516. I will submit a patch later today after bootstrapping and testing it. Martin > > Richard. > > > 2) final IR after SRA: > > > > MEM[..., &SR_123] = .. > > other_var = SR_123_yyy(D); > > > > > > David > > > > On Mon, Jun 20, 2011 at 4:13 AM, Richard Guenther > > wrote: > >> On Sat, Jun 18, 2011 at 10:56 AM, Xinliang David Li > >> wrote: > >>> Compiling the test case in the patch with -O2 -m32 without the fix, > >>> the program will abort. The problem is a var decl whose address is > >>> taken is not marked as addressable leading to bad SSA update (missing > >>> VUSE). (the triaging used the the .after and .after_cleanup dump diff > >>> and found the problem). > >>> > >>> the test is on going. Ok after testing? > >> > >> That doesn't make sense. SRA shouldn't generate anything that has > >> its address taken. So, where do we take its address? > >> > >> Richard. > >> > >>> Thanks, > >>> > >>> David > >>> > >> > >
Re: varpool alias reorg
On Sat, Jun 18, 2011 at 7:19 AM, H.J. Lu wrote: > On Sat, Jun 18, 2011 at 1:32 AM, Jan Hubicka wrote: >> Hi, >> this patch makes symetric changes to varpool as did the prevoius series to >> cgraph. >> Basically the aliases are now represented as separate varpool nodes with >> alias reference >> to the variable they refer to, with some infrastructure to walk the alias >> references >> as needed. >> >> Bootstrapped/regtested x86_64-linux, comitted. >> >> Honza >> >> * lto-symtab.c (lto_varpool_replace_node): Remove code handling >> extra name aliases. >> (lto_symtab_resolve_can_prevail_p): Likewise. >> (lto_symtab_merge_cgraph_nodes): Update alias_of pointers. >> * cgraphbuild.c (record_reference): Remove extra body alias code. >> (mark_load): Likewise. >> (mark_store): Likewise. >> * cgraph.h (varpool_node): Remove extra_name filed; >> add alias_of and extraname_alias. >> (varpool_create_variable_alias, varpool_for_node_and_aliases): >> Declare. >> (varpool_alias_aliased_node): New inline function. >> (varpool_variable_node): New function. >> * cgraphunit.c (handle_alias_pairs): Handle also variable aliases. >> * ipa-ref.c (ipa_record_reference): Allow aliases on variables. >> * lto-cgraph.c (lto_output_varpool_node): Update streaming. >> (input_varpool_node): Likewise. >> * lto-streamer-out.c (produce_symtab): Remove extra name aliases. >> (varpool_externally_visible_p): Remove extra body alias code. >> (function_and_variable_visibility): Likewise. >> * tree-ssa-structalias.c (associate_varinfo_to_alias_1): New function. >> (ipa_pta_execute): Use it. >> * varpool.c (varpool_remove_node): Remove extra name alias code. >> (varpool_mark_needed_node): Likewise. >> (varpool_analyze_pending_decls): Analyze aliases. >> (assemble_aliases): New functoin. >> (varpool_assemble_decl): Use it. >> (varpool_create_variable_alias): New function. >> (varpool_extra_name_alias): Rewrite. >> (varpool_for_node_and_aliases): New function. > > This caused: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49463 > This patch is incorrect as shown in the PR above. -- H.J.
[PATCH] Use get_pointer_alignment in vect_compute_data_ref_alignment
Hi! This is a precondition of the __builtin_assume_aligned patch (otherwise it wouldn't be useful for vectorization for which it has been designed), but I've bootstrapped/regtested it on x86_64-linux and i686-linux separately. get_pointer_alignment can tell us that a pointer is already sufficiently aligned and we don't need to use misaligned loads/stores. It should be useful even in other cases, such as when the code contains explicit ptr = (double *) (((uintptr_t) ptr) & ~(uintptr_t) 15); and similar to guarantee that ptr is already 16 byte aligned, etc. I haven't played with doing something additionally just with SSA_NAME_PTR_INFO (base_addr)->misalign yet if it isn't sufficiently aligned, but ->align is big enough, for integer_zerop (misalign) I guess we could just set misalign to that, otherwise? Also, I think we can't leave out the TYPE_ALIGN_UNIT test, because get_pointer_alignment will often return just BITS_PER_UNIT, e.g. for PARM_DECLs, even if they are pointers to sufficiently aligned types. Ok for trunk? 2011-06-23 Jakub Jelinek * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Use get_pointer_alignment to see if base isn't sufficiently aligned. --- gcc/tree-vect-data-refs.c.jj2011-06-17 11:02:19.0 +0200 +++ gcc/tree-vect-data-refs.c 2011-06-23 12:37:43.0 +0200 @@ -859,7 +859,9 @@ vect_compute_data_ref_alignment (struct || (TREE_CODE (base_addr) == SSA_NAME && tree_int_cst_compare (ssize_int (TYPE_ALIGN_UNIT (TREE_TYPE ( TREE_TYPE (base_addr, - alignment) >= 0)) + alignment) >= 0) + || (get_pointer_alignment (base_addr, TYPE_ALIGN (vectype)) + >= TYPE_ALIGN (vectype))) base_aligned = true; else base_aligned = false; Jakub
Re: PATCH: PR rtl-optimization/49504: Invalid optimization for Pmode != ptr_mode
On Wed, Jun 22, 2011 at 12:36:57PM -0700, H.J. Lu wrote: > Hi, > > I just don't see how nonzero_bits1 can assume if pointers extend unsigned > and this is an addition or subtraction to a pointer in Pmode, all the bits > bove ptr_mode are known to be zero. We never run into it before x32 > since x32 is the first such target. > > This patch deletes it. OK to install the nonzero_bits1 part for trunk? > > Thanks. > > I checked this patch into x32 branch. H.J. --- commit de145b6ad18327c34009d96f1a1f0a9510023f31 Author: H.J. Lu Date: Thu Jun 23 06:09:20 2011 -0700 Check correct return value. diff --git a/gcc/testsuite/ChangeLog.x32 b/gcc/testsuite/ChangeLog.x32 index 6581a45..cde8d41 100644 --- a/gcc/testsuite/ChangeLog.x32 +++ b/gcc/testsuite/ChangeLog.x32 @@ -1,3 +1,7 @@ +2011-06-23 H.J. Lu + + * gcc.target/i386/pr49504.c (main): Check correct return value. + 2011-06-22 H.J. Lu PR rtl-optimization/49504 diff --git a/gcc/testsuite/gcc.target/i386/pr49504.c b/gcc/testsuite/gcc.target/i386/pr49504.c index 9128196..503e6c2 100644 --- a/gcc/testsuite/gcc.target/i386/pr49504.c +++ b/gcc/testsuite/gcc.target/i386/pr49504.c @@ -12,7 +12,7 @@ foo (const void* p, unsigned long long q) int main () { - if (foo ((const void*) 0x100, 0x1ULL) == 0) + if (foo (foo, 0x1ULL) != 0x1) __builtin_abort (); return 0; }
Re: PR tree-optimize/49373 (IPA-PTA regression)
On Thu, Jun 23, 2011 at 2:50 PM, Jan Hubicka wrote: >> > Ok, but please change the IPA inline gate to honor flag_no_inline >> > (thus, (optimize && !flag_no_inline) || flag_lto || flag_wpa). >> OK. > Actually it won't work, since results of inline-analysis are used by most of > other > IPA passes (i.e. ipa-cp and ipa-sra for cloning decisions, etc.). > As we chatted about shortly on the summit, perhaps it would make sense to > declare > the jump-functions and ipa-inline analysis to be independent analysis passes > (aka ipa-pta). > One computing jump functions and other deciding on size/time estimates. > But definitely incrementally. Ok, fair enough. Richard. > Honza >> > >> > Thanks for working on this, I'll look to some followup cleanups >> > for PTA. Now, when it works on LTRANS units we have to do >> > some adjustments (like not disable it in opts.c ;)) - do we know >> >> Yep, I decided that it can go as a followup. Thanks for working on this! >> BTW the PTA solving time seems rather high now not only for libjava, but >> also for tramp3d and other bigger units I tested. >> >> > whether a function is only called from within a ltrans unit somehow? >> >> When you look at the cgraph, the flags are set as at WPA time. >> I.e. if function is local to program it has externally_visible 0 >> and then you have used_from_other_partition/in_other_partition flags >> saying how the other ltrans partitions behave to your function. >> >> If you decide to ignore cgraph (that is probably not coolest idea), >> you have the usual PUBLIC flag that is set for all objects used cross >> ltrans boundary (since they are now hidden public symbols). >> >> You also have the address taken shipped from WPA info, so you know if other >> units reads/writes the objects or also take its address that probably comes >> handy. >> >> Honza >
Re: PR tree-optimize/49373 (IPA-PTA regression)
> > Ok, but please change the IPA inline gate to honor flag_no_inline > > (thus, (optimize && !flag_no_inline) || flag_lto || flag_wpa). > OK. Actually it won't work, since results of inline-analysis are used by most of other IPA passes (i.e. ipa-cp and ipa-sra for cloning decisions, etc.). As we chatted about shortly on the summit, perhaps it would make sense to declare the jump-functions and ipa-inline analysis to be independent analysis passes (aka ipa-pta). One computing jump functions and other deciding on size/time estimates. But definitely incrementally. Honza > > > > Thanks for working on this, I'll look to some followup cleanups > > for PTA. Now, when it works on LTRANS units we have to do > > some adjustments (like not disable it in opts.c ;)) - do we know > > Yep, I decided that it can go as a followup. Thanks for working on this! > BTW the PTA solving time seems rather high now not only for libjava, but > also for tramp3d and other bigger units I tested. > > > whether a function is only called from within a ltrans unit somehow? > > When you look at the cgraph, the flags are set as at WPA time. > I.e. if function is local to program it has externally_visible 0 > and then you have used_from_other_partition/in_other_partition flags > saying how the other ltrans partitions behave to your function. > > If you decide to ignore cgraph (that is probably not coolest idea), > you have the usual PUBLIC flag that is set for all objects used cross > ltrans boundary (since they are now hidden public symbols). > > You also have the address taken shipped from WPA info, so you know if other > units reads/writes the objects or also take its address that probably comes > handy. > > Honza
Re: [Patch, AVR]: Fix PR46779
Sorry for the earlier semi-empty mail (just quoting G-J), I meant to cancel it. Happy midsummer. brgds, H-P
Re: [Patch, AVR]: Fix PR46779
On Wed, 22 Jun 2011, Georg-Johann Lay wrote: > Hans-Peter Nilsson schrieb: > > On Mon, 13 Jun 2011, Georg-Johann Lay wrote: > >> [In CCing Richard Henderson] > >> Denis Chertykov schrieb: > >>> 2011/6/10 Georg-Johann Lay : > > > Then I observed trouble with DI patterns during libgcc build and had > to remove > > * "zero_extendqidi2" > * "zero_extendhidi2" > * "zero_extendsidi2" > > These are "orphan" insns: they deal with DI without having movdi > support so I removed them. > >>> This seems strange for me. > >> As far as I know, to support a mode a respective mov insn is needed, > > > > For the record, not in general, just if you have patterns > > operating on DImode. I.e. if you always have to call into > > libgcc for every operation, you're fine with just SImode, as the > > access will be split into SImode accesses. (That reload can't > > split the access is arguably a wart.) > > For avr it's actually split in QImode (word_mode), SImode would be > more efficient. > > > It's even documented, "node Standard Names" for mov@var{m}: > > "If there are patterns accepting operands in larger modes, > > @samp{mov@var{m}} must be defined for integer modes of those > > sizes." > > Thanks for pointing that out. > > For avr that means: There is movsf pattern that is implemented less > efficient than movsi. So removing movsf could improve code a bit. > Besides efficiency, code for movsi and movsf can be the same on avr. > > >> which is > >> not the case for DI. I don't know the exact rationale behind that > >> (reloading?), > > > > Yes. (I ran into problems with this myself long ago.) > > So the zero_extend*di2 pattern are bogus because there is no movdi. > > >> just read is on gcc list by Ian Taylor (and also that it is > >> stronly discouraged to have more than one mov insn per mode). > > > > That is correct. > > > >> So if the requirement to have mov insn is dropped and without the burden to > >> implement movdi, it would be rather easy to implement adddi3 and subdi3 for > >> avr... > > > > Resist the temptation... I see you did. :) > > The preferred handling is still that optabs cared for calling __adddi3 > if there is no adddi3 pattern... The target would have to care for > implementing __adddi3 so generic libgcc need not to be changed and IMO > changing libgcc for that would not be adequate. > > Johann > > > brgds, H-P >
Re: [PATCH][RFC][2/2] Bitfield lowering
On Thu, 23 Jun 2011, Richard Guenther wrote: > On Wed, Jun 22, 2011 at 10:24 PM, Hans-Peter Nilsson > wrote: > > On Thu, 16 Jun 2011, Richard Guenther wrote: > > > >> This implements lowering a subset of COMPONENT_REFs with DECL_BIT_FIELD > >> FIELD_DECLs and BIT_FIELD_REFs - thus bitfield operations in general. > >> It lowers those to memory loads/stores that the (non-strict-align) target > >> is able to carry out, adjusting for the bit-field-ness by inserting > >> proper shifting and masking operations (just like expand does). > > > >> > >> Comments welcome - I wanted to post this before London to get > >> some input from people that won't attend. > > > > What does it do to code for targets with some kind of bitfield > > access insns? (insv, extv, various test insns taking a > > zero_extract or sign_extract argument) > > Do those usually work on memory operands? I guess not. > If not, we can > lower to insv/extv like tree codes (BIT_FIELD_COMPOSE_EXPR, > BIT_FIELD_REF) that operate on registers. One goal of bitfield > lowering was to disallow memory operations in gimple that > are not addressable or are of non-byte-aligned size. Ah, ok. > > From the above "just like expand" I guess it's expected to be a > > no change, right? > > Well, I suppose the code in expand has to be adjusted to do the > proper things for those targets. The question is whether that's > going to be easy or not. > > So, do you know of a target that can do insv with a memory > target? Not from top of my head, no. brgds, H-P
Re: PR tree-optimize/49373 (IPA-PTA regression)
> Ok, but please change the IPA inline gate to honor flag_no_inline > (thus, (optimize && !flag_no_inline) || flag_lto || flag_wpa). OK. > > Thanks for working on this, I'll look to some followup cleanups > for PTA. Now, when it works on LTRANS units we have to do > some adjustments (like not disable it in opts.c ;)) - do we know Yep, I decided that it can go as a followup. Thanks for working on this! BTW the PTA solving time seems rather high now not only for libjava, but also for tramp3d and other bigger units I tested. > whether a function is only called from within a ltrans unit somehow? When you look at the cgraph, the flags are set as at WPA time. I.e. if function is local to program it has externally_visible 0 and then you have used_from_other_partition/in_other_partition flags saying how the other ltrans partitions behave to your function. If you decide to ignore cgraph (that is probably not coolest idea), you have the usual PUBLIC flag that is set for all objects used cross ltrans boundary (since they are now hidden public symbols). You also have the address taken shipped from WPA info, so you know if other units reads/writes the objects or also take its address that probably comes handy. Honza
[C++ Patch] PR 44625
Hi, a patchlet for a pretty old ice-on-invalid regression. Tested x86_64-linux. Ok for mainline? Thanks, Paolo. /cp 2011-06-23 Paolo Carlini PR c++/44625 * pt.c (tsubst_copy_and_build): Do not use BASELINK_P on a NULL_TREE. /testsuite 2011-06-23 Paolo Carlini PR c++/44625 * g++.dg/template/crash107.C: New. Index: testsuite/g++.dg/template/crash107.C === --- testsuite/g++.dg/template/crash107.C(revision 0) +++ testsuite/g++.dg/template/crash107.C(revision 0) @@ -0,0 +1,20 @@ +// PR c++/44625 +// { dg-do compile } +// { dg-options "" } + +template struct Vec { // { dg-message "note" } +Vec& operator^=(Vec& rhs) { +union { +struct {FP_ x,y,z;}; +}; // { dg-error "anonymous struct" } +X = y*rhs.z() - z*rhs.y(); // { dg-error "not declared|no member" } +} +Vec& operator^(Vec& rhs) { +return Vec(*this)^=rhs; // { dg-message "required" } +} +}; +Vec v(3,4,12); // { dg-error "no matching" } +// { dg-message "note" { target *-*-* } 16 } +Vec V(12,4,3); // { dg-error "no matching" } +// { dg-message "note" { target *-*-* } 18 } +Vec c = v^V; // { dg-message "required" } Index: cp/pt.c === --- cp/pt.c (revision 175328) +++ cp/pt.c (working copy) @@ -13252,6 +13252,9 @@ tsubst_copy_and_build (tree t, object_type = TREE_TYPE (object); member = TREE_OPERAND (t, 1); + if (!member) + return error_mark_node; + if (BASELINK_P (member)) member = tsubst_baselink (member, non_reference (TREE_TYPE (object)),
Re: [pph] Fix binding_level's names_size streaming (issue4634071)
On Wed, Jun 22, 2011 at 20:17, Gabriel Dos Reis wrote: > On Wed, Jun 22, 2011 at 7:05 PM, Gabriel Charette wrote: >> And it looks like this wasn't sent to anyone directly... >> Adding back dnovillo and crowl (Diego I don't think Jason was ever >> added to the original message...?) > > should not this go to mainline too? Yes, I CC'd Jason in the original thread that started this discussion. Gab, could you send a patch for trunk? Please CC Jason when you do. Diego.
[v3] adjust tests to work in c++0x mode
This changes a few tests so they still work when the testsuite is run in c++0x mode, adding _GLIBCXX_CONSTEXPR to some declarations and qualifying some TR1 names to disambiguate them from the same names in namespace std. I think this is the right way to handle the failures. * testsuite/tr1/6_containers/tuple/creation_functions/tie2.cc: Fix for C++0x mode. * testsuite/25_algorithms/sort/35588.cc: Likewise. * testsuite/26_numerics/headers/complex/synopsis.cc: Likewise. Tested on x86_64-linux, with and without -std=gnu++0x There's still one more failure like this in tr1/6_containers/utility/pair.cc, which needs every 'get' to be qualified as tr1::get, which I can't be bothered to do today! Index: testsuite/25_algorithms/sort/35588.cc === --- testsuite/25_algorithms/sort/35588.cc (revision 174948) +++ testsuite/25_algorithms/sort/35588.cc (working copy) @@ -23,9 +23,8 @@ int main() { using namespace std; - using namespace tr1; using namespace std::tr1::placeholders; int t[10]; - sort(t, t+10, bind(less(), _1, _2)); + sort(t, t+10, tr1::bind(less(), _1, _2)); } Index: testsuite/26_numerics/headers/complex/synopsis.cc === --- testsuite/26_numerics/headers/complex/synopsis.cc (revision 174948) +++ testsuite/26_numerics/headers/complex/synopsis.cc (working copy) @@ -44,15 +44,20 @@ template complex operator/(const T&, const complex&); template complex operator+(const complex&); template complex operator-(const complex&); - template bool operator== + template _GLIBCXX_CONSTEXPR bool operator== (const complex&, const complex&); - template bool operator==(const complex&, const T&); - template bool operator==(const T&, const complex&); + template _GLIBCXX_CONSTEXPR bool operator== +(const complex&, const T&); + template _GLIBCXX_CONSTEXPR bool operator== +(const T&, const complex&); - template bool operator!=(const complex&, const complex&); - template bool operator!=(const complex&, const T&); - template bool operator!=(const T&, const complex&); + template _GLIBCXX_CONSTEXPR bool operator!= +(const complex&, const complex&); + template _GLIBCXX_CONSTEXPR bool operator!= +(const complex&, const T&); + template _GLIBCXX_CONSTEXPR bool operator!= +(const T&, const complex&); template basic_istream& operator>>(basic_istream&, complex&); @@ -61,8 +66,8 @@ operator<<(basic_ostream&, const complex&); // 26.2.7 values: - template T real(const complex&); - template T imag(const complex&); + template _GLIBCXX_CONSTEXPR T real(const complex&); + template _GLIBCXX_CONSTEXPR T imag(const complex&); template T abs(const complex&); template T arg(const complex&); template T norm(const complex&); Index: testsuite/tr1/6_containers/tuple/creation_functions/tie2.cc === --- testsuite/tr1/6_containers/tuple/creation_functions/tie2.cc (revision 174948) +++ testsuite/tr1/6_containers/tuple/creation_functions/tie2.cc (working copy) @@ -30,7 +30,7 @@ int i; std::string s; - tie(i, ignore, s) = make_tuple(42, 3.14, "C++"); + std::tr1::tie(i, ignore, s) = make_tuple(42, 3.14, "C++"); VERIFY( i == 42 ); VERIFY( s == "C++" ); }
Re: Mark variables addressable if they are copied using libcall in RTL expander
> So, what's the patch(es) that need approval now? Original expr.c patch for PR rtl-optimization/49429 + adjusted and augmented calls.c patch for PR target/49454. Everything is in this thread. Easwaran, would you mind posting a consolidated patch? -- Eric Botcazou
Re: Mark variables addressable if they are copied using libcall in RTL expander
On Thu, Jun 23, 2011 at 10:00 AM, Eric Botcazou wrote: >> Is the following patch a reasonable fix for this case? > > The lines should be moved to within the first branch of the subsequent "if". > They aren't needed if the second branch is taken because, in this case, we're > back to the usual caller-copied scheme where we pass the address of the copy. > >> I assume I should add similar code inside emit_library_call_value_1. > > Yes, we need the same treatment for 'val' in the MEM_P (val) && !must_copy > case > as the one applied in emit_block_move_hints. > > > But these problems show that there is a slight discrepancy between what dse.c > really needs (is the address of the variable taken?) and what may_be_aliased > answers (might the variable be indirectly modified?). Another viewpoint is to > say that there is a slight discrepancy between Tree and RTL level when it > comes > to the address-taken property. Not clear what to do about it so I think that > we should try this kludgy way and see how it fares. Yeah, I agree. Unless we want to really do alias analysis on RTL (and not just export what the tree level has in some way) let's go forward with this. So, what's the patch(es) that need approval now? Thanks, Richard. > -- > Eric Botcazou >
Re: [PATCH] Improve dump files for SRA early candidate check
On Thu, Jun 23, 2011 at 8:07 AM, Eric Botcazou wrote: >> + if (!host_integerp (DECL_FIELD_OFFSET (fld), 1)) >> + { >> + *msg = "structure field offset not host integer"; /* ??? */ >> + return true; >> + } > > Offsets can be variable, like sizes, in Ada for example. > >> if (TYPE_VOLATILE (et)) >> - return true; >> + { >> + *msg = "array type is volatile"; >> + return true; >> + } > > "element type is volatile" > >> + if (!COMPLETE_TYPE_P (type)) >> + { >> + reject (var, "is not complete"); >> + continue; >> + } > > "has incomplete type" is better I think > >> + if (!host_integerp (TYPE_SIZE (type), 1)) >> + { >> + reject (var, "not host integer"); >> + continue; >> + } > > missing "type size" > >> + if (tree_low_cst (TYPE_SIZE (type), 1) == 0) >> + { >> + reject (var, "tree_low_cst is zero"); /* what is that? */ >> + continue; >> + } > > This is equivalent to saying that the type size is zero. Ok with the suggested changes and the questioning comments removed. Thanks, Richard. > -- > Eric Botcazou >