[PATCH, PR preprocessor/42014] Added LAST_SOURCE_COLUMN in while loop
Hi, The attached patch adds LAST_SOURCE_COLUMN to pp_verbatim function in the while loop present in diagnostic_report_current_module(). This makes the output consistent for any error parsing program as stated in the bug. 2013-05-10 Shakthi Kannan skan...@redhat.com PR preprocessor/42014 * gcc/diagnostic.c: Added LAST_SOURCE_COLUMN in while loop. --- gcc/diagnostic.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c index f9a236b..2addbf0 100644 --- a/gcc/diagnostic.c +++ b/gcc/diagnostic.c @@ -528,8 +528,9 @@ diagnostic_report_current_module (diagnostic_context *context, location_t where) { map = INCLUDED_FROM (line_table, map); pp_verbatim (context-printer, - ,\n from %s:%d, - LINEMAP_FILE (map), LAST_SOURCE_LINE (map)); + ,\n from %s:%d:%d, + LINEMAP_FILE (map), + LAST_SOURCE_LINE (map), LAST_SOURCE_COLUMN (map)); } pp_verbatim (context-printer, :); pp_newline (context-printer); -- 1.7.11.7
Re: [PATCH, PR preprocessor/42014] Added LAST_SOURCE_COLUMN in while loop
On Fri, May 10, 2013 at 02:55:07AM -0400, Shakthi Kannan wrote: Hi, The attached patch adds LAST_SOURCE_COLUMN to pp_verbatim function in the while loop present in diagnostic_report_current_module(). This makes the output consistent for any error parsing program as stated in the bug. 2013-05-10 Shakthi Kannan skan...@redhat.com PR preprocessor/42014 * gcc/diagnostic.c: Added LAST_SOURCE_COLUMN in while loop. Maybe PR preprocessor/42014 * diagnostic.c: Print LAST_SOURCE_COLUMN as well. ? Marek
Re: [PATCH] PR55033: Fix
On 05/10/2013 03:55 AM, Alan Modra wrote: On Thu, May 09, 2013 at 08:47:19AM -0700, Richard Henderson wrote: On 02/22/2013 02:51 AM, Alan Modra wrote: Richard, can I bother you with a review request? http://gcc.gnu.org/ml/gcc-patches/2013-02/msg00544.html Sorry for the truly excessive delay here. This patch is ok for mainline and 4.8.1. Thanks. Committed revision 198762 and 198763. Thanks a lot, and sorry for being so obtrusive. -- Sebastian Huber, embedded brains GmbH Address : Dornierstr. 4, D-82178 Puchheim, Germany Phone : +49 89 189 47 41-16 Fax : +49 89 189 47 41-09 E-Mail : sebastian.hu...@embedded-brains.de PGP : Public key available on request. Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
[PATCH] Test case for PR55033
gcc/testsuite/ChangeLog 2013-05-10 Sebastian Huber sebastian.hu...@embedded-brains.de PR target/55033 * gcc.target/powerpc/pr55033.c: New. --- gcc/testsuite/gcc.target/powerpc/pr55033.c | 21 + 1 files changed, 21 insertions(+), 0 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr55033.c diff --git a/gcc/testsuite/gcc.target/powerpc/pr55033.c b/gcc/testsuite/gcc.target/powerpc/pr55033.c new file mode 100644 index 000..cd8abcb --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr55033.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target powerpc_eabi_ok } */ +/* { dg-options -mcpu=8540 -msoft-float -msdata=eabi -G 8 -fno-common } */ + +void f(void); + +struct s { + int *p; + int *q; +}; + +extern int a; + +extern const struct s c; + +const struct s c = { a, 0 }; + +void f(void) +{ + char buf[4] = { 0, 1, 2, 3 }; +} -- 1.7.7
Re: [PATCH] Rotate pattern recognition in forwprop (PR tree-optimization/57157)
On Thu, 9 May 2013, Jakub Jelinek wrote: Hi! The bit_rotate: code in fold-const.c handles only a subset of rotates, and e.g. with C++ FE when sizeof isn't folded early fold_binary often even isn't called with something that could recognize the rotate pattern, even if bit_rotate: has been extended. Furthermore, not all the rotate patterns need to be written in a single statement. So, this patch adds a rotate pattern recognizer into forwprop, and handles several different patterns, the comment above the function shows which ones. Bootstrapped/regtested on x86_64-linux and i686-linux. On the Botan KASUMI distilled testcase (PR55278 #c6 at -O3) I get 6.3% speedup on Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz and 9.9% speedup on Quad-Core AMD Opteron(tm) Processor 8354. Ok for trunk? Ok. Thanks, Richard. 2013-05-09 Jakub Jelinek ja...@redhat.com PR tree-optimization/45216 PR tree-optimization/57157 * tree-ssa-forwprop.c (simplify_rotate): New function. (ssa_forward_propagate_and_combine): Call it. * c-c++-common/rotate-1.c: New test. * c-c++-common/rotate-1a.c: New test. * c-c++-common/rotate-2.c: New test. * c-c++-common/rotate-2a.c: New test. * c-c++-common/rotate-3.c: New test. * c-c++-common/rotate-3a.c: New test. * c-c++-common/rotate-4.c: New test. * c-c++-common/rotate-4a.c: New test. * gcc.target/i386/rotate-1.c: Accept rolb or rorb instruction. --- gcc/tree-ssa-forwprop.c.jj2013-04-22 08:06:41.0 +0200 +++ gcc/tree-ssa-forwprop.c 2013-05-09 15:31:49.35512 +0200 @@ -2124,6 +2124,242 @@ simplify_bitwise_binary (gimple_stmt_ite } +/* Recognize rotation patterns. Return true if a transformation + applied, otherwise return false. + + We are looking for X with unsigned type T with bitsize B, OP being + +, | or ^, some type T2 wider than T and + (X CNT1) OP (X CNT2)iff CNT1 + CNT2 == B + ((T) ((T2) X CNT1)) OP ((T) ((T2) X CNT2)) iff CNT1 + CNT2 == B + (X Y) OP (X (B - Y)) + (X (int) Y) OP (X (int) (B - Y)) + ((T) ((T2) X Y)) OP ((T) ((T2) X (B - Y))) + ((T) ((T2) X (int) Y)) OP ((T) ((T2) X (int) (B - Y))) + (X Y) OP (X ((-Y) (B - 1))) + (X (int) Y) OP (X (int) ((-Y) (B - 1))) + ((T) ((T2) X Y)) OP ((T) ((T2) X ((-Y) (B - 1 + ((T) ((T2) X (int) Y)) OP ((T) ((T2) X (int) ((-Y) (B - 1 + + and transform these into: + X r CNT1 + X r Y + + Note, in the patterns with T2 type, the type of OP operands + might be even a signed type, but should have precision B. */ + +static bool +simplify_rotate (gimple_stmt_iterator *gsi) +{ + gimple stmt = gsi_stmt (*gsi); + tree arg[2], rtype, rotcnt = NULL_TREE; + tree def_arg1[2], def_arg2[2]; + enum tree_code def_code[2]; + tree lhs; + int i; + bool swapped_p = false; + gimple g; + + arg[0] = gimple_assign_rhs1 (stmt); + arg[1] = gimple_assign_rhs2 (stmt); + rtype = TREE_TYPE (arg[0]); + + /* Only create rotates in complete modes. Other cases are not + expanded properly. */ + if (!INTEGRAL_TYPE_P (rtype) + || TYPE_PRECISION (rtype) != GET_MODE_PRECISION (TYPE_MODE (rtype))) +return false; + + for (i = 0; i 2; i++) +defcodefor_name (arg[i], def_code[i], def_arg1[i], def_arg2[i]); + + /* Look through narrowing conversions. */ + if (CONVERT_EXPR_CODE_P (def_code[0]) + CONVERT_EXPR_CODE_P (def_code[1]) + INTEGRAL_TYPE_P (TREE_TYPE (def_arg1[0])) + INTEGRAL_TYPE_P (TREE_TYPE (def_arg1[1])) + TYPE_PRECISION (TREE_TYPE (def_arg1[0])) + == TYPE_PRECISION (TREE_TYPE (def_arg1[1])) + TYPE_PRECISION (TREE_TYPE (def_arg1[0])) TYPE_PRECISION (rtype) + has_single_use (arg[0]) + has_single_use (arg[1])) +{ + for (i = 0; i 2; i++) + { + arg[i] = def_arg1[i]; + defcodefor_name (arg[i], def_code[i], def_arg1[i], def_arg2[i]); + } +} + + /* One operand has to be LSHIFT_EXPR and one RSHIFT_EXPR. */ + for (i = 0; i 2; i++) +if (def_code[i] != LSHIFT_EXPR def_code[i] != RSHIFT_EXPR) + return false; +else if (!has_single_use (arg[i])) + return false; + if (def_code[0] == def_code[1]) +return false; + + /* If we've looked through narrowing conversions before, look through + widening conversions from unsigned type with the same precision + as rtype here. */ + if (TYPE_PRECISION (TREE_TYPE (def_arg1[0])) != TYPE_PRECISION (rtype)) +for (i = 0; i 2; i++) + { + tree tem; + enum tree_code code; + defcodefor_name (def_arg1[i], code, tem, NULL); + if (!CONVERT_EXPR_CODE_P (code) + || !INTEGRAL_TYPE_P (TREE_TYPE (tem)) + || TYPE_PRECISION (TREE_TYPE (tem)) != TYPE_PRECISION (rtype)) + return false; + def_arg1[i] = tem; + } + /* Both shifts have to
Re: [PATCH][1/n] Re-organize -fvect-cost-model, enable basic vectorization at -O2
On Tue, 7 May 2013, Richard Biener wrote: The following patch is a first step towards being able to enable vectorizing of a subset of all vectorizable functions at -O2 by default. Analysis of Polyhedron (loop heavy code) shows that the cost of doing vectorizer analysis is in the noise but compile-time (and binary size) grows only with the number of loops we emit vectorized code for (because we generate up to 3 extra loops for each vectorizable loop that we have to move down the pass pipeline). This very first patch makes sure that a runtime cost model check comes first - and not after alias or alignment versioning checks. That part of the patch would be a no-op without the rest because currently peeling for alignment and versioning cannot coexist (well - they could not, before I introduced STMT_VINFO_LOOP_PHI_EVOLUTION_PART a few releases ago ...). Thus the patch enables doing both which may eventually speedup things. Sofar tested on the testsuite and polyhedron, full bootstrap and regtest is underway. Patches in this series will transform -fvect-cost-model to get an additional parameter (not sure how that will end up looking like) to be able to control vectorization in the following 'steps' 1 vectorize loops that do not require a runtime cost check to be profitable and that do not require versioning (to be enabled at -O2) 2 vectorize loops like we do now 3 vectorize loops like we do now but assume the runtime cost check will always succeed (thus, omit it) [-fno-vect-cost-model] I'm not sure yet if restricting the versioning makes sense (it's supposed to reduce code bloat and compile-time of course), esp. considering that peeling for alignment could be disabled as well (but then HW without unaligned accesses will likely vectorize nothing). Thus the complication seems to be the code size considerations (the cost model is currently set up to compare runtime cost only). Richard. 2013-05-07 Richard Biener rguent...@suse.de * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Do not disable peeling when we version for aliasing. * tree-vect-loop-manip.c (vect_can_advance_ivs_p): Use STMT_VINFO_LOOP_PHI_EVOLUTION_PART instead of recomputing it. * tree-vect-loop.c (vect_transform_loop): First apply versioning, then peeling to arrange for the cost-model check to come first. Runs into issues with gcc.target/i386/l_fma_double_1.c and friends because that now does peeling for alignment instead of just versioning for alias. And peeling for alignment is only applied on x86_64 for double because on i686 alignment might not be reachable with peeling. Trying to fixup in the testcase with a properly aligned double type shows that we don't honor user alignment in the vectorizer for this purpose - thus, fixed, to allow not too ugly adjustments of the fma testcases. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2013-05-10 Richard Biener rguent...@suse.de * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Do not disable peeling when we version for aliasing. (vector_alignment_reachable_p): Honor explicit user alignment. (vect_supportable_dr_alignment): Likewise. * tree-vect-loop-manip.c (vect_can_advance_ivs_p): Use STMT_VINFO_LOOP_PHI_EVOLUTION_PART instead of recomputing it. * tree-vect-loop.c (vect_transform_loop): First apply versioning, then peeling to arrange for the cost-model check to come first. * gcc.target/i386/avx256-unaligned-load-2.c: Make well-defined. * gcc.target/i386/l_fma_double_1.c: Adjust. * gcc.target/i386/l_fma_double_2.c: Likewise. * gcc.target/i386/l_fma_double_3.c: Likewise. * gcc.target/i386/l_fma_double_4.c: Likewise. * gcc.target/i386/l_fma_double_5.c: Likewise. * gcc.target/i386/l_fma_double_6.c: Likewise. * gcc.target/i386/l_fma_float_1.c: Likewise. * gcc.target/i386/l_fma_float_2.c: Likewise. * gcc.target/i386/l_fma_float_3.c: Likewise. * gcc.target/i386/l_fma_float_4.c: Likewise. * gcc.target/i386/l_fma_float_5.c: Likewise. * gcc.target/i386/l_fma_float_6.c: Likewise. Index: gcc/tree-vect-data-refs.c === *** gcc/tree-vect-data-refs.c.orig 2013-05-08 13:26:16.0 +0200 --- gcc/tree-vect-data-refs.c 2013-05-08 14:23:42.937265437 +0200 *** vector_alignment_reachable_p (struct dat *** 1024,1030 if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, Unknown misalignment, is_packed = %d,is_packed); ! if (targetm.vectorize.vector_alignment_reachable (type, is_packed)) return true; else return false; --- 1024,1031 if (dump_enabled_p ()) dump_printf_loc
[PATCH] Fix PR57214
This fixes PR57214 - the simple constant propagation pass we perform over unrolled loop bodies does not properly avoid propagating across abnormal edges. Bootstrap regtest running on x86_64-unknown-linux-gnu, I'll install this on trunk and the 4.8 branch where the bug is latent. Richard. 2013-05-10 Richard Biener rguent...@suse.de PR tree-optimization/57214 * tree-ssa-loop-ivcanon.c (propagate_constants_for_unrolling): Do not propagate from SSA names that occur in abnormal PHI nodes. Index: gcc/tree-ssa-loop-ivcanon.c === *** gcc/tree-ssa-loop-ivcanon.c (revision 198768) --- gcc/tree-ssa-loop-ivcanon.c (working copy) *** propagate_constants_for_unrolling (basic *** 1085,1092 tree lhs; if (is_gimple_assign (stmt) (lhs = gimple_assign_lhs (stmt), TREE_CODE (lhs) == SSA_NAME) ! gimple_assign_rhs_code (stmt) == INTEGER_CST) { propagate_into_all_uses (lhs, gimple_assign_rhs1 (stmt)); gsi_remove (gsi, true); --- 1085,1093 tree lhs; if (is_gimple_assign (stmt) + gimple_assign_rhs_code (stmt) == INTEGER_CST (lhs = gimple_assign_lhs (stmt), TREE_CODE (lhs) == SSA_NAME) ! !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs)) { propagate_into_all_uses (lhs, gimple_assign_rhs1 (stmt)); gsi_remove (gsi, true); Index: gcc/testsuite/gcc.dg/torture/pr57214.c === *** gcc/testsuite/gcc.dg/torture/pr57214.c (revision 0) --- gcc/testsuite/gcc.dg/torture/pr57214.c (working copy) *** *** 0 --- 1,25 + /* { dg-do compile } */ + + extern int baz (void); + extern int foo (void) __attribute__ ((returns_twice)); + + void + bar (_Bool b) + { + int buf[1]; + while (1) + { + _Bool x = 1; + if (b) + baz (); + b = 1; + baz (); + x = 0; + int i; + while (buf[i] i) + i++; + foo (); + if (!x) + b = 0; + } + }
RFC patch for fold-const.c's fold_indirect_ref_1 / avoid type-checking ICE with TYPE_MIN_VALUE vs. offset signess mismatches
I run into the following issue on the Fortran-Dev branch. I get an ICE due to a tree-checking assert for fold-const.c's fold_indirect_ref_1, which has tree min_val = size_zero_node; if (type_domain TYPE_MIN_VALUE (type_domain)) min_val = TYPE_MIN_VALUE (type_domain); op01 = size_binop_loc (loc, EXACT_DIV_EXPR, op01, TYPE_SIZE_UNIT (type)); op01 = size_binop_loc (loc, PLUS_EXPR, op01, min_val); The problem is that op01 is unsigned (size_type_node) while min_val alias TYPE_MIN_VALUE is signed (gfc_array_index_type which is PTRDIFF_TYPE). TYPE_MIN_VALUE gets set via tree.c's build_range_type_1, which has: TYPE_MIN_VALUE (itype) = fold_convert (type, lowval); TYPE_MAX_VALUE (itype) = highval ? fold_convert (type, highval) : NULL; And op01 becomes a size_type_node via tree.h's fold_build_pointer_plus_loc (location_t loc, tree ptr, tree off) { return fold_build2_loc (loc, POINTER_PLUS_EXPR, TREE_TYPE (ptr), ptr, fold_convert_loc (loc, sizetype, off)); I am not completely sure what's the best way to solve that, but the attached patch avoids the ICE for me. I did an all-all language bootstrap (all,ada,go,objc,obj-c++) and had no new failures with regresting. OK? Or do you have a better idea how to solve this? Tobias 2013-05-10 Tobias Burnus bur...@net-b.de * fold-const.c (fold_indirect_ref_1): Fold offset to TREE_TYPE (min_val) before converting to array syntax. diff --git a/gcc/fold-const.c b/gcc/fold-const.c index 74b451e..8ac873a 100644 --- a/gcc/fold-const.c +++ b/gcc/fold-const.c @@ -16591,10 +16591,11 @@ fold_indirect_ref_1 (location_t loc, tree type, tree op0) tree min_val = size_zero_node; if (type_domain TYPE_MIN_VALUE (type_domain)) min_val = TYPE_MIN_VALUE (type_domain); op01 = size_binop_loc (loc, EXACT_DIV_EXPR, op01, TYPE_SIZE_UNIT (type)); + op01 = fold_convert (TREE_TYPE (min_val), op01); op01 = size_binop_loc (loc, PLUS_EXPR, op01, min_val); return build4_loc (loc, ARRAY_REF, type, op00, op01, NULL_TREE, NULL_TREE); } }
Re: [PATCH] Improve rotation by mode bitsize - 1
On Thu, May 9, 2013 at 10:24 PM, Jakub Jelinek ja...@redhat.com wrote: On Thu, May 09, 2013 at 09:47:49PM +0200, Uros Bizjak wrote: On Thu, May 9, 2013 at 8:45 PM, Jakub Jelinek ja...@redhat.com wrote: 2013-05-09 Jakub Jelinek ja...@redhat.com * config/i386/i386.md (rotateinv): New code attr. (*rotate_insnmode3_1, *rotate_insnsi3_1_zext, *rotate_insnqi3_1_slp): Emit rorl %eax instead of roll $31, %eax, etc. OK. Thanks. While talking about rotates, seems they are quite expensive on SandyBridge if the destination (== source1) is memory. With -O3 -mavx: unsigned int a[1024] = { 0, 1, 2, 3, 4, 5, 6 }; __attribute__((noinline, noclone)) void foo (void) { int i; for (i = 0; i 1024; i++) { int j = i 31; a[i] = (a[i] j) ^ (a[i] ((-j) 31)); } } int main () { int i; for (i = 0; i 100; i++) foo (); return 0; } the timing for vanilla gcc, gcc with the rotate recognizer patch and the latter hand editted to use rotate on register gives: Intel i7-2600 0m1.421s0m1.914s0m0.826s AMD 83540m2.353s0m0.988s0m1.407s The vanilla gcc loop is long: .L2: movla(,%rax,4), %edi movl%eax, %esi andl$31, %esi movl%esi, %ecx negl%ecx movl%edi, %edx shrl%cl, %edx movl%esi, %ecx sall%cl, %edi xorl%edi, %edx movl%edx, a(,%rax,4) addq$1, %rax cmpq$1024, %rax jne .L2 With rotate using mem: .L2: roll%cl, a(,%rcx,4) addq$1, %rcx cmpq$1024, %rcx jne .L2 and with the hand tweaked rotate: .L2: movla(,%rcx,4), %eax roll%cl, %eax addq$1, %rcx cmpq$1024, %rcx movl%eax, a-4(,%rcx,4) jne .L2 So, on SandyBridge (haven't tried other Intel CPUs) it perhaps might be beneficial to disallow rotate with =m , 0 constraints, while for the above mentioned AMD it is best to use it. Guess the above isn't sufficient data for that, we'd need info on more CPUs and more rotation sizes. BTW, with -mavx2 we vectorize the loop without the rotate recognition patch and don't vectorize anymore, I'll need to adjust the pattern recognizer to handle those. The question is if for missing vector rotate optab we can replace it with the probably faster and shorter x r y - (x y) | (x (32 - y)) - which is invalid for y = 0, but perhaps if the only thing the vector shift would do for shift by 32 would be either return 0, or x, it would still work, or if we want to go for the (x y) | (x ((-y) 31)). The i?86/x86_64 vector shifts don't truncate the shift count, so perhaps best would be a vector rotate expander that expands it into some some special insn like vectorized y ? x (32 - y) : 0 One question is whether our LROTATE_EXPR or RROTATE_EXPR is only defined for rotation counts 0 through bitsize - 1 (as is the case when we pattern detect it, do we create *ROTATE_EXPR other way?), or whether we allow arbitrary rotation counts (would assume the rotation be done always with modulo bitsize). Because if the rotation count could be arbitrary, we'd need to vectorize it as (x (y 31)) | (x ((-y) 31)) The rotation count should be 0 to bitsize - 1 similar to shifts. Richard. Jakub
Re: [PATCH, PR preprocessor/42014] Added LAST_SOURCE_COLUMN in while loop
Hi, - Original Message - | From: Marek Polacek pola...@redhat.com | | Maybe | PR preprocessor/42014 | * diagnostic.c: Print LAST_SOURCE_COLUMN as well. | ? \-- Sure! SK -- Shakthi Kannan skannan at redhat dot com
Re: More vector folding
On Thu, May 9, 2013 at 12:55 AM, Marc Glisse marc.gli...@inria.fr wrote: Hello, here are a few more changes to fold-const.c so vectors can use the existing optimizations. Note that I made fold_truth_not_expr safe for use with vector BIT_NOT_EXPR. Passes bootstrap+testsuite on x86_64-linux-gnu. 2013-05-09 Marc Glisse marc.gli...@inria.fr gcc/ * fold-const.c (fold_negate_expr): Handle vectors. (fold_truth_not_expr): Likewise. (invert_truthvalue_loc): Likewise. (fold_single_bit_test): Likewise. (fold_comparison): Likewise. (fold_ternary_loc): Likewise. gcc/testsuite/ * g++.dg/ext/vector22.C: New testcase. -- Marc Glisse Index: gcc/testsuite/g++.dg/ext/vector22.C === --- gcc/testsuite/g++.dg/ext/vector22.C (revision 0) +++ gcc/testsuite/g++.dg/ext/vector22.C (revision 0) @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options -O -fdump-tree-gimple } */ + +typedef unsigned vec __attribute__((vector_size(4*sizeof(int; + +void f(vec*a,vec*b){ + *a=(*a)?-1:(*b10); + *b=(*b)?(*a10):0; +} +void g(vec*a,vec*b){ + *a=(*a)?(*a*a):-1; + *b=(*b)?-1:(*b*b); +} +void h(vec*a){ + *a=(~*a==5); +} + +/* { dg-final { scan-tree-dump-not ~ gimple } } */ +/* { dg-final { scan-tree-dump-not VEC_COND_EXPR gimple } } */ +/* { dg-final { cleanup-tree-dump gimple } } */ Property changes on: gcc/testsuite/g++.dg/ext/vector22.C ___ Added: svn:eol-style + native Added: svn:keywords + Author Date Id Revision URL Index: gcc/fold-const.c === --- gcc/fold-const.c(revision 198726) +++ gcc/fold-const.c(working copy) @@ -519,21 +519,21 @@ fold_negate_expr (location_t loc, tree t { tree type = TREE_TYPE (t); tree tem; switch (TREE_CODE (t)) { /* Convert - (~A) to A + 1. */ case BIT_NOT_EXPR: if (INTEGRAL_TYPE_P (type)) return fold_build2_loc (loc, PLUS_EXPR, type, TREE_OPERAND (t, 0), -build_int_cst (type, 1)); +build_one_cst (type)); break; case INTEGER_CST: tem = fold_negate_const (t, type); if (TREE_OVERFLOW (tem) == TREE_OVERFLOW (t) || !TYPE_OVERFLOW_TRAPS (type)) return tem; break; case REAL_CST: @@ -3110,20 +3110,23 @@ fold_truth_not_expr (location_t loc, tre return build2_loc (loc, code, type, TREE_OPERAND (arg, 0), TREE_OPERAND (arg, 1)); } switch (code) { case INTEGER_CST: return constant_boolean_node (integer_zerop (arg), type); +case VECTOR_CST: + return fold_unary_loc (loc, BIT_NOT_EXPR, type, arg); + case TRUTH_AND_EXPR: loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc); loc2 = expr_location_or (TREE_OPERAND (arg, 1), loc); return build2_loc (loc, TRUTH_OR_EXPR, type, invert_truthvalue_loc (loc1, TREE_OPERAND (arg, 0)), invert_truthvalue_loc (loc2, TREE_OPERAND (arg, 1))); case TRUTH_OR_EXPR: loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc); loc2 = expr_location_or (TREE_OPERAND (arg, 1), loc); @@ -3195,26 +3198,28 @@ fold_truth_not_expr (location_t loc, tre return build1_loc (loc, TRUTH_NOT_EXPR, type, arg); /* ... fall through ... */ case FLOAT_EXPR: loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc); return build1_loc (loc, TREE_CODE (arg), type, invert_truthvalue_loc (loc1, TREE_OPERAND (arg, 0))); case BIT_AND_EXPR: - if (!integer_onep (TREE_OPERAND (arg, 1))) + if (VECTOR_TYPE_P (type) || !integer_onep (TREE_OPERAND (arg, 1))) return NULL_TREE; return build2_loc (loc, EQ_EXPR, type, arg, build_int_cst (type, 0)); case SAVE_EXPR: - return build1_loc (loc, TRUTH_NOT_EXPR, type, arg); + return build1_loc (loc, VECTOR_TYPE_P (type) ? BIT_NOT_EXPR + : TRUTH_NOT_EXPR, +type, arg); case CLEANUP_POINT_EXPR: loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc); return build1_loc (loc, CLEANUP_POINT_EXPR, type, invert_truthvalue_loc (loc1, TREE_OPERAND (arg, 0))); default: return NULL_TREE; } } @@ -3222,28 +3227,28 @@ fold_truth_not_expr (location_t loc, tre /* Return a simplified tree node for the truth-negation of ARG. This never alters ARG itself. We assume that ARG is an operation that returns a truth value (0 or 1). FIXME: one would think we would fold the result, but it causes problems with the dominator optimizer. */ tree
Re: TYPE_PRECISION for vectors
On Tue, May 7, 2013 at 11:26 PM, Marc Glisse marc.gli...@inria.fr wrote: Hello, this patch is about the use of TYPE_PRECISION for non-scalar types. For complex types, it is unused (always 0) and for vectors, the field is used to store the log of the number of elements, so in both cases we shouldn't use the macro. I tried to enforce it, see the SCALAR_TYPE_CHECK in tree.h (I assume I should remove it and the FIXME before commit), and started fixing the crashes, but there were too many and I gave up. Those fixes I did write are included in the patch, since I believe we eventually want to get there. For hashing in gimple.c, including TYPE_UNSIGNED for complex and vector seems questionable: I think that it is the same as for the inner type, which is already hashed. I didn't change that in the patch. Passes bootstrap+testsuite on x86_64-linux-gnu. Ok. Thanks, Richard. 2013-05-07 Marc Glisse marc.gli...@inria.fr gcc/ * stor-layout.c (element_precision): New function. * machmode.h (element_precision): Declare it. * tree.c (build_minus_one_cst): New function. (element_precision): Likewise. * tree.h (SCALAR_TYPE_CHECK): New macro. (build_minus_one_cst): Declare new function. (element_precision): Likewise. * fold-const.c (operand_equal_p): Use element_precision. (fold_binary_loc): Handle vector types. * convert.c (convert_to_integer): Use element_precision. * gimple.c (iterative_hash_canonical_type): Handle complex and vectors separately. gcc/c-family/ * c-common.c (vector_types_convertible_p): No TYPE_PRECISION for vectors. gcc/testsuite/ * gcc.dg/vector-shift.c: New testcase. -- Marc Glisse Index: gcc/convert.c === --- gcc/convert.c (revision 198685) +++ gcc/convert.c (working copy) @@ -348,22 +348,22 @@ convert_to_real (tree type, tree expr) fixed-point or vector; in other cases error is called. The result of this is always supposed to be a newly created tree node not in use in any existing structure. */ tree convert_to_integer (tree type, tree expr) { enum tree_code ex_form = TREE_CODE (expr); tree intype = TREE_TYPE (expr); - unsigned int inprec = TYPE_PRECISION (intype); - unsigned int outprec = TYPE_PRECISION (type); + unsigned int inprec = element_precision (intype); + unsigned int outprec = element_precision (type); /* An INTEGER_TYPE cannot be incomplete, but an ENUMERAL_TYPE can be. Consider `enum E = { a, b = (enum E) 3 };'. */ if (!COMPLETE_TYPE_P (type)) { error (conversion to incomplete type); return error_mark_node; } /* Convert e.g. (long)round(d) - lround(d). */ Index: gcc/tree.c === --- gcc/tree.c (revision 198685) +++ gcc/tree.c (working copy) @@ -1615,20 +1615,59 @@ build_one_cst (tree type) case COMPLEX_TYPE: return build_complex (type, build_one_cst (TREE_TYPE (type)), build_zero_cst (TREE_TYPE (type))); default: gcc_unreachable (); } } +/* Return a constant of arithmetic type TYPE which is the + opposite of the multiplicative identity of the set TYPE. */ + +tree +build_minus_one_cst (tree type) +{ + switch (TREE_CODE (type)) +{ +case INTEGER_TYPE: case ENUMERAL_TYPE: case BOOLEAN_TYPE: +case POINTER_TYPE: case REFERENCE_TYPE: +case OFFSET_TYPE: + return build_int_cst (type, -1); + +case REAL_TYPE: + return build_real (type, dconstm1); + +case FIXED_POINT_TYPE: + /* We can only generate 1 for accum types. */ + gcc_assert (ALL_SCALAR_ACCUM_MODE_P (TYPE_MODE (type))); + return build_fixed (type, fixed_from_double_int (double_int_minus_one, + TYPE_MODE (type))); + +case VECTOR_TYPE: + { + tree scalar = build_minus_one_cst (TREE_TYPE (type)); + + return build_vector_from_val (type, scalar); + } + +case COMPLEX_TYPE: + return build_complex (type, + build_minus_one_cst (TREE_TYPE (type)), + build_zero_cst (TREE_TYPE (type))); + +default: + gcc_unreachable (); +} +} + /* Build 0 constant of type TYPE. This is used by constructor folding and thus the constant should be represented in memory by zero(es). */ tree build_zero_cst (tree type) { switch (TREE_CODE (type)) { case INTEGER_TYPE: case ENUMERAL_TYPE: case BOOLEAN_TYPE: @@ -6921,20 +6960,33 @@ compare_tree_int (const_tree t, unsigned bool valid_constant_size_p (const_tree size) { if (! host_integerp (size, 1) || TREE_OVERFLOW (size) ||
Re: More vector folding
On Fri, 10 May 2013, Richard Biener wrote: On Thu, May 9, 2013 at 12:55 AM, Marc Glisse marc.gli...@inria.fr wrote: Hello, here are a few more changes to fold-const.c so vectors can use the existing optimizations. Note that I made fold_truth_not_expr safe for use with vector BIT_NOT_EXPR. Passes bootstrap+testsuite on x86_64-linux-gnu. 2013-05-09 Marc Glisse marc.gli...@inria.fr gcc/ * fold-const.c (fold_negate_expr): Handle vectors. (fold_truth_not_expr): Likewise. (invert_truthvalue_loc): Likewise. (fold_single_bit_test): Likewise. (fold_comparison): Likewise. (fold_ternary_loc): Likewise. gcc/testsuite/ * g++.dg/ext/vector22.C: New testcase. -- Marc Glisse Index: gcc/testsuite/g++.dg/ext/vector22.C === --- gcc/testsuite/g++.dg/ext/vector22.C (revision 0) +++ gcc/testsuite/g++.dg/ext/vector22.C (revision 0) @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options -O -fdump-tree-gimple } */ + +typedef unsigned vec __attribute__((vector_size(4*sizeof(int; + +void f(vec*a,vec*b){ + *a=(*a)?-1:(*b10); + *b=(*b)?(*a10):0; +} +void g(vec*a,vec*b){ + *a=(*a)?(*a*a):-1; + *b=(*b)?-1:(*b*b); +} +void h(vec*a){ + *a=(~*a==5); +} + +/* { dg-final { scan-tree-dump-not ~ gimple } } */ +/* { dg-final { scan-tree-dump-not VEC_COND_EXPR gimple } } */ +/* { dg-final { cleanup-tree-dump gimple } } */ Property changes on: gcc/testsuite/g++.dg/ext/vector22.C ___ Added: svn:eol-style + native Added: svn:keywords + Author Date Id Revision URL Index: gcc/fold-const.c === --- gcc/fold-const.c(revision 198726) +++ gcc/fold-const.c(working copy) @@ -519,21 +519,21 @@ fold_negate_expr (location_t loc, tree t { tree type = TREE_TYPE (t); tree tem; switch (TREE_CODE (t)) { /* Convert - (~A) to A + 1. */ case BIT_NOT_EXPR: if (INTEGRAL_TYPE_P (type)) return fold_build2_loc (loc, PLUS_EXPR, type, TREE_OPERAND (t, 0), -build_int_cst (type, 1)); +build_one_cst (type)); break; case INTEGER_CST: tem = fold_negate_const (t, type); if (TREE_OVERFLOW (tem) == TREE_OVERFLOW (t) || !TYPE_OVERFLOW_TRAPS (type)) return tem; break; case REAL_CST: @@ -3110,20 +3110,23 @@ fold_truth_not_expr (location_t loc, tre return build2_loc (loc, code, type, TREE_OPERAND (arg, 0), TREE_OPERAND (arg, 1)); } switch (code) { case INTEGER_CST: return constant_boolean_node (integer_zerop (arg), type); +case VECTOR_CST: + return fold_unary_loc (loc, BIT_NOT_EXPR, type, arg); + case TRUTH_AND_EXPR: loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc); loc2 = expr_location_or (TREE_OPERAND (arg, 1), loc); return build2_loc (loc, TRUTH_OR_EXPR, type, invert_truthvalue_loc (loc1, TREE_OPERAND (arg, 0)), invert_truthvalue_loc (loc2, TREE_OPERAND (arg, 1))); case TRUTH_OR_EXPR: loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc); loc2 = expr_location_or (TREE_OPERAND (arg, 1), loc); @@ -3195,26 +3198,28 @@ fold_truth_not_expr (location_t loc, tre return build1_loc (loc, TRUTH_NOT_EXPR, type, arg); /* ... fall through ... */ case FLOAT_EXPR: loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc); return build1_loc (loc, TREE_CODE (arg), type, invert_truthvalue_loc (loc1, TREE_OPERAND (arg, 0))); case BIT_AND_EXPR: - if (!integer_onep (TREE_OPERAND (arg, 1))) + if (VECTOR_TYPE_P (type) || !integer_onep (TREE_OPERAND (arg, 1))) return NULL_TREE; return build2_loc (loc, EQ_EXPR, type, arg, build_int_cst (type, 0)); case SAVE_EXPR: - return build1_loc (loc, TRUTH_NOT_EXPR, type, arg); + return build1_loc (loc, VECTOR_TYPE_P (type) ? BIT_NOT_EXPR + : TRUTH_NOT_EXPR, +type, arg); case CLEANUP_POINT_EXPR: loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc); return build1_loc (loc, CLEANUP_POINT_EXPR, type, invert_truthvalue_loc (loc1, TREE_OPERAND (arg, 0))); default: return NULL_TREE; } } @@ -3222,28 +3227,28 @@ fold_truth_not_expr (location_t loc, tre /* Return a simplified tree node for the truth-negation of ARG. This never alters ARG itself. We assume that ARG is an operation that returns a truth value (0 or 1). FIXME: one would think we would fold the result, but it causes problems with the dominator optimizer. */ tree invert_truthvalue_loc (location_t loc, tree arg) { - tree tem; - if (TREE_CODE (arg) ==
Re: [RFA/ARM/4.7] Fix PR54974: Thumb literal pools don't handle PC rounding
OK for 4.7? Ok - I did say ok if no fallout in my original review :) regards Ramana Thanks, Matt 2013-01-04 Matthew Gretton-Dann matthew.gretton-d...@linaro.org Backport from mainline. 2012-11-29 Matthew Gretton-Dann matthew.gretton-d...@linaro.org PR target/54974 * config/arm/arm.md (thumb2_pool_range, pool_range): Add comment on Thumb pool ranges. (thumb1_extendhisi2): Reduce Thumb pool range. (arm_movdi): Likewise. (thumb1_movdi_insn): Likewise. (thumb1_movsi_insn): Likewise. (pic_load_addr_unified): Likewise. (pic_load_addr_32bit): Likewise. (pic_load_addr_thumb1): Likewise. (thumb1_movhf): Likewise. (arm_movsf_soft_insn): Likewise. (thumb1_movsf_soft_insn): Likewise. (movdf_soft_insn): Likewise. (thumb1_movdf_soft_insn): Likewise. * config/arm/neon.md (*neon_movmode): Likewise. (*neon_movmode): Likwise. * config/arm/thumb2.md: (*thumb2_movsi_insn): Likewise. (*thumb2_movhi_insn): Likewise. (*thumb2_extendqisi_v6): Likewise. (*thumb2_zero_extendqisi_v6): Likewise. (*thumb2_zero_extendqisi2_v6): Likewise. * config/arm/vfp.md: (*thumb2_movsi_vfp): Likewise. (*movdi_vfp): Likewise. (*movdi_vfp_cortexa8): Likewise. (*thumb2_movsf_vfp): Likewise. (*thumb2_movdf_vfp): Likewise.
[Patch, Fortran] PR57217 - re-add type checks for TBP overriding
GCC 4.7 added some additional checks for type-bound procedure overriding. However, doing so it weakened the check whether the nonpass argument has the same type. While for normal arguments, passing the parent type to an extended type is fine, for overriding the type (of nonpass arguments) must be exactly the same as in the original type. The attached patch re-adds this check. Build and regtested on x86-64-gnu-linux. OK for the trunk and the 4.7/4.8 branches? Tobias 2013-05-10 Tobias Burnus bur...@net-b.de PR fortran/57217 * interface.c (gfc_check_typebound_override): Check whether nonpass types are identical same. 2013-05-10 Tobias Burnus bur...@net-b.de PR fortran/57217 * gfortran.dg/typebound_override_4.f90: New. * gfortran.dg/typebound_proc_6.f03: Update dg-error. diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c index 1b967fa..8f22e4c 100644 --- a/gcc/fortran/interface.c +++ b/gcc/fortran/interface.c @@ -4128,6 +4128,17 @@ gfc_check_typebound_override (gfc_symtree* proc, gfc_symtree* old) } check_type = proc_pass_arg != argpos old_pass_arg != argpos; + if (check_type + (!gfc_compare_types (proc_formal-sym-ts, old_formal-sym-ts) + || !gfc_compare_types (old_formal-sym-ts, + proc_formal-sym-ts))) + { + gfc_error (Argument type mismatch for the overriding procedure + '%s' at %L: %s shall be %s, proc-name, where, + gfc_typename (proc_formal-sym-ts), + gfc_typename (old_formal-sym-ts)); + return false; + } if (!check_dummy_characteristics (proc_formal-sym, old_formal-sym, check_type, err, sizeof(err))) { diff --git a/gcc/testsuite/gfortran.dg/typebound_override_4.f90 b/gcc/testsuite/gfortran.dg/typebound_override_4.f90 new file mode 100644 index 000..e6f9805 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/typebound_override_4.f90 @@ -0,0 +1,36 @@ +! { dg-do compile } +! +! PR fortran/57217 +! +! Contributed Salvatore Filippone +! +module base_mod + type base_type +integer :: kind + contains +procedure, pass(map) :: clone= base_clone + end type base_type +contains + subroutine base_clone(map,mapout,info) +implicit none +class(base_type), intent(inout) :: map +class(base_type), intent(inout) :: mapout +integer :: info + end subroutine base_clone +end module base_mod + +module r_mod + use base_mod + type, extends(base_type) :: r_type +real :: dat + contains +procedure, pass(map) :: clone= r_clone ! { dg-error Argument type mismatch for the overriding procedure 'clone' at .1.: CLASS.r_type. shall be CLASS.base_type. } + end type r_type +contains + subroutine r_clone(map,mapout,info) +implicit none +class(r_type), intent(inout) :: map +class(r_type), intent(inout) :: mapout +integer :: info + end subroutine r_clone +end module r_mod diff --git a/gcc/testsuite/gfortran.dg/typebound_proc_6.f03 b/gcc/testsuite/gfortran.dg/typebound_proc_6.f03 index 3a32cbc..1fe2580 100644 --- a/gcc/testsuite/gfortran.dg/typebound_proc_6.f03 +++ b/gcc/testsuite/gfortran.dg/typebound_proc_6.f03 @@ -89,7 +89,7 @@ MODULE testmod ! For corresponding dummy arguments. PROCEDURE, PASS :: corresp1 = proc_tmeint ! Ok. PROCEDURE, PASS :: corresp2 = proc_tmeintx ! { dg-error should be named 'a' } -PROCEDURE, PASS :: corresp3 = proc_tmereal ! { dg-error Type/rank mismatch in argument 'a' } +PROCEDURE, PASS :: corresp3 = proc_tmereal ! { dg-error Argument type mismatch for the overriding procedure 'corresp3' at .1.: REAL.4. shall be INTEGER.4. } END TYPE t
Re: [PATCH] PR32219, weak hidden reference segfault
Chung-Lin Tang clt...@codesourcery.com writes: +case UNSPEC: + /* Reach for a contained symbol. */ + return nonzero_address_p (XVECEXP (x, 0, 0)); I don't think this is safe. UNSPECs really are unspecified :-), so we can't assume that (unspec X) is nonzero simply because X is. Richard
[MIPS, committed] Use microMIPS-style constraints for MIPS16
This patch makes the MIPS16 length checks use the style of predicates and constraints that were added for microMIPS. Some constraints and predicates are shared between the two ISA encodings, while others are only useful for one or the other. This is supposed to be a 1:1 change. I haven't tried to catch more cases, since that should be a separate patch. For now I've kept using extended_mips16 as the attribute of choice. We could look at extending the compression attribute to MIPS16 (as Catherine's patch originally did), but that's likewise a separate patch. Tested on mips-sde-elf (with micromips multilibs locally patched out). Applied. Richard gcc/ * config/mips/mips-protos.h (m16_uimm3_b, m16_simm4_1, m16_nsimm4_1) (m16_simm5_1, m16_nsimm5_1, m16_uimm5_4, m16_nuimm5_4, m16_simm8_1) (m16_nsimm8_1, m16_uimm8_1, m16_nuimm8_1, m16_uimm8_m1_1, m16_uimm8_4) (m16_nuimm8_4, m16_simm8_8, m16_nsimm8_8): Delete. * config/mips/mips.c (m16_check_op, m16_uimm3_b, m16_simm4_1) (m16_nsimm4_1, m16_simm5_1, m16_nsimm5_1, m16_uimm5_4, m16_nuimm5_4) (m16_simm8_1, m16_nsimm8_1, m16_uimm8_1, m16_nuimm8_1, m16_uimm8_m1_1) (m16_uimm8_4, m16_nuimm8_4, m16_simm8_8, m16_nsimm8_8): Delete. * config/mips/constraints.md (Udb8, Usb5, Usb8, Usd8, Uub8, Uuw5) (Uuw8): New constraints. (Usb4): Move into alphabetical order. * config/mips/predicates.md (db8_operand, sb5_operand, sb8_operand) (sd8_operand, ub8_operand, uw8_operand): New predicates. * config/mips/mips.md (*xormode3, *xormode3_mips16): Name previously unnamed patterns. (*addmode3_mips16, *xormode3_mips16, *optabsi3_mips16) (*ashldi3_mips16, *ashrdi3_mips16, *lshrdi3_mips16) (*sltu_GPR:modeGPR2:mode_mips16) (*sleu_GPR:modeGPR2:mode_mips16): Use constraints instead of set_attr_alternative/if_then_else. Use extended_mips16 instead of specific lengths. Index: gcc/config/mips/mips-protos.h === --- gcc/config/mips/mips-protos.h 2013-05-05 10:07:31.808505873 +0100 +++ gcc/config/mips/mips-protos.h 2013-05-05 10:08:54.009030051 +0100 @@ -211,23 +211,6 @@ extern rtx mips_strip_unspec_address (rt extern void mips_move_integer (rtx, rtx, unsigned HOST_WIDE_INT); extern bool mips_legitimize_move (enum machine_mode, rtx, rtx); -extern int m16_uimm3_b (rtx, enum machine_mode); -extern int m16_simm4_1 (rtx, enum machine_mode); -extern int m16_nsimm4_1 (rtx, enum machine_mode); -extern int m16_simm5_1 (rtx, enum machine_mode); -extern int m16_nsimm5_1 (rtx, enum machine_mode); -extern int m16_uimm5_4 (rtx, enum machine_mode); -extern int m16_nuimm5_4 (rtx, enum machine_mode); -extern int m16_simm8_1 (rtx, enum machine_mode); -extern int m16_nsimm8_1 (rtx, enum machine_mode); -extern int m16_uimm8_1 (rtx, enum machine_mode); -extern int m16_nuimm8_1 (rtx, enum machine_mode); -extern int m16_uimm8_m1_1 (rtx, enum machine_mode); -extern int m16_uimm8_4 (rtx, enum machine_mode); -extern int m16_nuimm8_4 (rtx, enum machine_mode); -extern int m16_simm8_8 (rtx, enum machine_mode); -extern int m16_nsimm8_8 (rtx, enum machine_mode); - extern rtx mips_subword (rtx, bool); extern bool mips_split_move_p (rtx, rtx, enum mips_split_type); extern void mips_split_move (rtx, rtx, enum mips_split_type); Index: gcc/config/mips/mips.c === --- gcc/config/mips/mips.c 2013-05-05 10:08:50.629997872 +0100 +++ gcc/config/mips/mips.c 2013-05-05 10:08:54.015030107 +0100 @@ -3357,113 +3357,6 @@ mips_rewrite_small_data (rtx pattern) return pattern; } -/* We need a lot of little routines to check the range of MIPS16 immediate - operands. */ - -static int -m16_check_op (rtx op, int low, int high, int mask) -{ - return (CONST_INT_P (op) - IN_RANGE (INTVAL (op), low, high) - (INTVAL (op) mask) == 0); -} - -int -m16_uimm3_b (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED) -{ - return m16_check_op (op, 0x1, 0x8, 0); -} - -int -m16_simm4_1 (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED) -{ - return m16_check_op (op, -0x8, 0x7, 0); -} - -int -m16_nsimm4_1 (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED) -{ - return m16_check_op (op, -0x7, 0x8, 0); -} - -int -m16_simm5_1 (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED) -{ - return m16_check_op (op, -0x10, 0xf, 0); -} - -int -m16_nsimm5_1 (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED) -{ - return m16_check_op (op, -0xf, 0x10, 0); -} - -int -m16_uimm5_4 (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED) -{ - return m16_check_op (op, -0x10 2, 0xf 2, 3); -} - -int -m16_nuimm5_4 (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED) -{ - return m16_check_op (op, -0xf 2, 0x10 2, 3); -} - -int -m16_simm8_1 (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED) -{ - return m16_check_op (op, -0x80, 0x7f, 0); -} - -int
[Patch:RL78] Fix hardware multiply on G13 target
Hi, The following patch fixes an issue with the inline hardware multiply used in the RL78 G13 target. The G13 target does not have a multiply instruction, but achieves this using set of data and control registers. This patch adds an additional 'nop' after control and data registers are loaded in the multiply-accumulation operation. Currently, there is only 1 'nop'. As per the data sheet, regular multiply operation: 1 nop multiply-accumulate operation: 2 nops Section 17.4.2 of G13 hardware manual: After 6, the multiply-accumulation operation finishes in one additional clock cycle. (There is a wait of at least two clock cycles after specifying the initial settings is finished (5).) Note: The multiplication fails on hardware (for certain testcases), but always works as expected using rl78-elf-run simulator. I do not think the simulator has any cycle check for nop's. Please let me know if this is OK to commit. Regards, Kaushik 2013-05-10 Kaushik Phatak kaushik.pha...@kpitcummins.com * config/rl78/rl78.md (mulsi3_g13): Add additional 'nop' required in multiply-accumulate mode diff -uprN /home/kpit/fsfsrc/gcc-4.8.0-20121219/gcc/config/rl78/rl78.md gcc/config/rl78/rl78.md --- /home/kpit/fsfsrc/gcc-4.8.0-20121219/gcc/config/rl78/rl78.md2013-01-25 16:26:27.0 +0530 +++ gcc/config/rl78/rl78.md 2013-05-10 16:57:26.0 +0530 @@ -381,6 +381,7 @@ movwax, %h2 movw0x2, ax ; MDAH nop ; mdc += mdal * mdah + nop ; Additional nop for MAC mov a, #0x40 mov !0xf00e8, a ; MDUC @@ -389,6 +390,7 @@ movwax, %H2 movw0x2, ax ; MDAH nop ; mdc += mdal * mdah + nop ; Additional nop for MAC movwax, !0xf00e0; MDCL movw%H0, ax ; end of mulsi macro
Re: [PATCH, PR 57084] Create real cgraph node during late intraprocedural devirtualization
On Thu, May 09, 2013 at 03:52:20PM +0200, Martin Jambor wrote: On Tue, May 07, 2013 at 11:43:33PM +0200, Jan Hubicka wrote: Hi, the problem in PR 57084 is that late PRE devirtualization creates a direct call to a decl fro which we only have an inlined call graph mode in the given partition. I tried to find a most universal place where to fix it because this problem is not special to type-based devirtualization and in theory can be caused by any call to a decl that is grabbed from a constructor. I think the best place is the following one-liner, because all such decls should go through canonicalize_constructor_val. Bootstrapped and tested on x86_64-linux, fixes the testcase (at -m32) and I have happened to also LTO build Mozilla Firefox with it. OK for trunk? Is this change needed for 4.8, too? It would be OK there. Although the testcase shows the problem relies on code that is new in 4.9, I think that yes, even the current devirtualization by looking into constructors can hit the bug. So I am going to commit it to 4.8 after testing on that branch. Except that there is no such thing as cgraph_get_create_real_symbol_node in 4.8. I suppose it's not worth backporting it without a bug triggering there. However, this makes me Yep, it was same situation with the PR fix I introduced cgraph_get_create_real_symbol_node for. In theory the bug exists for 4.8, too, but I do not have testcase. Lets go with bugfixes first and then I can do the renaming. Honza
Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)
On 05/07/13 23:13, Teresa Johnson wrote: -- Revised patch that fixes failures encountered when enabling -freorder-blocks-and-partition, including the failure reported in PR 53743. This includes new verification code to ensure no cold blocks dominate hot blocks contributed by Steven Bosscher. Seems like a reasonable verification; presumably if we have a cold block dominating a hot block, then the block/edge frequencies are badly broken. Ah, just saw the comments for the other case where this happens. cold entry, but hot loop inside pushing over the barrier. Arguably given a cold block in the dominator graph, all its Yep, also note that sanity checking anything about frequencies is really hard. There are very many places in compiler that necesarilly need to invalidate frequencies in weird ways (at least short of rebuilding the whole profile from probabilities again). I can't really comment on the cfglayout and related stuff -- it was added at a time when I wasn't doing much with GCC and thus I don't know much about it. I think I can take a look at the cfglayout stuff. Splitting the patch would be great. Honza However, I like the changes to record if we've done partitioning and checking those instead of flag_reorder_blocks_and_partition. That's simple enough that I'd support pulling it out as a separate patch and installing immediately if that can be done so without major headaches. I think we could do something similar with the code to verify the idom of a hot block is also hot. Though looking at the implementation I wonder if it could be simplified by walking the dominator tree? I can't look at it real closely tonight though. Could you pull those two logical hunks of work out into individual patches. jeff
Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)
On Fri, May 10, 2013 at 12:57 AM, Xinliang David Li wrote: On Thu, May 9, 2013 at 3:40 PM, Steven Bosscher wrote: This patch mixes up things badly from the point of what-depends-on-what, the whole approach looks wrong to me. Do you mean the 'source file dependency' or 'logical dependency'? If the former, the code can be easily refactored to remove the dependency. I don't see how the latter can be avoided as bb-partition etc does change cfg states and leads to different actions in cfg layout finalize. I mean logical dependency. cfglayout is just a representation of the CFG, and bb-partition is a code transformation. By making fixup_reorder_chain emit the note, you' ve put part of the transformation into out-of-cfglayout which is just bogus. You also don't put GCSE or loop unrolling in out-of-cfglayout, and this change is IMHO in the same category: mixing transformations into internal representations. That may be a short-term fix but it is a long-term maintenance/cleanups nightmare. Although I've only skimmed the patch, I have noted several issues with it: * 3 different changes put into a single patch: the crtl-has_bb_partition change (which looks good to me), the verification stuff, and various fixes. The patch should be submitted in 3 parts to make/testing review easier. * Emitting barriers in cfglayout mode. That's non-sense. * Fixup redirected edges that did not cross partitions before but apparently do after redirection. This is not supposed to happen in the first place, so fixing up any of this just papers over an error elsewhere in the compiler. * The fixup_reorder_chain changes I've mentioned above. Ciao! Steven
Re: RFC patch for fold-const.c's fold_indirect_ref_1 / avoid type-checking ICE with TYPE_MIN_VALUE vs. offset signess mismatches
On Fri, May 10, 2013 at 10:44 AM, Tobias Burnus tobias.bur...@physik.fu-berlin.de wrote: I run into the following issue on the Fortran-Dev branch. I get an ICE due to a tree-checking assert for fold-const.c's fold_indirect_ref_1, which has tree min_val = size_zero_node; if (type_domain TYPE_MIN_VALUE (type_domain)) min_val = TYPE_MIN_VALUE (type_domain); op01 = size_binop_loc (loc, EXACT_DIV_EXPR, op01, TYPE_SIZE_UNIT (type)); op01 = size_binop_loc (loc, PLUS_EXPR, op01, min_val); The problem is that op01 is unsigned (size_type_node) while min_val alias TYPE_MIN_VALUE is signed (gfc_array_index_type which is PTRDIFF_TYPE). TYPE_MIN_VALUE gets set via tree.c's build_range_type_1, which has: TYPE_MIN_VALUE (itype) = fold_convert (type, lowval); TYPE_MAX_VALUE (itype) = highval ? fold_convert (type, highval) : NULL; And op01 becomes a size_type_node via tree.h's fold_build_pointer_plus_loc (location_t loc, tree ptr, tree off) { return fold_build2_loc (loc, POINTER_PLUS_EXPR, TREE_TYPE (ptr), ptr, fold_convert_loc (loc, sizetype, off)); I am not completely sure what's the best way to solve that, but the attached patch avoids the ICE for me. I did an all-all language bootstrap (all,ada,go,objc,obj-c++) and had no new failures with regresting. OK? Or do you have a better idea how to solve this? I think the middle-end expects unsigned sizetype in many places, so this isn't the only point where you may hit such an issue. The code has the related issue that it happily does an unsigned division on negative offsets because it fails to interpret the POINTER_PLUS_EXPR offset as signed Note that those foldings are highly suspicious (I've removed most similar foldings from elsewhere) because they are susceptible to generate out-of-bound array accesses for multi-dimensional arrays. This one seems to be bogus this way, too. That said, the function needs some major TLC :/ (I'd simply try removing that array case) Richard. Tobias
Re: More vector folding
On Fri, May 10, 2013 at 12:03 PM, Marc Glisse marc.gli...@inria.fr wrote: On Fri, 10 May 2013, Richard Biener wrote: On Thu, May 9, 2013 at 12:55 AM, Marc Glisse marc.gli...@inria.fr wrote: Hello, here are a few more changes to fold-const.c so vectors can use the existing optimizations. Note that I made fold_truth_not_expr safe for use with vector BIT_NOT_EXPR. Passes bootstrap+testsuite on x86_64-linux-gnu. 2013-05-09 Marc Glisse marc.gli...@inria.fr gcc/ * fold-const.c (fold_negate_expr): Handle vectors. (fold_truth_not_expr): Likewise. (invert_truthvalue_loc): Likewise. (fold_single_bit_test): Likewise. (fold_comparison): Likewise. (fold_ternary_loc): Likewise. gcc/testsuite/ * g++.dg/ext/vector22.C: New testcase. -- Marc Glisse Index: gcc/testsuite/g++.dg/ext/vector22.C === --- gcc/testsuite/g++.dg/ext/vector22.C (revision 0) +++ gcc/testsuite/g++.dg/ext/vector22.C (revision 0) @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options -O -fdump-tree-gimple } */ + +typedef unsigned vec __attribute__((vector_size(4*sizeof(int; + +void f(vec*a,vec*b){ + *a=(*a)?-1:(*b10); + *b=(*b)?(*a10):0; +} +void g(vec*a,vec*b){ + *a=(*a)?(*a*a):-1; + *b=(*b)?-1:(*b*b); +} +void h(vec*a){ + *a=(~*a==5); +} + +/* { dg-final { scan-tree-dump-not ~ gimple } } */ +/* { dg-final { scan-tree-dump-not VEC_COND_EXPR gimple } } */ +/* { dg-final { cleanup-tree-dump gimple } } */ Property changes on: gcc/testsuite/g++.dg/ext/vector22.C ___ Added: svn:eol-style + native Added: svn:keywords + Author Date Id Revision URL Index: gcc/fold-const.c === --- gcc/fold-const.c(revision 198726) +++ gcc/fold-const.c(working copy) @@ -519,21 +519,21 @@ fold_negate_expr (location_t loc, tree t { tree type = TREE_TYPE (t); tree tem; switch (TREE_CODE (t)) { /* Convert - (~A) to A + 1. */ case BIT_NOT_EXPR: if (INTEGRAL_TYPE_P (type)) return fold_build2_loc (loc, PLUS_EXPR, type, TREE_OPERAND (t, 0), -build_int_cst (type, 1)); +build_one_cst (type)); break; case INTEGER_CST: tem = fold_negate_const (t, type); if (TREE_OVERFLOW (tem) == TREE_OVERFLOW (t) || !TYPE_OVERFLOW_TRAPS (type)) return tem; break; case REAL_CST: @@ -3110,20 +3110,23 @@ fold_truth_not_expr (location_t loc, tre return build2_loc (loc, code, type, TREE_OPERAND (arg, 0), TREE_OPERAND (arg, 1)); } switch (code) { case INTEGER_CST: return constant_boolean_node (integer_zerop (arg), type); +case VECTOR_CST: + return fold_unary_loc (loc, BIT_NOT_EXPR, type, arg); + case TRUTH_AND_EXPR: loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc); loc2 = expr_location_or (TREE_OPERAND (arg, 1), loc); return build2_loc (loc, TRUTH_OR_EXPR, type, invert_truthvalue_loc (loc1, TREE_OPERAND (arg, 0)), invert_truthvalue_loc (loc2, TREE_OPERAND (arg, 1))); case TRUTH_OR_EXPR: loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc); loc2 = expr_location_or (TREE_OPERAND (arg, 1), loc); @@ -3195,26 +3198,28 @@ fold_truth_not_expr (location_t loc, tre return build1_loc (loc, TRUTH_NOT_EXPR, type, arg); /* ... fall through ... */ case FLOAT_EXPR: loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc); return build1_loc (loc, TREE_CODE (arg), type, invert_truthvalue_loc (loc1, TREE_OPERAND (arg, 0))); case BIT_AND_EXPR: - if (!integer_onep (TREE_OPERAND (arg, 1))) + if (VECTOR_TYPE_P (type) || !integer_onep (TREE_OPERAND (arg, 1))) return NULL_TREE; return build2_loc (loc, EQ_EXPR, type, arg, build_int_cst (type, 0)); case SAVE_EXPR: - return build1_loc (loc, TRUTH_NOT_EXPR, type, arg); + return build1_loc (loc, VECTOR_TYPE_P (type) ? BIT_NOT_EXPR + : TRUTH_NOT_EXPR, +type, arg); case CLEANUP_POINT_EXPR: loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc); return build1_loc (loc, CLEANUP_POINT_EXPR, type, invert_truthvalue_loc (loc1, TREE_OPERAND (arg, 0))); default: return NULL_TREE; } } @@ -3222,28 +3227,28 @@ fold_truth_not_expr (location_t loc, tre /* Return a simplified tree node for the truth-negation of ARG. This never alters ARG itself. We assume that ARG is an operation that returns a truth value (0 or 1). FIXME:
Re: [PATCH] Dynamic dispatch of multiversioned functions and CPU mocks for code coverage.
On Thu, 9 May 2013, Sriraman Tallam wrote: Then, I plan to add the following hooks to libgcc (in a different patch) : int set_mock_cpu_is (const char *cpu); int set_mock_cpu_supports (const char *isa); int init_mock_cpu (); // Clear the values of the mock cpu. Those names are in the user's namespace; I think libgcc should only provide or use symbols in the implementation namespace. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] Improve rotation by mode bitsize - 1
Hi! This is something I've noticed while working on the rotate recognizer patch I've just posted. We emit say roll %eax instead of roll $1, %eax because the former is shorter, but emit roll $31, %eax instead of the equivalent, but shorter rorl %eax The following patch let us optimize even those. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2013-05-09 Jakub Jelinek ja...@redhat.com * config/i386/i386.md (rotateinv): New code attr. (*rotate_insnmode3_1, *rotate_insnsi3_1_zext, *rotate_insnqi3_1_slp): Emit rorl %eax instead of roll $31, %eax, etc. Going this route you will also need to update [(set_attr type rotate1) (set (attr length_immediate) (if_then_else (and (match_operand 1 const1_operand) (ior (match_test TARGET_SHIFT1) (match_test optimize_function_for_size_p (cfun (const_string 0) (const_string *))) (set_attr mode QI)]) computing the immediate size. Why we can't canonicalize this in folding/simplify_rtx/combiner? I do not see much benefit allowing both forms of instructions in our insn streams... Honza
Re: [PATCH] Improve rotation by mode bitsize - 1
Hi! Note, the patch has been already committed. On Fri, May 10, 2013 at 03:48:50PM +0200, Jan Hubicka wrote: Going this route you will also need to update [(set_attr type rotate1) (set (attr length_immediate) (if_then_else (and (match_operand 1 const1_operand) (ior (match_test TARGET_SHIFT1) (match_test optimize_function_for_size_p (cfun (const_string 0) (const_string *))) (set_attr mode QI)]) computing the immediate size. Why we can't canonicalize this in folding/simplify_rtx/combiner? I do not see much benefit allowing both forms of instructions in our insn streams... Both directions of rotation make sense for variable length rotation, and why would the generic simplify-rtx.c code want to prefer one rotate over the other direction for constant rotation counts? That is clearly target specific preference. What we could do instead of the patch I've committed just tweak expansion of the patterns instead, assuming it is unlikely RTL optimizations materialize a constant rotation out of non-constant rotations from expansion time. Or the length_immediate attributes could be adjusted, shouldn't be too hard (just replace that (match_operand 1 const1_operand) in there with (and (match_operand 1 const_int_operand) (ior (match_operand 1 const1_operand) (match_test INTVAL (operands[1]) == GET_MODE_BITSIZE (MODEmode) - 1))) or similar (untested). Jakub
Re: [PATCH] Test case for PR55033
2013/5/10 Sebastian Huber sebastian.hu...@embedded-brains.de: gcc/testsuite/ChangeLog 2013-05-10 Sebastian Huber sebastian.hu...@embedded-brains.de PR target/55033 * gcc.target/powerpc/pr55033.c: New. --- gcc/testsuite/gcc.target/powerpc/pr55033.c | 21 + 1 files changed, 21 insertions(+), 0 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr55033.c diff --git a/gcc/testsuite/gcc.target/powerpc/pr55033.c b/gcc/testsuite/gcc.target/powerpc/pr55033.c new file mode 100644 index 000..cd8abcb --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr55033.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target powerpc_eabi_ok } */ +/* { dg-options -mcpu=8540 -msoft-float -msdata=eabi -G 8 -fno-common } */ + +void f(void); ^ I think there is missing space. :p + +struct s { + int *p; + int *q; +}; + +extern int a; + +extern const struct s c; + +const struct s c = { a, 0 }; + +void f(void) ^ Likewise. :) +{ + char buf[4] = { 0, 1, 2, 3 }; +} -- 1.7.7 Best regards, jasonwucj
[v3] Fix libstdc++/54577
Hi, this is the issue about the signatures of the erase member functions of the sequence containers. Mostly rather straightfoward stuff within the limits of the current infrastructure: the various _M_const_case are normally simple enough, I only mention the rather ugly std::vector one, required otherwise an ext_pointer testcase fails: I suppose that handling these issues in a proper way will happen together with changing the vector::pointer typedef to the conforming: typedef typename _Alloc_traits::pointer pointer; and I suppose Jonathan has the issue under control. Something similar is true for the other containers. Tested x86_64-linux. Thanks, Paolo. 2013-05-10 Paolo Carlini paolo.carl...@oracle.com PR libstdc++/54577 * include/bits/stl_deque.h (erase): Fix signatures in C++11 mode. (_Deque_iterator::_M_const_cast): Add. (deque::_M_erase): Declare. * include/bits/deque.tcc: ... and define. * include/bits/stl_list.h (erase): Fix signatures in C++11 mode. (_List_const_iterator::_M_const_cast): Add. * include/bits/list.tcc (erase (iterator)): Fix. * include/bits/stl_iterator.h (__normal_iterator::_M_const_cast): Add; include ext/cast.h. * include/bits/stl_vector.h (erase): Fix signatures in C++11 mode. (vector::_M_erase): Declare. * include/bits/stl_bvector.h: Likewise. * include/bits/vector.tcc (vector::_M_erase): Define. * include/ext/vstring.h (erase): Fix signatures in C++11 mode. * include/debug/deque: Adjust. * include/debug/list: Likewise. * include/debug/vector: Likewise. * include/profile/deque: Likewise. * include/profile/list: Likewise. * include/profile/vector: Likewise. * testsuite/util/exception/safety.h (erase_basedeque, erase_basedeque, erase_basevector): Remove. (erase_base__versa_string): Update. * testsuite/ext/vstring/modifiers/char/54577.cc: New. * testsuite/ext/vstring/modifiers/wchar_t/54577.cc: Likewise. * testsuite/23_containers/deque/modifiers/erase/54577.cc: Likewise. * testsuite/23_containers/list/modifiers/erase/54577.cc: Likewise. * testsuite/23_containers/vector/bool/modifiers/erase/54577.cc: Likewise. * testsuite/23_containers/vector/modifiers/erase/54577.cc: Likewise. * testsuite/23_containers/deque/requirements/dr438/assign_neg.cc: Adjust dg-error line numbers. * testsuite/23_containers/deque/requirements/dr438/ constructor_1_neg.cc: Likewise. * testsuite/23_containers/deque/requirements/dr438/ constructor_2_neg.cc: Likewise. * testsuite/23_containers/deque/requirements/dr438/insert_neg.cc: Likewise. * testsuite/23_containers/list/requirements/dr438/assign_neg.cc: Likewise. * testsuite/23_containers/list/requirements/dr438/ constructor_1_neg.cc: Likewise. * testsuite/23_containers/list/requirements/dr438/ constructor_2_neg.cc: Likewise. * testsuite/23_containers/list/requirements/dr438/ insert_neg.cc: Likewise. * testsuite/23_containers/vector/requirements/dr438/assign_neg.cc: Likewise. * testsuite/23_containers/vector/requirements/dr438/ constructor_1_neg.cc: Likewise. * testsuite/23_containers/vector/requirements/dr438/ constructor_2_neg.cc: Likewise. * testsuite/23_containers/vector/requirements/dr438/insert_neg.cc: Likewise. Index: include/bits/deque.tcc === --- include/bits/deque.tcc (revision 198770) +++ include/bits/deque.tcc (working copy) @@ -191,7 +191,7 @@ template typename _Tp, typename _Alloc typename deque_Tp, _Alloc::iterator deque_Tp, _Alloc:: -erase(iterator __position) +_M_erase(iterator __position) { iterator __next = __position; ++__next; @@ -214,7 +214,7 @@ template typename _Tp, typename _Alloc typename deque_Tp, _Alloc::iterator deque_Tp, _Alloc:: -erase(iterator __first, iterator __last) +_M_erase(iterator __first, iterator __last) { if (__first == __last) return __first; Index: include/bits/list.tcc === --- include/bits/list.tcc (revision 198770) +++ include/bits/list.tcc (working copy) @@ -106,10 +106,18 @@ templatetypename _Tp, typename _Alloc typename list_Tp, _Alloc::iterator list_Tp, _Alloc:: +#if __cplusplus = 201103L +erase(const_iterator __position) +#else erase(iterator __position) +#endif { iterator __ret = iterator(__position._M_node-_M_next); +#if __cplusplus = 201103L + _M_erase(__position._M_const_cast()); +#else _M_erase(__position); +#endif return __ret; } Index:
C++ PATCH for c++/55149 (VLA lambda capture)
This bug talks about ICEs trying to capture a VLA by value. We don't support that, but we can give a more helpful error. In the comments Paolo also noticed that with the new support for capture by reference, we wrongly warn about returning a reference to the captured variable. Tested x86_64-pc-linux-gnu, applying to trunk. commit e88823b454079098df61d1701ac0c7b59a96b758 Author: Jason Merrill ja...@redhat.com Date: Thu May 9 22:15:22 2013 -0400 PR c++/55149 * semantics.c (add_capture): Error rather than abort on copy capture of VLA. * typeck.c (maybe_warn_about_returning_address_of_local): Don't warn about capture proxy. diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index 3e1a0bf..d0db10a 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -9463,9 +9463,12 @@ add_capture (tree lambda, tree id, tree initializer, bool by_reference_p, type = lambda_capture_field_type (initializer, explicit_init_p); if (array_of_runtime_bound_p (type)) { + if (!by_reference_p) + error (array of runtime bound cannot be captured by copy, + only by reference); + /* For a VLA, we capture the address of the first element and the maximum index, and then reconstruct the VLA for the proxy. */ - gcc_assert (by_reference_p); tree elt = cp_build_array_ref (input_location, initializer, integer_zero_node, tf_warning_or_error); tree ctype = vla_capture_type (type); diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c index df5fc4a..b8ea555 100644 --- a/gcc/cp/typeck.c +++ b/gcc/cp/typeck.c @@ -8140,6 +8140,7 @@ maybe_warn_about_returning_address_of_local (tree retval) if (DECL_P (whats_returned) DECL_NAME (whats_returned) DECL_FUNCTION_SCOPE_P (whats_returned) + !is_capture_proxy (whats_returned) !(TREE_STATIC (whats_returned) || TREE_PUBLIC (whats_returned))) { diff --git a/gcc/testsuite/g++.dg/cpp1y/vla5.C b/gcc/testsuite/g++.dg/cpp1y/vla5.C new file mode 100644 index 000..1f6da29 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1y/vla5.C @@ -0,0 +1,8 @@ +// PR c++/55149 +// { dg-options -std=c++1y } + +void test(int n) { + int r[n]; + [r]() { return r + 0; }; + [r]() { return r + 0; }; // { dg-error captured by copy } +}
C++ PATCH for c++/57047 (ICE with constexpr)
Oddly, after you assign to a variable, you can't use its old value anymore... Tested x86_64-pc-linux-gnu, applying to trunk, 4.8, 4.7. commit 16ee0ee2c495bc475e0ba015c6b9fd4707692196 Author: jason jason@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Fri May 10 14:17:37 2013 + PR c++/57047 * semantics.c (cxx_fold_indirect_ref): Fix thinko. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@198777 138bc75d-0d04-0410-961f-82ee72b054a4 diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index d0db10a..3e78887 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -7643,15 +7643,17 @@ cxx_fold_indirect_ref (location_t loc, tree type, tree op0, bool *empty_base) } } } - /* *(foo *)fooarrptreturn (*fooarrptr)[0] */ + /* *(foo *)fooarrptr = (*fooarrptr)[0] */ else if (TREE_CODE (TREE_TYPE (subtype)) == ARRAY_TYPE (same_type_ignoring_top_level_qualifiers_p (type, TREE_TYPE (TREE_TYPE (subtype) { tree type_domain; tree min_val = size_zero_node; - sub = cxx_fold_indirect_ref (loc, TREE_TYPE (subtype), sub, NULL); - if (!sub) + tree newsub = cxx_fold_indirect_ref (loc, TREE_TYPE (subtype), sub, NULL); + if (newsub) + sub = newsub; + else sub = build1_loc (loc, INDIRECT_REF, TREE_TYPE (subtype), sub); type_domain = TYPE_DOMAIN (TREE_TYPE (sub)); if (type_domain TYPE_MIN_VALUE (type_domain)) diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-array-ptr8.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-array-ptr8.C new file mode 100644 index 000..ee425ea --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-array-ptr8.C @@ -0,0 +1,54 @@ +// PR c++/57047 +// { dg-require-effective-target c++11 } + +template typename +struct A; +template typename T +struct A T +{ + typedef T type; +}; +template typename T +constexpr T foo (typename A T::type __t) noexcept +{ + return static_cast T (__t); +} +template class T1, class T2 +struct B +{ + T1 t1; + T2 t2; + template class U + constexpr B (U __x, const T2 __y) : t1 (foo U (__x)), t2 (__y) {} +}; +static inline constexpr bool +fn1 (const char c) +{ + return ('0' = c) (c = '9'); +} +static inline constexpr bool +fn2 (const char c) +{ + return (('A' = c) (c = 'Z')) || (('a' = c) (c = 'z')); +} +static constexpr bool +fn3 (const char *const x) +{ + return (x[1] == '\0' x[0] == ']') ? true : (!fn1 (x[0])) ? false : fn3 (x[1]); +} +static constexpr bool +fn4 (const char *const x) +{ + return (x[0] == '\0') ? fn3 (x[1]) : fn4 (x[1]); +} +static inline constexpr bool +fn5 (const char *const x) +{ + return fn2 (x[0]) ? fn4 (x) : false; +} +struct C final +{ + constexpr C (const char *const t1) : c (fn5 (t1) ? 199 : 69) {} + unsigned c; +}; +B C, C p (a, b);
[PATCH v2] Test case for PR55033
v2: Format changes gcc/testsuite/ChangeLog 2013-05-10 Sebastian Huber sebastian.hu...@embedded-brains.de PR target/55033 * gcc.target/powerpc/pr55033.c: New. --- gcc/testsuite/gcc.target/powerpc/pr55033.c | 23 +++ 1 files changed, 23 insertions(+), 0 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr55033.c diff --git a/gcc/testsuite/gcc.target/powerpc/pr55033.c b/gcc/testsuite/gcc.target/powerpc/pr55033.c new file mode 100644 index 000..2c1835e --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr55033.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target powerpc_eabi_ok } */ +/* { dg-options -mcpu=8540 -msoft-float -msdata=eabi -G 8 -fno-common } */ + +extern void f (void); + +struct s +{ + int *p; + int *q; +}; + +extern int a; + +extern const struct s c; + +const struct s c = { a, 0 }; + +void +f (void) +{ + char buf[4] = { 0, 1, 2, 3 }; +} -- 1.7.7
Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)
On Thu, May 9, 2013 at 2:42 PM, Diego Novillo dnovi...@google.com wrote: On 2013-05-08 01:13 , Teresa Johnson wrote: Somehow Rietveld didn't upload the patch properly. I've attached the patch to this email instead. Here is the description: Rietveld has turned out to be far less useful that I had hoped. If you are running ubuntu precise, the upload script is having some bad interaction with the server, which makes it to constantly reject your password. I do not recommend using Rietveld anymore. I don't really have the cycles to invest in fixing the various usability warts we've found. Sorry. Thanks for the note. The main reason I have tried to keep using Rietveld is that it sends out the patch inline in the email with the formatting preserved. I have found that cut-n-paste into a gmail window messes up the spacing. Do you know of a good way to work around this issue? -static void +void emit_barrier_after_bb (basic_block bb) { rtx barrier = emit_barrier_after (BB_END (bb)); - BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier); + if (current_ir_type () == IR_RTL_CFGLAYOUT) +BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier); What if the current IR is not RTL? Should we fail here? It doesn't seem like it makes sense to call this from gimple, for instance. This is called only from bb-reorder and cfgrtl, so we should only be in IR_RTL, I can add an assert to this effect. More on this change when I respond to Steven's comments. + several different possibilities. One is that there are edge weight insanities + due to optimization phases that do not properly update basic block profile + counts. The second is that the entry of the function may not be hot, because + it is entered fewer times than the number of profile training runs, but there + is a loop inside the function that causes blocks within the function to be + above the threshold for hotness. */ + if (cold_bb_count) +{ + bool dom_calculated_here = !dom_info_available_p (CDI_DOMINATORS); + Move this out into its own function? Will do. + if (dom_calculated_here) +calculate_dominance_info (CDI_DOMINATORS); + + /* Keep examining hot bbs until we have either checked them all, or + re-marked all cold bbs hot. */ + while (! bbs_in_hot_partition.is_empty () + cold_bb_count) +{ + basic_block dom_bb; + + bb = bbs_in_hot_partition.pop (); + dom_bb = get_immediate_dominator (CDI_DOMINATORS, bb); + + /* If bb's immediate dominator is also hot then it is ok. */ + if (BB_PARTITION (dom_bb) != BB_COLD_PARTITION) +continue; + + /* We have a hot bb with an immediate dominator that is cold. + The dominator needs to be re-marked to hot. */ s/to hot/hot/ ok. Actually, I think s/to hot/as hot/ might sound better. Index: cfgrtl.c === --- cfgrtl.c(revision 198686) +++ cfgrtl.c(working copy) @@ -44,6 +44,7 @@ along with GCC; see the file COPYING3. If not see #include tree.h #include hard-reg-set.h #include basic-block.h +#include bb-reorder.h You may need to modify Makefile.in to declare this new dependency. +/* Called when edge E has been redirected to a new destination, + in order to update the region crossing flag on the edge and + jump. */ + +static void +fixup_partition_crossing (edge e, basic_block target) +{ + rtx note; + + gcc_assert (e-dest == target); Then, why not just take a single argument E? Good idea, will do. +fixup_bb_partition (basic_block bb) +{ + edge e; + edge_iterator ei; + + /* Now need to make bb's pred edges non-region crossing. */ This is hard to parse. Ok, how about: /* Ensure edges to bb reflect its new partition assignment with the appropriate region-crossing flag setting. */ + /* Delete any blocks that became unreachable and weren't + already cleaned up, for example during edge forwarding + and convert_jumps_to_returns. This will expose more + opportunities for fixing the partition boundaries here. + Also, the calculation of the dominance graph during verification + will assert if there are unreachable nodes. */ + delete_unreachable_blocks (); Why not just schedule a CFG cleanup as a prerequisite to this pass? Which pass? This is called right after we try to optimize the cfg during cleanup_cfg, which is invoked numerous places. try_optimize_cfg performs a number of cfg optimizations, some of which can create unreachable blocks. I found it was much easier to clean this up in one pass at the end rather that try to detect and fix this up incrementally. A minor formatting nit. References to locals and function arguments should be done in capitals. Ok, will clean this up. Thanks! Teresa Diego. -- Teresa Johnson |
C++ PATCH for c++/57196 (wrong error with initializer-list constructor)
When I added the check for instantiation-dependent expressions, I treated statement-expressions and some qualified-ids as instantiation-dependent even when they don't involve template parameters. In this PR, that causes problems because array initialization expands into a statement-expression via build_vec_init, and a sizeof-expression is not immediately folded, and then when we go to convert the sizeof to the appropriate type for the template non-type parameter, we call uses_template_parms, which pretends we're in a template, we look inside the sizeof and see the statement-expression, and decide that it's dependent. The solution is not to pretend that we're in a template when deciding whether a template argument is dependent. Tested x86_64-pc-linux-gnu, applying to trunk. If no problems turn up, I'll apply it to 4.8 later as well. commit 32bf8cd06a9d5e2ad4abd35ce3571457a071b27d Author: jason jason@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Fri May 10 14:17:45 2013 + PR c++/57196 * pt.c (convert_template_argument): Use dependent_template_arg_p, not uses_template_parms. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@198778 138bc75d-0d04-0410-961f-82ee72b054a4 diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 0747de6..8f88b10 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -6368,7 +6368,8 @@ convert_template_argument (tree parm, val = error_mark_node; } } - else if (!uses_template_parms (orig_arg) !uses_template_parms (t)) + else if (!dependent_template_arg_p (orig_arg) + !uses_template_parms (t)) /* We used to call digest_init here. However, digest_init will report errors, which we don't want when complain is zero. More importantly, digest_init will try too diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist70.C b/gcc/testsuite/g++.dg/cpp0x/initlist70.C new file mode 100644 index 000..f215b9d --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/initlist70.C @@ -0,0 +1,27 @@ +// PR c++/57196 +// { dg-require-effective-target c++11 } + +#include initializer_list + +templateclass T +struct set { + set() = default; + set(std::initializer_listT){} +}; + +struct string { + string(const char*){} + ~string(){} +}; + +typedef decltype(sizeof(0)) size_t; + +template size_t struct EqHelper { }; + +int IsNullLiteralHelper(...); + +void Fn() { + EqHelpersizeof IsNullLiteralHelper(setint{1})eq1; // ok + EqHelpersizeof IsNullLiteralHelper(setstring()) eq2; // ok + EqHelpersizeof IsNullLiteralHelper(setstring{foo}) eq3; // error +} diff --git a/gcc/testsuite/g++.dg/template/crash106.C b/gcc/testsuite/g++.dg/template/crash106.C index c2d117e..ebd47bc 100644 --- a/gcc/testsuite/g++.dg/template/crash106.C +++ b/gcc/testsuite/g++.dg/template/crash106.C @@ -10,3 +10,5 @@ struct A templateT N = 0, void (A::*)() = A::fooN struct B {}; // { dg-error type|declared } B b; // { dg-error type|declaration } + +// { dg-prune-output could not convert } diff --git a/gcc/testsuite/g++.dg/template/crash112.C b/gcc/testsuite/g++.dg/template/crash112.C index 919c887..ff35764 100644 --- a/gcc/testsuite/g++.dg/template/crash112.C +++ b/gcc/testsuite/g++.dg/template/crash112.C @@ -5,7 +5,7 @@ struct A templatetypename void foo() {} }; -templatevoid (A::*)() struct B {}; // { dg-error declaration } +templatevoid (A::*)() struct B {}; templateint struct C { @@ -13,3 +13,5 @@ templateint struct C }; C0 c; + +// { dg-prune-output could not convert } diff --git a/gcc/testsuite/g++.dg/template/dependent-args1.C b/gcc/testsuite/g++.dg/template/dependent-args1.C index 0b197cf..a540e55 100644 --- a/gcc/testsuite/g++.dg/template/dependent-args1.C +++ b/gcc/testsuite/g++.dg/template/dependent-args1.C @@ -9,3 +9,5 @@ struct A templateint N, void (A::*)() = A::fooN struct B {}; Bint b; // { dg-error type/value mismatch|expected a constant|invalid type } + +// { dg-prune-output could not convert }
[PATCH] Pattern recognizer for rotates
Hi! This patch allows us to vectorize attached testcases again, rot1.c and rot3.c e.g. with -O3 -mavx, rot2.c only with -O3 -mavx2. The problem with rot2.c is that the rotation count is there vect_external_def, but when we insert some stmts into the pattern def seq (the (-j) 31), we turn it into vec_internal_def, we have right now no easy way to say, if you want to vectorize this, just push these statements before the loop as scalar and use those as external def. Well, perhaps the pattern recognizer could in that case emit the stmts before the loop directly, and if vectorization won't be successful, just let it be DCEd again. Perhaps something to be handled as a follow-up. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Not sure about if something should go into the testsuite/, and in what form (just runtime testcases with verification, or vect_shift target tests? Note I couldn't find any target supports stuff that would check if vector by vector shifts are supported). 2013-05-10 Jakub Jelinek ja...@redhat.com * tree-vectorizer.h (NUM_PATTERNS): Increment. * tree-vect-patterns.c (vect_vect_recog_func_ptrs): Add vect_recog_rotate_pattern. (vect_recog_rotate_pattern): New function. --- gcc/tree-vectorizer.h.jj2013-04-24 12:07:12.0 +0200 +++ gcc/tree-vectorizer.h 2013-05-10 13:38:14.658985201 +0200 @@ -1005,7 +1005,7 @@ extern void vect_slp_transform_bb (basic Additional pattern recognition functions can (and will) be added in the future. */ typedef gimple (* vect_recog_func_ptr) (vecgimple *, tree *, tree *); -#define NUM_PATTERNS 10 +#define NUM_PATTERNS 11 void vect_pattern_recog (loop_vec_info, bb_vec_info); /* In tree-vectorizer.c. */ --- gcc/tree-vect-patterns.c.jj 2013-01-11 09:03:01.0 +0100 +++ gcc/tree-vect-patterns.c2013-05-10 14:59:35.756019753 +0200 @@ -50,6 +50,7 @@ static gimple vect_recog_over_widening_p tree *); static gimple vect_recog_widen_shift_pattern (vecgimple *, tree *, tree *); +static gimple vect_recog_rotate_pattern (vecgimple *, tree *, tree *); static gimple vect_recog_vector_vector_shift_pattern (vecgimple *, tree *, tree *); static gimple vect_recog_divmod_pattern (vecgimple *, @@ -64,6 +65,7 @@ static vect_recog_func_ptr vect_vect_rec vect_recog_pow_pattern, vect_recog_widen_shift_pattern, vect_recog_over_widening_pattern, + vect_recog_rotate_pattern, vect_recog_vector_vector_shift_pattern, vect_recog_divmod_pattern, vect_recog_mixed_size_cond_pattern, @@ -1446,6 +1448,218 @@ vect_recog_widen_shift_pattern (vecgimp if (dump_enabled_p ()) dump_gimple_stmt_loc (MSG_NOTE, vect_location, TDF_SLIM, pattern_stmt, 0); + + stmts-safe_push (last_stmt); + return pattern_stmt; +} + +/* Detect a rotate pattern wouldn't be otherwise vectorized: + + type a_t, b_t, c_t; + + S0 a_t = b_t r c_t; + + Input/Output: + + * STMTS: Contains a stmt from which the pattern search begins, +i.e. the shift/rotate stmt. The original stmt (S0) is replaced +with a sequence: + + S1 d_t = -c_t; + S2 e_t = d_t (B - 1); + S3 f_t = b_t c_t; + S4 g_t = b_t e_t; + S0 a_t = f_t | g_t; + +where B is element bitsize of type. + + Output: + + * TYPE_IN: The type of the input arguments to the pattern. + + * TYPE_OUT: The type of the output of this pattern. + + * Return value: A new stmt that will be used to replace the rotate +S0 stmt. */ + +static gimple +vect_recog_rotate_pattern (vecgimple *stmts, tree *type_in, tree *type_out) +{ + gimple last_stmt = stmts-pop (); + tree oprnd0, oprnd1, lhs, var, var1, var2, vectype, type, stype, def, def2; + gimple pattern_stmt, def_stmt; + enum tree_code rhs_code; + stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt); + loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_vinfo); + bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_vinfo); + enum vect_def_type dt; + optab optab1, optab2; + + if (!is_gimple_assign (last_stmt)) +return NULL; + + rhs_code = gimple_assign_rhs_code (last_stmt); + switch (rhs_code) +{ +case LROTATE_EXPR: +case RROTATE_EXPR: + break; +default: + return NULL; +} + + if (STMT_VINFO_IN_PATTERN_P (stmt_vinfo)) +return NULL; + + lhs = gimple_assign_lhs (last_stmt); + oprnd0 = gimple_assign_rhs1 (last_stmt); + type = TREE_TYPE (oprnd0); + oprnd1 = gimple_assign_rhs2 (last_stmt); + if (TREE_CODE (oprnd0) != SSA_NAME + || TYPE_PRECISION (TREE_TYPE (lhs)) != TYPE_PRECISION (type) + || !INTEGRAL_TYPE_P (type) + || !TYPE_UNSIGNED (type)) +return NULL; + + if (!vect_is_simple_use (oprnd1, last_stmt, loop_vinfo, bb_vinfo, def_stmt, + def, dt)) +return NULL; + + if (dt != vect_internal_def + dt !=
[PATCH] Fix handle_char_store in strlen pass (PR tree-optimization/57230)
Hi! I've apparently missed one spot, where store of zero value was assumed, but not actually verified. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/4.8/4.7? 2013-05-10 Jakub Jelinek ja...@redhat.com PR tree-optimization/57230 * tree-ssa-strlen.c (handle_char_store): Add missing integer_zerop check. * gcc.dg/strlenopt-23.c: New test. --- gcc/tree-ssa-strlen.c.jj2013-04-26 08:49:53.0 +0200 +++ gcc/tree-ssa-strlen.c 2013-05-10 08:57:20.654523288 +0200 @@ -1703,7 +1703,7 @@ handle_char_store (gimple_stmt_iterator its length may be decreased. */ adjust_last_stmt (si, stmt, false); } - else if (si != NULL) + else if (si != NULL integer_zerop (gimple_assign_rhs1 (stmt))) { si = unshare_strinfo (si); si-length = build_int_cst (size_type_node, 0); --- gcc/testsuite/gcc.dg/strlenopt-23.c.jj 2013-05-10 09:01:27.808152595 +0200 +++ gcc/testsuite/gcc.dg/strlenopt-23.c 2013-05-10 09:02:08.042931124 +0200 @@ -0,0 +1,15 @@ +/* PR tree-optimization/57230 */ +/* { dg-do run } */ +/* { dg-options -O2 } */ + +#include strlenopt.h + +int +main () +{ + char p[] = hello world; + p[0] = (char) (strlen (p) - 1); + if (strlen (p) != 11) +abort (); + return 0; +} Jakub
[PATCH] Improve handle_char_store in strlen pass (PR tree-optimization/57230)
Hi! Related to the previously posted patch, I've noticed that on the attached testcase we don't optimize the strlen into constant, even when we easily could. Implemented thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2013-05-10 Jakub Jelinek ja...@redhat.com PR tree-optimization/57230 * tree-ssa-strlen.c (handle_char_store): Record length for array store from STRING_CST. * gcc.dg/strlenopt-24.c: New test. --- gcc/tree-ssa-strlen.c.jj2013-05-10 08:57:20.0 +0200 +++ gcc/tree-ssa-strlen.c 2013-05-10 10:22:03.166458997 +0200 @@ -1740,6 +1740,25 @@ handle_char_store (gimple_stmt_iterator if (si != NULL) si-writable = true; } + else if (idx == 0 + TREE_CODE (gimple_assign_rhs1 (stmt)) == STRING_CST + ssaname == NULL_TREE + TREE_CODE (TREE_TYPE (lhs)) == ARRAY_TYPE) +{ + size_t l = strlen (TREE_STRING_POINTER (gimple_assign_rhs1 (stmt))); + HOST_WIDE_INT a = int_size_in_bytes (TREE_TYPE (lhs)); + if (a 0 (unsigned HOST_WIDE_INT) a l) + { + int idx = new_addr_stridx (lhs); + if (idx != 0) + { + si = new_strinfo (build_fold_addr_expr (lhs), idx, + build_int_cst (size_type_node, l)); + set_strinfo (idx, si); + si-dont_invalidate = true; + } + } +} if (si != NULL initializer_zerop (gimple_assign_rhs1 (stmt))) { --- gcc/testsuite/gcc.dg/strlenopt-24.c.jj 2013-05-10 10:30:01.343821849 +0200 +++ gcc/testsuite/gcc.dg/strlenopt-24.c 2013-05-10 10:30:37.736621582 +0200 @@ -0,0 +1,17 @@ +/* PR tree-optimization/57230 */ +/* { dg-do run } */ +/* { dg-options -O2 -fdump-tree-strlen } */ + +#include strlenopt.h + +int +main () +{ + char p[] = hello world; + if (strlen (p) != 11) +abort (); + return 0; +} + +/* { dg-final { scan-tree-dump-times strlen \\( 0 strlen } } * +/* { dg-final { cleanup-tree-dump strlen } } */ Jakub
Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)
On Fri, May 10, 2013 at 4:52 AM, Jan Hubicka hubi...@ucw.cz wrote: On 05/07/13 23:13, Teresa Johnson wrote: -- Revised patch that fixes failures encountered when enabling -freorder-blocks-and-partition, including the failure reported in PR 53743. This includes new verification code to ensure no cold blocks dominate hot blocks contributed by Steven Bosscher. Seems like a reasonable verification; presumably if we have a cold block dominating a hot block, then the block/edge frequencies are badly broken. Ah, just saw the comments for the other case where this happens. cold entry, but hot loop inside pushing over the barrier. Arguably given a cold block in the dominator graph, all its children should have their frequences scaled down to avoid that situation. Yep, also note that sanity checking anything about frequencies is really hard. There are very many places in compiler that necesarilly need to invalidate frequencies in weird ways (at least short of rebuilding the whole profile from probabilities again). Yes, as noted in the comments this was in part due to several places where counts/frequencies were not kept in sync. Rather than try to fix all of these, or do any scaling of frequencies, the partitioning code now just enforces that the partitioning is sane w.r.t. the given counts. This is done during bb partitioning. The sanity checking routine was also useful for finding places where optimization passes were splitting edges and causing hot blocks previously reached by both hot and cold blocks to become dominated by cold blocks (see comments in commit_edge_insertions in my patch), and making sure they got fixed up. But there is the issue of what we should do in the case of an infrequent but non-zero entry (marked cold by maybe_hot_count_p because its count is less than the number of training runs) that leads to a hot loop. The code I added to the partitioning routine (find_rarely_executed_basic_blocks_and_crossing_edges) will cause the entry to also be placed in the hot partition. I would argue this is the desired behavior - if the routine contains code that is very hot for, say, 1/2 its training runs, the entry and hot loop (and everything on the path in between) should be in the hot partition. I can't really comment on the cfglayout and related stuff -- it was added at a time when I wasn't doing much with GCC and thus I don't know much about it. I think I can take a look at the cfglayout stuff. Splitting the patch would be great. Thanks, that would be great. I can split the patch first. Honza However, I like the changes to record if we've done partitioning and checking those instead of flag_reorder_blocks_and_partition. That's simple enough that I'd support pulling it out as a separate patch and installing immediately if that can be done so without major headaches. Ok, thanks, will do. I think we could do something similar with the code to verify the idom of a hot block is also hot. Though looking at the implementation I wonder if it could be simplified by walking the dominator tree? I can't look at it real closely tonight though. Looking at this code again, I agree with you. It looks like it is going to walk cold bb's more than once and O(n^2) in the worst case. I will fix this (there are a couple places in the patch that do a walk to ensure that this is not violated). Could you pull those two logical hunks of work out into individual patches. Will do. The only complication with splitting out the dominance checking stuff is that there are a number of changes in the patch to ensure that we don't violate this (hot block can't be dominated by cold block). I am not sure it makes sense or will be easy to split all of these out. I think what I will do is try to pull the big related chunks of them out to a separate patch (the new verification code, the code to prevent this in the partitioning routine, and fixup_partitions), but there are going to be a few places in the other patch that do some fixups related to this (e.g. in rtl_split_edge) that I would like to leave in the larger correctness patch. Thanks, Teresa jeff -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
[PATCH] Fix up rotate expansion
Hi! Our rotate expansion if rotate optab isn't present for the respective mode looks unsafe for rotations by variable count if that count could be 0, because then it invokes right or left shift by bitsize. While on most targets the hw will probably do the right thing (it is fine if x 32 will either yield x or 0, in both cases that ored together with x 0 aka x will still yield x), perhaps gcc could try to optimize based on the fact that undefined behavior can't happen, so IMHO it is better to generate a safer sequence. Ok for trunk? 2013-05-10 Jakub Jelinek ja...@redhat.com * expmed.c (expand_shift_1): For rotations by const0_rtx just return shifted. Use (-op1) (prec - 1) as other_amount instead of prec - op1. --- gcc/expmed.c.jj 2013-05-07 10:26:46.0 +0200 +++ gcc/expmed.c2013-05-10 11:54:34.659987038 +0200 @@ -2166,7 +2166,8 @@ expand_shift_1 (enum tree_code code, enu { /* If we have been unable to open-code this by a rotation, do it as the IOR of two shifts. I.e., to rotate A -by N bits, compute (A N) | ((unsigned) A (C - N)) +by N bits, compute +(A N) | ((unsigned) A ((-N) (C - 1))) where C is the bitsize of A. It is theoretically possible that the target machine might @@ -2181,14 +2182,22 @@ expand_shift_1 (enum tree_code code, enu rtx temp1; new_amount = op1; - if (CONST_INT_P (op1)) + if (op1 == const0_rtx) + return shifted; + else if (CONST_INT_P (op1)) other_amount = GEN_INT (GET_MODE_BITSIZE (mode) - INTVAL (op1)); else - other_amount - = simplify_gen_binary (MINUS, GET_MODE (op1), -GEN_INT (GET_MODE_PRECISION (mode)), -op1); + { + other_amount + = simplify_gen_unary (NEG, GET_MODE (op1), + op1, GET_MODE (op1)); + other_amount + = simplify_gen_binary (AND, GET_MODE (op1), + other_amount, + GEN_INT (GET_MODE_PRECISION (mode) + - 1)); + } shifted = force_reg (mode, shifted); Jakub
Re: [PATCH] Multilib changes ARM RTEMS
Committed to 4.7, 4.8 and head. On 5/8/2013 5:06 AM, Sebastian Huber wrote: This patch should be applied to all open GCC branches. This patch removes the mthumb/march=armv7 since it is virtually equal to the mthumb/march=armv7-m multilib. Add variants for ARMv7-A and ARMv7-R. I tried to use the MULTILIB_REQUIRED option, but this didn't work for patterns with more than two options. gcc/ChangeLog 2013-05-08 Sebastian Huber sebastian.hu...@embedded-brains.de * config/arm/t-rtems-eabi: Remove mthumb/march=armv7 multilib. Add mthumb/march=armv7-a multilib. Add mthumb/march=armv7-r multilib. Add mthumb/march=armv7-a/mfpu=neon/mfloat-abi=hard multilib. --- gcc/config/arm/t-rtems-eabi | 51 +- 1 files changed, 45 insertions(+), 6 deletions(-) diff --git a/gcc/config/arm/t-rtems-eabi b/gcc/config/arm/t-rtems-eabi index f0e714a..d81fbf7 100644 --- a/gcc/config/arm/t-rtems-eabi +++ b/gcc/config/arm/t-rtems-eabi @@ -1,8 +1,47 @@ # Custom RTEMS EABI multilibs -MULTILIB_OPTIONS= mthumb march=armv6-m/march=armv7/march=armv7-m -MULTILIB_DIRNAMES = thumb armv6-m armv7 armv7-m -MULTILIB_EXCEPTIONS = march=armv6-m march=armv7 march=armv7-m -MULTILIB_MATCHES= -MULTILIB_EXCLUSIONS = -MULTILIB_OSDIRNAMES = +MULTILIB_OPTIONS = mthumb march=armv6-m/march=armv7-a/march=armv7-r/march=armv7-m mfpu=neon mfloat-abi=hard +MULTILIB_DIRNAMES = thumb armv6-m armv7-a armv7-r armv7-m neon hard + +# Enumeration of multilibs + +MULTILIB_EXCEPTIONS = +MULTILIB_EXCEPTIONS += mthumb/march=armv6-m/mfpu=neon/mfloat-abi=hard +MULTILIB_EXCEPTIONS += mthumb/march=armv6-m/mfpu=neon +MULTILIB_EXCEPTIONS += mthumb/march=armv6-m/mfloat-abi=hard +# MULTILIB_EXCEPTIONS += mthumb/march=armv6-m +# MULTILIB_EXCEPTIONS += mthumb/march=armv7-a/mfpu=neon/mfloat-abi=hard +MULTILIB_EXCEPTIONS += mthumb/march=armv7-a/mfpu=neon +MULTILIB_EXCEPTIONS += mthumb/march=armv7-a/mfloat-abi=hard +# MULTILIB_EXCEPTIONS += mthumb/march=armv7-a +MULTILIB_EXCEPTIONS += mthumb/march=armv7-r/mfpu=neon/mfloat-abi=hard +MULTILIB_EXCEPTIONS += mthumb/march=armv7-r/mfpu=neon +MULTILIB_EXCEPTIONS += mthumb/march=armv7-r/mfloat-abi=hard +# MULTILIB_EXCEPTIONS += mthumb/march=armv7-r +MULTILIB_EXCEPTIONS += mthumb/march=armv7-m/mfpu=neon/mfloat-abi=hard +MULTILIB_EXCEPTIONS += mthumb/march=armv7-m/mfpu=neon +MULTILIB_EXCEPTIONS += mthumb/march=armv7-m/mfloat-abi=hard +# MULTILIB_EXCEPTIONS += mthumb/march=armv7-m +MULTILIB_EXCEPTIONS += mthumb/mfpu=neon/mfloat-abi=hard +MULTILIB_EXCEPTIONS += mthumb/mfpu=neon +MULTILIB_EXCEPTIONS += mthumb/mfloat-abi=hard +# MULTILIB_EXCEPTIONS += mthumb +MULTILIB_EXCEPTIONS += march=armv6-m/mfpu=neon/mfloat-abi=hard +MULTILIB_EXCEPTIONS += march=armv6-m/mfpu=neon +MULTILIB_EXCEPTIONS += march=armv6-m/mfloat-abi=hard +MULTILIB_EXCEPTIONS += march=armv6-m +MULTILIB_EXCEPTIONS += march=armv7-a/mfpu=neon/mfloat-abi=hard +MULTILIB_EXCEPTIONS += march=armv7-a/mfpu=neon +MULTILIB_EXCEPTIONS += march=armv7-a/mfloat-abi=hard +MULTILIB_EXCEPTIONS += march=armv7-a +MULTILIB_EXCEPTIONS += march=armv7-r/mfpu=neon/mfloat-abi=hard +MULTILIB_EXCEPTIONS += march=armv7-r/mfpu=neon +MULTILIB_EXCEPTIONS += march=armv7-r/mfloat-abi=hard +MULTILIB_EXCEPTIONS += march=armv7-r +MULTILIB_EXCEPTIONS += march=armv7-m/mfpu=neon/mfloat-abi=hard +MULTILIB_EXCEPTIONS += march=armv7-m/mfpu=neon +MULTILIB_EXCEPTIONS += march=armv7-m/mfloat-abi=hard +MULTILIB_EXCEPTIONS += march=armv7-m +MULTILIB_EXCEPTIONS += mfpu=neon/mfloat-abi=hard +MULTILIB_EXCEPTIONS += mfpu=neon +MULTILIB_EXCEPTIONS += mfloat-abi=hard -- Joel Sherrill, Ph.D. Director of Research Development joel.sherr...@oarcorp.comOn-Line Applications Research Ask me about RTEMS: a free RTOS Huntsville AL 35805 Support Available(256) 722-9985
RE: [RFA/ARM/4.7] Fix PR54974: Thumb literal pools don't handle PC rounding
-Original Message- From: Ramana Radhakrishnan Sent: Friday, May 10, 2013 18:13 To: Matthew Gretton-Dann Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw; d...@canonical.com; Patch Tracking; Richard Biener; Joey Ye Subject: Re: [RFA/ARM/4.7] Fix PR54974: Thumb literal pools don't handle PC rounding OK for 4.7? Ok - I did say ok if no fallout in my original review :) Tested with Qemu Cortex-M3. No regression. Committed to 4.7. - Joey
a tiny LRA patch
Analyzing code generated for PPC with LRA, I found that sometimes call-saved registers are overused. It creates long prologs/epilogs. The following patch fixes it. The problem was in that BB frequency might be zero. Committed to trunk as rev. 198792. The same patch was committed into lra-mike branch. 2013-05-10 Vladimir Makarov vmaka...@redhat.com * lra-assigns.c (find_hard_regno_for): Add 1 to the cost of call saved registers. Index: lra-assigns.c === --- lra-assigns.c (revision 198780) +++ lra-assigns.c (working copy) @@ -598,7 +598,7 @@ find_hard_regno_for (int regno, int *cos ! df_regs_ever_live_p (hard_regno + j)) /* It needs save restore. */ hard_regno_costs[hard_regno] - += 2 * ENTRY_BLOCK_PTR-next_bb-frequency; + += 2 * ENTRY_BLOCK_PTR-next_bb-frequency + 1; priority = targetm.register_priority (hard_regno); if (best_hard_regno 0 || hard_regno_costs[hard_regno] best_cost || (hard_regno_costs[hard_regno] == best_cost
Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)
On Thu, May 9, 2013 at 3:40 PM, Steven Bosscher stevenb@gmail.com wrote: On Thu, May 9, 2013 at 11:42 PM, Diego Novillo wrote: On 2013-05-08 01:13 , Teresa Johnson wrote: -static void +void emit_barrier_after_bb (basic_block bb) { rtx barrier = emit_barrier_after (BB_END (bb)); - BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier); + if (current_ir_type () == IR_RTL_CFGLAYOUT) +BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier); What if the current IR is not RTL? Should we fail here? It doesn't seem like it makes sense to call this from gimple, for instance. It also makes no sense calling it in IR_RTL_CFGLAYOUT mode. Barriers are meaningless in cfglayout mode. Actually, the change above ensures we can call this routine when *not* in CFGLAYOUT mode. It was previously only called when we were in CFGLAYOUT mode (while in bbpart). (This relates to a conversation we had on an earlier version of the patch back in the fall, where we discussed this issue of bbpart adding barriers while in cfglayout mode: http://gcc.gnu.org/ml/gcc-patches/2012-10/msg02996.html). - emit_barrier_after (BB_END (jump_block)); + /* We might be in cfg layout mode, and if so, the following routine will + insert the barrier correctly. */ + emit_barrier_after_bb (jump_block); We're practically always in cfglayout mode, but oh well... Right, not always, which is why I needed to make this change. On Fri, May 10, 2013 at 5:05 AM, Steven Bosscher stevenb@gmail.com wrote: On Fri, May 10, 2013 at 12:57 AM, Xinliang David Li wrote: On Thu, May 9, 2013 at 3:40 PM, Steven Bosscher wrote: This patch mixes up things badly from the point of what-depends-on-what, the whole approach looks wrong to me. Do you mean the 'source file dependency' or 'logical dependency'? If the former, the code can be easily refactored to remove the dependency. I don't see how the latter can be avoided as bb-partition etc does change cfg states and leads to different actions in cfg layout finalize. I mean logical dependency. cfglayout is just a representation of the CFG, and bb-partition is a code transformation. By making fixup_reorder_chain emit the note, you' ve put part of the transformation into out-of-cfglayout which is just bogus. You also don't put GCSE or loop unrolling in out-of-cfglayout, and this change is IMHO in the same category: mixing transformations into internal representations. That may be a short-term fix but it is a long-term maintenance/cleanups nightmare. The main issue I had here, and why I made this change, is that we go in and out of cfglayout mode several times after bb partitioning and then out_of_cfglayout. The problem was that when we subsequently went in and out of cfglayout mode, the switch text section notes that had been inserted by bbpart were getting messed up (they were moved into the bb header when we enter cfglayout mode and then not being transferred to the correct location upon exit). I investigated trying to keep those in sync, but it is really difficult/impossible to do during cfglayout mode when they are in the header. So I simply strip them out completely on entry to cfglayout mode, and if there were any there on entry, this change ensures that they are restored in the appropriate location upon exit. I'm not sure what is a good alternative? I found a more detailed description of why I made this change in our email exchange from the fall: -- I triggered the same error in 445.gobmk once I applied the thread_prologue_and_epilogue_insns fixes. This is an assert in the dwarf CFI code that complains about a NOTE_INSN_SWITCH_TEXTS_SECTION note not being preceeded by a barrier: gcc -c -o engine/utils.o -DSPEC_CPU -DNDEBUG -DHAVE_CONFIG_H -I. -I.. -I../include -I./include -fprofile-use -freorder-blocks-and-partition -freorder-blocks -ffunction-sections -O2 engine/utils.c engine/utils.c: In function ‘visible_along_edge’: engine/utils.c:274:1: internal compiler error: in create_pseudo_cfg, at dwarf2cfi.c:2742 } ^ In this case the switch section note was inside a BB. What I found was that this was due to several phases going into and back out of cfglayout mode again. In this case it was the compgotos phase. There aren't any computed gotos, but this change occurs during cfg_layout_initialize (in try_optimize_cfg called via cleanup_cfg). Here it merged two (non-contiguous) blocks that had a single-successor/single-predecessor relationship. However, the source block was previously on the section boundary and had a SWITCH note prior. This note is put into the header of the bb when we go into cfglayout mode, and ended up inside the new merged block, which was in any case not on the new border between the hot and cold sections. The correct solution in my opinion is to strip out the SWITCH note every time we enter cfglayout mode after bbro, and then invoke insert_section_boundary_note when leaving cfglayout (if one was found on entry
Re: [Patch, Fortran] PR57217 - re-add type checks for TBP overriding
Hi Tobias, GCC 4.7 added some additional checks for type-bound procedure overriding. However, doing so it weakened the check whether the nonpass argument has the same type. While for normal arguments, passing the parent type to an extended type is fine, for overriding the type (of nonpass arguments) must be exactly the same as in the original type. in principle I think your idea to tighten up the type check by symmetrizing it is ok. However, I'm not sure if the place where you do it is the most preferable: I think it should rather go inside of check_dummy_characteristics (since any check for the dummy characteristics should include the 'strict' type check, and not only a check of type compatibility, cf. F08:12.3.2.2). Anyway, there already is a call to 'compare_type_rank' in 'check_dummy_characteristics'. So, you could either just symmetrize this, or alternatively use a double 'gfc_compare_types' and a separate check for the rank. Moreover, I think your current patch misses e.g. the assumed-type handling that is present in compare_type_rank. Cheers, Janus
Re: [PATCH, PR preprocessor/42014] Added LAST_SOURCE_COLUMN in while loop
Shakthi == Shakthi Kannan skan...@redhat.com writes: Shakthi 2013-05-10 Shakthi Kannan skan...@redhat.com Shakthi PR preprocessor/42014 Shakthi * gcc/diagnostic.c: Added LAST_SOURCE_COLUMN in while loop. You should mention the function name in there. See the GNU Coding Standards for the format. Shakthi - ,\n from %s:%d, Shakthi - LINEMAP_FILE (map), LAST_SOURCE_LINE (map)); Shakthi + ,\n from %s:%d:%d, Shakthi + LINEMAP_FILE (map), Shakthi + LAST_SOURCE_LINE (map), LAST_SOURCE_COLUMN (map)); Does this cause test suite regressions? If so then the patch needs fixes there. If not then the patch needs a new test. Also, is the column number actually correct? IIRC some things in cpp still don't get the right column number. Tom
Re: [Patch, Fortran] PR57217 - re-add type checks for TBP overriding
Hi Janus, Janus Weil wrote: in principle I think your idea to tighten up the type check by symmetrizing it is ok. I actually kind of copied it from the proc-pointer assignment check, which does the same. However, I'm not sure if the place where you do it is the most preferable: I think it should rather go inside of check_dummy_characteristics (since any check for the dummy characteristics should include the 'strict' type check, and not only a check of type compatibility, cf. F08:12.3.2.2). Okay. Maybe one can at the same time replace the rank+type check by a type and a rank check to provide a better error message, i.e. show the type for a type mismatch (showing both types)* and the rank for a rank mismatch (showing both ranks). (* best with special handling of no_arg_check) Moreover, I think your current patch misses e.g. the assumed-type handling that is present in compare_type_rank. Good point. Another reason to think about splitting the two. Would you interested updating this patch? Or should I put it again on my growing to-do list? Tobias
Re: Ping: [PATCH, PR 42371] Remove references to functions from symbol table during inlining
Hi, I thought I was hitting the ipa-inline-transform.c:263 assert with this patch when LTO-building Mozilla Firefox but it turned out it was caused by a different patch so I'm handling that there. On Thu, May 02, 2013 at 02:37:26PM +0200, Jan Hubicka wrote: 2013-03-22 Martin Jambor mjam...@suse.cz PR middle-end/42371 * ipa-prop.h (IPA_UNDESCRIBED_USE): New macro. (ipa_constant_data): New type. (ipa_jump_func): Use ipa_constant_data to hold information about constant jump functions. (ipa_get_jf_constant): Adjust to jump function type changes. (ipa_get_jf_constant_rdesc): New function. (ipa_param_descriptor): New field controlled_uses. (ipa_get_controlled_uses): New function. (ipa_set_controlled_uses): Likewise. * ipa-ref.h (ipa_find_reference): Declare. * ipa-prop.c (ipa_cst_ref_desc): New type. (ipa_print_node_jump_functions_for_edge): Adjust for jump function type changes. (ipa_set_jf_constant): Likewise. Also create reference descriptions. New parameter cs. Adjust all callers. (ipa_analyze_params_uses): Detect uncontrolled and controlled uses. (remove_described_reference): New function. (jfunc_rdesc_usable): Likewise. (try_make_edge_direct_simple_call): Decrement controlled use count, attempt to remove reference if it hits zero. (combine_controlled_uses_counters): New function. (propagate_controlled_uses): Likewise. (ipa_propagate_indirect_call_infos): Call propagate_controlled_uses. (ipa_edge_duplication_hook): Duplicate reference descriptions. (ipa_print_node_params): Print described use counter. (ipa_write_jump_function): Adjust to jump function type changes. (ipa_read_jump_function): New parameter CS, pass it to ipa_set_jf_constant. Adjust caller. (ipa_write_node_info): Stream controlled use count (ipa_read_node_info): Likewise. * cgraph.c (cgraph_mark_address_taken_node): Bail out instead of asserting. * ipa-cp.c (ipcp_discover_new_direct_edges): Decrement controlled use count. Remove cloning-added reference if it reaches zero. * ipa-ref.c (ipa_find_reference): New function. As mentioned offline, I think the patch should be extended to work on references in general, so we can do more of promoting to !TREE_ADDRESSABLE. But it can be handled incrementally. Yeah, adding that is easy. But since I add no use for that, I refrained from allocating data for them. Index: src/gcc/cgraph.c === *** src.orig/gcc/cgraph.c --- src/gcc/cgraph.c *** cgraph_remove_node (struct cgraph_node * *** 1409,1415 void cgraph_mark_address_taken_node (struct cgraph_node *node) { ! gcc_assert (!node-global.inlined_to); /* FIXME: address_taken flag is used both as a shortcut for testing whether IPA_REF_ADDR reference exists (and thus it should be set on node representing alias we take address of) and as a test whether address --- 1409,1418 void cgraph_mark_address_taken_node (struct cgraph_node *node) { ! /* Indirect inlining can figure out that all uses of the address are ! inlined. */ ! if (node-global.inlined_to) ! return; /* FIXME: address_taken flag is used both as a shortcut for testing whether IPA_REF_ADDR reference exists (and thus it should be set on node representing alias we take address of) and as a test whether address Perhaps add assert that after_inlining is set? I added asserting: gcc_assert (cfun-after_inlining); gcc_assert (node-callers-indirect_inlining_edge); Paths is OK, OK, the updated patch is below (the only change is the extra asserts), re-bootstrapped and re-tested on x86_64-linux. I will commit it on Monday if there are no objections. Thanks, Martin 2013-05-09 Martin Jambor mjam...@suse.cz PR middle-end/42371 * ipa-prop.h (IPA_UNDESCRIBED_USE): New macro. (ipa_constant_data): New type. (ipa_jump_func): Use ipa_constant_data to hold information about constant jump functions. (ipa_get_jf_constant): Adjust to jump function type changes. (ipa_get_jf_constant_rdesc): New function. (ipa_param_descriptor): New field controlled_uses. (ipa_get_controlled_uses): New function. (ipa_set_controlled_uses): Likewise. * ipa-ref.h (ipa_find_reference): Declare. * ipa-prop.c (ipa_cst_ref_desc): New type. (ipa_print_node_jump_functions_for_edge): Adjust for jump function type changes. (ipa_set_jf_constant): Likewise. Also create reference descriptions. New parameter cs. Adjust all callers. (ipa_analyze_params_uses): Detect uncontrolled and controlled uses. (remove_described_reference): New function.
Re: Add std::unordered_* C++11 allocator support
.. looks like the prettyprinting code needs again adjustments: FAIL: libstdc++-prettyprinters/cxx11.cc print uom FAIL: libstdc++-prettyprinters/cxx11.cc print uomm FAIL: libstdc++-prettyprinters/cxx11.cc print uos FAIL: libstdc++-prettyprinters/cxx11.cc print uoms Thanks, Paolo.
[PATCH] Consistent and slightly better IPA dumps
Hi, I have noticed that whereas all of the dumps in ipa-inline*, ipa-cp.c, ipa-prop.c and elsewhere dump node UIDs, function dump_cgraph_node, which is the core of call graph dumping, dumps node-symbol.uid. This unnecessarily makes it more difficult to see what has happened to a node in the graph dump or how did a node from an action ended up in the dump. The patch below changes this and dumps symbol.uid everywhere where we currently dump node.uid for the sake of consistency. There is only one exception, and that is input_node in lto-cgraph.c where we assign to the order, so I just made it explicit that we were dumping UIDs. Apart from this, the patch also dumps the symbol.order in three places where we currently only print the name, because I reckon it is useful ands some extra info about indirect edges in jump function dumping. Bootstrapped and tested on x86_64-linux without any problems. OK for trunk? Thanks, Martin 2013-05-07 Martin Jambor mjam...@suse.cz * ipa-prop.c (ipa_print_node_jump_functions): Print symbol order in header, print symbol order instead of node uid, print more information about indirect edge targets. (ipa_make_edge_direct_to_target): Print symbol order instead of node uids. (ipa_make_edge_direct_to_target): Likewise. (remove_described_reference): Likewise. (propagate_controlled_uses): Likewise. (ipa_print_node_params): Also print symbol order. (ipcp_transform_function): Print symbol order instead of node uids. * cgraph.c (cgraph_redirect_edge_call_stmt_to_callee): Likewise. (cgraph_get_create_real_symbol_node): Likewise. * ipa-cp.c (print_lattice): Likewise. (print_all_lattices): Likewise. (determine_versionability): Likewise. (initialize_node_lattices): Likewise. (estimate_local_effects): Likewise. (update_profiling_info): Likewise. (create_specialized_node): Likewise. (perhaps_add_new_callers): Likewise. (decide_about_value): Likewise. (decide_whether_version_node): Likewise. (identify_dead_nodes): Likewise. * ipa-inline-analysis.c (dump_inline_edge_summary): Likewise. (dump_inline_summary): Likewise. (estimate_node_size_and_time): Likewise. (inline_analyze_function): Likewise. * ipa-inline.c (report_inline_failed_reason): Likewise. (want_early_inline_function_p): Likewise. (edge_badness): Likewise. (update_edge_key): Likewise. (inline_small_functions): Likewise. Add dumping of order to two other dumps. * ipa-pure-const.c (pure_const_read_summary): Print symbol order instead of node uids. (propagate_pure_const): Likewise. (propagate_pure_const): Likewise. * ipa-utils.c (dump_cgraph_node_set): Likewise. * lto-cgraph.c (input_node): Explicitly specify we dump uid. * lto-symtab.c (lto_cgraph_replace_node): Print symbol order instead of node uids. * tree-pretty-print.c (dump_function_header): Likewise. * tree-sra.c (convert_callers_for_node): Dump in traditional format. Print symbol order instead of node uids. lto/ * lto-partition.c (lto_balanced_map): Print symbol order instead of node uids. Index: src/gcc/ipa-prop.c === --- src.orig/gcc/ipa-prop.c +++ src/gcc/ipa-prop.c @@ -265,34 +265,45 @@ void ipa_print_node_jump_functions (FILE *f, struct cgraph_node *node) { struct cgraph_edge *cs; - int i; - fprintf (f, Jump functions of caller %s:\n, cgraph_node_name (node)); + fprintf (f, Jump functions of caller %s/%i:\n, cgraph_node_name (node), + node-symbol.order); for (cs = node-callees; cs; cs = cs-next_callee) { if (!ipa_edge_args_info_available_for_edge_p (cs)) continue; fprintf (f, callsite %s/%i - %s/%i : \n, - xstrdup (cgraph_node_name (node)), node-uid, - xstrdup (cgraph_node_name (cs-callee)), cs-callee-uid); + xstrdup (cgraph_node_name (node)), node-symbol.order, + xstrdup (cgraph_node_name (cs-callee)), + cs-callee-symbol.order); ipa_print_node_jump_functions_for_edge (f, cs); } - for (cs = node-indirect_calls, i = 0; cs; cs = cs-next_callee, i++) + for (cs = node-indirect_calls; cs; cs = cs-next_callee) { + struct cgraph_indirect_call_info *ii; if (!ipa_edge_args_info_available_for_edge_p (cs)) continue; + ii = cs-indirect_info; + if (ii-agg_contents) + fprintf (f, indirect aggregate callsite, calling param %i, +offset HOST_WIDE_INT_PRINT_DEC , %s, +ii-param_index, ii-offset, +ii-by_ref ? by reference : by_value); + else + fprintf (f, indirect %s callsite, calling param %i, +
Re: [Patch, Fortran] PR57217 - re-add type checks for TBP overriding
Hi, However, I'm not sure if the place where you do it is the most preferable: I think it should rather go inside of check_dummy_characteristics (since any check for the dummy characteristics should include the 'strict' type check, and not only a check of type compatibility, cf. F08:12.3.2.2). Okay. Maybe one can at the same time replace the rank+type check by a type and a rank check to provide a better error message, i.e. show the type for a type mismatch (showing both types)* and the rank for a rank mismatch (showing both ranks). yes, actually I also thought that splitting it up might be a good idea. Moreover, I think your current patch misses e.g. the assumed-type handling that is present in compare_type_rank. Good point. Another reason to think about splitting the two. Right. However, the question is whether the patch will not become too involved for backporting. Would you interested updating this patch? Or should I put it again on my growing to-do list? My weekend is extremely tight, but I might find a quite hour in the evenings during the next week (unless you can beat me to it) ... Cheers, Janus
[PATCH] Use types_compatible_p in get_binfo_at_offset
Hi, PR 57084 has alerted me that IN LTO we do not devirtualize calls during ipa-cp and ipa-inline that we do in non-LTO compilations. The reason is that the type comparison in get_binfo_at_offset is failing, because the two types do not have the same TYPE_MAIN_VARIANT but TYPE_CANONICAL should be checked instead. Or better yet, call types_compatible_p which hopefully will also know what to do in these cases, which is what the patch below does. The change increases the number of type-based devirtualizations in ipa-cp and ipa-inline from 137 to 179 in 483.xalancbmk when compiles with -O2 -flto -m32 (-m32 is not significant here, it was only required to trigger the bug and I made the measurements in the same environment). Bootstrapped and tested on x86_64-linux without any issues. OK for trunk? Thanks, Martin 2013-05-10 Martin Jambor mjam...@suse.cz * tree.c (get_binfo_at_offset): Use types_compatible_p to compare types. Index: src/gcc/tree.c === --- src.orig/gcc/tree.c +++ src/gcc/tree.c @@ -11483,7 +11483,7 @@ get_binfo_at_offset (tree binfo, HOST_WI tree fld; int i; - if (TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (expected_type)) + if (types_compatible_p (type, expected_type)) return binfo; if (offset 0) return NULL_TREE;
Re: [PATCH] Improve rotation by mode bitsize - 1
Hi! Note, the patch has been already committed. On Fri, May 10, 2013 at 03:48:50PM +0200, Jan Hubicka wrote: Going this route you will also need to update [(set_attr type rotate1) (set (attr length_immediate) (if_then_else (and (match_operand 1 const1_operand) (ior (match_test TARGET_SHIFT1) (match_test optimize_function_for_size_p (cfun (const_string 0) (const_string *))) (set_attr mode QI)]) computing the immediate size. Why we can't canonicalize this in folding/simplify_rtx/combiner? I do not see much benefit allowing both forms of instructions in our insn streams... Both directions of rotation make sense for variable length rotation, and why would the generic simplify-rtx.c code want to prefer one rotate over the other direction for constant rotation counts? That is clearly target specific preference. It seems to me that it is not different from normalizing reg-10 into reg+(-10) we do for years (and for good reason). It is still target preference when use add and when sub to perform the arithmetic, but it makes sense to keep single canonical form of the expression both in RTL and Gimple. For example we may want to be able to prove that (rotate reg 31) == (rotatert reg 1) is true or (rotate reg 30) == (rotatert reg 2) is also true or cross jump both variants into one instruction. So it seems to me that canonicalizing constant rotations to get the operand into range 0bitsize/2 makes sense in general and will make the extra special case in i386.md unnecesary. What we could do instead of the patch I've committed just tweak expansion of the patterns instead, assuming it is unlikely RTL optimizations materialize a constant rotation out of non-constant rotations from expansion time. Or the length_immediate attributes could be adjusted, shouldn't be too hard (just replace that (match_operand 1 const1_operand) in there with (and (match_operand 1 const_int_operand) (ior (match_operand 1 const1_operand) (match_test INTVAL (operands[1]) == GET_MODE_BITSIZE (MODEmode) - 1))) or similar (untested). Yep, we should either update the pattern or add canonicalization. I still think the second variant is better... Honza
Re: Ping: [PATCH, PR 42371] Remove references to functions from symbol table during inlining
OK, the updated patch is below (the only change is the extra asserts), re-bootstrapped and re-tested on x86_64-linux. I will commit it on Monday if there are no objections. No objections from my side. Path looks OK. Thanks, Honza
Re: [PATCH] Use types_compatible_p in get_binfo_at_offset
2013-05-10 Martin Jambor mjam...@suse.cz * tree.c (get_binfo_at_offset): Use types_compatible_p to compare types. Index: src/gcc/tree.c === --- src.orig/gcc/tree.c +++ src/gcc/tree.c @@ -11483,7 +11483,7 @@ get_binfo_at_offset (tree binfo, HOST_WI tree fld; int i; - if (TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (expected_type)) + if (types_compatible_p (type, expected_type)) As discussed on lunch, it seems fine to me, but I am not an expert ;) I wonder about the following test few lines bellow: /* Offset 0 indicates the primary base, whose vtable contents are represented in the binfo for the derived class. */ else if (offset != 0) { tree base_binfo, found_binfo = NULL_TREE; for (i = 0; BINFO_BASE_ITERATE (binfo, i, base_binfo); i++) if (TREE_TYPE (base_binfo) == TREE_TYPE (fld)) don't we want also more relaxed testing? grep TYPE.* ==.*TYPE *.c seems to show some extra dubious cases i.e. in fold-const, tree-ssa-coaelesce and other places where we most probably care only about semantical equivalence of the type... Thanks for looking into this! Honza
[PATCH] Redirect calls to non-functions to builtin_unreachable
Hi, as we discover targets of previously indirect calls in ipa-inline and ipa-cp, we sometimes figure out that the targets are not a function. One typical example is when NULL is passed as a function pointer parameter, another is when C++ member-pointer points to a virtual function and the overloaded field of the structure which can also hold pointers to non-virtual methods contains odd integer constants. I have gone through all the cases when this happens when LTO building Mozilla Firefox and verified all of them were OK and legal. Because no such call to a non-function can ever occur in a legal program, I have decided to turn them into calls of builting_unreachable, which then could help further optimizations. Because calls to builtin_unreachable is not cgraph_maybe_hot_edge_p, whereas an indirect call with an unknown destination is, this change often triggered assertion tree-ipa-inline-transform.c:263. I have discussed this with Honza and he has agreed to switch the check off when there are newly discovered direct edges. The patch has passed bootstrap and is undergoing testing now. OK for trunk if it passes? Thanks, Martin 2013-05-09 Martin Jambor mjam...@suse.cz * ipa-prop.c (ipa_make_edge_direct_to_target): Redirect calls to non-functions to builtin_unreachable. * ipa-inline-transform.c (inline_call): Do not assert estimates were correct when new direct edges were discovered. Index: src/gcc/ipa-prop.c === --- src.orig/gcc/ipa-prop.c +++ src/gcc/ipa-prop.c @@ -2200,6 +2200,7 @@ ipa_make_edge_direct_to_target (struct c { struct cgraph_node *callee; struct inline_edge_summary *es = inline_edge_summary (ie); + bool unreachable = false; if (TREE_CODE (target) == ADDR_EXPR) target = TREE_OPERAND (target, 0); @@ -2210,12 +2211,17 @@ ipa_make_edge_direct_to_target (struct c { if (dump_file) fprintf (dump_file, ipa-prop: Discovered direct call to non-function -in %s/%i.\n, +in %s/%i, making it unreachable.\n, cgraph_node_name (ie-caller), ie-caller-symbol.order); - return NULL; + target = builtin_decl_implicit (BUILT_IN_UNREACHABLE); + callee = cgraph_get_create_node (target); + unreachable = true; } + else + callee = cgraph_get_node (target); } - callee = cgraph_get_node (target); + else +callee = cgraph_get_node (target); /* Because may-edges are not explicitely represented and vtable may be external, we may create the first reference to the object in the unit. */ @@ -2252,7 +2258,7 @@ ipa_make_edge_direct_to_target (struct c - eni_size_weights.call_cost); es-call_stmt_time -= (eni_time_weights.indirect_call_cost - eni_time_weights.call_cost); - if (dump_file) + if (dump_file !unreachable) { fprintf (dump_file, ipa-prop: Discovered %s call to a known target (%s/%i - %s/%i), for stmt , Index: src/gcc/ipa-inline-transform.c === --- src.orig/gcc/ipa-inline-transform.c +++ src/gcc/ipa-inline-transform.c @@ -260,7 +260,7 @@ inline_call (struct cgraph_edge *e, bool #ifdef ENABLE_CHECKING /* Verify that estimated growth match real growth. Allow off-by-one error due to INLINE_SIZE_SCALE roudoff errors. */ - gcc_assert (!update_overall_summary || !overall_size + gcc_assert (!update_overall_summary || !overall_size || new_edges_found || abs (estimated_growth - (new_size - old_size)) = 1 /* FIXME: a hack. Edges with false predicate are accounted wrong, we should remove them from callgraph. */
[Google] Suppress message when primary module entry cannot found
Now we don't store the module info if the module is not exported or has any aux module (to compress the profile data size). Thus it's normal that a primary module entry cannot be found. This patch suppresses the messages printed when the primary module is not found. Bootstrapped and passed regression test. OK for google branch? Thanks, Dehao Index: auto-profile.c === --- auto-profile.c (revision 198751) +++ auto-profile.c (working copy) @@ -497,10 +497,7 @@ read_aux_modules (void) module.name = xstrdup (in_fnames[0]); entry = (struct afdo_module *) htab_find (module_htab, module); if (!entry) -{ - inform (0, primary module %s cannot be found., in_fnames[0]); - return; -} +return; module_infos = XCNEWVEC (struct gcov_module_info *, entry-num_aux_modules + 1); afdo_add_module (module_infos, entry, true);
Re: [PATCH] Use types_compatible_p in get_binfo_at_offset
Hi, On Fri, May 10, 2013 at 07:24:06PM +0200, Jan Hubicka wrote: 2013-05-10 Martin Jambor mjam...@suse.cz * tree.c (get_binfo_at_offset): Use types_compatible_p to compare types. Index: src/gcc/tree.c === --- src.orig/gcc/tree.c +++ src/gcc/tree.c @@ -11483,7 +11483,7 @@ get_binfo_at_offset (tree binfo, HOST_WI tree fld; int i; - if (TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (expected_type)) + if (types_compatible_p (type, expected_type)) As discussed on lunch, it seems fine to me, but I am not an expert ;) I wonder about the following test few lines bellow: /* Offset 0 indicates the primary base, whose vtable contents are represented in the binfo for the derived class. */ else if (offset != 0) { tree base_binfo, found_binfo = NULL_TREE; for (i = 0; BINFO_BASE_ITERATE (binfo, i, base_binfo); i++) if (TREE_TYPE (base_binfo) == TREE_TYPE (fld)) don't we want also more relaxed testing? we most probably do, good catch. I'll update the patch, re-test and commit it early next week unless someone objects. grep TYPE.* ==.*TYPE *.c seems to show some extra dubious cases i.e. in fold-const, tree-ssa-coaelesce and other places where we most probably care only about semantical equivalence of the type... Yeah, examinig those is also probably a good idea. Thanks, Martin
Re: [PATCH] Redirect calls to non-functions to builtin_unreachable
On 05/10/2013 11:27 AM, Martin Jambor wrote: Hi, as we discover targets of previously indirect calls in ipa-inline and ipa-cp, we sometimes figure out that the targets are not a function. One typical example is when NULL is passed as a function pointer parameter, another is when C++ member-pointer points to a virtual function and the overloaded field of the structure which can also hold pointers to non-virtual methods contains odd integer constants. I have gone through all the cases when this happens when LTO building Mozilla Firefox and verified all of them were OK and legal. Because no such call to a non-function can ever occur in a legal program, I have decided to turn them into calls of builting_unreachable, which then could help further optimizations. Because calls to builtin_unreachable is not cgraph_maybe_hot_edge_p, whereas an indirect call with an unknown destination is, this change often triggered assertion tree-ipa-inline-transform.c:263. I have discussed this with Honza and he has agreed to switch the check off when there are newly discovered direct edges. The patch has passed bootstrap and is undergoing testing now. OK for trunk if it passes? Thanks, Martin 2013-05-09 Martin Jambor mjam...@suse.cz * ipa-prop.c (ipa_make_edge_direct_to_target): Redirect calls to non-functions to builtin_unreachable. * ipa-inline-transform.c (inline_call): Do not assert estimates were correct when new direct edges were discovered. This is good. Please install. Thanks, Jeff
Re: [Google] Suppress message when primary module entry cannot found
Is it only auto fdo that doesn't store the module info if the module is not exported or has aux modules? Note that this will prevent usage of my script that enables annotation of gcov dump info with function names, which relies on accessing the primary module info to obtain the module name (which is then used to locate the module's associated func id to func name mapping optionally emitted into the build output). Teresa On Fri, May 10, 2013 at 10:37 AM, Dehao Chen de...@google.com wrote: Now we don't store the module info if the module is not exported or has any aux module (to compress the profile data size). Thus it's normal that a primary module entry cannot be found. This patch suppresses the messages printed when the primary module is not found. Bootstrapped and passed regression test. OK for google branch? Thanks, Dehao Index: auto-profile.c === --- auto-profile.c (revision 198751) +++ auto-profile.c (working copy) @@ -497,10 +497,7 @@ read_aux_modules (void) module.name = xstrdup (in_fnames[0]); entry = (struct afdo_module *) htab_find (module_htab, module); if (!entry) -{ - inform (0, primary module %s cannot be found., in_fnames[0]); - return; -} +return; module_infos = XCNEWVEC (struct gcov_module_info *, entry-num_aux_modules + 1); afdo_add_module (module_infos, entry, true); -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
Re: [PATCH] Dynamic dispatch of multiversioned functions and CPU mocks for code coverage.
On Fri, May 10, 2013 at 6:34 AM, Joseph S. Myers jos...@codesourcery.com wrote: On Thu, 9 May 2013, Sriraman Tallam wrote: Then, I plan to add the following hooks to libgcc (in a different patch) : int set_mock_cpu_is (const char *cpu); int set_mock_cpu_supports (const char *isa); int init_mock_cpu (); // Clear the values of the mock cpu. Those names are in the user's namespace; I think libgcc should only provide or use symbols in the implementation namespace. Shall I just use __builtin prefixes for these too?, would that work? Thanks Sri -- Joseph S. Myers jos...@codesourcery.com
Re: [Google] Suppress message when primary module entry cannot found
On Fri, May 10, 2013 at 10:47 AM, Teresa Johnson tejohn...@google.com wrote: Is it only auto fdo that doesn't store the module info if the module is not exported or has aux modules? Note that this will prevent usage of my script that enables annotation of gcov dump info with function names, which relies on accessing the primary module info to obtain the module name (which is then used to locate the module's associated func id to func name mapping optionally emitted into the build output). Yes, this is auto fdo only. And gcov dump will not work on auto fdo profile because it does not use the id at all in the profile (instead, it stores the function name directly). Thanks, Dehao Teresa On Fri, May 10, 2013 at 10:37 AM, Dehao Chen de...@google.com wrote: Now we don't store the module info if the module is not exported or has any aux module (to compress the profile data size). Thus it's normal that a primary module entry cannot be found. This patch suppresses the messages printed when the primary module is not found. Bootstrapped and passed regression test. OK for google branch? Thanks, Dehao Index: auto-profile.c === --- auto-profile.c (revision 198751) +++ auto-profile.c (working copy) @@ -497,10 +497,7 @@ read_aux_modules (void) module.name = xstrdup (in_fnames[0]); entry = (struct afdo_module *) htab_find (module_htab, module); if (!entry) -{ - inform (0, primary module %s cannot be found., in_fnames[0]); - return; -} +return; module_infos = XCNEWVEC (struct gcov_module_info *, entry-num_aux_modules + 1); afdo_add_module (module_infos, entry, true); -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
Re: [Google] Suppress message when primary module entry cannot found
ok. Would be nicer if there is a way to tell this from other error cases though. David On Fri, May 10, 2013 at 11:00 AM, Dehao Chen de...@google.com wrote: On Fri, May 10, 2013 at 10:47 AM, Teresa Johnson tejohn...@google.com wrote: Is it only auto fdo that doesn't store the module info if the module is not exported or has aux modules? Note that this will prevent usage of my script that enables annotation of gcov dump info with function names, which relies on accessing the primary module info to obtain the module name (which is then used to locate the module's associated func id to func name mapping optionally emitted into the build output). Yes, this is auto fdo only. And gcov dump will not work on auto fdo profile because it does not use the id at all in the profile (instead, it stores the function name directly). Thanks, Dehao Teresa On Fri, May 10, 2013 at 10:37 AM, Dehao Chen de...@google.com wrote: Now we don't store the module info if the module is not exported or has any aux module (to compress the profile data size). Thus it's normal that a primary module entry cannot be found. This patch suppresses the messages printed when the primary module is not found. Bootstrapped and passed regression test. OK for google branch? Thanks, Dehao Index: auto-profile.c === --- auto-profile.c (revision 198751) +++ auto-profile.c (working copy) @@ -497,10 +497,7 @@ read_aux_modules (void) module.name = xstrdup (in_fnames[0]); entry = (struct afdo_module *) htab_find (module_htab, module); if (!entry) -{ - inform (0, primary module %s cannot be found., in_fnames[0]); - return; -} +return; module_infos = XCNEWVEC (struct gcov_module_info *, entry-num_aux_modules + 1); afdo_add_module (module_infos, entry, true); -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
[RFC] Make the new var decl STATIC in function dw2_output_indirect_constant_1
Hi In function dw2_output_indirect_constant_1 a new var decl is created. Only When the variable is not PUBLIC it is allocated static storage. Does anybody know why the variable is not allocated static storage by marking TREE_STATIC when it is PUBLIC? The following patch marks the STATIC flag in all cases. It can pass bootstrap and regression test on x86-64. Any comments? thanks Carrot 2013-05-09 Guozhi Wei car...@google.com * dwarf2asm.c (dw2_output_indirect_constant_1): Mark new decl STATIC. Index: dwarf2asm.c === --- dwarf2asm.c (revision 198794) +++ dwarf2asm.c (working copy) @@ -906,6 +906,7 @@ DECL_IGNORED_P (decl) = 1; DECL_INITIAL (decl) = decl; TREE_READONLY (decl) = 1; + TREE_STATIC (decl) = 1; if (TREE_PUBLIC (id)) { @@ -914,8 +915,6 @@ if (USE_LINKONCE_INDIRECT) DECL_VISIBILITY (decl) = VISIBILITY_HIDDEN; } - else -TREE_STATIC (decl) = 1; sym_ref = gen_rtx_SYMBOL_REF (Pmode, sym); assemble_variable (decl, 1, 1, 1);
Re: [Google] Suppress message when primary module entry cannot found
On Fri, May 10, 2013 at 11:32 AM, Teresa Johnson tejohn...@google.com wrote: On Fri, May 10, 2013 at 11:00 AM, Dehao Chen de...@google.com wrote: On Fri, May 10, 2013 at 10:47 AM, Teresa Johnson tejohn...@google.com wrote: Is it only auto fdo that doesn't store the module info if the module is not exported or has aux modules? Note that this will prevent usage of my script that enables annotation of gcov dump info with function names, which relies on accessing the primary module info to obtain the module name (which is then used to locate the module's associated func id to func name mapping optionally emitted into the build output). Yes, this is auto fdo only. And gcov dump will not work on auto fdo profile because it does not use the id at all in the profile (instead, it stores the function name directly). If you are trying to reduce profile sizes, it might be worthwhile to try to use the id instead of the name. Before implementing the wrapper script solution to getting function names in the gcov-info dump I first made a patch to include the function names in the gcda file. This caused a huge bloat in gcda and instrumented file sizes. In AutoFDO, we use names (file names, function names) to match the binary level profile back to source. We do this because: 1. We don't do instrumentation, thus there is no id info in the binary/profile 2. This makes the profile loosely coupled with profile collection runs For the size issue, we only stores profile for those functions with sample counts larger than a threshold. Right now, a typical profile file is around 10M~20M. Thanks, Dehao Teresa Thanks, Dehao Teresa On Fri, May 10, 2013 at 10:37 AM, Dehao Chen de...@google.com wrote: Now we don't store the module info if the module is not exported or has any aux module (to compress the profile data size). Thus it's normal that a primary module entry cannot be found. This patch suppresses the messages printed when the primary module is not found. Bootstrapped and passed regression test. OK for google branch? Thanks, Dehao Index: auto-profile.c === --- auto-profile.c (revision 198751) +++ auto-profile.c (working copy) @@ -497,10 +497,7 @@ read_aux_modules (void) module.name = xstrdup (in_fnames[0]); entry = (struct afdo_module *) htab_find (module_htab, module); if (!entry) -{ - inform (0, primary module %s cannot be found., in_fnames[0]); - return; -} +return; module_infos = XCNEWVEC (struct gcov_module_info *, entry-num_aux_modules + 1); afdo_add_module (module_infos, entry, true); -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
[PATCH 7/n, i386]: Improve memory attribute handling in vec_extract patterns
Hello! 2013-05-10 Uros Bizjak ubiz...@gmail.com * config/i386/i386.md (memory): Handle sseishft1. * config/i386/sse.md (*vec_extractv4si): Remove memory attribute. (*vec_extractv2di_1): Ditto. Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN. Uros. Index: i386.md === --- i386.md (revision 198780) +++ i386.md (working copy) @@ -615,8 +615,9 @@ !alu1,negnot,ishift1, imov,imovx,icmp,test,bitmanip, fmov,fcmp,fsgn, - sse,ssemov,ssecmp,ssecomi,ssecvt,ssecvt1,sseicvt,sselog1, - sseshuf1,sseadd1,sseiadd1,mmx,mmxmov,mmxcmp,mmxcvt) + sse,ssemov,ssecmp,ssecomi,ssecvt,ssecvt1,sseicvt, + sselog1,sseshuf1,sseadd1,sseiadd1,sseishft1, + mmx,mmxmov,mmxcmp,mmxcvt) (match_operand 2 memory_operand)) (const_string load) (and (eq_attr type icmov,ssemuladd,sse4arg) Index: sse.md === --- sse.md (revision 198780) +++ sse.md (working copy) @@ -7387,7 +7387,6 @@ } [(set_attr isa *,noavx,avx) (set_attr type sselog1,sseishft1,sseishft1) - (set_attr memory *,none,none) (set_attr prefix_extra 1,*,*) (set_attr length_immediate 1) (set_attr prefix maybe_vex,orig,vex) @@ -7446,7 +7445,6 @@ [(set_attr isa x64_sse4,*,sse2_noavx,avx,noavx,*,x64) (set_attr type sselog1,ssemov,sseishft1,sseishft1,ssemov,ssemov,imov) (set_attr length_immediate 1,*,1,1,*,*,*) - (set_attr memory *,*,none,none,*,*,*) (set_attr prefix_rex 1,*,*,*,*,*,*) (set_attr prefix_extra 1,*,*,*,*,*,*) (set_attr prefix maybe_vex,maybe_vex,orig,vex,orig,*,*)
Re: [PATCH] Use types_compatible_p in get_binfo_at_offset
Martin Jambor mjam...@suse.cz wrote: Hi, On Fri, May 10, 2013 at 07:24:06PM +0200, Jan Hubicka wrote: 2013-05-10 Martin Jambor mjam...@suse.cz * tree.c (get_binfo_at_offset): Use types_compatible_p to compare types. Index: src/gcc/tree.c === --- src.orig/gcc/tree.c +++ src/gcc/tree.c @@ -11483,7 +11483,7 @@ get_binfo_at_offset (tree binfo, HOST_WI tree fld; int i; - if (TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (expected_type)) + if (types_compatible_p (type, expected_type)) As discussed on lunch, it seems fine to me, but I am not an expert ;) This doesn't look good to me. With lto all types that are structurally equivalent will compare compatible this way. Probably not what you want. Yep, that was my concern, too. I wonder how to recognize the two types represent the same object. Perhaps by adding an assiciated vtable (via binfos) into the structural match while merging? Honza
Re: wide-int.[ch] review
On May 10, 2013, at 5:12 AM, Richard Biener rguent...@suse.de wrote: Somewhere we talked about naming new C++ files wide-int.cc (or was it .C or .cxx?) for the C++ library: $ find . -name \*.cc | wc -l 5605 $ find . -name \*.C | wc -l 1 :-) Go also prefers .cc.
Re: [Patch:RL78] Fix hardware multiply on G13 target
movwax, %h2 movw0x2, ax ; MDAH nop ; mdc += mdal * mdah + nop ; Additional nop for MAC mov a, #0x40 mov !0xf00e8, a ; MDUC Shouldn't the MOV after the nop add the extra cycle here? movwax, %H2 movw0x2, ax ; MDAH nop ; mdc += mdal * mdah + nop ; Additional nop for MAC movwax, !0xf00e0; MDCL movw%H0, ax Yeah, this one is probably needed.
Re: [PATCH] Use types_compatible_p in get_binfo_at_offset
Jan Hubicka hubi...@ucw.cz wrote: Martin Jambor mjam...@suse.cz wrote: Hi, On Fri, May 10, 2013 at 07:24:06PM +0200, Jan Hubicka wrote: 2013-05-10 Martin Jambor mjam...@suse.cz * tree.c (get_binfo_at_offset): Use types_compatible_p to compare types. Index: src/gcc/tree.c === --- src.orig/gcc/tree.c +++ src/gcc/tree.c @@ -11483,7 +11483,7 @@ get_binfo_at_offset (tree binfo, HOST_WI tree fld; int i; - if (TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (expected_type)) + if (types_compatible_p (type, expected_type)) As discussed on lunch, it seems fine to me, but I am not an expert ;) This doesn't look good to me. With lto all types that are structurally equivalent will compare compatible this way. Probably not what you want. Yep, that was my concern, too. I wonder how to recognize the two types represent the same object. Perhaps by adding an assiciated vtable (via binfos) into the structural match while merging? The existing check should work ok with lto. If not then we should figure out why we do not merge the main variants properly. Richard. Honza
sparc64*-*-rtems* should not define __svr4__
Hi sparc64*-*-rtems* ends up with __svr4__ defined. The attached patch corrects that. OK to apply? 013-05-10 Joel Sherrill joel.sherr...@oarcorp.com * config.gcc (sparc64*-*-rtems*): Use sp64-rtemself.h. RTEMS target should not have -D__svr4__ in CPP_SUBTARGET_SPEC. * config/sparc/sp64-rtemself.h: New file. -- Joel Sherrill, Ph.D. Director of Research Development joel.sherr...@oarcorp.comOn-Line Applications Research Ask me about RTEMS: a free RTOS Huntsville AL 35805 Support Available(256) 722-9985 2013-05-10 Joel Sherrill joel.sherr...@oarcorp.com * config.gcc (sparc64*-*-rtems*): Use sp64-rtemself.h RTEMS target should not have -D__svr4__ in CPP_SUBTARGET_SPEC. * config/sparc/sp64-rtemself.h: New file. Index: gcc/config/sparc/sp64-rtemself.h === --- gcc/config/sparc/sp64-rtemself.h(revision 0) +++ gcc/config/sparc/sp64-rtemself.h(working copy) @@ -0,0 +1,37 @@ +/* Definitions for rtems targeting a SPARC64 using ELF. + Copyright (C) 2013 Free Software Foundation, Inc. + Contributed by Joel Sherrill (joel.sherr...@oarcorp.com). + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +http://www.gnu.org/licenses/. */ + +/* Target OS builtins. */ +#undef TARGET_OS_CPP_BUILTINS +#define TARGET_OS_CPP_BUILTINS() \ + do \ +{ \ + builtin_define (__rtems__); \ + builtin_define (__USE_INIT_FINI__); \ + builtin_assert (system=rtems);\ +} \ + while (0) + +/* Use the default */ +#undef LINK_GCC_C_SEQUENCE_SPEC + +/* we are not svr4 */ +#undef CPP_SUBTARGET_SPEC +#define CPP_SUBTARGET_SPEC Index: gcc/config.gcc === --- gcc/config.gcc (revision 198775) +++ gcc/config.gcc (working copy) @@ -2516,7 +2516,7 @@ tmake_file=${tmake_file} sparc/t-sparc ;; sparc64-*-rtems*) - tm_file=${tm_file} dbxelf.h elfos.h newlib-stdint.h sparc/sysv4.h sparc/sp64-elf.h sparc/rtemself.h rtems.h + tm_file=${tm_file} dbxelf.h elfos.h newlib-stdint.h sparc/sysv4.h sparc/sp64-elf.h sparc/sp64-rtemself.h rtems.h extra_options=${extra_options} tmake_file=${tmake_file} sparc/t-sparc sparc/t-rtems-64 t-rtems ;;
Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)
On Fri, May 10, 2013 at 5:54 PM, Teresa Johnson wrote: The main issue I had here, and why I made this change, is that we go in and out of cfglayout mode several times after bb partitioning and then out_of_cfglayout. The problem was that when we subsequently went in and out of cfglayout mode, the switch text section notes that had been inserted by bbpart were getting messed up (they were moved into the bb header when we enter cfglayout mode and then not being transferred to the correct location upon exit). I investigated trying to keep those in sync, but it is really difficult/impossible to do during cfglayout mode when they are in the header. So I simply strip them out completely on entry to cfglayout mode, and if there were any there on entry, this change ensures that they are restored in the appropriate location upon exit. I'm not sure what is a good alternative? The problem is that the note exists at all. I'd like to see the note go away completely eventually (when we have a CFG all the way through pass_final). As a stop-gap, we should not emit the note until pass_free_cfg. Up to that point the basic blocks tell what partition they're in, and all hot block and cold blocks should be in a sequence. So during pass_free_cfg, just walk the basic blocks chain until there's a partition change, and emit the note between those two blocks. I triggered the same error in 445.gobmk once I applied the thread_prologue_and_epilogue_insns fixes. This is an assert in the dwarf CFI code that complains about a NOTE_INSN_SWITCH_TEXTS_SECTION note not being preceeded by a barrier: The problem here is that fixup_reorder_chain should force a jump for basic blocks in one partition falling through to bb-next. That should be dealt with in can_fallthru, which should return false if BB_PARTITION (target) != BB_PARTITION (src). The correct solution in my opinion is to strip out the SWITCH note every time we enter cfglayout mode after bbro, and then invoke insert_section_boundary_note when leaving cfglayout (if one was found on entry to that cfglayout mode) to reapply it. Please make the note go away completely before pass_free_cfg, and earn greater admiration than Zeus. The note always was wrong, and now you've shown it's also a problem. * Fixup redirected edges that did not cross partitions before but apparently do after redirection. This is not supposed to happen in the first place, so fixing up any of this just papers over an error elsewhere in the compiler. Looking back through the earlier email exchanges from last fall I found some discussion on this where I had found a couple places causing this to happen. Two were in thread_prologue_and_epilogue_insns: when we duplicated tail blocks or created a new block to hold a simple return, and redirected some of the edges to the new copy. In some cases this caused edges that were previously region crossing to become non-region crossing, and in other cases the reverse happened. I think there are a couple of other cases mentioned in the emails too, but I would need to do some more digging to find them all. There were a few places in the code that tried to detect this type of issue, and fix them up, but they weren't consistent and there were other places that had the same issue. I've centralized all that handling in this patch (fixup_partitions and fixup_partition_crossing, called from a few places where we redirect edges) so that it is more consistent and comprehensive. Right, I think it's good that you've centralized this code. But it seems to me that we should make sure that the hot blocks and cold blocks are still grouped (i.e. there is only one basic block B such that BB_PARTITION(B) != BB_PARTITION(B-next_bb). That is something your code doesn't handle, AFAICT. It's just one thing that's difficult to maintain, probably there are others. It's also something the partitioning verifier should check, i.e. that the basic blocks in hot and cold partitions are properly grouped. BTW1: I don't understand this comment: + /* Invoke the cleanup again once we are out of cfg layout mode + after committing the final bb layout above. This enables removal + of forwarding blocks across the hot/cold section boundary when + splitting is enabled that were necessary while in cfg layout + mode. */ + if (crtl-has_bb_partition) +cleanup_cfg (CLEANUP_EXPENSIVE); There shouldn't be any forwarder blocks in cfg layout mode. What did you need this for? BTW2: We badly need to figure out a way to create test cases for FDO... :-( Ciao! Steven
Re: [PATCH] Dynamic dispatch of multiversioned functions and CPU mocks for code coverage.
On Fri, 10 May 2013, Sriraman Tallam wrote: On Fri, May 10, 2013 at 6:34 AM, Joseph S. Myers jos...@codesourcery.com wrote: On Thu, 9 May 2013, Sriraman Tallam wrote: Then, I plan to add the following hooks to libgcc (in a different patch) : int set_mock_cpu_is (const char *cpu); int set_mock_cpu_supports (const char *isa); int init_mock_cpu (); // Clear the values of the mock cpu. Those names are in the user's namespace; I think libgcc should only provide or use symbols in the implementation namespace. Shall I just use __builtin prefixes for these too?, would that work? I'm not sure if that's a good idea for something that's actually a library function (we've previously discussed rejecting explicit declarations of __builtin_* identifiers to some extent - see bug 32455 - which would be an issue for defining library functions with such a name if we do decide in future to reject such declarations), but use __ prefixes in some form, certainly. -- Joseph S. Myers jos...@codesourcery.com
Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)
There shouldn't be any forwarder blocks in cfg layout mode. What did you need this for? BTW2: We badly need to figure out a way to create test cases for FDO... :-( We have gcc.dg/tree-prof and friends. What do you need to add? Honza
Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)
On Fri, May 10, 2013 at 11:10 PM, Jan Hubicka hubi...@ucw.cz wrote: There shouldn't be any forwarder blocks in cfg layout mode. What did you need this for? BTW2: We badly need to figure out a way to create test cases for FDO... :-( We have gcc.dg/tree-prof and friends. What do you need to add? Something to more easily reproduce bugs and derive test cases from them. Ciao! Steven
Re: web ICEs on subreg
On Fri, May 10, 2013 at 11:51 PM, Mike Stump wrote: - if (DF_REF_LOC (*ref) == recog_data.operand_loc[op]) - break; + { + if (DF_REF_LOC (*ref) == recog_data.operand_loc[op]) + break; + /* DF_REF_LOC can be (subreg:DI (reg:TI 5) 8) and +recog_data.operand_loc[op] can be (reg:TI 5), and the above +won't find it. */ + if (GET_CODE (*DF_REF_LOC (*ref)) == SUBREG + SUBREG_REG (*DF_REF_LOC (*ref)) == recog_data.operand_loc[op]) Congrats, you've re-invented DF_REF_REAL_LOC. Ciao! Steven
Re: web ICEs on subreg
On Fri, May 10, 2013 at 11:51 PM, Mike Stump wrote: I have a instruction pattern for my port that reads in part: [(parallel [(set (subreg:DI (match_operand:TI 0 register_operand =r) 0) (mem:DI (plus:DI (match_operand:DI 1 register_operand r) (match_operand:DI 2 const_int_operand (set (subreg:DI (match_dup 0) 8) (mem:DI (plus:DI (match_dup 1) (match_operand:DI 3 const_int_operand])] The subreg seems to give web.c fits. web is looking for the match_dup and failing to find it, due to the fact that DF_REF_LOC is of the form (subreg:DI (reg:TI 5) 8), not (reg:TI 5). If I add a case for SUBREG like in the below, we can then find and process the subreg. So, the question is, is the port completely wrong for trying to use subreg in this fashion? If not, is processing subregs in this way the right way? If yes, Ok? I don't think this port is completely wrong. I've looked at some other ports and there are comparable patterns, e.g. in m68k: ;; We need a separate DEFINE_EXPAND for u?mulsidi3 to be able to use the ;; proper matching constraint. This is because the matching is between ;; the high-numbered word of the DImode operand[0] and operand[1]. (define_expand umulsidi3 [(parallel [(set (subreg:SI (match_operand:DI 0 register_operand ) 4) (mult:SI (match_operand:SI 1 register_operand ) (match_operand:SI 2 register_operand ))) (set (subreg:SI (match_dup 0) 0) (truncate:SI (lshiftrt:DI (mult:DI (zero_extend:DI (match_dup 1)) (zero_extend:DI (match_dup 2))) (const_int 32])] TARGET_68020 !TUNE_68060 !TARGET_COLDFIRE ) Your web.c patch looks correct to me, but I can't approve it. Ciao! Steven
Re: web ICEs on subreg
On May 10, 2013, at 3:29 PM, Steven Bosscher stevenb@gmail.com wrote: Your web.c patch looks correct to me, but I can't approve it. Thanks. Now that you point out the DF accessor, it all makes perfect sense. :-) I've fixed one other instance that has to be as wrong for all the same reasons. Though my port doesn't trip the other case from the test suite, if my port used an IN/OUT, I bet I could make it trip it. Ok? 2013-05-10 Mike Stump mikest...@comcast.net * web.c (union_match_dups): Use DF_REF_REAL_LOC instead. Doing diffs in web.c.~1~: --- web.c.~1~ 2013-01-14 10:39:36.0 -0800 +++ web.c 2013-05-10 15:45:02.0 -0700 @@ -132,13 +132,13 @@ union_match_dups (rtx insn, struct web_e ref = type == OP_IN ? use_link : def_link; entry = type == OP_IN ? use_entry : def_entry; for (; *ref; ref++) - if (DF_REF_LOC (*ref) == recog_data.operand_loc[op]) + if (DF_REF_REAL_LOC (*ref) == recog_data.operand_loc[op]) break; if (!*ref type == OP_INOUT) { for (ref = use_link, entry = use_entry; *ref; ref++) - if (DF_REF_LOC (*ref) == recog_data.operand_loc[op]) + if (DF_REF_REAL_LOC (*ref) == recog_data.operand_loc[op]) break; } --
Re: [PATCH] Improve handle_char_store in strlen pass (PR tree-optimization/57230)
On 05/10/2013 08:49 AM, Jakub Jelinek wrote: Hi! Related to the previously posted patch, I've noticed that on the attached testcase we don't optimize the strlen into constant, even when we easily could. Implemented thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2013-05-10 Jakub Jelinek ja...@redhat.com PR tree-optimization/57230 * tree-ssa-strlen.c (handle_char_store): Record length for array store from STRING_CST. * gcc.dg/strlenopt-24.c: New test. OK. A bit surprised we weren't already handling this... jeff
Re: [PATCH] Fix up rotate expansion
On 05/10/2013 08:53 AM, Jakub Jelinek wrote: Hi! Our rotate expansion if rotate optab isn't present for the respective mode looks unsafe for rotations by variable count if that count could be 0, because then it invokes right or left shift by bitsize. While on most targets the hw will probably do the right thing (it is fine if x 32 will either yield x or 0, in both cases that ored together with x 0 aka x will still yield x), perhaps gcc could try to optimize based on the fact that undefined behavior can't happen, so IMHO it is better to generate a safer sequence. Ok for trunk? 2013-05-10 Jakub Jelinek ja...@redhat.com * expmed.c (expand_shift_1): For rotations by const0_rtx just return shifted. Use (-op1) (prec - 1) as other_amount instead of prec - op1. Found by inspection? Presumably the rotate was synthesized by GCC from some other set of operations. To be optimizable, we'd have to prove the original sequence triggered undefined behaviour. Seems that we ought to have a testcase, even though it probably means scanning the tree dumps to pick up the undefined behaviour. Approved with a testcase. Out of curiosity, do we recognize the original sequence created by expand_shift_1 as a rotate if they appeared in the original source? jeff
[patch] struct resources: remove unch_memory member
Hello, This unch_memory in struct resources is a left-over from RTX_UNCHANGING_P, but it looks like the change-over to MEM_READONLY_P was done incorrectly: The resource_conflicts_p code now reports conflicts for insns reading readonly memory and insns reading normal memory or no memory at all. Spotted by checking why reorg.c failed to fill some slots that my sched-deps based delay slot scheduler managed to fill. Bootstrappedtested on sparc64-unknown-linux-gnu. OK for trunk? Ciao! Steven * resource.h (struct resources): Remove unch_memory member. (CLEAR_RESOURCE): Don't clear unch_memory. * resource.c (mark_referenced_resources): Don't set it. (mark_set_resources): Likewise. (mark_target_live_regs): Don't clear it. (init_resource_info): Likewise. * reorg.c (resource_conflicts_p): Don't compare it. (redundant_insn): Don't set it. Index: resource.h === --- resource.h (revision 198737) +++ resource.h (working copy) @@ -25,14 +25,13 @@ along with GCC; see the file COPYING3. If not see /* Macro to clear all resources. */ #define CLEAR_RESOURCE(RES)\ - do { (RES)-memory = (RES)-unch_memory = (RES)-volatil = (RES)-cc = 0; \ + do { (RES)-memory = (RES)-volatil = (RES)-cc = 0; \ CLEAR_HARD_REG_SET ((RES)-regs); } while (0) /* The resources used by a given insn. */ struct resources { char memory; /* Insn sets or needs a memory location. */ - char unch_memory;/* Insn sets or needs an unchanging MEM. */ char volatil;/* Insn sets or needs a volatile memory loc. */ char cc; /* Insn sets or needs the condition codes. */ HARD_REG_SET regs; /* Which registers are set or needed. */ Index: resource.c === --- resource.c (revision 198737) +++ resource.c (working copy) @@ -240,9 +240,7 @@ mark_referenced_resources (rtx x, struct resources case MEM: /* If this memory shouldn't change, it really isn't referencing memory. */ - if (MEM_READONLY_P (x)) - res-unch_memory = 1; - else + if (! MEM_READONLY_P (x)) res-memory = 1; res-volatil |= MEM_VOLATILE_P (x); @@ -740,7 +738,6 @@ mark_set_resources (rtx x, struct resources *res, if (in_dest) { res-memory = 1; - res-unch_memory |= MEM_READONLY_P (x); res-volatil |= MEM_VOLATILE_P (x); } @@ -896,7 +893,7 @@ mark_target_live_regs (rtx insns, rtx target, stru /* We have to assume memory is needed, but the CC isn't. */ res-memory = 1; - res-volatil = res-unch_memory = 0; + res-volatil = 0; res-cc = 0; /* See if we have computed this value already. */ @@ -1145,7 +1142,6 @@ init_resource_info (rtx epilogue_insn) end_of_function_needs.cc = 0; end_of_function_needs.memory = 1; - end_of_function_needs.unch_memory = 0; CLEAR_HARD_REG_SET (end_of_function_needs.regs); if (frame_pointer_needed) Index: reorg.c === --- reorg.c (revision 198737) +++ reorg.c (working copy) @@ -276,7 +276,6 @@ static int resource_conflicts_p (struct resources *res1, struct resources *res2) { if ((res1-cc res2-cc) || (res1-memory res2-memory) - || (res1-unch_memory res2-unch_memory) || res1-volatil || res2-volatil) return 1; @@ -1542,7 +1541,6 @@ redundant_insn (rtx insn, rtx target, rtx delay_li /* Insns we pass may not set either NEEDED or SET, so merge them for simpler tests. */ needed.memory |= set.memory; - needed.unch_memory |= set.unch_memory; IOR_HARD_REG_SET (needed.regs, set.regs); /* This insn isn't redundant if it conflicts with an insn that either is
Re: cfgexpand.c patch for [was new port: msp430-elf]
Note that I had to make a few changes (fixes?) in the MI portions of gcc to avoid problems I encountered, I don't know if these changes are correct or if there are better ways to avoid those cases. Those In any case, they should best be posted in separate messages, each one with its own rationale. Here's the first of those... The patch assumes that, by definition, a partial int mode has fewer bits than an int mode of the same size, and thus truncation should be used to go from the int mode to the partial int mode. Index: gcc/cfgexpand.c === --- gcc/cfgexpand.c (revision 198591) +++ gcc/cfgexpand.c (working copy) @@ -3090,13 +3090,17 @@ expand_debug_expr (tree exp) size_t, we need to check for mis-matched modes and correct the addend. */ if (op0 op1 GET_MODE (op0) != VOIDmode GET_MODE (op1) != VOIDmode GET_MODE (op0) != GET_MODE (op1)) { - if (GET_MODE_BITSIZE (GET_MODE (op0)) GET_MODE_BITSIZE (GET_MODE (op1))) + if (GET_MODE_BITSIZE (GET_MODE (op0)) GET_MODE_BITSIZE (GET_MODE (op1)) + /* Don't try to sign-extend SImode to PSImode, for example. */ + || (GET_MODE_BITSIZE (GET_MODE (op0)) == GET_MODE_BITSIZE (GET_MODE (op1)) + GET_MODE_CLASS (GET_MODE (op0)) == MODE_PARTIAL_INT + GET_MODE_CLASS (GET_MODE (op1)) == MODE_INT)) op1 = simplify_gen_unary (TRUNCATE, GET_MODE (op0), op1, GET_MODE (op1)); else /* We always sign-extend, regardless of the signedness of the operand, because the operand is always unsigned here even if the original C expression is signed. */
Re: dwarf2cfi.c patch for [was new port: msp430-elf]
Note that I had to make a few changes (fixes?) in the MI portions of gcc to avoid problems I encountered, I don't know if these changes are correct or if there are better ways to avoid those cases. Those In any case, they should best be posted in separate messages, each one with its own rationale. Here's the second of those... Index: dwarf2cfi.c === --- dwarf2cfi.c (revision 198591) +++ dwarf2cfi.c (working copy) @@ -277,12 +277,19 @@ expand_builtin_init_dwarf_reg_sizes (tre { if (save_mode == VOIDmode) continue; wrote_return_column = true; } size = GET_MODE_SIZE (save_mode); + + /* Entries in the dwarf_reg_size_table must be big enough to hold an _Unwind_Word + even if this is bigger than reg_raw_mode. This can happen on targets where the + pointer size is larger than the integer size, and not a power-of-two. (Eg MSP430). */ + if (size GET_MODE_SIZE (targetm.unwind_word_mode ())) +size = GET_MODE_SIZE (targetm.unwind_word_mode ()); + if (offset 0) continue; emit_move_insn (adjust_address (mem, mode, offset), gen_int_mode (size, mode)); }
Re: [PATCH] Use types_compatible_p in get_binfo_at_offset
The existing check should work ok with lto. If not then we should figure out why we do not merge the main variants properly. Hmm, adding: Index: tree.c === --- tree.c (revision 198796) +++ tree.c (working copy) @@ -11572,6 +11572,12 @@ get_binfo_at_offset (tree binfo, HOST_WI if (TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (expected_type)) return binfo; + else + if (types_compatible_p (type, expected_type)) + { + debug_tree (TYPE_MAIN_VARIANT (type)); + debug_tree (TYPE_MAIN_VARIANT (expected_type)); + } if (offset 0) return NULL_TREE; @@ -11605,6 +11611,12 @@ get_binfo_at_offset (tree binfo, HOST_WI found_binfo = base_binfo; break; } + else + if (types_compatible_p (TREE_TYPE (base_binfo), TREE_TYPE (fld))) + { + debug_tree (TREE_TYPE (base_binfo)); + debug_tree (TREE_TYPE (fld)); + } if (!found_binfo) return NULL_TREE; binfo = found_binfo; seems to bring a lot of positives and by quick inspection most of them seem like same classes. I will try to find some time to debug this more. But glancing over the the dumps, I see many of them just have different name spaces. Do we even attempt to merge namespace_decl? How types from same namespaces in different units are supposed to match? Honza Richard. Honza
Re: [PATCH] Use types_compatible_p in get_binfo_at_offset
seems to bring a lot of positives and by quick inspection most of them seem like same classes. I will try to find some time to debug this more. But glancing over the the dumps, I see many of them just have different name spaces. Do we even attempt to merge namespace_decl? How types from same namespaces in different units are supposed to match? Just as example record_type 0x7ff129ff6bd0 X86Assembler addressable needs-constructing BLK size integer_cst 0x7ff136ee2760 type integer_type 0x7ff1370af0a8 bitsizetype constant 2368 unit size integer_cst 0x7ff136ee2780 type integer_type 0x7ff1370af000 sizetype constant 296 align 64 symtab 0 alias set -1 canonical type 0x7ff136efc498 fields field_decl 0x7ff129fe2850 D.752229 type record_type 0x7ff129ff6150 GenericAssembler needs-constructing TI size integer_cst 0x7ff13709eda0 constant 128 unit size integer_cst 0x7ff13709edc0 constant 16 align 64 symtab 0 alias set -1 canonical type 0x7ff136ec6b28 fields field_decl 0x7ff129fe2098 printer context namespace_decl 0x7ff129e45e60 JSC pointer_to_this pointer_type 0x7ff129c7f888 chain type_decl 0x7ff129fd7e60 GenericAssembler ignored TI file /abuild/jh/mozilla-central2/mozilla-central/js/src/assembler/assembler/X86Assembler.h line 140 col 7 size integer_cst 0x7ff137227400 constant 72 unit size integer_cst 0x7ff136eb50a0 constant 9 align 64 offset_align 128 offset integer_cst 0x7ff13709ed60 constant 0 bit offset integer_cst 0x7ff13709ede0 constant 0 context record_type 0x7ff129ff6bd0 X86Assembler chain field_decl 0x7ff129fe28e8 m_formatter type record_type 0x7ff129ff6b28 X86InstructionFormatter used private nonlocal BLK file /abuild/jh/mozilla-central2/mozilla-central/js/src/assembler/assembler/X86Assembler.h line 3207 col 7 size integer_cst 0x7ff136e64d00 constant 2240 unit size integer_cst 0x7ff136e64d20 constant 280 align 64 offset_align 128 offset integer_cst 0x7ff13709edc0 16 bit offset integer_cst 0x7ff13709ede0 0 context record_type 0x7ff129ff6bd0 X86Assembler chain type_decl 0x7ff129ff75c0 X86Assembler context namespace_decl 0x7ff129e45e60 JSC pointer_to_this pointer_type 0x7ff129c7fa80 chain type_decl 0x7ff129ff7508 X86Assembler record_type 0x7ff12c9e7690 X86Assembler addressable needs-constructing BLK size integer_cst 0x7ff136ee2760 type integer_type 0x7ff1370af0a8 bitsizetype constant 2368 unit size integer_cst 0x7ff136ee2780 type integer_type 0x7ff1370af000 sizetype constant 296 align 64 symtab 0 alias set -1 canonical type 0x7ff136efc498 fields field_decl 0x7ff12db56000 D.789661 type record_type 0x7ff12e6b4000 GenericAssembler needs-constructing TI size integer_cst 0x7ff13709eda0 constant 128 unit size integer_cst 0x7ff13709edc0 constant 16 align 64 symtab 0 alias set -1 canonical type 0x7ff136ec6b28 fields field_decl 0x7ff12c72e8e8 printer context namespace_decl 0x7ff1315587e8 JSC pointer_to_this pointer_type 0x7ff12c6cb738 chain type_decl 0x7ff13155f228 GenericAssembler ignored TI file /abuild/jh/mozilla-central2/mozilla-central/js/src/assembler/assembler/X86Assembler.h line 140 col 0 size integer_cst 0x7ff137227400 constant 72 unit size integer_cst 0x7ff136eb50a0 constant 9 align 64 offset_align 128 offset integer_cst 0x7ff13709ed60 constant 0 bit offset integer_cst 0x7ff13709ede0 constant 0 context record_type 0x7ff12c9e7690 X86Assembler chain field_decl 0x7ff12db56098 m_formatter type record_type 0x7ff12c9e75e8 X86InstructionFormatter used private nonlocal BLK file /abuild/jh/mozilla-central2/mozilla-central/js/src/assembler/assembler/X86Assembler.h line 3207 col 0 size integer_cst 0x7ff136e64d00 constant 2240 unit size integer_cst 0x7ff136e64d20 constant 280 align 64 offset_align 128 offset integer_cst 0x7ff13709edc0 16 bit offset integer_cst 0x7ff13709ede0 0 context record_type 0x7ff12c9e7690 X86Assembler chain type_decl 0x7ff12c673678 X86Assembler context namespace_decl 0x7ff1315587e8 JSC pointer_to_this pointer_type 0x7ff12c6cbf18 chain type_decl 0x7ff12c6732e0 X86Assembler Honza Honza Richard. Honza
Re: web ICEs on subreg
Assuming the patch has been tested on a public port, it is ok for commit. kenny On 05/10/2013 06:52 PM, Mike Stump wrote: On May 10, 2013, at 3:29 PM, Steven Bosscher stevenb@gmail.com wrote: Your web.c patch looks correct to me, but I can't approve it. Thanks. Now that you point out the DF accessor, it all makes perfect sense. :-) I've fixed one other instance that has to be as wrong for all the same reasons. Though my port doesn't trip the other case from the test suite, if my port used an IN/OUT, I bet I could make it trip it. Ok?
Re: wide-int.[ch] review
On Fri, May 10, 2013 at 5:12 AM, Richard Biener rguent...@suse.de wrote: Somewhere we talked about naming new C++ files wide-int.cc (or was it .C or .cxx?) Definitely not .C. I prefer .cc. Ian
Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)
On Fri, May 10, 2013 at 2:00 PM, Steven Bosscher stevenb@gmail.com wrote: On Fri, May 10, 2013 at 5:54 PM, Teresa Johnson wrote: The main issue I had here, and why I made this change, is that we go in and out of cfglayout mode several times after bb partitioning and then out_of_cfglayout. The problem was that when we subsequently went in and out of cfglayout mode, the switch text section notes that had been inserted by bbpart were getting messed up (they were moved into the bb header when we enter cfglayout mode and then not being transferred to the correct location upon exit). I investigated trying to keep those in sync, but it is really difficult/impossible to do during cfglayout mode when they are in the header. So I simply strip them out completely on entry to cfglayout mode, and if there were any there on entry, this change ensures that they are restored in the appropriate location upon exit. I'm not sure what is a good alternative? The problem is that the note exists at all. I'd like to see the note go away completely eventually (when we have a CFG all the way through pass_final). As a stop-gap, we should not emit the note until pass_free_cfg. Up to that point the basic blocks tell what partition they're in, and all hot block and cold blocks should be in a sequence. So during pass_free_cfg, just walk the basic blocks chain until there's a partition change, and emit the note between those two blocks. Ok, let me take a stab at that. I triggered the same error in 445.gobmk once I applied the thread_prologue_and_epilogue_insns fixes. This is an assert in the dwarf CFI code that complains about a NOTE_INSN_SWITCH_TEXTS_SECTION note not being preceeded by a barrier: The problem here is that fixup_reorder_chain should force a jump for basic blocks in one partition falling through to bb-next. That should be dealt with in can_fallthru, which should return false if BB_PARTITION (target) != BB_PARTITION (src). That wasn't the issue in this case. Here there was a block that happened to be laid out at the very start of the cold section (it was jumped to from elsewhere, not reached via fall through from its layout predecessor). Thus it was preceded by a switch section note, which was put into the bb header when we entered cfglayout mode for compgoto. The note ended up in the middle of the block when we did some block combining with its cfg predecessor (not the block that preceded it in the layout chain, which was the last hot block in the reorder chain). So the issue here wasn't making sure there is no fall-through across section boundaries. There was already code to prevent this, but my patch contains a few fixes to make sure that is always happening. The correct solution in my opinion is to strip out the SWITCH note every time we enter cfglayout mode after bbro, and then invoke insert_section_boundary_note when leaving cfglayout (if one was found on entry to that cfglayout mode) to reapply it. Please make the note go away completely before pass_free_cfg, and earn greater admiration than Zeus. The note always was wrong, and now you've shown it's also a problem. Ok, I will try. =) * Fixup redirected edges that did not cross partitions before but apparently do after redirection. This is not supposed to happen in the first place, so fixing up any of this just papers over an error elsewhere in the compiler. Looking back through the earlier email exchanges from last fall I found some discussion on this where I had found a couple places causing this to happen. Two were in thread_prologue_and_epilogue_insns: when we duplicated tail blocks or created a new block to hold a simple return, and redirected some of the edges to the new copy. In some cases this caused edges that were previously region crossing to become non-region crossing, and in other cases the reverse happened. I think there are a couple of other cases mentioned in the emails too, but I would need to do some more digging to find them all. There were a few places in the code that tried to detect this type of issue, and fix them up, but they weren't consistent and there were other places that had the same issue. I've centralized all that handling in this patch (fixup_partitions and fixup_partition_crossing, called from a few places where we redirect edges) so that it is more consistent and comprehensive. Right, I think it's good that you've centralized this code. But it seems to me that we should make sure that the hot blocks and cold blocks are still grouped (i.e. there is only one basic block B such that BB_PARTITION(B) != BB_PARTITION(B-next_bb). That is something your code doesn't handle, AFAICT. It's just one thing that's difficult to maintain, probably there are others. It's also something the partitioning verifier should check, i.e. that the basic blocks in hot and cold partitions are properly grouped. Actually, there is already code that verifies this at the
Re: [PATCH v2] Test case for PR55033
2013/5/10 Sebastian Huber sebastian.hu...@embedded-brains.de: v2: Format changes gcc/testsuite/ChangeLog 2013-05-10 Sebastian Huber sebastian.hu...@embedded-brains.de PR target/55033 * gcc.target/powerpc/pr55033.c: New. --- gcc/testsuite/gcc.target/powerpc/pr55033.c | 23 +++ 1 files changed, 23 insertions(+), 0 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr55033.c diff --git a/gcc/testsuite/gcc.target/powerpc/pr55033.c b/gcc/testsuite/gcc.target/powerpc/pr55033.c new file mode 100644 index 000..2c1835e --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr55033.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target powerpc_eabi_ok } */ +/* { dg-options -mcpu=8540 -msoft-float -msdata=eabi -G 8 -fno-common } */ + +extern void f (void); + +struct s +{ + int *p; + int *q; +}; + +extern int a; + +extern const struct s c; + +const struct s c = { a, 0 }; + +void +f (void) +{ + char buf[4] = { 0, 1, 2, 3 }; +} -- 1.7.7 It seems that you forgot to include powerpc maintainer in the CC list. Let me do this for you. :) Best regards, jasonwucj