Re: [PATCH][i386]Fix PR 57756
On Tue, Oct 22, 2013 at 11:05 PM, Sriraman Tallam tmsri...@google.com wrote: This simple patch fixes the -m32 -mno-sse bugs you reported. A few more places where I did not change references to global_options. Uros/Richard: Is this ok to commit? * config/i386/i386.c (ix86_option_override_internal): Change TARGET_SSE2 to TARGET_SSE2_P (opts-...) (ix86_valid_target_attribute_tree): Change TARGET_64BIT to TARGET_64BIT_P (opts-...) Change TARGET_SSE to TARGET_SSE_P (opts-...) OK. Thanks, Uros.
RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
Hi Richard/Joseph, I noticed, this test case crashes on arm-eabi already witout the patch. extern void abort (void); #define test_type unsigned short #define MAGIC (unsigned short)0x102u typedef struct s{ unsigned char Prefix[1]; test_type Type; }__attribute((__packed__,__aligned__(4))) ss; volatile ss v; ss g; void __attribute__((noinline)) foo (test_type u) { v.Type = u; } test_type __attribute__((noinline)) bar (void) { return v.Type; } However when compiled with -fno-strict-volatile-bitfields it does not crash, but AFAIK the generated code for foo() violates the C++ memory model: foo: @ Function supports interworking. @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. ldr r2, .L2 ldr r3, [r2] bic r3, r3, #16711680 bic r3, r3, #65280 orr r3, r3, r0, asl #8 str r3, [r2] bx lr On Intel the generated code uses unaligned access, but is OK for the memory model: foo: .LFB0: .cfi_startproc movw %di, v+1(%rip) ret Am I right, or is the code OK for the Memory model? Regards Bernd.
Re: [PATCH] Vectorizing abs(char/short/int) on x86.
Hello! Currently GCC could not vectorize abs() operation for integers on x86 with only SSE2 support. For int type, the reason is that the expand on abs() is not defined for vector type. This patch defines such an expand so that abs(int) will be vectorized with only SSE2. +(define_expand absmode2 + [(set (match_operand:VI124_AVX2_48_AVX512F 0 register_operand) + (abs:VI124_AVX2_48_AVX512F + (match_operand:VI124_AVX2_48_AVX512F 1 register_operand)))] + TARGET_SSE2 +{ + if (TARGET_SSE2 !TARGET_SSSE3) +ix86_expand_sse2_absvxsi2 (operands[0], operands[1]); + else if (TARGET_SSSE3) +emit_insn (gen_rtx_SET (VOIDmode, operands[0], +gen_rtx_ABS (MODEmode, operands[1]))); + DONE; +}) This should be written as: (define_expand absmode2 [(set (match_operand:VI124_AVX2_48_AVX512F 0 register_operand) (abs:VI124_AVX2_48_AVX512F (match_operand:VI124_AVX2_48_AVX512F 1 nonimmediate_operand)))] TARGET_SSE2 { if (!TARGET_SSSE3) { ix86_expand_sse2_absvxsi2 (operands[0], operands[1]); DONE; } }) Please note that operands[1] can be a memory operand, so your expander should either handle it (this is preferred) or load the operand to the register at the beginning of the expansion. +void +ix86_expand_sse2_absvxsi2 (rtx op0, rtx op1) This function name implies SImode operands ... please just name it ix86_expand_sse2_abs. Uros.
Re: [RFC] Isolate simplify paths with undefined behaviour
On 10/22/2013 09:00 PM, Jeff Law wrote: So I was poking at this a bit. It's trival to use infer_nonnull_range and to teach infer_nonnull_range to use the returns_nonnull attribute to pick up that return x in an appropriately decorated function implies that x is non-null. We'll need a better place to shove infer_nonnull_range so that it's available to both users. Could you keep in mind that there is considerable interest in a check_nonnull attribute which marks values (parameters, return values, maybe even struct fields) that can be NULL and need to be checked explictly prior to dereference? GCC would then warn if there is a path on which the check is missing. I don't have time at the moment to work on this, but it's on my ever-growing TODO list. :) -- Florian Weimer / Red Hat Product Security Team
Re: [PATCH, PR58805] Add missing check in stmt_local_def for tail-merge
On Tue, 22 Oct 2013, Tom de Vries wrote: Richard, This patch adds a missing check for gimple_vdef in stmt_local_def for the tail-merge pass. Bootstrapped and reg-tested on x86_64. OK for trunk, gcc-4_8-branch? Ok. Thanks, Richard.
Re: [PATCH, PR58805] Add missing check in stmt_local_def for tail-merge
On Tue, 22 Oct 2013, Jeff Law wrote: On 10/22/13 03:58, Tom de Vries wrote: Richard, This patch adds a missing check for gimple_vdef in stmt_local_def for the tail-merge pass. Bootstrapped and reg-tested on x86_64. OK for trunk, gcc-4_8-branch? Thanks, - Tom 2013-10-22 Tom de Vries t...@codesourcery.com PR tree-optimization/58805 * tree-ssa-tail-merge.c (stmt_local_def): Add gimple_vdef check. * gcc.dg/pr58805.c: New test. Doesn't this test belong in an architecture specific directory? Under what conditions can a statement have a VDEF but not be considered as having a side effect by gimple_has_side_effects? It almost seems to me that gimple_has_side_effects may need updating. You seem to misunderstand side-effect, for example *p = 1; has !gimple_has_side_effects but it has a VDEF. Likewise *p = const_call_returing_aggregate (); has !gimple_has_side_effects but it has a VDEF. side-effect is an effect that is not explicitely represented in the gimple stmt you look at. Richard.
Re: New prologue/epilogue code for i386 string functions
Jan Hubicka hubi...@ucw.cz writes: +static void +expand_set_or_movmem_prologue_epilogue_by_misaligned_moves (rtx destmem, rtx srcmem, + rtx *destptr, rtx *srcptr, + enum machine_mode mode, + rtx value, rtx vec_value, + rtx *count, + rtx *done_label, + int size, + int desired_align, + int align, + unsigned HOST_WIDE_INT *min_size, + bool dynamic_check, + bool issetmem) That's a scary prototype. Could you refactor this somehow to not need that many parameters? Perhaps this should be multiple functions. -Andi -- a...@linux.intel.com -- Speaking for myself only
Re: New prologue/epilogue code for i386 string functions
Jan Hubicka hubi...@ucw.cz writes: +static void +expand_set_or_movmem_prologue_epilogue_by_misaligned_moves (rtx destmem, rtx srcmem, + rtx *destptr, rtx *srcptr, + enum machine_mode mode, + rtx value, rtx vec_value, + rtx *count, + rtx *done_label, + int size, + int desired_align, + int align, + unsigned HOST_WIDE_INT *min_size, + bool dynamic_check, + bool issetmem) That's a scary prototype. Could you refactor this somehow to not need that many parameters? Perhaps this should be multiple functions. Well, it became worse with merging memcpy and memset code (but not by that much). The problem here is that the prologue/epilogue code does really quite a lot of things at once. It sort of naturally split into the code handling small memcpy and the branchless code copying first and last N bytes. I can split the function this way, but I think the first one will take pretty much all the existing parameters... I will try to look into this... Honza -Andi -- a...@linux.intel.com -- Speaking for myself only
Re: [PATCH] Use get_range_info in vect_recog_divmod_pattern
On Tue, 22 Oct 2013, Jakub Jelinek wrote: Hi! If VRP tells us that oprnd is always = 0 or always 0, we can generate better code for the divmode vectorization. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Testcase...? Ok with adding one (I suggest a x86 specific one and scanning the assembler dump). Thanks, Richard. 2013-10-22 Jakub Jelinek ja...@redhat.com * tree-vect-patterns.c (vect_recog_divmod_pattern): Optimize sequence based on get_range_info returned range. --- gcc/tree-vect-patterns.c.jj 2013-09-20 09:42:43.048260891 +0200 +++ gcc/tree-vect-patterns.c 2013-10-22 18:21:21.518563159 +0200 @@ -2226,20 +2226,19 @@ vect_recog_divmod_pattern (vecgimple * if (post_shift = prec) return NULL; - /* t1 = oprnd1 h* ml; */ + /* t1 = oprnd0 h* ml; */ t1 = vect_recog_temp_ssa_var (itype, NULL); def_stmt = gimple_build_assign_with_ops (MULT_HIGHPART_EXPR, t1, oprnd0, build_int_cst (itype, ml)); - append_pattern_def_seq (stmt_vinfo, def_stmt); if (add) { /* t2 = t1 + oprnd0; */ + append_pattern_def_seq (stmt_vinfo, def_stmt); t2 = vect_recog_temp_ssa_var (itype, NULL); def_stmt = gimple_build_assign_with_ops (PLUS_EXPR, t2, t1, oprnd0); - append_pattern_def_seq (stmt_vinfo, def_stmt); } else t2 = t1; @@ -2247,27 +2246,57 @@ vect_recog_divmod_pattern (vecgimple * if (post_shift) { /* t3 = t2 post_shift; */ + append_pattern_def_seq (stmt_vinfo, def_stmt); t3 = vect_recog_temp_ssa_var (itype, NULL); def_stmt = gimple_build_assign_with_ops (RSHIFT_EXPR, t3, t2, build_int_cst (itype, post_shift)); - append_pattern_def_seq (stmt_vinfo, def_stmt); } else t3 = t2; - /* t4 = oprnd0 (prec - 1); */ - t4 = vect_recog_temp_ssa_var (itype, NULL); - def_stmt - = gimple_build_assign_with_ops (RSHIFT_EXPR, t4, oprnd0, - build_int_cst (itype, prec - 1)); - append_pattern_def_seq (stmt_vinfo, def_stmt); - - /* q = t3 - t4; or q = t4 - t3; */ - q = vect_recog_temp_ssa_var (itype, NULL); - pattern_stmt - = gimple_build_assign_with_ops (MINUS_EXPR, q, d 0 ? t4 : t3, - d 0 ? t3 : t4); + double_int oprnd0_min, oprnd0_max; + int msb = 1; + if (get_range_info (oprnd0, oprnd0_min, oprnd0_max) == VR_RANGE) + { + if (!oprnd0_min.is_negative ()) + msb = 0; + else if (oprnd0_max.is_negative ()) + msb = -1; + } + + if (msb == 0 d = 0) + { + /* q = t3; */ + q = t3; + pattern_stmt = def_stmt; + } + else + { + /* t4 = oprnd0 (prec - 1); + or if we know from VRP that oprnd0 = 0 + t4 = 0; + or if we know from VRP that oprnd0 0 + t4 = -1; */ + append_pattern_def_seq (stmt_vinfo, def_stmt); + t4 = vect_recog_temp_ssa_var (itype, NULL); + if (msb != 1) + def_stmt + = gimple_build_assign_with_ops (INTEGER_CST, + t4, build_int_cst (itype, msb), + NULL_TREE); + else + def_stmt + = gimple_build_assign_with_ops (RSHIFT_EXPR, t4, oprnd0, + build_int_cst (itype, prec - 1)); + append_pattern_def_seq (stmt_vinfo, def_stmt); + + /* q = t3 - t4; or q = t4 - t3; */ + q = vect_recog_temp_ssa_var (itype, NULL); + pattern_stmt + = gimple_build_assign_with_ops (MINUS_EXPR, q, d 0 ? t4 : t3, + d 0 ? t3 : t4); + } } if (rhs_code == TRUNC_MOD_EXPR) Jakub
Re: [C++14] implement [[deprecated]].
On 10/23/2013 04:47 AM, Jason Merrill wrote: I don't think this patch should be blocked on those bugs. Sure. My point essentially was that as a *GNU* thing we could still pretend to do whatever we wanted is corner cases, etc. As a Standard feature, those are really full blown bugs. Anyway, what *really* matters is that we do work on those ;) Paolo.
Re: Doc patch RFA: Remove docs for -fno-default-inline
On Mon, Oct 21, 2013 at 5:46 PM, Ian Lance Taylor i...@google.com wrote: According to c.opt, the -fdefault-inline patch is ignored, and only exists for backward compatibility. The only use of flag_default_inline was removed by Honza here: http://gcc.gnu.org/ml/gcc-patches/2008-07/msg02104.html This patch removes the docs for -fdefault-inline. Bootstrapped on x86_64-unknown-linux-gnu. OK for mainline? Ok. Thanks, Richard. Ian 2013-10-21 Ian Lance Taylor i...@google.com * doc/invoke.texi (Option Summary): Remove -fno-default-inline. (C++ Dialect Options): Likewise. (Optimize Options): Likewise.
Re: pr37868
On Mon, Oct 21, 2013 at 9:39 PM, Mike Stump mikest...@comcast.net wrote: Concerning: 2008-11-20 Richard Guenther rguent...@suse.de PR tree-optimization/37868 * gcc.dg/torture/pr37868.c: New testcase. * gcc.c-torture/execute/pr38048-1.c: Likewise. * gcc.c-torture/execute/pr38048-2.c: Likewise. So, is there any reason why we can't make the test case portable as in the below instead of skipping on all the targets that blow on misaligned data? Ok? Clearly this was an alias analysis fix so adding a union may no longer trigger the failure mode. Please make sure to go back in time where the testcase failed and see if your adjusted testcase fails _in the same way_. Note that you don't adjust the dg-skip-if in your patch and note that actually there should be no unaligned accesses performed if you instead do typedef unsigned int unaligned_uint __attribute__((aligned(1))); ... bad_bits = ((unsigned int)-1) ^ *(1+(unaligned_uint *) x); but double-check this adjusted testcase still fails without the fix. Richard. diff --git a/gcc/testsuite/gcc.dg/torture/pr37868.c b/gcc/testsuite/gcc.dg/torture/pr37868.c index 380b089..b7ff43f 100644 --- a/gcc/testsuite/gcc.dg/torture/pr37868.c +++ b/gcc/testsuite/gcc.dg/torture/pr37868.c @@ -24,15 +24,18 @@ struct X { int main (void) { - struct X x; + union { +struct X x; +unsigned int i; + } s; unsigned int bad_bits; - x.pad = -1; - x.a = -1; - x.b = -1; - x.c = -1; + s.x.pad = -1; + s.x.a = -1; + s.x.b = -1; + s.x.c = -1; - bad_bits = ((unsigned int)-1) ^ *(1+(unsigned int *) x); + bad_bits = ((unsigned int)-1) ^ *(1+(unsigned int *) s.x); if (bad_bits != 0) abort (); return 0;
Re: [PATCH] Generate fused widening multiply-and-accumulate operations only when the widening multiply has single use
On Tue, Oct 22, 2013 at 12:01 AM, Yufeng Zhang yufeng.zh...@arm.com wrote: Hi, This patch changes the widening_mul pass to fuse the widening multiply with accumulate only when the multiply has single use. The widening_mul pass currently does the conversion regardless of the number of the uses, which can cause poor code-gen in cases like the following: typedef int ArrT [10][10]; void foo (ArrT Arr, int Idx) { Arr[Idx][Idx] = 1; Arr[Idx + 10][Idx] = 2; } On AArch64, after widening_mul, the IR is like _2 = (long unsigned int) Idx_1(D); _3 = Idx_1(D) w* 40; _5 = Arr_4(D) + _3; *_5[Idx_1(D)] = 1; _8 = WIDEN_MULT_PLUS_EXPR Idx_1(D), 40, 400; _9 = Arr_4(D) + _8; *_9[Idx_1(D)] = 2; Where the arrows point, there are redundant widening multiplies. Bootstrap successfully on x86_64. The patch passes the regtest on aarch64, arm and x86_64. OK for the trunk? if (!is_widening_mult_p (rhs1_stmt, type1, mult_rhs1, - type2, mult_rhs2)) + type2, mult_rhs2) + || !has_single_use (rhs1)) please check has_single_use first, it's the cheaper check. Ok with that change (and possibly a testcase). Thanks, Richard. Thanks, Yufeng p.s. Note that x86_64 doesn't suffer from this issue as the corresponding widening multiply accumulate op is not available on the target. gcc/ * tree-ssa-math-opts.c (convert_plusminus_to_widen): Call has_single_use () and not do the conversion if has_single_use () returns false for the multiplication result.
Re: RFC: Add of type-demotion pass
On Tue, Oct 22, 2013 at 4:27 PM, Jakub Jelinek ja...@redhat.com wrote: On Fri, Oct 18, 2013 at 12:06:35PM +0200, Richard Biener wrote: You can't move type conversion out of the way in most cases as GIMPLE is stronly typed and data sources and sinks can obviously not be promoted (nor can function arguments). So you'll very likely not be able to remove the code from the optimizers, it will only maybe trigger less often. My take on the type demotion and promotion is that we badly need it and the question is just in which pass to do it. The benefit of type demotion is code canonicalization and removing unnecessary computation that e.g. only affects the upper bits that are going to be thrown away anyway, the disadvantage of type demotion of signed operations is that we need to perform them in unsigned type instead and thus we can't perform some loop optimizations based on undefined behavior etc. See e.g. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45397#c0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45397#c1 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45397#c8 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45397#c10 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47477#c16 for some testcases where type demotion can improve generated code. If types are demoted, upper bits of constants go away, SCCVN can find equivalences between SSA_NAMEs that wouldn't be considered before, etc. Indeed demotion for this reason is good and important (we may also be able to remove the code in the frontends and in fold that shorten operations). But given the issue with signed operation type demotion, I think before loop optimizations we should only be doing type demotions that don't result in defining previously undefined behavior operations. But the demotion pass could fill in range information which maybe allows to recover parts of the undefinedness. I guess passes like forwprop, gimple-fold etc. could easily handle the easy cases, where there is a tree of has_single_use SSA_NAMEs that can be demoted, but handling a more complicated web would be harder. Say in: unsigned int a, b, c, d, e, f; unsigned char h, i, j; void foo (void) { unsigned int k = a * 2 + b + 0x1234; unsigned int l = c * 4 + d + 0x23456700; unsigned int m = e * 5 + f, n = k + l - m, o = k - l + m, p = -k + 1; h = n; i = o; j = p; } k, l, m all have multiple imm uses, but still pretty much everything in this function could be demoted to unsigned char, the two large constants could go away as additions of zero, etc. Perhaps that can be seen as little benefit, but what if the above is all s/unsigned int/unsigned long long/;s/unsigned char/unsigned int/ on 32-bit target? RTL subreg pass might help a little bit, but that is too late. Yeah, I'd like to see testcases like this with the expected outcome. For the demotion which changes undefined overflow operations to defined ones, I wonder when is the last pass that usefully makes use of that information, if e.g. we could do the full type demotion already before vectorization somewhere in the loop optimization queue, or if that is still too early. The most important user is number of iterations analysis. Where type demotion and promotion is very important is IMHO vectorization, the code we generate for mixed types vectorization is just huge and terrible. If we can help it by not computing useless upper bits, or on the other side sometimes not doing parts of computations in smaller types, which lead to all the other computations on wider types to be done with bigger vectorization factor, we could improve generated code quality. I wonder if for vectorizations we couldn't use the same thing I wrote recently for if-conversion, for bbs potentially suitable for vectorization (with the right loop form etc.), that is, if we don't do full type demotion before vectorization, check if we'd demote anything and if so, work only on the vectorization only loop copy (or create it), and then try to do some type promotion to minimize number of type sizes in the loop, see the http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47477#c16 (admittedly artificial) testcase for what I mean. After demotion, we could replace the cast of short to char and back just with and (for zero extension) or signed shift right + shift left (for sign extension), etc. And, finally, the question is if we generate good code if we just expand RTL from the demoted types (we'd better be, because user could have written his code in the narrower types from the beginning (well, C implicit promotions make that harder, but fold-const already demotes some computations that appear in a single statement), or if there are advantages of promoting some types, what algorithm to use for that, what cost model, what target hooks etc. I guess that experiments will show that not doing the promotion again will regress things. Ideally we'd promote in a target specific way, just like what
Re: [PATCH] Fix up pr58508.c testcase
On Tue, Oct 22, 2013 at 9:26 PM, Jakub Jelinek ja...@redhat.com wrote: Hi! I've noticed that this testcase fails on i686-linux. The problem is that the vect.exp infrastructure is terrible and dg-options overrides all the required flags. So, vect/ testcases either don't have to have explicit options at all (and the way to express options is through magic naming of the tests), or one can use dg-additional-options. But, in this case you just want to pass the defaults. Regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Btw, it handles dg-additional-options just fine I think. Richard. 2013-10-22 Jakub Jelinek ja...@redhat.com * gcc.dg/vect/pr58508.c: Remove dg-options. --- gcc/testsuite/gcc.dg/vect/pr58508.c.jj 2013-10-21 09:00:49.0 +0200 +++ gcc/testsuite/gcc.dg/vect/pr58508.c 2013-10-22 14:21:16.064261179 +0200 @@ -1,5 +1,4 @@ /* { dg-do compile } */ -/* { dg-options -O2 -ftree-vectorize -fdump-tree-vect-details } */ /* The GCC vectorizer generates loop versioning for the following loop Jakub
Re: [PATCH] more robust check for multiplier in addressing mode when determining the costs
On Wed, Oct 23, 2013 at 1:42 AM, Igor Shevlyakov igor.shevlya...@gmail.com wrote: I'm working on a chip with some unusual addressing modes. it does have [base_rage+msize*index_reg] and one can't omit base_reg like on x86. But when ivopts tries to calculate the costs of different addressing modes it only checks [mult*reg] to determine allowable multipliers. I modified multiplier_allowed_in_address_p to check both cases. There's no actual testcase that fails it's just my back-end generated sub-optimal code. I tested it on my tree off 4.8.1 and it fixes the problem. Rebuilt trunk for x86_64-linux-gnu. Should it be included in trunk to make check more robust? I think somebody else fixed this on trunk already. Richard. Thanks Igor 2013-10-22 Igor Shevlyakov igor.shevlya...@gmail.com * tree-ssa-loop-ivopts.c (multiplier_allowed_in_address_p ): Check both [reg+mult*reg] and [mult*reg] to determine if multiplier is allowed. Index: tree-ssa-loop-ivopts.c === --- tree-ssa-loop-ivopts.c (revision 203937) +++ tree-ssa-loop-ivopts.c (working copy) @@ -3108,16 +3108,19 @@ multiplier_allowed_in_address_p (HOST_WI { enum machine_mode address_mode = targetm.addr_space.address_mode (as); rtx reg1 = gen_raw_REG (address_mode, LAST_VIRTUAL_REGISTER + 1); - rtx addr; + rtx reg2 = gen_raw_REG (address_mode, LAST_VIRTUAL_REGISTER + 2); + rtx addr, scaled; HOST_WIDE_INT i; valid_mult = sbitmap_alloc (2 * MAX_RATIO + 1); bitmap_clear (valid_mult); - addr = gen_rtx_fmt_ee (MULT, address_mode, reg1, NULL_RTX); + scaled = gen_rtx_fmt_ee (MULT, address_mode, reg1, NULL_RTX); + addr = gen_rtx_fmt_ee (PLUS, address_mode, scaled, reg2); for (i = -MAX_RATIO; i = MAX_RATIO; i++) { - XEXP (addr, 1) = gen_int_mode (i, address_mode); - if (memory_address_addr_space_p (mode, addr, as)) + XEXP (scaled, 1) = gen_int_mode (i, address_mode); + if (memory_address_addr_space_p (mode, addr, as) || + memory_address_addr_space_p (mode, scaled, as)) bitmap_set_bit (valid_mult, i + MAX_RATIO); }
Re: [patch] Flatten tree-ssa.h
On Wed, Oct 23, 2013 at 5:21 AM, Andrew MacLeod amacl...@redhat.com wrote: When moving all the prototypes out of tree-flow.h and into individual files, the kitchen sink #include list temporarily propagated it way into the new tree-ssa.h file. tree-ssa.h is now in a position to contain only it's own prototypes. Its include list was: #include bitmap.h #include gimple.h #include gimple-ssa.h #include cgraph.h #include tree-cfg.h #include tree-phinodes.h #include ssa-iterators.h #include tree-ssanames.h #include tree-ssa-loop.h #include tree-into-ssa.h #include tree-dfa.h This patch flattens tree-ssa.h so that it no longer includes *any* of those include files. The process, mostly automated: 1 - takes all the #include's from tree-ssa.h and copies them immediately before #include tree-ssa.h in every .c file which includes it. 2 - Remove each of those include files (as well as tree-ssa.h) one at a time from the .c file (bottom up) and tries to compile it. 3 - Removes any which cause no compilation failures. This way each .c file gets only the includes it actually needed from tree-ssa.h, and often doesn't even require tree-ssa.h itself. I haven't touched any existing includes unless they formed a duplicate of those copied from tree-ssa.h. (and that happens) This touches 147 files (!!!) which included tree-ssa.h only 40 still need tree-ssa.h. tree-ssa.h had 11 include files and the average .c file ended up requiring only 3. bootstraps on x86_64-unknown-linux-gnu with no new regressions. OK? Ok. Thanks, Richard. Andrew
Re: [PATCH] Vectorizing abs(char/short/int) on x86.
On Wed, Oct 23, 2013 at 5:11 AM, pins...@gmail.com wrote: Sent from my iPad On Oct 22, 2013, at 7:23 PM, Cong Hou co...@google.com wrote: This patch aims at PR58762. Currently GCC could not vectorize abs() operation for integers on x86 with only SSE2 support. For int type, the reason is that the expand on abs() is not defined for vector type. This patch defines such an expand so that abs(int) will be vectorized with only SSE2. For abs(char/short), type conversions are needed as the current abs() function/operation does not accept argument of char/short type. Therefore when we want to get the absolute value of a char_val using abs (char_val), it will be converted into abs ((int) char_val). It then can be vectorized, but the generated code is not efficient as lots of packings and unpackings are envolved. But if we convert (char) abs ((int) char_val) to abs (char_val), the vectorizer will be able to generate better code. Same for short. This conversion also enables vectorizing abs(char/short) operation with PABSB and PABSW instructions in SSE3. With only SSE2 support, I developed three methods to expand abs(char/short/int) seperately: 1. For 32 bit int value x, we can get abs (x) from (((signed) x (W-1)) ^ x) - ((signed) x (W-1)). This is better than max (x, -x), which needs bit masking. 2. For 16 bit int value x, we can get abs (x) from max (x, -x), as SSE2 provides PMAXSW instruction. 3. For 8 bit int value x, we can get abs (x) from min ((unsigned char) x, (unsigned char) (-x)), as SSE2 provides PMINUB instruction. The patch is pasted below. Please point out any problem in my patch and analysis. thanks, Cong diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 8a38316..e0f33ee 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,13 @@ +2013-10-22 Cong Hou co...@google.com + + PR target/58762 + * convert.c (convert_to_integer): Convert (char) abs ((int) char_val) + into abs (char_val). Also convert (short) abs ((int) short_val) + into abs (short_val). I don't like this optimization in convert. I think it should be submitted separately and should be done in tree-ssa-forwprop. Also I think you should have a generic (non x86) test case for the above optimization. I agree. Richard. Thanks, Andrew + * config/i386/i386-protos.h (ix86_expand_sse2_absvxsi2): New function. + * config/i386/i386.c (ix86_expand_sse2_absvxsi2): New function. + * config/i386/sse.md: Add SSE2 support to abs (char/int/short). + 2013-10-14 David Malcolm dmalc...@redhat.com * dumpfile.h (gcc::dump_manager): New class, to hold state diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index 3ab2f3a..e85f663 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -238,6 +238,7 @@ extern void ix86_expand_mul_widen_evenodd (rtx, rtx, rtx, bool, bool); extern void ix86_expand_mul_widen_hilo (rtx, rtx, rtx, bool, bool); extern void ix86_expand_sse2_mulv4si3 (rtx, rtx, rtx); extern void ix86_expand_sse2_mulvxdi3 (rtx, rtx, rtx); +extern void ix86_expand_sse2_absvxsi2 (rtx, rtx); /* In i386-c.c */ extern void ix86_target_macros (void); diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 02cbbbd..8050e02 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -41696,6 +41696,53 @@ ix86_expand_sse2_mulvxdi3 (rtx op0, rtx op1, rtx op2) gen_rtx_MULT (mode, op1, op2)); } +void +ix86_expand_sse2_absvxsi2 (rtx op0, rtx op1) +{ + enum machine_mode mode = GET_MODE (op0); + rtx tmp0, tmp1; + + switch (mode) +{ + /* For 32-bit signed integer X, the best way to calculate the absolute + value of X is (((signed) X (W-1)) ^ X) - ((signed) X (W-1)). */ + case V4SImode: + tmp0 = expand_simple_binop (mode, ASHIFTRT, op1, +GEN_INT (GET_MODE_BITSIZE + (GET_MODE_INNER (mode)) - 1), +NULL, 0, OPTAB_DIRECT); + if (tmp0) + tmp1 = expand_simple_binop (mode, XOR, op1, tmp0, + NULL, 0, OPTAB_DIRECT); + if (tmp0 tmp1) + expand_simple_binop (mode, MINUS, tmp1, tmp0, + op0, 0, OPTAB_DIRECT); + break; + + /* For 16-bit signed integer X, the best way to calculate the absolute + value of X is max (X, -X), as SSE2 provides the PMAXSW insn. */ + case V8HImode: + tmp0 = expand_unop (mode, neg_optab, op1, NULL_RTX, 0); + if (tmp0) + expand_simple_binop (mode, SMAX, op1, tmp0, op0, 0, + OPTAB_DIRECT); + break; + + /* For 8-bit signed integer X, the best way to calculate the absolute + value of X is min ((unsigned char) X, (unsigned char) (-X)), + as SSE2 provides the PMINUB insn. */ + case V16QImode: + tmp0 = expand_unop (mode, neg_optab, op1, NULL_RTX, 0); + if (tmp0) + expand_simple_binop (V16QImode, UMIN, op1, tmp0, op0, 0, + OPTAB_DIRECT); + break; + + default: + break; +} +} + /* Expand an insert into a vector register through pinsr insn. Return
Re: [PATCH] Use get_range_info in vect_recog_divmod_pattern
On Wed, Oct 23, 2013 at 11:14:54AM +0200, Richard Biener wrote: On Tue, 22 Oct 2013, Jakub Jelinek wrote: Hi! If VRP tells us that oprnd is always = 0 or always 0, we can generate better code for the divmode vectorization. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Testcase...? Ok with adding one (I suggest a x86 specific one and scanning the assembler dump). Like this? 2013-10-23 Jakub Jelinek ja...@redhat.com * tree-vect-patterns.c (vect_recog_divmod_pattern): Optimize sequence based on get_range_info returned range. * gcc.target/i386/vect-div-1.c: New test. --- gcc/tree-vect-patterns.c.jj 2013-10-22 18:36:51.947395037 +0200 +++ gcc/tree-vect-patterns.c2013-10-23 11:28:26.211956658 +0200 @@ -2226,20 +2226,19 @@ vect_recog_divmod_pattern (vecgimple * if (post_shift = prec) return NULL; - /* t1 = oprnd1 h* ml; */ + /* t1 = oprnd0 h* ml; */ t1 = vect_recog_temp_ssa_var (itype, NULL); def_stmt = gimple_build_assign_with_ops (MULT_HIGHPART_EXPR, t1, oprnd0, build_int_cst (itype, ml)); - append_pattern_def_seq (stmt_vinfo, def_stmt); if (add) { /* t2 = t1 + oprnd0; */ + append_pattern_def_seq (stmt_vinfo, def_stmt); t2 = vect_recog_temp_ssa_var (itype, NULL); def_stmt = gimple_build_assign_with_ops (PLUS_EXPR, t2, t1, oprnd0); - append_pattern_def_seq (stmt_vinfo, def_stmt); } else t2 = t1; @@ -2247,27 +2246,57 @@ vect_recog_divmod_pattern (vecgimple * if (post_shift) { /* t3 = t2 post_shift; */ + append_pattern_def_seq (stmt_vinfo, def_stmt); t3 = vect_recog_temp_ssa_var (itype, NULL); def_stmt = gimple_build_assign_with_ops (RSHIFT_EXPR, t3, t2, build_int_cst (itype, post_shift)); - append_pattern_def_seq (stmt_vinfo, def_stmt); } else t3 = t2; - /* t4 = oprnd0 (prec - 1); */ - t4 = vect_recog_temp_ssa_var (itype, NULL); - def_stmt - = gimple_build_assign_with_ops (RSHIFT_EXPR, t4, oprnd0, - build_int_cst (itype, prec - 1)); - append_pattern_def_seq (stmt_vinfo, def_stmt); - - /* q = t3 - t4; or q = t4 - t3; */ - q = vect_recog_temp_ssa_var (itype, NULL); - pattern_stmt - = gimple_build_assign_with_ops (MINUS_EXPR, q, d 0 ? t4 : t3, - d 0 ? t3 : t4); + double_int oprnd0_min, oprnd0_max; + int msb = 1; + if (get_range_info (oprnd0, oprnd0_min, oprnd0_max) == VR_RANGE) + { + if (!oprnd0_min.is_negative ()) + msb = 0; + else if (oprnd0_max.is_negative ()) + msb = -1; + } + + if (msb == 0 d = 0) + { + /* q = t3; */ + q = t3; + pattern_stmt = def_stmt; + } + else + { + /* t4 = oprnd0 (prec - 1); +or if we know from VRP that oprnd0 = 0 +t4 = 0; +or if we know from VRP that oprnd0 0 +t4 = -1; */ + append_pattern_def_seq (stmt_vinfo, def_stmt); + t4 = vect_recog_temp_ssa_var (itype, NULL); + if (msb != 1) + def_stmt + = gimple_build_assign_with_ops (INTEGER_CST, + t4, build_int_cst (itype, msb), + NULL_TREE); + else + def_stmt + = gimple_build_assign_with_ops (RSHIFT_EXPR, t4, oprnd0, + build_int_cst (itype, prec - 1)); + append_pattern_def_seq (stmt_vinfo, def_stmt); + + /* q = t3 - t4; or q = t4 - t3; */ + q = vect_recog_temp_ssa_var (itype, NULL); + pattern_stmt + = gimple_build_assign_with_ops (MINUS_EXPR, q, d 0 ? t4 : t3, + d 0 ? t3 : t4); + } } if (rhs_code == TRUNC_MOD_EXPR) --- gcc/testsuite/gcc.target/i386/vect-div-1.c.jj 2013-10-23 11:43:49.089265027 +0200 +++ gcc/testsuite/gcc.target/i386/vect-div-1.c 2013-10-23 11:57:06.387187749 +0200 @@ -0,0 +1,43 @@ +/* { dg-do compile { target sse2 } } */ +/* { dg-options -O2 -ftree-vectorize -fno-common -msse2 } */ + +unsigned short b[1024] = { 0 }; +int a[1024] = { 0 }; + +int +f1 (int x) +{ + int i; + for (i = 0; i 1024; i++) +a[i] = (b[i] + 7) / 15; +} + +int +f2 (int x) +{ + int i; + for (i = 0; i 1024; i++) +a[i] = (b[i] + 7) % 15; +} + +int +f3 (int x) +{ + int i; + for (i = 0; i 1024; i++) +a[i] = (b[i] - 66000) / 15; +} + +int +f4 (int x) +{ + int i; + for (i = 0; i 1024; i++) +a[i] = (b[i] - 66000) % 15; +} + +/* In f1 and f2, VRP can prove the first operand of division or modulo + is always
Re: [PATCH] Use get_range_info in vect_recog_divmod_pattern
On Wed, 23 Oct 2013, Jakub Jelinek wrote: On Wed, Oct 23, 2013 at 11:14:54AM +0200, Richard Biener wrote: On Tue, 22 Oct 2013, Jakub Jelinek wrote: Hi! If VRP tells us that oprnd is always = 0 or always 0, we can generate better code for the divmode vectorization. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Testcase...? Ok with adding one (I suggest a x86 specific one and scanning the assembler dump). Like this? Yes, thanks. Richard. 2013-10-23 Jakub Jelinek ja...@redhat.com * tree-vect-patterns.c (vect_recog_divmod_pattern): Optimize sequence based on get_range_info returned range. * gcc.target/i386/vect-div-1.c: New test. --- gcc/tree-vect-patterns.c.jj 2013-10-22 18:36:51.947395037 +0200 +++ gcc/tree-vect-patterns.c 2013-10-23 11:28:26.211956658 +0200 @@ -2226,20 +2226,19 @@ vect_recog_divmod_pattern (vecgimple * if (post_shift = prec) return NULL; - /* t1 = oprnd1 h* ml; */ + /* t1 = oprnd0 h* ml; */ t1 = vect_recog_temp_ssa_var (itype, NULL); def_stmt = gimple_build_assign_with_ops (MULT_HIGHPART_EXPR, t1, oprnd0, build_int_cst (itype, ml)); - append_pattern_def_seq (stmt_vinfo, def_stmt); if (add) { /* t2 = t1 + oprnd0; */ + append_pattern_def_seq (stmt_vinfo, def_stmt); t2 = vect_recog_temp_ssa_var (itype, NULL); def_stmt = gimple_build_assign_with_ops (PLUS_EXPR, t2, t1, oprnd0); - append_pattern_def_seq (stmt_vinfo, def_stmt); } else t2 = t1; @@ -2247,27 +2246,57 @@ vect_recog_divmod_pattern (vecgimple * if (post_shift) { /* t3 = t2 post_shift; */ + append_pattern_def_seq (stmt_vinfo, def_stmt); t3 = vect_recog_temp_ssa_var (itype, NULL); def_stmt = gimple_build_assign_with_ops (RSHIFT_EXPR, t3, t2, build_int_cst (itype, post_shift)); - append_pattern_def_seq (stmt_vinfo, def_stmt); } else t3 = t2; - /* t4 = oprnd0 (prec - 1); */ - t4 = vect_recog_temp_ssa_var (itype, NULL); - def_stmt - = gimple_build_assign_with_ops (RSHIFT_EXPR, t4, oprnd0, - build_int_cst (itype, prec - 1)); - append_pattern_def_seq (stmt_vinfo, def_stmt); - - /* q = t3 - t4; or q = t4 - t3; */ - q = vect_recog_temp_ssa_var (itype, NULL); - pattern_stmt - = gimple_build_assign_with_ops (MINUS_EXPR, q, d 0 ? t4 : t3, - d 0 ? t3 : t4); + double_int oprnd0_min, oprnd0_max; + int msb = 1; + if (get_range_info (oprnd0, oprnd0_min, oprnd0_max) == VR_RANGE) + { + if (!oprnd0_min.is_negative ()) + msb = 0; + else if (oprnd0_max.is_negative ()) + msb = -1; + } + + if (msb == 0 d = 0) + { + /* q = t3; */ + q = t3; + pattern_stmt = def_stmt; + } + else + { + /* t4 = oprnd0 (prec - 1); + or if we know from VRP that oprnd0 = 0 + t4 = 0; + or if we know from VRP that oprnd0 0 + t4 = -1; */ + append_pattern_def_seq (stmt_vinfo, def_stmt); + t4 = vect_recog_temp_ssa_var (itype, NULL); + if (msb != 1) + def_stmt + = gimple_build_assign_with_ops (INTEGER_CST, + t4, build_int_cst (itype, msb), + NULL_TREE); + else + def_stmt + = gimple_build_assign_with_ops (RSHIFT_EXPR, t4, oprnd0, + build_int_cst (itype, prec - 1)); + append_pattern_def_seq (stmt_vinfo, def_stmt); + + /* q = t3 - t4; or q = t4 - t3; */ + q = vect_recog_temp_ssa_var (itype, NULL); + pattern_stmt + = gimple_build_assign_with_ops (MINUS_EXPR, q, d 0 ? t4 : t3, + d 0 ? t3 : t4); + } } if (rhs_code == TRUNC_MOD_EXPR) --- gcc/testsuite/gcc.target/i386/vect-div-1.c.jj 2013-10-23 11:43:49.089265027 +0200 +++ gcc/testsuite/gcc.target/i386/vect-div-1.c2013-10-23 11:57:06.387187749 +0200 @@ -0,0 +1,43 @@ +/* { dg-do compile { target sse2 } } */ +/* { dg-options -O2 -ftree-vectorize -fno-common -msse2 } */ + +unsigned short b[1024] = { 0 }; +int a[1024] = { 0 }; + +int +f1 (int x) +{ + int i; + for (i = 0; i 1024; i++) +a[i] = (b[i] + 7) / 15; +} + +int +f2 (int x) +{ + int i; + for (i = 0; i 1024; i++) +a[i] = (b[i] + 7) % 15; +} + +int +f3 (int x) +{ + int i; + for (i = 0; i 1024; i++) +a[i] = (b[i] - 66000) / 15; +} + +int +f4 (int x) +{ + int i; + for (i = 0; i 1024; i++) +a[i] = (b[i] -
Re: [PATCH] Fix various reassoc issues (PR tree-optimization/58791, tree-optimization/58775)
On Tue, Oct 22, 2013 at 12:47:41PM -0600, Jeff Law wrote: On 10/22/13 07:09, Jakub Jelinek wrote: I've spent over two days looking at reassoc, fixing spots where we invalidly reused SSA_NAMEs (this results in wrong-debug, as the added guality testcases show, even some ICEs (pr58791-3.c) and wrong range info for SSA_NAMEs) This is something we all need to remember, directly reusing an existing SSA_NAME for a new value and the like this is bad. It's far better to release the name back to the manager and grab a new one. For debug info quality it actually isn't just about using different SSA_NAME, but also not reusing the defining stmt; only then the code will magically try to create debug temporaries and expressions from the old dead defining stmt. -/* Returns the UID of STMT if it is non-NULL. Otherwise return 1. */ +/* Returns true if statement S1 dominates statement S2. Like + stmt_dominates_stmt_p, but uses stmt UIDs to optimize. */ -static inline unsigned -get_stmt_uid_with_default (gimple stmt) +static bool +reassoc_stmt_dominates_stmt_p (gimple s1, gimple s2) +{ + basic_block bb1 = gimple_bb (s1), bb2 = gimple_bb (s2); + + if (!bb1 || s1 == s2) +return true; + + if (!bb2) +return false; Maybe this was carried over from somewhere else, but that looks awful strange. When !bb1 you return true, but if !bb2 you return false. At the least it deserves a comment. bb{1,2} == NULL means a default definition of the corresponding lhs. If bb2 is NULL and bb1 is not, then s2 (assumed to be at the beginning of function) certainly doesn't dominate s1, if bb2 is not NULL and bb1 is, then s1 (assumed to be at the beginning of function) dominates s2, if both bb1 and bb2 are NULL, then both are assumed to be at the same spot, so like the s1 == s2 case. Note, the if (!bb1 || s1 == s2) return true; comes from the tree-ssa-loop-niter.c origin, if (!bb2) return false; does not. + if (bb1 == bb2) +{ + if (gimple_code (s2) == GIMPLE_PHI) +return false; + + if (gimple_code (s1) == GIMPLE_PHI) +return true; Deserves a comment. I know what's going on here, but it's easier to read it in a comment rather than recalling the all-phis-in-parallel rule and verifying this handles that correctly. Ok. + if (gimple_uid (s1) gimple_uid (s2)) +return true; + + if (gimple_uid (s1) gimple_uid (s2)) +return false; So if one (but not both) of the UIDs isn't set yet, one of these two statements will return, which seems wrong since we don't know where the statement without a UID is relative to the statement with a UID. Am I missing something? Right now we shouldn't see an unset uid here at all, unless it is a PHI, which is handled earlier. break_up_subtract_bb initializes them for all preexisting stmts, and the various places which add new stmts are responsible for setting uid to either the uid of the immediately preceeding or following stmt that has uid set (or 1 if the bb was previously empty). + gimple_stmt_iterator gsi = gsi_for_stmt (s1); + unsigned int uid = gimple_uid (s1); + for (gsi_next (gsi); !gsi_end_p (gsi); gsi_next (gsi)) +{ + gimple s = gsi_stmt (gsi); + if (gimple_uid (s) != uid) +break; + if (s == s2) +return true; +} Don't we only get here if both UIDs are zero (or the same by other means)?If so does this code even make sense? Both UIDs zero should not happen, both UIDs the same can happen, we don't renumber the whole bb upon inserting something into the bb. + +/* Insert STMT after INSERT_POINT. */ + +static void +insert_stmt_after (gimple stmt, gimple insert_point) { - return stmt ? gimple_uid (stmt) : 1; + gimple_stmt_iterator gsi; + basic_block bb; + + if (gimple_code (insert_point) == GIMPLE_PHI) +bb = gimple_bb (insert_point); + else if (!stmt_ends_bb_p (insert_point)) +{ + gsi = gsi_for_stmt (insert_point); + gimple_set_uid (stmt, gimple_uid (insert_point)); + gsi_insert_after (gsi, stmt, GSI_NEW_STMT); + return; +} + else +bb = find_fallthru_edge (gimple_bb (insert_point)-succs)-dest; + gsi = gsi_after_labels (bb); + if (gsi_end_p (gsi)) +{ + gimple_stmt_iterator gsi2 = gsi_last_bb (bb); + gimple_set_uid (stmt, + gsi_end_p (gsi2) ? 1 : gimple_uid (gsi_stmt (gsi2))); +} + else +gimple_set_uid (stmt, gimple_uid (gsi_stmt (gsi))); + gsi_insert_before (gsi, stmt, GSI_SAME_STMT); So from reading the name of the function, it's not clear that this inserts on the fallthru edge in the case where INSERT_POINT is the end of a block. And does that make sense? ISTM you'd want it on all the outgoing edges. And you have to worry about things like abnormals, critical edges, etc. Note the above isn't in any way a new code, just a cleanup of the preexisting, just earlier the thing was done in more than one place and
Re: [PATCH][buildrobot] tilepro/tilegx: fallout after tree.h refactoring (was: Re-factor inclusion of tree.h)
On Tue, 2013-10-22 08:46:13 -0400, Diego Novillo dnovi...@google.com wrote: On Tue, Oct 22, 2013 at 4:22 AM, Jan-Benedict Glaw jbg...@lug-owl.de wrote: This fixes it: 2013-10-22 Jan-Benedict Glaw jbg...@lug-owl.de * config/tilepro/tilepro.c: Include tree.h. Sure. It qualifies as obvious. Thanks. Also committed this as obvious: 2013-10-23 Jan-Benedict Glaw jbg...@lug-owl.de * config/tilegx/tilegx.c: Include tree.h. Index: gcc/config/tilegx/tilegx.c === --- gcc/config/tilegx/tilegx.c (revision 203950) +++ gcc/config/tilegx/tilegx.c (working copy) @@ -39,6 +39,7 @@ #include function.h #include dwarf2.h #include timevar.h +#include tree.h #include gimple.h #include cfgloop.h #include tilegx-builtins.h MfG, JBG -- Jan-Benedict Glaw jbg...@lug-owl.de +49-172-7608481 Signature of: really soon now: an unspecified period of time, likly to the second : be greater than any reasonable definition of soon. signature.asc Description: Digital signature
Re: [PATCH, i386, MPX, 1/X] Support of Intel MPX ISA. 1/2 Bound type and modes
eOn 22 Oct 22:55, Jeff Law wrote: On 09/17/13 02:18, Ilya Enkovich wrote: Hi, Here is a patch introducing new type and mode for bounds. It is a part of MPX ISA support patch (http://gcc.gnu.org/ml/gcc-patches/2013-07/msg01094.html). Bootstrapped and tested on linux-x86_64. Is it OK for trunk? Thanks, Ilya -- gcc/ 2013-09-16 Ilya Enkovich ilya.enkov...@intel.com * mode-classes.def (MODE_BOUND): New. * tree.def (BOUND_TYPE): New. * genmodes.c (complete_mode): Support MODE_BOUND. (BOUND_MODE): New. (make_bound_mode): New. * machmode.h (BOUND_MODE_P): New. * stor-layout.c (int_mode_for_mode): Support MODE_BOUND. (layout_type): Support BOUND_TYPE. * tree-pretty-print.c (dump_generic_node): Support BOUND_TYPE. * tree.c (build_int_cst_wide): Support BOUND_TYPE. (type_contains_placeholder_1): Likewise. * tree.h (BOUND_TYPE_P): New. * varasm.c (output_constant): Support BOUND_TYPE. * doc/rtl.texi (MODE_BOUND): New. Mostly OK. Just a few minor things that should be fixed or at least clarified. diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi index 1d62223..02b1214 100644 --- a/gcc/doc/rtl.texi +++ b/gcc/doc/rtl.texi @@ -1382,6 +1382,10 @@ any @code{CC_MODE} modes listed in the @file{@var{machine}-modes.def}. @xref{Jump Patterns}, also see @ref{Condition Code}. +@findex MODE_BOUND +@item MODE_BOUND +Bound modes class. Used to represent values of pointer bounds. I can't help but feel more is needed here -- without going into the details of the MPX implementation we ought to say something about how these differ from the more normal integer modes. Drawing from the brief discussion between Richard myself earlier today should give some ideas on how to improve this. I'd probably use MODE_POINTER_BOUNDS which is a bit more descriptive. We wouldn't want someone to (for example) think this stuff relates to array bounds. Obviously a change to MODE_POINTER_BOUNDS would propagate into other places where you use BOUND without a POINTER qualification, such as BOUND_MODE_P which we'd change to POINTER_BOUNDS_MODE_P. diff --git a/gcc/mode-classes.def b/gcc/mode-classes.def index 7207ef7..c5ea215 100644 --- a/gcc/mode-classes.def +++ b/gcc/mode-classes.def @@ -21,6 +21,7 @@ along with GCC; see the file COPYING3. If not see DEF_MODE_CLASS (MODE_RANDOM),/* other */ \ DEF_MODE_CLASS (MODE_CC),/* condition code in a register */ \ DEF_MODE_CLASS (MODE_INT), /* integer */ \ + DEF_MODE_CLASS (MODE_BOUND),/* bounds */ \ DEF_MODE_CLASS (MODE_PARTIAL_INT), /* integer with padding bits */ \ DEF_MODE_CLASS (MODE_FRACT), /* signed fractional number */ \ DEF_MODE_CLASS (MODE_UFRACT),/* unsigned fractional number */ \ Does genmodes do the right thing WRT MAX_INT_MODE and MIN_INT_MODE? I'd be more comfortable if MODE_POINTER_BOUNDS wasn't sitting between MODE_INT and MODE_PARTIAL_INT. I'm not aware of code that iterates over these things that would get confused, but ISTM putting MODE_POINTER_BOUNDS after MODE_PARTIAL_INT is marginally safer. diff --git a/gcc/tree.c b/gcc/tree.c index b469b97..bbbe16e 100644 --- a/gcc/tree.c +++ b/gcc/tree.c @@ -1197,6 +1197,7 @@ build_int_cst_wide (tree type, unsigned HOST_WIDE_INT low, HOST_WIDE_INT hi) case INTEGER_TYPE: case OFFSET_TYPE: +case BOUND_TYPE: if (TYPE_UNSIGNED (type)) { /* Cache 0..N */ So here you're effectively treading POINTER_BOUNDS_TYPE like an integer. I'm guessing there's a number of flags that may not be relevant for your type and which you might want to repurpose (again, I haven't looked at the entire patchset). If so, you want to be real careful here since you'll be looking at (for example) TYPE_UNSIGNED which may not have any real meaning for POINTER_BOUNDS_TYPE. Overall, it seems fairly reasonable -- the biggest concern of mine is in the last comment. Are you going to be repurposing various flag bits in the type? If so, then we have to be more careful in code like above. Jeff Thanks for review! You are right here, treating bounds as integer is wrong. Currently we do not use type flags for bounds, checking TYPE_UNSIGNED is incorrect. Bounds constant in this case is better to be treated as pointer constant when only zero value is a special case. Below is the a new version with all required changes. Ilya -- gcc/ 2013-10-23 Ilya Enkovich ilya.enkov...@intel.com * mode-classes.def (MODE_POINTER_BOUNDS): New. * tree.def (POINTER_BOUNDS_TYPE): New. * genmodes.c (complete_mode): Support MODE_POINTER_BOUNDS. (POINTER_BOUNDS_MODE): New. (make_pointer_bounds_mode): New. * machmode.h
[v3] Fix libstdc++/58815
Hi, for details see the audit trail. Tested x86_64-linux, committed to mainline. Thanks, Paolo. // 2013-10-23 Paolo Carlini paolo.carl...@oracle.com PR libstdc++/58815 * include/decimal/decimal (decimal32::operator long long(), decimal64::operator long long(), decimal128::operator long long()): Add in c++11 mode per n3407. * testsuite/decimal/pr58815.cc: New. Index: include/decimal/decimal === --- include/decimal/decimal (revision 203951) +++ include/decimal/decimal (working copy) @@ -250,8 +250,11 @@ /// Conforming extension: Conversion from scalar decimal type. decimal32(__decfloat32 __z): __val(__z) {} -// 3.2.2.5 Conversion to integral type. (DISABLED) -//operator long long() const { return (long long)__val; } +#if __cplusplus = 201103L +// 3.2.2.5 Conversion to integral type. +// Note: explicit per n3407. +explicit operator long long() const { return (long long)__val; } +#endif // 3.2.2.6 Increment and decrement operators. decimal32 operator++() @@ -333,8 +336,11 @@ /// Conforming extension: Conversion from scalar decimal type. decimal64(__decfloat64 __z): __val(__z) {} -// 3.2.3.5 Conversion to integral type. (DISABLED) -//operator long long() const { return (long long)__val; } +#if __cplusplus = 201103L +// 3.2.3.5 Conversion to integral type. +// Note: explicit per n3407. +explicit operator long long() const { return (long long)__val; } +#endif // 3.2.3.6 Increment and decrement operators. decimal64 operator++() @@ -417,8 +423,11 @@ /// Conforming extension: Conversion from scalar decimal type. decimal128(__decfloat128 __z) : __val(__z) {} -// 3.2.4.5 Conversion to integral type. (DISABLED) -//operator long long() const { return (long long)__val; } +#if __cplusplus = 201103L +// 3.2.4.5 Conversion to integral type. +// Note: explicit per n3407. +explicit operator long long() const { return (long long)__val; } +#endif // 3.2.4.6 Increment and decrement operators. decimal128 operator++() Index: testsuite/decimal/pr58815.cc === --- testsuite/decimal/pr58815.cc(revision 0) +++ testsuite/decimal/pr58815.cc(working copy) @@ -0,0 +1,35 @@ +// { dg-options -std=gnu++11 } +// +// Copyright (C) 2013 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +// { dg-do compile } +// { dg-require-effective-target dfp } + +#include decimal/decimal + +void +test01 () +{ + std::decimal::decimal32 d32; + std::decimal::decimal64 d64; + std::decimal::decimal128 d128; + + static_castlong long(d32); + static_castlong long(d64); + static_castlong long(d128); +}
Re: [PATCH] Fix various reassoc issues (PR tree-optimization/58791, tree-optimization/58775)
On Wed, 23 Oct 2013, Jakub Jelinek wrote: On Tue, Oct 22, 2013 at 12:47:41PM -0600, Jeff Law wrote: On 10/22/13 07:09, Jakub Jelinek wrote: I've spent over two days looking at reassoc, fixing spots where we invalidly reused SSA_NAMEs (this results in wrong-debug, as the added guality testcases show, even some ICEs (pr58791-3.c) and wrong range info for SSA_NAMEs) This is something we all need to remember, directly reusing an existing SSA_NAME for a new value and the like this is bad. It's far better to release the name back to the manager and grab a new one. For debug info quality it actually isn't just about using different SSA_NAME, but also not reusing the defining stmt; only then the code will magically try to create debug temporaries and expressions from the old dead defining stmt. -/* Returns the UID of STMT if it is non-NULL. Otherwise return 1. */ +/* Returns true if statement S1 dominates statement S2. Like + stmt_dominates_stmt_p, but uses stmt UIDs to optimize. */ -static inline unsigned -get_stmt_uid_with_default (gimple stmt) +static bool +reassoc_stmt_dominates_stmt_p (gimple s1, gimple s2) +{ + basic_block bb1 = gimple_bb (s1), bb2 = gimple_bb (s2); + + if (!bb1 || s1 == s2) +return true; + + if (!bb2) +return false; Maybe this was carried over from somewhere else, but that looks awful strange. When !bb1 you return true, but if !bb2 you return false. At the least it deserves a comment. bb{1,2} == NULL means a default definition of the corresponding lhs. If bb2 is NULL and bb1 is not, then s2 (assumed to be at the beginning of function) certainly doesn't dominate s1, if bb2 is not NULL and bb1 is, then s1 (assumed to be at the beginning of function) dominates s2, if both bb1 and bb2 are NULL, then both are assumed to be at the same spot, so like the s1 == s2 case. Note, the if (!bb1 || s1 == s2) return true; comes from the tree-ssa-loop-niter.c origin, if (!bb2) return false; does not. + if (bb1 == bb2) +{ + if (gimple_code (s2) == GIMPLE_PHI) + return false; + + if (gimple_code (s1) == GIMPLE_PHI) + return true; Deserves a comment. I know what's going on here, but it's easier to read it in a comment rather than recalling the all-phis-in-parallel rule and verifying this handles that correctly. Ok. + if (gimple_uid (s1) gimple_uid (s2)) + return true; + + if (gimple_uid (s1) gimple_uid (s2)) + return false; So if one (but not both) of the UIDs isn't set yet, one of these two statements will return, which seems wrong since we don't know where the statement without a UID is relative to the statement with a UID. Am I missing something? Right now we shouldn't see an unset uid here at all, unless it is a PHI, which is handled earlier. break_up_subtract_bb initializes them for all preexisting stmts, and the various places which add new stmts are responsible for setting uid to either the uid of the immediately preceeding or following stmt that has uid set (or 1 if the bb was previously empty). + gimple_stmt_iterator gsi = gsi_for_stmt (s1); + unsigned int uid = gimple_uid (s1); + for (gsi_next (gsi); !gsi_end_p (gsi); gsi_next (gsi)) + { +gimple s = gsi_stmt (gsi); +if (gimple_uid (s) != uid) + break; +if (s == s2) + return true; + } Don't we only get here if both UIDs are zero (or the same by other means)?If so does this code even make sense? Both UIDs zero should not happen, both UIDs the same can happen, we don't renumber the whole bb upon inserting something into the bb. + +/* Insert STMT after INSERT_POINT. */ + +static void +insert_stmt_after (gimple stmt, gimple insert_point) { - return stmt ? gimple_uid (stmt) : 1; + gimple_stmt_iterator gsi; + basic_block bb; + + if (gimple_code (insert_point) == GIMPLE_PHI) +bb = gimple_bb (insert_point); + else if (!stmt_ends_bb_p (insert_point)) +{ + gsi = gsi_for_stmt (insert_point); + gimple_set_uid (stmt, gimple_uid (insert_point)); + gsi_insert_after (gsi, stmt, GSI_NEW_STMT); + return; +} + else +bb = find_fallthru_edge (gimple_bb (insert_point)-succs)-dest; + gsi = gsi_after_labels (bb); + if (gsi_end_p (gsi)) +{ + gimple_stmt_iterator gsi2 = gsi_last_bb (bb); + gimple_set_uid (stmt, +gsi_end_p (gsi2) ? 1 : gimple_uid (gsi_stmt (gsi2))); +} + else +gimple_set_uid (stmt, gimple_uid (gsi_stmt (gsi))); + gsi_insert_before (gsi, stmt, GSI_SAME_STMT); So from reading the name of the function, it's not clear that this inserts on the fallthru edge in the case where INSERT_POINT is the end of a block. And does that make sense? ISTM you'd want it on all the outgoing edges. And you have to worry about things like abnormals, critical edges,
Re: [v3] Fix libstdc++/58815
2013/10/23 Paolo Carlini paolo.carl...@oracle.com: Hi, for details see the audit trail. Tested x86_64-linux, committed to mainline. I suggest to initialize the decimal values in the new unit tests. If http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3407.html would be applied these initializations would become default-initializations thus leaving them uninitialized. This might trap during the static_cast. - Daniel
Re: [PATCH] Fix various reassoc issues (PR tree-optimization/58791, tree-optimization/58775)
On Wed, Oct 23, 2013 at 01:54:05PM +0200, Richard Biener wrote: I'm ok with the patch. Not sure if we want to backport it at all - we will only have wrong-debug issues (well, only). I've added some comments and one assert (that the uids are non-zero). Because of that assertion I'll bootstrap/regtest it again. 2013-10-23 Jakub Jelinek ja...@redhat.com PR tree-optimization/58775 PR tree-optimization/58791 * tree-ssa-reassoc.c (reassoc_stmt_dominates_stmt_p): New function. (insert_stmt_after): Rewritten, don't move the stmt, but really insert it. (get_stmt_uid_with_default): Remove. (build_and_add_sum): Use insert_stmt_after and reassoc_stmt_dominates_stmt_p. Fix up uid if bb contains only labels. (update_range_test): Set uid on stmts added by force_gimple_operand_gsi. Don't immediately modify statements in inter-bb optimization, just update oe-op values. (optimize_range_tests): Return bool whether any changed have been made. (update_ops): New function. (struct inter_bb_range_test_entry): New type. (maybe_optimize_range_tests): Perform statement changes here. (not_dominated_by, appears_later_in_bb, get_def_stmt, ensure_ops_are_available): Remove. (find_insert_point): Rewritten. (rewrite_expr_tree): Remove MOVED argument, add CHANGED argument, return LHS of the (new resp. old) stmt. Don't call ensure_ops_are_available, don't reuse SSA_NAMEs, recurse first instead of last, move new stmt at the right place. (linearize_expr, repropagate_negates): Don't reuse SSA_NAMEs. (negate_value): Likewise. Set uids. (break_up_subtract_bb): Initialize uids. (reassociate_bb): Adjust rewrite_expr_tree caller. (do_reassoc): Don't call renumber_gimple_stmt_uids. * gcc.dg/guality/pr58791-1.c: New test. * gcc.dg/guality/pr58791-2.c: New test. * gcc.dg/guality/pr58791-3.c: New test. * gcc.dg/guality/pr58791-4.c: New test. * gcc.dg/guality/pr58791-5.c: New test. * gcc.c-torture/compile/pr58775.c: New test. * gcc.dg/tree-ssa/reassoc-28.c: Don't scan reassoc1 dump. --- gcc/tree-ssa-reassoc.c.jj 2013-10-22 18:36:51.880395382 +0200 +++ gcc/tree-ssa-reassoc.c 2013-10-23 12:54:49.550475286 +0200 @@ -1143,12 +1143,94 @@ zero_one_operation (tree *def, enum tree while (1); } -/* Returns the UID of STMT if it is non-NULL. Otherwise return 1. */ +/* Returns true if statement S1 dominates statement S2. Like + stmt_dominates_stmt_p, but uses stmt UIDs to optimize. */ -static inline unsigned -get_stmt_uid_with_default (gimple stmt) +static bool +reassoc_stmt_dominates_stmt_p (gimple s1, gimple s2) +{ + basic_block bb1 = gimple_bb (s1), bb2 = gimple_bb (s2); + + /* If bb1 is NULL, it should be a GIMPLE_NOP def stmt of an (D) + SSA_NAME. Assume it lives at the beginning of function and + thus dominates everything. */ + if (!bb1 || s1 == s2) +return true; + + /* If bb2 is NULL, it doesn't dominate any stmt with a bb. */ + if (!bb2) +return false; + + if (bb1 == bb2) +{ + /* PHIs in the same basic block are assumed to be +executed all in parallel, if only one stmt is a PHI, +it dominates the other stmt in the same basic block. */ + if (gimple_code (s1) == GIMPLE_PHI) + return true; + + if (gimple_code (s2) == GIMPLE_PHI) + return false; + + gcc_assert (gimple_uid (s1) gimple_uid (s2)); + + if (gimple_uid (s1) gimple_uid (s2)) + return true; + + if (gimple_uid (s1) gimple_uid (s2)) + return false; + + gimple_stmt_iterator gsi = gsi_for_stmt (s1); + unsigned int uid = gimple_uid (s1); + for (gsi_next (gsi); !gsi_end_p (gsi); gsi_next (gsi)) + { + gimple s = gsi_stmt (gsi); + if (gimple_uid (s) != uid) + break; + if (s == s2) + return true; + } + + return false; +} + + return dominated_by_p (CDI_DOMINATORS, bb2, bb1); +} + +/* Insert STMT after INSERT_POINT. */ + +static void +insert_stmt_after (gimple stmt, gimple insert_point) { - return stmt ? gimple_uid (stmt) : 1; + gimple_stmt_iterator gsi; + basic_block bb; + + if (gimple_code (insert_point) == GIMPLE_PHI) +bb = gimple_bb (insert_point); + else if (!stmt_ends_bb_p (insert_point)) +{ + gsi = gsi_for_stmt (insert_point); + gimple_set_uid (stmt, gimple_uid (insert_point)); + gsi_insert_after (gsi, stmt, GSI_NEW_STMT); + return; +} + else +/* We assume INSERT_POINT is a SSA_NAME_DEF_STMT of some SSA_NAME, + thus if it must end a basic block, it should be a call that can + throw, or some assignment that can throw. If it throws, the LHS + of it will not be initialized though, so only valid places using + the SSA_NAME should be dominated
[PATCH] Fix PR58830, backport fix for PR57488
Boostrapped and tested on x86_64-unknonw-linux-gnu, applied. Richard. 2013-10-23 Richard Biener rguent...@suse.de Backport from mainline 2013-06-24 Richard Biener rguent...@suse.de PR tree-optimization/57488 * tree-ssa-pre.c (insert): Clear NEW sets before each iteration. * gcc.dg/torture/pr57488.c: New testcase. Index: gcc/tree-ssa-pre.c === *** gcc/tree-ssa-pre.c (revision 200237) --- gcc/tree-ssa-pre.c (working copy) *** insert (void) *** 3665,3670 --- 3666,3677 if (dump_file dump_flags TDF_DETAILS) fprintf (dump_file, Starting insert iteration %d\n, num_iterations); new_stuff = insert_aux (ENTRY_BLOCK_PTR); + + /* Clear the NEW sets before the next iteration. We have already + fully propagated its contents. */ + if (new_stuff) + FOR_ALL_BB (bb) + bitmap_set_free (NEW_SETS (bb)); } statistics_histogram_event (cfun, insert iterations, num_iterations); } Index: gcc/testsuite/gcc.dg/torture/pr57488.c === *** gcc/testsuite/gcc.dg/torture/pr57488.c (revision 0) --- gcc/testsuite/gcc.dg/torture/pr57488.c (working copy) *** *** 0 --- 1,58 + /* { dg-do run } */ + + extern void abort (void); + + int i, j, *pj = j, **ppj = pj; + int x, *px = x; + + short s, *ps = s, k; + + unsigned short u, *pu = u, **ppu = pu; + + char c, *pc = c; + + unsigned char v = 48; + + static int + bar (int p) + { + p = k; + *px = **ppu = i; + *ppj = p; + if (**ppj) + *pj = p; + return p; + } + + void __attribute__((noinline)) + foo () + { + for (; i = 3; i++) + for (; j; j--); + + u ^= bar (*pj); + + for (k = 1; k = 0; k--) + { + int l; + bar (0); + for (l = 1; l 5; l++) + { + int m; + for (m = 6; m; m--) + { + v--; + *ps = *pc; + } + } + } + } + + int + main () + { + foo (); + if (v != 0) + abort (); + return 0; + } 2013-10-23 Richard Biener rguent...@suse.de * gcc.dg/torture/pr58830.c: New testcase. Index: gcc/testsuite/gcc.dg/torture/pr58830.c === --- gcc/testsuite/gcc.dg/torture/pr58830.c (revision 0) +++ gcc/testsuite/gcc.dg/torture/pr58830.c (working copy) @@ -0,0 +1,42 @@ +/* { dg-do run } */ +/* { dg-additional-options -ftree-pre -ftree-partial-pre } */ + +extern void abort (void); + +int b, c, d, f, g, h, i, j[6], *l = b, *m, n, *o, r; +char k; + +static int +foo () +{ + char *p = k; + + for (; d; d++) +if (i) + h = 0; +else + h = c || (r = 0); + + for (f = 0; f 2; f++) +{ + unsigned int q; + *l = 0; + if (n) + *m = g; + if (g) + o = 0; + for (q = -8; q = 5; q++) + (*p)--; +} + + return 0; +} + +int +main () +{ + foo (); + if (j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[j[0] ^ (k 15)] != 0) +abort (); + return 0; +}
Re: [wide-int] Make trees more like rtxes
Richard Biener rguent...@suse.de writes: The patch does that by adding: wi::address (t) for when we want to extend tree t to addr_wide_int precision and: wi::extend (t) for when we want to extend it to max_wide_int precision. (Better names welcome.) These act just like addr_wide_int (t) and max_wide_int (t) would on current sources, except that they use the tree representation directly, so there's no copying. Good. Better names - ah well, wi::to_max_wide_int (t) and wi::to_addr_wide_int (t)? Btw, addr_wide_int is an odd name as it has at least the precision of the maximum _bit_ offset possible, right? So more like [bit_]offset_wide_int? Or just [bit_]offset_int? And then wi::to_offset (t) and wi::to_max (t)? offset_int, max_int, wi::to_offset and wi::to_max sound OK to me. Kenny? Mike? Most of the patch is mechanical and many of the wi::address (...)s and wi::extend (...)s reinstate addr_wide_int (...)s and max_wide_int (...)s from the initial implementation. Sorry for the run-around on this. One change I'd like to point out though is: @@ -7287,7 +7287,9 @@ native_encode_int (const_tree expr, unsi for (byte = 0; byte total_bytes; byte++) { int bitpos = byte * BITS_PER_UNIT; - value = wi::extract_uhwi (expr, bitpos, BITS_PER_UNIT); + /* Extend EXPR according to TYPE_SIGN if the precision isn't a whole + number of bytes. */ + value = wi::extract_uhwi (wi::extend (expr), bitpos, BITS_PER_UNIT); if (total_bytes UNITS_PER_WORD) { I think this preserves the existing trunk behaviour but I wasn't sure whether it was supposed to work like that or whether upper bits should be zero. I think the upper bits are undefined, the trunk native_interpret_int does result = double_int::from_buffer (ptr, total_bytes); return double_int_to_tree (type, result); where the call to double_int_to_tree re-extends according to the types precision and sign. wide_int_to_tree doesn't though? This is native_encode_int rather than native_interpret_int though. AIUI it's used for VIEW_CONVERT_EXPRs, so I thought the upper bits might get used. Thanks, Richard
Re: [wide-int] Make trees more like rtxes
On Wed, 23 Oct 2013, Richard Sandiford wrote: Richard Biener rguent...@suse.de writes: The patch does that by adding: wi::address (t) for when we want to extend tree t to addr_wide_int precision and: wi::extend (t) for when we want to extend it to max_wide_int precision. (Better names welcome.) These act just like addr_wide_int (t) and max_wide_int (t) would on current sources, except that they use the tree representation directly, so there's no copying. Good. Better names - ah well, wi::to_max_wide_int (t) and wi::to_addr_wide_int (t)? Btw, addr_wide_int is an odd name as it has at least the precision of the maximum _bit_ offset possible, right? So more like [bit_]offset_wide_int? Or just [bit_]offset_int? And then wi::to_offset (t) and wi::to_max (t)? offset_int, max_int, wi::to_offset and wi::to_max sound OK to me. Kenny? Mike? Most of the patch is mechanical and many of the wi::address (...)s and wi::extend (...)s reinstate addr_wide_int (...)s and max_wide_int (...)s from the initial implementation. Sorry for the run-around on this. One change I'd like to point out though is: @@ -7287,7 +7287,9 @@ native_encode_int (const_tree expr, unsi for (byte = 0; byte total_bytes; byte++) { int bitpos = byte * BITS_PER_UNIT; - value = wi::extract_uhwi (expr, bitpos, BITS_PER_UNIT); + /* Extend EXPR according to TYPE_SIGN if the precision isn't a whole + number of bytes. */ + value = wi::extract_uhwi (wi::extend (expr), bitpos, BITS_PER_UNIT); if (total_bytes UNITS_PER_WORD) { I think this preserves the existing trunk behaviour but I wasn't sure whether it was supposed to work like that or whether upper bits should be zero. I think the upper bits are undefined, the trunk native_interpret_int does result = double_int::from_buffer (ptr, total_bytes); return double_int_to_tree (type, result); where the call to double_int_to_tree re-extends according to the types precision and sign. wide_int_to_tree doesn't though? This is native_encode_int rather than native_interpret_int though. Yes, I was looking at the matched interpret variant though to see what we do. AIUI it's used for VIEW_CONVERT_EXPRs, so I thought the upper bits might get used. Yeah, that might happen, but still relying on the upper bits in any way would be brittle here. Richard.
Re: [wide-int] Make trees more like rtxes
On 10/23/2013 08:13 AM, Richard Biener wrote: On Wed, 23 Oct 2013, Richard Sandiford wrote: Richard Biener rguent...@suse.de writes: The patch does that by adding: wi::address (t) for when we want to extend tree t to addr_wide_int precision and: wi::extend (t) for when we want to extend it to max_wide_int precision. (Better names welcome.) These act just like addr_wide_int (t) and max_wide_int (t) would on current sources, except that they use the tree representation directly, so there's no copying. Good. Better names - ah well, wi::to_max_wide_int (t) and wi::to_addr_wide_int (t)? Btw, addr_wide_int is an odd name as it has at least the precision of the maximum _bit_ offset possible, right? So more like [bit_]offset_wide_int? Or just [bit_]offset_int? And then wi::to_offset (t) and wi::to_max (t)? offset_int, max_int, wi::to_offset and wi::to_max sound OK to me. Kenny? Mike? Most of the patch is mechanical and many of the wi::address (...)s and wi::extend (...)s reinstate addr_wide_int (...)s and max_wide_int (...)s from the initial implementation. Sorry for the run-around on this. One change I'd like to point out though is: @@ -7287,7 +7287,9 @@ native_encode_int (const_tree expr, unsi for (byte = 0; byte total_bytes; byte++) { int bitpos = byte * BITS_PER_UNIT; - value = wi::extract_uhwi (expr, bitpos, BITS_PER_UNIT); + /* Extend EXPR according to TYPE_SIGN if the precision isn't a whole +number of bytes. */ + value = wi::extract_uhwi (wi::extend (expr), bitpos, BITS_PER_UNIT); if (total_bytes UNITS_PER_WORD) { I think this preserves the existing trunk behaviour but I wasn't sure whether it was supposed to work like that or whether upper bits should be zero. I think the upper bits are undefined, the trunk native_interpret_int does result = double_int::from_buffer (ptr, total_bytes); return double_int_to_tree (type, result); where the call to double_int_to_tree re-extends according to the types precision and sign. wide_int_to_tree doesn't though? This is native_encode_int rather than native_interpret_int though. Yes, I was looking at the matched interpret variant though to see what we do. the wide_int_to_tree really needs to canonicalize the value before making it into a tree. the calls to tree_fits*_p (the successor to host_integer_p) depend on this being clean. Otherwise, these functions will have to clean the short integers and they get called all over the place. AIUI it's used for VIEW_CONVERT_EXPRs, so I thought the upper bits might get used. Yeah, that might happen, but still relying on the upper bits in any way would be brittle here. Richard.
Re: Patch: Add #pragma ivdep support to the ME and C FE
On Mon, 21 Oct 2013, Tobias Burnus wrote: Dear all, attached is a new version of the patch. Changes: * #pragma GCC ivdep instead of #pragma ivdep * Corrections to the error message in c-parser.c and a test case for it * New wording in the .texi and examples I am still not completely happy with the wording ? and I am open for suggestions. In the example, I played safe and mention k -m and k =m; even if k = 0 probably works. I also didn't know how to best state the reason for requiring a condition. (Internal reason: The annotation is attached to the condition - thus, it has to be present. External reason: For vectorization, there shouldn't be a branching in the body of the loop and without a condition in either the for header or in its body, one has an endless loop.) Do you have suggestions for a better wording? If not, is the patch okay for the trunk? Built and regtested (C only). [An all-language bootstrap + regtesting is underway.] Crossrefs: C review comments: http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00803.html OpenMPv4's omp simd wording, see bottom of the email at http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01194.html I don't like that you change flow_loops_find. Please instead amend tree-cfg.c:execute_build_cfg and do after loop_optimizer_init sth like FOR_EACH_LOOP (li, loop, 0) if (loop-latch) ... possibly doing after it FOR_EACH_BB (bb) for (... stmts ...) if (gimple_call_internal_p (stmt) gimple_call_internal_fn (stmt) == IFN_ANNOTATE) { warning (ignoring ...); remove it } you can of course factor this out into a function in cfgloop.c. This whole ANNOTATE stuff is only to carry information from the frontend through gimplification and until loops are first built. +/* ANNOTATE_EXPR. + Operand 0 is the expression. + Operand 1 is the annotation id, FIXME */ +DEFTREECODE (ANNOTATE_EXPR, annotate_expr, tcc_expression, 1) FIXME? Btw, not using a tree operand here is somewhat of a premature optimization I think ... Thanks, Richard.
Re: [c++-concepts] small tidbits to get it to build
Hi Ed, It looks like we did reserve assume as a keyword, but it's not being used... By any chance, did you configure without --disable-bootstrap? I think it would be a better solution to remove the unused keywords -- there were a couple of others that we grabbed for some other concepts-related work, but which aren't included in Concepts Lite. I'll apply the typeck fix. Andrew Sutton On Tue, Oct 22, 2013 at 10:02 PM, Ed Smith-Rowland 3dw...@verizon.net wrote: I had to get past two small bugs to get c++-concepts to build. Take a good look because I'm not sure if they're right. The solutions should be harmless though. Ed
Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
On Wed, Oct 23, 2013 at 9:11 AM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi Richard/Joseph, I noticed, this test case crashes on arm-eabi already witout the patch. extern void abort (void); #define test_type unsigned short #define MAGIC (unsigned short)0x102u typedef struct s{ unsigned char Prefix[1]; test_type Type; }__attribute((__packed__,__aligned__(4))) ss; volatile ss v; ss g; void __attribute__((noinline)) foo (test_type u) { v.Type = u; } test_type __attribute__((noinline)) bar (void) { return v.Type; } However when compiled with -fno-strict-volatile-bitfields it does not crash, but AFAIK the generated code for foo() violates the C++ memory model: foo: @ Function supports interworking. @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. ldrr2, .L2 ldrr3, [r2] bicr3, r3, #16711680 bicr3, r3, #65280 orrr3, r3, r0, asl #8 strr3, [r2] bxlr On Intel the generated code uses unaligned access, but is OK for the memory model: foo: .LFB0: .cfi_startproc movw%di, v+1(%rip) ret Am I right, or is the code OK for the Memory model? The C++ memory model says that you may not introduce a data-race and thus you have to access Type without touching Prefix. Richard. Regards Bernd.
[c++-concepts] Specialization of concepts
This patch disallows the explicit specialization of concepts, as required by the specification. It also fixes an ICE when comparing overloads of non-template members. 2013-10-23 Andrew Sutton andrew.n.sut...@gmail.com * gcc/cp/class.c (get_member_fntemplate): New. (are_constrained_member_overloads): Only get a template declaration if the member function is, in fact, a template or temploid. * gcc/cp/pt.c (check_explicit_specialization): Do not allow explicit specializations to be declared 'concept', and do not allow an explicit specialization of a concept. * gcc/cp/decl.c (grokfndecl): Propagate the concept flag to check_explicit_specialization. Committed in r203970. Andrew bugfix-3.patch Description: Binary data
RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2
Hi, On Wed, 23 Oct 2013 14:37:43, Richard Biener wrote: The C++ memory model says that you may not introduce a data-race and thus you have to access Type without touching Prefix. Thanks. Right now I see the following priorities: 1. make -fno-strict-volatile-bitfields respect the C++ memory model = I think, I know how to do that and can prepare a patch for that. This patch should fix the recursion problem that Sandra pointed out. 2. make Sandra's -fstrict-volatile-bitfields aware of the C++ memory model. = that means passing bitregion_start/end to strict_volatile_bitfield_p() and return false if the access is outside. (this is -fstrict-volatile-bitfields=gnu) 3. another patch can add -fstrict-volatile-bitfields=aapcs later, but I don't think we have to hurry for that. Bernd.
Re: [RFC] Isolate simplify paths with undefined behaviour
On Mon, Oct 21, 2013 at 7:27 PM, Jeff Law l...@redhat.com wrote: On 10/21/13 06:19, Richard Biener wrote: I wonder why this isn't part of the regular jump-threading code - after all the opportunity is to thead to __builtin_unreachable () ;) Otherwise the question is always where you'd place this pass and whether it enables jump threading or CSE opportunities (that at least it does). In theory one could do this as part of jump threading. Doing it in jump threading would have the advantage that we could use a NULL generated in a different block than the dereference. Of course, we'd have a full path to duplicate at that point. The downside is complexity. This pass is amazingly simple and effective. The jump threading code is considerably more complex and bolting this on the side would just make it worse. From a pipeline location, if kept as a separate pass, it probably needs to be between DOM1 phi-only cprop. DOM1 does a reasonable job at propagating the NULL values into the PHI nodes to expose the optimization. And the pass tends to expose precisely the kinds of PHI nodes that phi-only cprop can optimize. VRP2/PRE/DOM2 do a good job at finding the newly exposed CSEs and jump threading opportunities. I haven't looked to see if there's anything left to optimize after DOM2, but I can certainly speculate that there would be cases exposed by the various passes that currently exist between DOM1 DOM2. That's certainly worth exploring. As you know, phase ordering will always be a concern. We already have acknowledged that we're punting some of those issues once we stopped iterating DOM. True ... Btw, -ftree-isolate-paths sounds a bit generic for isolating paths leading to undefined behavior, maybe -fisolate-errorneous-paths? (please drop 'tree-' from new options, 'tree' isn't meaningful to GCC users) The file should be named gimple-ssa-isolate-paths.c, too. + for (si = gsi_start_nondebug_bb (bb), si2 = gsi_start_nondebug_bb (duplicate); +!gsi_end_p (si) !gsi_end_p (si2); +gsi_next_nondebug (si), gsi_next_nondebug (si2)) + { + while (gimple_code (gsi_stmt (si)) == GIMPLE_LABEL) + gsi_next_nondebug (si); gsi_after_labels (bb); if (is_gimple_debug (gsi_stmt (gsi))) gsi_next_nondebug (gsi); warrants a new gsi_nondebug_after_labels () + /* SI2 is the iterator in the duplicate block and it now points +to our victim statement. */ + gimple_seq seq = NULL; + tree t = build_call_expr_loc (0, + builtin_decl_explicit (BUILT_IN_TRAP), 0); + gimplify_and_add (t, seq); + gsi_insert_before (si2, seq, GSI_SAME_STMT); please build a gimple call stmt directly instead of going through the gimplifier. + if (operand_equal_p (op, integer_zero_node, 0)) integer_zerop () + /* We've got a NULL PHI argument. Now see if the +PHI's result is dereferenced within BB. */ + FOR_EACH_IMM_USE_STMT (use_stmt, iter, lhs) + { + unsigned int j; + bool removed_edge = false; + + /* We only care about uses in BB which are assignements +with memory operands. + +We could see if any uses are as function arguments +when the callee has marked the argument as being +non-null. */ + if (gimple_bb (use_stmt) != bb + || !gimple_has_mem_ops (use_stmt) + || gimple_code (use_stmt) != GIMPLE_ASSIGN) !is_gimple_assign (), you can drop the has_mem_ops check, but ... + continue; + + /* Look at each operand for a memory reference using +LHS. */ ... you want to use walk_stmt_load_store_ops () which handles all kinds of stmts fine. + for (j = 0; j gimple_num_ops (use_stmt); j++) + { + tree op = gimple_op (use_stmt, j); + + if (op == NULL) + continue; + + while (handled_component_p (op)) + op = TREE_OPERAND (op, 0); + + if ((TREE_CODE (op) == MEM_REF + || TREE_CODE (op) == INDIRECT_REF) no more INDIRECT_REFs are in the IL. Richard. + TREE_OPERAND (op, 0) == lhs) + { + edge e = gimple_phi_arg_edge (phi, i); + duplicate = isolate_path (bb, duplicate, + e, use_stmt); + + /* When we remove an incoming edge, we need to +reprocess the Ith element. */ + next_i = i; + +
Re: [PATCH, rs6000] Fix mulv8hi3 pattern for little endian
* config/rs6000/altivec.md (mulv8hi3): Adjust for little endian. Okay. Thanks, David
Re: [RFC] Isolate simplify paths with undefined behaviour
On 10/23/13 01:58, Florian Weimer wrote: Could you keep in mind that there is considerable interest in a check_nonnull attribute which marks values (parameters, return values, maybe even struct fields) that can be NULL and need to be checked explictly prior to dereference? GCC would then warn if there is a path on which the check is missing. I don't have time at the moment to work on this, but it's on my ever-growing TODO list. :) There's a lot of overlap between this and the change which originally set me down this path -- namely a pass which would warn about potential null pointer dereference (with suitable knobs which allowed the user to control things like the non-nullness of a load of a pointer from memory). I'd like to get back to that work, but it's definitely not a 4.9 thing. Jeff
[v3] Resolve libstdc++/58850
Hi, strictly speaking this isn't a bug, is more a QoI issue, but it shows that the current minutes and hours typedefs we have got in chrono, while strictly speaking conforming, lead to surprises: /// minutes typedef durationint, ratio 60 minutes; /// hours typedef durationint, ratio3600 hours; that is, somewhat arbitrarily, while use int instead of int64_t, like we do for the others (for seconds, etc), only because on many targets, by no mean all, an int is typically 32 bits, thus the Standard requirements are fulfilled (it talks about at least 29 bits and at least 23 bits, respectively). Let's just use int64_t for all of them. Tested x86_64-linux. Thanks, Paolo. / 2013-10-23 Paolo Carlini paolo.carl...@oracle.com PR libstdc++/58850 * include/std/chrono (minutes, hours): Change typedefs to uniformly use int64_t. * testsuite/20_util/duration/arithmetic/58850.cc: New. Index: include/std/chrono === --- include/std/chrono (revision 203955) +++ include/std/chrono (working copy) @@ -524,22 +524,22 @@ { return !(__lhs __rhs); } /// nanoseconds -typedef durationint64_t, nanonanoseconds; +typedef durationint64_t, nanonanoseconds; /// microseconds -typedef durationint64_t, micro microseconds; +typedef durationint64_t, micro microseconds; /// milliseconds -typedef durationint64_t, milli milliseconds; +typedef durationint64_t, milli milliseconds; /// seconds -typedef durationint64_t seconds; +typedef durationint64_t seconds; /// minutes -typedef durationint, ratio 60 minutes; +typedef durationint64_t, ratio 60 minutes; /// hours -typedef durationint, ratio3600 hours; +typedef durationint64_t, ratio3600 hours; /// time_point templatetypename _Clock, typename _Dur Index: testsuite/20_util/duration/arithmetic/58850.cc === --- testsuite/20_util/duration/arithmetic/58850.cc (revision 0) +++ testsuite/20_util/duration/arithmetic/58850.cc (working copy) @@ -0,0 +1,42 @@ +// { dg-options -std=gnu++11 } +// { dg-require-cstdint } + +// Copyright (C) 2013 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +#include chrono +#include testsuite_hooks.h + +void test01() +{ + bool test __attribute__((unused)) = true; + using namespace std::chrono; + + typedef durationstd::int64_t, std::ratio36 * 24 * 36525 Years; + Years galactic_empire_age( 12067 ); + + VERIFY( duration_castseconds( galactic_empire_age ).count() + == duration_castminutes( galactic_empire_age ).count() * 60 ); + VERIFY( duration_castminutes( galactic_empire_age ).count() + == duration_castseconds( galactic_empire_age ).count() / 60 ); +} + +int main() +{ + test01(); + return 0; +}
[patch] Flatten tree-ssa-loops.h
Similar to tree-ssa.h, tree-ssa-loops.h became an aggregator for 3 of the tree-ssa-loop* header files. This remedies that situation. The average .c file required only 1 of the 3 includes from tree-ssa-loop.h. Bootstraps on x86_64-unknown-linux-gnu (with graphite enabled :-). Regressions are running, but I expect no issues due to this just being just an include shuffle. OK assuming no errors? Andrew * tree-ssa-loop.h: Remove include files. * gengtype.c (open_base_files): Adjust include list for gtype-desc.c. * cfgloopmanip.c: Move required includes from tree-ssa-loop.h. * graphite-clast-to-gimple.c: Likewise. * graphite-scop-detection.c: Likewise. * graphite-sese-to-poly.c: Likewise. * ipa-inline-analysis.c: Likewise. * ipa-pure-const.c: Likewise. * loop-init.c: Likewise. * passes.c: Likewise. * predict.c: Likewise. * tree-cfg.c: Likewise. * tree-cfgcleanup.c: Likewise. * tree-chrec.c: Likewise. * tree-data-ref.c: Likewise. * tree-loop-distribution.c: Likewise. * tree-parloops.c: Likewise. * tree-predcom.c: Likewise. * tree-scalar-evolution.c: Likewise. * tree-ssa-address.c: Likewise. * tree-ssa.c: Likewise. * tree-ssa-dce.c: Likewise. * tree-ssa-loop.c: Likewise. * tree-ssa-loop-im.c: Likewise. * tree-ssa-loop-ivcanon.c: Likewise. * tree-ssa-loop-ivopts.c: Likewise. * tree-ssa-loop-manip.c: Likewise. * tree-ssa-loop-niter.c: Likewise. * tree-ssa-loop-prefetch.c: Likewise. * tree-ssa-loop-unswitch.c: Likewise. * tree-ssa-reassoc.c: Likewise. * tree-vect-data-refs.c: Likewise. * tree-vect-loop.c: Likewise. * tree-vect-loop-manip.c: Likewise. * tree-vectorizer.c: Likewise. * tree-vect-stmts.c: Likewise. * tree-vrp.c: Likewise. Index: tree-ssa-loop.h === *** tree-ssa-loop.h (revision 203976) --- tree-ssa-loop.h (working copy) *** along with GCC; see the file COPYING3. *** 20,29 #ifndef GCC_TREE_SSA_LOOP_H #define GCC_TREE_SSA_LOOP_H - #include tree-ssa-loop-ivopts.h - #include tree-ssa-loop-manip.h - #include tree-ssa-loop-niter.h - /* Affine iv. */ typedef struct affine_iv_d --- 20,25 Index: gengtype.c === *** gengtype.c (revision 203976) --- gengtype.c (working copy) *** open_base_files (void) *** 1738,1744 optabs.h, libfuncs.h, debug.h, ggc.h, cgraph.h, gimple.h, gimple-ssa.h, tree-cfg.h, tree-phinodes.h, ssa-iterators.h, tree-ssanames.h, tree-ssa-loop.h, ! tree-into-ssa.h, tree-dfa.h, tree-ssa.h, reload.h, cpp-id-data.h, tree-chrec.h, except.h, output.h, cfgloop.h, target.h, ipa-prop.h, lto-streamer.h, target-globals.h, --- 1738,1745 optabs.h, libfuncs.h, debug.h, ggc.h, cgraph.h, gimple.h, gimple-ssa.h, tree-cfg.h, tree-phinodes.h, ssa-iterators.h, tree-ssanames.h, tree-ssa-loop.h, ! tree-ssa-loop-ivopts.h, tree-ssa-loop-manip.h, ! tree-ssa-loop-niter.h, tree-into-ssa.h, tree-dfa.h, tree-ssa.h, reload.h, cpp-id-data.h, tree-chrec.h, except.h, output.h, cfgloop.h, target.h, ipa-prop.h, lto-streamer.h, target-globals.h, Index: cfgloopmanip.c === *** cfgloopmanip.c (revision 203976) --- cfgloopmanip.c (working copy) *** along with GCC; see the file COPYING3. *** 26,32 #include cfgloop.h #include tree.h #include gimple.h ! #include tree-ssa-loop.h #include dumpfile.h static void copy_loops_to (struct loop **, int, --- 26,32 #include cfgloop.h #include tree.h #include gimple.h ! #include tree-ssa-loop-manip.h #include dumpfile.h static void copy_loops_to (struct loop **, int, Index: graphite-clast-to-gimple.c === *** graphite-clast-to-gimple.c (revision 203976) --- graphite-clast-to-gimple.c (working copy) *** along with GCC; see the file COPYING3. *** 38,43 --- 38,44 #include tree.h #include gimple.h #include gimple-ssa.h + #include tree-ssa-loop-manip.h #include tree-ssa-loop.h #include tree-into-ssa.h #include tree-pass.h Index: graphite-scop-detection.c === *** graphite-scop-detection.c (revision 203976) --- graphite-scop-detection.c (working copy) *** along with GCC; see the file COPYING3. *** 36,41 --- 36,43 #include gimple-ssa.h #include tree-phinodes.h #include ssa-iterators.h + #include tree-ssa-loop-manip.h + #include tree-ssa-loop-niter.h #include tree-ssa-loop.h #include tree-into-ssa.h #include tree-ssa.h Index: graphite-sese-to-poly.c === *** graphite-sese-to-poly.c (revision 203976) --- graphite-sese-to-poly.c (working copy) *** along with GCC; see the file COPYING3.
[PATCH] Fix for target/58838
PR target/58838 * config/rs6000/rs6000.md (mulsi3_internal1 and splitter): Add TARGET_32BIT final condition. (mulsi3_internal2 and splitter): Same. Index: rs6000.md === --- rs6000.md (revision 203930) +++ rs6000.md (working copy) @@ -2699,7 +2699,7 @@ (match_operand:SI 2 gpc_reg_operand r,r)) (const_int 0))) (clobber (match_scratch:SI 3 =r,r))] - + TARGET_32BIT @ mullw. %3,%1,%2 # @@ -2712,7 +2712,7 @@ (match_operand:SI 2 gpc_reg_operand )) (const_int 0))) (clobber (match_scratch:SI 3 ))] - reload_completed + TARGET_32BIT reload_completed [(set (match_dup 3) (mult:SI (match_dup 1) (match_dup 2))) (set (match_dup 0) @@ -2727,7 +2727,7 @@ (const_int 0))) (set (match_operand:SI 0 gpc_reg_operand =r,r) (mult:SI (match_dup 1) (match_dup 2)))] - + TARGET_32BIT @ mullw. %0,%1,%2 # @@ -2741,7 +2741,7 @@ (const_int 0))) (set (match_operand:SI 0 gpc_reg_operand ) (mult:SI (match_dup 1) (match_dup 2)))] - reload_completed + TARGET_32BIT reload_completed [(set (match_dup 0) (mult:SI (match_dup 1) (match_dup 2))) (set (match_dup 3)
Re: [patch] Flatten tree-ssa-loops.h
On Wed, Oct 23, 2013 at 10:27 AM, Andrew MacLeod amacl...@redhat.com wrote: Similar to tree-ssa.h, tree-ssa-loops.h became an aggregator for 3 of the tree-ssa-loop* header files. This remedies that situation. The average .c file required only 1 of the 3 includes from tree-ssa-loop.h. Bootstraps on x86_64-unknown-linux-gnu (with graphite enabled :-). Regressions are running, but I expect no issues due to this just being just an include shuffle. Looks OK. What do you think about Jan-Benedict's idea of installing a commit hook that would check that no #includes of certain headers get into other headers? We could just leave this to code reviews, but perhaps installing a hook makes it easier. Diego.
[PATCH] More threading cleanups
During testing of the generalized FSA opt I ran into a case where we tried to thread through a joiner with abnormal outgoing edges. Since we can't reliably clone the joiner and redirect just one of its outgoing edges in this case, we need to avoid even trying to thread through such blocks. While looking at that I realized we don't need the code to filter out jump threading opportunities involving joiner blocks. We can avoid the unnecessary duplication by first handling all the jump threads which do not require a joiner block, then handling the jump threads which do require a joiner block. Finally, I was able to look at a dangling AUX field problem that I've seen a few times. While I haven't seen it on the trunk, I believe it's just latent. When we thread the latch edge to an exit edge, we clear loop-header. If we had another thread path into that same loop from outside, the second thread path will be ignored because loop-header is NULL. This leaves a dangling edge AUX field, which we need to clear. All three issues are addressed by this patch. Bootstrapped and regression tested on x86_64-unknown-linux-gnu. Installed on the trunk. commit 9e33df4427bf4e13c32d3296fcdc1ee1b1fd7bfa Author: Jeff Law l...@redhat.com Date: Wed Oct 23 07:53:46 2013 -0600 * tree-ssa-threadedge.c (thread_across_edge): Do not allow threading through joiner blocks with abnormal outgoing edges. * tree-ssa-threadupdate.c (thread_block_1): Renamed from thread_block. Add parameter JOINERS, to allow/disallow threading through joiner blocks. (thread_block): New. Call thread_block_1. (mark_threaded_blocks): Remove code to filter out certain cases of threading through joiner blocks. (thread_through_all_blocks): Document how we can have a dangling edge AUX field and clear it. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 6132515..acd6620 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,17 @@ +2013-10-23 Jeff Law l...@redhat.com + + * tree-ssa-threadedge.c (thread_across_edge): Do not allow threading + through joiner blocks with abnormal outgoing edges. + + * tree-ssa-threadupdate.c (thread_block_1): Renamed from thread_block. + Add parameter JOINERS, to allow/disallow threading through joiner + blocks. + (thread_block): New. Call thread_block_1. + (mark_threaded_blocks): Remove code to filter out certain cases + of threading through joiner blocks. + (thread_through_all_blocks): Document how we can have a dangling + edge AUX field and clear it. + 2013-10-23 Ian Lance Taylor i...@google.com * doc/invoke.texi (Option Summary): Remove -fno-default-inline. diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c index bbc4f9b..810908b 100644 --- a/gcc/tree-ssa-threadedge.c +++ b/gcc/tree-ssa-threadedge.c @@ -1040,6 +1040,16 @@ thread_across_edge (gimple dummy_cond, edge_iterator ei; bool found; +/* If E-dest has abnormal outgoing edges, then there's no guarantee + we can safely redirect any of the edges. Just punt those cases. */ +FOR_EACH_EDGE (taken_edge, ei, e-dest-succs) + if (taken_edge-flags EDGE_ABNORMAL) + { + remove_temporary_equivalences (stack); + BITMAP_FREE (visited); + return; + } + /* Look at each successor of E-dest to see if we can thread through it. */ FOR_EACH_EDGE (taken_edge, ei, e-dest-succs) { diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c index bde81a8..1027191 100644 --- a/gcc/tree-ssa-threadupdate.c +++ b/gcc/tree-ssa-threadupdate.c @@ -616,10 +616,12 @@ redirection_block_p (basic_block bb) the appropriate duplicate of BB. If NOLOOP_ONLY is true, we only perform the threading as long as it - does not affect the structure of the loops in a nontrivial way. */ + does not affect the structure of the loops in a nontrivial way. + + If JOINERS is true, then thread through joiner blocks as well. */ static bool -thread_block (basic_block bb, bool noloop_only) +thread_block_1 (basic_block bb, bool noloop_only, bool joiners) { /* E is an incoming edge into BB that we may or may not want to redirect to a duplicate of BB. */ @@ -642,7 +644,9 @@ thread_block (basic_block bb, bool noloop_only) e = loop_latch_edge (loop); vecjump_thread_edge * *path = THREAD_PATH (e); - if (path) + if (path + (((*path)[1]-type == EDGE_COPY_SRC_JOINER_BLOCK joiners) + || ((*path)[1]-type == EDGE_COPY_SRC_BLOCK !joiners))) { for (unsigned int i = 1; i path-length (); i++) { @@ -666,6 +670,11 @@ thread_block (basic_block bb, bool noloop_only) continue; vecjump_thread_edge * *path = THREAD_PATH (e); + + if (((*path)[1]-type == EDGE_COPY_SRC_JOINER_BLOCK !joiners) + || ((*path)[1]-type ==
Re: [Ada] Convert Ada front-end to automatic dependencies
Eric Botcazou ebotca...@adacore.com writes: This finally converts the Ada front-end. Tested on x86_64-suse-linux, applied on the mainline. 2013-10-13 Nicolas Roche ro...@adacore.com Eric Botcazou ebotca...@adacore.com * gcc-interface/Make-lang.in (ada/%.o): Replace individual rules with generic rule and add $(POSTCOMPILE). (ADA_DEPS): New. (.adb.o): Add @$(ADA_DEPS). (.ads.o): Likewise. (ada/a-except.o): Likewise. (ada/s-excdeb.): Likewise. (ada/s-assert.o): Likewise. (ada/a-tags.o): Likewise. (ada_generated_files): New variable. Use them as dependency order for GNAT1_ADA_OBJS and GNATBIND_OBJS. (ADA_DEPFILES): New variable. Include them. (ada_OBJS): Define. This patch broke Solaris Ada bootstrap with SHELL=/bin/ksh on Solaris 10 and 11: gcc -c -g -gnatpg -gnatwns -gnata -W -Wall -nostdinc -I- -I. -Iada -I/vol/gcc/src/hg/trunk/local/gcc/ada -I/vol/gcc/src/hg/trunk/local/gcc/ada/gcc-interface /vol/gcc/src/hg/trunk/local/gcc/ada/ada.ads -o ada/ada.o sed: -e expression #1, char 1: unterminated `s' command ada/ada.o: /bin/ksh: ^D: not found /bin/ksh: ada/1: not found /bin/ksh: : cannot execute /bin/ksh: gp | tr -d '\015' | tr '\n' ' ': not found make: *** [ada/ada.o] Error 127 If you have make print the commands in ADA_DEPS, you see case ada/ada.o in *sdefault.o);; *)a=`echo ada/ada.o | sed -e 's/.o$/.ali/'`; echo ada/ada.o: `cat $a | sed -ne s;^D \([a-z0-9_\.-]*\).*;ada/\1;gp | tr -d '\015' | tr '\n' ' '` ada//.deps/ada.Po;; esac; So this boils down to case ada/ada.o in *sdefault.o) ;; *) a=`echo ada/ada.o | sed -e 's/.o$/.ali/'`; \ echo ada/ada.o: `cat $a | sed -ne s;^D \([a-z0-9_\.-]*\).*;ada/\1;gp | tr -d '\015' | tr '\n' ' '` ada//.deps/ada.Po ;; esac; Obviously /bin/ksh (both the old ksh88 in Solaris 10 and ksh93 in Solaris 11) interpret the echo line as echo ada/ada.o: `cat $a | sed -ne s; ^D \([a-z0-9_\.-]*\).*; ada/\1; gp | tr -d '\015' | tr '\n' ' '` ada//.deps/ada.Po The following trivial patch avoids this and allows the i386-pc-solaris2.1[01] and sparc-sun-solaris2.11 bootstraps to finish. Ok for mainline? Rainer 2013-10-23 Rainer Orth r...@cebitec.uni-bielefeld.de * gcc-interface/Make-lang.in (ADA_DEPS): Fix quoting. diff --git a/gcc/ada/gcc-interface/Make-lang.in b/gcc/ada/gcc-interface/Make-lang.in --- a/gcc/ada/gcc-interface/Make-lang.in +++ b/gcc/ada/gcc-interface/Make-lang.in @@ -110,7 +110,7 @@ ADA_DEPS=case $@ in \ *sdefault.o);; \ *)a=`echo $@ | sed -e 's/.o$$/.ali/'`; \ echo $@: `cat $$a | \ -sed -ne s;^D \([a-z0-9_\.-]*\).*;ada/\1;gp | \ +sed -ne 's;^D \([a-z0-9_\.-]*\).*;ada/\1;gp' | \ tr -d '\015' | tr '\n' ' '` $(dir $@)/$(DEPDIR)/$(patsubst %.o,%.Po,$(notdir $@));; \ esac; -- - Rainer Orth, Center for Biotechnology, Bielefeld University
ping [PATCH] rewrite stack vectors]
Hi, ping ;) Trev - Forwarded message from tsaund...@mozilla.com - Date: Thu, 10 Oct 2013 14:07:22 -0400 From: tsaund...@mozilla.com To: dnovi...@google.com, gcc-patches@gcc.gnu.org Cc: Trevor Saunders tsaund...@mozilla.com Subject: [PATCH] rewrite stack vectors X-Mailer: git-send-email 1.8.4.rc3 From: Trevor Saunders tsaund...@mozilla.com Hi, This makes the implementation of stack vectors simpler and easier to use. This works by making the size of the on stack storage a template argument, so the size is embedded in the type. This allows you to implicitly convert a stack_vecT, N to a vecT, va_heap *, and it will just work. Because there's no need to support stack vectors in unions we can make them be a more normal c++ class with a constructor and destructor that are nontrivial. Trev 2013-10-08 Trevor Saunders tsaund...@mozilla.com ada/ * gcc-interface/decl.c (components_to_record): Adjust. gcc/ * df-scan.c (df_collection_rec): Adjust. (copy_defs): New constant. (copy_uses): Likewise. (copy_eq_uses): Likewise. (copy_mw): Likewise. (copy_all): Likewise. (df_insn_rescan): Adjust. (df_notes_rescan): Likewise. (df_swap_refs): Likewise. (df_sort_and_compress_refs): Likewise. (df_sort_and_compress_mws): Likewise. (df_install_refs): Likewise. (df_install_mws): Likewise. (df_refs_add_to_chains): Add flags parameter controlling which vectors are coppied. (df_bb_refs_record): Adjust. (df_record_entry_block_defs): Likewise. (df_record_exit_block_defs): Likewise. (df_refs_verify): Likewise. (df_mws_verify): Likewise. (df_insn_refs_verify): Likewise. (df_bb_verify): Likewise. * ipa-pure-const.c (finish_state): Remove. (propagate): Adjust. * tree-data-ref.c tree-ssa-alias.c tree-ssa-loop-ivcanon.c tree-ssa-threadedge.c tree-vect-loop-manip.c tree-vect-slp.c var-tracking.c Adjust. * vec.c (stack_vecs): Remove. (register_stack_vec): Likewise. (stack_vec_register_index): Likewise. (unregister_stack_vec): Likewise. * vec.h (struct va_stack): Remove. (struct vecT, A, vl_ptr): Specialize as struct vecT, va_heap, vl_ptr instead since va_heap is the only allocation strategy compatable with the vl_ptr layout. (struct vecT, va_gc, vl_ptr): Remove because it now gets an empty specialization anyway. (class stack_vec): New class. (vec_stack_alloc): Remove. (vecT, va_heap, vl_ptr::using_auto_storage): New method. diff --git a/gcc/ada/gcc-interface/decl.c b/gcc/ada/gcc-interface/decl.c index 456d7ab..0debdae 100644 --- a/gcc/ada/gcc-interface/decl.c +++ b/gcc/ada/gcc-interface/decl.c @@ -6994,13 +6994,11 @@ components_to_record (tree gnu_record_type, Node_Id gnat_component_list, tree gnu_union_type, gnu_union_name; tree this_first_free_pos, gnu_variant_list = NULL_TREE; bool union_field_needs_strict_alignment = false; - vec vinfo_t, va_stack variant_types; + stack_vec vinfo_t, 16 variant_types; vinfo_t *gnu_variant; unsigned int variants_align = 0; unsigned int i; - vec_stack_alloc (vinfo_t, variant_types, 16); - if (TREE_CODE (gnu_name) == TYPE_DECL) gnu_name = DECL_NAME (gnu_name); @@ -7196,9 +7194,6 @@ components_to_record (tree gnu_record_type, Node_Id gnat_component_list, gnu_variant_list = gnu_field; } - /* We are done with the variants. */ - variant_types.release (); - /* Only make the QUAL_UNION_TYPE if there are non-empty variants. */ if (gnu_variant_list) { diff --git a/gcc/df-scan.c b/gcc/df-scan.c index f2e8ab2..aace96d 100644 --- a/gcc/df-scan.c +++ b/gcc/df-scan.c @@ -86,10 +86,10 @@ static HARD_REG_SET elim_reg_set; struct df_collection_rec { - vecdf_ref, va_stack def_vec; - vecdf_ref, va_stack use_vec; - vecdf_ref, va_stack eq_use_vec; - vecdf_mw_hardreg_ptr, va_stack mw_vec; + stack_vecdf_ref, 128 def_vec; + stack_vecdf_ref, 32 use_vec; + stack_vecdf_ref, 32 eq_use_vec; + stack_vecdf_mw_hardreg_ptr, 32 mw_vec; }; static df_ref df_null_ref_rec[1]; @@ -131,7 +131,7 @@ static void df_ref_chain_delete_du_chain (df_ref *); static void df_ref_chain_delete (df_ref *); static void df_refs_add_to_chains (struct df_collection_rec *, - basic_block, rtx); + basic_block, rtx, unsigned int); static bool df_insn_refs_verify (struct df_collection_rec *, basic_block, rtx, bool); static void df_entry_block_defs_collect (struct df_collection_rec *, bitmap); @@ -153,6 +153,14 @@ static void df_insn_info_delete (unsigned int);
Re: [PATCH, PR 57748] Check for out of bounds access
On Tue, Oct 22, 2013 at 3:50 PM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi, On Tue, 17 Sep 2013 01:09:45, Martin Jambor wrote: @@ -4773,6 +4738,8 @@ expand_assignment (tree to, tree from, b if (MEM_P (to_rtx) GET_MODE (to_rtx) == BLKmode GET_MODE (XEXP (to_rtx, 0)) != VOIDmode + bitregion_start == 0 + bitregion_end == 0 bitsize 0 (bitpos % bitsize) == 0 (bitsize % GET_MODE_ALIGNMENT (mode1)) == 0 ... I'm not sure to what extent the hunk adding tests for bitregion_start and bitregion_end being zero is connected to this issue. I do not see any of the testcases exercising that path. If it is indeed another problem, I think it should be submitted (and potentially committed) as a separate patch, preferably with a testcase. Meanwhile I am able to give an example where that code is executed with bitpos = 64, bitsize=32, bitregion_start = 32, bitregion_end = 95. Afterwards bitpos=0, bitsize=32, which is completely outside bitregion_start=32, bitregion_end=95. However this can only be seen in the debugger, as the store_field goes thru a code path that does not look at bitregion_start/end. Well that is at least extremely ugly, and I would not be sure, that I cannot come up with a sample that crashes or creates wrong code. Currently I think that maybe the best way to fix that would be this: --- gcc/expr.c2013-10-21 08:27:09.546035668 +0200 +++ gcc/expr.c2013-10-22 15:19:56.749476525 +0200 @@ -4762,6 +4762,9 @@ expand_assignment (tree to, tree from, b MEM_ALIGN (to_rtx) == GET_MODE_ALIGNMENT (mode1)) { to_rtx = adjust_address (to_rtx, mode1, bitpos / BITS_PER_UNIT); + bitregion_start = 0; + if (bitregion_end= (unsigned HOST_WIDE_INT) bitpos) +bitregion_end -= bitpos; bitpos = 0; } Any suggestions? if bitregion_start/end are used after the adjust_address call then they have to be adjusted (or bitpos left in place). In fact why we apply byte-parts of bitpos here only if offset != 0 is weird. OTOH this code is _very_ old... what happens if you remove the whole case? Richard. Regards Bernd.
Re: [PATCH] Vectorizing abs(char/short/int) on x86.
On Tue, 22 Oct 2013, Cong Hou wrote: For abs(char/short), type conversions are needed as the current abs() function/operation does not accept argument of char/short type. Therefore when we want to get the absolute value of a char_val using abs (char_val), it will be converted into abs ((int) char_val). It then can be vectorized, but the generated code is not efficient as lots of packings and unpackings are envolved. But if we convert (char) abs ((int) char_val) to abs (char_val), the vectorizer will be able to generate better code. Same for short. ABS_EXPR has undefined overflow behavior. Thus, abs ((int) -128) is defined (and we also define the subsequent conversion of +128 to signed char, which ISO C makes implementation-defined not undefined), and converting to an ABS_EXPR on char would wrongly make it undefined. For such a transformation to be valid (in the absence of VRP saying that -128 isn't a possible value) you'd need a GIMPLE representation for ABS_EXPRoverflow:wrap, as distinct from ABS_EXPRoverflow:undefined. You don't have the option there is for some arithmetic operations of converting to a corresponding operation on unsigned types. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH, PR58805] Add missing check in stmt_local_def for tail-merge
On 22/10/13 20:50, Jeff Law wrote: On 10/22/13 03:58, Tom de Vries wrote: Richard, This patch adds a missing check for gimple_vdef in stmt_local_def for the tail-merge pass. Bootstrapped and reg-tested on x86_64. OK for trunk, gcc-4_8-branch? Thanks, - Tom 2013-10-22 Tom de Vries t...@codesourcery.com PR tree-optimization/58805 * tree-ssa-tail-merge.c (stmt_local_def): Add gimple_vdef check. * gcc.dg/pr58805.c: New test. Doesn't this test belong in an architecture specific directory? Jeff, The test-case has i386 assembly inside the asm string, but since the test-case only compiles, the assembly string is never used. I've made the string empty to make that clear. AFAIU the only requirement for this test-case is that the constraint matches the operand. I'm not sure whether 'unsigned long' always matches 'r'. I've changed this into 'void *' and 'p', which I think should always be true. Committed as below. Thanks, - Tom jeff +++ b/gcc/testsuite/gcc.dg/pr58805.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -ftree-tail-merge -fdump-tree-pre } */ + +/* Type that matches the 'p' constraint. */ +#define TYPE void * + +static inline +void bar (TYPE *r) +{ + TYPE t; + __asm__ ( : =p (t), =p (*r)); +} + +void +foo (int n, TYPE *x, TYPE *y) +{ + if (n == 0) +bar (x); + else +bar (y); +} + +/* { dg-final { scan-tree-dump-times __asm__ 2 pre} } */ +/* { dg-final { cleanup-tree-dump pre } } */
Re: [PATCH] Vectorizing abs(char/short/int) on x86.
On Wed, Oct 23, 2013 at 5:52 PM, Joseph S. Myers jos...@codesourcery.com wrote: On Tue, 22 Oct 2013, Cong Hou wrote: For abs(char/short), type conversions are needed as the current abs() function/operation does not accept argument of char/short type. Therefore when we want to get the absolute value of a char_val using abs (char_val), it will be converted into abs ((int) char_val). It then can be vectorized, but the generated code is not efficient as lots of packings and unpackings are envolved. But if we convert (char) abs ((int) char_val) to abs (char_val), the vectorizer will be able to generate better code. Same for short. ABS_EXPR has undefined overflow behavior. Thus, abs ((int) -128) is defined (and we also define the subsequent conversion of +128 to signed char, which ISO C makes implementation-defined not undefined), and converting to an ABS_EXPR on char would wrongly make it undefined. For such a transformation to be valid (in the absence of VRP saying that -128 isn't a possible value) you'd need a GIMPLE representation for ABS_EXPRoverflow:wrap, as distinct from ABS_EXPRoverflow:undefined. You don't have the option there is for some arithmetic operations of converting to a corresponding operation on unsigned types. Note that we simply could make ABS_EXPR have defined behavior on overflow (wrapping). Of course you have to audit expanders whether they rely on undefined behavior and you also have to audit existing foldings. Undefined behavior on ABS_EXPR (INT_MIN) isn't as useful for optimization as that on plus, minus and multiply. Similar for the integer division case. Richard. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH 1/n] Add conditional compare support
+static enum rtx_code +arm_ccmode_to_code (enum machine_mode mode) +{ + switch (mode) +{ +case CC_DNEmode: + return NE; Why would you need to encode comparisons in CCmodes? That looks like a mis-design to me. +Conditional compare instruction. Operand 2 and 5 are RTLs which perform +two comparisons. Operand 1 is AND or IOR, which operates on the result of +Operand 2 and 5. Operand 0 is the result of operand 1. + +A typical @code{ccmp} pattern looks like + +@smallexample +(define_expand ccmp + [(set (match_operand 0 register_operand ) +(match_operator 1 + [(match_operator:SI 2 comparison_operator + [(match_operand:SI 3 register_operand) + (match_operand:SI 4 register_operand)]) + (match_operator:SI 5 comparison_operator + [(match_operand:SI 6 register_operand) + (match_operand:SI 7 register_operand)])]))] + + @dots{}) +@end smallexample This documentation is inadequate. What sorts of input combinations are valid? How is the middle-end expected to compose more than two compares, as you describe in the mail? What mode should the middle-end use to allocate that output register? The only thing that makes a sort of sense to me is something akin to how the old cmp + bcc patterns worked. Except that's not especially clean, and there's a reason we got rid of them. I think the only clean interface is going to be a new hook or hooks. The return value of the hook with be an rtx akin to the PTEST value returned by prepare_cmp_insn. Which is a proper input to cbranch/cstore/etc. I might think that two hooks would be appropriate because it might make the ccmp hook easier. I.e. res = targetm.cmp_hook(...); while (more) res = targetm.ccmp_hook(res, ...); For combinations that can't be implemented with ccmp, the hook can return null. +;; The first compare in this pattern is the result of a previous CCMP. +;; We can not swap it. And we only need its flag. +(define_insn *ccmp_and + [(set (match_operand 6 dominant_cc_register ) + (compare + (and:SI + (match_operator 4 expandable_comparison_operator +[(match_operand 0 dominant_cc_register ) + (match_operand:SI 1 arm_add_operand )]) + (match_operator:SI 5 arm_comparison_operator +[(match_operand:SI 2 s_register_operand + l,r,r,r,r) + (match_operand:SI 3 arm_add_operand + lPy,rI,L,rI,L)])) + (const_int 0)))] + TARGET_32BIT + * + { +static const char *const cmp2[2] = +{ + \cmp%d4\\t%2, %3\, + \cmn%d4\\t%2, #%n3\ +}; +static const char *const ite = \it\\t%d4\; +static const int cmp_idx[9] = {0, 0, 1, 0, 1}; + +if (TARGET_THUMB2) { + output_asm_insn (ite, operands); +} +output_asm_insn (cmp2[cmp_idx[which_alternative]], operands); +return \\; + } I would think simply splitting to a normal cond_exec insn post reload would be significantly cleaner than this. And for future reference, don't use '*', just use '{'. That way you don't have to escape the embedded quotes, etc. r~
Re: [PATCH, PR 57748] Check for out of bounds access, Part 2
On Tue, Oct 22, 2013 at 12:25 PM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi, On Tue, 8 Oct 2013 22:50:21, Eric Botcazou wrote: I agree, that assigning a non-BLKmode to structures with zero-sized arrays should be considered a bug. Fine, then let's apply Martin's patch, on mainline at least. That would definitely be a good move. Maybe someone should approve it? But this testcase is invalid on STRICT_ALIGNMENT platforms: xx is pointer to a type with 4-byte alignment so its value must be a multiple of 4. Then you probably win. But I still have some doubts. I had to use this silly alignment/pack(4) to circumvent this statement in compute_record_mode: /* If structure's known alignment is less than what the scalar mode would need, and it matters, then stick with BLKmode. */ if (TYPE_MODE (type) != BLKmode STRICT_ALIGNMENT ! (TYPE_ALIGN (type)= BIGGEST_ALIGNMENT || TYPE_ALIGN (type)= GET_MODE_ALIGNMENT (TYPE_MODE (type { /* If this is the only reason this type is BLKmode, then don't force containing types to be BLKmode. */ TYPE_NO_FORCE_BLK (type) = 1; SET_TYPE_MODE (type, BLKmode); } But there are at least two targets where STRICT_ALIGNMENT = 0 and SLOW_UNALIGNED_ACCESS != 0: rs6000 and alpha. This example with a byte-aligned structure will on one of these targets likely execute this code path in expand_expr_real_1/case MEM_REF: else if (SLOW_UNALIGNED_ACCESS (mode, align)) temp = extract_bit_field (temp, GET_MODE_BITSIZE (mode), 0, TYPE_UNSIGNED (TREE_TYPE (exp)), (modifier == EXPAND_STACK_PARM ? NULL_RTX : target), mode, mode); This looks wrong, but unfortunately I cannot test on these targets... Hmm, well, the condition that would be necessary to execute that code path would be STRICT_ALIGNMENT = 0 and SLOW_UNALIGNED_ACCESS != 0 for any integer mode. The only target that is close to hit this bug is rs6000: #define STRICT_ALIGNMENT 0 #define SLOW_UNALIGNED_ACCESS(MODE, ALIGN) \ (STRICT_ALIGNMENT \ || (((MODE) == SFmode || (MODE) == DFmode || (MODE) == TFmode\ || (MODE) == SDmode || (MODE) == DDmode || (MODE) == TDmode)\ (ALIGN) 32) \ || (VECTOR_MODE_P ((MODE)) (((int)(ALIGN)) VECTOR_ALIGN (MODE but, luckily this is 0 for all integer modes. So I am now convinced, there won't be any valid example with unions that executes this code path. Therefore I updated Martin's previous patch, to meet Eric's request: That is to only handle zero-sized arrays at the end of the structure. That looks backwards. Why should struct { V i; V j[0]; } have a different mode than struct { V j[0]; V i; } ? And why should the same issue not exist for struct { V a[1]; } which is also flexible enough that we support accesses beyond it via a pointer? That struct also has V2DImode. And this is all because /* If this field is the whole struct, remember its mode so that, say, we can put a double in a class into a DF register instead of forcing it to live in the stack. */ if (simple_cst_equal (TYPE_SIZE (type), DECL_SIZE (field))) mode = DECL_MODE (field); which is IMHO ok. Boot-strapped and regression-tested on x86_64-linux-gnu. Ok for trunk? Well, I'm not so sure. And if so then I'd prefer martins original patch, rejecting all zero-sized fields. But then only if you consider it a real field. Of course I blame those that added the zero-sized array extension for all this mess ... maybe we can reduce it by rejecting zero-sized arrays that are not at the end of a structure - which means we can demote them to flexible arrays with a NULL TYPE_SIZE which would completely side-step this issue in stor-layout.c. Richard. Regards Bernd.
[jit] Add missing include of diagnostic-core.h
Committed to dmalcolm/jit: gcc/jit/ * internal-api.c: Add missing include of diagnostic-core.h --- gcc/jit/ChangeLog.jit | 4 gcc/jit/internal-api.c | 1 + 2 files changed, 5 insertions(+) diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit index 5e8d0f9..8296b62 100644 --- a/gcc/jit/ChangeLog.jit +++ b/gcc/jit/ChangeLog.jit @@ -1,3 +1,7 @@ +2013-10-23 David Malcolm dmalc...@redhat.com + + * internal-api.c: Add missing include of diagnostic-core.h + 2013-10-22 David Malcolm dmalc...@redhat.com * internal-api.c (gcc::jit::context::add_error_va): Record the diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c index 639cd49..4a35f17 100644 --- a/gcc/jit/internal-api.c +++ b/gcc/jit/internal-api.c @@ -13,6 +13,7 @@ #include gimple.h #include gimple-pretty-print.h #include timevar.h +#include diagnostic-core.h #include dumpfile.h #include tree-flow.h -- 1.7.11.7
Re: [PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping
Hi, On Mon, Oct 21, 2013 at 11:00:38PM -0400, Vladimir Makarov wrote: On 13-10-21 6:56 PM, Steven Bosscher wrote: + { + bitmap_clear (need_new); + bitmap_clear (reachable); + return; + } + + for (df_ref use = DF_REG_USE_CHAIN (REGNO(dest)); + use; + use = DF_REF_NEXT_REG (use)) You're using DF in these places. But IRA and LRA don't work with DF. After update_equiv_regs DF caches and liveness may be incorrect. You'd have to add a df_analyze call but I'm not sure how that will interact with IRA/LRA's own dataflow frameworks (e.g. w.r.t. REG_DEAD/REG_UNUSED notes). Sorry, Martin. I think Steven is right. IRA/LRA (and reload pass) creates so many changes in RTL that DF infrastructure would slow down the compiler a lot and therefore df info is not updated during RA. no need to be sorry, I absolutely anticipated comments and requests for changes. Your patch mostly uses a correct DF-info because there are few changes since updating is off. I think the reason why it works is that find_moveable_pseudos, which is called immediately before the function I'm adding, already calls df_analyze. I suppose it does not cause any trouble since the call is there for quite a few months already. Because find_moveable_pseudos will never split registers which are defined by a set from a hard register (because rtx_moveable_p returns false for a HARD_REGISTER_P), the analysis results should still be perfectly up-to-date for the purposes of my transformation. So, do you think that this could be just made more explicit by moving the call to df_analyze (and dominator calculation) out of find_moveable_pseudos to ira() as in the (bootstrapped, tested) patch below? You could move your optimization a bit up before df_clear_flags (DF_NO_INSN_RESCAN); or move this call right after your optimizations (possibly some minor df calls are needed too to restore live info for RA after your RTL changes). I tried this but got dark glibc double free and segfault errors from deep down in the call to df_analyze in find_moveable_pseudos, so I quickly chickened out. I will re-try (or even move the transformation more up front) if you or Steven reject this attempt. Thanks, Martin 2013-10-23 Martin Jambor mjam...@suse.cz PR rtl-optimization/10474 * ira.c (find_moveable_pseudos): Do not calculate dominance info nor df analysis. (interesting_dest_for_shprep): New function. (split_live_ranges_for_shrink_wrap): Likewise. (ira): Calculate dominance info and df analysis. Call split_live_ranges_for_shrink_wrap. testsuite/ * gcc.dg/pr10474.c: New testcase. * gcc.dg/ira-shrinkwrap-prep-1.c: Likewise. * gcc.dg/ira-shrinkwrap-prep-2.c: Likewise. diff --git a/gcc/ira.c b/gcc/ira.c index 203fbff..0faea8f 100644 --- a/gcc/ira.c +++ b/gcc/ira.c @@ -3989,9 +3989,6 @@ find_moveable_pseudos (void) pseudo_replaced_reg.release (); pseudo_replaced_reg.safe_grow_cleared (max_regs); - df_analyze (); - calculate_dominance_info (CDI_DOMINATORS); - i = 0; bitmap_initialize (live, 0); bitmap_initialize (used, 0); @@ -4311,7 +4308,192 @@ find_moveable_pseudos (void) regstat_free_ri (); regstat_init_n_sets_and_refs (); regstat_compute_ri (); - free_dominance_info (CDI_DOMINATORS); +} + + +/* If insn is interesting for parameter range-splitting shring-wrapping + preparation, i.e. it is a single set from a hard register to a pseudo, which + is live at CALL_DOM, return the destination. Otherwise return NULL. */ + +static rtx +interesting_dest_for_shprep (rtx insn, basic_block call_dom) +{ + rtx set = single_set (insn); + if (!set) +return NULL; + rtx src = SET_SRC (set); + rtx dest = SET_DEST (set); + if (!REG_P (src) || !HARD_REGISTER_P (src) + || !REG_P (dest) || HARD_REGISTER_P (dest) + || (call_dom !bitmap_bit_p (df_get_live_in (call_dom), REGNO (dest +return NULL; + return dest; +} + +/* Split live ranges of pseudos that are loaded from hard registers in the + first BB in a BB that dominates all non-sibling call if such a BB can be + found and is not in a loop. */ + +static void +split_live_ranges_for_shrink_wrap (void) +{ + basic_block bb, call_dom = NULL; + basic_block first = single_succ (ENTRY_BLOCK_PTR); + rtx insn, last_interesting_insn = NULL; + bitmap_head need_new, reachable; + vecbasic_block queue; + + if (!flag_shrink_wrap) +return; + + bitmap_initialize (need_new, 0); + bitmap_initialize (reachable, 0); + queue.create (n_basic_blocks); + + FOR_EACH_BB (bb) +FOR_BB_INSNS (bb, insn) + if (CALL_P (insn) !SIBLING_CALL_P (insn)) + { + if (bb == first) + { + bitmap_clear (need_new); + bitmap_clear (reachable); + queue.release (); + return; + } + + bitmap_set_bit (need_new, bb-index); + bitmap_set_bit
Re: [PATCH] Fix various reassoc issues (PR tree-optimization/58791, tree-optimization/58775)
On 10/23/13 04:35, Jakub Jelinek wrote: For debug info quality it actually isn't just about using different SSA_NAME, but also not reusing the defining stmt; only then the code will magically try to create debug temporaries and expressions from the old dead defining stmt. Joys. Something else to remember. -/* Returns the UID of STMT if it is non-NULL. Otherwise return 1. */ +/* Returns true if statement S1 dominates statement S2. Like + stmt_dominates_stmt_p, but uses stmt UIDs to optimize. */ -static inline unsigned -get_stmt_uid_with_default (gimple stmt) +static bool +reassoc_stmt_dominates_stmt_p (gimple s1, gimple s2) +{ + basic_block bb1 = gimple_bb (s1), bb2 = gimple_bb (s2); + + if (!bb1 || s1 == s2) +return true; + + if (!bb2) +return false; Maybe this was carried over from somewhere else, but that looks awful strange. When !bb1 you return true, but if !bb2 you return false. At the least it deserves a comment. bb{1,2} == NULL means a default definition of the corresponding lhs. Ahh, it makes much more sense now. ISTM that should be documented in the function header and in stmt_dominates_stmt_p in. Yes, I know you're not responsible for this mess, but better to get it cleaned up now since you're in there. Obviously no need for a re-bootstrap to add that comment fix. + if (gimple_uid (s1) gimple_uid (s2)) + return true; + + if (gimple_uid (s1) gimple_uid (s2)) + return false; So if one (but not both) of the UIDs isn't set yet, one of these two statements will return, which seems wrong since we don't know where the statement without a UID is relative to the statement with a UID. Am I missing something? Right now we shouldn't see an unset uid here at all, unless it is a PHI, which is handled earlier. break_up_subtract_bb initializes them for all preexisting stmts, and the various places which add new stmts are responsible for setting uid to either the uid of the immediately preceeding or following stmt that has uid set (or 1 if the bb was previously empty). OK. Might be worth a comment as well -- something as simple as only PHIs have unset UIDs here. Once that precondition is known the code is pretty easy to understand. + gimple_stmt_iterator gsi = gsi_for_stmt (s1); + unsigned int uid = gimple_uid (s1); + for (gsi_next (gsi); !gsi_end_p (gsi); gsi_next (gsi)) + { + gimple s = gsi_stmt (gsi); + if (gimple_uid (s) != uid) + break; + if (s == s2) + return true; + } Don't we only get here if both UIDs are zero (or the same by other means)?If so does this code even make sense? Both UIDs zero should not happen, both UIDs the same can happen, we don't renumber the whole bb upon inserting something into the bb. Ok. Note the above isn't in any way a new code, just a cleanup of the preexisting, just earlier the thing was done in more than one place and could mishandle the uid setting. I understand that. But I can easily see that the function is either mis-named, poorly documented, or utter crap. The question is which is it :-) Anyway, from my understanding, insert_point is always a SSA_NAME setter So let's get that documented. Again, once the precondition is understood, the code makes more sense. At this point my only concerns are getting those functions documented a bit better -- which you can do as a pre-approved followup. Being away from GCC for a long time has actually been good in some ways in that I don't have experience with much of the new bits. Thus I rely a lot more on comments, function/variable names, etc to provide some of the context. Jeff
Re: [PATCH, PR 57748] Check for out of bounds access, Part 2
Hi, On Wed, Oct 23, 2013 at 06:00:43PM +0200, Richard Biener wrote: ... ? And why should the same issue not exist for struct { V a[1]; } indeed, it does and it's simple to trigger (on x86_64): /* { dg-do run } */ #include stdlib.h typedef long long V __attribute__ ((vector_size (2 * sizeof (long long)), may_alias)); #if 1 typedef struct S {V b[1]; } P __attribute__((aligned (1))); struct __attribute__((packed)) T { char c; P s; }; #else typedef struct S {V b[1]; } P; struct T { char c; P s; }; #endif void __attribute__((noinline, noclone)) good_value (V b) { if (b[0] != 3 || b[1] != 4) __builtin_abort (); } void __attribute__((noinline, noclone)) check (P *p) { good_value (p-b[1]); } int main () { struct T *t = (struct T *) calloc (128, 1); V a = { 3, 4 }; t-s.b[1] = a; check (t-s); free (t); return 0; } While I was willing to discount the zero sized array as an insufficiently specified oddity, this seems to be much more serious and potentially common. It seems we really need to be able to detect these out-of-bounds situations and tell lower levels of expander that whatever mode you think you are expanding, it is actually BLK mode. It's uglier than I thought. Martin which is also flexible enough that we support accesses beyond it via a pointer? That struct also has V2DImode. And this is all because /* If this field is the whole struct, remember its mode so that, say, we can put a double in a class into a DF register instead of forcing it to live in the stack. */ if (simple_cst_equal (TYPE_SIZE (type), DECL_SIZE (field))) mode = DECL_MODE (field); which is IMHO ok. Boot-strapped and regression-tested on x86_64-linux-gnu. Ok for trunk? Well, I'm not so sure. And if so then I'd prefer martins original patch, rejecting all zero-sized fields. But then only if you consider it a real field. Of course I blame those that added the zero-sized array extension for all this mess ... maybe we can reduce it by rejecting zero-sized arrays that are not at the end of a structure - which means we can demote them to flexible arrays with a NULL TYPE_SIZE which would completely side-step this issue in stor-layout.c. Richard. Regards Bernd.
Re: [PATCH, PR 57748] Check for out of bounds access, Part 2
That looks backwards. Why should struct { V i; V j[0]; } have a different mode than struct { V j[0]; V i; } ? Because we support out-of-bounds access for the array in the former case and not in the latter case. And why should the same issue not exist for struct { V a[1]; } which is also flexible enough that we support accesses beyond it via a pointer? That struct also has V2DImode. And this is all because /* If this field is the whole struct, remember its mode so that, say, we can put a double in a class into a DF register instead of forcing it to live in the stack. */ if (simple_cst_equal (TYPE_SIZE (type), DECL_SIZE (field))) mode = DECL_MODE (field); which is IMHO ok. It's OK *only* if TYPE_SIZE (type) is really the size of all the objects of the type; if it isn't, i.e. if we allow access past TYPE_SIZE (type), then we cannot use the field's mode. So we need to decide where to draw the line. -- Eric Botcazou
Re: [Ada] Convert Ada front-end to automatic dependencies
The following trivial patch avoids this and allows the i386-pc-solaris2.1[01] and sparc-sun-solaris2.11 bootstraps to finish. Ok for mainline? Sure, thanks for fixing this. -- Eric Botcazou
Re: [PATCH, PR 57748] Check for out of bounds access, Part 2
While I was willing to discount the zero sized array as an insufficiently specified oddity, this seems to be much more serious and potentially common. It seems we really need to be able to detect these out-of-bounds situations and tell lower levels of expander that whatever mode you think you are expanding, it is actually BLK mode. Please let's not enter this hazardous business. We just need to draw a line somewhere and clearly state what we support in terms of out-of-bounds access. -- Eric Botcazou
[jit] Merger from trunk into jit
I've merged trunk r203980 into dmalcolm/jit, in particular bringing in the fix for the segfault seen in libbacktrace when an ICE happened in libgccjit (fix was on trunk on r203810). gcc/jit/ * internal-api.c: Update for rename of tree-flow.h to tree-cfg.h in r203320, for declaration of dump_function_to_file. * TODO.rst (segfault seen in libbacktrace): Remove - this was fixed by Ian in r203810. Conflicts: gcc/cgraph.c gcc/doc/install.texi
Re: libstdc++ Testsuite extension for sort, partial_sort, partial_sort_copy, nth_element
--- Chris, I went ahead and applied to my local trees the various tweaks I mentioned. I tested the below make check, check-debug and again both -m32/-m64. See if you can spot something you don't like, otherwise I will commit it later today. Thanks, Paolo. Index: testsuite/25_algorithms/nth_element/random_test.cc === --- testsuite/25_algorithms/nth_element/random_test.cc (revision 0) +++ testsuite/25_algorithms/nth_element/random_test.cc (working copy) @@ -0,0 +1,63 @@ +// Copyright (C) 2013 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +// { dg-options -std=gnu++11 } +// { dg-options -std=gnu++11 -DSIMULATOR_TEST { target simulator } } +// { dg-require-cstdint } + +// 25.4.2 [lib.alg.nth.element] + +#include algorithm +#include random +#include testsuite_hooks.h +#include testsuite_iterators.h +#include testsuite_containergen.h + +using __gnu_test::test_container; +using __gnu_test::random_access_iterator_wrapper; + +typedef test_containerint, random_access_iterator_wrapper Container; + +struct testNthElement +{ + templatetypename Container, typename RandomGen + void operator()(Container con, RandomGen rg) + { +bool test __attribute__((unused)) = true; + +const int size = con.end() - con.begin(); +auto dist = std::uniform_int_distribution(0, size); +const int element = dist(rg); + +std::nth_element(con.begin(), con.begin() + element, con.end()); + +if (element size) + { +for (int i = 0; i element; ++i) + VERIFY( con.val(i) = con.val(element) ); + + for (int i = element + 1; i size; ++i) + VERIFY( con.val(i) = con.val(element) ); + } + } +}; + +int +main() +{ + __gnu_test::test_containersContainer(testNthElement()); +} Index: testsuite/25_algorithms/partial_sort/random_test.cc === --- testsuite/25_algorithms/partial_sort/random_test.cc (revision 0) +++ testsuite/25_algorithms/partial_sort/random_test.cc (working copy) @@ -0,0 +1,62 @@ +// Copyright (C) 2013 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +// { dg-options -std=gnu++11 } +// { dg-options -std=gnu++11 -DSIMULATOR_TEST { target simulator } } +// { dg-require-cstdint } + +// 25.4.1.3 [lib.alg.partial.sort] + +#include algorithm +#include random +#include testsuite_hooks.h +#include testsuite_iterators.h +#include testsuite_containergen.h + +using __gnu_test::test_container; +using __gnu_test::random_access_iterator_wrapper; + +typedef test_containerint, random_access_iterator_wrapper Container; + +struct testPartialSort +{ + templatetypename Container, typename RandomGen + void operator()(Container con, RandomGen rg) + { +bool test __attribute__((unused)) = true; + +const int size = con.end() - con.begin(); +auto dist = std::uniform_int_distribution(0, size); +const int element = dist(rg); + +std::partial_sort(con.begin(), con.begin() + element, con.end()); + +VERIFY( std::is_sorted(con.begin(), con.begin() + element) ); + +if (element 0) + { +for (int i = element; i size; ++i) + VERIFY( con.val(element - 1) = con.val(i) ); + } + } +}; + +int +main() +{ + __gnu_test::test_containersContainer(testPartialSort()); +} Index: testsuite/25_algorithms/partial_sort_copy/random_test.cc === --- testsuite/25_algorithms/partial_sort_copy/random_test.cc(revision 0) +++
C++ PATCH to deal with trivial but non-callable [cd]tors
Late in the C++11 process it was decided that a constructor or destructor can be trivial but not callable; as a result, everywhere that assumed that a call to a trivial function didn't need any processing needed to be updated. This patch does that. Tested x86_64-pc-linux-gnu, applying to trunk. commit cbc14ce6b840e6311b8c580564f8c836f8dc18ae Author: Jason Merrill ja...@redhat.com Date: Tue Oct 22 16:38:01 2013 -0400 In C++11 a trivial [cd]tor might not be callable. * class.c (user_provided_p): A function deleted on its declation in the class is not user-provided. (type_build_ctor_call): Also force a ctor call if we might have a deleted or private trivial ctor. (type_build_dtor_call): New. (deduce_noexcept_on_destructors): Remove obsolete code. * cp-tree.h: Declare type_build_dtor_call. * decl.c (expand_static_init): Make sure trivial dtors are callable. (cxx_maybe_build_cleanup): Likewise. * except.c (build_throw): Likewise. * init.c (build_value_init): Handle trivial but not callable ctors. (perform_target_ctor): Make sure trivial dtor is callable. (perform_member_init): Likewise. (expand_cleanup_for_base): Likewise. (build_vec_delete_1): Likewise. (build_delete): Likewise. (push_base_cleanups): Likewise. (build_new_1): Avoid redundant error. * method.c (synthesized_method_walk): Can't ever exit early in C++11. Always process the subobject destructor. * semantics.c (finish_compound_literal): Make sure trivial dtor is callable. * typeck2.c (split_nonconstant_init): Likewise. diff --git a/gcc/cp/call.c b/gcc/cp/call.c index 3ed73b8..cd90140 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -9273,6 +9273,9 @@ set_up_extended_ref_temp (tree decl, tree expr, vectree, va_gc **cleanups, static_aggregates = tree_cons (NULL_TREE, var, static_aggregates); } + else + /* Check whether the dtor is callable. */ + cxx_maybe_build_cleanup (var, tf_warning_or_error); } *initp = init; diff --git a/gcc/cp/class.c b/gcc/cp/class.c index c587e55..43f90d7 100644 --- a/gcc/cp/class.c +++ b/gcc/cp/class.c @@ -4674,15 +4674,8 @@ deduce_noexcept_on_destructors (tree t) if (!CLASSTYPE_METHOD_VEC (t)) return; - bool saved_nontrivial_dtor = TYPE_HAS_NONTRIVIAL_DESTRUCTOR (t); - - /* Avoid early exit from synthesized_method_walk (c++/57645). */ - TYPE_HAS_NONTRIVIAL_DESTRUCTOR (t) = true; - for (tree fns = CLASSTYPE_DESTRUCTORS (t); fns; fns = OVL_NEXT (fns)) deduce_noexcept_on_destructor (OVL_CURRENT (fns)); - - TYPE_HAS_NONTRIVIAL_DESTRUCTOR (t) = saved_nontrivial_dtor; } /* Subroutine of set_one_vmethod_tm_attributes. Search base classes @@ -4884,7 +4877,8 @@ user_provided_p (tree fn) return true; else return (!DECL_ARTIFICIAL (fn) - !DECL_DEFAULTED_IN_CLASS_P (fn)); + !(DECL_INITIALIZED_IN_CLASS_P (fn) + (DECL_DEFAULTED_FN (fn) || DECL_DELETED_FN (fn; } /* Returns true iff class T has a user-provided constructor. */ @@ -5149,7 +5143,7 @@ type_has_user_declared_move_assign (tree t) } /* Nonzero if we need to build up a constructor call when initializing an - object of this class, either because it has a user-provided constructor + object of this class, either because it has a user-declared constructor or because it doesn't have a default constructor (so we need to give an error if no initializer is provided). Use TYPE_NEEDS_CONSTRUCTING when what you care about is whether or not an object can be produced by a @@ -5165,8 +5159,46 @@ type_build_ctor_call (tree t) if (TYPE_NEEDS_CONSTRUCTING (t)) return true; inner = strip_array_types (t); - return (CLASS_TYPE_P (inner) !TYPE_HAS_DEFAULT_CONSTRUCTOR (inner) - !ANON_AGGR_TYPE_P (inner)); + if (!CLASS_TYPE_P (inner) || ANON_AGGR_TYPE_P (inner)) +return false; + if (!TYPE_HAS_DEFAULT_CONSTRUCTOR (inner)) +return true; + /* A user-declared constructor might be private, and a constructor might + be trivial but deleted. */ + for (tree fns = lookup_fnfields_slot (inner, complete_ctor_identifier); + fns; fns = OVL_NEXT (fns)) +{ + tree fn = OVL_CURRENT (fns); + if (!DECL_ARTIFICIAL (fn) + || DECL_DELETED_FN (fn)) + return true; +} + return false; +} + +/* Like type_build_ctor_call, but for destructors. */ + +bool +type_build_dtor_call (tree t) +{ + tree inner; + if (TYPE_HAS_NONTRIVIAL_DESTRUCTOR (t)) +return true; + inner = strip_array_types (t); + if (!CLASS_TYPE_P (inner) || ANON_AGGR_TYPE_P (inner) + || !COMPLETE_TYPE_P (inner)) +return false; + /* A user-declared destructor might be private, and a destructor might + be trivial but deleted. */ + for (tree fns = lookup_fnfields_slot (inner, complete_dtor_identifier); + fns; fns = OVL_NEXT (fns)) +{ + tree fn = OVL_CURRENT (fns); + if (!DECL_ARTIFICIAL (fn) + || DECL_DELETED_FN (fn)) +
Re: C++ PATCH to deal with trivial but non-callable [cd]tors
On 10/23/2013 08:05 PM, Jason Merrill wrote: @@ -4674,15 +4674,8 @@ deduce_noexcept_on_destructors (tree t) if (!CLASSTYPE_METHOD_VEC (t)) return; - bool saved_nontrivial_dtor = TYPE_HAS_NONTRIVIAL_DESTRUCTOR (t); - - /* Avoid early exit from synthesized_method_walk (c++/57645). */ - TYPE_HAS_NONTRIVIAL_DESTRUCTOR (t) = true; - for (tree fns = CLASSTYPE_DESTRUCTORS (t); fns; fns = OVL_NEXT (fns)) deduce_noexcept_on_destructor (OVL_CURRENT (fns)); - - TYPE_HAS_NONTRIVIAL_DESTRUCTOR (t) = saved_nontrivial_dtor; } /* Subroutine of set_one_vmethod_tm_attributes. Search base classes Cool. Paolo.
Re: [2nd PING] [C++ PATCH] demangler fix (take 2)
On 10/08/2013 05:54 AM, Gary Benson wrote: diff --git a/test.c b/test.c I don't think we want to add this to the top level directory. :) Other than that, the patch is OK. Jason
[wwwdocs] Add link to jit mailing list to lists.html
OK to commit to CVS? (fwiw the pre-existing page content doesn't validate as per http://gcc.gnu.org/contribute.html#webchanges, due to issues with the include searchbox.ihtml ) Index: htdocs/lists.html === RCS file: /cvs/gcc/wwwdocs/htdocs/lists.html,v retrieving revision 1.105 diff -u -p -r1.105 lists.html --- htdocs/lists.html 25 Aug 2013 21:53:12 - 1.105 +++ htdocs/lists.html 23 Oct 2013 19:11:45 - @@ -91,6 +91,14 @@ before a href=#subscribesubscribing front end of GCC, and the corresponding runtime library. Patches to gfortran and libgfortran should go to both this list and bgcc-patches/b./li + + liba href=http://gcc.gnu.org/ml/jit/;jit/a/b + is for discussion and development of + a href=http://gcc.gnu.org/wiki/JIT;libgccjit/a, an experimental + library for implementing Just-In-Time compilation using gcc as a backend. + The list is intended for both users and developers of the library. + Patches for the jit branch should go to both this list and + bgcc-patches/b./li /ul pRead only lists:/p @@ -208,6 +216,7 @@ lists via this form:/p option value=java-prsjava-prs/option option value=java-cvsjava-cvs/option option value=fortranfortran/option + option value=jitjit/option /select br / Your e-mail address:
C++ PATCH to resolve LWG issue 1265
LWG 2165, submitted by Jonathan, argues that declaring a defaulted constructor noexcept should not be ill-formed if the implicitly- declared constructor would not be noexcept. At the Chicago meeting, CWG agreed. This patch makes it deleted instead. The second hunk adds %X for printing an exception-specification in diagnostics. Tested x86_64-pc-linux-gnu, applying to trunk. commit 785cdfa527c1481af08a31fb0f3c2489119b539c Author: Jason Merrill ja...@redhat.com Date: Tue Oct 22 16:17:31 2013 -0400 LWG 2165 * method.c (defaulted_late_check): Delete on eh-spec mismatch. (maybe_explain_implicit_delete): Explain it. diff --git a/gcc/cp/method.c b/gcc/cp/method.c index 593a4a6..594a004 100644 --- a/gcc/cp/method.c +++ b/gcc/cp/method.c @@ -1466,13 +1466,34 @@ maybe_explain_implicit_delete (tree decl) tree parms = FUNCTION_FIRST_USER_PARMTYPE (decl); tree parm_type = TREE_VALUE (parms); bool const_p = CP_TYPE_CONST_P (non_reference (parm_type)); + tree raises = NULL_TREE; + bool deleted_p = false; tree scope = push_scope (ctype); - inform (0, %q+#D is implicitly deleted because the default - definition would be ill-formed:, decl); - pop_scope (scope); + synthesized_method_walk (ctype, sfk, const_p, - NULL, NULL, NULL, NULL, true, + raises, NULL, deleted_p, NULL, false, DECL_INHERITED_CTOR_BASE (decl), parms); + if (deleted_p) + { + inform (0, %q+#D is implicitly deleted because the default + definition would be ill-formed:, decl); + synthesized_method_walk (ctype, sfk, const_p, + NULL, NULL, NULL, NULL, true, + DECL_INHERITED_CTOR_BASE (decl), parms); + } + else if (!comp_except_specs + (TYPE_RAISES_EXCEPTIONS (TREE_TYPE (decl)), + raises, ce_normal)) + inform (DECL_SOURCE_LOCATION (decl), %q#F is implicitly + deleted because its exception-specification does not + match the implicit exception-specification %qX, + decl, raises); +#ifdef ENABLE_CHECKING + else + gcc_unreachable (); +#endif + + pop_scope (scope); } input_location = loc; @@ -1782,9 +1803,10 @@ defaulted_late_check (tree fn) eh_spec, ce_normal)) { if (DECL_DEFAULTED_IN_CLASS_P (fn)) - error (function %q+D defaulted on its first declaration - with an exception-specification that differs from - the implicit declaration %q#D, fn, implicit_fn); + { + DECL_DELETED_FN (fn) = true; + eh_spec = TYPE_RAISES_EXCEPTIONS (TREE_TYPE (fn)); + } else error (function %q+D defaulted on its redeclaration with an exception-specification that differs from diff --git a/gcc/testsuite/g++.dg/cpp0x/defaulted23.C b/gcc/testsuite/g++.dg/cpp0x/defaulted23.C index 319cb39..be2fd2f 100644 --- a/gcc/testsuite/g++.dg/cpp0x/defaulted23.C +++ b/gcc/testsuite/g++.dg/cpp0x/defaulted23.C @@ -6,22 +6,32 @@ struct A A() noexcept = default; }; +A a; + struct B { - B() throw (int) = default; // { dg-error exception-specification that differs from the implicit declaration } + B() throw (int) = default; // { dg-message exception-specification } }; +B b;// { dg-error deleted } + struct C { C() throw (int) { } }; +C c; + struct D: C { D() throw (int) = default; }; +D d; + struct E { E() = default; }; + +E e; diff --git a/gcc/testsuite/g++.dg/cpp0x/defaulted43.C b/gcc/testsuite/g++.dg/cpp0x/defaulted43.C index e1c2b72..f2846fe 100644 --- a/gcc/testsuite/g++.dg/cpp0x/defaulted43.C +++ b/gcc/testsuite/g++.dg/cpp0x/defaulted43.C @@ -7,6 +7,8 @@ struct T ~T() noexcept(false) { } }; +T t; + struct A { A() noexcept; @@ -24,6 +26,8 @@ struct U ~U() noexcept(false) { } }; +U u; + struct B { B() noexcept(false); @@ -35,16 +39,22 @@ struct B B::B() noexcept(false) = default; B::~B() noexcept(false) = default; +B b; + struct V { V() noexcept(false) { } ~V() noexcept(false) { } }; +V v; + struct C { - C() noexcept = default; // { dg-error defaulted } - ~C() noexcept = default;// { dg-error defaulted } + C() noexcept = default; // { dg-message exception-specification } + ~C() noexcept = default; // { dg-message exception-specification } V v; }; + +C c;// { dg-error deleted } commit cc95ba6f007d4b15a40b2da5d9e7ee65a276b738 Author: Jason Merrill ja...@redhat.com Date: Tue Oct 22 16:15:32 2013 -0400 c-family/ * c-format.c (gcc_cxxdiag_char_table): Add %X. cp/ * error.c (eh_spec_to_string): New. (cp_printer): Use it for %X. diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c index c11d93a..f0371d3 100644 --- a/gcc/c-family/c-format.c +++ b/gcc/c-family/c-format.c @@ -721,7 +721,7 @@ static const format_char_info gcc_cxxdiag_char_table[] = /* Custom conversion specifiers. */ /* These will require a tree at runtime. */ - { ADEFKSTV,0,STD_C89,{ T89_V, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN,
Re: [wide-int] Make trees more like rtxes
On Oct 23, 2013, at 2:09 AM, Richard Biener rguent...@suse.de wrote: Good. Better names - ah well, wi::to_max_wide_int (t) and wi::to_addr_wide_int (t)? Btw, addr_wide_int is an odd name The idea was to have one type to rule them all… everything address related... bit offsets, byte offsets… Bit offset is the wrong name, as that would be a wrong type to use for a byte offset. Someone that has a bit offset type, would naturally want to /8 to get a byte offset. If one wanted a bit offset, then it should be split into two, bit offset and byte offset.
Re: [wide-int] Make trees more like rtxes
On Oct 23, 2013, at 5:00 AM, Richard Sandiford rsand...@linux.vnet.ibm.com wrote: offset_int, max_int, wi::to_offset and wi::to_max sound OK to me. Kenny? Mike? Those two names seem reasonable. to_offset should document that these are for address offsets (and address constants) exclusively.
RE: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C (and C++)
Can you take a look at calls.c::special_function_p and determine if we need to do something special for spawn here? I will look into it and let you know. Any word on this? Hi Jeff, I looked into this function and from what I can tell, it is used to mark certain functions (e.g. builtin functions) as special and thus don't do special optimizations on them like a regular function. The thing is, the spawnee (the function being spawned) can be pretty much any regular function. The compiler doesn't even touch inside the function. The compiler inserts specific Cilk function calls in the spawner and transplants the function . The only major restriction I know is that the frame pointer needs to be used and that I mark as I mentioned above. Is there anything you were thinking about that I missed? Thanks, Balaji V. Iyer. jeff
Re: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C (and C++)
On 10/23/13 13:46, Iyer, Balaji V wrote: Can you take a look at calls.c::special_function_p and determine if we need to do something special for spawn here? I will look into it and let you know. Any word on this? Hi Jeff, I looked into this function and from what I can tell, it is used to mark certain functions (e.g. builtin functions) as special and thus don't do special optimizations on them like a regular function. The thing is, the spawnee (the function being spawned) can be pretty much any regular function. The compiler doesn't even touch inside the function. The compiler inserts specific Cilk function calls in the spawner and transplants the function . The only major restriction I know is that the frame pointer needs to be used and that I mark as I mentioned above. Is there anything you were thinking about that I missed? There wasn't anything in particular I was worried about. Just a general question as to whether or not we needed to mark the spawner or spawnee as special, partiuclarly returns twice (setjmp/fork) and never returns (longjmp). Jeff
Re: [PATCH, PR58805] Add missing check in stmt_local_def for tail-merge
On 10/23/13 10:02, Tom de Vries wrote: On 22/10/13 20:50, Jeff Law wrote: On 10/22/13 03:58, Tom de Vries wrote: Richard, This patch adds a missing check for gimple_vdef in stmt_local_def for the tail-merge pass. Bootstrapped and reg-tested on x86_64. OK for trunk, gcc-4_8-branch? Thanks, - Tom 2013-10-22 Tom de Vries t...@codesourcery.com PR tree-optimization/58805 * tree-ssa-tail-merge.c (stmt_local_def): Add gimple_vdef check. * gcc.dg/pr58805.c: New test. Doesn't this test belong in an architecture specific directory? Jeff, The test-case has i386 assembly inside the asm string, but since the test-case only compiles, the assembly string is never used. I've made the string empty to make that clear. Thanks. AFAIU the only requirement for this test-case is that the constraint matches the operand. I'm not sure whether 'unsigned long' always matches 'r'. I've changed this into 'void *' and 'p', which I think should always be true. Committed as below. Excellent. Thanks again. jeff
RE: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C (and C++)
-Original Message- From: Jeff Law [mailto:l...@redhat.com] Sent: Wednesday, October 23, 2013 3:53 PM To: Iyer, Balaji V; r...@redhat.com; Jason Merrill (ja...@redhat.com); Aldy Hernandez (al...@redhat.com) Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C (and C++) On 10/23/13 13:46, Iyer, Balaji V wrote: Can you take a look at calls.c::special_function_p and determine if we need to do something special for spawn here? I will look into it and let you know. Any word on this? Hi Jeff, I looked into this function and from what I can tell, it is used to mark certain functions (e.g. builtin functions) as special and thus don't do special optimizations on them like a regular function. The thing is, the spawnee (the function being spawned) can be pretty much any regular function. The compiler doesn't even touch inside the function. The compiler inserts specific Cilk function calls in the spawner and transplants the function . The only major restriction I know is that the frame pointer needs to be used and that I mark as I mentioned above. Is there anything you were thinking about that I missed? There wasn't anything in particular I was worried about. Just a general question as to whether or not we needed to mark the spawner or spawnee as special, partiuclarly returns twice (setjmp/fork) and never returns (longjmp). I do check for those in the the spawnee using the check_outlined_calls function. Jeff
Re: [PATCH, PR58805] Add missing check in stmt_local_def for tail-merge
On 10/23/13 02:16, Richard Biener wrote: On Tue, 22 Oct 2013, Jeff Law wrote: On 10/22/13 03:58, Tom de Vries wrote: Richard, This patch adds a missing check for gimple_vdef in stmt_local_def for the tail-merge pass. Bootstrapped and reg-tested on x86_64. OK for trunk, gcc-4_8-branch? Thanks, - Tom 2013-10-22 Tom de Vries t...@codesourcery.com PR tree-optimization/58805 * tree-ssa-tail-merge.c (stmt_local_def): Add gimple_vdef check. * gcc.dg/pr58805.c: New test. Doesn't this test belong in an architecture specific directory? Under what conditions can a statement have a VDEF but not be considered as having a side effect by gimple_has_side_effects? It almost seems to me that gimple_has_side_effects may need updating. You seem to misunderstand side-effect, for example *p = 1; has !gimple_has_side_effects but it has a VDEF. Likewise *p = const_call_returing_aggregate (); has !gimple_has_side_effects but it has a VDEF. side-effect is an effect that is not explicitely represented in the gimple stmt you look at. So side effects as in implicit... That makes sense. Thanks for the clarification. jeff
Re: [PATCH, i386, MPX, 1/X] Support of Intel MPX ISA. 1/2 Bound type and modes
On 10/23/13 04:57, Ilya Enkovich wrote: 2013-10-23 Ilya Enkovich ilya.enkov...@intel.com * mode-classes.def (MODE_POINTER_BOUNDS): New. * tree.def (POINTER_BOUNDS_TYPE): New. * genmodes.c (complete_mode): Support MODE_POINTER_BOUNDS. (POINTER_BOUNDS_MODE): New. (make_pointer_bounds_mode): New. * machmode.h (POINTER_BOUNDS_MODE_P): New. * stor-layout.c (int_mode_for_mode): Support MODE_POINTER_BOUNDS. (layout_type): Support POINTER_BOUNDS_TYPE. * tree-pretty-print.c (dump_generic_node): Support POINTER_BOUNDS_TYPE. * tree.c (build_int_cst_wide): Support POINTER_BOUNDS_TYPE. (type_contains_placeholder_1): Likewise. * tree.h (POINTER_BOUNDS_TYPE_P): New. * varasm.c (output_constant): Support POINTER_BOUNDS_TYPE. * doc/rtl.texi (MODE_POINTER_BOUNDS): New. OK for the trunk. IIRC, there was a backend patch with conditional approval that should be good to go now (conditional upon the acceptance of the types/modes patch). Note that since I asked for a couple things to be renamed that backend patch might need tweaking. If so, make the obvious changes, post the patch (so it's recorded into the archives) and go ahead and check it into the trunk. jeff
Re: [PATCH] Fixing improper conversion from sin() to sinf() in optimization mode.
On Mon, 7 Oct 2013, Cong Hou wrote: + if (type != newtype) +break; That comparison would wrongly treat as different cases where the types differ only in one being a typedef, having qualifiers, etc. - or if in future GCC implemented proposed TS 18661-3, cases where they differ in e.g. one being float and the other _Float32 (defined as distinct types that are not compatible although they have the same representation and alignment). I think the right test here, bearing in mind the _Float32 case where types may not be compatible, is TYPE_MODE (type) != TYPE_MODE (newtype) - if the types have the same mode, they have the same set of values and so are not different in any way that matters for this optimization. OK with that change. -- Joseph S. Myers jos...@codesourcery.com
[libiberty] Fix testsuite/test-expandargv.c
I'm not sure why this did not trigger on GNU/Linux distributions, on FreeBSD 10 one does need to #include unistd.h to get to unlink() which is used by this test. With this, the test compiles on my i386-unknown-freebsd10.0 tester; without it, it doesn't. Okay? Gerald 2013-10-22 Gerald Pfeifer ger...@pfeifer.com * testsuite/test-expandargv.c: Include unistd.h Index: testsuite/test-expandargv.c === --- testsuite/test-expandargv.c (revision 203981) +++ testsuite/test-expandargv.c (working copy) @@ -40,6 +40,9 @@ #ifdef HAVE_STRING_H #include string.h #endif +#ifdef HAVE_UNISTD_H +#include unistd.h +#endif #ifndef EXIT_SUCCESS #define EXIT_SUCCESS 0
Re: [libiberty] Fix testsuite/test-expandargv.c
Ok.
Re: [PATCH] Generate fused widening multiply-and-accumulate operations only when the widening multiply has single use
Hi, Thank you both for the reviewing. I've updated the patch and also added a test (to the gcc.dg to avoid duplication). I'll commit the patch shortly. Thanks, Yufeng gcc/ * tree-ssa-math-opts.c (convert_plusminus_to_widen): Call has_single_use () and not do the conversion if has_single_use () returns false for the multiplication result. gcc/testsuite/ * gcc.dg/wmul-1.c: New test. On 10/23/13 10:42, Richard Biener wrote: On Tue, Oct 22, 2013 at 12:01 AM, Yufeng Zhangyufeng.zh...@arm.com wrote: Hi, This patch changes the widening_mul pass to fuse the widening multiply with accumulate only when the multiply has single use. The widening_mul pass currently does the conversion regardless of the number of the uses, which can cause poor code-gen in cases like the following: typedef int ArrT [10][10]; void foo (ArrT Arr, int Idx) { Arr[Idx][Idx] = 1; Arr[Idx + 10][Idx] = 2; } On AArch64, after widening_mul, the IR is like _2 = (long unsigned int) Idx_1(D); _3 = Idx_1(D) w* 40; _5 = Arr_4(D) + _3; *_5[Idx_1(D)] = 1; _8 = WIDEN_MULT_PLUS_EXPRIdx_1(D), 40, 400; _9 = Arr_4(D) + _8; *_9[Idx_1(D)] = 2; Where the arrows point, there are redundant widening multiplies. Bootstrap successfully on x86_64. The patch passes the regtest on aarch64, arm and x86_64. OK for the trunk? if (!is_widening_mult_p (rhs1_stmt,type1,mult_rhs1, -type2,mult_rhs2)) +type2,mult_rhs2) + || !has_single_use (rhs1)) please check has_single_use first, it's the cheaper check. Ok with that change (and possibly a testcase). Thanks, Richard. Thanks, Yufeng p.s. Note that x86_64 doesn't suffer from this issue as the corresponding widening multiply accumulate op is not available on the target. gcc/ * tree-ssa-math-opts.c (convert_plusminus_to_widen): Call has_single_use () and not do the conversion if has_single_use () returns false for the multiplication result. diff --git a/gcc/testsuite/gcc.dg/wmul-1.c b/gcc/testsuite/gcc.dg/wmul-1.c new file mode 100644 index 000..3e762f4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/wmul-1.c @@ -0,0 +1,19 @@ +/* Not to fuse widening multiply with accumulate if the multiply has more than + one uses. + Note that for targets where pointer and int are of the same size or + widening multiply-and-accumulate is not available, this test just passes. */ + +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-widening_mul } */ + +typedef int ArrT [10][10]; + +void +foo (ArrT Arr, int Idx) +{ + Arr[Idx][Idx] = 1; + Arr[Idx + 10][Idx] = 2; +} + +/* { dg-final { scan-tree-dump-not WIDEN_MULT_PLUS_EXPR widening_mul } } */ +/* { dg-final { cleanup-tree-dump widening_mul } } */ diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c index f7f8ec9..77701ae 100644 --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -2425,20 +2425,25 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, It might also appear that it would be sufficient to use the existing operands of the widening multiply, but that would limit the choice of - multiply-and-accumulate instructions. */ + multiply-and-accumulate instructions. + + If the widened-multiplication result has more than one uses, it is + probably wiser not to do the conversion. */ if (code == PLUS_EXPR (rhs1_code == MULT_EXPR || rhs1_code == WIDEN_MULT_EXPR)) { - if (!is_widening_mult_p (rhs1_stmt, type1, mult_rhs1, - type2, mult_rhs2)) + if (!has_single_use (rhs1) + || !is_widening_mult_p (rhs1_stmt, type1, mult_rhs1, + type2, mult_rhs2)) return false; add_rhs = rhs2; conv_stmt = conv1_stmt; } else if (rhs2_code == MULT_EXPR || rhs2_code == WIDEN_MULT_EXPR) { - if (!is_widening_mult_p (rhs2_stmt, type1, mult_rhs1, - type2, mult_rhs2)) + if (!has_single_use (rhs2) + || !is_widening_mult_p (rhs2_stmt, type1, mult_rhs1, + type2, mult_rhs2)) return false; add_rhs = rhs1; conv_stmt = conv2_stmt;
[wide-int] Make the main tree decompose cheaper
This patch stores two array lengths in an INTEGER_CST: the length that should be used when accessing the constant in its TYPE_PRECISION, and the one that should be used for wider precisions. This means that the main tree decompose routine is just a simple storage_ref constructor. It also means that cst_fits_shwi_p can be simplified back to its earlier form (before r203602). The patch also fixes wi::extended_tree N::get_len so that it uses the right length when a tree has the same precision as addr_wide_int. I just noticed this by inspection, I don't have a testcase. (It might be better to require address trees to be smaller than addr_wide_int though -- I can look into that if you agree. But just changing: gcc_checking_assert (TYPE_PRECISION (TREE_TYPE (t)) = N); to: gcc_checking_assert (TYPE_PRECISION (TREE_TYPE (t)) N); did trigger when I tried it in the first version of the wi::address patch.) Tested on powerpc64-linux-gnu. OK for wide-int? Thanks, Richard Index: gcc/c-family/c-common.c === --- gcc/c-family/c-common.c 2013-10-23 17:43:03.339574704 +0100 +++ gcc/c-family/c-common.c 2013-10-23 17:43:21.593731522 +0100 @@ -5463,7 +5463,7 @@ c_common_nodes_and_builtins (void) } /* This node must not be shared. */ - void_zero_node = make_int_cst (1); + void_zero_node = make_int_cst (1, 1); TREE_TYPE (void_zero_node) = void_type_node; void_list_node = build_void_list_node (); @@ -5674,7 +5674,7 @@ c_common_nodes_and_builtins (void) /* Create the built-in __null node. It is important that this is not shared. */ - null_node = make_int_cst (1); + null_node = make_int_cst (1, 1); TREE_TYPE (null_node) = c_common_type_for_size (POINTER_SIZE, 0); /* Since builtin_types isn't gc'ed, don't export these nodes. */ Index: gcc/lto-streamer-out.c === --- gcc/lto-streamer-out.c 2013-10-23 17:42:54.691500411 +0100 +++ gcc/lto-streamer-out.c 2013-10-23 17:43:05.853596301 +0100 @@ -712,6 +712,7 @@ #define visit(SIBLING) \ { int i; v = iterative_hash_host_wide_int (TREE_INT_CST_NUNITS (t), v); + v = iterative_hash_host_wide_int (TREE_INT_CST_EXT_NUNITS (t), v); for (i = 0; i TREE_INT_CST_NUNITS (t); i++) v = iterative_hash_host_wide_int (TREE_INT_CST_ELT (t, i), v); } Index: gcc/tree-core.h === --- gcc/tree-core.h 2013-10-23 17:42:54.691500411 +0100 +++ gcc/tree-core.h 2013-10-23 17:48:33.634415070 +0100 @@ -741,12 +741,25 @@ struct GTY(()) tree_base { of the field must be large enough to hold addr_space_t values. */ unsigned address_space : 8; } bits; + /* The following fields are present in tree_base to save space. The nodes using them do not require any of the flags above and so can make better use of the 4-byte sized word. */ -/* VEC length. This field is only used with TREE_VEC and - TREE_INT_CST. */ + +/* The number of HOST_WIDE_INTs in an INTEGER_CST. */ +struct { + /* The number of HOST_WIDE_INTs if the INTEGER_CST is accessed in +its native precision. */ + unsigned short unextended; + + /* The number of HOST_WIDE_INTs if the INTEGER_CST is extended to +wider precisions based on its TYPE_SIGN. */ + unsigned short extended; +} split_length; + +/* VEC length. This field is only used with TREE_VEC. */ int length; + /* SSA version number. This field is only used with SSA_NAME. */ unsigned int version; } GTY((skip())) u; Index: gcc/tree-streamer-in.c === --- gcc/tree-streamer-in.c 2013-10-23 17:42:54.691500411 +0100 +++ gcc/tree-streamer-in.c 2013-10-23 19:46:18.512517370 +0100 @@ -147,7 +147,7 @@ unpack_ts_base_value_fields (struct bitp unpack_ts_int_cst_value_fields (struct bitpack_d *bp, tree expr) { int i; - for (i = 0; i TREE_INT_CST_NUNITS (expr); i++) + for (i = 0; i TREE_INT_CST_EXT_NUNITS (expr); i++) TREE_INT_CST_ELT (expr, i) = bp_unpack_var_len_int (bp); } @@ -571,8 +571,8 @@ streamer_alloc_tree (struct lto_input_bl else if (CODE_CONTAINS_STRUCT (code, TS_INT_CST)) { unsigned HOST_WIDE_INT len = streamer_read_uhwi (ib); - result = make_int_cst (len); - TREE_INT_CST_NUNITS (result) = len; + unsigned HOST_WIDE_INT ext_len = streamer_read_uhwi (ib); + result = make_int_cst (len, ext_len); } else if (code == CALL_EXPR) { Index: gcc/tree-streamer-out.c === --- gcc/tree-streamer-out.c 2013-10-23 17:42:54.691500411 +0100 +++ gcc/tree-streamer-out.c 2013-10-23 22:06:08.864297213 +0100 @@ -124,7 +124,7 @@ pack_ts_int_cst_value_fields (struct bit int i; /*
Re: [wide-int] Make trees more like rtxes
Mike Stump mikest...@comcast.net writes: On Oct 23, 2013, at 5:00 AM, Richard Sandiford rsand...@linux.vnet.ibm.com wrote: offset_int, max_int, wi::to_offset and wi::to_max sound OK to me. Kenny? Mike? Those two names seem reasonable. to_offset should document that these are for address offsets (and address constants) exclusively. Reading this back, I realise max_int might sound too similar to INT_MAX. Maybe we could follow the current HOST_* stuff and use: offset_int, widest_int, wi::to_offset and wi::to_widest. Bah. I'm no good at naming stuff... Richard
Re: Debug functions review
On 10/23/2013 12:37 AM, Paolo Carlini wrote: Hi, François Dumont frs.dum...@gmail.com ha scritto: Hi Here is a patch to clean up a little some debug functions. I got rid of the __check_singular_aux, simply playing with __check_singular overloads was enough. I also added the missing __check_dereferenceable for safe local iterators. This is probably straightforward but I want to be sure I understand your previous message + this one: do they mean that in some cases, due to that missing 'const', we weren't catching non-dereferenceable iterators? Thus, should we also add a testcase? Paolo You are right, I am preparing a test case. However you have to know that __check_dereferenceable is simply not used for the moment. It is only because I have started using it for a debug mode evolution that I discovered the issue. François
Re: [PATCH, MPX, 2/X] Pointers Checker [1/25] Hooks
On 10/21/13 08:20, Ilya Enkovich wrote: diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 8d220f3..79bd0f9 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi +@deftypefn {Target Hook} rtx TARGET_LOAD_BOUNDS_FOR_ARG (rtx @var{slot}, rtx @var{arg}, rtx @var{slot_no}) +This hook is used to emit insn to load bounds of @var{arg} passed +in @var{slot}. In case @var{slot} is not a memory, @var{slot_no} is RTX +constant holding number of the special slot we should get bounds from. +Return loaded bounds. +@end deftypefn + +@deftypefn {Target Hook} void TARGET_STORE_BOUNDS_FOR_ARG (rtx @var{arg}, rtx @var{slot}, rtx @var{bounds}, rtx @var{slot_no}) +This hook is used to emit insn to store bounds of @var{arg} passed +in @var{slot}. In case @var{slot} is not a memory, @var{slot_no} is RTX +constant holding number of the special slot we should store bounds to. +@end deftypefn + Almost there. What I think is missing is more information about the case where SLOT is not a memory (presumably it's a reg) and how that relates to SLOT_NO. Isn't this just providing a mapping from the input argument registers to some set of bounds pointer registers? Presumably you aren't really exposing the bound registers as argument registers hence this hack? Not asking you to change anything, just trying to understand the rationale. @node Trampolines @section Trampolines for Nested Functions @cindex trampolines for nested functions @@ -10907,6 +10927,27 @@ ignored. This function should return the result of the call to the built-in function. @end deftypefn +@deftypefn {Target Hook} tree TARGET_BUILTIN_CHKP_FUNCTION (unsigned @var{fcode}) +This hook allows target to redefine built-in functions used by +Pointers Checker for code instrumentation. Hook should return +fndecl of function implementing generic builtin whose code is +passed in @var{fcode}. Currently following built-in functions are +obtained using this hook: +@code{BUILT_IN_CHKP_BNDMK}, @code{BUILT_IN_CHKP_BNDSTX}, +@code{BUILT_IN_CHKP_BNDLDX}, @code{BUILT_IN_CHKP_BNDCL}, +@code{BUILT_IN_CHKP_BNDCU}, @code{BUILT_IN_CHKP_BNDRET}, +@code{BUILT_IN_CHKP_INTERSECT}, @code{BUILT_IN_CHKP_SET_PTR_BOUNDS}, +@code{BUILT_IN_CHKP_NARROW}, @code{BUILT_IN_CHKP_ARG_BND}, +@code{BUILT_IN_CHKP_SIZEOF}, @code{BUILT_IN_CHKP_EXTRACT_LOWER}, +@code{BUILT_IN_CHKP_EXTRACT_UPPER}. +@end deftypefn +@deftypefn {Target Hook} tree TARGET_CHKP_BOUND_TYPE (void) +Return type to be used for bounds +@end deftypefn +@deftypefn {Target Hook} {enum machine_mode} TARGET_CHKP_BOUND_MODE (void) +Return mode to be used for bounds. +@end deftypefn So how am I (as a GCC developer) suppsoed to know what BNDMK, BNDLDX, BNDCU, etc mean? The names aren't particularly descriptive. Are these documented elsewhere in a follow-up patch? If not, it seems to me we need to document them here. Jeff
Re: [PATCH, MPX, 2/X] Pointers Checker [2/25] Builtins
On 10/21/13 05:49, Ilya Enkovich wrote: Hi, This patch introduces built-in functions used by Pointers Checker and flag to enable Pointers Checker. Builtins available for user are expanded in expand_builtin. All other builtins are not allowed in expand until generic version of Pointers Cheker is implemented. Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- gcc/ 2013-10-04 Ilya Enkovich ilya.enkov...@intel.com * builtin-types.def (BT_FN_VOID_CONST_PTR): New. (BT_FN_PTR_CONST_PTR): New. (BT_FN_CONST_PTR_CONST_PTR): New. (BT_FN_PTR_CONST_PTR_SIZE): New. (BT_FN_PTR_CONST_PTR_CONST_PTR): New. (BT_FN_VOID_PTRPTR_CONST_PTR): New. (BT_FN_VOID_CONST_PTR_SIZE): New. (BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE): New. * chkp-builtins.def: New. * builtins.def: include chkp-builtins.def. (DEF_CHKP_BUILTIN): New. * builtins.c (expand_builtin): Support BUILT_IN_CHKP_INIT_PTR_BOUNDS, BUILT_IN_CHKP_NULL_PTR_BOUNDS, BUILT_IN_CHKP_COPY_PTR_BOUNDS, BUILT_IN_CHKP_CHECK_PTR_LBOUNDS, BUILT_IN_CHKP_CHECK_PTR_UBOUNDS, BUILT_IN_CHKP_CHECK_PTR_BOUNDS, BUILT_IN_CHKP_SET_PTR_BOUNDS, BUILT_IN_CHKP_NARROW_PTR_BOUNDS, BUILT_IN_CHKP_STORE_PTR_BOUNDS, BUILT_IN_CHKP_GET_PTR_LBOUND, BUILT_IN_CHKP_GET_PTR_UBOUND, BUILT_IN_CHKP_BNDMK, BUILT_IN_CHKP_BNDSTX, BUILT_IN_CHKP_BNDCL, BUILT_IN_CHKP_BNDCU, BUILT_IN_CHKP_BNDLDX, BUILT_IN_CHKP_BNDRET, BUILT_IN_CHKP_INTERSECT, BUILT_IN_CHKP_ARG_BND, BUILT_IN_CHKP_NARROW, BUILT_IN_CHKP_EXTRACT_LOWER, BUILT_IN_CHKP_EXTRACT_UPPER. * common.opt (fcheck-pointers): New. * toplev.c (process_options): Check Pointers Checker is supported. * doc/extend.texi: Document Pointers Checker built-in functions. Out of curiosity, did you consider and/or discuss with Richard whether or not to make these target-dependent or target-independent builtins? I realize it's a bit problematic with Richard being involved during the NDA portion and someone else during the review/integration portion, but that's unfortunately where we are. I don't think we've generally encouraged target-independent builtins which are only implemented on one target, though I can see the possibility that these may need enough support code in gimple and beyond that they may need to be target-independent. I also find myself wondering if mudflap can/should be reimplemented on top of these hooks (or perhaps we should remove it, I'm not aware of anyone using it in the real world). Anyway, like with the first patch, I'm trying to get a sense of some design decisions as they're important for the review process. Jeff
[PATCH, testsuite committed] Fix powerpc direct-move.h
The following fixes the gcc.target/powerpc/direct-move-*2.c executable test failures. Committed as obvious. 2013-10-23 Pat Haugen pthau...@us.ibm.com * gcc.target/powerpc/direct-move.h: Fix header for executable tests. Index: gcc/testsuite/gcc.target/powerpc/direct-move.h === --- gcc/testsuite/gcc.target/powerpc/direct-move.h(revision 203993) +++ gcc/testsuite/gcc.target/powerpc/direct-move.h(working copy) @@ -1,5 +1,7 @@ /* Test functions for direct move support. */ +#include math.h +extern void abort (void); void __attribute__((__noinline__)) copy (TYPE *a, TYPE *b) @@ -107,7 +109,7 @@ const struct test_struct test_functions[ void __attribute__((__noinline__)) test_value (TYPE a) { - size_t i; + long i; for (i = 0; i sizeof (test_functions) / sizeof (test_functions[0]); i++) { @@ -123,8 +125,7 @@ test_value (TYPE a) int main (void) { - size_t i; - long j; + long i,j; union { TYPE value; unsigned char bytes[sizeof (TYPE)];
[Patch, Fortran] PR44350 - add constraint check for BLOCK DATA
A rather simple patch, which tries to implement Fortran 2008's C1116 (see also PR for the wording). While creating the patch, I found a reject-valid issue, which is now tracked as PR fortran/58857. Build and regtested on x86-64-gnu-linux. OK for the trunk? Tobias 2013-10-23 Tobias Burnus bur...@net-b.de PR fortran/44350 * parse.c (parse_spec): Add C1116 constraint check for BLOCK DATA. 2013-10-23 Tobias Burnus bur...@net-b.de PR fortran/44350 * gfortran.dg/blockdata_8.f90: New. diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c index 512babf..e8b9885 100644 --- a/gcc/fortran/parse.c +++ b/gcc/fortran/parse.c @@ -2628,6 +2628,33 @@ loop: default: break; } + else if (gfc_current_state () == COMP_BLOCK_DATA) +/* Fortran 2008, C1116. */ +switch (st) + { +case ST_DATA_DECL: + case ST_COMMON: + case ST_DATA: + case ST_TYPE: + case ST_END_BLOCK_DATA: + case ST_ATTR_DECL: + case ST_EQUIVALENCE: + case ST_PARAMETER: + case ST_IMPLICIT: + case ST_IMPLICIT_NONE: + case ST_DERIVED_DECL: + case ST_USE: + break; + + case ST_NONE: + break; + + default: + gfc_error (%s statement is not allowed inside of BLOCK DATA at %C, + gfc_ascii_statement (st)); + reject_statement (); + break; + } /* If we find a statement that can not be followed by an IMPLICIT statement (and thus we can expect to see none any further), type the function result diff --git a/gcc/testsuite/gfortran.dg/blockdata_8.f90 b/gcc/testsuite/gfortran.dg/blockdata_8.f90 new file mode 100644 index 000..d3f9925 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/blockdata_8.f90 @@ -0,0 +1,48 @@ +! { dg-do compile } +! +! PR fortran/44350 +! +! Fortran 2008, C1116 only permits a small subset of statements in BLOCK DATA +! +! Part of the test case was contributed by Vittorio Zecca +! +module m +end module m + +BLOCK DATA valid2 + use m + implicit integer(a-z) + intrinsic :: sin + common /one/ a, c + bind(C) :: /one/ + dimension c(5) + parameter (g = 7) +END BLOCK DATA valid2 + +BLOCK DATA valid + use m + implicit none + type t +sequence + end type t + type(t), save :: x + integer :: y + real :: q + save :: y + dimension :: q(5) +! class(*) :: zz ! See PR fortran/58857 +! pointer :: zz + target :: q + volatile y + asynchronous q +END BLOCK DATA valid + +block data invalid + common x + f(x)=x ! { dg-error STATEMENT FUNCTION statement is not allowed inside of BLOCK DATA } + interface ! { dg-error INTERFACE statement is not allowed inside of BLOCK DATA } + end interface +1 format() ! { dg-error FORMAT statement is not allowed inside of BLOCK DATA } +end block invalid ! { dg-error Expecting END BLOCK DATA statement } + +! { dg-error Unexpected end of file { target *-*-* } 0 }
Re: Debug functions review
On 10/23/2013 11:22 PM, François Dumont wrote: You are right, I am preparing a test case. However you have to know that __check_dereferenceable is simply not used for the moment. It is only because I have started using it for a debug mode evolution that I discovered the issue. Ok, thanks. Now however I'm curious to know the story of __check_dereferenceable: dates back to when Doug added debug-mode and never used since? Any idea? Paolo.
AIX: Dependency problem with xlC/g++ combination
Hi! When building on gcc111, I tried to use different compilers for my build robot. This machine has IBM's XL C compiler installed, but not the C++ compiler. So I used CC=xlC, g++ was auto-detected for CXX. This led to depmode=aix, but g++ was actually used later on for calculating dependencies, totally fucking up the build directory (since it's output was wrongly parsed using aix-style.) Since some time, GCC's language is C++, so let's update dependency generation to use the proper compiler, which in turn allows to build the C parts (ie. binutils etc.) with a non-GNU C compiler. I'd like to get some comments on this patch, which works for me for stage-1 builds with gcc/g++ (on a amd64-linux box) as well as with xlC/g++ on gcc111 (the mentioned AIX machine): 2013-10-24 Jan-Benedict Glaw jbg...@lug-owl.de gcc/ * configure.ac (ZW_PROG_COMPILER_DEPENDENCIES): Use CXX instead of CC. * Makefile.in (CXXDEPMODE): Assign and change users. (CCDEPMODE): Delete. * configure: Regenerate. diff --git a/gcc/configure.ac b/gcc/configure.ac index 509..5e686db 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -974,7 +974,7 @@ AC_CONFIG_COMMANDS([gccdepdir],[ ${CONFIG_SHELL-/bin/sh} $ac_aux_dir/mkinstalldirs $lang/$DEPDIR done], [subdirs=$subdirs ac_aux_dir=$ac_aux_dir DEPDIR=$DEPDIR]) -ZW_PROG_COMPILER_DEPENDENCIES([CC]) +ZW_PROG_COMPILER_DEPENDENCIES([CXX]) AC_LANG_POP(C++) # diff --git a/gcc/Makefile.in b/gcc/Makefile.in index f0b8c5a..f519455 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -311,7 +311,7 @@ write_entries_to_file = $(shell rm -f $(2) || :) $(shell touch $(2)) \ # # Dependency tracking stuff. -CCDEPMODE = @CCDEPMODE@ +CXXDEPMODE = @CXXDEPMODE@ DEPDIR = @DEPDIR@ depcomp = $(SHELL) $(srcdir)/../depcomp @@ -1040,7 +1040,7 @@ INCLUDES = -I. -I$(@D) -I$(srcdir) -I$(srcdir)/$(@D) \ $(CLOOGINC) $(ISLINC) COMPILE.base = $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) -o $@ -ifeq ($(CCDEPMODE),depmode=gcc3) +ifeq ($(CXXDEPMODE),depmode=gcc3) # Note a subtlety here: we use $(@D) for the directory part, to make # things like the go/%.o rule work properly; but we use $(*F) for the # file part, as we just want the file part of the stem, not the entire @@ -1049,7 +1049,7 @@ COMPILE = $(COMPILE.base) -MT $@ -MMD -MP -MF $(@D)/$(DEPDIR)/$(*F).TPo POSTCOMPILE = @mv $(@D)/$(DEPDIR)/$(*F).TPo $(@D)/$(DEPDIR)/$(*F).Po else COMPILE = source='$' object='$@' libtool=no \ -DEPDIR=$(DEPDIR) $(CCDEPMODE) $(depcomp) $(COMPILE.base) +DEPDIR=$(DEPDIR) $(CXXDEPMODE) $(depcomp) $(COMPILE.base) POSTCOMPILE = endif diff --git a/gcc/configure b/gcc/configure index bedf3b1..1e7bcb6 100755 --- a/gcc/configure +++ b/gcc/configure @@ -737,7 +737,7 @@ LDEXP_LIB EXTRA_GCC_LIBS GNAT_LIBEXC COLLECT2_LIBS -CCDEPMODE +CXXDEPMODE DEPDIR am__leading_dot CXXCPP @@ -8766,12 +8766,12 @@ ac_config_commands=$ac_config_commands depdir ac_config_commands=$ac_config_commands gccdepdir -depcc=$CC am_compiler_list= +depcc=$CXX am_compiler_list= am_depcomp=$ac_aux_dir/depcomp { $as_echo $as_me:${as_lineno-$LINENO}: checking dependency style of $depcc 5 $as_echo_n checking dependency style of $depcc... 6; } -if test ${am_cv_CC_dependencies_compiler_type+set} = set; then : +if test ${am_cv_CXX_dependencies_compiler_type+set} = set; then : $as_echo_n (cached) 6 else if test -f $am_depcomp; then @@ -8793,7 +8793,7 @@ else # directory. mkdir sub - am_cv_CC_dependencies_compiler_type=none + am_cv_CXX_dependencies_compiler_type=none if test $am_compiler_list = ; then am_compiler_list=`sed -n 's/^\([a-zA-Z0-9]*\))$/\1/p' ./depcomp` fi @@ -8838,7 +8838,7 @@ else # icc: Command line remark: option '-MP' not supported if (grep 'ignoring option' conftest.err || grep 'not supported' conftest.err) /dev/null 21; then :; else -am_cv_CC_dependencies_compiler_type=$depmode +am_cv_CXX_dependencies_compiler_type=$depmode $as_echo $as_me:$LINENO: success 5 break fi @@ -8850,15 +8850,15 @@ else cd .. rm -rf conftest.dir else - am_cv_CC_dependencies_compiler_type=none + am_cv_CXX_dependencies_compiler_type=none fi fi -{ $as_echo $as_me:${as_lineno-$LINENO}: result: $am_cv_CC_dependencies_compiler_type 5 -$as_echo $am_cv_CC_dependencies_compiler_type 6; } -if test x${am_cv_CC_dependencies_compiler_type-none} = xnone +{ $as_echo $as_me:${as_lineno-$LINENO}: result: $am_cv_CXX_dependencies_compiler_type 5 +$as_echo $am_cv_CXX_dependencies_compiler_type 6; } +if test x${am_cv_CXX_dependencies_compiler_type-none} = xnone then as_fn_error no usable dependency style found $LINENO 5 -else CCDEPMODE=depmode=$am_cv_CC_dependencies_compiler_type +else CXXDEPMODE=depmode=$am_cv_CXX_dependencies_compiler_type fi Any suggestions to better testing? Or an ACK/NACK for the overall change? (Maybe there's something I didn't
Re: [PATCH, MPX, 2/X] Pointers Checker [2/25] Builtins
On 10/23/2013 02:41 PM, Jeff Law wrote: Out of curiosity, did you consider and/or discuss with Richard whether or not to make these target-dependent or target-independent builtins? I realize it's a bit problematic with Richard being involved during the NDA portion and someone else during the review/integration portion, but that's unfortunately where we are. I suggested that they be target independent. I suggested that there was nothing in MPX that couldn't be done generically, if slower, on non-MPX hardware. E.g. there's no reason why bounds couldn't be packed into a double-word integer, and the checking builtins completely outlined into a runtime library. I suggested that the optimization done on the bound type would help a generic mudflap replacement. r~
Re: [PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping
On Wed, Oct 23, 2013 at 6:46 PM, Martin Jambor wrote: /* Perform the second half of the transformation started in @@ -4522,7 +4704,15 @@ ira (FILE *f) allocation because of -O0 usage or because the function is too big. */ if (ira_conflicts_p) -find_moveable_pseudos (); +{ + df_analyze (); + calculate_dominance_info (CDI_DOMINATORS); + + find_moveable_pseudos (); + split_live_ranges_for_shrink_wrap (); + + free_dominance_info (CDI_DOMINATORS); +} You probably want to add another df_analyze if split_live_ranges_for_shrink_wrap makes code transformations. AFAIU find_moveable_pseudos doesn't change global liveness but your transformation might. IRA/LRA need up-to-date DF_LR results to compute allocno live ranges. Ciao! Steven
[ARM][PATCH] Fix testsuite testcase neon-vcond-[ltgt,unordered].c
Hi, arm testcases neon-vcond-ltgt.c and neon-vcond-unordered.c fails in Linaro 4.8 branch. It is not reproducable with trunk but it can happen. Both neon-vcond-ltgt.c and neon-vcond-unordered.c scans for vbsl instruction, with other vector instructions. However, as per the comment for neon_vbslmode_internal md pattern defined in neon.md, gcc can generate vbsl or vbit or vbif depending on the register allocation. Therfore, these testcases should scan for one of these three instructions instead of just vbsl. I have updated the testcases to scan vbsl or vbit or vbif now. Is this OK? Thanks, Kugan 2013-10-23 Kugan Vivekanandarajah kug...@linaro.org * gcc.target/arm/neon-vcond-ltgt.c: Scan for vbsl or vbit or vbif. * gcc.target/arm/neon-vcond-unordered.c: Scan for vbsl or vbit or vbif. diff --git a/gcc/testsuite/gcc.target/arm/neon-vcond-ltgt.c b/gcc/testsuite/gcc.target/arm/neon-vcond-ltgt.c index acb23a9..c8306e3 100644 --- a/gcc/testsuite/gcc.target/arm/neon-vcond-ltgt.c +++ b/gcc/testsuite/gcc.target/arm/neon-vcond-ltgt.c @@ -15,4 +15,4 @@ void foo (int ilast,float* w, float* w2) /* { dg-final { scan-assembler-times vcgt\\.f32\[\\t \]*q\[0-9\]+,\[\\t \]*q\[0-9\]+,\[\\t \]*q\[0-9\]+ 2 } } */ /* { dg-final { scan-assembler vorr\[\\t \]*q\[0-9\]+,\[\\t \]*q\[0-9\]+,\[\\t \]*q\[0-9\]+ } } */ -/* { dg-final { scan-assembler vbsl\[\\t \]*q\[0-9\]+,\[\\t \]*q\[0-9\]+,\[\\t \]*q\[0-9\]+ } } */ +/* { dg-final { scan-assembler vbsl|vbit|vbif\[\\t \]*q\[0-9\]+,\[\\t \]*q\[0-9\]+,\[\\t \]*q\[0-9\]+ } } */ diff --git a/gcc/testsuite/gcc.target/arm/neon-vcond-unordered.c b/gcc/testsuite/gcc.target/arm/neon-vcond-unordered.c index c3e448d..3bb67d3 100644 --- a/gcc/testsuite/gcc.target/arm/neon-vcond-unordered.c +++ b/gcc/testsuite/gcc.target/arm/neon-vcond-unordered.c @@ -16,4 +16,4 @@ void foo (int ilast,float* w, float* w2) /* { dg-final { scan-assembler vcgt\\.f32\[\\t \]*q\[0-9\]+,\[\\t \]*q\[0-9\]+,\[\\t \]*q\[0-9\]+ } } */ /* { dg-final { scan-assembler vcge\\.f32\[\\t \]*q\[0-9\]+,\[\\t \]*q\[0-9\]+,\[\\t \]*q\[0-9\]+ } } */ /* { dg-final { scan-assembler vorr\[\\t \]*q\[0-9\]+,\[\\t \]*q\[0-9\]+,\[\\t \]*q\[0-9\]+ } } */ -/* { dg-final { scan-assembler vbsl\[\\t \]*q\[0-9\]+,\[\\t \]*q\[0-9\]+,\[\\t \]*q\[0-9\]+ } } */ +/* { dg-final { scan-assembler vbsl|vbit|vbif\[\\t \]*q\[0-9\]+,\[\\t \]*q\[0-9\]+,\[\\t \]*q\[0-9\]+ } } */
Re: [wwwdocs] Add link to jit mailing list to lists.html
On Wed, 23 Oct 2013, David Malcolm wrote: OK to commit to CVS? Yes, thank you, David. (fwiw the pre-existing page content doesn't validate as per http://gcc.gnu.org/contribute.html#webchanges, due to issues with the include searchbox.ihtml ) I was going to say Go ahead and commit, and if my automated checker hits you, ignore it and I'll have a look, but the actual page on the server at http://gcc.gnu.org/lists.html actually validates since there the include... is resolved. You can invoke the validator manually be hitting the full stop after the date in the last line of the page. :-) Gerald
RE: [PATCH, PR 57748] Check for out of bounds access
On, Wed, 23 Oct 2013 17:36:41Richard Biener wrote: if bitregion_start/end are used after the adjust_address call then they have to be adjusted (or bitpos left in place). In fact why we apply byte-parts of bitpos here only if offset != 0 is weird. OTOH this code is _very_ old... what happens if you remove the whole case? Richard. If I remove that code completely... Than changes nothing ARM at -O2. and on x86 at -O2 this changes: .cfi_startproc movl k, %eax leal (%eax,%eax,2), %eax - leal ss+8(,%eax,4), %eax - movl $1, 16(%eax) + leal ss+16(,%eax,4), %eax + movl $1, 8(%eax) movl k, %eax leal (%eax,%eax,2), %eax - leal ss+8(,%eax,4), %eax - movl 16(%eax), %eax + leal ss+16(,%eax,4), %eax + movl 8(%eax), %eax cmpl $1, %eax jne .L5 xorl %eax, %eax that does not really make any difference. So removing that code looks like a nice alternative. What do you think? Bernd. patch-pr57748-3.diff Description: Binary data
Go patch committed: Use backend interface for some runtime calls
This patch from Chris Manghane changes the Go frontend to use the backend interface for a couple of runtime calls. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 46c1804a8e17 go/expressions.cc --- a/go/expressions.cc Fri Oct 18 06:25:23 2013 -0700 +++ b/go/expressions.cc Wed Oct 23 16:46:57 2013 -0700 @@ -3351,14 +3351,10 @@ return se-get_tree(context); } - static tree int_to_string_fndecl; - ret = Gogo::call_builtin(int_to_string_fndecl, - this-location(), - __go_int_to_string, - 1, - type_tree, - int_type_tree, - expr_tree); + Call_expression* i2s_expr = + Runtime::make_call(Runtime::INT_TO_STRING, this-location(), 1, + this-expr_); + ret = i2s_expr-get_tree(context); } else if (type-is_string_type() expr_type-is_slice_type()) { @@ -3408,29 +3404,18 @@ { Type* e = type-array_type()-element_type()-forwarded(); go_assert(e-integer_type() != NULL); + + Call_expression* s2a_expr; if (e-integer_type()-is_byte()) - { - tree string_to_byte_array_fndecl = NULL_TREE; - ret = Gogo::call_builtin(string_to_byte_array_fndecl, - this-location(), - __go_string_to_byte_array, - 1, - type_tree, - TREE_TYPE(expr_tree), - expr_tree); - } +s2a_expr = Runtime::make_call(Runtime::STRING_TO_BYTE_ARRAY, + this-location(), 1, this-expr_); else { go_assert(e-integer_type()-is_rune()); - tree string_to_int_array_fndecl = NULL_TREE; - ret = Gogo::call_builtin(string_to_int_array_fndecl, - this-location(), - __go_string_to_int_array, - 1, - type_tree, - TREE_TYPE(expr_tree), - expr_tree); - } + s2a_expr = Runtime::make_call(Runtime::STRING_TO_INT_ARRAY, +this-location(), 1, this-expr_); + } + ret = s2a_expr-get_tree(context); } else if ((type-is_unsafe_pointer_type() expr_type-points_to() != NULL)
Re: [c++-concepts] small tidbits to get it to build
On 10/23/2013 08:36 AM, Andrew Sutton wrote: Hi Ed, It looks like we did reserve assume as a keyword, but it's not being used... By any chance, did you configure without --disable-bootstrap? I think it would be a better solution to remove the unused keywords -- there were a couple of others that we grabbed for some other concepts-related work, but which aren't included in Concepts Lite. I'll apply the typeck fix. Andrew Sutton On Tue, Oct 22, 2013 at 10:02 PM, Ed Smith-Rowland 3dw...@verizon.net wrote: I had to get past two small bugs to get c++-concepts to build. Take a good look because I'm not sure if they're right. The solutions should be harmless though. Ed I did this: $ ../gcc_concepts/configure --prefix=/home/ed/bin_concepts --enable-languages=c,c++,lto This is pretty base bones - no special treatment configure and the branch worked pretty well. Ed
[RFC] PR 58542: const_int vs lost modes
In this pr, we have a -1 in type __int128. Since this value can be represented in a HOST_WIDE_INT, we expand this to a const_int. The expansion from tree to rtl happens in expand_builtin_atomic_store. And as with most of our builtins, we then pass off the rtl to another routine for expansion. When we fill in the blanks of the expand_operand in expand_atomic_compare_and_swap, we've forgotten that the const_int is of TImode. Then convert_modes attempts to zero-extend what it assumes is a narrower mode and we get corrupt data. For a bit I thought that the bug was in convert_modes, but honestly any change I make there merely moves the bug around. The biggest problem is that we've lost the mode. Given that we lose the mode through any of 10 stack frames, I believe it to be implausible to adjust all of the call frames in optabs.c. The real bug is that const_int doesn't carry a mode. The most isolated patch I can come up with, especially since we ought to fix this in 4.8 branch as well, is to only allow expansion of wide int modes to const_int when they're positive. Thoughts? r~ PR rtl/58542 * emit-rtl.c (immed_double_const): Use const_double for negative numbers of very wide modes. diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c index b0fc846..d055f56 100644 --- a/gcc/emit-rtl.c +++ b/gcc/emit-rtl.c @@ -545,7 +545,10 @@ immed_double_const (HOST_WIDE_INT i0, HOST_WIDE_INT i1, enum machine_mode mode) } /* If this integer fits in one word, return a CONST_INT. */ - if ((i1 == 0 i0 = 0) || (i1 == ~0 i0 0)) + /* ??? On occasion we lose track of the original mode, and a CONST_INT gets + interpreted incorrectly for modes larger than HOST_BITS_PER_WIDE_INT. + If we only use CONST_INT for positive values, we work around that. */ + if (i1 == 0 i0 = 0) return GEN_INT (i0); /* We use VOIDmode for integers. */
Re: AIX: Dependency problem with xlC/g++ combination
On Wed, Oct 23, 2013 at 6:21 PM, Jan-Benedict Glaw jbg...@lug-owl.de wrote: Hi! When building on gcc111, I tried to use different compilers for my build robot. This machine has IBM's XL C compiler installed, but not the C++ compiler. So I used CC=xlC, g++ was auto-detected for CXX. This led to depmode=aix, but g++ was actually used later on for calculating dependencies, totally fucking up the build directory (since it's output was wrongly parsed using aix-style.) Why do you say that the C++ compiler is not installed on gcc111? That statement is incorrect. The AIX XL C++ compiler is xlC, which you mention. Updating the dependencies to C++ makes sense. - David
Re: [PATCH] Generate fused widening multiply-and-accumulate operations only when the widening multiply has single use
On 10/21/2013 03:01 PM, Yufeng Zhang wrote: This patch changes the widening_mul pass to fuse the widening multiply with accumulate only when the multiply has single use. The widening_mul pass currently does the conversion regardless of the number of the uses, which can cause poor code-gen in cases like the following: typedef int ArrT [10][10]; void foo (ArrT Arr, int Idx) { Arr[Idx][Idx] = 1; Arr[Idx + 10][Idx] = 2; } On AArch64, after widening_mul, the IR is like _2 = (long unsigned int) Idx_1(D); _3 = Idx_1(D) w* 40; _5 = Arr_4(D) + _3; *_5[Idx_1(D)] = 1; _8 = WIDEN_MULT_PLUS_EXPR Idx_1(D), 40, 400; _9 = Arr_4(D) + _8; *_9[Idx_1(D)] = 2; Where the arrows point, there are redundant widening multiplies. So they're redundant. Why does this imply poor code-gen? If a target has more than one FMA unit, then the target might be able to issue the computation for _3 and _8 in parallel. Even if the target only has one FMA unit, but the unit is pipelined, the computations could overlap. r~
[C11-atomic] Miscellaneous fixes 3/n
I've committed this patch to C11-atomic branch with further miscellaneous fixes. The testcases are expanded to cover everything I found in going through the language parts of C11 for atomics issues that can be covered through compile rather than execute tests. Corresponding compiler fixes are made for issues found (as well as some other cleanups). I think the next steps will be checking the C front-end code for particular constructs likely to need updating for atomics, in case any relevant places have been missed so far, adding execution tests, and adding support for the special floating-point handling in compound assignment. c: 2013-10-24 Joseph Myers jos...@codesourcery.com * c-decl.c (validate_proto_after_old_defn): Do not remove atomic qualifiers when compating types. (start_decl): Use c_type_promotes_to when promoting argument types. (check_bitfield_type_and_width): Don't check for atomic qualifiers here. (grokdeclarator): Check for atomic qualifiers on bit-fields here. *store_parm_decls_oldstyle): Do not remove atomic qualifiers when comparing types. Use c_type_promotes_to when promoting argument types. (finish_function) Use c_type_promotes_to when promoting argument types. * c-parser.c (c_parser_typeof_specifier): Use TYPE_ATOMIC instead of testing TYPE_QUAL_ATOMIC in TYPE_QUALS. * c-typeck.c (c_type_promotes_to): Promote atomic types to corresponding atomic types. (type_lists_compatible_p, find_anonymous_field_with_type) (convert_to_anonymous_field, convert_for_assignment): Do not remove atomic qualifiers when comparing types. testsuite: 2013-10-24 Joseph Myers jos...@codesourcery.com * gcc.dg/c11-atomic-1.c, gcc.dg/c11-atomic-3.c: Add more tests. Index: gcc/testsuite/gcc.dg/c11-atomic-1.c === --- gcc/testsuite/gcc.dg/c11-atomic-1.c (revision 203840) +++ gcc/testsuite/gcc.dg/c11-atomic-1.c (working copy) @@ -179,3 +179,79 @@ func7 (void) expressions with other pointer types. */ (void) (r ? xaip1 : (r ? xaip1 : xvp1)); } + +/* Pointer += and -= integer is valid. */ +void +func8 (void) +{ + b += 1; + b -= 2ULL; + ap += 3; +} + +/* Various other cases of simple assignment are valid (some already + tested above). */ +void +func9 (void) +{ + ap = 0; + ap = (void *) 0; + xvp1 = atp1; + atp1 = xvp1; +} + +/* Test compatibility of function types in cases where _Atomic matches + (see c11-atomic-3.c for corresponding cases where it doesn't + match). */ +void fc0a (int const); +void fc0a (int); +void fc0b (int _Atomic); +void fc0b (int _Atomic); +void fc1a (int); +void +fc1a (x) + volatile int x; +{ +} +void fc1b (_Atomic int); +void +fc1b (x) + volatile _Atomic int x; +{ +} +void +fc2a (x) + const int x; +{ +} +void fc2a (int); /* { dg-warning follows non-prototype } */ +void +fc2b (x) + _Atomic int x; +{ +} +void fc2b (_Atomic int); /* { dg-warning follows non-prototype } */ +void fc3a (int); +void +fc3a (x) + volatile short x; +{ +} +void fc3b (_Atomic int); +void +fc3b (x) + _Atomic short x; +{ +} +void +fc4a (x) + const short x; +{ +} +void fc4a (int); /* { dg-warning follows non-prototype } */ +void +fc4b (x) + _Atomic short x; +{ +} +void fc4b (_Atomic int); /* { dg-warning follows non-prototype } */ Index: gcc/testsuite/gcc.dg/c11-atomic-3.c === --- gcc/testsuite/gcc.dg/c11-atomic-3.c (revision 203840) +++ gcc/testsuite/gcc.dg/c11-atomic-3.c (working copy) @@ -64,3 +64,94 @@ func2 (void) (void) (r ? pai : pav); /* { dg-error pointer type mismatch } */ (void) (r ? pav : pai); /* { dg-error pointer type mismatch } */ } + +/* Likewise for pointer assignment. */ +void +func3 (void) +{ + pai = pi; /* { dg-error incompatible pointer type } */ + pi = pai; /* { dg-error incompatible pointer type } */ + pav = pai; /* { dg-error incompatible pointer type } */ + pai = pav; /* { dg-error incompatible pointer type } */ +} + +/* Cases that are invalid for normal assignments are just as invalid + (and should not ICE) when the LHS is atomic. */ +void +func4 (void) +{ + as = acf; /* { dg-error incompatible types } */ + apv = as; /* { dg-error incompatible types } */ + as += 1; /* { dg-error invalid operands } */ + apv -= 1; /* { dg-error pointer of type } */ + apv *= 1; /* { dg-error invalid operands } */ + apv /= 1; /* { dg-error invalid operands } */ + apv %= 1; /* { dg-error invalid operands } */ + apv = 1; /* { dg-error invalid operands } */ + apv = 1; /* { dg-error invalid operands } */ + apv = 1; /* { dg-error invalid operands } */ + apv ^= 1; /* { dg-error invalid operands } */ + apv |= 1; /* { dg-error invalid operands } */ +} + +/* We don't allow atomic bit-fields in GCC (implementation-defined + whether they are permitted).
Re: [wwwdocs] Add link to jit mailing list to lists.html
On Thu, 2013-10-24 at 01:46 +0200, Gerald Pfeifer wrote: On Wed, 23 Oct 2013, David Malcolm wrote: OK to commit to CVS? Yes, thank you, David. Thanks; committed. (fwiw the pre-existing page content doesn't validate as per http://gcc.gnu.org/contribute.html#webchanges, due to issues with the include searchbox.ihtml ) I was going to say Go ahead and commit, and if my automated checker hits you, ignore it and I'll have a look, but the actual page on the server at http://gcc.gnu.org/lists.html actually validates since there the include... is resolved. You can invoke the validator manually be hitting the full stop after the date in the last line of the page. :-) Aha! Thanks Dave
Re: [PATCH v2 2/4] Parse base classes for GTY-marked types
On Tue, 2013-10-15 at 12:45 -0600, Jeff Law wrote: On 09/24/13 11:49, David Malcolm wrote: Extend gengtype (and gtype.state reading/writing) so that it is able to parse base classes in simple cases, and only attempt to do it for GTY-marked types. * gengtype-parse.c (require_without_advance): New. (type): For GTY-marked types that are not GTY((user)), parse any base classes, requiring them to be single-inheritance, and not be templates. For non-GTY-marked types and GTY((user)), continue to skip over any C++ inheritance specification. * gengtype-state.c (state_writer::write_state_struct_type): Write base class of type (if any). (read_state_struct_type): Read base class of type (if any). * gengtype.c (new_structure): Add a base_class parameter. (create_optional_field_): Update for new parameter to new_structure. (adjust_field_rtx_def): Likewise. (adjust_field_tree_exp): Likewise. * gengtype.h (struct type): Add base_class field to the s union field. (new_structure): Add base parameter. This is OK. Sorry about belated response. I've now committed this to trunk as r204003, having doublechecked the bootstrap and testsuites. Thanks Dave