Re: [ARM][PATCH] Vectorizer generates unaligned access when -mno-unaligned-access is enabled
> we prohibited unaligned loads Sorry, meant to say "vectorization of unaligned loads". -Y
Re: [ARM][PATCH] Vectorizer generates unaligned access when -mno-unaligned-access is enabled
> As can be seen here: > http://cbuild.validation.linaro.org/build/cross-validation/gcc/207533/report-build-info.html > this has caused some regressions on armv5t targets. IMHO this is expected: with this patch we prohibited unaligned loads on all platforms with -mno-unaligned-access. Perhaps we should set vect_no_align for armv5t? -Y
PR 59918 (overacitve ipa-devirt sanity check)
Hi, this testcase triggers overactive check in record_target_from_binfo. It is confused by fact that we may have hard time not only looking for a vtable of virtual base but also of base of this virtual base that no longer has virtual flag on it. Regtsted ad comitted as obvious. Honza Index: testsuite/ChangeLog === --- testsuite/ChangeLog (revision 207591) +++ testsuite/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2014-02-06 Jan Hubicka + + PR ipa/59918 + * g++.dg/torture/pr59918.C: New testcase. + 2014-02-06 Jakub Jelinek PR target/59575 Index: testsuite/g++.dg/torture/pr59918.C === --- testsuite/g++.dg/torture/pr59918.C (revision 0) +++ testsuite/g++.dg/torture/pr59918.C (revision 0) @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +struct E { + ~E(); + virtual void f() const; +}; +struct B : E {}; +struct G : virtual B {}; +struct A { + virtual ~A(); +}; +struct J : E { + void f() const { +E *p = 0; +p->f(); + } +}; +J h; +struct I : A, G, virtual B {}; Index: ChangeLog === --- ChangeLog (revision 207591) +++ ChangeLog (working copy) @@ -1,5 +1,10 @@ 2014-02-06 Jan Hubicka + PR ipa/59918 + * ipa-devirt.c (record_target_from_binfo): Remove overactive sanity check. + +2014-02-06 Jan Hubicka + PR ipa/59469 * lto-cgraph.c (lto_output_node): Use symtab_get_symbol_partitioning_class. Index: ipa-devirt.c === --- ipa-devirt.c(revision 207588) +++ ipa-devirt.c(working copy) @@ -689,10 +689,7 @@ record_target_from_binfo (vec
Re: [PATCH] Fix PR 58960
On 30.01.2014 9:42, Andrey Belevantsev wrote: Hello, Ping. As detailed in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58960#c6, we fail to use the DF liveness info in the register pressure sensitive scheduling for the new blocks as we do not properly compute it in this case. The patch fixes this by avoiding to use the sched-pressure for the new regions, as currently these are only ia64 recovery blocks and supposed to be cold. In the case we'd get other cases of the new blocks, this may be reconsidered. The other options of computing the DF info sketched at the above link do not seem plausible for this stage. Bootstrapped and tested on ia64, also tested by Andreas Schwab on ia64 (see PR log). OK for trunk? Andrey 2013-01-30 Andrey Belevantsev PR rtl-optimization/58960 * haifa-sched.c (alloc_global_sched_pressure_data): New, factored out from ... (sched_init) ... here. (free_global_sched_pressure_data): New, factored out from ... (sched_finish): ... here. * sched-int.h (free_global_sched_pressure_data): Declare. * sched-rgn.c (nr_regions_initial): New static global. (haifa_find_rgns): Initialize it. (schedule_region): Disable sched-pressure for the newly generated regions.
Re: [ARM][PATCH] Vectorizer generates unaligned access when -mno-unaligned-access is enabled
On 07/02/14 03:54, Christophe Lyon wrote: > On 6 February 2014 10:49, Yury Gribov wrote: >> Kugan wrote: Ok if no regressions. >>> >>> Tested it on qemu for arm-none-linux-gnueabi and there is no new >>> regressions. I am sorry I didn't mention it when I posted the patch. >> >> Commited in r207533. Thanks! >> > > Hi, > > As can be seen here: > http://cbuild.validation.linaro.org/build/cross-validation/gcc/207533/report-build-info.html > this has caused some regressions on armv5t targets. Thanks for spotting this. armv5t disables unaligned access by default. I am looking into it. > > Additionally, are you sure testing this kind of thing on QEMU > sufficient? Does it really handle unaligned accesses as bus errors? > I did test this patch on a Chromebook where I reproduced the bus error with an older version of trunk and verified it was fixed. I then tested the patch for armv7-a with the latest trunk on qemu. I am not sure if qemu models the bus errors. Thanks, Kugan > Christophe. >
PR 59469
Hi, this patch fixes undefined symbol seen while compiling LLVM with LTO. The bug here is that lto-cgraph and lto-partition both contain knowledge about what symbols gets partitioned and what gets duplicated to each partition that uses them. This has got out of sync causing lto-cgraph to not set in_other_partition for symbols it believed to be duplicated while they was really partitioned. This patch unifies the logic. Bootstrapped/regtsted x86_64-linux, comitted. PR ipa/59469 * lto-cgraph.c (lto_output_node): Use symtab_get_symbol_partitioning_class. (lto_output_varpool_node): likewise. (symtab_get_symbol_partitioning_class): Move here from lto/lto-partition.c * cgraph.h (symbol_partitioning_class): Likewise. (symtab_get_symbol_partitioning_class): Declare. * lto-partition.c (symbol_class): Move to cgraph.h (get_symbol_class): Move to symtab.c (add_references_to_partition, add_symbol_to_partition_1, lto_max_map, lto_1_to_1_map, lto_balanced_map, lto_promote_cross_file_statics): Update. Index: lto-cgraph.c === --- lto-cgraph.c(revision 207588) +++ lto-cgraph.c(working copy) @@ -417,7 +417,8 @@ lto_output_node (struct lto_simple_outpu Cherry-picked nodes: These are nodes we pulled from other translation units into SET during IPA-inlining. We make them as local static nodes to prevent clashes with other local statics. */ - if (boundary_p && node->analyzed && !DECL_EXTERNAL (node->decl)) + if (boundary_p && node->analyzed + && symtab_get_symbol_partitioning_class (node) == SYMBOL_PARTITION) { /* Inline clones can not be part of boundary. gcc_assert (!node->global.inlined_to); @@ -501,8 +502,7 @@ lto_output_node (struct lto_simple_outpu bp_pack_value (&bp, node->unique_name, 1); bp_pack_value (&bp, node->address_taken, 1); bp_pack_value (&bp, tag == LTO_symtab_analyzed_node -&& !DECL_EXTERNAL (node->decl) -&& !DECL_COMDAT (node->decl) +&& symtab_get_symbol_partitioning_class (node) == SYMBOL_PARTITION && (reachable_from_other_partition_p (node, encoder) || referenced_from_other_partition_p (&node->ref_list, encoder)), 1); @@ -569,9 +569,7 @@ lto_output_varpool_node (struct lto_simp /* Constant pool initializers can be de-unified into individual ltrans units. FIXME: Alternatively at -Os we may want to avoid generating for them the local labels and share them across LTRANS partitions. */ - if (DECL_IN_CONSTANT_POOL (node->decl) - && !DECL_EXTERNAL (node->decl) - && !DECL_COMDAT (node->decl)) + if (symtab_get_symbol_partitioning_class (node) != SYMBOL_PARTITION) { bp_pack_value (&bp, 0, 1); /* used_from_other_parition. */ bp_pack_value (&bp, 0, 1); /* in_other_partition. */ Index: symtab.c === --- symtab.c(revision 207588) +++ symtab.c(working copy) @@ -1267,4 +1267,55 @@ symtab_semantically_equivalent_p (symtab bb = b; return bb == ba; } + +/* Classify symbol NODE for partitioning. */ + +enum symbol_partitioning_class +symtab_get_symbol_partitioning_class (symtab_node *node) +{ + /* Inline clones are always duplicated. + This include external delcarations. */ + cgraph_node *cnode = dyn_cast (node); + + if (DECL_ABSTRACT (node->decl)) +return SYMBOL_EXTERNAL; + + if (cnode && cnode->global.inlined_to) +return SYMBOL_DUPLICATE; + + /* Weakref aliases are always duplicated. */ + if (node->weakref) +return SYMBOL_DUPLICATE; + + /* External declarations are external. */ + if (DECL_EXTERNAL (node->decl)) +return SYMBOL_EXTERNAL; + + if (varpool_node *vnode = dyn_cast (node)) +{ + /* Constant pool references use local symbol names that can not + be promoted global. We should never put into a constant pool + objects that can not be duplicated across partitions. */ + if (DECL_IN_CONSTANT_POOL (node->decl)) + return SYMBOL_DUPLICATE; + gcc_checking_assert (vnode->definition); +} + /* Functions that are cloned may stay in callgraph even if they are unused. + Handle them as external; compute_ltrans_boundary take care to make + proper things to happen (i.e. to make them appear in the boundary but + with body streamed, so clone can me materialized). */ + else if (!cgraph (node)->definition) +return SYMBOL_EXTERNAL; + + /* Linker discardable symbols are duplicated to every use unless they are + keyed. + Keyed symbols or those. */ + if (DECL_ONE_ONLY (node->decl) + && !node->force_output + && !node->forced_by_abi + && !symtab_used_from_object_file_p (node)) +return SYMBOL_DUPLIC
[RFA][rtl-optimization/52714] Do not allow combine to create invalid RTL
As mentioned in the PR, we call try_combine with: i1 = (insn 14 13 16 2 (set (reg/v/f:SI 34 [ stack ]) (reg/f:SI 15 %sp)) pr52714.c:9 38 {*movsi_m68k2} (nil)) i2 = (insn 16 14 17 2 (set (cc0) (compare (reg/v/f:SI 34 [ stack ]) (const_int 0 [0]))) pr52714.c:10 4 {*tstsi_internal_68020_cf} (nil)) i3 = (jump_insn 17 16 18 2 (set (pc) (if_then_else (eq (cc0) (const_int 0 [0])) (label_ref 40) (pc))) pr52714.c:10 390 {beq} (int_list:REG_BR_PROB 1014 (nil)) -> 40) This is then combined into: newpat = (parallel [ (set (pc) (pc)) (set (reg/v/f:SI 34 [ stack ]) (reg/f:SI 15 %sp)) ]) This isn't a recognized insn, and it needs to be split. Since it's just a parallel of independent sets, combine tries to split it into a pair of assignments which look like: (insn 16 14 17 2 (set (pc) (pc)) pr52714.c:10 2147483647 {NOOP_MOVE} (nil)) (jump_insn 17 16 18 2 (set (reg/v/f:SI 34 [ stack ]) (reg/f:SI 15 %sp)) pr52714.c:10 38 {*movsi_m68k2} (int_list:REG_BR_PROB 1014 (nil)) -> 40) Note how the second is a JUMP_INSN, but it doesn't modify the PC. Opps. ISTM the code in combine which tries to rip apart a PARALLEL like that needs to ensure that I2 and I3 are both INSNs. The astute reader would notice that this wasn't optimized at the tree level. That's an artifact of using -O1 instead of -O2. At -O2 VRP would come in and zap that test well before it ever expanded into RTL. Verified all 3 tests in the PR are fixed (two full, one reduced). Reduced test to be added to the test suite. Bootstrap and regression test in progress on x86_64-unknown-linux-gnu. OK if that passes without incident? diff --git a/gcc/ChangeLog b/gcc/ChangeLog index e489b62..7983139 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,10 @@ +2014-02-06 Jeff Law + + PR rtl-optimization/52714 + * combine.c (try_combine): When splitting an unrecognized PARALLEL + into two independent simple sets, make sure both I3 and I2 are + INSNs (as opposed to JUMP_INSNs or CALL_INSNs). + 2014-02-06 Kyrylo Tkachov * config/arm/aarch-cost-tables.h (cortexa57_extra_costs): New table. diff --git a/gcc/combine.c b/gcc/combine.c index fd4294b..12db3a3 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -3692,7 +3692,11 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx i0, int *new_direct_jump_p, && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)), XVECEXP (newpat, 0, 1)) && ! (contains_muldiv (SET_SRC (XVECEXP (newpat, 0, 0))) -&& contains_muldiv (SET_SRC (XVECEXP (newpat, 0, 1) +&& contains_muldiv (SET_SRC (XVECEXP (newpat, 0, 1 + /* Make sure both I2 and I3 are simple INSNs, otherwise we might + create a JUMP_INSN with a PATTERN that is a simple set. */ + && GET_CODE (i2) == INSN + && GET_CODE (i3) == INSN) { rtx set0 = XVECEXP (newpat, 0, 0); rtx set1 = XVECEXP (newpat, 0, 1); diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 2cac8d2..5d6e066 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2014-02-06 Jeff Law + + PR rtl-optimization/52714 + * gcc.c-torture/compile/pr52714.c: New test. + 2014-02-06 Jakub Jelinek PR target/59575 diff --git a/gcc/testsuite/gcc.c-torture/compile/pr52714.c b/gcc/testsuite/gcc.c-torture/compile/pr52714.c new file mode 100644 index 000..03d4990 --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/compile/pr52714.c @@ -0,0 +1,25 @@ + +int __re_compile_fastmap(unsigned char *p) +{ +unsigned char **stack; +unsigned size; +unsigned avail; + +stack = __builtin_alloca(5 * sizeof(unsigned char*)); +if (stack == 0) + return -2; +size = 5; +avail = 0; + +for (;;) { + switch (*p++) { + case 0: + if (avail == size) + return -2; + stack[avail++] = p; + } +} + +return 0; +} +
Re: [PATCH] New optimize(0) versioning fix (PR target/60026, take 2)
> On Thu, Feb 06, 2014 at 10:45:23AM +0100, Richard Biener wrote: > > On Thu, 6 Feb 2014, Jakub Jelinek wrote: > > > > > On Wed, Feb 05, 2014 at 08:42:27PM +0100, Jakub Jelinek wrote: > > > > So, where do we want to do that instead? E.g. should it be e.g. in > > > > tree_versionable_function_p directly and let the inliner (if it doesn't > > > > do > > > > already) also treat optimize(0) functions that aren't always_inline as > > > > noinline? > > > > > > So, another attempt to put the && opt_for_fn (fndecl, optimize) into > > > tree_versionable_function_p also failed, because e.g. for TM (but also for > > > SIMD clones) we need to clone -O0 functions. So, can I fix PR600{6,7}2 > > > without touching tree-inline.c and fix PR60026 again separately somehow > > > else > > > as follow-up? Bootstrapped/regtested on x86_64-linux and i686-linux. > > > > That works for me. > > Here it is. Bootstrapped/regtested on x86_64-linux and i686-linux, > ok for trunk? > > 2014-02-06 Jakub Jelinek > > PR ipa/60026 > * ipa-cp.c (determine_versionability): Fail at -O0 > or __attribute__((optimize (0))) functions. > * tree-sra.c (ipa_sra_preliminary_function_checks): Likewise. > > Revert: > 2014-02-04 Jakub Jelinek > > PR ipa/60026 > * tree-inline.c (copy_forbidden): Fail for > __attribute__((optimize (0))) functions. > > --- gcc/tree-inline.c.jj 2014-02-05 16:18:50.0 +0100 > +++ gcc/tree-inline.c 2014-02-06 17:12:03.008993548 +0100 > @@ -3315,18 +3315,6 @@ copy_forbidden (struct function *fun, tr > goto fail; >} > > - tree fs_opts; > - fs_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fun->decl); > - if (fs_opts) > -{ > - struct cl_optimization *os = TREE_OPTIMIZATION (fs_opts); > - if (!os->x_optimize) > - { > - reason = G_("function %q+F compiled without optimizations"); > - goto fail; > - } > -} > - > fail: >fun->cannot_be_copied_reason = reason; >fun->cannot_be_copied_set = true; > --- gcc/tree-sra.c.jj 2014-01-03 11:40:57.0 +0100 > +++ gcc/tree-sra.c2014-02-06 17:15:43.348804011 +0100 > @@ -4900,6 +4900,13 @@ ipa_sra_preliminary_function_checks (str >return false; > } > > + if (!opt_for_fn (node->decl, optimize)) > +{ > + if (dump_file) > + fprintf (dump_file, "Function not optimized.\n"); > + return false; > +} Don't we want to check opt_for_fn (node->decl, cp) instead and arrange -fipa-cp to be false when !optimize? Otherwise it is OK. I would like to move IPA passes to check the flags on function basis instead of withing global gates - that seems only way I can imagine IPA passes to work at LTO with presnece of multiple command line options in longer run. Thanks, Honza > + >if (DECL_VIRTUAL_P (current_function_decl)) > { >if (dump_file) > --- gcc/ipa-cp.c.jj 2014-02-05 10:38:01.0 +0100 > +++ gcc/ipa-cp.c 2014-02-06 17:19:13.405671970 +0100 > @@ -430,6 +430,8 @@ determine_versionability (struct cgraph_ > reason = "not a tree_versionable_function"; >else if (cgraph_function_body_availability (node) <= AVAIL_OVERWRITABLE) > reason = "insufficient body availability"; > + else if (!opt_for_fn (node->decl, optimize)) > +reason = "non-optimized function"; >else if (lookup_attribute ("omp declare simd", DECL_ATTRIBUTES > (node->decl))) > { >/* Ideally we should clone the SIMD clones themselves and create > > > Jakub
Re: Allow passing arrays in registers on AArch64
Ramana Radhakrishnan writes: > On Tue, Feb 4, 2014 at 2:12 AM, Michael Hudson-Doyle > wrote: >> Ping? I'm attaching a marginally cleaner version of the test. I've had >> a look at integrating this into aapcs64.exp but got defeated in the >> end. If go-torture-execute took a list of sources as c-torture-execute >> does, then I think adding something like this to aapcs64.exp would work: >> >> # Test passing arrays by value >> set src $srcdir/$subdir/test_array.go >> if {[runtest_file_p $runtests $src]} { >> go-torture-execute [list $src $srcdir/$subdir/abitest.S >> $srcdir/$subdir/_test_array_c.c] \ >> $additional_flags >> } >> >> but it doesn't. I also think that some stuff needs to be set up before >> go-torture-execute can be called that I don't really understand and I >> also also don't know how to avoid executing this test if gccgo hasn't >> been built. > > Putting in a shell script like that is a bad idea, To be clear: I was never proposing that this shell script be committed. It was just an unambiguous way of showing how to run my test. > the dejagnu infrastructure can take care of all the transport and > target bits for this. This will fail if someone were to try testing > this on qemu while the rest of the testsuite might work. > > However what happens if in aarch64.exp > > you do a > > load_lib go-dg.exp > > and essentially run the tests similar to testsuite/go.dg/dg.exp ? I can't see how to run a test case that consists of multiple source files like that. That doesn't mean it's not possible though... > That will take care of any cross-testing issues I hope with this using > dejagnu. > > If that fails I think we should just drop the test as the go testsuite > will catch this. Please. >> >> All that said, is there any chance of getting the original ABI fix >> committed? It would be nice to have it in 4.9. > > > I cannot approve or disapprove this patch but looking at the fix and > the ABI issue under question here, I agree that this should be fixed > for 4.9 and documented in the release notes. Your latest patch should > also take Yufeng's suggestion under consideration about merging the > condition in the if block. Oh right, I forgot that I hadn't sent a patch acting on that comment. Here is one. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 16c51a8..958c667 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -1187,14 +1187,10 @@ aarch64_pass_by_reference (cumulative_args_t pcum ATTRIBUTE_UNUSED, size = (mode == BLKmode && type) ? int_size_in_bytes (type) : (int) GET_MODE_SIZE (mode); - if (type) + /* Aggregates are passed by reference based on their size. */ + if (type && AGGREGATE_TYPE_P (type)) { - /* Arrays always passed by reference. */ - if (TREE_CODE (type) == ARRAY_TYPE) - return true; - /* Other aggregates based on their size. */ - if (AGGREGATE_TYPE_P (type)) - size = int_size_in_bytes (type); + size = int_size_in_bytes (type); } /* Variable sized arguments are always returned by reference. */ Cheers, mwh > > Cheers, > Ramana > >> >> Cheers, >> mwh >> >> Michael Hudson-Doyle writes: >> >>> Richard Earnshaw writes: >>> On 17/01/14 23:56, Michael Hudson-Doyle wrote: > Ian Lance Taylor writes: > >> On Fri, Jan 17, 2014 at 11:32 AM, Michael Hudson-Doyle >> wrote: >>> >>> On 18 Jan 2014 07:50, "Yufeng Zhang" wrote: Also can you please try to add some new test(s)? It may not be that straightforward to add non-C/C++ tests, but give it a try. >>> >>> Can you give some hints? Like at least where in the tree such a test >>> would >>> go? I don't know this code at all. >> >> There is already a test in libgo, of course. >> >> I think it would be pretty hard to write a test that doesn't something >> like what libgo does. The problem is that GCC is entirely consistent >> with and without your patch. You could add a Go test that passes an >> array in gcc/testsuite/go.go-torture/execute/ easily enough, but it >> would be quite hard to add a test that doesn't pass whether or not >> your patch is applied. > > I think it would have to be a code generation test, i.e. that compiling > something like > > func second(e [2]int64) int64 { > return e[1] > } > > does not access memory or something along those lines. I'll have a look > next week. > > Cheers, > mwh > Michael, Can you have a look at how the tests in gcc.target/aarch64/aapcs64 work; it ought to be possible to write a test (though not in C) to catch this. >>> >>> Yeah, I had found those tests. It would be much easier if I could write >>> this in C :-) >>> The tests work by calling a wrapper function written in assembler -- that wrapper stores out all the registers used for parameter passing
Re: [Patch, Fortran] PR 58470: [4.9 Regression] [OOP] ICE on invalid with FINAL procedure and type extension
Hi Mikael, thanks for your comments ... >> attached is a small patch which fixes an ICE-on-invalid regression >> with finalization. In the PR, Dominique objected to the patch, but I >> think it's the correct thing to do after all. The line that I'm >> removing was added in a patch authored by Tobias and myself. I suspect >> it was added to work around some other problem in the finalization >> implementation, and there is no evidence it's actually needed. >> >> The patch regtests cleanly on x86_64-unknown-linux-gnu. Ok for trunk? >> > Wait a bit; let's try to understand the problem. > > Normally I would say calling gfc_is_finalizable here in > resolve_fl_derived0 is harmless because gfc_resolve_finalizers has been > called before in resolve_fl_derived. > BUT: > resolve_fl_derived0 recurses on its parent type, while > gfc_resolve_finalizers doesn't; and in this case we end up recursing on > type "cfml" whose finalizers haven't been resolved yet. Yes, that's more or less what happens. And the real problem is that gfc_is_finalzable already generates the finalization wrapper (via gfc_find_derived_vtab -> generate_finalizaton_wrapper) before we have checked that the finalizer is actually valid (which is what gfc_resolve_finalizers does). Once we get into gfc_resolve_finalizers, it is fooled to believe that the finalizer has already been resolved and therefore skips the checks and produces no error message. > Now whether your patch is the right thing to do... I'm a bit skeptical > about removing the one use of gfc_is_finalizable in resolve.c. Well, all others occurrences of 'gfc_is_finalizable' are in trans*, so this is the only one that comes too early. Also the fact that its return value is not used here tells you that we are not actually interested if the type is finalizable at this point. The call is bogus and should be removed, I think. > On the > other hand it is regression free... Well, finalization is new in 4.9, and Dominique has argued that the test suite may not have sufficient coverage yet. The absence of regressions may not be enough to conclude that the patch is correct. But after all I think that the patch should not hurt. After giving it some second thoughts, the only alternative I could see is this: Index: gcc/fortran/resolve.c === --- gcc/fortran/resolve.c(revision 207485) +++ gcc/fortran/resolve.c(working copy) @@ -11224,13 +11224,6 @@ gfc_resolve_finalizers (gfc_symbol* derived) gfc_finalizer* i; int my_rank; - /* Skip this finalizer if we already resolved it. */ - if (list->proc_tree) -{ - prev_link = &(list->next); - continue; -} - /* Check this exists and is a SUBROUTINE. */ if (!list->proc_sym->attr.subroutine) { It also gets rid of the ICE, but I haven't regtested it yet. Does this look better to you than the original patch? (It might give duplicate error messages in some cases?) Cheers, Janus
[PATCH] fix for PR 59691
Hello Everyone, Attached, please find a patch that will fix the issue in PR 59691. The main issue was that the Cilk library (libcilkrts) was not checking if the target has SSE support before emitting SSE instruction. This patch should fix that. Here is the ChangeLog entries: libcilkrts/ChangeLog 2014-02-06 Balaji V. Iyer * runtime/config/x86/os-unix-sysdep.c (__builtin_cpu_supports): New function. (restore_x86_fp_state): Added a check if the cpu supports the instruction before emitting it. (sysdep_save_fp_ctrl_state): Likewise. Is this Ok to install? Thanks, Balaji V. Iyer. diff --git a/libcilkrts/runtime/config/x86/os-unix-sysdep.c b/libcilkrts/runtime/config/x86/os-unix-sysdep.c index 881bc3f..b505ddf 100644 --- a/libcilkrts/runtime/config/x86/os-unix-sysdep.c +++ b/libcilkrts/runtime/config/x86/os-unix-sysdep.c @@ -2,11 +2,9 @@ * * * - * @copyright - * Copyright (C) 2009-2013, Intel Corporation + * Copyright (C) 2009-2014, Intel Corporation * All rights reserved. * - * @copyright * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: @@ -21,7 +19,6 @@ * contributors may be used to endorse or promote products derived * from this software without specific prior written permission. * - * @copyright * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR @@ -91,6 +88,20 @@ COMMON_SYSDEP int __cilkrts_xchg(volatile int *ptr, int x) return x; } +/* + * The Intel compiler distribution assumes newer CPUs and doesn't yet support + * the __builtin_cpu_supports intrinsic added by GCC 4.8, so just return 1 in + * that environment. + * + * This declaration should generate an error when the Intel compiler adds + * supprt for the intrinsic. + */ +#ifdef __INTEL_COMPILER +static inline int __builtin_cpu_supports(const char *feature) +{ +return 1; +} +#endif /* * Restore the floating point state that is stored in a stack frame at each @@ -100,11 +111,16 @@ COMMON_SYSDEP int __cilkrts_xchg(volatile int *ptr, int x) */ void restore_x86_fp_state (__cilkrts_stack_frame *sf) { #ifdef RESTORE_X86_FP_STATE -__asm__ ( "ldmxcsr %0\n\t" - "fnclex\n\t" - "fldcw %1" - : - : "m" (sf->mxcsr), "m" (sf->fpcsr)); +if (__builtin_cpu_supports("sse")) +{ +__asm__ ("ldmxcsr %0" + : + : "m" (sf->mxcsr)); +} +__asm__ ("fnclex\n\t" + "fldcw %0" + : + : "m" (sf->fpcsr)); #endif } @@ -115,7 +131,10 @@ void sysdep_save_fp_ctrl_state(__cilkrts_stack_frame *sf) #ifdef RESTORE_X86_FP_STATE if (CILK_FRAME_VERSION_VALUE(sf->flags) >= 1) { -__asm__ ("stmxcsr %0" : "=m" (sf->mxcsr)); +if (__builtin_cpu_supports("sse")) +{ +__asm__ ("stmxcsr %0" : "=m" (sf->mxcsr)); +} __asm__ ("fnstsw %0" : "=m" (sf->fpcsr)); } #endif
Re: [google gcc-4_8][patch] Thunk section names
Patch attached. On Thu, Feb 6, 2014 at 2:18 PM, Sriraman Tallam wrote: > I sent the following patch for review for trunk commit here. Details here: > http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00328.html > > This is important for function layout for the following reason. > Without this patch, the thunk's section name is the same as the original > function's section name for which the thunk is created. This affects function > layout as it is not possible to figure out the thunk section from the name > alone. With this patch, the thunk's section name is suffixed with the mangled > name of the thunk and this solves the problem. > > Is this patch ok for google/gcc-4_8? > > Sri I sent the following patch for review for trunk commit here. Details here: http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00328.html This is important for function layout for the following reason. Without this patch, the thunk's section name is the same as the original function's section name for which the thunk is created. This affects function layout as it is not possible to figure out the thunk section from the name alone. With this patch, the thunk's section name is suffixed with the mangled name of the thunk and this solves the problem. Is this patch ok for google/gcc-4_8? Index: cp/method.c === --- cp/method.c (revision 207581) +++ cp/method.c (working copy) @@ -360,8 +360,10 @@ use_thunk (tree thunk_fndecl, bool emit_p) { resolve_unique_section (thunk_fndecl, 0, flag_function_sections); - /* Output the thunk into the same section as function. */ - DECL_SECTION_NAME (thunk_fndecl) = DECL_SECTION_NAME (function); + /* Output the thunk into the same section as function if function reordering +is not switched on. */ + if (!flag_reorder_functions) + DECL_SECTION_NAME (thunk_fndecl) = DECL_SECTION_NAME (function); } } Index: testsuite/g++.dg/thunk_section_name.C === --- testsuite/g++.dg/thunk_section_name.C (revision 0) +++ testsuite/g++.dg/thunk_section_name.C (revision 0) @@ -0,0 +1,30 @@ +/* { dg-require-named-sections "" } */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-reorder-blocks-and-partition -ffunction-sections" } */ + +class base_class_1 +{ +public: + virtual void vfn () {} +}; + +class base_class_2 +{ +public: + virtual void vfn () {} +}; + +class need_thunk_class : public base_class_1, public base_class_2 +{ +public: + virtual void vfn () {} +}; + +int main (int argc, char *argv[]) +{ + base_class_1 *c = new need_thunk_class (); + c->vfn(); + return 0; +} + +/* { dg-final { scan-assembler "\.text\._ZThn8_N16need_thunk_class3vfnEv" } } */
[google gcc-4_8][patch] Thunk section names
I sent the following patch for review for trunk commit here. Details here: http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00328.html This is important for function layout for the following reason. Without this patch, the thunk's section name is the same as the original function's section name for which the thunk is created. This affects function layout as it is not possible to figure out the thunk section from the name alone. With this patch, the thunk's section name is suffixed with the mangled name of the thunk and this solves the problem. Is this patch ok for google/gcc-4_8? Sri
[PATCH] New optimize(0) versioning fix (PR target/60026, take 2)
On Thu, Feb 06, 2014 at 10:45:23AM +0100, Richard Biener wrote: > On Thu, 6 Feb 2014, Jakub Jelinek wrote: > > > On Wed, Feb 05, 2014 at 08:42:27PM +0100, Jakub Jelinek wrote: > > > So, where do we want to do that instead? E.g. should it be e.g. in > > > tree_versionable_function_p directly and let the inliner (if it doesn't do > > > already) also treat optimize(0) functions that aren't always_inline as > > > noinline? > > > > So, another attempt to put the && opt_for_fn (fndecl, optimize) into > > tree_versionable_function_p also failed, because e.g. for TM (but also for > > SIMD clones) we need to clone -O0 functions. So, can I fix PR600{6,7}2 > > without touching tree-inline.c and fix PR60026 again separately somehow else > > as follow-up? Bootstrapped/regtested on x86_64-linux and i686-linux. > > That works for me. Here it is. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-02-06 Jakub Jelinek PR ipa/60026 * ipa-cp.c (determine_versionability): Fail at -O0 or __attribute__((optimize (0))) functions. * tree-sra.c (ipa_sra_preliminary_function_checks): Likewise. Revert: 2014-02-04 Jakub Jelinek PR ipa/60026 * tree-inline.c (copy_forbidden): Fail for __attribute__((optimize (0))) functions. --- gcc/tree-inline.c.jj2014-02-05 16:18:50.0 +0100 +++ gcc/tree-inline.c 2014-02-06 17:12:03.008993548 +0100 @@ -3315,18 +3315,6 @@ copy_forbidden (struct function *fun, tr goto fail; } - tree fs_opts; - fs_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fun->decl); - if (fs_opts) -{ - struct cl_optimization *os = TREE_OPTIMIZATION (fs_opts); - if (!os->x_optimize) - { - reason = G_("function %q+F compiled without optimizations"); - goto fail; - } -} - fail: fun->cannot_be_copied_reason = reason; fun->cannot_be_copied_set = true; --- gcc/tree-sra.c.jj 2014-01-03 11:40:57.0 +0100 +++ gcc/tree-sra.c 2014-02-06 17:15:43.348804011 +0100 @@ -4900,6 +4900,13 @@ ipa_sra_preliminary_function_checks (str return false; } + if (!opt_for_fn (node->decl, optimize)) +{ + if (dump_file) + fprintf (dump_file, "Function not optimized.\n"); + return false; +} + if (DECL_VIRTUAL_P (current_function_decl)) { if (dump_file) --- gcc/ipa-cp.c.jj 2014-02-05 10:38:01.0 +0100 +++ gcc/ipa-cp.c2014-02-06 17:19:13.405671970 +0100 @@ -430,6 +430,8 @@ determine_versionability (struct cgraph_ reason = "not a tree_versionable_function"; else if (cgraph_function_body_availability (node) <= AVAIL_OVERWRITABLE) reason = "insufficient body availability"; + else if (!opt_for_fn (node->decl, optimize)) +reason = "non-optimized function"; else if (lookup_attribute ("omp declare simd", DECL_ATTRIBUTES (node->decl))) { /* Ideally we should clone the SIMD clones themselves and create Jakub
Re: [MIPS] Use soft-fp for libgcc floating-point routines
Richard Sandiford writes: > libgcc/ > * configure.ac (libgcc_cv_mips_hard_float): New. > * configure: Regenerate. > * config.host (mips*-*-*): Use t-hardfp-sfdf rather than > t-softfp-sfdf for hard-float targets. > * config/mips/t-mips (LIB2_SIDITI_CONV_FUNCS): Reinstate. > (softfp_float_modes, softfp_int_modes, softfp_extensions) > (softfp_truncations, softfp_exclude_libgcc2): New. > * config/t-hardfp: New file. > * config/t-hardfp-sfdf: Likewise. > * config/hardfp.c: Likewise. This is OK. Thanks. Ian
[PATCH] Fix up __builtin_setjmp_receiver handling (PR c++/60082, take 2)
On Thu, Feb 06, 2014 at 11:00:14AM +0100, Richard Biener wrote: > Ah, so __builtin_setjmp_receiver is like setjmp in this regard > and setjmp is LEAF (it's a stmt that doesn't direct control-flow > anywhere else). So __builtin_setjmp_receiver should be LEAF as well > (and so should __builtin_setjmp_setup). > > Doesn't sound dangerous at all to me ... This passed bootstrap/regtest and fixed the timeouts too. Ok for trunk? 2014-02-06 Jakub Jelinek PR c++/60082 * tree.c (build_common_builtin_nodes): Set ECF_LEAF for __builtin_setjmp_receiver. Revert 2014-02-05 Balaji V. Iyer * g++.dg/cilk-plus/CK/catch_exc.cc: Disable test for -O1. * c-c++-common/cilk-plus/CK/spawner_inline.c: Likewise. --- gcc/tree.c.jj 2014-02-04 10:30:15.010121513 +0100 +++ gcc/tree.c 2014-02-06 17:03:49.034656060 +0100 @@ -9980,7 +9980,7 @@ build_common_builtin_nodes (void) ftype = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE); local_define_builtin ("__builtin_setjmp_receiver", ftype, BUILT_IN_SETJMP_RECEIVER, - "__builtin_setjmp_receiver", ECF_NOTHROW); + "__builtin_setjmp_receiver", ECF_NOTHROW | ECF_LEAF); ftype = build_function_type_list (ptr_type_node, NULL_TREE); local_define_builtin ("__builtin_stack_save", ftype, BUILT_IN_STACK_SAVE, --- gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc (revision 207519) +++ gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc (revision 207518) @@ -1,7 +1,6 @@ /* { dg-options "-fcilkplus" } */ /* { dg-do run { target i?86-*-* x86_64-*-* arm*-*-* } } */ /* { dg-options "-fcilkplus -lcilkrts" { target { i?86-*-* x86_64-*-* arm*-*-* } } } */ -/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */ #include #include --- gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c(revision 207519) +++ gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c(revision 207518) @@ -1,7 +1,6 @@ /* { dg-do run { target { i?86-*-* x86_64-*-* } } } */ /* { dg-options "-fcilkplus" } */ /* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */ -/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */ #include #define DEFAULT_VALUE 30 Jakub
minor help message fix
Hi the following patch removes the 'state' print for -ftree-tree-vectorize option which does not make sense anymore. Ok for trunk? thanks, David Index: ChangeLog === --- ChangeLog (revision 207581) +++ ChangeLog (working copy) @@ -1,3 +1,7 @@ +2014-02-06 Xinliang David Li + + * opts.c (print_filtered_help): Fix help message bug. + 2014-02-06 Kyrylo Tkachov * config/arm/aarch-cost-tables.h (cortexa57_extra_costs): New table. Index: opts.c === --- opts.c (revision 207581) +++ opts.c (working copy) @@ -1060,8 +1060,11 @@ print_filtered_help (unsigned int includ "%#x", * (int *) flag_var); } else - strcat (new_help, option_enabled (i, opts) - ? _("[enabled]") : _("[disabled]")); + { + if (i != OPT_ftree_vectorize) + strcat (new_help, option_enabled (i, opts) + ? _("[enabled]") : _("[disabled]")); + } } help = new_help;
[PATCH] Fix PR49008, a typoed statement
Below patch fixes a small copy-paste mistake, reported by Hui Wu. (Whether the fix is correct, I'm not fully sure, but the code as it is looks nonsensical.) 2014-02-06 Benno Schulenberg PR target/49008 * genautomata.c (add_presence_absence): Fix copy-paste error. Index: gcc/genautomata.c === --- gcc/genautomata.c (revision 207551) +++ gcc/genautomata.c (working copy) @@ -2348,7 +2348,7 @@ for (prev_el = (presence_p ? (final_p ? dst->unit_decl->final_presence_list - : dst->unit_decl->final_presence_list) + : dst->unit_decl->presence_list) : (final_p ? dst->unit_decl->final_absence_list : dst->unit_decl->absence_list));
[PATCH] Fix PR52289, a typoed word in an error message
[Oops, had a wrong bug number in the subject line.] Below patch fixes another miswording in an error message, reported by Roland Stigge. Please apply. 2014-02-06 Benno Schulenberg PR translation/52289 * fortran/resolve.c (resolve_ordinary_assign): Fix typoed word in an error message. Index: fortran/resolve.c === --- fortran/resolve.c(revision 207551) +++ fortran/resolve.c(working copy) @@ -9218,7 +9218,7 @@ /* F2008, Section 7.2.1.2. */ if (gfc_is_coindexed (lhs) && gfc_has_ultimate_allocatable (lhs)) { - gfc_error ("Coindexed variable must not be have an allocatable ultimate " + gfc_error ("Coindexed variable must not have an allocatable ultimate " "component in assignment at %L", &lhs->where); return false; }
[PATCH] Fix PR55289, a typoed word in an error message
Below patch fixes another miswording in an error message, reported by Roland Stigge. Please apply. 2014-02-06 Benno Schulenberg PR translation/52289 * fortran/resolve.c (resolve_ordinary_assign): Fix typoed word in an error message. Index: fortran/resolve.c === --- fortran/resolve.c(revision 207551) +++ fortran/resolve.c(working copy) @@ -9218,7 +9218,7 @@ /* F2008, Section 7.2.1.2. */ if (gfc_is_coindexed (lhs) && gfc_has_ultimate_allocatable (lhs)) { - gfc_error ("Coindexed variable must not be have an allocatable ultimate " + gfc_error ("Coindexed variable must not have an allocatable ultimate " "component in assignment at %L", &lhs->where); return false; }
[jit] Add some methods to the C++ wrapper API.
Committed to branch dmalcolm/jit: gcc/jit/ * libgccjit++.h: Include rather than , since the former gets us std::ostream, and the latter may introduce startup-time overhead for constructing std::cout et al. (gccjit::context::new_child_context): New. (gccjit::context::release): New. (gccjit::context::compile): New. (gccjit::context::set_int_option): New. (gccjit::context::set_bool_option): New. --- gcc/jit/ChangeLog.jit | 11 +++ gcc/jit/libgccjit++.h | 49 - 2 files changed, 59 insertions(+), 1 deletion(-) diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit index be96ff7..6950b13 100644 --- a/gcc/jit/ChangeLog.jit +++ b/gcc/jit/ChangeLog.jit @@ -1,3 +1,14 @@ +2014-02-06 David Malcolm + + * libgccjit++.h: Include rather than , since + the former gets us std::ostream, and the latter may introduce + startup-time overhead for constructing std::cout et al. + (gccjit::context::new_child_context): New. + (gccjit::context::release): New. + (gccjit::context::compile): New. + (gccjit::context::set_int_option): New. + (gccjit::context::set_bool_option): New. + 2014-02-03 David Malcolm * libgccjit.h (struct gcc_jit_object): New. diff --git a/gcc/jit/libgccjit++.h b/gcc/jit/libgccjit++.h index 79abcd1..b7f5d4b 100644 --- a/gcc/jit/libgccjit++.h +++ b/gcc/jit/libgccjit++.h @@ -5,7 +5,7 @@ #include "libgccjit.h" -#include +#include #include / @@ -31,6 +31,18 @@ namespace gccjit context (); context (gcc_jit_context *ctxt); +gccjit::context new_child_context (); + +void release (); + +gcc_jit_result *compile (); + +void set_int_option (enum gcc_jit_int_option opt, +int value); + +void set_bool_option (enum gcc_jit_bool_option opt, + int value); + location new_location (const char *filename, int line, @@ -295,6 +307,41 @@ inline context context::acquire () inline context::context () : m_inner_ctxt (NULL) {} inline context::context (gcc_jit_context *inner) : m_inner_ctxt (inner) {} +inline gccjit::context +context::new_child_context () +{ + return context (gcc_jit_context_new_child_context (m_inner_ctxt)); +} + +inline void +context::release () +{ + gcc_jit_context_release (m_inner_ctxt); + m_inner_ctxt = NULL; +} + +inline gcc_jit_result * +context::compile () +{ + return gcc_jit_context_compile (m_inner_ctxt); +} + +inline void +context::set_int_option (enum gcc_jit_int_option opt, +int value) +{ + gcc_jit_context_set_int_option (m_inner_ctxt, opt, value); + +} + +inline void +context::set_bool_option (enum gcc_jit_bool_option opt, + int value) +{ + gcc_jit_context_set_bool_option (m_inner_ctxt, opt, value); + +} + inline location context::new_location (const char *filename, int line, -- 1.7.11.7
[PATCH] Fix typo and miswordings in three error messages
Updating a bit the Dutch translations of GCC's messages, I noticed the following mistakes in three msgids: "only displayed one" ==> "displayed only once" "none class-method" ==> "non-class method" "incorect" ==> "incorrect" Below patch fixes those. I'm not entirely sure about the second fix, but the "none" doesn't make any sense. When found okay, please apply. 2014-02-05 Benno Schulenberg * gcov.c (find_source): Fix miswording in error message. * config/i386/i386.c (ix86_handle_cconv_attribute): Likewise. (ix86_expand_sse_comi_round): Fix typo in error message. Index: gcov.c === --- gcov.c (revision 207551) +++ gcov.c (working copy) @@ -1141,7 +1141,7 @@ if (!info_emitted) { fnotice (stderr, - "(the message is only displayed one per source file)\n"); + "(the message is displayed only once per source file)\n"); info_emitted = 1; } sources[idx].file_time = 0; Index: config/i386/i386.c === --- config/i386/i386.c (revision 207551) +++ config/i386/i386.c (working copy) @@ -5446,7 +5446,7 @@ else if (is_attribute_p ("thiscall", name)) { if (TREE_CODE (*node) != METHOD_TYPE && pedantic) - warning (OPT_Wattributes, "%qE attribute is used for none class-method", + warning (OPT_Wattributes, "%qE attribute is used for non-class method", name); if (lookup_attribute ("stdcall", TYPE_ATTRIBUTES (*node))) { @@ -34230,7 +34230,7 @@ } if (INTVAL (op2) < 0 || INTVAL (op2) >= 32) { - error ("incorect comparison mode"); + error ("incorrect comparison mode"); return const0_rtx; }
Re: [Patch, Fortran] PR 58470: [4.9 Regression] [OOP] ICE on invalid with FINAL procedure and type extension
Le 06/02/2014 11:43, Janus Weil a écrit : > Hi all, > > attached is a small patch which fixes an ICE-on-invalid regression > with finalization. In the PR, Dominique objected to the patch, but I > think it's the correct thing to do after all. The line that I'm > removing was added in a patch authored by Tobias and myself. I suspect > it was added to work around some other problem in the finalization > implementation, and there is no evidence it's actually needed. > > The patch regtests cleanly on x86_64-unknown-linux-gnu. Ok for trunk? > Wait a bit; let's try to understand the problem. Normally I would say calling gfc_is_finalizable here in resolve_fl_derived0 is harmless because gfc_resolve_finalizers has been called before in resolve_fl_derived. BUT: resolve_fl_derived0 recurses on its parent type, while gfc_resolve_finalizers doesn't; and in this case we end up recursing on type "cfml" whose finalizers haven't been resolved yet. Now whether your patch is the right thing to do... I'm a bit skeptical about removing the one use of gfc_is_finalizable in resolve.c . On the other hand it is regression free... I'll think about it and try to come back to you tomorrow or so. Mikael
Re: [MIPS] Use soft-fp for libgcc floating-point routines
"Joseph S. Myers" writes: > On Tue, 4 Feb 2014, Richard Sandiford wrote: >> So here I just provide simple C functions for each operation that has >> hard-float support. We can then restrict the softfp stuff to functions >> that really do need softfp support, meaning that softfp_exclude_libgcc2 := n >> should always be OK. As an added bonus it reduces the footprint of >> libgcc_s.so >> on hard-float targets. > > Simple C functions are indeed the best thing for cases where the functions > are only part of the ABI at all because of historical accident of it being > hard to build fp-bit only for soft-float multilibs. But they're also > useful for more than just MIPS - 32-bit powerpc hard float could use them > as well, for example (it seems likely there are other cases, but powerpc > is one I know about). So I think the implementation and a makefile > fragment for them should be outside config/mips, although targets may need > to configure exactly which functions to build from this set. (For > example, e500v1 would want simple C functions for SFmode but soft-fp for > DFmode, and for __unordsf2; e500v2 would want soft-fp for __unordsf2 and > __unorddf2 but otherwise simple C functions. I don't think the initial > patch needs to build out the full configurability, as long as the files > are in the right place, outside config/mips.) > > A patch putting the files in an architecture-independent place should > probably then be reviewed by Ian as libgcc maintainer. OK, done in the patch below. I went for the same style of configuration macros as softfp, but like you say we might need to tweak it when other targets start using it. Ian: the short(ish) version is that I switched MIPS over from fpbit.c to softfp in order to handle exceptions for TFmode with -mhard-float. This fixes some c11-atomic-exec-5.c failures. The setup we now want for hard-float is: (1) conversions between SF/DF and DI/TI are provided by libgcc2.c, which unlike softfp and fpbit.c can take advantage of the FPU (2) other SF and DF functions are provided by the new functions added below (3) all functions involving TF are provided by softfp. The longer version is that, for historical reasons, hard-float MIPS provides a full set of libgcc floating-point routines, even if there's hardware support for them. For example, __addsf3 is defined even though that's just ADD.S. It used to be that all the functions in (2) were provided by fpbit.c, but it's not currently possible to have softfp provide (2) and (3) but not (1). One fix would have been to tweak the softfp_exclude_libgcc2 handling so that softfp could provide only (2) and (3). But with that setup -- and also with the old setup -- conversions to and from DImode and TImode would use fast hardware support but conversions to and from SImode wouldn't. libgcc_s.so would also be bloated with large emulated routines that would never normally be called. So the idea was instead to define simple C functions for (2) that can exploit the hardware instructions. E.g. __addsf3 does just become an ADD.S and a return instruction. >> +#elif defined (OP_eq2) >> +int FUNC (TYPE x, TYPE y) { return x == y; } >> +#elif defined (OP_ne2) >> +int FUNC (TYPE x, TYPE y) { return x != y; } >> +#elif defined (OP_ge2) >> +int FUNC (TYPE x, TYPE y) { return x >= y; } >> +#elif defined (OP_gt2) >> +int FUNC (TYPE x, TYPE y) { return x > y; } >> +#elif defined (OP_le2) >> +int FUNC (TYPE x, TYPE y) { return x <= y; } >> +#elif defined (OP_lt2) >> +int FUNC (TYPE x, TYPE y) { return x < y; } > > That's not the semantics of the comparison functions. They return an > integer value that can be compared to 0 with the original operation to get > the final result, rather than returning the result of the comparison > directly. See libgcc.texi. Gah, forgot about that sorry. Tested on mips64-linux-gnu and mipsisa64-sde-elf. OK to install? Thanks, Richard libgcc/ * configure.ac (libgcc_cv_mips_hard_float): New. * configure: Regenerate. * config.host (mips*-*-*): Use t-hardfp-sfdf rather than t-softfp-sfdf for hard-float targets. * config/mips/t-mips (LIB2_SIDITI_CONV_FUNCS): Reinstate. (softfp_float_modes, softfp_int_modes, softfp_extensions) (softfp_truncations, softfp_exclude_libgcc2): New. * config/t-hardfp: New file. * config/t-hardfp-sfdf: Likewise. * config/hardfp.c: Likewise. Index: libgcc/configure.ac === --- libgcc/configure.ac 2014-02-04 18:11:37.078922650 + +++ libgcc/configure.ac 2014-02-04 18:11:41.534950708 + @@ -292,6 +292,18 @@ EOF eval `${CC-cc} -E conftest.c | grep host_address=` rm -f conftest.c +case ${host} in +mips*-*-*) + AC_CACHE_CHECK([whether the target is hard-float], +[libgcc_cv_mips_hard_float], +[AC_COMPILE_IFELSE( +[#ifndef __mips_hard_float + #error FOO + #endif], +
[RFA] [middle-end/54041] Convert modes as needed from expand_expr
expand_expr has, for as long as I can remember, had the ability to ignore the desired mode provided by its callers and instead returning something in a completely different mode. It's always been the caller's responsibility to deal with that. For the testcase in 54041, we call expand_expr with a desired mode of SImode, but it actually returns a HImode object. This causes the assertion in convert_memory_address_addr_space to trip because the passed mode must be the same as the mode of the memory address. The fix is simple. If expand_expr returns something in the wrong mode, convert it to the desired mode. I've reviewed the resulting code for the m68k target and it looks correct to me. I've also bootstrapped and regression tested on x86_64-unknown-linux-gnu, though the new code most certainly does not trigger there. I guess if someone really wanted to be thorough, they'd test on a target where pointers and integers are different sizes. OK for the trunk? Thanks, Jeff diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 2dbab72..4c7da83 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2014-02-05 Jeff Law + + PR middle-end/54041 + * expr.c (expand_expr_addr_1): Handle expand_expr returning an + object with an undesirable mode. + 2014-02-05 Bill Schmidt * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Change diff --git a/gcc/expr.c b/gcc/expr.c index 878a51b..9609c45 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -7708,6 +7708,11 @@ expand_expr_addr_expr_1 (tree exp, rtx target, enum machine_mode tmode, modifier == EXPAND_INITIALIZER ? EXPAND_INITIALIZER : EXPAND_NORMAL); + /* expand_expr is allowed to return an object in a mode other +than TMODE. If it did, we need to convert. */ + if (tmode != GET_MODE (tmp)) + tmp = convert_modes (tmode, GET_MODE (tmp), +tmp, TYPE_UNSIGNED (TREE_TYPE (offset))); result = convert_memory_address_addr_space (tmode, result, as); tmp = convert_memory_address_addr_space (tmode, tmp, as); diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index c81a00d..283912d 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2014-02-05 Jeff Law + + PR middle-end/54041 + * gcc.target/m68k/pr54041.c: New test. + 2014-02-05 Bill Schmidt * gcc.dg/vmx/sum2s.c: New. diff --git a/gcc/testsuite/gcc.target/m68k/pr54041.c b/gcc/testsuite/gcc.target/m68k/pr54041.c new file mode 100644 index 000..645cb6d --- /dev/null +++ b/gcc/testsuite/gcc.target/m68k/pr54041.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mshort" } */ + +extern int r[]; + +int *fn(int i) +{ + return &r[i]; +} +
Re: [C++ patch] for C++/52369
OK. Jason
Re: wide-int, lto
On Nov 25, 2013, at 3:09 AM, Richard Biener wrote: > please add streamer_read/write_wi () helpers to data-streamer* > > replicating the above loop N times is too ugly. Agreed. Below is the patch to collapse the code. Thanks. diff --git a/gcc/lto-streamer-in.c b/gcc/lto-streamer-in.c index 08eba48..3e02840 100644 --- a/gcc/lto-streamer-in.c +++ b/gcc/lto-streamer-in.c @@ -617,6 +617,21 @@ make_new_block (struct function *fn, unsigned int index) } +/* Read a wide-int. */ + +static widest_int +streamer_read_wi (struct lto_input_block *ib) +{ + HOST_WIDE_INT a[WIDE_INT_MAX_ELTS]; + int i; + int prec ATTRIBUTE_UNUSED = streamer_read_uhwi (ib); + int len = streamer_read_uhwi (ib); + for (i = 0; i < len; i++) +a[i] = streamer_read_hwi (ib); + return widest_int::from_array (a, len); +} + + /* Read the CFG for function FN from input block IB. */ static void @@ -726,28 +741,10 @@ input_cfg (struct lto_input_block *ib, struct data_in *data_in, loop->estimate_state = streamer_read_enum (ib, loop_estimation, EST_LAST); loop->any_upper_bound = streamer_read_hwi (ib); if (loop->any_upper_bound) - { - HOST_WIDE_INT a[WIDE_INT_MAX_ELTS]; - int i; - int prec ATTRIBUTE_UNUSED = streamer_read_uhwi (ib); - int len = streamer_read_uhwi (ib); - for (i = 0; i < len; i++) - a[i] = streamer_read_hwi (ib); - - loop->nb_iterations_upper_bound = widest_int::from_array (a, len); - } + loop->nb_iterations_upper_bound = streamer_read_wi (ib); loop->any_estimate = streamer_read_hwi (ib); if (loop->any_estimate) - { - HOST_WIDE_INT a[WIDE_INT_MAX_ELTS]; - int i; - int prec ATTRIBUTE_UNUSED = streamer_read_uhwi (ib); - int len = streamer_read_uhwi (ib); - for (i = 0; i < len; i++) - a[i] = streamer_read_hwi (ib); - - loop->nb_iterations_estimate = widest_int::from_array (a, len); - } + loop->nb_iterations_estimate = streamer_read_wi (ib); /* Read OMP SIMD related info. */ loop->safelen = streamer_read_hwi (ib); diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c index de19235..60acb42 100644 --- a/gcc/lto-streamer-out.c +++ b/gcc/lto-streamer-out.c @@ -1622,6 +1622,21 @@ output_ssa_names (struct output_block *ob, struct function *fn) } +/* Output a wide-int. */ + +static void +streamer_write_wi (struct output_block *ob, + const widest_int &w) +{ + int len = w.get_len (); + + streamer_write_uhwi (ob, w.get_precision ()); + streamer_write_uhwi (ob, len); + for (int i = 0; i < len; i++) +streamer_write_hwi (ob, w.elt (i)); +} + + /* Output the cfg. */ static void @@ -1694,26 +1709,10 @@ output_cfg (struct output_block *ob, struct function *fn) loop_estimation, EST_LAST, loop->estimate_state); streamer_write_hwi (ob, loop->any_upper_bound); if (loop->any_upper_bound) - { - int len = loop->nb_iterations_upper_bound.get_len (); - int i; - - streamer_write_uhwi (ob, loop->nb_iterations_upper_bound.get_precision ()); - streamer_write_uhwi (ob, len); - for (i = 0; i < len; i++) - streamer_write_hwi (ob, loop->nb_iterations_upper_bound.elt (i)); - } + streamer_write_wi (ob, loop->nb_iterations_upper_bound); streamer_write_hwi (ob, loop->any_estimate); if (loop->any_estimate) - { - int len = loop->nb_iterations_estimate.get_len (); - int i; - - streamer_write_uhwi (ob, loop->nb_iterations_estimate.get_precision ()); - streamer_write_uhwi (ob, len); - for (i = 0; i < len; i++) - streamer_write_hwi (ob, loop->nb_iterations_estimate.elt (i)); - } + streamer_write_wi (ob, loop->nb_iterations_estimate); /* Write OMP SIMD related info. */ streamer_write_hwi (ob, loop->safelen);
Re: Allow passing arrays in registers on AArch64
On Tue, Feb 4, 2014 at 2:12 AM, Michael Hudson-Doyle wrote: > Ping? I'm attaching a marginally cleaner version of the test. I've had > a look at integrating this into aapcs64.exp but got defeated in the > end. If go-torture-execute took a list of sources as c-torture-execute > does, then I think adding something like this to aapcs64.exp would work: > > # Test passing arrays by value > set src $srcdir/$subdir/test_array.go > if {[runtest_file_p $runtests $src]} { > go-torture-execute [list $src $srcdir/$subdir/abitest.S > $srcdir/$subdir/_test_array_c.c] \ > $additional_flags > } > > but it doesn't. I also think that some stuff needs to be set up before > go-torture-execute can be called that I don't really understand and I > also also don't know how to avoid executing this test if gccgo hasn't > been built. Putting in a shell script like that is a bad idea, the dejagnu infrastructure can take care of all the transport and target bits for this. This will fail if someone were to try testing this on qemu while the rest of the testsuite might work. However what happens if in aarch64.exp you do a load_lib go-dg.exp and essentially run the tests similar to testsuite/go.dg/dg.exp ? That will take care of any cross-testing issues I hope with this using dejagnu. If that fails I think we should just drop the test as the go testsuite will catch this. > > All that said, is there any chance of getting the original ABI fix > committed? It would be nice to have it in 4.9. I cannot approve or disapprove this patch but looking at the fix and the ABI issue under question here, I agree that this should be fixed for 4.9 and documented in the release notes. Your latest patch should also take Yufeng's suggestion under consideration about merging the condition in the if block. Cheers, Ramana > > Cheers, > mwh > > Michael Hudson-Doyle writes: > >> Richard Earnshaw writes: >> >>> On 17/01/14 23:56, Michael Hudson-Doyle wrote: Ian Lance Taylor writes: > On Fri, Jan 17, 2014 at 11:32 AM, Michael Hudson-Doyle > wrote: >> >> On 18 Jan 2014 07:50, "Yufeng Zhang" wrote: >>> >>> Also can you please try to add some new test(s)? It may not be that >>> straightforward to add non-C/C++ tests, but give it a try. >> >> Can you give some hints? Like at least where in the tree such a test >> would >> go? I don't know this code at all. > > There is already a test in libgo, of course. > > I think it would be pretty hard to write a test that doesn't something > like what libgo does. The problem is that GCC is entirely consistent > with and without your patch. You could add a Go test that passes an > array in gcc/testsuite/go.go-torture/execute/ easily enough, but it > would be quite hard to add a test that doesn't pass whether or not > your patch is applied. I think it would have to be a code generation test, i.e. that compiling something like func second(e [2]int64) int64 { return e[1] } does not access memory or something along those lines. I'll have a look next week. Cheers, mwh >>> >>> Michael, >>> >>> Can you have a look at how the tests in gcc.target/aarch64/aapcs64 work; >>> it ought to be possible to write a test (though not in C) to catch this. >> >> Yeah, I had found those tests. It would be much easier if I could write >> this in C :-) >> >>> The tests work by calling a wrapper function written in assembler -- >>> that wrapper stores out all the registers used for parameter passing and >>> then calls back into the test's validator with the register dump >>> available. Code can then check that values are passed in the places >>> expected. This gives the compiler the separation between caller and >>> callee that's needed to test this feature. >> >> Right, that much makes sense. I'm not sure I completely understand all >> the preprocessor trickery involved but the output of it seems clear >> enough. >> >>> My preferred languages for these tests would be (in approximate order) >>> c, c++, fortran, java, ada, go. That order is based on which languages >>> are tested most by users. >> >> Well of course it cannot be done in C or C++. A commenter on the PR >> said that while fortran does allow passing arrays by value, it's all >> handled in the frontend. No idea about Java. Ada has arrays by value >> but I don't know it even slightly. So it's the last one on your list >> but here's a diff that adds hack at a test in Go: >> >> diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest.S >> b/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest.S >> index 86ce7be..365cd91 100644 >> --- a/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest.S >> +++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest.S >> @@ -1,9 +1,12 @@ >> .global dumpregs >> .global myfunc >> + .global main.myfunc >> .type dumpregs,%function >>
Re: [PATCH] S/390: Throw FE_INVALID exception in the fp2int libgcc routines
I have no further comments on the patch. -- Joseph S. Myers jos...@codesourcery.com
Re: Disable accumulate-outgoing-args for Generic and Buldozers
> On Thu, Feb 06, 2014 at 06:06:41PM +0100, Jan Hubicka wrote: > > > Hi, > > > this is improved patch I am testing. The basic idea is to remove push > > > expanders for cases where we do not have push instruction anyway. > > > emit_move_insns then resorts to unconditonally call move expander > > > with push operand. I expended ix86_expand_vector_move to handle > > > it gracefully and for that I borrowed emit_move_resolve_push > > > function from expr.c since it seemed pointless to preserve > > > duplicated logic in ix86_expand_push. > > > > > > I can easily imagine that scheduling around function call sequences > > > matters, so I also updated push/pop expanders to preserve memory > > > attributes. > > > > > > Eventually I found the attributes to be blank because of logic in expr.c > > > that clears alias info when sibcall is enabled. We can now do better > > > by only disabling it in functions that actually do sibcalls. > > > > > > Bootstrap/regtest running on x86_64-linux, OK for the non-i386 parts > > > if it passes? > > > > Ping... > > The expr.[ch]/function.h/tree-tailcall.c bits are ok. > I see your changes clash with my PR60077 fix, does your patch make them > obsolete and you take care of using proper alignment info? > If so, at least the two tests from that PR's patch should be added, > but I can do that as a follow-up. Yes, this patch was made to fix gcc.target/i386/pr35767-5.c by obtaining correct alignment (and also alias) info on the store. Sorry for making you to duplicate the effort - seems I should have pinged it earlier. Honza
Re: Disable accumulate-outgoing-args for Generic and Buldozers
On Thu, Feb 06, 2014 at 06:06:41PM +0100, Jan Hubicka wrote: > > Hi, > > this is improved patch I am testing. The basic idea is to remove push > > expanders for cases where we do not have push instruction anyway. > > emit_move_insns then resorts to unconditonally call move expander > > with push operand. I expended ix86_expand_vector_move to handle > > it gracefully and for that I borrowed emit_move_resolve_push > > function from expr.c since it seemed pointless to preserve > > duplicated logic in ix86_expand_push. > > > > I can easily imagine that scheduling around function call sequences > > matters, so I also updated push/pop expanders to preserve memory attributes. > > > > Eventually I found the attributes to be blank because of logic in expr.c > > that clears alias info when sibcall is enabled. We can now do better > > by only disabling it in functions that actually do sibcalls. > > > > Bootstrap/regtest running on x86_64-linux, OK for the non-i386 parts > > if it passes? > > Ping... The expr.[ch]/function.h/tree-tailcall.c bits are ok. I see your changes clash with my PR60077 fix, does your patch make them obsolete and you take care of using proper alignment info? If so, at least the two tests from that PR's patch should be added, but I can do that as a follow-up. > > * expr.c (emit_move_resolve_push): Export; be bit more selective > > on when to clear alias set. > > * expr.h (emit_move_resolve_push): Declare. > > * function.h (struct function): Add tail_call_marked. > > * tree-tailcall.c (optimize_tail_call): Set tail_call_marked. > > * config/i386/i386-protos.h (ix86_expand_push): Remove. > > * config/i386/i386.md (TImode move expander): De not call > > ix86_expand_push. > > (FP push expanders): Preserve memory attributes. > > * config/i386/sse.md (push1): Remove. > > * config/i386/i386.c (ix86_expand_vector_move): Handle push > > operation. > > (ix86_expand_push): Remove. > > * config/i386/mmx.md (push1): Remove. Jakub
Fix issues with -enable-gather-detailed-mem-stats
Hi, seems no one cared about memory use for a while, since stats code got bit rotten. Bootstrapped/regtested x86_64-linux, will commit it shortly. Honza * ggc.h (ggc_internal_cleared_alloc): New macro. * var.h (vec_safe_copy): Handle memory stats. * omp-low.c (simd_clone_struct_alloc): Use ggc_internal_cleared_alloc. * target-globals.c (save_target_globals): Likewise. * parser.c (synthesize_implicit_template_parm): Use grow_tree_vec. Index: ggc.h === --- ggc.h (revision 207514) +++ ggc.h (working copy) @@ -146,6 +146,7 @@ extern size_t ggc_round_alloc_size (size /* Allocates cleared memory. */ extern void *ggc_internal_cleared_alloc_stat (size_t MEM_STAT_DECL) ATTRIBUTE_MALLOC; +#define ggc_internal_cleared_alloc(s) ggc_internal_cleared_alloc_stat (s MEM_STAT_INFO) /* Resize a block. */ extern void *ggc_realloc_stat (void *, size_t MEM_STAT_DECL); Index: cp/parser.c === --- cp/parser.c (revision 207514) +++ cp/parser.c (working copy) @@ -31957,7 +31957,7 @@ synthesize_implicit_template_parm (cp_p { tree& new_parms = INNERMOST_TEMPLATE_PARMS (current_template_parms); int new_parm_idx = TREE_VEC_LENGTH (new_parms); - new_parms = grow_tree_vec_stat (new_parms, new_parm_idx + 1); + new_parms = grow_tree_vec (new_parms, new_parm_idx + 1); TREE_VEC_ELT (new_parms, new_parm_idx) = parser->implicit_template_parms; } Index: vec.h === --- vec.h (revision 207514) +++ vec.h (working copy) @@ -679,9 +679,9 @@ vec_safe_truncate (vec * /* If SRC is not NULL, return a pointer to a copy of it. */ template inline vec * -vec_safe_copy (vec *src) +vec_safe_copy (vec *src CXX_MEM_STAT_INFO) { - return src ? src->copy () : NULL; + return src ? src->copy (ALONE_PASS_MEM_STAT) : NULL; } /* Copy the elements from SRC to the end of DST as if by memcpy. Index: omp-low.c === --- omp-low.c (revision 207514) +++ omp-low.c (working copy) @@ -10649,7 +10649,7 @@ simd_clone_struct_alloc (int nargs) size_t len = (sizeof (struct cgraph_simd_clone) + nargs * sizeof (struct cgraph_simd_clone_arg)); clone_info = (struct cgraph_simd_clone *) - ggc_internal_cleared_alloc_stat (len PASS_MEM_STAT); + ggc_internal_cleared_alloc (len); return clone_info; } Index: target-globals.c === --- target-globals.c(revision 207514) +++ target-globals.c(working copy) @@ -79,29 +79,21 @@ save_target_globals (void) struct target_lower_subreg lower_subreg; } *p; p = (struct target_globals_extra *) - ggc_internal_cleared_alloc_stat (sizeof (struct target_globals_extra) - PASS_MEM_STAT); + ggc_internal_cleared_alloc (sizeof (struct target_globals_extra)); g = (struct target_globals *) p; g->flag_state = &p->flag_state; - g->regs = ggc_internal_cleared_alloc_stat (sizeof (struct target_regs) -PASS_MEM_STAT); + g->regs = ggc_internal_cleared_alloc (sizeof (struct target_regs)); g->rtl = ggc_alloc_cleared_target_rtl (); g->hard_regs -= ggc_internal_cleared_alloc_stat (sizeof (struct target_hard_regs) - PASS_MEM_STAT); - g->reload = ggc_internal_cleared_alloc_stat (sizeof (struct target_reload) - PASS_MEM_STAT); - g->expmed = ggc_internal_cleared_alloc_stat (sizeof (struct target_expmed) - PASS_MEM_STAT); += ggc_internal_cleared_alloc (sizeof (struct target_hard_regs)); + g->reload = ggc_internal_cleared_alloc (sizeof (struct target_reload)); + g->expmed = ggc_internal_cleared_alloc (sizeof (struct target_expmed)); g->optabs = &p->optabs; g->libfuncs = ggc_alloc_cleared_target_libfuncs (); g->cfgloop = &p->cfgloop; - g->ira = ggc_internal_cleared_alloc_stat (sizeof (struct target_ira) - PASS_MEM_STAT); - g->ira_int = ggc_internal_cleared_alloc_stat (sizeof (struct target_ira_int) - PASS_MEM_STAT); - g->lra_int = ggc_internal_cleared_alloc_stat (sizeof (struct target_lra_int) - PASS_MEM_STAT); + g->ira = ggc_internal_cleared_alloc (sizeof (struct target_ira)); + g->ira_int = ggc_internal_cleared_alloc (sizeof (struct target_ira_int)); + g->lra_int = ggc_internal_cleared_alloc (sizeof (struct target_lra_int)); g->builtins = &p->builtins; g->gcse = &p->gcse; g->bb_reorder = &p->bb_reorder;
Re: Disable accumulate-outgoing-args for Generic and Buldozers
> Hi, > this is improved patch I am testing. The basic idea is to remove push > expanders for cases where we do not have push instruction anyway. > emit_move_insns then resorts to unconditonally call move expander > with push operand. I expended ix86_expand_vector_move to handle > it gracefully and for that I borrowed emit_move_resolve_push > function from expr.c since it seemed pointless to preserve > duplicated logic in ix86_expand_push. > > I can easily imagine that scheduling around function call sequences > matters, so I also updated push/pop expanders to preserve memory attributes. > > Eventually I found the attributes to be blank because of logic in expr.c > that clears alias info when sibcall is enabled. We can now do better > by only disabling it in functions that actually do sibcalls. > > Bootstrap/regtest running on x86_64-linux, OK for the non-i386 parts > if it passes? Ping... > > Honza > > * expr.c (emit_move_resolve_push): Export; be bit more selective > on when to clear alias set. > * expr.h (emit_move_resolve_push): Declare. > * function.h (struct function): Add tail_call_marked. > * tree-tailcall.c (optimize_tail_call): Set tail_call_marked. > * config/i386/i386-protos.h (ix86_expand_push): Remove. > * config/i386/i386.md (TImode move expander): De not call > ix86_expand_push. > (FP push expanders): Preserve memory attributes. > * config/i386/sse.md (push1): Remove. > * config/i386/i386.c (ix86_expand_vector_move): Handle push > operation. > (ix86_expand_push): Remove. > * config/i386/mmx.md (push1): Remove.
Free constructor elts in LTO merging
Hi, according to memory stats this is relatively common reason for garbage left after tree merging. Bootstrapped/regtested x86_64-linux, OK? Honza * lto/lto.c (unify_scc): Free also CONSTRUCTOR_ELTS. Index: lto/lto.c === --- lto/lto.c (revision 207515) +++ lto/lto.c (working copy) @@ -1807,8 +1807,13 @@ unify_scc (struct streamer_tree_cache_d /* Free the tree nodes from the read SCC. */ for (unsigned i = 0; i < len; ++i) { + enum tree_code code; if (TYPE_P (scc->entries[i])) num_merged_types++; + code = TREE_CODE (scc->entries[i]); + if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR) + && CONSTRUCTOR_ELTS (scc->entries[i])) + ggc_free (CONSTRUCTOR_ELTS (scc->entries[i])); ggc_free (scc->entries[i]); }
[C++ patch] for C++/52369
Hi, This bug seems already fixed on mainline. I have added two testcases, but as the C++11 one is not obvious to me, I guess this path does not qualify as obvious. Incidentally, while writing the C++11 testcase, I realized that the diagnostic was not emitted at the proper location and fixed it. OK for trunk ? 2014-02-06 Fabien Chene PR c++/52369 * cp/method.c (walk_field_subobs): improve the diagnostic locations for both REFERENCE_TYPEs and non-static const members. 2014-02-06 Fabien Chene PR c++/52369 * g++.dg/init/const10.C: New. * g++.dg/init/const11.C: New. -- Fabien Index: gcc/testsuite/g++.dg/init/const10.C === --- gcc/testsuite/g++.dg/init/const10.C (revision 0) +++ gcc/testsuite/g++.dg/init/const10.C (revision 0) @@ -0,0 +1,31 @@ +// PR C++/52369 +// { dg-do compile { target c++11 } } + +class B // { dg-message "implicitly deleted" } +{ + int const v_; // { dg-error "uninitialized" } +}; + +struct D : B {}; // { dg-error "deleted" } + +class A // { dg-message "implicitly deleted" } +{ + int& ref; // { dg-error "uninitialized" } +}; + +struct C : A {}; // { dg-error "deleted" } + +void f() +{ + D d; // { dg-error "use of deleted" } + new D; // { dg-error "use of deleted" } + D(); // { dg-error "use of deleted" } + new D(); // { dg-error "use of deleted" } + + C c; // { dg-error "use of deleted" } + new C; // { dg-error "use of deleted" } + C(); // { dg-error "use of deleted" } + new C(); // { dg-error "use of deleted" } +} + + Index: gcc/testsuite/g++.dg/init/const11.C === --- gcc/testsuite/g++.dg/init/const11.C (revision 0) +++ gcc/testsuite/g++.dg/init/const11.C (revision 0) @@ -0,0 +1,29 @@ +// PR C++/52369 +// { dg-do compile { target c++98 } } + +class B +{ + int const v_; // { dg-message "should be initialized" } +}; + +struct D : B {}; + +class A +{ + int& ref; // { dg-message "should be initialized" } +}; + +struct C : A {}; + +void f() +{ + D d; // { dg-error "uninitialized" } + new D; // { dg-error "uninitialized" } + D(); + new D(); + + C c; // { dg-error "uninitialized" } + new C; // { dg-error "uninitialized" } + C(); // { dg-error "value-initialization" } + new C(); // { dg-error "value-initialization" } +} Index: gcc/cp/method.c === --- gcc/cp/method.c (revision 207406) +++ gcc/cp/method.c (working copy) @@ -1091,15 +1091,15 @@ walk_field_subobs (tree fields, tree fnn && default_init_uninitialized_part (mem_type)) { if (diag) - error ("uninitialized non-static const member %q#D", - field); + error_at (DECL_SOURCE_LOCATION (field), + "uninitialized non-static const member %q#D", field); bad = true; } else if (TREE_CODE (mem_type) == REFERENCE_TYPE) { if (diag) - error ("uninitialized non-static reference member %q#D", - field); + error_at (DECL_SOURCE_LOCATION (field), + "uninitialized non-static reference member %q#D", field); bad = true; }
Avoid unnnecesary copying of ipa-prop's expressions
Hi, at WPA we currently read trees accessed by jump functions and then copy them to remove location that is already known to be UNKNOWN and then keep copying them for every inline clone introduced (and there are many for firefox) This patch makes us to copy only when expression really has an location in it. Bootstrapped/regtested x86_64-linux, OK? Honza * gimplify.c (expr_with_location_p): New function. (expr_without_location): Likewise. * gimplify.h (expr_without_location): Declare. * ipa-prop.c (ipa_set_jf_constant, ipa_set_jf_arith_pass_through, determine_known_aggregate_parts): Do not unshare. Index: gimplify.c === --- gimplify.c (revision 207514) +++ gimplify.c (working copy) @@ -901,6 +901,32 @@ unshare_expr_without_location (tree expr walk_tree (&expr, prune_expr_location, NULL, NULL); return expr; } + +/* Worker for expr_without_location. */ + +static tree +expr_with_location_p (tree *tp, int *walk_subtrees, void *) +{ + if (EXPR_P (*tp)) +{ + if (EXPR_LOCATION (*tp) != UNKNOWN_LOCATION) + return *tp; +} + else +*walk_subtrees = 0; + return NULL_TREE; +} + +/* Return EXPR if it has no location, otherwise make its copy + without location. */ + +tree +expr_without_location (tree expr) +{ + if (walk_tree (&expr, expr_with_location_p, NULL, NULL)) +return (unshare_expr_without_location (expr)); + return expr; +} /* WRAPPER is a code such as BIND_EXPR or CLEANUP_POINT_EXPR which can both contain statements and have a value. Assign its value to a temporary Index: gimplify.h === --- gimplify.h (revision 207514) +++ gimplify.h (working copy) @@ -62,6 +62,7 @@ extern void declare_vars (tree, gimple, extern void gimple_add_tmp_var (tree); extern tree unshare_expr (tree); extern tree unshare_expr_without_location (tree); +extern tree expr_without_location (tree); extern tree voidify_wrapper_expr (tree, tree); extern tree build_and_jump (tree *); extern enum gimplify_status gimplify_self_mod_expr (tree *, gimple_seq *, Index: ipa-prop.c === --- ipa-prop.c (revision 207514) +++ ipa-prop.c (working copy) @@ -419,11 +419,8 @@ static void ipa_set_jf_constant (struct ipa_jump_func *jfunc, tree constant, struct cgraph_edge *cs) { - constant = unshare_expr (constant); - if (constant && EXPR_P (constant)) -SET_EXPR_LOCATION (constant, UNKNOWN_LOCATION); jfunc->type = IPA_JF_CONST; - jfunc->value.constant.value = unshare_expr_without_location (constant); + jfunc->value.constant.value = expr_without_location (constant); if (TREE_CODE (constant) == ADDR_EXPR && TREE_CODE (TREE_OPERAND (constant, 0)) == FUNCTION_DECL) @@ -463,7 +460,7 @@ ipa_set_jf_arith_pass_through (struct ip tree operand, enum tree_code operation) { jfunc->type = IPA_JF_PASS_THROUGH; - jfunc->value.pass_through.operand = unshare_expr_without_location (operand); + jfunc->value.pass_through.operand = expr_without_location (operand); jfunc->value.pass_through.formal_id = formal_id; jfunc->value.pass_through.operation = operation; jfunc->value.pass_through.agg_preserved = false; @@ -1538,7 +1535,7 @@ determine_known_aggregate_parts (gimple struct ipa_agg_jf_item item; item.offset = list->offset - arg_offset; gcc_assert ((item.offset % BITS_PER_UNIT) == 0); - item.value = unshare_expr_without_location (list->constant); + item.value = expr_without_location (list->constant); jfunc->agg.items->quick_push (item); } list = list->next;
Re: [ARM][PATCH] Vectorizer generates unaligned access when -mno-unaligned-access is enabled
On 6 February 2014 10:49, Yury Gribov wrote: > Kugan wrote: >>> Ok if no regressions. >> >> Tested it on qemu for arm-none-linux-gnueabi and there is no new >> regressions. I am sorry I didn't mention it when I posted the patch. > > Commited in r207533. Thanks! > Hi, As can be seen here: http://cbuild.validation.linaro.org/build/cross-validation/gcc/207533/report-build-info.html this has caused some regressions on armv5t targets. Additionally, are you sure testing this kind of thing on QEMU sufficient? Does it really handle unaligned accesses as bus errors? Christophe.
Make gimple_get_virt_method_for_vtable O(1) and not allocating garbage
Hi, I did some memory measurements for Firefox. We seems in shape in exception of linemaps that takes about 20% of memory and I also noticed that we produce a lot of garbage by copy_tree_r. Analyzing the reasons for copy_tree_r I found two about equally important. The first one is that gimple_get_virt_method_for_vtable lookup virtual method via fold_ctor_reference and this eventually calls unshare_expr on the resulting value (that is ADDR_EXPR of FUNCTION_DECL). We make copy of ADDR_EXPR and immediately throw it away in gimple_get_virt_method_for_vtable and care only about the decl. I always considered it stupid to do linear walk of fields here, when we can do just simple O(1) lookup, but considered it SEP. This patch implements that. The constructors are fully controlled by frontend (we verify that it is vtable by DECL_VIRTUAL flag first), so I think it is safe to use special purpose lookup code. Bootstrapped/regtested x86_64-linux, OK? Honza * gimple-fold.c (gimple_get_virt_method_for_vtable): Rewrite constructor lookup to be O(1). Index: gimple-fold.c === --- gimple-fold.c (revision 207523) +++ gimple-fold.c (working copy) @@ -3179,6 +3180,8 @@ gimple_get_virt_method_for_vtable (HOST_ { tree vtable = v, init, fn; unsigned HOST_WIDE_INT size; + unsigned HOST_WIDE_INT elt_size, access_index; + tree domain_type; /* First of all double check we have virtual table. */ if (TREE_CODE (v) != VAR_DECL @@ -3202,10 +3205,30 @@ gimple_get_virt_method_for_vtable (HOST_ offset *= BITS_PER_UNIT; offset += token * size; - /* Do not pass from_decl here, we want to know even about values we can - not use and will check can_refer_decl_in_current_unit_p ourselves. */ - fn = fold_ctor_reference (TREE_TYPE (TREE_TYPE (v)), init, - offset, size, NULL); + /* Lookup the value in the constructor that is assumed to be array. + This is equivalent to + fn = fold_ctor_reference (TREE_TYPE (TREE_TYPE (v)), init, + offset, size, NULL); + but in a constant time. We expect that frontend produced a simple + array without indexed initializers. */ + + gcc_checking_assert (TREE_CODE (TREE_TYPE (init)) == ARRAY_TYPE); + domain_type = TYPE_DOMAIN (TREE_TYPE (init)); + gcc_checking_assert (integer_zerop (TYPE_MIN_VALUE (domain_type))); + elt_size = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (init; + + access_index = offset / BITS_PER_UNIT / elt_size; + gcc_checking_assert (offset % (elt_size * BITS_PER_UNIT) == 0); + + /* This code makes an assumption that there are no + indexed fileds produced by C++ FE, so we can directly index the array. */ + if (access_index < CONSTRUCTOR_NELTS (init)) +{ + fn = CONSTRUCTOR_ELT (init, access_index)->value; + gcc_checking_assert (!CONSTRUCTOR_ELT (init, access_index)->index); +} + else +fn = NULL; /* For type inconsistent program we may end up looking up virtual method in virtual table that does not contain TOKEN entries. We may overrun
Re: [PATCH][ARM] Add Cortex-A57 rtx costs table
On Thu, Feb 06, 2014 at 04:37:38PM +, Ramana Radhakrishnan wrote: > On Thu, Jan 30, 2014 at 1:45 PM, Kyrill Tkachov > wrote: > > Hi all, > > > > This patch adds the rtx costs table for Cortex-A57 and sets its issue rate > > properly in the arm backend. > > > > Tested on arm-none-eabi on a model. > > > > > > Ok for trunk? > > In my view this is OK - this is just a tuning patch for A57 and we > should be able to take it now as it is not default anywhere and > unlikely to affect anyone doing bulk rebuilds. > > However I would like an RM to ack this. Ok. > > 2014-01-30 Kyrylo Tkachov > > > > * config/arm/aarch-cost-tables.h (cortexa57_extra_costs): New table. > > Remove extra newline at end of file. > > * config/arm/arm.c (arm_cortex_a57_tune): New tuning struct. > > (arm_issue_rate): Handle cortexa57. > > * config/arm/arm-cores.def (cortex-a57): Use cortex_a57 tuning. > > (cortex-a57.cortex-a53): Likewise. Jakub
Re: [PATCH] Fix ubsan expansion of multiplication (PR rtl-optimization/60030)
On Thu, Feb 06, 2014 at 08:27:09AM -0800, Richard Henderson wrote: > > All I know is that the generated code with the ZERO_EXTEND has been larger > > than with the paradoxical subreg. But if you prefer, I can surely emit a > > ZERO_EXTEND and open a PR for GCC 5.0 that we should investigate why we > > generate worse code then. > > Yes please. Ok, I'll retest this before committing then: 2014-02-06 Jakub Jelinek PR rtl-optimization/60030 * internal-fn.c (ubsan_expand_si_overflow_mul_check): Surround lopart with paradoxical subreg before shifting it up by hprec. --- gcc/internal-fn.c.jj2014-01-29 12:43:24.0 +0100 +++ gcc/internal-fn.c 2014-02-03 10:40:57.0 +0100 @@ -646,7 +646,8 @@ ubsan_expand_si_overflow_mul_check (gimp emit_cmp_and_jump_insns (hipart, const0_rtx, GE, NULL_RTX, hmode, false, after_hipart_neg, PROB_EVEN); - tem = expand_shift (LSHIFT_EXPR, mode, lopart, hprec, NULL_RTX, 1); + tem = convert_modes (mode, hmode, lopart, 1); + tem = expand_shift (LSHIFT_EXPR, mode, tem, hprec, NULL_RTX, 1); tem = expand_simple_binop (mode, MINUS, loxhi, tem, NULL_RTX, 1, OPTAB_DIRECT); emit_move_insn (loxhi, tem); Jakub
Re: [PATCH][ARM] Add Cortex-A57 rtx costs table
On Thu, Jan 30, 2014 at 1:45 PM, Kyrill Tkachov wrote: > Hi all, > > This patch adds the rtx costs table for Cortex-A57 and sets its issue rate > properly in the arm backend. > > Tested on arm-none-eabi on a model. > > > Ok for trunk? In my view this is OK - this is just a tuning patch for A57 and we should be able to take it now as it is not default anywhere and unlikely to affect anyone doing bulk rebuilds. However I would like an RM to ack this. regards Ramana > > Thanks, > Kyrill > > 2014-01-30 Kyrylo Tkachov > > * config/arm/aarch-cost-tables.h (cortexa57_extra_costs): New table. > Remove extra newline at end of file. > * config/arm/arm.c (arm_cortex_a57_tune): New tuning struct. > (arm_issue_rate): Handle cortexa57. > * config/arm/arm-cores.def (cortex-a57): Use cortex_a57 tuning. > (cortex-a57.cortex-a53): Likewise.
Re: [PATCH] Fix ubsan expansion of multiplication (PR rtl-optimization/60030)
On 02/06/2014 08:25 AM, Jakub Jelinek wrote: > On Thu, Feb 06, 2014 at 08:23:00AM -0800, Richard Henderson wrote: >> On 02/06/2014 08:02 AM, Jakub Jelinek wrote: >>> On Thu, Feb 06, 2014 at 06:53:55AM -0800, Richard Henderson wrote: On 02/04/2014 04:40 AM, Jakub Jelinek wrote: > - tem = expand_shift (LSHIFT_EXPR, mode, lopart, hprec, NULL_RTX, 1); > + tem = gen_rtx_SUBREG (mode, lopart, 0); > + tem = expand_shift (LSHIFT_EXPR, mode, tem, hprec, NULL_RTX, 1); I would be happier with gen_lowpart rather than the explicit gen_rtx_subreg. >>> >>> I need a paradoxical subreg, gen_lowpart ICEs in that case >> >> It does? Since when? I've certainly used it for paradoxicals in the past. >> >> Oh, I see, yes, it would ICE for a multi-word paradoxical subreg. But that >> sort of thing is ... skirting the bounds of validity at best. >> >> Surely we should be able to optimize away a zero-extension in all cases? > > All I know is that the generated code with the ZERO_EXTEND has been larger > than with the paradoxical subreg. But if you prefer, I can surely emit a > ZERO_EXTEND and open a PR for GCC 5.0 that we should investigate why we > generate worse code then. Yes please. r~
Re: [PATCH] Fix ubsan expansion of multiplication (PR rtl-optimization/60030)
On Thu, Feb 06, 2014 at 08:23:00AM -0800, Richard Henderson wrote: > On 02/06/2014 08:02 AM, Jakub Jelinek wrote: > > On Thu, Feb 06, 2014 at 06:53:55AM -0800, Richard Henderson wrote: > >> On 02/04/2014 04:40 AM, Jakub Jelinek wrote: > >>> - tem = expand_shift (LSHIFT_EXPR, mode, lopart, hprec, NULL_RTX, 1); > >>> + tem = gen_rtx_SUBREG (mode, lopart, 0); > >>> + tem = expand_shift (LSHIFT_EXPR, mode, tem, hprec, NULL_RTX, 1); > >> > >> I would be happier with gen_lowpart rather than the explicit > >> gen_rtx_subreg. > > > > I need a paradoxical subreg, gen_lowpart ICEs in that case > > It does? Since when? I've certainly used it for paradoxicals in the past. > > Oh, I see, yes, it would ICE for a multi-word paradoxical subreg. But that > sort of thing is ... skirting the bounds of validity at best. > > Surely we should be able to optimize away a zero-extension in all cases? All I know is that the generated code with the ZERO_EXTEND has been larger than with the paradoxical subreg. But if you prefer, I can surely emit a ZERO_EXTEND and open a PR for GCC 5.0 that we should investigate why we generate worse code then. Jakub
Re: [PATCH] Fix ubsan expansion of multiplication (PR rtl-optimization/60030)
On 02/06/2014 08:02 AM, Jakub Jelinek wrote: > On Thu, Feb 06, 2014 at 06:53:55AM -0800, Richard Henderson wrote: >> On 02/04/2014 04:40 AM, Jakub Jelinek wrote: >>> - tem = expand_shift (LSHIFT_EXPR, mode, lopart, hprec, NULL_RTX, 1); >>> + tem = gen_rtx_SUBREG (mode, lopart, 0); >>> + tem = expand_shift (LSHIFT_EXPR, mode, tem, hprec, NULL_RTX, 1); >> >> I would be happier with gen_lowpart rather than the explicit gen_rtx_subreg. > > I need a paradoxical subreg, gen_lowpart ICEs in that case It does? Since when? I've certainly used it for paradoxicals in the past. Oh, I see, yes, it would ICE for a multi-word paradoxical subreg. But that sort of thing is ... skirting the bounds of validity at best. Surely we should be able to optimize away a zero-extension in all cases? r~
Re: [PATCH] S/390: Throw FE_INVALID exception in the fp2int libgcc routines
Here is a new version of the patch not limited to NaNs and Infs anymore. Bootstrapped and regtested on s390 with various options. -Andreas- 2014-02-06 Andreas Krebbel * config/s390/32/_fixdfdi.c: Throw invalid exception if number cannot be represented. * config/s390/32/_fixsfdi.c: Likewise. * config/s390/32/_fixtfdi.c: Likewise. * config/s390/32/_fixunsdfdi.c: Likewise. * config/s390/32/_fixunssfdi.c: Likewise. * config/s390/32/_fixunstfdi.c: Likewise. 2014-02-06 Andreas Krebbel * gcc.target/s390/fp2int1.c: New testcase. commit 080f0a28fd6f27f5d122e287a6f18481d253d433 Author: Andreas Krebbel Date: Wed Jan 29 10:13:51 2014 +0100 S/390: libgcc: Throw INVALID exception for float->int conversions to implement C99 Annex F.4. diff --git a/gcc/testsuite/gcc.target/s390/fp2int1.c b/gcc/testsuite/gcc.target/s390/fp2int1.c new file mode 100644 index 000..2176370 --- /dev/null +++ b/gcc/testsuite/gcc.target/s390/fp2int1.c @@ -0,0 +1,95 @@ +/* Test for the 32 bit fp to 64 bit int conversion routines. + + On S/390 32 bit we use our own implementations in order to be IEEE + complaint as we are with our machine instructions. These missed to + throw FE_INVALID exceptions in a bunch of cases. */ + +/* { dg-do run { target s390-*-* } } */ +/* { dg-options "-O3 -mesa" } */ +/* { dg-require-effective-target fenv_exceptions } */ + +#define _GNU_SOURCE +#include +#include +#include + +#define INFINITYf (__builtin_inff()) +#define INFINITY(__builtin_inf()) +#define INFINITYl (__builtin_infl()) +#define NANf(__builtin_nanf ("")) +#define NAN (__builtin_nan ("")) +#define NANl(__builtin_nanl ("")) + +#define TESTEXCEPT_FUNC(FUNC, TYPE_FROM, TYPE_TO) \ + TYPE_TO \ + __attribute__((noinline)) FUNC (TYPE_FROM a) \ + {\ +asm volatile ("" : : "f" (a)); \ +return (TYPE_TO)a; \ + } + +#define TESTEXCEPT(FUNC, EXCEPT, EXPECT, VALUE, TYPE_TO) \ + {\ +TYPE_TO b; \ +feclearexcept (FE_ALL_EXCEPT); \ +b = FUNC (VALUE); \ +if ((fetestexcept (EXCEPT) & (EXCEPT)) != EXPECT) \ + { \ + printf ("FAIL in line: %d\n", __LINE__);\ + abort (); \ + } \ + } + +#define TESTEXCEPT_FUNC_ALLFLOATS(FUNC, TYPE_TO) \ + TESTEXCEPT_FUNC (FUNC##_f, float, TYPE_TO); \ + TESTEXCEPT_FUNC (FUNC##_d, double, TYPE_TO); \ + TESTEXCEPT_FUNC (FUNC##_l, long double, TYPE_TO);\ + +#define TESTEXCEPT_ALLFLOATS(FUNC, EXCEPT, EXPECT, VALUE, TYPE_TO) \ + TESTEXCEPT (FUNC##_f, EXCEPT, EXPECT, VALUE##f, TYPE_TO);\ + TESTEXCEPT (FUNC##_d, EXCEPT, EXPECT, VALUE, TYPE_TO); \ + TESTEXCEPT (FUNC##_l, EXCEPT, EXPECT, VALUE##l, TYPE_TO);\ + +TESTEXCEPT_FUNC_ALLFLOATS (a, unsigned long long); +TESTEXCEPT_FUNC_ALLFLOATS (u, long long); + + +int +main () +{ + /* Prevent getting signals. */ + fedisableexcept (FE_INVALID); + + /* To unsigned long long */ + + TESTEXCEPT_ALLFLOATS (a, FE_INVALID, FE_INVALID, INFINITY, unsigned long long); + TESTEXCEPT_ALLFLOATS (a, FE_INVALID, FE_INVALID, -INFINITY, unsigned long long); + TESTEXCEPT_ALLFLOATS (a, FE_INVALID, FE_INVALID, NAN, unsigned long long); + TESTEXCEPT_ALLFLOATS (a, FE_INVALID, FE_INVALID, -NAN, unsigned long long); + + /* Negative values >-1.0 must not cause FE_INVALID. */ + TESTEXCEPT_ALLFLOATS (a, FE_INVALID, 0, -0x0.ffp0, unsigned long long); + /* -1.0 instead must. */ + TESTEXCEPT_ALLFLOATS (a, FE_INVALID, FE_INVALID, -0x1.0p+0, unsigned long long); + TESTEXCEPT_ALLFLOATS (a, FE_INVALID, 0, 0x1.0p+63, unsigned long long); + TESTEXCEPT_ALLFLOATS (a, FE_INVALID, FE_INVALID, 0x1.0p+64, unsigned long long); + + /* To signed long long */ + + TESTEXCEPT_ALLFLOATS (u, FE_INVALID, FE_INVALID, INFINITY, long long); + TESTEXCEPT_ALLFLOATS (u, FE_INVALID, FE_INVALID, -INFINITY, long long); + TESTEXCEPT_ALLFLOATS (u, FE_INVALID, FE_INVALID, NAN, long long); + TESTEXCEPT_ALLFLOATS (u, FE_INVALID, FE_INVALID, -NAN, long long); + + TESTEXCEPT_ALLFLOATS (u, FE_INVALID, 0, -0x1.0p+63, long long); + TESTEXCEPT_ALLFLOATS (u, FE_INVALID, FE_INVALID, -0x1.1p+63, long long); + TESTEXCEPT_ALLFLOATS (u,
Re: [PATCH] Fix ubsan expansion of multiplication (PR rtl-optimization/60030)
On Thu, Feb 06, 2014 at 06:53:55AM -0800, Richard Henderson wrote: > On 02/04/2014 04:40 AM, Jakub Jelinek wrote: > > - tem = expand_shift (LSHIFT_EXPR, mode, lopart, hprec, NULL_RTX, 1); > > + tem = gen_rtx_SUBREG (mode, lopart, 0); > > + tem = expand_shift (LSHIFT_EXPR, mode, tem, hprec, NULL_RTX, 1); > > I would be happier with gen_lowpart rather than the explicit gen_rtx_subreg. I need a paradoxical subreg, gen_lowpart ICEs in that case (for ppc* here lopart is SImode, I need DImode paradoxical subreg of that, so that I can shift it up by 32 bits and thus get a DImode value with the lopart bits in the upper 32 bits and zero in the rest. Jakub
Re: [PATCH] Add alloc_align and assume_aligned attributes (PR middle-end/60092)
On Thu, 6 Feb 2014, Jakub Jelinek wrote: +/* Handle a "alloc_align" attribute; arguments as in + struct attribute_spec.handler. */ + +static tree +handle_alloc_align_attribute (tree *node, tree ARG_UNUSED (name), tree args, + int, bool *no_add_attrs) +{ + unsigned arg_count = type_num_arguments (*node); + tree position = TREE_VALUE (args); + if (position && TREE_CODE (position) != IDENTIFIER_NODE + && TREE_CODE (position) != FUNCTION_DECL) +position = default_conversion (position); The FUNCTION_DECL test is only useful when there are 2+ arguments to the builtin (but it doesn't hurt). -- Marc Glisse
Re: [PATCH] Fix ARM dwarf2cfi ICE and unwind info issues (PR target/59575)
Hi Jakub, 2014-01-30 Jakub Jelinek PR target/59575 * config/arm/arm.c (emit_multi_reg_push): Add dwarf_regs_mask argument, don't record in REG_FRAME_RELATED_EXPR registers not set in that bitmask. (arm_expand_prologue): Adjust all callers. (arm_unwind_emit_sequence): Allow saved, but not important for unwind info, registers also at the lowest numbered registers side. Use gcc_assert instead of abort, and SET_SRC/SET_DEST macros instead of XEXP. * gcc.target/arm/pr59575.c: New test. Approved - please apply. Cheers Nick
[PATCH] Add alloc_align and assume_aligned attributes (PR middle-end/60092)
Hi! As discussed on IRC, this patch introduces two new attributes, so that the C library (and other headers) have a way to a) tell the compiler something about functions like aligned_alloc or memalign b) tell the compiler the alignment of pointers returned say by malloc Ok for trunk if bootstrap/regtest passes? 2014-02-06 Jakub Jelinek PR middle-end/60092 * tree-ssa-ccp.c (surely_varying_stmt_p): Don't return true if TYPE_ATTRIBUTES (gimple_call_fntype ()) contain assume_aligned or alloc_align attributes. (bit_value_alloc_assume_aligned_attribute): New function. (evaluate_stmt): Handle calls to functions with assume_aligned or alloc_align attributes. * doc/extend.texi: Document assume_aligned and alloc_align attributes. c-family/ * c-common.c (handle_alloc_align_attribute, handle_assume_aligned_attribute): New functions. (c_common_attribute_table): Add alloc_align and assume_aligned attributes. testsuite/ * gcc.dg/attr-alloc_align-1.c: New test. * gcc.dg/attr-alloc_align-2.c: New test. * gcc.dg/attr-alloc_align-3.c: New test. * gcc.dg/attr-assume_aligned-1.c: New test. * gcc.dg/attr-assume_aligned-2.c: New test. * gcc.dg/attr-assume_aligned-3.c: New test. --- gcc/c-family/c-common.c.jj 2014-02-05 10:37:58.0 +0100 +++ gcc/c-family/c-common.c 2014-02-06 15:35:15.707333771 +0100 @@ -366,6 +366,8 @@ static tree handle_warn_unused_result_at static tree handle_sentinel_attribute (tree *, tree, tree, int, bool *); static tree handle_type_generic_attribute (tree *, tree, tree, int, bool *); static tree handle_alloc_size_attribute (tree *, tree, tree, int, bool *); +static tree handle_alloc_align_attribute (tree *, tree, tree, int, bool *); +static tree handle_assume_aligned_attribute (tree *, tree, tree, int, bool *); static tree handle_target_attribute (tree *, tree, tree, int, bool *); static tree handle_optimize_attribute (tree *, tree, tree, int, bool *); static tree ignore_attribute (tree *, tree, tree, int, bool *); @@ -766,6 +768,10 @@ const struct attribute_spec c_common_att handle_omp_declare_simd_attribute, false }, { "omp declare target", 0, 0, true, false, false, handle_omp_declare_target_attribute, false }, + { "alloc_align", 1, 1, false, true, true, + handle_alloc_align_attribute, false }, + { "assume_aligned",1, 2, false, true, true, + handle_assume_aligned_attribute, false }, { NULL, 0, 0, false, false, false, NULL, false } }; @@ -8046,13 +8052,64 @@ handle_alloc_size_attribute (tree *node, if (TREE_CODE (position) != INTEGER_CST || TREE_INT_CST_HIGH (position) || TREE_INT_CST_LOW (position) < 1 - || TREE_INT_CST_LOW (position) > arg_count ) + || TREE_INT_CST_LOW (position) > arg_count) { warning (OPT_Wattributes, "alloc_size parameter outside range"); *no_add_attrs = true; return NULL_TREE; } +} + return NULL_TREE; +} + +/* Handle a "alloc_align" attribute; arguments as in + struct attribute_spec.handler. */ + +static tree +handle_alloc_align_attribute (tree *node, tree ARG_UNUSED (name), tree args, + int, bool *no_add_attrs) +{ + unsigned arg_count = type_num_arguments (*node); + tree position = TREE_VALUE (args); + if (position && TREE_CODE (position) != IDENTIFIER_NODE + && TREE_CODE (position) != FUNCTION_DECL) +position = default_conversion (position); + + if (TREE_CODE (position) != INTEGER_CST + || TREE_INT_CST_HIGH (position) + || TREE_INT_CST_LOW (position) < 1 + || TREE_INT_CST_LOW (position) > arg_count) +{ + warning (OPT_Wattributes, + "alloc_align parameter outside range"); + *no_add_attrs = true; + return NULL_TREE; +} + return NULL_TREE; +} + +/* Handle a "assume_aligned" attribute; arguments as in + struct attribute_spec.handler. */ + +static tree +handle_assume_aligned_attribute (tree *node, tree ARG_UNUSED (name), tree args, +int, bool *no_add_attrs) +{ + for (; args; args = TREE_CHAIN (args)) +{ + tree position = TREE_VALUE (args); + if (position && TREE_CODE (position) != IDENTIFIER_NODE + && TREE_CODE (position) != FUNCTION_DECL) + position = default_conversion (position); + + if (TREE_CODE (position) != INTEGER_CST) + { + warning (OPT_Wattributes, + "assume_aligned parameter not integer constant"); + *no_add_attrs = true; + return NULL_TREE; + } } return NULL_TREE; } --- gcc/tree-ssa-ccp.c.jj 2014-01-03 11:40:46.0 +0100 +++ gcc/tree-ssa-ccp.c 2014-02-06 16:16:27.260996887 +010
Re: [C,C++] integer constants in attribute arguments
On Wed, 5 Feb 2014, Andreas Schwab wrote: Marc Glisse writes: On Wed, 5 Feb 2014, Andreas Schwab wrote: Wrt. the function decl it should probably just be changed to a variable decl. Alignments on function decls are kind of exotic. Good idea. How about this patch then? LGTM. Done at r207561. 2014-02-06 Marc Glisse * g++.dg/cpp0x/constexpr-attribute2.C: Restrict to target init_priority. Test alignment of variable instead of function. -- Marc Glisse
Re: RFA: MN10300: Include saved registers in stack usage
Hi Jeff, According to our coding conventions, the ability to build with something other than gcc is still desirable. You could argue that you're unlikely to be bootstrapping on a mn103 with something other than GCC and if you're building a cross, you could start by first building gcc native. However, it's pretty easy to avoid the headaches and just provide a popcount routine. OK, here is a version of the patch with a homebrew popcount() function in it. OK to apply ? Cheers Nick Index: gcc/config/mn10300/mn10300.c === --- gcc/config/mn10300/mn10300.c(revision 207529) +++ gcc/config/mn10300/mn10300.c(working copy) @@ -741,17 +741,32 @@ F (emit_insn (x)); } +static inline unsigned int +popcount (unsigned int mask) +{ + unsigned int count = 0; + + while (mask) +{ + ++ count; + mask &= ~ (mask & - mask); +} + return count; +} + void mn10300_expand_prologue (void) { HOST_WIDE_INT size = mn10300_frame_size (); + unsigned int mask; + mask = mn10300_get_live_callee_saved_regs (NULL); + /* If we use any of the callee-saved registers, save them now. */ + mn10300_gen_multiple_store (mask); + if (flag_stack_usage_info) -current_function_static_stack_size = size; +current_function_static_stack_size = size + popcount (mask) * 4; - /* If we use any of the callee-saved registers, save them now. */ - mn10300_gen_multiple_store (mn10300_get_live_callee_saved_regs (NULL)); - if (TARGET_AM33_2 && fp_regs_to_save ()) { int num_regs_to_save = fp_regs_to_save (), i; @@ -767,6 +782,9 @@ unsigned int strategy_size = (unsigned)-1, this_strategy_size; rtx reg; + if (flag_stack_usage_info) + current_function_static_stack_size += num_regs_to_save * 4; + /* We have several different strategies to save FP registers. We can store them using SP offsets, which is beneficial if there are just a few registers to save, or we can use `a0' in
Re: [PATCH] Fix ARM dwarf2cfi ICE and unwind info issues (PR target/59575)
On Thu, Jan 30, 2014 at 8:38 PM, Jakub Jelinek wrote: > Hi! > > For -Os, apparently ARM backend sometimes decides to push (up to 8?) dummy > registers to stack in addition to the registers that actually need to be > saved, in order to avoid extra instruction to adjust stack pointer. > These registers shouldn't be mentioned for .debug_frame though (both because > it breaks assumptions of dwarf2cfi and because it doesn't make sense), they > are either call clobbered which don't need saving (r0-r3) or for r4-r7 would > be dummy if they aren't ever modified in the current function, again, there > is no point in saving them. Even for ARM unwind info I think it doesn't > make sense to mention them in the unwind info, the patch instead adds .pad > #NNN > directive after the .save to adjust sp correspondingly. > > Kyrill has kindly bootstrapped/regtested this on arm, ok for trunk? This is OK with a small comment on top of arm_unwind_emit_sequence just describing the reasoning here. I was trying to convince myself that the changes for the unwind information was safe and yes it follows similar to what you describe. regards Ramana > > 2014-01-30 Jakub Jelinek > > PR target/59575 > * config/arm/arm.c (emit_multi_reg_push): Add dwarf_regs_mask > argument, > don't record in REG_FRAME_RELATED_EXPR registers not set in that > bitmask. > (arm_expand_prologue): Adjust all callers. > (arm_unwind_emit_sequence): Allow saved, but not important for unwind > info, registers also at the lowest numbered registers side. Use > gcc_assert instead of abort, and SET_SRC/SET_DEST macros instead of > XEXP. > > * gcc.target/arm/pr59575.c: New test. > > --- gcc/config/arm/arm.c.jj 2014-01-29 10:21:11.448031616 +0100 > +++ gcc/config/arm/arm.c2014-01-29 15:32:22.246381587 +0100 > @@ -177,7 +177,7 @@ static rtx arm_expand_builtin (tree, rtx > static tree arm_builtin_decl (unsigned, bool); > static void emit_constant_insn (rtx cond, rtx pattern); > static rtx emit_set_insn (rtx, rtx); > -static rtx emit_multi_reg_push (unsigned long); > +static rtx emit_multi_reg_push (unsigned long, unsigned long); > static int arm_arg_partial_bytes (cumulative_args_t, enum machine_mode, > tree, bool); > static rtx arm_function_arg (cumulative_args_t, enum machine_mode, > @@ -19573,28 +19573,33 @@ arm_emit_strd_push (unsigned long saved_ > /* Generate and emit an insn that we will recognize as a push_multi. > Unfortunately, since this insn does not reflect very well the actual > semantics of the operation, we need to annotate the insn for the benefit > - of DWARF2 frame unwind information. */ > + of DWARF2 frame unwind information. DWARF_REGS_MASK is a subset of > + MASK for registers that should be annotated for DWARF2 frame unwind > + information. */ > static rtx > -emit_multi_reg_push (unsigned long mask) > +emit_multi_reg_push (unsigned long mask, unsigned long dwarf_regs_mask) > { >int num_regs = 0; > - int num_dwarf_regs; > + int num_dwarf_regs = 0; >int i, j; >rtx par; >rtx dwarf; >int dwarf_par_index; >rtx tmp, reg; > > + /* We don't record the PC in the dwarf frame information. */ > + dwarf_regs_mask &= ~(1 << PC_REGNUM); > + >for (i = 0; i <= LAST_ARM_REGNUM; i++) > -if (mask & (1 << i)) > - num_regs++; > +{ > + if (mask & (1 << i)) > + num_regs++; > + if (dwarf_regs_mask & (1 << i)) > + num_dwarf_regs++; > +} > >gcc_assert (num_regs && num_regs <= 16); > - > - /* We don't record the PC in the dwarf frame information. */ > - num_dwarf_regs = num_regs; > - if (mask & (1 << PC_REGNUM)) > -num_dwarf_regs--; > + gcc_assert ((dwarf_regs_mask & ~mask) == 0); > >/* For the body of the insn we are going to generate an UNSPEC in > parallel with several USEs. This allows the insn to be recognized > @@ -19660,14 +19665,13 @@ emit_multi_reg_push (unsigned long mask) >gen_rtvec (1, reg), >UNSPEC_PUSH_MULT)); > > - if (i != PC_REGNUM) > + if (dwarf_regs_mask & (1 << i)) > { > tmp = gen_rtx_SET (VOIDmode, > gen_frame_mem (SImode, stack_pointer_rtx), > reg); > RTX_FRAME_RELATED_P (tmp) = 1; > - XVECEXP (dwarf, 0, dwarf_par_index) = tmp; > - dwarf_par_index++; > + XVECEXP (dwarf, 0, dwarf_par_index++) = tmp; > } > > break; > @@ -19682,7 +19686,7 @@ emit_multi_reg_push (unsigned long mask) > > XVECEXP (par, 0, j) = gen_rtx_USE (VOIDmode, reg); > > - if (i != PC_REGNUM) > + if (dwarf_regs_mask & (1 << i)) > { > tmp > = gen_rtx_SET (VOIDmode, > @@ -20689,7 +20693,7 @@ arm_expand_pr
Re: var-tracking and s390's LM(G)
On 02/06/2014 06:48 AM, Richard Sandiford wrote: > Richard Henderson writes: >> On 02/06/2014 01:55 AM, Richard Sandiford wrote: >>> OK, I agree that's not 4.9 material. What about the other change >>> of replacing: >>> >>>REF_CFA_DEF_CFA (plus (stack_pointer_rtx) (const_int 160/96)) >>> >>> with: >>> >>>REF_CFA_ADJUST_CFA (set (stack_pointer_rtx) >>>(plus (current_cfa_base) (const_int offset))) >>> >>> ? That works on its own, but having both a REG_CFA_ADJUST_CFA that assigns >>> to stack_pointer_rtx and a REG_CFA_RESTORE for stack_pointer_rtx feels like >>> a double assignment. >> >> It does seem like it. I suppose it would be easy to suppress the RESTORE of >> the stack pointer, without changing the save at all. > > But if having a restore in the presence of a save doesn't matter, why do > we have restores for the other registers? If the idea is that we never > care what the CFI state is after the LM(G) then why not omit all of them? > > Or do you mean that REG_CFA_ADJUST_CFA would act as a REG_CFA_RESTORE too, > if there had been a previous save? Hum. Your response does make it clear that I may need more coffee. What I wrote above doesn't really make sense. r~
Re: [PATCH] Fix ARM dwarf2cfi ICE and unwind info issues (PR target/59575)
On 01/30/2014 12:38 PM, Jakub Jelinek wrote: > 2014-01-30 Jakub Jelinek > > PR target/59575 > * config/arm/arm.c (emit_multi_reg_push): Add dwarf_regs_mask argument, > don't record in REG_FRAME_RELATED_EXPR registers not set in that > bitmask. > (arm_expand_prologue): Adjust all callers. > (arm_unwind_emit_sequence): Allow saved, but not important for unwind > info, registers also at the lowest numbered registers side. Use > gcc_assert instead of abort, and SET_SRC/SET_DEST macros instead of > XEXP. > > * gcc.target/arm/pr59575.c: New test. Give the ARM guys another 24 hours to comment, but as I said in the PR I think the patch is good. r~
Re: [PATCH] Fix ubsan expansion of multiplication (PR rtl-optimization/60030)
On 02/04/2014 04:40 AM, Jakub Jelinek wrote: > - tem = expand_shift (LSHIFT_EXPR, mode, lopart, hprec, NULL_RTX, 1); > + tem = gen_rtx_SUBREG (mode, lopart, 0); > + tem = expand_shift (LSHIFT_EXPR, mode, tem, hprec, NULL_RTX, 1); I would be happier with gen_lowpart rather than the explicit gen_rtx_subreg. Ok with that change. r~
Re: [PATCH] Small var-tracking improvement (PR debug/59992)
> 2014-01-30 Jakub Jelinek > > PR debug/59992 > * var-tracking.c (adjust_mems): Before adding a SET > to amd->side_effects, adjust it's SET_SRC using > simplify_replace_fn_rtx. > > * gcc.dg/pr59992.c: New test. Ok. r~
Re: var-tracking and s390's LM(G)
Richard Henderson writes: > On 02/06/2014 01:55 AM, Richard Sandiford wrote: >> OK, I agree that's not 4.9 material. What about the other change >> of replacing: >> >>REF_CFA_DEF_CFA (plus (stack_pointer_rtx) (const_int 160/96)) >> >> with: >> >>REF_CFA_ADJUST_CFA (set (stack_pointer_rtx) >>(plus (current_cfa_base) (const_int offset))) >> >> ? That works on its own, but having both a REG_CFA_ADJUST_CFA that assigns >> to stack_pointer_rtx and a REG_CFA_RESTORE for stack_pointer_rtx feels like >> a double assignment. > > It does seem like it. I suppose it would be easy to suppress the RESTORE of > the stack pointer, without changing the save at all. But if having a restore in the presence of a save doesn't matter, why do we have restores for the other registers? If the idea is that we never care what the CFI state is after the LM(G) then why not omit all of them? Or do you mean that REG_CFA_ADJUST_CFA would act as a REG_CFA_RESTORE too, if there had been a previous save? Thanks, Richard
Re: [PATCH] PR60092 - lower posix_memalign to make align-info accessible
On Thu, 6 Feb 2014, Richard Biener wrote: > > This re-writes posix_memalign calls to > > posix_memalign (ptr, align, size); > tem = *ptr; > tem = __builtin_assume_aligned (align); > *ptr = tem; > > during CF lowering (yeah, ok ...) to make alignment info accessible > to SSA based analysis. > > I have to adjust the added alias-31.c testcase again because with > the above we end up with > > : > res_3 = *p_2(D); > posix_memalign (&q.q1, 128, 512); > _5 = MEM[(void *)&q]; > _6 = __builtin_assume_aligned (_5, 128); > MEM[(void *)&q] = _6; > posix_memalign (&q.q2, 128, 512); > _17 = res_3 + res_3; > _20 = _17 + 1; > _23 = _20 + 2; > q ={v} {CLOBBER}; > return _23; > > after early DCE. This is because DCE only has "baby" DSE built-in > and the store to MEM[(void *)&q] which it doesn't remove keeps > the rest live. DSE removes the store and the DCE following it > the rest. > > Not sure if more sophisticated lowering is wanted here. Special-casing > &... operands to posix_memalign as stated in the PR, generating > for posix_memalign (&ptr, 128, 512); > > posix_memalign (&tem, 128, 512); > reg = tem; > reg = __builtin_assume_aligned (reg, 128); > ptr = reg; > > instead would be possible (hoping for ptr to become non-address-taken). Ok, doing that was simple and avoids pessimizing the testcase. Richard.
[PATCH] PR60092 - lower posix_memalign to make align-info accessible
This re-writes posix_memalign calls to posix_memalign (ptr, align, size); tem = *ptr; tem = __builtin_assume_aligned (align); *ptr = tem; during CF lowering (yeah, ok ...) to make alignment info accessible to SSA based analysis. I have to adjust the added alias-31.c testcase again because with the above we end up with : res_3 = *p_2(D); posix_memalign (&q.q1, 128, 512); _5 = MEM[(void *)&q]; _6 = __builtin_assume_aligned (_5, 128); MEM[(void *)&q] = _6; posix_memalign (&q.q2, 128, 512); _17 = res_3 + res_3; _20 = _17 + 1; _23 = _20 + 2; q ={v} {CLOBBER}; return _23; after early DCE. This is because DCE only has "baby" DSE built-in and the store to MEM[(void *)&q] which it doesn't remove keeps the rest live. DSE removes the store and the DCE following it the rest. Not sure if more sophisticated lowering is wanted here. Special-casing &... operands to posix_memalign as stated in the PR, generating for posix_memalign (&ptr, 128, 512); posix_memalign (&tem, 128, 512); reg = tem; reg = __builtin_assume_aligned (reg, 128); ptr = reg; instead would be possible (hoping for ptr to become non-address-taken). Bootstrap / regtest on x86_64-unknown-linux-gnu pending. Thanks, Richard. 2014-02-06 Richard Biener PR middle-end/60092 * gimple-low.c (lower_builtin_posix_memalign): New function. (lower_stmt): Call it to lower posix_memalign in a way to make alignment info accessible. * gcc.dg/tree-ssa/alias-31.c: Adjust. * gcc.dg/vect/pr60092-2.c: New testcase. Index: trunk/gcc/gimple-low.c === *** trunk.orig/gcc/gimple-low.c 2014-02-06 15:06:39.013419315 +0100 --- trunk/gcc/gimple-low.c 2014-02-06 15:12:39.116394523 +0100 *** static void lower_gimple_bind (gimple_st *** 83,88 --- 83,89 static void lower_try_catch (gimple_stmt_iterator *, struct lower_data *); static void lower_gimple_return (gimple_stmt_iterator *, struct lower_data *); static void lower_builtin_setjmp (gimple_stmt_iterator *); + static void lower_builtin_posix_memalign (gimple_stmt_iterator *); /* Lower the body of current_function_decl from High GIMPLE into Low *** lower_stmt (gimple_stmt_iterator *gsi, s *** 327,338 } if (decl ! && DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL ! && DECL_FUNCTION_CODE (decl) == BUILT_IN_SETJMP) { ! lower_builtin_setjmp (gsi); ! data->cannot_fallthru = false; ! return; } if (decl && (flags_from_decl_or_type (decl) & ECF_NORETURN)) --- 328,346 } if (decl ! && DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL) { ! if (DECL_FUNCTION_CODE (decl) == BUILT_IN_SETJMP) ! { ! lower_builtin_setjmp (gsi); ! data->cannot_fallthru = false; ! return; ! } ! else if (DECL_FUNCTION_CODE (decl) == BUILT_IN_POSIX_MEMALIGN) ! { ! lower_builtin_posix_memalign (gsi); ! return; ! } } if (decl && (flags_from_decl_or_type (decl) & ECF_NORETURN)) *** lower_builtin_setjmp (gimple_stmt_iterat *** 771,776 --- 779,812 /* Remove the call to __builtin_setjmp. */ gsi_remove (gsi, false); } + + /* Lower calls to posix_memalign to + posix_memalign (ptr, align, size); + tem = *ptr; + tem = __builtin_assume_aligned (align); + *ptr = tem; +so we can get at the alignment of *ptr in CCP. */ + + static void + lower_builtin_posix_memalign (gimple_stmt_iterator *gsi) + { + gimple stmt = gsi_stmt (*gsi); + tree pptr = gimple_call_arg (stmt, 0); + tree align = gimple_call_arg (stmt, 1); + tree ptr = create_tmp_var (ptr_type_node, NULL); + stmt = gimple_build_assign (ptr, + fold_build2 (MEM_REF, ptr_type_node, pptr, + build_int_cst (ptr_type_node, 0))); + gsi_insert_after (gsi, stmt, GSI_NEW_STMT); + stmt = gimple_build_call (builtin_decl_implicit (BUILT_IN_ASSUME_ALIGNED), + 2, ptr, align); + gimple_call_set_lhs (stmt, ptr); + gsi_insert_after (gsi, stmt, GSI_NEW_STMT); + stmt = gimple_build_assign (fold_build2 (MEM_REF, ptr_type_node, pptr, + build_int_cst (ptr_type_node, 0)), + ptr); + gsi_insert_after (gsi, stmt, GSI_NEW_STMT); + } /* Record the variables in VARS into function FN. */ Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/alias-31.c === *** trunk.orig/gcc/testsuite/gcc.dg/tree-ssa/alias-31.c 2014-02-06 15:11:55.881397499 +0100 --- trunk/gcc/testsuite/gcc.dg/tree-ssa/alias-31.c 2014-02-06 15:12:39.1173
Re: [PATCH] PR60092, basic support for posix_malloc
On Thu, 6 Feb 2014, Jakub Jelinek wrote: > On Thu, Feb 06, 2014 at 02:21:01PM +0100, Richard Biener wrote: > > + /* We marking allocated storage local, we deal with it becoming > > +global by escaping and setting of vars_contains_escaped_heap. */ > > Did you mean By marking ..., or something else? "We are marking ..." cut&pasted and fixed both cases. > > + extern int posix_memalign(void **memptr, > > + __SIZE_TYPE__ alignment, __SIZE_TYPE__ size); > > + > > + int foo (int *p) > > + { > > + int res = *p; > > + int *q; > > + posix_memalign ((void **)&q, 128, 128 * sizeof (int)); > > Do you really want to have strict aliasing violations in the testcase? > I think one has to take address of a void * variable and if you want > int *, then cast to int * afterwards. Sure, I'll fix that (it doesn't matter for GCC luckily ;)). I still expect user code to be lazy ... (which is why GCC treats all pointer types as having the same alias set) > Also, I think for posix_memalign > you really should be checking return value of the function. Yeah. Like the following - will commit after re-bootstrapping together with the 3rd followup. Richard. 2014-02-06 Richard Biener PR middle-end/60092 * builtin-types.def (BT_FN_INT_PTRPTR_SIZE_SIZE): Add. * builtins.def (BUILT_IN_POSIX_MEMALIGN): Likewise. * tree-ssa-structalias.c (find_func_aliases_for_builtin_call): Handle BUILT_IN_POSIX_MEMALIGN. (find_func_clobbers): Likewise. * tree-ssa-alias.c (ref_maybe_used_by_call_p_1): Likewise. (call_may_clobber_ref_p_1): Likewise. * gcc.dg/tree-ssa/alias-30.c: New testcase. * gcc.dg/tree-ssa/alias-31.c: Likewise. Index: gcc/builtin-types.def === *** gcc/builtin-types.def.orig 2014-02-06 12:43:09.585012064 +0100 --- gcc/builtin-types.def 2014-02-06 12:43:34.435010353 +0100 *** DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I2_ *** 429,434 --- 429,435 DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I4_INT, BT_VOID, BT_VOLATILE_PTR, BT_I4, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I8_INT, BT_VOID, BT_VOLATILE_PTR, BT_I8, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I16_INT, BT_VOID, BT_VOLATILE_PTR, BT_I16, BT_INT) + DEF_FUNCTION_TYPE_3 (BT_FN_INT_PTRPTR_SIZE_SIZE, BT_INT, BT_PTR_PTR, BT_SIZE, BT_SIZE) DEF_FUNCTION_TYPE_4 (BT_FN_SIZE_CONST_PTR_SIZE_SIZE_FILEPTR, BT_SIZE, BT_CONST_PTR, BT_SIZE, BT_SIZE, BT_FILEPTR) Index: gcc/builtins.def === *** gcc/builtins.def.orig 2014-02-06 12:43:09.586012064 +0100 --- gcc/builtins.def2014-02-06 15:06:39.716419267 +0100 *** DEF_GCC_BUILTIN(BUILT_IN_POPCOUN *** 755,760 --- 755,761 DEF_GCC_BUILTIN(BUILT_IN_POPCOUNTIMAX, "popcountimax", BT_FN_INT_UINTMAX, ATTR_CONST_NOTHROW_LEAF_LIST) DEF_GCC_BUILTIN(BUILT_IN_POPCOUNTL, "popcountl", BT_FN_INT_ULONG, ATTR_CONST_NOTHROW_LEAF_LIST) DEF_GCC_BUILTIN(BUILT_IN_POPCOUNTLL, "popcountll", BT_FN_INT_ULONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST) + DEF_EXT_LIB_BUILTIN(BUILT_IN_POSIX_MEMALIGN, "posix_memalign", BT_FN_INT_PTRPTR_SIZE_SIZE, ATTR_NOTHROW_NONNULL_LEAF) DEF_GCC_BUILTIN(BUILT_IN_PREFETCH, "prefetch", BT_FN_VOID_CONST_PTR_VAR, ATTR_NOVOPS_LEAF_LIST) DEF_LIB_BUILTIN(BUILT_IN_REALLOC, "realloc", BT_FN_PTR_PTR_SIZE, ATTR_NOTHROW_LEAF_LIST) DEF_GCC_BUILTIN(BUILT_IN_RETURN, "return", BT_FN_VOID_PTR, ATTR_NORETURN_NOTHROW_LEAF_LIST) Index: gcc/tree-ssa-structalias.c === *** gcc/tree-ssa-structalias.c.orig 2014-02-06 12:43:09.597012063 +0100 --- gcc/tree-ssa-structalias.c 2014-02-06 15:07:22.987416288 +0100 *** handle_lhs_call (gimple stmt, tree lhs, *** 3982,3988 struct constraint_expr tmpc; rhsc.create (0); vi = make_heapvar ("HEAP"); ! /* We marking allocated storage local, we deal with it becoming global by escaping and setting of vars_contains_escaped_heap. */ DECL_EXTERNAL (vi->decl) = 0; vi->is_global_var = 0; --- 3982,3988 struct constraint_expr tmpc; rhsc.create (0); vi = make_heapvar ("HEAP"); ! /* We are marking allocated storage local, we deal with it becoming global by escaping and setting of vars_contains_escaped_heap. */ DECL_EXTERNAL (vi->decl) = 0; vi->is_global_var = 0; *** find_func_aliases_for_builtin_call (gimp *** 4231,4236 --- 4231,4256 lhsc.release (); return true; } + case BUILT_IN_POSIX_MEMALIGN: + { + tree ptrptr = gimple_call_arg (t, 0); + get_constraint_for (ptrptr, &lhsc); + do_deref (&lhsc); + varinfo_t vi = make_heapvar ("HEAP"); +
Re: [PATCH] PR60092, basic support for posix_malloc
Richard Biener writes: > + case BUILT_IN_POSIX_MEMALIGN: > + { > + tree ptrptr = gimple_call_arg (t, 0); > + get_constraint_for (ptrptr, &lhsc); > + do_deref (&lhsc); > + varinfo_t vi = make_heapvar ("HEAP"); > + /* We marking allocated storage local, we deal with it becoming s/We/When/? Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different."
Re: [PATCH] PR60092, basic support for posix_malloc
On Thu, Feb 06, 2014 at 02:21:01PM +0100, Richard Biener wrote: > + /* We marking allocated storage local, we deal with it becoming > + global by escaping and setting of vars_contains_escaped_heap. */ Did you mean By marking ..., or something else? > + extern int posix_memalign(void **memptr, > + __SIZE_TYPE__ alignment, __SIZE_TYPE__ size); > + > + int foo (int *p) > + { > + int res = *p; > + int *q; > + posix_memalign ((void **)&q, 128, 128 * sizeof (int)); Do you really want to have strict aliasing violations in the testcase? I think one has to take address of a void * variable and if you want int *, then cast to int * afterwards. Also, I think for posix_memalign you really should be checking return value of the function. Otherwise LGTM. Jakub
Re: var-tracking and s390's LM(G)
On 02/06/2014 01:55 AM, Richard Sandiford wrote: > OK, I agree that's not 4.9 material. What about the other change > of replacing: > >REF_CFA_DEF_CFA (plus (stack_pointer_rtx) (const_int 160/96)) > > with: > >REF_CFA_ADJUST_CFA (set (stack_pointer_rtx) >(plus (current_cfa_base) (const_int offset))) > > ? That works on its own, but having both a REG_CFA_ADJUST_CFA that assigns > to stack_pointer_rtx and a REG_CFA_RESTORE for stack_pointer_rtx feels like > a double assignment. It does seem like it. I suppose it would be easy to suppress the RESTORE of the stack pointer, without changing the save at all. > Personally I'd prefer to leave the REG_CFA_DEF_CFA > as-is for now and change all three (the save, the restore and the CFA > definition) at the same time. Yeah, well... > * var-tracking.c (insn_stack_adjust_offset_pre_post): Handle > REG_CFA_ADJUST_CFA. > * config/s390/s390.c (s390_add_restore_cfa_note): New function. > (s390_emit_epilogue): Use it. Looks fine to me, as far as it goes. r~
Re: [C PATCH] Use warning_at (PR c/60087)
On Thu, Feb 6, 2014 at 1:24 PM, Marek Polacek wrote: > An obvious one, use location if available. Can this be considered as > a "docfix" though, thus ok for trunk? (Not a regression.) > > Regtested/bootstrapped on x86_64-linux. Ok. Thanks, Richard. > 2014-02-06 Marek Polacek > > PR c/60087 > c-family/ > * c-common.c (warn_for_sign_compare): Call warning_at with location > instead of warning. > testsuite/ > * gcc.dg/pr60087.c: New test. > > diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c > index 007e727..50cc848 100644 > --- gcc/c-family/c-common.c > +++ gcc/c-family/c-common.c > @@ -11285,8 +11285,8 @@ warn_for_sign_compare (location_t location, >if ((mask & constant) != mask) > { > if (constant == 0) > - warning (OPT_Wsign_compare, > -"promoted ~unsigned is always non-zero"); > + warning_at (location, OPT_Wsign_compare, > + "promoted ~unsigned is always non-zero"); > else > warning_at (location, OPT_Wsign_compare, > "comparison of promoted ~unsigned with > constant"); > diff --git gcc/testsuite/gcc.dg/pr60087.c gcc/testsuite/gcc.dg/pr60087.c > index e69de29..9cdd589 100644 > --- gcc/testsuite/gcc.dg/pr60087.c > +++ gcc/testsuite/gcc.dg/pr60087.c > @@ -0,0 +1,14 @@ > +/* PR c/60087 */ > +/* { dg-do compile } */ > +/* { dg-options "-Wsign-compare" } */ > + > +void > +foo (unsigned int ui, int i) > +{ > + const unsigned char uc = 0; > + _Bool b; > + b = 0 != ~uc; /* { dg-warning "9:promoted ~unsigned is always non-zero" } > */ > + b = 2 != ~uc; /* { dg-warning "9:comparison of promoted ~unsigned with > constant" } */ > + b = uc == ~uc; /* { dg-warning "10:comparison of promoted ~unsigned with > unsigned" } */ > + b = i == ui; /* { dg-warning "9:comparison between signed and unsigned > integer expressions" } */ > +} > > Marek
[PATCH] PR60092, add C11 aligned_alloc handling
This adds a builtin for C11 aligned_alloc and support for it in the alias and alignment tracking machinery. Bootstrap and regtest in progress on x86_64-unknown-linux-gnu. Ok for trunk? Thanks, Richard. 2014-02-06 Richard Biener PR middle-end/60092 * builtins.def (DEF_C11_BUILTIN): Add. (BUILT_IN_ALIGNED_ALLOC): Likewise. * coretypes.h (enum function_class): Add function_c11_misc. * tree-ssa-alias.c (ref_maybe_used_by_call_p_1): Handle BUILT_IN_ALIGNED_ALLOC like BUILT_IN_MALLOC. (call_may_clobber_ref_p_1): Likewise. * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Likewise. (mark_all_reaching_defs_necessary_1): Likewise. (propagate_necessity): Likewise. (eliminate_unnecessary_stmts): Likewise. * tree-ssa-ccp.c (evaluate_stmt): Handle BUILT_IN_ALIGNED_ALLOC. ada/ * gcc-interface/utils.c: Define flag_isoc11. lto/ * lto-lang.c: Define flag_isoc11. * gcc.dg/tree-ssa/alias-32.c: New testcase. * gcc.dg/vect/pr60092.c: Likewise. Index: trunk/gcc/builtins.def === *** trunk.orig/gcc/builtins.def 2014-02-06 12:46:01.085000256 +0100 --- trunk/gcc/builtins.def 2014-02-06 13:13:06.499888349 +0100 *** along with GCC; see the file COPYING3. *** 111,116 --- 111,123 DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE, \ true, true, !flag_isoc99, ATTRS, targetm.libc_has_function (function_c99_misc), true) + /* Like DEF_LIB_BUILTIN, except that the function is only a part of +the standard in C11 or above. */ + #undef DEF_C11_BUILTIN + #define DEF_C11_BUILTIN(ENUM, NAME, TYPE, ATTRS) \ + DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE, \ + true, true, !flag_isoc11, ATTRS, targetm.libc_has_function (function_c11_misc), true) + /* Like DEF_C99_BUILTIN, but for complex math functions. */ #undef DEF_C99_COMPL_BUILTIN #define DEF_C99_COMPL_BUILTIN(ENUM, NAME, TYPE, ATTRS)\ *** DEF_C99_BUILTIN(BUILT_IN_ACOSH, *** 223,228 --- 230,236 DEF_C99_BUILTIN(BUILT_IN_ACOSHF, "acoshf", BT_FN_FLOAT_FLOAT, ATTR_MATHFN_FPROUNDING_ERRNO) DEF_C99_BUILTIN(BUILT_IN_ACOSHL, "acoshl", BT_FN_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO) DEF_C99_C90RES_BUILTIN (BUILT_IN_ACOSL, "acosl", BT_FN_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO) + DEF_C11_BUILTIN(BUILT_IN_ALIGNED_ALLOC, "aligned_alloc", BT_FN_PTR_SIZE_SIZE, ATTR_MALLOC_NOTHROW_LIST) DEF_LIB_BUILTIN(BUILT_IN_ASIN, "asin", BT_FN_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO) DEF_C99_C90RES_BUILTIN (BUILT_IN_ASINF, "asinf", BT_FN_FLOAT_FLOAT, ATTR_MATHFN_FPROUNDING_ERRNO) DEF_C99_BUILTIN(BUILT_IN_ASINH, "asinh", BT_FN_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING) Index: trunk/gcc/coretypes.h === *** trunk.orig/gcc/coretypes.h 2014-01-07 10:20:00.549453955 +0100 --- trunk/gcc/coretypes.h 2014-02-06 13:10:03.472900950 +0100 *** enum function_class { *** 194,200 function_c94, function_c99_misc, function_c99_math_complex, ! function_sincos }; /* Memory model types for the __atomic* builtins. --- 194,201 function_c94, function_c99_misc, function_c99_math_complex, ! function_sincos, ! function_c11_misc }; /* Memory model types for the __atomic* builtins. Index: trunk/gcc/tree-ssa-alias.c === *** trunk.orig/gcc/tree-ssa-alias.c 2014-02-06 12:43:34.450010352 +0100 --- trunk/gcc/tree-ssa-alias.c 2014-02-06 13:14:08.669884068 +0100 *** ref_maybe_used_by_call_p_1 (gimple call, *** 1516,1521 --- 1516,1522 case BUILT_IN_FREE: case BUILT_IN_MALLOC: case BUILT_IN_POSIX_MEMALIGN: + case BUILT_IN_ALIGNED_ALLOC: case BUILT_IN_CALLOC: case BUILT_IN_ALLOCA: case BUILT_IN_ALLOCA_WITH_ALIGN: *** call_may_clobber_ref_p_1 (gimple call, a *** 1826,1831 --- 1827,1833 /* Allocating memory does not have any side-effects apart from being the definition point for the pointer. */ case BUILT_IN_MALLOC: + case BUILT_IN_ALIGNED_ALLOC: case BUILT_IN_CALLOC: case BUILT_IN_STRDUP: case BUILT_IN_STRNDUP: Index: trunk/gcc/tree-ssa-dce.c === *** trunk.orig/gcc/tree-ssa-dce.c 2014-01-07 10:20:02.520453869 +0100 --- trunk/gcc/tree-ssa-dce.c2014-02-06 13:16:20.570874987 +0100 *** mark_stmt_if_obviously_necessary (gimple *** 231,236 --- 231,237 switch (DECL_FUNCTION_CODE (callee)) { case BUILT_IN_MALLOC: + case
[PATCH] PR60092, basic support for posix_malloc
This adds posix_malloc as builtin and support for it in the alias machinery. Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk? Thanks, Richard. 2014-02-06 Richard Biener PR middle-end/60092 * builtin-types.def (BT_FN_INT_PTRPTR_SIZE_SIZE): Add. * builtins.def (BUILT_IN_POSIX_MEMALIGN): Likewise. * tree-ssa-structalias.c (find_func_aliases_for_builtin_call): Handle BUILT_IN_POSIX_MEMALIGN. (find_func_clobbers): Likewise. * tree-ssa-alias.c (ref_maybe_used_by_call_p_1): Likewise. (call_may_clobber_ref_p_1): Likewise. * gcc.dg/tree-ssa/alias-30.c: New testcase. * gcc.dg/tree-ssa/alias-31.c: Likewise. Index: gcc/builtin-types.def === *** gcc/builtin-types.def (revision 207532) --- gcc/builtin-types.def (working copy) *** DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I2_ *** 429,434 --- 429,435 DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I4_INT, BT_VOID, BT_VOLATILE_PTR, BT_I4, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I8_INT, BT_VOID, BT_VOLATILE_PTR, BT_I8, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I16_INT, BT_VOID, BT_VOLATILE_PTR, BT_I16, BT_INT) + DEF_FUNCTION_TYPE_3 (BT_FN_INT_PTRPTR_SIZE_SIZE, BT_INT, BT_PTR_PTR, BT_SIZE, BT_SIZE) DEF_FUNCTION_TYPE_4 (BT_FN_SIZE_CONST_PTR_SIZE_SIZE_FILEPTR, BT_SIZE, BT_CONST_PTR, BT_SIZE, BT_SIZE, BT_FILEPTR) Index: gcc/builtins.def === *** gcc/builtins.def(revision 207532) --- gcc/builtins.def(working copy) *** DEF_GCC_BUILTIN(BUILT_IN_POPCOUN *** 755,760 --- 755,761 DEF_GCC_BUILTIN(BUILT_IN_POPCOUNTIMAX, "popcountimax", BT_FN_INT_UINTMAX, ATTR_CONST_NOTHROW_LEAF_LIST) DEF_GCC_BUILTIN(BUILT_IN_POPCOUNTL, "popcountl", BT_FN_INT_ULONG, ATTR_CONST_NOTHROW_LEAF_LIST) DEF_GCC_BUILTIN(BUILT_IN_POPCOUNTLL, "popcountll", BT_FN_INT_ULONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST) + DEF_EXT_LIB_BUILTIN(BUILT_IN_POSIX_MEMALIGN, "posix_memalign", BT_FN_INT_PTRPTR_SIZE_SIZE, ATTR_NOTHROW_NONNULL_LEAF) DEF_GCC_BUILTIN(BUILT_IN_PREFETCH, "prefetch", BT_FN_VOID_CONST_PTR_VAR, ATTR_NOVOPS_LEAF_LIST) DEF_LIB_BUILTIN(BUILT_IN_REALLOC, "realloc", BT_FN_PTR_PTR_SIZE, ATTR_NOTHROW_LEAF_LIST) DEF_GCC_BUILTIN(BUILT_IN_RETURN, "return", BT_FN_VOID_PTR, ATTR_NORETURN_NOTHROW_LEAF_LIST) Index: gcc/tree-ssa-structalias.c === *** gcc/tree-ssa-structalias.c (revision 207532) --- gcc/tree-ssa-structalias.c (working copy) *** find_func_aliases_for_builtin_call (gimp *** 4231,4236 --- 4231,4256 lhsc.release (); return true; } + case BUILT_IN_POSIX_MEMALIGN: + { + tree ptrptr = gimple_call_arg (t, 0); + get_constraint_for (ptrptr, &lhsc); + do_deref (&lhsc); + varinfo_t vi = make_heapvar ("HEAP"); + /* We marking allocated storage local, we deal with it becoming +global by escaping and setting of vars_contains_escaped_heap. */ + DECL_EXTERNAL (vi->decl) = 0; + vi->is_global_var = 0; + struct constraint_expr tmpc; + tmpc.var = vi->id; + tmpc.offset = 0; + tmpc.type = ADDRESSOF; + rhsc.safe_push (tmpc); + process_all_all_constraints (lhsc, rhsc); + lhsc.release (); + rhsc.release (); + return true; + } case BUILT_IN_ASSUME_ALIGNED: { tree res = gimple_call_lhs (t); *** find_func_clobbers (gimple origt) *** 4960,4965 --- 4980,4986 its argument. */ case BUILT_IN_MEMSET: case BUILT_IN_MEMSET_CHK: + case BUILT_IN_POSIX_MEMALIGN: { tree dest = gimple_call_arg (t, 0); unsigned i; Index: gcc/tree-ssa-alias.c === *** gcc/tree-ssa-alias.c(revision 207532) --- gcc/tree-ssa-alias.c(working copy) *** ref_maybe_used_by_call_p_1 (gimple call, *** 1515,1520 --- 1515,1521 /* The following builtins do not read from memory. */ case BUILT_IN_FREE: case BUILT_IN_MALLOC: + case BUILT_IN_POSIX_MEMALIGN: case BUILT_IN_CALLOC: case BUILT_IN_ALLOCA: case BUILT_IN_ALLOCA_WITH_ALIGN: *** call_may_clobber_ref_p_1 (gimple call, a *** 1838,1843 --- 1839,1854 case BUILT_IN_ALLOCA_WITH_ALIGN: case BUILT_IN_ASSUME_ALIGNED: return false; + /* But posix_memalign stores a pointer into the memory pointed to + by its first argument. */ + case BUILT_IN_POSIX_MEMALIGN: + { + tree ptrptr = gimple_call_arg (call, 0); +
Re: Is testing libgomp outside of the build tree supported?
On Wed, Feb 5, 2014 at 5:10 PM, Matthias Klose wrote: > could somebody please shed some light on how this is done? It's nice that > everybody has this kind of testing, but the only bit in the gcc sources itself > seems to be a bit bit-rot and incomplete (contrib/test_installed). Our case is similar to what Jeff and Joseph already described. I wrote a script that splits the testsuite directories in equal-sized chunks and ships them off to different machines. Each machine generates its site.exp file, and executes runtest with the list of files. This has exposed a few problems with the testsuite. There are implied dependencies that some directories have on others (e.g., using other directories header files) and the multi-files tests are not explicit about it. So, you may end up sending files needed in the same test to different machines. Diego.
[C PATCH] Use warning_at (PR c/60087)
An obvious one, use location if available. Can this be considered as a "docfix" though, thus ok for trunk? (Not a regression.) Regtested/bootstrapped on x86_64-linux. 2014-02-06 Marek Polacek PR c/60087 c-family/ * c-common.c (warn_for_sign_compare): Call warning_at with location instead of warning. testsuite/ * gcc.dg/pr60087.c: New test. diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c index 007e727..50cc848 100644 --- gcc/c-family/c-common.c +++ gcc/c-family/c-common.c @@ -11285,8 +11285,8 @@ warn_for_sign_compare (location_t location, if ((mask & constant) != mask) { if (constant == 0) - warning (OPT_Wsign_compare, -"promoted ~unsigned is always non-zero"); + warning_at (location, OPT_Wsign_compare, + "promoted ~unsigned is always non-zero"); else warning_at (location, OPT_Wsign_compare, "comparison of promoted ~unsigned with constant"); diff --git gcc/testsuite/gcc.dg/pr60087.c gcc/testsuite/gcc.dg/pr60087.c index e69de29..9cdd589 100644 --- gcc/testsuite/gcc.dg/pr60087.c +++ gcc/testsuite/gcc.dg/pr60087.c @@ -0,0 +1,14 @@ +/* PR c/60087 */ +/* { dg-do compile } */ +/* { dg-options "-Wsign-compare" } */ + +void +foo (unsigned int ui, int i) +{ + const unsigned char uc = 0; + _Bool b; + b = 0 != ~uc; /* { dg-warning "9:promoted ~unsigned is always non-zero" } */ + b = 2 != ~uc; /* { dg-warning "9:comparison of promoted ~unsigned with constant" } */ + b = uc == ~uc; /* { dg-warning "10:comparison of promoted ~unsigned with unsigned" } */ + b = i == ui; /* { dg-warning "9:comparison between signed and unsigned integer expressions" } */ +} Marek
Patch ping
Hi! I'd like to ping a few outstanding patches: - PR59575 P1 ARM dwarf2cfi ICEs fix http://gcc.gnu.org/ml/gcc-patches/2014-01/msg01997.html - PR59992 P1 var-tracking improvement http://gcc.gnu.org/ml/gcc-patches/2014-01/msg01962.html - PR60030 P1 ubsan expansion fix http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00167.html Jakub
Re: [C++ Patch/RFC] PR 60047
Hi, On 02/05/2014 10:28 PM, Jason Merrill wrote: On 02/05/2014 11:19 AM, Paolo Carlini wrote: if (vec_safe_is_empty (vbases)) /* No virtual bases to worry about. */; else if (!assign_p) { if (constexpr_p) *constexpr_p = false; *constexpr_p should be false for a constructor of a class with virtual bases, according to the standard (7.1.5p4): The definition of a constexpr constructor shall satisfy the following constraints: — the class shall not have any virtual base classes; ... So the assert in implicit_declare_fn is wrong. I guess I would fix it by checking CLASSTYPE_VBASECLASSES. Ah! It didn't occur to me that the bug could be in the gcc_assert itself. Thus, thanks, I tested on x86_64-linux the below. Paolo. // /cp 2014-02-06 Paolo Carlini PR c++/60047 * method.c (implicitly_declare_fn): A constructor of a class with virtual base classes isn't constexpr (7.1.5p4). (synthesized_method_walk): Revert PR58871 change. /testsuite 2014-02-06 Paolo Carlini PR c++/60047 * g++.dg/cpp0x/pr60047.C: New. Index: cp/method.c === --- cp/method.c (revision 207536) +++ cp/method.c (working copy) @@ -1366,7 +1366,7 @@ synthesized_method_walk (tree ctype, special_funct } vbases = CLASSTYPE_VBASECLASSES (ctype); - if (vec_safe_is_empty (vbases)) + if (vbases == NULL) /* No virtual bases to worry about. */; else if (!assign_p) { @@ -1656,10 +1656,12 @@ implicitly_declare_fn (special_function_kind kind, /* Don't bother marking a deleted constructor as constexpr. */ if (deleted_p) constexpr_p = false; - /* A trivial copy/move constructor is also a constexpr constructor. */ + /* A trivial copy/move constructor is also a constexpr constructor, + unless the class has virtual bases (7.1.5p4). */ else if (trivial_p && cxx_dialect >= cxx11 && (kind == sfk_copy_constructor - || kind == sfk_move_constructor)) + || kind == sfk_move_constructor) + && !CLASSTYPE_VBASECLASSES (type)) gcc_assert (constexpr_p); if (!trivial_p && type_has_trivial_fn (type, kind)) Index: testsuite/g++.dg/cpp0x/pr60047.C === --- testsuite/g++.dg/cpp0x/pr60047.C(revision 0) +++ testsuite/g++.dg/cpp0x/pr60047.C(working copy) @@ -0,0 +1,14 @@ +// PR c++/60047 +// { dg-do compile { target c++11 } } + +struct B { }; + +template struct A : virtual B +{ + A(); + A(const A&); +}; + +template A::A(const A&) = default; + +A a = A();
[Patch, Fortran] PR 58470: [4.9 Regression] [OOP] ICE on invalid with FINAL procedure and type extension
Hi all, attached is a small patch which fixes an ICE-on-invalid regression with finalization. In the PR, Dominique objected to the patch, but I think it's the correct thing to do after all. The line that I'm removing was added in a patch authored by Tobias and myself. I suspect it was added to work around some other problem in the finalization implementation, and there is no evidence it's actually needed. The patch regtests cleanly on x86_64-unknown-linux-gnu. Ok for trunk? Cheers, Janus 2014-02-06 Janus Weil PR fortran/58470 * resolve.c (resolve_fl_derived0): Remove unnecessary call to gfc_is_finalizable. 2014-02-06 Janus Weil PR fortran/58470 * gfortran.dg/finalize_22.f90: New. Index: gcc/fortran/resolve.c === --- gcc/fortran/resolve.c (revision 207485) +++ gcc/fortran/resolve.c (working copy) @@ -12455,10 +12455,6 @@ resolve_fl_derived0 (gfc_symbol *sym) /* Add derived type to the derived type list. */ add_dt_to_dt_list (sym); - /* Check if the type is finalizable. This is done in order to ensure that the - finalization wrapper is generated early enough. */ - gfc_is_finalizable (sym, NULL); - return true; } ! { dg-do compile } ! ! PR 58470: [4.9 Regression] [OOP] ICE on invalid with FINAL procedure and type extension ! ! Contributed by Andrew Benson module cf type :: cfml contains final :: mld end type cfml type, extends(cfml) :: cfmde end type cfmde contains subroutine mld(s) ! { dg-error "must be of type" } class(cfml), intent(inout) :: s end subroutine mld end module cf ! { dg-final { cleanup-modules "cf" } }
[Ada] Configuration options lost after processing of separate unit
this patch updates the analysis of package body stubs and subprogram body stubs to maintain the configuration options of the enclosing context in a stack-like fasion. This ensures that options pertaining to the proper body do not govern the options of the enclosing context. -- Source -- -- pack.ads package Pack is procedure Must_Fail; end Pack; -- pack.adb package body Pack is pragma Assertion_Policy (Assert => Check); package Pack_Stub is procedure Pack_Stub_Proc; end Pack_Stub; package body Pack_Stub is separate; procedure Proc_Stub; procedure Proc_Stub is separate; procedure Must_Fail is begin pragma Assert (False); null; end Must_Fail; end Pack; -- pack-pack_stub.adb separate (Pack) package body Pack_Stub is procedure Pack_Stub_Proc is begin null; end Pack_Stub_Proc; end Pack_Stub; -- pack-proc_stub.adb separate (Pack) procedure Proc_Stub is begin null; end Proc_Stub; -- config_opts.adb with Pack; use Pack; procedure Config_Opts is begin Must_Fail; end Config_Opts; -- Compilation and output -- $ gnatmake -q config_opts.adb $ ./config_opts raised SYSTEM.ASSERTIONS.ASSERT_FAILURE : pack.adb:15 Tested on x86_64-pc-linux-gnu, committed on trunk 2014-02-06 Hristian Kirtchev * sem_ch10.adb (Analyze_Package_Body_Stub): Maintain the configuration options of the enclosing context in a stack-like fasion. (Analyze_Subprogram_Body_Stub): Maintain the configuration options of the enclosing context in a stack-like fashion. Index: sem_ch10.adb === --- sem_ch10.adb(revision 207533) +++ sem_ch10.adb(working copy) @@ -1513,8 +1513,9 @@ --- procedure Analyze_Package_Body_Stub (N : Node_Id) is - Id : constant Entity_Id := Defining_Identifier (N); - Nam : Entity_Id; + Id : constant Entity_Id := Defining_Identifier (N); + Nam : Entity_Id; + Opts : Config_Switches_Type; begin -- The package declaration must be in the current declarative part @@ -1531,6 +1532,11 @@ Error_Msg_N ("duplicate or redundant stub for package", N); else + -- Retain and restore the configuration options of the enclosing + -- context as the proper body may introduce a set of its own. + + Save_Opt_Config_Switches (Opts); + -- Indicate that the body of the package exists. If we are doing -- only semantic analysis, the stub stands for the body. If we are -- generating code, the existence of the body will be confirmed @@ -1541,6 +1547,8 @@ Set_Corresponding_Spec_Of_Stub (N, Nam); Generate_Reference (Nam, Id, 'b'); Analyze_Proper_Body (N, Nam); + + Restore_Opt_Config_Switches (Opts); end if; end Analyze_Package_Body_Stub; @@ -1913,6 +1921,7 @@ procedure Analyze_Subprogram_Body_Stub (N : Node_Id) is Decl : Node_Id; + Opts : Config_Switches_Type; begin Check_Stub_Level (N); @@ -1937,11 +1946,18 @@ end loop; end if; + -- Retain and restore the configuration options of the enclosing context + -- as the proper body may introduce a set of its own. + + Save_Opt_Config_Switches (Opts); + -- Treat stub as a body, which checks conformance if there is a previous -- declaration, or else introduces entity and its signature. Analyze_Subprogram_Body (N); Analyze_Proper_Body (N, Empty); + + Restore_Opt_Config_Switches (Opts); end Analyze_Subprogram_Body_Stub; -
Re: [PATCH] Fix ix86_function_regparm with optimize attribute (PR target/60062, take 3)
On Thu, Feb 6, 2014 at 10:45 AM, Richard Biener wrote: > On Thu, 6 Feb 2014, Jakub Jelinek wrote: > >> On Wed, Feb 05, 2014 at 08:42:27PM +0100, Jakub Jelinek wrote: >> > So, where do we want to do that instead? E.g. should it be e.g. in >> > tree_versionable_function_p directly and let the inliner (if it doesn't do >> > already) also treat optimize(0) functions that aren't always_inline as >> > noinline? >> >> So, another attempt to put the && opt_for_fn (fndecl, optimize) into >> tree_versionable_function_p also failed, because e.g. for TM (but also for >> SIMD clones) we need to clone -O0 functions. So, can I fix PR600{6,7}2 >> without touching tree-inline.c and fix PR60026 again separately somehow else >> as follow-up? Bootstrapped/regtested on x86_64-linux and i686-linux. > > That works for me. Also OK for x86. Thanks, Uros.
[Ada] Adjust documentation of Pragma Optimize_Alignment
This adjusts the documentation of Pragma Optimize_Alignment by mentioning that the pragma also has an effect on individual objects in addition to types. Tested on x86_64-pc-linux-gnu, committed on trunk 2014-02-06 Eric Botcazou * gnat_rm.texi (Pragma Optimize_Alignment): Document the effect of the pragma on individual objects. Index: gnat_rm.texi === --- gnat_rm.texi(revision 207533) +++ gnat_rm.texi(working copy) @@ -4812,13 +4812,14 @@ @noindent This is a configuration pragma which affects the choice of default alignments -for types where no alignment is explicitly specified. There is a time/space -trade-off in the selection of these values. Large alignments result in more -efficient code, at the expense of larger data space, since sizes have to be -increased to match these alignments. Smaller alignments save space, but the -access code is slower. The normal choice of default alignments (which is what -you get if you do not use this pragma, or if you use an argument of OFF), -tries to balance these two requirements. +for types and objects where no alignment is explicitly specified. There is a +time/space trade-off in the selection of these values. Large alignments result +in more efficient code, at the expense of larger data space, since sizes have +to be increased to match these alignments. Smaller alignments save space, but +the access code is slower. The normal choice of default alignments for types +and individual alignment promotions for objects (which is what you get if you +do not use this pragma, or if you use an argument of OFF), tries to balance +these two requirements. Specifying SPACE causes smaller default alignments to be chosen in two cases. First any packed record is given an alignment of 1. Second, if a size is given @@ -4848,6 +4849,10 @@ in general possible to set the alignment of such a record to one, so the pragma is ignored in this case (with a warning). +Specifying SPACE also disables individual alignment promotions for objects, +which occur when the compiler increases the alignment of a specific object +without changing the alignment of its type. + Specifying TIME causes larger default alignments to be chosen in the case of small types with sizes that are not a power of 2. For example, consider:
Re: [PATCH] Fix up __builtin_setjmp_receiver handling (PR c++/60082)
On Thu, Feb 06, 2014 at 11:00:14AM +0100, Richard Biener wrote: > Then __builtin_setjmp_setup is? No, it never returns twice through the fallthru edge. It only returns second time through the abnormal edge virtually to the ABNORMAL_DISPATCHER and from that through another abnormal edge to __builtin_setjmp_receiver. So, among other things, we don't want abnormal edge to the start of __builtin_setjmp_setup. Perhaps we could stop this lowering of __builtin_setjmp into two special builtins now, but such changes look too big for me for stage4. At expansion time we'd need to make sure to have the abnormal edges (which right now expansion recreates from scratch) point to the middle of __builtin_setjmp expansion (the non-local label we insert there). Then __builtin_setjmp would be a normal returns_twice function. But then there is Eric's argument that we also need to revisit all cfun->calls_setjmp inhibiting optimizations, at least at the tree level, and for RTL need to keep abnormal edges and kill the rest of the optimization inhibitions. Jakub
Re: [build, libgcc] Ensure libgcc_s unwinder is always used on 64-bit Solaris 10+/x86 (PR target/59788)
"Joseph S. Myers" writes: > On Tue, 4 Feb 2014, Rainer Orth wrote: > >> > The patch to libgo/ltmain.sh is now committed to mainline. >> >> I've now committed the rest, thanks. >> >> Btw., what's the procedure for syncing the toplevel ltmain.sh these >> days? AFAIK there's both the old CVS src repo as well as the new >> binutils-gdb git repo. Does the patch need to go to both? > > Yes, shared files and directories should be kept in sync between all three > repositories (the src repository is still used for newlib). It looks like > some (e.g. configure.ac) have been getting out of sync lately. Committed to both now. Thanks for the info. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[Ada] Order of evaluation between precondition and 'Old
This patch modifies the expansion of attribute 'Old to insert the generated temporary that captures the value of the prefix before routine _Postconditions. As a result, the precondition of a subprogram will be evaluated first, before any postcondition-related code is executed. -- Source -- -- evaluation_order.adb procedure Evaluation_Order is subtype Idx is Integer range 1 .. 10; type Arr is array (Integer range <>) of Integer; procedure Swap (A : in out Arr; X, Y : Idx) with Pre => X in A'Range and Y in A'Range, Post => A(X) = A(Y)'Old and A(Y) = A(X)'Old; procedure Swap (A : in out Arr; X, Y : Idx) is begin null; end Swap; A : Arr (1 .. 2) := (1 => 1, others => 0); begin Swap (A, 1, 3); end Evaluation_Order; -- Compilation and output -- $ gnatmake -q -gnata evaluation_order.adb $ ./evaluation_order raised SYSTEM.ASSERTIONS.ASSERT_FAILURE : failed precondition from evaluation_order.adb:6 Tested on x86_64-pc-linux-gnu, committed on trunk 2014-02-06 Hristian Kirtchev * exp_attr.adb (Expand_N_Attribute_Reference): Alphabetize variables. Rename variabme Tnn to Temp. Do not create a temporary if assertions are disabled. Find enclosing routine _Postconditions and insert the temporary that captures the value of the prefix before the routine. * exp_ch6.adb (Build_Postconditions_Procedure): Insert the generated _Postconditions routine before the first source declaration of the related subprogram. (Insert_After_Last_Declaration): Removed. (Insert_Before_First_Source_Declaration): New routine. Index: exp_attr.adb === --- exp_attr.adb(revision 207533) +++ exp_attr.adb(working copy) @@ -3806,9 +3806,9 @@ - when Attribute_Old => Old : declare - Tnn : constant Entity_Id := Make_Temporary (Loc, 'T', Pref); + Asn_Stm : Node_Id; Subp: Node_Id; - Asn_Stm : Node_Id; + Temp: Entity_Id; begin -- If assertions are disabled, no need to create the declaration @@ -3818,42 +3818,47 @@ return; end if; - -- Find the nearest subprogram body, ignoring _Preconditions + Temp := Make_Temporary (Loc, 'T', Pref); + -- Climb the parent chain looking for subprogram _Postconditions + Subp := N; - loop + while Present (Subp) loop +exit when Nkind (Subp) = N_Subprogram_Body + and then Chars (Defining_Entity (Subp)) = Name_uPostconditions; + Subp := Parent (Subp); -exit when Nkind (Subp) = N_Subprogram_Body - and then Chars (Defining_Entity (Subp)) /= Name_uPostconditions; end loop; - -- Insert the initialized object declaration at the start of the - -- subprogram's declarations. + -- 'Old can only appear in a postcondition, the generated body of + -- _Postconditions must be in the tree. + pragma Assert (Present (Subp)); + + -- Generate: + --Temp : constant := ; + Asn_Stm := Make_Object_Declaration (Loc, - Defining_Identifier => Tnn, + Defining_Identifier => Temp, Constant_Present=> True, Object_Definition => New_Occurrence_Of (Etype (N), Loc), Expression => Pref); - -- Push the subprogram's scope, so that the object will be analyzed - -- in that context (rather than the context of the Precondition - -- subprogram) and will have its Scope set properly. + -- Push the scope of the related subprogram where _Postcondition + -- resides as this ensures that the object will be analyzed in the + -- proper context. - if Present (Corresponding_Spec (Subp)) then -Push_Scope (Corresponding_Spec (Subp)); - else -Push_Scope (Defining_Entity (Subp)); - end if; + Push_Scope (Scope (Defining_Entity (Subp))); - if Is_Empty_List (Declarations (Subp)) then -Set_Declarations (Subp, New_List (Asn_Stm)); -Analyze (Asn_Stm); - else -Insert_Action (First (Declarations (Subp)), Asn_Stm); - end if; + -- The object declaration is inserted before the body of subprogram + -- _Postconditions. This ensures that any precondition-like actions + -- are still executed before any parameter values are captured and + -- the multiple 'Old occurrences appear in order of declaration. + Insert_Before_And_Analyze (Subp, Asn_Stm); + Pop_Scope; + -- Ensure that the prefix of attribute 'Old is valid. The check must -- be inserted
[Ada] Dependence clause with multiple parenthesis produces misleading errors
This patch modifies the analysis of aspect/pragma [Refined_]Depends to catch clauses that contain multiple parenthesis as this is not allowed by the syntax of the aspect/pragma. -- Source -- -- multi_parents.ads pragma SPARK_Mode; package Multi_Parents with Abstract_State => Input_Port is procedure Read_Port (X1, X2 : out Integer; Y : in Boolean) with Global => (In_Out => Input_Port), Depends => ((X1 => (Input_Port, Y)), -- single nesting ((X2 => (Input_Port, Y))), -- double nesting (Junk),-- junk Input_Port => Input_Port); end Multi_Parents; -- Compilation and output -- $ gcc -c multi_parents.ads multi_parents.ads:8:23: dependency clause contains extra parenthesis multi_parents.ads:9:23: dependency clause contains extra parenthesis multi_parents.ads:10:24: malformed dependency clause Tested on x86_64-pc-linux-gnu, committed on trunk 2014-02-06 Hristian Kirtchev * sem_prag.adb (Analyze_Depends_In_Decl_Part): Add local variable Expr. Flag clauses with extra parenthesis as this is not allowed by the syntax of the pragma. Code reformatting. Index: sem_prag.adb === --- sem_prag.adb(revision 207536) +++ sem_prag.adb(working copy) @@ -1597,6 +1597,7 @@ Clause : Node_Id; Errors : Nat; + Expr: Node_Id; Last_Clause : Node_Id; Subp_Decl : Node_Id; @@ -1653,72 +1654,122 @@ -- Dependency clauses appear as component associations of an aggregate - elsif Nkind (Clause) = N_Aggregate -and then Present (Component_Associations (Clause)) - then - Last_Clause := Last (Component_Associations (Clause)); + elsif Nkind (Clause) = N_Aggregate then - -- Gather all states, variables and formal parameters that the - -- subprogram may depend on. These items are obtained from the - -- parameter profile or pragma [Refined_]Global (if available). + -- The aggregate should not have an expression list because a clause + -- is always interpreted as a component association. The only way an + -- expression list can sneak in is by adding extra parenthesis around + -- the individual clauses: - Collect_Subprogram_Inputs_Outputs - (Subp_Id => Subp_Id, -Subp_Inputs => Subp_Inputs, -Subp_Outputs => Subp_Outputs, -Global_Seen => Global_Seen); + --Depends (Output => Input) -- proper form + --Depends ((Output => Input)) -- extra parenthesis - -- Ensure that the formal parameters are visible when analyzing all - -- clauses. This falls out of the general rule of aspects pertaining - -- to subprogram declarations. Skip the installation for subprogram - -- bodies because the formals are already visible. + -- Since the extra parenthesis are not allowed by the syntax of the + -- pragma, flag them now to avoid emitting misleading errors down the + -- line. - if not In_Open_Scopes (Spec_Id) then -Restore_Scope := True; -Push_Scope (Spec_Id); -Install_Formals (Spec_Id); + if Present (Expressions (Clause)) then +Expr := First (Expressions (Clause)); +while Present (Expr) loop + + -- A dependency clause surrounded by extra parenthesis appears + -- as an aggregate of component associations with an optional + -- Paren_Count set. + + if Nkind (Expr) = N_Aggregate + and then Present (Component_Associations (Expr)) + then + Error_Msg_N +("dependency clause contains extra parenthesis", Expr); + + -- Otherwise the expression is a malformed construct + + else + Error_Msg_N ("malformed dependency clause", Expr); + end if; + + Next (Expr); +end loop; + +-- Do not attempt to perform analysis of syntactically illegal +-- clauses as this will lead to misleading errors. + +return; end if; - Clause := First (Component_Associations (Clause)); - while Present (Clause) loop -Errors := Serious_Errors_Detected; + if Present (Component_Associations (Clause)) then +Last_Clause := Last (Component_Associations (Clause)); --- Normalization may create extra clauses that contain replicated --- input and output names. There is no need to reanalyze them. +-- Gather all states, variables and formal parameters that the +
Re: [PATCH] Fix up __builtin_setjmp_receiver handling (PR c++/60082)
On Thu, 6 Feb 2014, Jakub Jelinek wrote: > On Thu, Feb 06, 2014 at 10:33:54AM +0100, Richard Biener wrote: > > > Not even in call_can_make_abnormal_goto? I mean, as we split > > > __builtin_setjmp into __builtin_setjmp{_setup,_receiver} and the former > > > has abnormal edge out of it, there is no need to have another abnormal > > > edge > > > out of the receiver. > > > > longjmp has outgoing abnormal edges, setjmp has > > incoming abnormal edges. So how's the dispatcher working exactly? > > It receives all abnormal edges from all possibly longjmp calling > > stmts and dispatches them to all setjmp calls? > > I've actually removed the special __builtin_setjmp_dispatcher and replaced > it with the same ABNORMAL_DISPATCHER used for normal setjmp/returns_twice > calls. __builtin_setjmp_receiver actually is not really a returns_twice > function, it only has the abnormal predecessor edge (from > ABNORMAL_DISPATCHER) and no other predecessor edge, so it really works as a > normal call with just the requirement that it is at the start of the bb. > We just need to treat it as receiver of the abnormal edges (we do already). > It is __builtin_setjmp_setup that has both normal fallthru edge and abnormal > edge to the dispatcher. The dispatcher is expanded as nothing at all (it's > basic block is actually removed), the receiver as loading of frame pointer > etc. from somewhere and setup as saving frame pointer somewhere. > > > So why would there be extra edges from the receiver? > > That is what I'd like to change. The reason for the extra edge from the > receiver is that it is a non-leaf call. Supposedly if we set ECF_LEAF on > __builtin_setjmp_receiver, it might work, but that looks too dangerous to > me, at least right now. Ah, so __builtin_setjmp_receiver is like setjmp in this regard and setjmp is LEAF (it's a stmt that doesn't direct control-flow anywhere else). So __builtin_setjmp_receiver should be LEAF as well (and so should __builtin_setjmp_setup). Doesn't sound dangerous at all to me ... (same applies to trampoline setup and stack save/restore, _not_ to __builtin_nonlocal_goto though) > > > > stmts we add abnormal edges to. Btw, why isn't BUILT_IN_SETJMP_RECEIVER > > > > ECF_RETURNS_TWICE? > > > > > > See PR60003, #c5 in particular which really didn't work at all, and Eric's > > > preference in #c8. > > > > Hmm, but it really is ECF_RETURNS_TWICE ... > > It is not. Then __builtin_setjmp_setup is? Looks like I have to build a cross to a sjlj EH target ... Richard.
[Ada] Forbid the use of aspect/pragma SPARK_Mode in generics
The following patch modifies the analysis of generic packages and subprograms to flag aspect/pragma SPARK_Mode as illegal. -- Source -- -- in_function.ads pragma SPARK_Mode (On); generic Val : Integer; function In_Function return Integer with SPARK_Mode; -- in_function.adb function In_Function return Integeris pragma SPARK_Mode (On); begin return 1; end In_Function; -- in_procedure.ads pragma SPARK_Mode (On); generic Val : Integer; procedure In_Procedure with SPARK_Mode; -- in_procedure.adb procedure In_Procedure is pragma SPARK_Mode (On); begin null; end In_Procedure; -- in_spec.ads pragma SPARK_Mode; generic Var : Integer; package In_Spec is pragma SPARK_Mode; procedure Proc with SPARK_Mode => On; end In_Spec; -- in_spec.adb package body In_Spec is procedure Proc is begin null; end Proc; end In_Spec; -- instances.ads with In_Function; with In_Procedure; with In_Spec; package Instances is function Inst_3 is new In_Function (3); procedure Inst_4 is new In_Procedure (4); package Inst_5 is new In_Spec (5); generic Val : Integer; function Nested_Func return Integer with SPARK_Mode; generic Val : Integer; package Nested_Pack with SPARK_Mode is procedure Proc; pragma SPARK_Mode (On); end Nested_Pack; generic Val : Integer; procedure Nested_Proc with SPARK_Mode; function Inst_6 is new Nested_Func (6); package Inst_7 is new Nested_Pack (7); procedure Inst_8 is new Nested_Proc (8); end Instances; -- instances.adb package body Instances is function Nested_Func return Integer is pragma SPARK_Mode (On); begin return 1; end Nested_Func; package body Nested_Pack with SPARK_Mode is procedure Proc is pragma SPARK_Mode (On); begin null; end Proc; end Nested_Pack; procedure Nested_Proc with SPARK_Mode is begin null; end Nested_Proc; end Instances; -- Compilation and output -- $ gcc -c instances.adb instances.adb:3:07: incorrect placement of pragma "Spark_Mode" in a generic instances.adb:6:34: incorrect placement of aspect "Spark_Mode" in a generic instances.adb:8:10: incorrect placement of pragma "Spark_Mode" in a generic instances.adb:12:31: incorrect placement of aspect "Spark_Mode" on a generic instances.ads:13:45: incorrect placement of aspect "Spark_Mode" on a generic instances.ads:18:29: incorrect placement of aspect "Spark_Mode" on a generic instances.ads:20:07: incorrect placement of pragma "Spark_Mode" in a generic instances.ads:26:31: incorrect placement of aspect "Spark_Mode" on a generic in_function.ads:1:01: incorrect placement of pragma "Spark_Mode" in a generic in_function.ads:6:42: incorrect placement of aspect "Spark_Mode" on a generic in_procedure.ads:1:01: incorrect placement of pragma "Spark_Mode" in a generic in_procedure.ads:6:29: incorrect placement of aspect "Spark_Mode" on a generic in_spec.ads:1:01: incorrect placement of pragma "Spark_Mode" in a generic in_spec.ads:7:04: incorrect placement of pragma "Spark_Mode" in a generic in_spec.ads:9:24: incorrect placement of aspect "Spark_Mode" in a generic Tested on x86_64-pc-linux-gnu, committed on trunk 2014-02-06 Hristian Kirtchev * sem_ch6.adb (Analyze_Generic_Subprogram_Body): Flag illegal use of SPARK_Mode. * sem_ch12.adb (Analyze_Generic_Subprogram_Declaration): Flag illegal use of SPARK_Mode. (Instantiate_Subprogram_Body): Flag illegal use of SPARK_Mode. * sem_prag.adb (Analyze_Pragma): Code reformatting. * sem_util.adb Add with and use clause for Aspects. (Check_SPARK_Mode_In_Generic): New routine. * sem_util.ads (Check_SPARK_Mode_In_Generic): New routine. Index: sem_prag.adb === --- sem_prag.adb(revision 207535) +++ sem_prag.adb(working copy) @@ -19217,10 +19217,6 @@ Check_No_Identifiers; Check_At_Most_N_Arguments (1); -if Inside_A_Generic then - Error_Pragma ("incorrect placement of pragma% in a generic"); -end if; - -- Check the legality of the mode (no argument = ON) if Arg_Count = 1 then @@ -19233,9 +19229,15 @@ Mode_Id := Get_SPARK_Mode_Type (Mode); Context := Parent (N); +-- Packages and subprograms declared in a generic unit cannot be +-- subject to the pragma. + +if Inside_A_Generic then + Error_Pragma ("incorrect placement of pragma% in a generic"); + -- The pragma appears in a configuration pragmas file -if No (Context) then +elsif No (Context) then Check_Valid_Configuration_Pragma; if Present (SPARK_Mode_Pragma) then @@ -19258,8 +19260,7 @@
[Ada] Composite postconditions
This patch fixes the order in which the conjuncts in complex postconditions are tested. Such postconditions are expanded into individual tests, chained to internal lists when aspects are processed, and combined with other tests when building the contract checking procedure for a given subprogram. The construction of this procedure did not take into account the order in which conjuncts were processed. Executing the following: gnatmake -gnata postand.adb postand must yield: raised SYSTEM.ASSERTIONS.ASSERT_FAILURE : failed postcondition from postand.adb:3 --- procedure Postand is procedure Test (X, Y : out Integer) with Post => X /= 0 and then Y / X /= 0 and then X + Y > 2; procedure Test (X, Y : out Integer) is begin X := 0; Y := 1; end Test; X, Y : Integer; begin Test (X, Y); end Postand; Tested on x86_64-pc-linux-gnu, committed on trunk 2014-02-06 Ed Schonberg * exp_ch6.adb (Expand_Subprogram_Contract, Append_Enabled_Item): Take into account the Split_PPC flag to ensure that conjuncts in a composite postcondition aspect are tested in source order. Index: exp_ch6.adb === --- exp_ch6.adb (revision 207533) +++ exp_ch6.adb (working copy) @@ -8887,7 +8887,18 @@ List := New_List; end if; -Append (Item, List); +-- If the pragma is a conjunct in a composite postcondition, it +-- has been processed in reverse order. In the postcondition body +-- if must appear before the others. + +if Nkind (Item) = N_Pragma + and then From_Aspect_Specification (Item) + and then Split_PPC (Item) +then + Prepend (Item, List); +else + Append (Item, List); +end if; end if; end Append_Enabled_Item;
[Ada] Predicates on partial views
This patch fixes some errors in the handling of dynamic predicates applied to private types. Compiling and executing the following: gnatmake -q -gnata main main must yield: Endevour Ariane5 Failure to launch --- with gnat.io; with ada.assertions; procedure Main is package SpaceShuttles is type SpaceShuttle (Name : not null access constant String) is tagged private with Dynamic_Predicate => SpaceShuttle.name.all'length > 6; function Make (Ptr : not null access constant String) return SpaceShuttle; private type SpaceShuttle (Name : not null access constant String) is tagged null record; end SpaceShuttles; package body SpaceShuttles is function Make (Ptr : not null access constant String) return SpaceShuttle is begin return (Name => Ptr); end Make; end SpaceShuttles; use SpaceShuttles; Name : aliased constant String := "Endevour"; Endevour : SpaceShuttles.SpaceShuttle(Name'Access); Her : aliased constant String := "Ariane5"; Ariane : SpaceShuttle := Make (Her'access); begin gnat.io.Put_Line(Endevour.name.all); gnat.io.Put_Line(Ariane.name.all); declare Dud : aliased constant String := "Ariane"; Failure : SpaceShuttle := Make (Dud'access); begin null; end; exception when Ada.Assertions.Assertion_Error => gnat.io.put_line ("Failure to launch"); end Main; Tested on x86_64-pc-linux-gnu, committed on trunk 2014-02-06 Ed Schonberg * sem_ch3.adb (Process_Full_View): Fix typo in the order of parameters when propagating predicate function to full view. (Find_Type_Of_Object): Freeze base type of object type to catch premature use of discriminated private type without a full view. Index: sem_ch3.adb === --- sem_ch3.adb (revision 207533) +++ sem_ch3.adb (working copy) @@ -15772,8 +15772,12 @@ and then No (Expression (P)) then null; + + -- Here we freeze the base type of object type to catch premature use + -- of discriminated private type without a full view. + else -Insert_Actions (Obj_Def, Freeze_Entity (T, P)); +Insert_Actions (Obj_Def, Freeze_Entity (Base_Type (T), P)); end if; -- Ada 2005 AI-406: the object definition in an object declaration @@ -18675,7 +18679,7 @@ end; end if; - -- Ada 2005 AI 161: Check preelaboratable initialization consistency + -- Ada 2005 AI 161: Check preelaborable initialization consistency if Known_To_Have_Preelab_Init (Priv_T) then @@ -18737,10 +18741,16 @@ Set_Has_Inheritable_Invariants (Full_T); end if; - -- Propagate predicates to full type + -- Propagate predicates to full type, and predicate function if already + -- defined. It is not clear that this can actually happen? the partial + -- view cannot be frozen yet, and the predicate function has not been + -- built. Still it is a cheap check and seems safer to make it. if Has_Predicates (Priv_T) then - Set_Predicate_Function (Priv_T, Predicate_Function (Full_T)); + if Present (Predicate_Function (Priv_T)) then +Set_Predicate_Function (Full_T, Predicate_Function (Priv_T)); + end if; + Set_Has_Predicates (Full_T); end if; end Process_Full_View;
[Ada] Exception during finalization of controlled object
This patch corrects the transient object machinery to detect subprogram calls in constructs that have been heavily expanded. -- Source -- -- types.ads with Ada.Finalization; use Ada.Finalization; package Types is Bomb : exception; Not_Zero : exception; procedure Set_Calls (Bomb_On : Natural); function New_Id return Natural; subtype Zero is Natural range 0 .. 0; type Ctrl_Component is new Controlled with record Id : Natural := 0; Data : Zero:= 0; end record; procedure Adjust (Obj : in out Ctrl_Component); procedure Finalize (Obj : in out Ctrl_Component); procedure Initialize (Obj : in out Ctrl_Component); function Make_Component return Ctrl_Component; type Ctrl_Encapsulator is new Controlled with record Id : Natural := 0; Comp_1 : Ctrl_Component; Comp_2 : Ctrl_Component; Comp_3 : Ctrl_Component; end record; procedure Adjust (Obj : in out Ctrl_Encapsulator); procedure Finalize (Obj : in out Ctrl_Encapsulator); procedure Initialize (Obj : in out Ctrl_Encapsulator); type Encapsulator is record Id : Natural := 0; Comp_1 : Ctrl_Component; Comp_2 : Ctrl_Component; Comp_3 : Ctrl_Component; end record; type Super_Encapsulator is record Id : Natural := 0; Comp_1 : Ctrl_Encapsulator; Comp_2 : Ctrl_Encapsulator; Comp_3 : Ctrl_Encapsulator; end record; end Types; -- types.adb with Ada.Text_IO; use Ada.Text_IO; package body Types is Calls : Natural := 0; Calls_To_Bomb : Natural := 0; Id_Gen: Natural := 0; procedure Adjust (Obj : in out Ctrl_Component) is Old_Id : constant Natural := Obj.Id; New_Id : constant Natural := Old_Id * 100; begin Put_Line (" adj comp" & Old_Id'Img & " ->" & New_Id'Img); Obj.Id := New_Id; end Adjust; procedure Adjust (Obj : in out Ctrl_Encapsulator) is Old_Id : constant Natural := Obj.Id; New_Id : constant Natural := Old_Id * 100; begin Put_Line (" adj enca" & Old_Id'Img & " ->" & New_Id'Img); Obj.Id := New_Id; end Adjust; procedure Finalize (Obj : in out Ctrl_Component) is begin Put_Line (" fin comp" & Obj.Id'Img); if Obj.Data /= 0 then raise Not_Zero; end if; end Finalize; procedure Finalize (Obj : in out Ctrl_Encapsulator) is begin Put_Line (" fin enca" & Obj.Id'Img); end Finalize; procedure Initialize (Obj : in out Ctrl_Component) is begin Obj.Id := New_Id; Put_Line (" ini comp" & Obj.Id'Img); end Initialize; procedure Initialize (Obj : in out Ctrl_Encapsulator) is begin Obj.Id := New_Id; Put_Line (" ini enca" & Obj.Id'Img); end Initialize; function Make_Component return Ctrl_Component is begin if Calls = Calls_To_Bomb then raise Bomb; else Calls := Calls + 1; end if; declare Result : Ctrl_Component; begin return Result; end; end Make_Component; function New_Id return Natural is begin Id_Gen := Id_Gen + 1; return Id_Gen; end New_Id; procedure Set_Calls (Bomb_On : Natural) is begin Calls := 0; if Bomb_On >= 1 then Calls_To_Bomb := Bomb_On - 1; else Calls_To_Bomb := 0; end if; end Set_Calls; end Types; -- aggregates.adb with Ada.Finalization; use Ada.Finalization; with Ada.Text_IO; use Ada.Text_IO; with Types;use Types; procedure Aggregates is begin Put_Line ("Test 15 - slices"); Set_Calls (2); begin declare type Collection is array (Natural range 1 .. 5) of Natural; C : Collection := (others => 0); Value : Collection; begin Value (1 .. 3) := C (1 .. Ctrl_Encapsulator' (Controlled with Id => New_Id, Comp_1 => Make_Component, Comp_2 => Make_Component, Comp_3 => Make_Component).Comp_2.Data); Put_Line ("ERROR: Test 15: Bomb not raised"); end; exception when Bomb => null; when Not_Zero => Put_Line ("ERROR: Test 15: Not_Zero raised"); when others => Put_Line ("ERROR: Test 15: unexpected exception"); end; Put_Line ("Test 15 end"); end Aggregates; -- Compilation and output -- $ gnatmake -q aggregates.adb $ ./aggregates Test 15 - slices ini comp 1 adj comp 1 -> 100 fin comp 1 fin comp 100 Test 15 end Tested on x86_64-pc-linux-gnu, committed on trunk 2014-02-06 Hristian Kirtchev * exp_ch7.adb (Is_Subprogram_Call): Inspect the original tree in certain cases where a construct has been factored out and replaced by a reference to a temporary. Index: exp_ch7.adb ==
Re: var-tracking and s390's LM(G)
Richard Henderson writes: > On 02/05/2014 02:30 PM, Ulrich Weigand wrote: >> Jakub Jelinek wrote: >>> On Wed, Feb 05, 2014 at 10:26:16PM +0100, Ulrich Weigand wrote: Actually, now I think the problem originally described there is still valid: on s390 the CFA is *not* equal to the value at function entry, but biased by 96/160 bytes. So setting the SP to the CFA is wrong ... >>> >>> Such biasing happens on most of the targets, e.g. x86_64 has upon function >>> entry CFA set to %rsp + 8, i?86 to %esp + 4. >> >> Ah, but that's a different bias. In those cases it is still correct >> to *unwind* SP to the CFA, since that's the value SP needs to have >> back in the caller immediately after the call site. >> >> On s390, the call/return instructions do not modify the SP so the >> SP at function entry is equal to the SP in the caller after return, >> but neither of these is equal to the CFA. Instead, SP points to >> 96/160 bytes below the CFA ... Therefore you cannot simply >> unwind SP to the CFA. > > I was about to reply the same to Jakub, Ulrich beat me to it. > > Basically, the CFA has been defined "incorrectly" for s390, but it hasn't > mattered since they record the save of the SP. But the result is that you > can't stop recording the save of SP without also adjusting how the CFA > is defined. > > Which _can_ be done, in a way that's fully compatible with all of the existing > unwinding. But certainly shouldn't be attempted at this stage of development. OK, I agree that's not 4.9 material. What about the other change of replacing: REF_CFA_DEF_CFA (plus (stack_pointer_rtx) (const_int 160/96)) with: REF_CFA_ADJUST_CFA (set (stack_pointer_rtx) (plus (current_cfa_base) (const_int offset))) ? That works on its own, but having both a REG_CFA_ADJUST_CFA that assigns to stack_pointer_rtx and a REG_CFA_RESTORE for stack_pointer_rtx feels like a double assignment. Personally I'd prefer to leave the REG_CFA_DEF_CFA as-is for now and change all three (the save, the restore and the CFA definition) at the same time. I'm also a bit worried about handling REG_CFA_ADJUST_CFA in var-tracking.c at this stage since AIUI it used to be valid for a backend to use CFA notes to summarise the effect of several instructions, with only the last of them being marked frame-related. If we look at non-frame-related insn patterns as well as CFA notes then we could end up double-counting. But I suppose the same goes for REG_FRAME_RELATED_EXPRs and insn_stack_adjust_offset_pre_post has been using those without any known problems. FWIW here's the patch I tested. Thanks, Richard gcc/ * var-tracking.c (insn_stack_adjust_offset_pre_post): Handle REG_CFA_ADJUST_CFA. * config/s390/s390.c (s390_add_restore_cfa_note): New function. (s390_emit_epilogue): Use it. Index: gcc/var-tracking.c === --- gcc/var-tracking.c 2014-02-06 09:47:55.564375661 + +++ gcc/var-tracking.c 2014-02-06 09:47:57.259364367 + @@ -807,6 +807,14 @@ insn_stack_adjust_offset_pre_post (rtx i rtx expr = find_reg_note (insn, REG_FRAME_RELATED_EXPR, NULL_RTX); if (expr) pattern = XEXP (expr, 0); + else + { + expr = find_reg_note (insn, REG_CFA_ADJUST_CFA, NULL_RTX); + if (expr + && XEXP (expr, 0) + && SET_DEST (XEXP (expr, 0)) == stack_pointer_rtx) + pattern = XEXP (expr, 0); + } } if (GET_CODE (pattern) == SET) Index: gcc/config/s390/s390.c === --- gcc/config/s390/s390.c 2014-02-06 09:47:55.564375661 + +++ gcc/config/s390/s390.c 2014-02-06 09:48:47.378031557 + @@ -8775,6 +8775,22 @@ s390_emit_stack_tie (void) emit_insn (gen_stack_tie (mem)); } +/* INSN is the epilogue instruction that sets the stack pointer to its + final end-of-function value. Add a REG_CFA_ADJUST_CFA to INSN to + describe that final value. FP_OFFSET is the amount that needs to be + added to the current hard frame pointer or stack pointer in order to + get the final value. */ + +static void +s390_add_restore_cfa_note (rtx insn, HOST_WIDE_INT fp_offset) +{ + rtx base = frame_pointer_needed ? hard_frame_pointer_rtx : stack_pointer_rtx; + if (base != stack_pointer_rtx || fp_offset != 0) +add_reg_note (insn, REG_CFA_ADJUST_CFA, + gen_rtx_SET (VOIDmode, stack_pointer_rtx, + plus_constant (Pmode, base, fp_offset))); +} + /* Copy GPRS into FPR save slots. */ static void @@ -9316,9 +9332,7 @@ s390_emit_epilogue (bool sibcall) cfun_frame_layout.last_restore_gpr); insn = emit_insn (insn); REG_NOTES (insn) = cfa_restores; - add_reg_note (insn, REG_CFA_DEF_CFA, - plus_constant (Pmode, stack_pointer_rtx, - STACK
[Ada] Crash on processing external property
This patch modifies the mechanism that detects an enabled property to handle the case where the property appears with an expression. -- Source -- -- serial_port.ads pragma SPARK_Mode; package Serial_Port with Abstract_State => (Input_Port with External => (Async_Writers, Effective_Reads => False)) is procedure Read_Port (X1, X2 : out Integer; Y : in Boolean) with Global => (In_Out => Input_Port), Depends => (X1 => (Input_Port, Y), X2 => Input_Port, Input_Port => Input_Port); end Serial_Port; -- serial_port.adb pragma SPARK_Mode; with System.Storage_Elements; package body Serial_Port with Refined_State => (Input_Port => Port_23) is Port_23 : Integer with Volatile, Async_Writers, Effective_Reads, Address => System.Storage_Elements.To_Address (16#A11CAF0#); procedure Read_Port (X1, X2 : out Integer; Y : in Boolean) with Refined_Global => (In_Out => Port_23), Refined_Depends => (X1 => (Port_23, Y), X2 => Port_23, Port_23 => Port_23) is begin if Y then X1 := Port_23; end if; X2 := Port_23; end Read_Port; end Serial_Port; -- Compilation and output -- $ gcc -c serial_port.adb serial_port.adb:6:09: external state "Input_Port" lacks property "Effective_Reads" set by constituent "Port_23" (SPARK RM 7.2.8(3)) Tested on x86_64-pc-linux-gnu, committed on trunk 2014-02-06 Hristian Kirtchev * sem_util.adb (Has_Enabled_Property): Rename formal parameter Prop_Nam to Property. Update the comment on usage and all occurrences in the body. Add local variable Prop_Nam. When inspecting a property with an expression, the property name appears as the first choice of the component association. Index: sem_util.adb === --- sem_util.adb(revision 207533) +++ sem_util.adb(working copy) @@ -115,10 +115,10 @@ function Has_Enabled_Property (State_Id : Node_Id; - Prop_Nam : Name_Id) return Boolean; + Property : Name_Id) return Boolean; -- Subsidiary to routines Async_xxx_Enabled and Effective_xxx_Enabled. -- Determine whether an abstract state denoted by its entity State_Id has - -- enabled property Prop_Name. + -- enabled property Property. function Has_Null_Extension (T : Entity_Id) return Boolean; -- T is a derived tagged type. Check whether the type extension is null. @@ -7255,13 +7255,14 @@ function Has_Enabled_Property (State_Id : Node_Id; - Prop_Nam : Name_Id) return Boolean + Property : Name_Id) return Boolean is - Decl: constant Node_Id := Parent (State_Id); - Opt : Node_Id; - Opt_Nam : Node_Id; - Prop: Node_Id; - Props : Node_Id; + Decl : constant Node_Id := Parent (State_Id); + Opt : Node_Id; + Opt_Nam : Node_Id; + Prop : Node_Id; + Prop_Nam : Node_Id; + Props: Node_Id; begin -- The declaration of an external abstract state appears as an extension @@ -7305,7 +7306,7 @@ Prop := First (Expressions (Props)); while Present (Prop) loop - if Chars (Prop) = Prop_Nam then + if Chars (Prop) = Property then return True; end if; @@ -7316,7 +7317,9 @@ Prop := First (Component_Associations (Props)); while Present (Prop) loop - if Chars (Prop) = Prop_Nam then + Prop_Nam := First (Choices (Prop)); + + if Chars (Prop_Nam) = Property then return Is_True (Expr_Value (Expression (Prop))); end if; @@ -7326,7 +7329,7 @@ -- Single property else - return Chars (Props) = Prop_Nam; + return Chars (Props) = Property; end if; end if;
Re: [PATCH][AArch64] Specify CRC and Crypto support for Cortex-A53, A57
On 16 January 2014 18:10, Kyrill Tkachov wrote: > Hi all, > > The Cortex-A53 and Cortex-A57 cores support the CRC32 and Crypto extensions > to the ARMv8-A architecture. This patch adds that information to their > definitions in aarch64-cores.def. Both cortex-a53 and cortex-a57 can be configured without crypto support. The behavior of -mcpu=cortex-a5[37] should be to assume the presence of crypto, we will need to add support for -mcpu=core+feature in the future in order to provide a mechanism to turn off crypto. I think this patch is OK and should go in now. /Marcus
[Ada] Exception during finalization of controlled object
This patch corrects the finalization machinery to properly handle a controlled object that is initialized by an aggregate and acts as a transient. The object is now considered fully initialized after the last component assignment takes place. This avoids the finalization of uninitialized data that may lead to a Segmentation_Fault. -- Source -- -- types.ads with Ada.Finalization; use Ada.Finalization; package Types is Bomb : exception; Not_Zero : exception; procedure Set_Calls (Bomb_On : Natural); function New_Id return Natural; type Zero is new Natural range 0 .. 0; type Ctrl_Component is new Controlled with record Id : Natural := 0; Data : Zero:= 0; end record; procedure Adjust (Obj : in out Ctrl_Component); procedure Finalize (Obj : in out Ctrl_Component); procedure Initialize (Obj : in out Ctrl_Component); function Make_Component return Ctrl_Component; type Ctrl_Encapsulator is new Controlled with record Id : Natural := 0; Comp_1 : Ctrl_Component; Comp_2 : Ctrl_Component; Comp_3 : Ctrl_Component; end record; procedure Adjust (Obj : in out Ctrl_Encapsulator); procedure Finalize (Obj : in out Ctrl_Encapsulator); procedure Initialize (Obj : in out Ctrl_Encapsulator); type Encapsulator is record Id : Natural := 0; Comp_1 : Ctrl_Component; Comp_2 : Ctrl_Component; Comp_3 : Ctrl_Component; end record; type Super_Encapsulator is record Id : Natural := 0; Comp_1 : Ctrl_Encapsulator; Comp_2 : Ctrl_Encapsulator; Comp_3 : Ctrl_Encapsulator; end record; end Types; -- types.adb with Ada.Text_IO; use Ada.Text_IO; package body Types is Calls : Natural := 0; Calls_To_Bomb : Natural := 0; Id_Gen: Natural := 0; procedure Adjust (Obj : in out Ctrl_Component) is Old_Id : constant Natural := Obj.Id; New_Id : constant Natural := Old_Id * 100; begin Put_Line (" adj comp" & Old_Id'Img & " ->" & New_Id'Img); Obj.Id := New_Id; end Adjust; procedure Adjust (Obj : in out Ctrl_Encapsulator) is Old_Id : constant Natural := Obj.Id; New_Id : constant Natural := Old_Id * 100; begin Put_Line (" adj enca" & Old_Id'Img & " ->" & New_Id'Img); Obj.Id := New_Id; end Adjust; procedure Finalize (Obj : in out Ctrl_Component) is begin Put_Line (" fin comp" & Obj.Id'Img); if Obj.Data /= 0 then raise Not_Zero; end if; end Finalize; procedure Finalize (Obj : in out Ctrl_Encapsulator) is begin Put_Line (" fin enca" & Obj.Id'Img); end Finalize; procedure Initialize (Obj : in out Ctrl_Component) is begin Obj.Id := New_Id; Put_Line (" ini comp" & Obj.Id'Img); end Initialize; procedure Initialize (Obj : in out Ctrl_Encapsulator) is begin Obj.Id := New_Id; Put_Line (" ini comp" & Obj.Id'Img); end Initialize; function Make_Component return Ctrl_Component is begin if Calls = Calls_To_Bomb then raise Bomb; else Calls := Calls + 1; end if; declare Result : Ctrl_Component; begin return Result; end; end Make_Component; function New_Id return Natural is begin Id_Gen := Id_Gen + 1; return Id_Gen; end New_Id; procedure Set_Calls (Bomb_On : Natural) is begin Calls := 0; if Bomb_On >= 1 then Calls_To_Bomb := Bomb_On - 1; else Calls_To_Bomb := 0; end if; end Set_Calls; end Types; -- aggregates.adb with Ada.Finalization; use Ada.Finalization; with Ada.Text_IO; use Ada.Text_IO; with Types;use Types; procedure Aggregates is begin Put_Line ("Test 4"); Set_Calls (3); begin declare function Make_Encapsulator return Ctrl_Encapsulator is begin return Result : Ctrl_Encapsulator := (Controlled with Id => New_Id, Comp_1 => Make_Component, Comp_2 => Make_Component, Comp_3 => Make_Component); end Make_Encapsulator; Obj : Ctrl_Encapsulator; begin Obj := Make_Encapsulator; Put_Line ("ERROR: Test 4: Bomb not raised"); end; exception when Bomb => null; when Not_Zero => Put_Line ("ERROR: Test 4: Not_Zero raised"); when others => Put_Line ("ERROR: Test 4: unexpected exception"); end; Put_Line ("Test 4 end"); Put_Line ("Test 5"); Set_Calls (2); begin declare procedure Do_Nothing (Obj : Ctrl_Encapsulator) is begin null; end Do_Nothing; begin Do_Nothing (Ctrl_Encapsulator' (Controlled with Id => New_Id,
Re: [ARM][PATCH] Vectorizer generates unaligned access when -mno-unaligned-access is enabled
Kugan wrote: >> Ok if no regressions. > > Tested it on qemu for arm-none-linux-gnueabi and there is no new > regressions. I am sorry I didn’t mention it when I posted the patch. Commited in r207533. Thanks! -Y
Re: [PATCH] Fix two missed Makefile.in dependencies
On Thu, Feb 6, 2014 at 8:12 AM, Jakub Jelinek wrote: > Hi! > > I have noticed that with the .deps introduction for gcc/ we've lost > these two dependencies, which autodependency creation isn't aware of, > as the Makefile runs cat on those files and passes the content through > -D option. > > Ok for trunk? Ok. Richard. > 2014-02-06 Jakub Jelinek > > * Makefile.in (prefix.o, cppbuiltin.o): Depend on $(BASEVER). > > --- gcc/Makefile.in.jj 2014-01-28 14:03:49.0 +0100 > +++ gcc/Makefile.in 2014-02-05 13:09:26.871019299 +0100 > @@ -1925,6 +1925,7 @@ default-c.o: config/default-c.c > # Files used by all variants of C and some other languages. > > CFLAGS-prefix.o += -DPREFIX=\"$(prefix)\" -DBASEVER=$(BASEVER_s) > +prefix.o: $(BASEVER) > > # Language-independent files. > > @@ -2540,6 +2541,7 @@ PREPROCESSOR_DEFINES = \ >@TARGET_SYSTEM_ROOT_DEFINE@ > > CFLAGS-cppbuiltin.o += $(PREPROCESSOR_DEFINES) -DBASEVER=$(BASEVER_s) > +cppbuiltin.o: $(BASEVER) > > CFLAGS-cppdefault.o += $(PREPROCESSOR_DEFINES) > > > Jakub
Re: [PATCH] Fix ix86_function_regparm with optimize attribute (PR target/60062, take 3)
On Thu, 6 Feb 2014, Jakub Jelinek wrote: > On Wed, Feb 05, 2014 at 08:42:27PM +0100, Jakub Jelinek wrote: > > So, where do we want to do that instead? E.g. should it be e.g. in > > tree_versionable_function_p directly and let the inliner (if it doesn't do > > already) also treat optimize(0) functions that aren't always_inline as > > noinline? > > So, another attempt to put the && opt_for_fn (fndecl, optimize) into > tree_versionable_function_p also failed, because e.g. for TM (but also for > SIMD clones) we need to clone -O0 functions. So, can I fix PR600{6,7}2 > without touching tree-inline.c and fix PR60026 again separately somehow else > as follow-up? Bootstrapped/regtested on x86_64-linux and i686-linux. That works for me. Thanks, Richard. > 2014-02-06 Jakub Jelinek > > PR target/60062 > * tree.h (opts_for_fn): New inline function. > (opt_for_fn): Define. > * config/i386/i386.c (ix86_function_regparm): Use > opt_for_fn (decl, optimize) instead of optimize. > > * gcc.c-torture/execute/pr60062.c: New test. > * gcc.c-torture/execute/pr60072.c: New test. > > --- gcc/tree.h.jj 2014-01-20 19:16:29.0 +0100 > +++ gcc/tree.h2014-02-05 15:54:04.681904492 +0100 > @@ -4470,6 +4470,20 @@ may_be_aliased (const_tree var) > || TREE_ADDRESSABLE (var))); > } > > +/* Return pointer to optimization flags of FNDECL. */ > +static inline struct cl_optimization * > +opts_for_fn (const_tree fndecl) > +{ > + tree fn_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl); > + if (fn_opts == NULL_TREE) > +fn_opts = optimization_default_node; > + return TREE_OPTIMIZATION (fn_opts); > +} > + > +/* opt flag for function FNDECL, e.g. opts_for_fn (fndecl, optimize) is > + the optimization level of function fndecl. */ > +#define opt_for_fn(fndecl, opt) (opts_for_fn (fndecl)->x_##opt) > + > /* For anonymous aggregate types, we need some sort of name to > hold on to. In practice, this should not appear, but it should > not be harmful if it does. */ > --- gcc/config/i386/i386.c.jj 2014-02-05 10:37:36.166167116 +0100 > +++ gcc/config/i386/i386.c2014-02-05 15:56:36.779146394 +0100 > @@ -5608,7 +5608,12 @@ ix86_function_regparm (const_tree type, >/* Use register calling convention for local functions when possible. */ >if (decl >&& TREE_CODE (decl) == FUNCTION_DECL > - && optimize > + /* Caller and callee must agree on the calling convention, so > + checking here just optimize means that with > + __attribute__((optimize (...))) caller could use regparm convention > + and callee not, or vice versa. Instead look at whether the callee > + is optimized or not. */ > + && opt_for_fn (decl, optimize) >&& !(profile_flag && !flag_fentry)) > { >/* FIXME: remove this CONST_CAST when cgraph.[ch] is constified. */ > --- gcc/testsuite/gcc.c-torture/execute/pr60062.c.jj 2014-02-05 > 15:55:48.639388697 +0100 > +++ gcc/testsuite/gcc.c-torture/execute/pr60062.c 2014-02-05 > 15:55:48.639388697 +0100 > @@ -0,0 +1,25 @@ > +/* PR target/60062 */ > + > +int a; > + > +static void > +foo (const char *p1, int p2) > +{ > + if (__builtin_strcmp (p1, "hello") != 0) > +__builtin_abort (); > +} > + > +static void > +bar (const char *p1) > +{ > + if (__builtin_strcmp (p1, "hello") != 0) > +__builtin_abort (); > +} > + > +__attribute__((optimize (0))) int > +main () > +{ > + foo ("hello", a); > + bar ("hello"); > + return 0; > +} > --- gcc/testsuite/gcc.c-torture/execute/pr60072.c.jj 2014-02-05 > 15:55:48.640388641 +0100 > +++ gcc/testsuite/gcc.c-torture/execute/pr60072.c 2014-02-05 > 15:55:48.640388641 +0100 > @@ -0,0 +1,16 @@ > +/* PR target/60072 */ > + > +int c = 1; > + > +__attribute__ ((optimize (1))) > +static int *foo (int *p) > +{ > + return p; > +} > + > +int > +main () > +{ > + *foo (&c) = 2; > + return c - 2; > +} > > > Jakub > > -- Richard Biener SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
Re: [PATCH] Fix up __builtin_setjmp_receiver handling (PR c++/60082)
On Thu, Feb 06, 2014 at 10:33:54AM +0100, Richard Biener wrote: > > Not even in call_can_make_abnormal_goto? I mean, as we split > > __builtin_setjmp into __builtin_setjmp{_setup,_receiver} and the former > > has abnormal edge out of it, there is no need to have another abnormal edge > > out of the receiver. > > longjmp has outgoing abnormal edges, setjmp has > incoming abnormal edges. So how's the dispatcher working exactly? > It receives all abnormal edges from all possibly longjmp calling > stmts and dispatches them to all setjmp calls? I've actually removed the special __builtin_setjmp_dispatcher and replaced it with the same ABNORMAL_DISPATCHER used for normal setjmp/returns_twice calls. __builtin_setjmp_receiver actually is not really a returns_twice function, it only has the abnormal predecessor edge (from ABNORMAL_DISPATCHER) and no other predecessor edge, so it really works as a normal call with just the requirement that it is at the start of the bb. We just need to treat it as receiver of the abnormal edges (we do already). It is __builtin_setjmp_setup that has both normal fallthru edge and abnormal edge to the dispatcher. The dispatcher is expanded as nothing at all (it's basic block is actually removed), the receiver as loading of frame pointer etc. from somewhere and setup as saving frame pointer somewhere. > So why would there be extra edges from the receiver? That is what I'd like to change. The reason for the extra edge from the receiver is that it is a non-leaf call. Supposedly if we set ECF_LEAF on __builtin_setjmp_receiver, it might work, but that looks too dangerous to me, at least right now. > > > stmts we add abnormal edges to. Btw, why isn't BUILT_IN_SETJMP_RECEIVER > > > ECF_RETURNS_TWICE? > > > > See PR60003, #c5 in particular which really didn't work at all, and Eric's > > preference in #c8. > > Hmm, but it really is ECF_RETURNS_TWICE ... It is not. Jakub
Re: [PATCH][PING^3][AArch64] Specify CRC and Crypto support for Cortex-A53, A57
On 29/01/14 17:42, Kyrill Tkachov wrote: On 23/01/14 08:58, Kyrill Tkachov wrote: On 16/01/14 18:10, Kyrill Tkachov wrote: Hi all, The Cortex-A53 and Cortex-A57 cores support the CRC32 and Crypto extensions to the ARMv8-A architecture. This patch adds that information to their definitions in aarch64-cores.def. Tested aarch64-none-elf with no regressions. Ok for trunk? (or next stage1)? Thanks, Kyrill 2014-01-16 Kyrylo Tkachov * config/aarch64/aarch64-cores.def (cortex-a53): Specify CRC32 and crypto support. (cortex-a57): Likewise. (cortex-a57.cortex-a53): Likewise. Ping. Ping^2 Ping^3 http://gcc.gnu.org/ml/gcc-patches/2014-01/msg01012.html Kyrill Kyrill Kyrill
Re: [PATCH] Fix up __builtin_setjmp_receiver handling (PR c++/60082)
On Thu, 6 Feb 2014, Jakub Jelinek wrote: > On Thu, Feb 06, 2014 at 10:13:25AM +0100, Richard Biener wrote: > > > trunk? Alternatively we could special case BUILT_IN_SETJMP_RECEIVER > > > instead > > > in call_can_make_abnormal_goto, there is probably no need to have AB edge > > > out of __builtin_setjmp_receiver, we already have one out of > > > __builtin_setjmp_setup and receiver is really expanded just to load of > > > frame > > > pointer, so it doesn't jump anywhere. > > > > > > Note that this fixes both the CK tests on x86_64-linux, on i686-linux > > > neither of them timeout, but catch_exc.cc still fails: > > > FAIL: g++.dg/cilk-plus/CK/catch_exc.cc -O1 -fcilkplus execution test > > > FAIL: g++.dg/cilk-plus/CK/catch_exc.cc -O3 -fcilkplus execution test > > > FAIL: g++.dg/cilk-plus/CK/catch_exc.cc -g -O2 -fcilkplus execution test > > > > I don't like special-casing BUILT_IN_SETJMP_RECEIVER - we don't > > Not even in call_can_make_abnormal_goto? I mean, as we split > __builtin_setjmp into __builtin_setjmp{_setup,_receiver} and the former > has abnormal edge out of it, there is no need to have another abnormal edge > out of the receiver. longjmp has outgoing abnormal edges, setjmp has incoming abnormal edges. So how's the dispatcher working exactly? It receives all abnormal edges from all possibly longjmp calling stmts and dispatches them to all setjmp calls? So why would there be extra edges from the receiver? > > currently have a suitable exported stmt_starts_bb_p () (the existing > > one is too special), but similar to is_ctrl_altering_stmt we could > > add a is_ctrl_receiving_stmt that covers nonlocal labels and other > > Yes, that works for me. I wondered if insert_backedge_copies wouldn't do > harm if it inserts the assignments before normal setjmp call or other > call that returns twice. Yeah, it disrupts the CFG (unfortunately we don't verify that abnormal edge receievers start a BB, or rather that if a BB receives abnormal edges there is a stmt at the beginning that receives them). Adding such verification to gimple_verify_flow_info would be nice. > > stmts we add abnormal edges to. Btw, why isn't BUILT_IN_SETJMP_RECEIVER > > ECF_RETURNS_TWICE? > > See PR60003, #c5 in particular which really didn't work at all, and Eric's > preference in #c8. Hmm, but it really is ECF_RETURNS_TWICE ...
Re: [Patch][AArch64] Shift right pattern fix
On Mon, Feb 03, 2014 at 10:31:48AM +, Alex Velenko wrote: > Hi, > I agree to changelog change. Could this patch, please, be submitted as I > do not have rights to do so. > Kind regards, > Alex > > On 30/01/14 22:36, Marcus Shawcroft wrote: > > On 30 January 2014 15:28, Alex Velenko wrote: > >> Hi, > >> This patch fixes shift right pattern, as it failed on -O0 after shift > >> right patch. The reason was unnecessary movement of immediate value to > >> a register due to type mismatch. > >> Patch is tested not to cause any additional regressions. > >> Could someone, please, approve and commit this patch, as I do not have > >> the rights to do so? > >> > >> Kind regards, > >> Alex > >> > >> 2014-01-28 Alex Velenko > >> > >> gcc/ > >> > >> * config/aarch64/aarch64-simd.md (aarch64_ashr_simddi): Fixed. > > > > Fixed doesn't say what was changed, how about: > > > > * config/aarch64/aarch64-simd.md (aarch64_ashr_simddi): Change QI to SI. > > > > OK with that change. > > /Marcus > > > > I have committed this on Alex' behalf as revision 207531 with the following ChangeLog entry: 2014-02-06 Alex Velenko * config/aarch64/aarch64-simd.md (aarch64_ashr_simddi): Change QI to SI. Thanks, James
Re: wide-int, build system
On Jan 9, 2014, at 7:16 AM, Richard Biener wrote: > On Sat, Nov 23, 2013 at 8:20 PM, Mike Stump wrote: >> Richi has asked the we break the wide-int patch so that the individual port >> and front end maintainers can review their parts without have to go through >> the entire patch.This patch covers the build system (make). >> >> Ok? > > Needs updating (no explicit dependences for wide-int.h) but ok. Yeah, wondering around, and I can’t help but think there are a ton of extra header dependancies that are no longer needed... Anyway, here is the update I applied to the base… diff --git a/gcc/ChangeLog.wide-int b/gcc/ChangeLog.wide-int index 223fc07..3655961 100644 --- a/gcc/ChangeLog.wide-int +++ b/gcc/ChangeLog.wide-int @@ -368,11 +368,8 @@ (hash_tree): Use wide-int interfaces. (output_cfg): Likewise. * Makefile.in - (RTL_H, CFGLOOP_H, C_COMMON_H, TREE_FLOW_H): Add wide-int.h. - (REAL_H): Add signop.h. (OBJS): Add wide-int.o and wide-int-print.o. (GTFILES): Add wide-int.h and signop.h. - (wide-int.h): New rule. (TAGS): Look for .cc files too. * omp-low.c (scan_omp_1_op): Use wide-int interfaces. diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 7926bd7..f53a89f 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -869,7 +869,7 @@ RTL_BASE_H = coretypes.h rtl.h rtl.def $(MACHMODE_H) reg-notes.def \ insn-notes.def $(INPUT_H) $(REAL_H) statistics.h $(VEC_H) \ $(FIXED_VALUE_H) alias.h $(HASHTAB_H) FIXED_VALUE_H = fixed-value.h $(MACHMODE_H) double-int.h -RTL_H = $(RTL_BASE_H) $(FLAGS_H) genrtl.h wide-int.h +RTL_H = $(RTL_BASE_H) $(FLAGS_H) genrtl.h READ_MD_H = $(OBSTACK_H) $(HASHTAB_H) read-md.h PARAMS_H = params.h params.def BUILTINS_DEF = builtins.def sync-builtins.def omp-builtins.def \ @@ -899,7 +899,7 @@ FUNCTION_H = function.h $(HASHTAB_H) $(TM_H) hard-reg-set.h \ EXPR_H = expr.h insn-config.h $(FUNCTION_H) $(RTL_H) $(FLAGS_H) $(TREE_H) $(MACHMODE_H) $(EMIT_RTL_H) OPTABS_H = optabs.h insn-codes.h insn-opinit.h REGS_H = regs.h $(MACHMODE_H) hard-reg-set.h -CFGLOOP_H = cfgloop.h $(BASIC_BLOCK_H) double-int.h wide-int.h \ +CFGLOOP_H = cfgloop.h $(BASIC_BLOCK_H) double-int.h \ $(BITMAP_H) sbitmap.h IPA_UTILS_H = ipa-utils.h $(TREE_H) $(CGRAPH_H) IPA_REFERENCE_H = ipa-reference.h $(BITMAP_H) $(TREE_H) @@ -914,7 +914,7 @@ TIMEVAR_H = timevar.h timevar.def INSN_ATTR_H = insn-attr.h insn-attr-common.h $(INSN_ADDR_H) INSN_ADDR_H = $(srcdir)/insn-addr.h C_COMMON_H = c-family/c-common.h c-family/c-common.def $(TREE_H) \ - $(SPLAY_TREE_H) $(CPPLIB_H) $(GGC_H) $(DIAGNOSTIC_CORE_H) wide-int.h + $(SPLAY_TREE_H) $(CPPLIB_H) $(GGC_H) $(DIAGNOSTIC_CORE_H) C_PRAGMA_H = c-family/c-pragma.h $(CPPLIB_H) C_TREE_H = c/c-tree.h $(C_COMMON_H) $(DIAGNOSTIC_H) SYSTEM_H = system.h hwint.h $(srcdir)/../include/libiberty.h \ @@ -932,7 +932,7 @@ TREE_PASS_H = tree-pass.h $(TIMEVAR_H) $(DUMPFILE_H) TREE_SSA_H = tree-ssa.h tree-ssa-operands.h \ $(BITMAP_H) sbitmap.h $(BASIC_BLOCK_H) $(GIMPLE_H) \ $(HASHTAB_H) $(CGRAPH_H) $(IPA_REFERENCE_H) \ - tree-ssa-alias.h wide-int.h + tree-ssa-alias.h PRETTY_PRINT_H = pretty-print.h $(INPUT_H) $(OBSTACK_H) TREE_PRETTY_PRINT_H = tree-pretty-print.h $(PRETTY_PRINT_H) GIMPLE_PRETTY_PRINT_H = gimple-pretty-print.h $(TREE_PRETTY_PRINT_H) @@ -941,7 +941,7 @@ DIAGNOSTIC_H = diagnostic.h $(DIAGNOSTIC_CORE_H) $(PRETTY_PRINT_H) C_PRETTY_PRINT_H = c-family/c-pretty-print.h $(PRETTY_PRINT_H) \ $(C_COMMON_H) $(TREE_H) TREE_INLINE_H = tree-inline.h -REAL_H = real.h $(MACHMODE_H) signop.h +REAL_H = real.h $(MACHMODE_H) LTO_STREAMER_H = lto-streamer.h $(LINKER_PLUGIN_API_H) $(TARGET_H) \ $(CGRAPH_H) $(VEC_H) $(HASH_TABLE_H) $(TREE_H) $(GIMPLE_H) \ $(GCOV_IO_H) $(DIAGNOSTIC_H) alloc-pool.h pointer-set.h @@ -2438,13 +2438,11 @@ CFLAGS-gengtype-parse.o += -DGENERATOR_FILE build/gengtype-parse.o: $(BCONFIG_H) gengtype-state.o build/gengtype-state.o: gengtype-state.c $(SYSTEM_H) \ - gengtype.h errors.h double-int.h version.h $(HASHTAB_H)\ - $(OBSTACK_H) $(XREGEX_H) + gengtype.h errors.h double-int.h version.h $(HASHTAB_H) $(OBSTACK_H) \ + $(XREGEX_H) gengtype-state.o: $(CONFIG_H) CFLAGS-gengtype-state.o += -DGENERATOR_FILE build/gengtype-state.o: $(BCONFIG_H) -wide-int.h: $(GTM_H) $(TREE_H) hwint.h $(OPTIONS_H)\ - $(MACHMODE_H) double-int.h dumpfile.h $(REAL_H) signop.h gengtype.o build/gengtype.o : gengtype.c $(SYSTEM_H) gengtype.h\ rtl.def insn-notes.def errors.h double-int.h version.h \ $(HASHTAB_H) $(OBSTACK_H) $(XREGEX_H)
Re: [PATCH] Fix ix86_function_regparm with optimize attribute (PR target/60062, take 2)
On Wed, 5 Feb 2014, Jakub Jelinek wrote: > On Wed, Feb 05, 2014 at 02:20:05PM +0100, Richard Biener wrote: > > > Using !!optimize to determine if we should switch local ABI to regparm > > > convention isn't compatible with optimize attribute, as !!optimize is > > > whether the current function is being optimized, but for the ABI decisions > > > we actually need the caller and callee to agree on the calling convention. > > > > > > Fixed by looking at callee's optimize settings all the time. > > > > > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > > > Seeing a 2nd variant of such code I'd rather have an abstraction > > for this ... optimize_fn ()? opt_for_fn (fn, OPT_...)? > > Here is an updated version of the patch, unfortunately this one regressed in > quite a lot of tests. The problem is that by using opt_for_fn there > it also sets the failure reason for all functions at -O0 that don't have > the optimize attribute at all. Now copy_forbidden is used by the inliner, > where we have to handle always_inline functions, other -O0 functions we likely > don't want to inline, be it because the cmd line options are -O0 or because > the function we are considering for inlining has optimize (0), and > for versioning, where I'd say we want to never version the -O0 functions. > > So, where do we want to do that instead? E.g. should it be e.g. in > tree_versionable_function_p directly and let the inliner (if it doesn't do > already) also treat optimize(0) functions that aren't always_inline as > noinline? > > I guess even without this patch my PR60026 fix broke inlining of > __attribute__((always_inline, optimize (0))) functions. > > 2014-02-05 Jakub Jelinek > > PR target/60062 > * tree.h (opts_for_fn): New inline function. > (opt_for_fn): Define. > * config/i386/i386.c (ix86_function_regparm): Use > opt_for_fn (decl, optimize) instead of optimize. > * tree-inline.c (copy_forbidden): Use opt_for_fn. > > * gcc.c-torture/execute/pr60062.c: New test. > * gcc.c-torture/execute/pr60072.c: New test. > > --- gcc/tree.h.jj 2014-01-20 19:16:29.0 +0100 > +++ gcc/tree.h2014-02-05 15:54:04.681904492 +0100 > @@ -4470,6 +4470,20 @@ may_be_aliased (const_tree var) > || TREE_ADDRESSABLE (var))); > } > > +/* Return pointer to optimization flags of FNDECL. */ > +static inline struct cl_optimization * > +opts_for_fn (const_tree fndecl) > +{ > + tree fn_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl); > + if (fn_opts == NULL_TREE) > +fn_opts = optimization_default_node; > + return TREE_OPTIMIZATION (fn_opts); > +} > + > +/* opt flag for function FNDECL, e.g. opts_for_fn (fndecl, optimize) is > + the optimization level of function fndecl. */ > +#define opt_for_fn(fndecl, opt) (opts_for_fn (fndecl)->x_##opt) > + > /* For anonymous aggregate types, we need some sort of name to > hold on to. In practice, this should not appear, but it should > not be harmful if it does. */ > --- gcc/config/i386/i386.c.jj 2014-02-05 10:37:36.166167116 +0100 > +++ gcc/config/i386/i386.c2014-02-05 15:56:36.779146394 +0100 > @@ -5608,7 +5608,12 @@ ix86_function_regparm (const_tree type, >/* Use register calling convention for local functions when possible. */ >if (decl >&& TREE_CODE (decl) == FUNCTION_DECL > - && optimize > + /* Caller and callee must agree on the calling convention, so > + checking here just optimize means that with > + __attribute__((optimize (...))) caller could use regparm convention > + and callee not, or vice versa. Instead look at whether the callee > + is optimized or not. */ > + && opt_for_fn (decl, optimize) >&& !(profile_flag && !flag_fentry)) > { >/* FIXME: remove this CONST_CAST when cgraph.[ch] is constified. */ > --- gcc/tree-inline.c.jj 2014-02-04 14:03:31.0 +0100 > +++ gcc/tree-inline.c 2014-02-05 15:55:34.747448968 +0100 > @@ -3315,16 +3315,10 @@ copy_forbidden (struct function *fun, tr > goto fail; >} > > - tree fs_opts; > - fs_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fun->decl); > - if (fs_opts) > + if (!opt_for_fn (fun->decl, optimize)) Maybe if (!opt_for_fn (fun->decl, optimize) && optimization_default_node->x_optimize != 0) ? But yes, copy_forbidden is a bit big hammer considering always_inline and similar stuff. I think we need to handle -O0 at all consumers selectively. > { > - struct cl_optimization *os = TREE_OPTIMIZATION (fs_opts); > - if (!os->x_optimize) > - { > - reason = G_("function %q+F compiled without optimizations"); > - goto fail; > - } > + reason = G_("function %q+F compiled without optimizations"); > + goto fail; > } > > fail: > --- gcc/testsuite/gcc.c-torture/execute/pr60062.c.jj 2014-02-05 > 15:55:48.639388697 +0100 > +++ gcc/testsuite/gcc.c-torture/execute/pr60062.c
Re: [PATCH] Fix up __builtin_setjmp_receiver handling (PR c++/60082)
On Thu, Feb 06, 2014 at 10:13:25AM +0100, Richard Biener wrote: > > trunk? Alternatively we could special case BUILT_IN_SETJMP_RECEIVER instead > > in call_can_make_abnormal_goto, there is probably no need to have AB edge > > out of __builtin_setjmp_receiver, we already have one out of > > __builtin_setjmp_setup and receiver is really expanded just to load of frame > > pointer, so it doesn't jump anywhere. > > > > Note that this fixes both the CK tests on x86_64-linux, on i686-linux > > neither of them timeout, but catch_exc.cc still fails: > > FAIL: g++.dg/cilk-plus/CK/catch_exc.cc -O1 -fcilkplus execution test > > FAIL: g++.dg/cilk-plus/CK/catch_exc.cc -O3 -fcilkplus execution test > > FAIL: g++.dg/cilk-plus/CK/catch_exc.cc -g -O2 -fcilkplus execution test > > I don't like special-casing BUILT_IN_SETJMP_RECEIVER - we don't Not even in call_can_make_abnormal_goto? I mean, as we split __builtin_setjmp into __builtin_setjmp{_setup,_receiver} and the former has abnormal edge out of it, there is no need to have another abnormal edge out of the receiver. > currently have a suitable exported stmt_starts_bb_p () (the existing > one is too special), but similar to is_ctrl_altering_stmt we could > add a is_ctrl_receiving_stmt that covers nonlocal labels and other Yes, that works for me. I wondered if insert_backedge_copies wouldn't do harm if it inserts the assignments before normal setjmp call or other call that returns twice. > stmts we add abnormal edges to. Btw, why isn't BUILT_IN_SETJMP_RECEIVER > ECF_RETURNS_TWICE? See PR60003, #c5 in particular which really didn't work at all, and Eric's preference in #c8. Jakub
[PATCH] Add a couple of dialect and warning options regarding Objective-C instance variable scope
Hello, This is a patch regarding a couple of Objective-C related dialect options and warning switches. I have already submitted it a while ago but gave up after pinging a couple of times. I am now informed that should have kept pinging until I got someone's attention so I'm resending it. The patch is now against an old revision and as I stated originally it's probably not in a state that can be adopted as is. I'm sending it as is so that the implemented features can be assesed in terms of their usefulness and if they're welcome I'd be happy to make any necessary changes to bring it up-to-date, split it into smaller patches, add test-cases and anything else that is deemed necessary. Here's the relevant text from my initial message: Two of these switches are related to a feature request I submitted a while ago, Bug 56044 (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56044). I won't reproduce the entire argument here since it is available in the feature request. The relevant functionality in the patch comes in the form of two switches: -Wshadow-ivars which controls the "local declaration of ‘somevar’ hides instance variable" warning which curiously is enabled by default instead of being controlled at least by -Wshadow. The patch changes it so that this warning can be enabled and disabled specifically through -Wshadow-ivars as well as with all other shadowing-related warnings through -Wshadow. The reason for the extra switch is that, while searching through the Internet for a solution to this problem I have found out that other people are inconvenienced by this particular warning as well so it might be useful to be able to turn it off while keeping all the other shadowing-related warnings enabled. -flocal-ivars which when true, as it is by default, treats instance variables as having local scope. If false (-fno-local-ivars) instance variables must always be referred to as self->ivarname and references of ivarname resolve to the local or global scope as usual. I've also taken the opportunity of adding another switch unrelated to the above but related to instance variables: -fivar-visibility which can be set to either private, protected (the default), public and package. This sets the default instance variable visibility which normally is implicitly protected. My use-case for it is basically to be able to set it to public and thus effectively disable this visibility mechanism altogether which I find no use for and therefore have to circumvent. I'm not sure if anyone else feels the same way towards this but I figured it was worth a try. I'm attaching a preliminary patch against the current revision in case anyone wants to have a look. The changes are very small and any blatant mistakes should be immediately obvious. I have to admit to having virtually no knowledge of the internals of GCC but I have tried to keep in line with formatting guidelines and general style as well as looking up the particulars of the way options are handled in the available documentation to avoid blind copy-pasting. I have also tried to test the functionality both in my own (relatively large, or at least not too small) project and with small test programs and everything works as expected. Finallly, I tried running the tests too but these fail to complete both in the patched and unpatched version, possibly due to the way I've configured GCC. Dimitris Index: gcc/c-family/c.opt === --- gcc/c-family/c.opt (revision 199867) +++ gcc/c-family/c.opt (working copy) @@ -661,6 +661,10 @@ Wselector ObjC ObjC++ Var(warn_selector) Warning Warn if a selector has multiple methods +Wshadow-ivar +ObjC ObjC++ Var(warn_shadow_ivar) Init(-1) Warning +Warn if a local declaration hides an instance variable + Wsequence-point C ObjC C++ ObjC++ Var(warn_sequence_point) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall) Warn about possible violations of sequence point rules @@ -1018,6 +1022,29 @@ fnil-receivers ObjC ObjC++ Var(flag_nil_receivers) Init(1) Assume that receivers of Objective-C messages may be nil +flocal-ivars +ObjC ObjC++ Var(flag_local_ivars) Init(1) +Allow access to instance variables as if they were local declarations within instance method implementations. + +fivar-visibility= +ObjC ObjC++ Joined RejectNegative Enum(ivar_visibility) Var(default_ivar_visibility) Init(IVAR_VISIBILITY_PROTECTED) +-fvisibility=[private|protected|public|package] Set the default symbol visibility + +Enum +Name(ivar_visibility) Type(enum ivar_visibility) UnknownError(unrecognized ivar visibility value %qs) + +EnumValue +Enum(ivar_visibility) String(private) Value(IVAR_VISIBILITY_PRIVATE) + +EnumValue +Enum(ivar_visibility) String(protected) Value(IVAR_VISIBILITY_PROTECTED) + +EnumValue +Enum(ivar_visibility) String(public) Value(IVAR_VISIBILITY_PUBLIC) + +EnumValue +Enum(ivar_visibility) String(package) Value(IVAR_VISIBILITY_P
Re: [ARM][PATCH] Vectorizer generates unaligned access when -mno-unaligned-access is enabled
On 06/02/14 01:56, Ramana Radhakrishnan wrote: > > >> >> Ramana wrote: Attached patch fixes this. Is this OK for trunk? >>> >>> How has it been tested ? >> >> I was hoping Linaro people could run their magical cbuild on it... > > Ok if no regressions. > Tested it on qemu for arm-none-linux-gnueabi and there is no new regressions. I am sorry I didn’t mention it when I posted the patch. Thanks, Kugan
Re: [PATCH] Fix up __builtin_setjmp_receiver handling (PR c++/60082)
On Thu, 6 Feb 2014, Jakub Jelinek wrote: > Hi! > > I've looked at the spawner_inline.c timeout issue, and it looks like the > problem is that __builtin_setjmp_receiver is a stmt_ends_bb_p call and > insert_backedge_copies then happily inserts statements before the call, > because it can't insert them after the call. But __builtin_setjmp_receiver > is quite special beast, it's bb is reachable only through an abnormal edge > and it pretty much relies on being at the beginning of the bb, as before the > call e.g. frame pointer value is bogus, the expansion of the call itself > fixes that up and then emits a blockage to make sure things aren't > reordered. But when outof-ssa emits something before the builtin, it isn't > reordered, but is on the wrong side of the frame pointer restoration. > > Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for > trunk? Alternatively we could special case BUILT_IN_SETJMP_RECEIVER instead > in call_can_make_abnormal_goto, there is probably no need to have AB edge > out of __builtin_setjmp_receiver, we already have one out of > __builtin_setjmp_setup and receiver is really expanded just to load of frame > pointer, so it doesn't jump anywhere. > > Note that this fixes both the CK tests on x86_64-linux, on i686-linux > neither of them timeout, but catch_exc.cc still fails: > FAIL: g++.dg/cilk-plus/CK/catch_exc.cc -O1 -fcilkplus execution test > FAIL: g++.dg/cilk-plus/CK/catch_exc.cc -O3 -fcilkplus execution test > FAIL: g++.dg/cilk-plus/CK/catch_exc.cc -g -O2 -fcilkplus execution test I don't like special-casing BUILT_IN_SETJMP_RECEIVER - we don't currently have a suitable exported stmt_starts_bb_p () (the existing one is too special), but similar to is_ctrl_altering_stmt we could add a is_ctrl_receiving_stmt that covers nonlocal labels and other stmts we add abnormal edges to. Btw, why isn't BUILT_IN_SETJMP_RECEIVER ECF_RETURNS_TWICE? Thanks, Richard. > 2014-02-06 Jakub Jelinek > > PR c++/60082 > * tree-outof-ssa.c (insert_backedge_copies): Never insert statements > before __builtin_setjmp_receiver. > > Revert > 2014-02-05 Balaji V. Iyer > > * g++.dg/cilk-plus/CK/catch_exc.cc: Disable test for -O1. > * c-c++-common/cilk-plus/CK/spawner_inline.c: Likewise. > > --- gcc/tree-outof-ssa.c.jj 2014-01-03 11:41:01.0 +0100 > +++ gcc/tree-outof-ssa.c 2014-02-05 22:12:03.637412102 +0100 > @@ -1152,6 +1152,12 @@ insert_backedge_copies (void) > if (TREE_CODE (arg) == SSA_NAME > && SSA_NAME_DEF_STMT (arg) == last) > continue; > + /* Never insert anything before > + __builtin_setjmp_receiver, that builtin should be > + immediately after labels. */ > + if (gimple_call_builtin_p (last, > + BUILT_IN_SETJMP_RECEIVER)) > + continue; > } > > /* Create a new instance of the underlying variable of the > --- gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc(revision 207519) > +++ gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc(revision 207518) > @@ -1,7 +1,6 @@ > /* { dg-options "-fcilkplus" } */ > /* { dg-do run { target i?86-*-* x86_64-*-* arm*-*-* } } */ > /* { dg-options "-fcilkplus -lcilkrts" { target { i?86-*-* x86_64-*-* > arm*-*-* } } } */ > -/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */ > > #include > #include > --- gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c (revision > 207519) > +++ gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c (revision > 207518) > @@ -1,7 +1,6 @@ > /* { dg-do run { target { i?86-*-* x86_64-*-* } } } */ > /* { dg-options "-fcilkplus" } */ > /* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } > */ > -/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */ > > #include > #define DEFAULT_VALUE 30 > > Jakub > > -- Richard Biener SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer