Re: [PATCH PR target/65058] AIX: missing extern decorations "[DS]" for functions and "[RW]" for variables
Hi David, Am 2015-02-14 um 22:05 schrieb David Edelsohn: > Hi, Michael > > Thanks for noticing this. This patch generally seems to be on the > right track. The original ASM_OUTPUT_EXTERNAL code was not completely > correct in the pedantic sense. It should use [UA] mapping class > instead of [RW] for all non-function descriptor symbols. One more thought: How is that dollar_inside thing seen in ASM_OUTPUT_EXTERNAL supposed to work/be used? > This patch also needs a dg-final scan-assembler test to check for [DS] > and [UA] to ensure that this does not regress again in the future. Thanks for committing! /haubi/ (the one you gave commit access already:)
Re: [patch] fix PR65048: check that jump-thread paths are still valid
On 02/13/15 16:50, Sebastian Pop wrote: Hi, the attached patch fixes PR65048 by checking before jump-threading that a path to be threaded is still valid: as the testcase shows, there may be paths that are not connected anymore because the cfg has changed in a previous jump-thread. PR tree-optimization/65048 * tree-ssa-threadupdate.c (valid_jump_thread_path): New. (thread_through_all_blocks): Call valid_jump_thread_path. Remove invalid FSM jump-thread paths. * gcc.dg/tree-ssa/ssa-dom-thread-9.c: New. The patch passed bootstrap and regression tests on x86_64-linux. Ok for trunk? These kinds of situations are normally pruned out in mark_threaded_blocks. The dumps for the FSM threads are a bit sparse -- they don't show the entire path. That makes it much harder to see what's going on. It also appears that FSM is registering lots of duplicate paths. grep Registering j*.dom1 | grep -v PHI | sort Registering FSM jump thread: (10, 12) incoming edge; (15, 3) nocopy; Registering FSM jump thread: (11, 12) incoming edge; (15, 16) nocopy; Registering FSM jump thread: (16, 3) incoming edge; (15, 16) nocopy; Registering FSM jump thread: (5, 10) incoming edge; (13, 14) nocopy; Registering FSM jump thread: (5, 10) incoming edge; (13, 14) nocopy; Registering FSM jump thread: (5, 11) incoming edge; (13, 14) nocopy; Registering FSM jump thread: (5, 11) incoming edge; (13, 14) nocopy; Registering FSM jump thread: (6, 14) incoming edge; (13, 14) nocopy; Registering FSM jump thread: (6, 14) incoming edge; (13, 14) nocopy; Registering FSM jump thread: (6, 3) incoming edge; (13, 14) nocopy; Registering FSM jump thread: (6, 3) incoming edge; (13, 14) nocopy; Registering FSM jump thread: (7, 10) incoming edge; (13, 14) nocopy; Registering FSM jump thread: (7, 10) incoming edge; (13, 14) nocopy; Registering FSM jump thread: (7, 11) incoming edge; (13, 14) nocopy; Registering FSM jump thread: (7, 11) incoming edge; (13, 14) nocopy; Registering FSM jump thread: (8, 14) incoming edge; (13, 14) nocopy; Registering FSM jump thread: (8, 14) incoming edge; (13, 14) nocopy; Registering FSM jump thread: (8, 3) incoming edge; (13, 14) nocopy; Registering FSM jump thread: (8, 3) incoming edge; (13, 14) nocopy; Something seems wrong there. Anyway, so what node precisely is not connected? Is that happening as a result of the duplicated jump threads or is it something else? Jeff
Re: [Haifa Scheduler] Fix latent bug in macro-fusion/instruction grouping
On 02/11/15 02:20, James Greenhalgh wrote: On Mon, Feb 09, 2015 at 11:16:56PM +, Jeff Law wrote: On 02/06/15 05:24, James Greenhalgh wrote: --- 2015-02-06 James Greenhalgh * haifa-sched.c (recompute_todo_spec): After applying a replacement and cancelling a dependency, also clear the SCHED_GROUP_P flag. My worry here would be that we might be clearing a SCHED_GROUP_P that had been set for some reason other than macro-fusion. Yeah, I also had this worry. This patch tackles the problem from the other direction. If we see a SCHED_GROUP_P on an insn, treat it as a hard dependency, and don't try to rewrite it. I think this will always be "safe" but it might pessimize if the dependency breaker would have resulted in better code generation. I don't think this gives you anything towards fixing your bug, but it clears mine. Right. Mine was in the management of the ready queue. We allowed something with SCHED_GROUP_P to get deferred for several cycles. While it was deferred another insn that was previously deferred became ready and fired. That messed up the scheduling group and ultimately resulted in incorrect code. The fix was actually pretty simple, We just queue the SCHED_GROUP_P for a single cycle, then reevaluate. I've bootstrapped and tested on x86_64-unknown-linux-gnu with no issues and given it a quick check on the problem code from before, where it has the desired impact. Thanks, James --- 2015-02-10 James Greenhalgh * haifa-sched.c (recompute_todo_spec): Treat SCHED_GROUP_P as forcing a HARD_DEP between instructions, thereby disallowing rewriting to break dependencies. OK. jeff
Re: [PATCH][RFA][LRA] Don't try to break down subreg expressions if insn already matches
On 02/14/15 04:23, Maxim Kuvyrkov wrote: FYI, (and not related to the core issue of this patch) The use of mult vs shift by combine is a problem that Venkat is working on, see "[RFC] Tighten memory type assumption in RTL combiner pass" . The combiner uses MULTs instead of SHIFTs for rtx'es that look like addresses, even when they are, in fact, mere logic/arithmetic operations. Right. I think I put that into my gcc-6 queue because we don't have a regression that requires this problem be fixed. If you've got a BZ that's regressing because of this issue, don't hesitate to point it out and I'll move the thread into my gcc-5 queue. jeff
Re: Ping : [PATCH] [gcc, combine] PR46164: Don't combine the insns if a volatile register is contained.
On Sun, Feb 15, 2015 at 7:35 PM, Segher Boessenkool wrote: > Hi Terry, > > I still think this is stage1 material. > >> + /* Don't combine if dest contains a user specified register and i3 >> contains >> + ASM_OPERANDS, because the user specified register (same with dest) in >> i3 >> + would be replaced by the src of insn which might be different with >> + the user's expectation. */ > > "Do not eliminate a register asm in an asm input" or similar? Text > explaining why REG_USERVAR_P && HARD_REGISTER_P works here would be > good to have, too. > >> + if (REG_P (dest) && REG_USERVAR_P (dest) && HARD_REGISTER_P (dest) >> + && (GET_CODE (PATTERN (i3)) == SET >> + && GET_CODE (SET_SRC (PATTERN (i3))) == ASM_OPERANDS)) >> +return 0; > > That works only for asms with exactly one output. You want > extract_asm_operands. > > > Segher Thanks Segher. Patch is updated per you suggestion. Is this one ok for stage 1? BR, Terry pr64818-combine-user-specified-register.patch-4 Description: Binary data
Re: [C++ PATCH] Fix constexpr C++11 handling with lambdas (PR c++/65075)
OK. Jason
RE: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible
> From: Steven Bosscher [mailto:stevenb@gmail.com] > Sent: Tuesday, February 17, 2015 4:19 AM > To: Thomas Preud'homme > Cc: GCC Patches; Richard Biener > Subject: Re: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop > not possible > > On Mon, Feb 16, 2015 at 11:26 AM, Thomas Preud'homme wrote: > > > /* Subroutine of cprop_insn that tries to propagate constants into > > @@ -1044,40 +1042,41 @@ cprop_insn (rtx_insn *insn) > > > - /* Constant propagation. */ > > - if (cprop_constant_p (src)) > > - { > > - if (constprop_register (reg_used, src, insn)) > > + /* Constant propagation. */ > > + if (src_cst && cprop_constant_p (src_cst) > > + && constprop_register (reg_used, src_cst, insn)) > > { > > changed_this_round = changed = 1; > > global_const_prop_count++; > > The cprop_constant_p test is redundant, you only have non-NULL > src_cst > if it is a cprop_constant_p (as you test for it in find_avail_set()). Ack. > > > > @@ -1087,18 +1086,16 @@ retry: > >"GLOBAL CONST-PROP: Replacing reg %d in ", > > regno); > > fprintf (dump_file, "insn %d with constant ", > >INSN_UID (insn)); > > - print_rtl (dump_file, src); > > + print_rtl (dump_file, src_cst); > > fprintf (dump_file, "\n"); > > } > > if (insn->deleted ()) > > return 1; > > } > > - } > > - else if (REG_P (src) > > - && REGNO (src) >= FIRST_PSEUDO_REGISTER > > - && REGNO (src) != regno) > > - { > > - if (try_replace_reg (reg_used, src, insn)) > > + else if (src_reg && REG_P (src_reg) > > + && REGNO (src_reg) >= FIRST_PSEUDO_REGISTER > > + && REGNO (src_reg) != regno > > + && try_replace_reg (reg_used, src_reg, insn)) > > Likewise for the REG_P and ">= FIRST_PSEUDO_REGISTER" tests here > (with > the equivalent and IMHO preferable HARD_REGISTER_P test in > find_avail_set()). I'm not sure I follow you here. First, it seems to me that the equivalent test is rather REG_P && !HARD_REGISTER_P since here it checks if it's a pseudo register. Then, do you mean the test can be simply removed because of the REG_P && !HARD_REGISTER_P in hash_scan_set () called indirectly by compute_hash_table () when called in one_cprop_pass () before any cprop_insn ()? Or do you mean I should move the check in find_avail_set ()? Best regards, Thomas
Re: [PATCH][PR tree-optimization/64823] Handle threading through blocks with PHIs, but no statements V2
On 02/16/15 14:11, Jakub Jelinek wrote: On Mon, Feb 16, 2015 at 02:00:32PM -0700, Jeff Law wrote: --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -10176,13 +10176,20 @@ identify_jump_threads (void) /* We only care about blocks ending in a COND_EXPR. While there may be some value in handling SWITCH_EXPR here, I doubt it's terribly important. */ - last = gsi_stmt (gsi_last_bb (bb)); + last = gsi_stmt (gsi_last_nondebug_bb (bb)); Isn't that just last = last_stmt (bb); equivalent? It is. I'll make that change and update the comment as well. jeff
RE: [PATCH, FT32] initial support
Aha yes. Revised attached. invoke.texi now has: These options are defined specifically for the FT32 port. @table @gcctabopt @item -msim @opindex msim Specifies that the program will be run on the simulator. This causes an alternate runtime startup and library to be linked. You must not use this option when generating programs that will run on real hardware; you must provide your own runtime library for whatever I/O functions are needed. @end table -- James Bowman FTDI Open Source Liaison From: Joseph Myers [jos...@codesourcery.com] Sent: Tuesday, February 17, 2015 2:06 AM To: James Bowman Cc: gcc-patches@gcc.gnu.org Subject: RE: [PATCH, FT32] initial support On Mon, 16 Feb 2015, James Bowman wrote: > I have updated the target options. Space-saving is now enabled by > -Os. There is also a new option -msim to enable building for the > simulator (the simulator is pending submission to gdb-binutils). The documentation in this patch doesn't seem to have been updated for those changes. -- Joseph S. Myers jos...@codesourcery.com gcc-ft32.txt.gz Description: gcc-ft32.txt.gz
Re: nvptx offloading patches [3/n], RFD
On Mon, Feb 16, 2015 at 10:35:30PM +0100, Richard Biener wrote: > Seeing the real format string you introduce I wonder if identifying modes > by their names wouldn't work in 99% of all cases (apart from PSImode > maybe). There are various corner cases. Plus of course sometimes insignificant, but sometimes very significant, floating mode changes. SFmode on one target might be completely different from another target. > Also for most cases we can construct the machine mode from the type. Or > where that is not possible stream the extra info that is necessary > instead. I thought we've discussed that already on IRC. E.g. decimal modes are identified only by mode and nothing else, and it doesn't look like it can be easily derived from types in many cases (spent quite some time on that). > Overall feels like a hack BTW :) can't we assign machine mode enum IDs in > a target independent way? I mean, it doesn't have to be densely > allocated? We iterate over modes, we have tons of tables indexed by modes, so if we introduce gaps, we'll make the compiler bigger and slower. If this is limited to the offloading path, like in the attached updated patch, the overhead for native LTO should be not measurable. --- gcc/passes.c.jj 2015-02-16 22:18:33.219702315 +0100 +++ gcc/passes.c2015-02-16 22:19:20.842917807 +0100 @@ -2460,6 +2460,7 @@ ipa_write_summaries_1 (lto_symtab_encode struct lto_out_decl_state *state = lto_new_out_decl_state (); state->symtab_node_encoder = encoder; + lto_output_init_mode_table (); lto_push_out_decl_state (state); gcc_assert (!flag_wpa); @@ -2581,6 +2582,7 @@ ipa_write_optimization_summaries (lto_sy lto_symtab_encoder_iterator lsei; state->symtab_node_encoder = encoder; + lto_output_init_mode_table (); lto_push_out_decl_state (state); for (lsei = lsei_start_function_in_partition (encoder); !lsei_end_p (lsei); lsei_next_function_in_partition (&lsei)) --- gcc/tree-streamer.h.jj 2015-02-16 22:18:33.222702266 +0100 +++ gcc/tree-streamer.h 2015-02-16 22:19:20.843917791 +0100 @@ -24,6 +24,7 @@ along with GCC; see the file COPYING3. #include "streamer-hooks.h" #include "lto-streamer.h" +#include "data-streamer.h" #include "hash-map.h" /* Cache of pickled nodes. Used to avoid writing the same node more @@ -91,6 +92,7 @@ void streamer_write_integer_cst (struct void streamer_write_builtin (struct output_block *, tree); /* In tree-streamer.c. */ +extern unsigned char streamer_mode_table[1 << 8]; void streamer_check_handled_ts_structures (void); bool streamer_tree_cache_insert (struct streamer_tree_cache_d *, tree, hashval_t, unsigned *); @@ -119,5 +121,19 @@ streamer_tree_cache_get_hash (struct str return cache->hashes[ix]; } +static inline void +bp_pack_machine_mode (struct bitpack_d *bp, machine_mode mode) +{ + streamer_mode_table[mode] = 1; + bp_pack_enum (bp, machine_mode, 1 << 8, mode); +} + +static inline machine_mode +bp_unpack_machine_mode (struct bitpack_d *bp) +{ + return (machine_mode) + ((struct lto_input_block *) + bp->stream)->mode_table[bp_unpack_enum (bp, machine_mode, 1 << 8)]; +} #endif /* GCC_TREE_STREAMER_H */ --- gcc/lto-streamer-out.c.jj 2015-02-16 22:18:33.204702562 +0100 +++ gcc/lto-streamer-out.c 2015-02-16 22:20:06.659163066 +0100 @@ -2642,6 +2642,96 @@ produce_symtab (struct output_block *ob) } +/* Init the streamer_mode_table for output, where we collect info on what + machine_mode values have been streamed. */ +void +lto_output_init_mode_table (void) +{ + memset (streamer_mode_table, '\0', MAX_MACHINE_MODE); +} + + +/* Write the mode table. */ +static void +lto_write_mode_table (void) +{ + struct output_block *ob; + ob = create_output_block (LTO_section_mode_table); + bitpack_d bp = bitpack_create (ob->main_stream); + + /* Ensure that for GET_MODE_INNER (m) != VOIDmode we have + also the inner mode marked. */ + for (int i = 0; i < (int) MAX_MACHINE_MODE; i++) +if (streamer_mode_table[i]) + { + machine_mode m = (machine_mode) i; + if (GET_MODE_INNER (m) != VOIDmode) + streamer_mode_table[(int) GET_MODE_INNER (m)] = 1; + } + /* First stream modes that have GET_MODE_INNER (m) == VOIDmode, + so that we can refer to them afterwards. */ + for (int pass = 0; pass < 2; pass++) +for (int i = 0; i < (int) MAX_MACHINE_MODE; i++) + if (streamer_mode_table[i] && i != (int) VOIDmode && i != (int) BLKmode) + { + machine_mode m = (machine_mode) i; + if ((GET_MODE_INNER (m) == VOIDmode) ^ (pass == 0)) + continue; + bp_pack_value (&bp, m, 8); + bp_pack_enum (&bp, mode_class, MAX_MODE_CLASS, GET_MODE_CLASS (m)); + bp_pack_value (&bp, GET_MODE_SIZE (m), 8); + bp_pack_value (&bp, GET_MODE_PRECISION (m), 16); + bp_pack_value (&bp, GET_MODE_INNER (m), 8); + bp_pack_value (&bp, GET_MODE_NUNITS (m),
Re: nvptx offloading patches [3/n], RFD
On February 16, 2015 10:08:12 PM CET, Jakub Jelinek wrote: >Hi! > >On Mon, Feb 09, 2015 at 11:20:00AM +0100, Richard Biener wrote: >> I think (also communicated that on IRC) we should instead try not >streaming >> machine-modes at all but generating them at stream-in time via >layout_type >> or layout_decl. > >Here is a WIP prototype for being able to stream a machine mode >description >table and streaming it back in. >In the end, I'd like to stream this out only for lto_stream_offload_p >and >stream it in only for ACCEL_COMPILER reading in when available, but >wanted >to see what it does even for native LTO. >For that it doesn't work very well, because it seems that wpa phase >doesn't stream in some sections and stream them out again, but instead >somehow copies them directly to the output object, so the mode table >isn't aware of the modes used in there that were bypassed this way. > >Anyway, the question is if for offloading we use wpa stage at all these >days >or not at all, if there is a way for ACCEL_COMPILER to differentiate >somehow between LTO sections written by the host compiler and LTO >sections >perhaps created by the offloading compiler when trying to LTO the thing >(if >it does it at all). Because obviously the host compiler written LTO >(in .gnu.offload_lto_*) would need the machine modes translated, while >LTO streamed already by the ACCEL_COMPILER (if any) generally would >already >use the offloading target machine modes and therefore should be treated >as >native lto (.gnu.lto_*). > >If we don't try to write .gnu.offload_lto_* again, I think following >patch >with additionally not calling lto_write_mode_table for >!lto_stream_offload_p >and not calling lto_input_mode_table for !ACCEL_COMPILER - instead >build >a single shared identity table - might actually work. > >Thoughts on this? Seeing the real format string you introduce I wonder if identifying modes by their names wouldn't work in 99% of all cases (apart from PSImode maybe). Also for most cases we can construct the machine mode from the type. Or where that is not possible stream the extra info that is necessary instead. Overall feels like a hack BTW :) can't we assign machine mode enum IDs in a target independent way? I mean, it doesn't have to be densely allocated? Richard. >Bernd/Thomas, do you plan to commit the other approved patches soon? > >--- gcc/passes.c.jj2015-02-16 20:14:09.477345693 +0100 >+++ gcc/passes.c 2015-02-16 20:26:23.659299189 +0100 >@@ -2460,6 +2460,7 @@ ipa_write_summaries_1 (lto_symtab_encode > struct lto_out_decl_state *state = lto_new_out_decl_state (); > state->symtab_node_encoder = encoder; > >+ lto_output_init_mode_table (); > lto_push_out_decl_state (state); > > gcc_assert (!flag_wpa); >@@ -2581,6 +2582,7 @@ ipa_write_optimization_summaries (lto_sy > lto_symtab_encoder_iterator lsei; > state->symtab_node_encoder = encoder; > >+ lto_output_init_mode_table (); > lto_push_out_decl_state (state); > for (lsei = lsei_start_function_in_partition (encoder); >!lsei_end_p (lsei); lsei_next_function_in_partition (&lsei)) >--- gcc/tree-streamer.h.jj 2015-02-16 20:14:09.446346202 +0100 >+++ gcc/tree-streamer.h2015-02-16 21:14:50.701615850 +0100 >@@ -24,6 +24,7 @@ along with GCC; see the file COPYING3. > > #include "streamer-hooks.h" > #include "lto-streamer.h" >+#include "data-streamer.h" > #include "hash-map.h" > > /* Cache of pickled nodes. Used to avoid writing the same node more >@@ -91,6 +92,7 @@ void streamer_write_integer_cst (struct > void streamer_write_builtin (struct output_block *, tree); > > /* In tree-streamer.c. */ >+extern unsigned char streamer_mode_table[1 << 8]; > void streamer_check_handled_ts_structures (void); > bool streamer_tree_cache_insert (struct streamer_tree_cache_d *, tree, >hashval_t, unsigned *); >@@ -119,5 +121,19 @@ streamer_tree_cache_get_hash (struct str > return cache->hashes[ix]; > } > >+static inline void >+bp_pack_machine_mode (struct bitpack_d *bp, machine_mode mode) >+{ >+ streamer_mode_table[mode] = 1; >+ bp_pack_enum (bp, machine_mode, 1 << 8, mode); >+} >+ >+static inline machine_mode >+bp_unpack_machine_mode (struct bitpack_d *bp) >+{ >+ return (machine_mode) >+ ((struct lto_input_block *) >+ bp->stream)->mode_table[bp_unpack_enum (bp, machine_mode, 1 << >8)]; >+} > > #endif /* GCC_TREE_STREAMER_H */ >--- gcc/lto-streamer-out.c.jj 2015-02-16 20:14:09.046352765 +0100 >+++ gcc/lto-streamer-out.c 2015-02-16 20:26:23.665299091 +0100 >@@ -2642,6 +2642,96 @@ produce_symtab (struct output_block *ob) > } > > >+/* Init the streamer_mode_table for output, where we collect info on >what >+ machine_mode values have been streamed. */ >+void >+lto_output_init_mode_table (void) >+{ >+ memset (streamer_mode_table, '\0', MAX_MACHINE_MODE); >+} >+ >+ >+/* Write the mode table. */ >+static void >+lto_write_mode_table (void) >+{ >+ struct output_bloc
Re: [PATCH][PR tree-optimization/64823] Handle threading through blocks with PHIs, but no statements V2
On Mon, Feb 16, 2015 at 10:20:23PM +0100, Richard Biener wrote: > On February 16, 2015 10:11:07 PM CET, Jakub Jelinek wrote: > >On Mon, Feb 16, 2015 at 02:00:32PM -0700, Jeff Law wrote: > >> --- a/gcc/tree-vrp.c > >> +++ b/gcc/tree-vrp.c > >> @@ -10176,13 +10176,20 @@ identify_jump_threads (void) > >>/* We only care about blocks ending in a COND_EXPR. While > >there > >> may be some value in handling SWITCH_EXPR here, I doubt it's > >> terribly important. */ > >> - last = gsi_stmt (gsi_last_bb (bb)); > >> + last = gsi_stmt (gsi_last_nondebug_bb (bb)); > > And if the comment is correct then it should not even matter as the condition > ends a basic block. It matters, because the use is: if (!last || gimple_code (last) == GIMPLE_SWITCH || (gimple_code (last) == GIMPLE_COND && TREE_CODE (gimple_cond_lhs (last)) == SSA_NAME && (INTEGRAL_TYPE_P (TREE_TYPE (gimple_cond_lhs (last))) || POINTER_TYPE_P (TREE_TYPE (gimple_cond_lhs (last && (TREE_CODE (gimple_cond_rhs (last)) == SSA_NAME || is_gimple_min_invariant (gimple_cond_rhs (last) thus, if a bb contains only debug statements and nothing else, the condition is false, while if it for corresponding -g0 doesn't contain anything, it is true (!last). Jakub
Re: [PATCH][PR tree-optimization/64823] Handle threading through blocks with PHIs, but no statements V2
On February 16, 2015 10:11:07 PM CET, Jakub Jelinek wrote: >On Mon, Feb 16, 2015 at 02:00:32PM -0700, Jeff Law wrote: >> --- a/gcc/tree-vrp.c >> +++ b/gcc/tree-vrp.c >> @@ -10176,13 +10176,20 @@ identify_jump_threads (void) >>/* We only care about blocks ending in a COND_EXPR. While >there >> may be some value in handling SWITCH_EXPR here, I doubt it's >> terribly important. */ >> - last = gsi_stmt (gsi_last_bb (bb)); >> + last = gsi_stmt (gsi_last_nondebug_bb (bb)); And if the comment is correct then it should not even matter as the condition ends a basic block. Richard. >Isn't that just > last = last_stmt (bb); >equivalent? > > Jakub
Re: [PATCH][PR tree-optimization/64823] Handle threading through blocks with PHIs, but no statements V2
On Mon, Feb 16, 2015 at 02:00:32PM -0700, Jeff Law wrote: > --- a/gcc/tree-vrp.c > +++ b/gcc/tree-vrp.c > @@ -10176,13 +10176,20 @@ identify_jump_threads (void) >/* We only care about blocks ending in a COND_EXPR. While there >may be some value in handling SWITCH_EXPR here, I doubt it's >terribly important. */ > - last = gsi_stmt (gsi_last_bb (bb)); > + last = gsi_stmt (gsi_last_nondebug_bb (bb)); Isn't that just last = last_stmt (bb); equivalent? Jakub
Re: nvptx offloading patches [3/n], RFD
Hi! On Mon, Feb 09, 2015 at 11:20:00AM +0100, Richard Biener wrote: > I think (also communicated that on IRC) we should instead try not streaming > machine-modes at all but generating them at stream-in time via layout_type > or layout_decl. Here is a WIP prototype for being able to stream a machine mode description table and streaming it back in. In the end, I'd like to stream this out only for lto_stream_offload_p and stream it in only for ACCEL_COMPILER reading in when available, but wanted to see what it does even for native LTO. For that it doesn't work very well, because it seems that wpa phase doesn't stream in some sections and stream them out again, but instead somehow copies them directly to the output object, so the mode table isn't aware of the modes used in there that were bypassed this way. Anyway, the question is if for offloading we use wpa stage at all these days or not at all, if there is a way for ACCEL_COMPILER to differentiate somehow between LTO sections written by the host compiler and LTO sections perhaps created by the offloading compiler when trying to LTO the thing (if it does it at all). Because obviously the host compiler written LTO (in .gnu.offload_lto_*) would need the machine modes translated, while LTO streamed already by the ACCEL_COMPILER (if any) generally would already use the offloading target machine modes and therefore should be treated as native lto (.gnu.lto_*). If we don't try to write .gnu.offload_lto_* again, I think following patch with additionally not calling lto_write_mode_table for !lto_stream_offload_p and not calling lto_input_mode_table for !ACCEL_COMPILER - instead build a single shared identity table - might actually work. Thoughts on this? Bernd/Thomas, do you plan to commit the other approved patches soon? --- gcc/passes.c.jj 2015-02-16 20:14:09.477345693 +0100 +++ gcc/passes.c2015-02-16 20:26:23.659299189 +0100 @@ -2460,6 +2460,7 @@ ipa_write_summaries_1 (lto_symtab_encode struct lto_out_decl_state *state = lto_new_out_decl_state (); state->symtab_node_encoder = encoder; + lto_output_init_mode_table (); lto_push_out_decl_state (state); gcc_assert (!flag_wpa); @@ -2581,6 +2582,7 @@ ipa_write_optimization_summaries (lto_sy lto_symtab_encoder_iterator lsei; state->symtab_node_encoder = encoder; + lto_output_init_mode_table (); lto_push_out_decl_state (state); for (lsei = lsei_start_function_in_partition (encoder); !lsei_end_p (lsei); lsei_next_function_in_partition (&lsei)) --- gcc/tree-streamer.h.jj 2015-02-16 20:14:09.446346202 +0100 +++ gcc/tree-streamer.h 2015-02-16 21:14:50.701615850 +0100 @@ -24,6 +24,7 @@ along with GCC; see the file COPYING3. #include "streamer-hooks.h" #include "lto-streamer.h" +#include "data-streamer.h" #include "hash-map.h" /* Cache of pickled nodes. Used to avoid writing the same node more @@ -91,6 +92,7 @@ void streamer_write_integer_cst (struct void streamer_write_builtin (struct output_block *, tree); /* In tree-streamer.c. */ +extern unsigned char streamer_mode_table[1 << 8]; void streamer_check_handled_ts_structures (void); bool streamer_tree_cache_insert (struct streamer_tree_cache_d *, tree, hashval_t, unsigned *); @@ -119,5 +121,19 @@ streamer_tree_cache_get_hash (struct str return cache->hashes[ix]; } +static inline void +bp_pack_machine_mode (struct bitpack_d *bp, machine_mode mode) +{ + streamer_mode_table[mode] = 1; + bp_pack_enum (bp, machine_mode, 1 << 8, mode); +} + +static inline machine_mode +bp_unpack_machine_mode (struct bitpack_d *bp) +{ + return (machine_mode) + ((struct lto_input_block *) + bp->stream)->mode_table[bp_unpack_enum (bp, machine_mode, 1 << 8)]; +} #endif /* GCC_TREE_STREAMER_H */ --- gcc/lto-streamer-out.c.jj 2015-02-16 20:14:09.046352765 +0100 +++ gcc/lto-streamer-out.c 2015-02-16 20:26:23.665299091 +0100 @@ -2642,6 +2642,96 @@ produce_symtab (struct output_block *ob) } +/* Init the streamer_mode_table for output, where we collect info on what + machine_mode values have been streamed. */ +void +lto_output_init_mode_table (void) +{ + memset (streamer_mode_table, '\0', MAX_MACHINE_MODE); +} + + +/* Write the mode table. */ +static void +lto_write_mode_table (void) +{ + struct output_block *ob; + ob = create_output_block (LTO_section_mode_table); + bitpack_d bp = bitpack_create (ob->main_stream); + + /* Ensure that for GET_MODE_INNER (m) != VOIDmode we have + also the inner mode marked. */ + for (int i = 0; i < (int) MAX_MACHINE_MODE; i++) +if (streamer_mode_table[i]) + { + machine_mode m = (machine_mode) i; + if (GET_MODE_INNER (m) != VOIDmode) + streamer_mode_table[(int) GET_MODE_INNER (m)] = 1; + } + /* First stream modes that have GET_MODE_INNER (m) == VOIDmode, + so that we can refer to them afterwards. */ + for (int pass = 0; pass < 2; pass++) +for (int i = 0; i < (int
[PATCH][PR tree-optimization/64823] Handle threading through blocks with PHIs, but no statements V2
The prior version of this patch failed to bootstrap with some non-standard configure options on x86_64-unknown-linux-gnu. The problem was existing code which looked for the last statement in a block. It should have looked through non-debug insns which was a trivial change to use gsi_last_nondebug_bb in one more place. Bootstrapped and regression tested on x86_64-unknown-linux-gnu with HJ's options. Installed on the trunk. Will obviously keep my eye out for any new issues. Jeff commit 4c181d63db537424b28e5d022f6cbec53594ac8f Author: Jeff Law Date: Mon Feb 16 13:54:35 2015 -0700 PR tree-optimization/64823 * tree-vrp.c (identify_jump_threads): Handle blocks with no real statements. * tree-ssa-threadedge.c (potentially_threadable_block): Allow threading through blocks with PHIs, but no statements. (thread_through_normal_block): Distinguish between blocks where we did not process all the statements and blocks with no statements. PR tree-optimization/64823 * gcc.dg/uninit-20.c: New test. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 9ef0d8c..bbeee3f 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,13 @@ +2015-02-16 Jeff Law + + PR tree-optimization/64823 + * tree-vrp.c (identify_jump_threads): Handle blocks with no real + statements. + * tree-ssa-threadedge.c (potentially_threadable_block): Allow + threading through blocks with PHIs, but no statements. + (thread_through_normal_block): Distinguish between blocks where + we did not process all the statements and blocks with no statements. + 2015-02-16 Jakub Jelinek James Greenhalgh diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index d5769b7..06ed820 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2015-02-16 Jeff Law + + PR tree-optimization/64823 + * gcc.dg/uninit-20.c: New test. + 2015-02-16 Jakub Jelinek James Greenhalgh diff --git a/gcc/testsuite/gcc.dg/uninit-20.c b/gcc/testsuite/gcc.dg/uninit-20.c new file mode 100644 index 000..12001ae --- /dev/null +++ b/gcc/testsuite/gcc.dg/uninit-20.c @@ -0,0 +1,18 @@ +/* Spurious uninitialized variable warnings, from gdb */ +/* { dg-do compile } */ +/* { dg-options "-O2 -Wuninitialized" } */ +struct os { struct o *o; }; +struct o { struct o *next; struct os *se; }; +void f(struct o *o){ + struct os *s; + if(o) s = o->se; + while(o && s == o->se){ +s++; // here `o' is non-zero and thus s is initialized +s == o->se // `?' is essential, `if' does not trigger the warning + ? (o = o->next, o ? s = o->se : 0) + : 0; + } +} + + + diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c index 4f83991..7187d06 100644 --- a/gcc/tree-ssa-threadedge.c +++ b/gcc/tree-ssa-threadedge.c @@ -110,6 +110,15 @@ potentially_threadable_block (basic_block bb) { gimple_stmt_iterator gsi; + /* Special case. We can get blocks that are forwarders, but are + not optimized away because they forward from outside a loop + to the loop header. We want to thread through them as we can + sometimes thread to the loop exit, which is obviously profitable. + the interesting case here is when the block has PHIs. */ + if (gsi_end_p (gsi_start_nondebug_bb (bb)) + && !gsi_end_p (gsi_start_phis (bb))) +return true; + /* If BB has a single successor or a single predecessor, then there is no threading opportunity. */ if (single_succ_p (bb) || single_pred_p (bb)) @@ -1281,16 +1290,32 @@ thread_through_normal_block (edge e, = record_temporary_equivalences_from_stmts_at_dest (e, stack, simplify, *backedge_seen_p); - /* If we didn't look at all the statements, the most likely reason is - there were too many and thus duplicating this block is not profitable. + /* There's two reasons STMT might be null, and distinguishing + between them is important. - Also note if we do not look at all the statements, then we may not - have invalidated equivalences that are no longer valid if we threaded - around a loop. Thus we must signal to our caller that this block - is not suitable for use as a joiner in a threading path. */ + First the block may not have had any statements. For example, it + might have some PHIs and unconditionally transfer control elsewhere. + Such blocks are suitable for jump threading, particularly as a + joiner block. + + The second reason would be if we did not process all the statements + in the block (because there were too many to make duplicating the + block profitable. If we did not look at all the statements, then + we may not have invalidated everything needing invalidation. Thus + we must signal to our caller that this block is not suitable for + use as a joiner in a thr
Re: [debug-early] C++ clones and limbo DIEs
On 02/12/2015 11:27 AM, Jason Merrill wrote: On 02/12/2015 01:04 PM, Aldy Hernandez wrote: On 02/10/2015 02:52 AM, Richard Biener wrote: On Fri, Feb 6, 2015 at 5:42 PM, Aldy Hernandez wrote: Of course I wonder why you need to separate handling of functions and variables The variables need to be handled earlier, else the call to analyze_functions() will remove some optimized global variables away, and we'll never see them. I believe that Jason said they were needed up-thread. variables. What breaks if you emit debug info for functions before the first analyze_functions () call? > > I also wonder why you restrict it to functions with a GIMPLE body. The functions, on the other hand, need to be handled after the second call to analyze_function (and with a GIMPLE body) else we get far more function DIEs than mainline currently does, especially wrt C++ clones. Otherwise, we get DIEs for base constructors, complete constructors, and what-have-yous. Jason wanted less DIEs, more attune to what mainline is currently doing. I think it makes sense to generate DIEs for everything defined in the TU if we don't have -feliminate-unused-debug-symbols. But since clones are artificial, emit them only if they're used. Ok, just so we're on the same page. I'm thinking that for -fNO-eliminate-unused-debug-symbols, we can iterate through FOR_EACH_DEFINED_FUNCTION before unreachable functions have been removed. There we can output all non-clones. Then for the -feliminate-unused-debug-symbols case, we can output reachable functions after the unreachable ones have been removed. Here we can also dump the clones we ignored for -fNO-eliminate-unused-debug-symbols above, since we only want to emit them if they're reachable (regardless of -feliminate-unused-debug-symbols). In either case, we always ignore those without a gimple body, otherwise we end up generating DIEs for the _ZN1AC2Ei constructor in the attached function unnecessarily. See how the bits end up in the attached testcase: (Oh, and we determine clonehood with DECL_ABSTRACT_ORIGIN) Before any calls to analyze_functions() --- Function: 'int main()' (Mangled: main) gimple_body=1 DECL_ABSTRACT_ORIGIN=0 Function: 'A::A(int)' (Mangled: _ZN1AC1Ei) gimple_body=0 DECL_ABSTRACT_ORIGIN=1 Function: 'A::A(int)' (Mangled: _ZN1AC2Ei) gimple_body=1 DECL_ABSTRACT_ORIGIN=1 Function: 'void foo(int)' (Mangled: _Z3fooi) gimple_body=1 DECL_ABSTRACT_ORIGIN=0 Function: 'int bar()' (Mangled: _Z3barv) gimple_body=1 DECL_ABSTRACT_ORIGIN=0 Function: 'void unreachable_func()' (Mangled: _ZL16unreachable_funcv) gimple_body=1 DECL_ABSTRACT_ORIGIN=0 After reachability analysis (after first call to analyze_functions()) - Function: 'int main()' (Mangled: main) gimple_body=1 DECL_ABSTRACT_ORIGIN=0 Function: 'A::A(int)' (Mangled: _ZN1AC1Ei) gimple_body=0 DECL_ABSTRACT_ORIGIN=1 Function: 'A::A(int)' (Mangled: _ZN1AC2Ei) gimple_body=1 DECL_ABSTRACT_ORIGIN=1 Function: 'void foo(int)' (Mangled: _Z3fooi) gimple_body=1 DECL_ABSTRACT_ORIGIN=0 Function: 'int bar()' (Mangled: _Z3barv) gimple_body=1 DECL_ABSTRACT_ORIGIN=0 Is this what you had in mind? I can provide a patch to make things clearer. Aldy extern "C" void abort (); struct A { A (int); int a; }; int i; static void unreachable_func() { i = 5; } __attribute__((noinline, noclone)) int bar (void) { return 40; } __attribute__((noinline, noclone)) void foo (int x) { __asm volatile ("" : : "r" (x) : "memory"); } A::A (int x) { static int p = bar (); foo (p); a = ++p; } int main () { A a (42); if (a.a != 41) abort (); }
Re: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible
On Mon, Feb 16, 2015 at 11:26 AM, Thomas Preud'homme wrote: > /* Subroutine of cprop_insn that tries to propagate constants into > @@ -1044,40 +1042,41 @@ cprop_insn (rtx_insn *insn) > - /* Constant propagation. */ > - if (cprop_constant_p (src)) > - { > - if (constprop_register (reg_used, src, insn)) > + /* Constant propagation. */ > + if (src_cst && cprop_constant_p (src_cst) > + && constprop_register (reg_used, src_cst, insn)) > { > changed_this_round = changed = 1; > global_const_prop_count++; The cprop_constant_p test is redundant, you only have non-NULL src_cst if it is a cprop_constant_p (as you test for it in find_avail_set()). > @@ -1087,18 +1086,16 @@ retry: >"GLOBAL CONST-PROP: Replacing reg %d in ", regno); > fprintf (dump_file, "insn %d with constant ", >INSN_UID (insn)); > - print_rtl (dump_file, src); > + print_rtl (dump_file, src_cst); > fprintf (dump_file, "\n"); > } > if (insn->deleted ()) > return 1; > } > - } > - else if (REG_P (src) > - && REGNO (src) >= FIRST_PSEUDO_REGISTER > - && REGNO (src) != regno) > - { > - if (try_replace_reg (reg_used, src, insn)) > + else if (src_reg && REG_P (src_reg) > + && REGNO (src_reg) >= FIRST_PSEUDO_REGISTER > + && REGNO (src_reg) != regno > + && try_replace_reg (reg_used, src_reg, insn)) Likewise for the REG_P and ">= FIRST_PSEUDO_REGISTER" tests here (with the equivalent and IMHO preferable HARD_REGISTER_P test in find_avail_set()). Looks good to me otherwise. Ciao! Steven
RE: [PATCH] Fix PR64980 and PR61960
again, with attachments, sorry. > > Hi, > > > this patch fixes PR64980 and PR61960 at the same time. > > The unreduced test case for PR64230 is also included, because a previous > version > of this patch caused this test to fail but the complete test suite passed > without any > indication of any problem. > > Boot-strapped and regression-tested on X86_64-unknown-linux-gnu. > > OK for trunk? > > > Thanks, > Bernd. > > > PS: Special thanks to Dominique for his help with testing of this patch. > 2015-02-16 Bernd Edlinger PR fortran/64980 PR fortran/61960 * trans-expr.c (gfc_apply_interface_mapping_to_expr): Remove mapping for component references to class objects. (gfc_conv_procedure_call): Compare the class by name. testsuite: 2015-02-16 Bernd Edlinger PR fortran/64980 PR fortran/61960 * gfortran.dg/pr61960.f90: New. * gfortran.dg/pr64230.f90: New. * gfortran.dg/pr64980.f03: New. patch-pr64980.diff Description: Binary data
[PATCH] Fix PR64980 and PR61960
Hi, this patch fixes PR64980 and PR61960 at the same time. The unreduced test case for PR64230 is also included, because a previous version of this patch caused this test to fail but the complete test suite passed without any indication of any problem. Boot-strapped and regression-tested on X86_64-unknown-linux-gnu. OK for trunk? Thanks, Bernd. PS: Special thanks to Dominique for his help with testing of this patch.
Confidential Message. Reply ASAP ***
Good day, I have been trying to reach you without success. Glad I could be able to get in touch with you today. Kindly reply as soon as possible in order to get back to you in regards to reason i have been trying to reach you. Regards Mr. David Nicodemus Executive Director
Re: [PATCH][PR tree-optimization/64823] Handle threading through blocks with PHIs, but no statements
On 02/13/15 23:12, Jack Howarth wrote: This also breaks the bootstrap on x86_64-apple-darwin14 due to a similar stage 2/3 comparison failure. Thanks. I'm pretty sure I've got the root cause of both of these failures. There's a gsi_last_bb in some existing code that really needs to be changed into gsi_last_nondebug_bb. It didn't matter before because a block ending in debug statements was never considered potentially threadable by VRP. However it matters with my change because we're allowing threading through such blocks. Strange that it didn't show up in my tests. But it's definitely a real issue. Jeff
Re: OMP builtins in offloading (was: [PATCH 1/4] Add mkoffload for Intel MIC)
On Thu, Jan 08, 2015 at 16:49:40 +0100, Jakub Jelinek wrote: > BTW, today when looking at the TARGET_OPTION_NODE streaming caused > regressions, I've discovered that it is very hard to debug issues in the > offloading compiler. Would be nice if > -save-temps -v > printed enough information that it is actually possible to reproduce it, > e.g. while mkoffload command is printed, one can't cut and paste it easily, > because some env vars are required and those aren't printed in the -v dump. Currently I see all required env vars for mkoffload in the -v dump: COLLECT_GCC=... COMPILER_PATH=... .../mkoffload @... It doesn't need anything more. -- Ilya
Re: [committed] Change "Q" and "T" constraints to memory constraints
On 2015-02-16, at 11:38 AM, Richard Henderson wrote: >> >> Possibly the constant can somehow be forced into the data section where the >> relocations >> aren't a problem? > > Hmm. It looks like we might already do that. See default_select_rtx_section. Thanks, I see the problem. default_reloc_rw_mask returns 0 when not generating PIC code, so rtx went to readonly_data_section. I was thinking that pa_select_section was somehow broken. Dave -- John David Anglin dave.ang...@bell.net
Re: [PATCH] Copy over section name during cloning (PR ipa/64963)
> On Mon, Feb 16, 2015 at 07:23:33PM +0100, Jan Hubicka wrote: > > > --- gcc/cgraphclones.c.jj 2015-01-09 21:59:44.0 +0100 > > > +++ gcc/cgraphclones.c2015-02-16 14:02:16.564725881 +0100 > > > @@ -577,7 +577,7 @@ cgraph_node::create_virtual_clone (vec > >char *name; > > > > > >if (!in_lto_p) > > > -gcc_checking_assert (tree_versionable_function_p (old_decl)); > > > +gcc_checking_assert (tree_versionable_function_p (old_decl)); > > > > > >gcc_assert (local.can_change_signature || !args_to_skip); > > > > > > @@ -617,6 +617,8 @@ cgraph_node::create_virtual_clone (vec > > ABI support for this. */ > > >set_new_clone_decl_and_node_flags (new_node); > > >new_node->clone.tree_map = tree_map; > > > + if (!DECL_ONE_ONLY (old_decl)) > > > > Instead of DECL_ONE_ONLY you want to test implicit_section flag. I think > > resolving unique section with -ffunction-section is also needed. > > DECL_ONE_ONLY was the test that 4.9 has been using here: > > /* Update the properties. > Make clone visible only within this translation unit. Make sure > that is not weak also. > ??? We cannot use COMDAT linkage because there is no > ABI support for this. */ > if (DECL_ONE_ONLY (old_decl)) > DECL_SECTION_NAME (new_node->decl) = NULL; > > therefore I wanted to match the 4.9 behavior, before we try something > different incrementally. OK. implicit_section is trye only for DECL_ONE_ONLY declarations and those that was named by resolve_unique_section via -ffunction-sections, so the change should be safe, but lets first go with immitating 4.9 behaviour. Incremetnally I think we should fix this and also teach inliner to not make code travel in between user named sections. This may become more issue with an LTO kernel builds. Honza > > > > +new_node->set_section (this->get_section ()); > > > > No need for this->... > > Sure, can change that. > > Jakub
Re: [PATCH] Copy over section name during cloning (PR ipa/64963)
On Mon, Feb 16, 2015 at 07:23:33PM +0100, Jan Hubicka wrote: > > --- gcc/cgraphclones.c.jj 2015-01-09 21:59:44.0 +0100 > > +++ gcc/cgraphclones.c 2015-02-16 14:02:16.564725881 +0100 > > @@ -577,7 +577,7 @@ cgraph_node::create_virtual_clone (vec >char *name; > > > >if (!in_lto_p) > > -gcc_checking_assert (tree_versionable_function_p (old_decl)); > > +gcc_checking_assert (tree_versionable_function_p (old_decl)); > > > >gcc_assert (local.can_change_signature || !args_to_skip); > > > > @@ -617,6 +617,8 @@ cgraph_node::create_virtual_clone (vec > ABI support for this. */ > >set_new_clone_decl_and_node_flags (new_node); > >new_node->clone.tree_map = tree_map; > > + if (!DECL_ONE_ONLY (old_decl)) > > Instead of DECL_ONE_ONLY you want to test implicit_section flag. I think > resolving unique section with -ffunction-section is also needed. DECL_ONE_ONLY was the test that 4.9 has been using here: /* Update the properties. Make clone visible only within this translation unit. Make sure that is not weak also. ??? We cannot use COMDAT linkage because there is no ABI support for this. */ if (DECL_ONE_ONLY (old_decl)) DECL_SECTION_NAME (new_node->decl) = NULL; therefore I wanted to match the 4.9 behavior, before we try something different incrementally. > > +new_node->set_section (this->get_section ()); > > No need for this->... Sure, can change that. Jakub
[PATCH] Fix PR64748
This fixes the validation of the argument to the deviceptr clause. Bootstrapped and regtested on x86_64-unknown-linux-gnu. OK to commit to trunk? Jim diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index ceb9e1a..9f0d7af 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -10334,11 +10334,11 @@ c_parser_oacc_data_clause_deviceptr (c_parser *parser, tree list) c_parser_omp_var_list_parens() should construct a list of locations to go along with the var list. */ - if (TREE_CODE (v) != VAR_DECL) - error_at (loc, "%qD is not a variable", v); - else if (TREE_TYPE (v) == error_mark_node) + if (TREE_TYPE (v) == error_mark_node) ; - else if (!POINTER_TYPE_P (TREE_TYPE (v))) + else if ((TREE_CODE (v) != VAR_DECL || + TREE_CODE (v) != PARM_DECL) && + !POINTER_TYPE_P (TREE_TYPE (v))) error_at (loc, "%qD is not a pointer variable", v); tree u = build_omp_clause (loc, OMP_CLAUSE_MAP); diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 57dfbcc..37b4712 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -27988,11 +27988,11 @@ cp_parser_oacc_data_clause_deviceptr (cp_parser *parser, tree list) c_parser_omp_var_list_parens should construct a list of locations to go along with the var list. */ - if (TREE_CODE (v) != VAR_DECL) - error_at (loc, "%qD is not a variable", v); - else if (TREE_TYPE (v) == error_mark_node) + if (TREE_TYPE (v) == error_mark_node) ; - else if (!POINTER_TYPE_P (TREE_TYPE (v))) + else if ((TREE_CODE (v) != VAR_DECL || + TREE_CODE (v) != PARM_DECL) && + !POINTER_TYPE_P (TREE_TYPE (v))) error_at (loc, "%qD is not a pointer variable", v); tree u = build_omp_clause (loc, OMP_CLAUSE_MAP); diff --git a/gcc/testsuite/c-c++-common/goacc/deviceptr-1.c b/gcc/testsuite/c-c++-common/goacc/deviceptr-1.c index 546fa82..5ec7540 100644 --- a/gcc/testsuite/c-c++-common/goacc/deviceptr-1.c +++ b/gcc/testsuite/c-c++-common/goacc/deviceptr-1.c @@ -8,27 +8,29 @@ fun1 (void) #pragma acc kernels deviceptr(u[0:4]) /* { dg-error "expected '\\\)' before '\\\[' token" } */ ; -#pragma acc data deviceptr(fun1) /* { dg-error "'fun1' is not a variable" } */ +#pragma acc data deviceptr(fun1) /* { dg-error "'fun1' is not a pointer variable" } */ + /* { dg-error "'fun1' is not a variable in 'map' clause" "fun1 is not a varialbe in map clause" { target *-*-* } 11 } */ ; #pragma acc parallel deviceptr(fun1[2:5]) - /* { dg-error "'fun1' is not a variable" "not a variable" { target *-*-* } 13 } */ - /* { dg-error "expected '\\\)' before '\\\[' token" "array" { target *-*-* } 13 } */ + /* { dg-error "'fun1' is not a pointer variable" "not a pointer variable" { target *-*-* } 14 } */ + /* { dg-error "expected '\\\)' before '\\\[' token" "array" { target *-*-* } 14 } */ + /* { dg-error "'fun1' is not a variable in 'map' clause" "fun1 is not a varialbe in map clause" { target *-*-* } 14 } */ ; int i; #pragma acc kernels deviceptr(i) /* { dg-error "'i' is not a pointer variable" } */ ; #pragma acc data deviceptr(i[0:4]) - /* { dg-error "'i' is not a pointer variable" "not a pointer variable" { target *-*-* } 21 } */ - /* { dg-error "expected '\\\)' before '\\\[' token" "array" { target *-*-* } 21 } */ + /* { dg-error "'i' is not a pointer variable" "not a pointer variable" { target *-*-* } 23 } */ + /* { dg-error "expected '\\\)' before '\\\[' token" "array" { target *-*-* } 23 } */ ; float fa[10]; #pragma acc parallel deviceptr(fa) /* { dg-error "'fa' is not a pointer variable" } */ ; #pragma acc kernels deviceptr(fa[1:5]) - /* { dg-error "'fa' is not a pointer variable" "not a pointer variable" { target *-*-* } 29 } */ - /* { dg-error "expected '\\\)' before '\\\[' token" "array" { target *-*-* } 29 } */ + /* { dg-error "'fa' is not a pointer variable" "not a pointer variable" { target *-*-* } 31 } */ + /* { dg-error "expected '\\\)' before '\\\[' token" "array" { target *-*-* } 31 } */ ; float *fp; @@ -44,10 +46,11 @@ fun2 (void) int i; float *fp; #pragma acc kernels deviceptr(fp,u,fun2,i,fp) - /* { dg-error "'u' undeclared" "u undeclared" { target *-*-* } 46 } */ - /* { dg-error "'fun2' is not a variable" "fun2 not a variable" { target *-*-* } 46 } */ - /* { dg-error "'i' is not a pointer variable" "i not a pointer variable" { target *-*-* } 46 } */ - /* { dg-error "'fp' appears more than once in map clauses" "fp more than once" { target *-*-* } 46 } */ + /* { dg-error "'u' undeclared" "u undeclared" { target *-*-* } 48 } */ + /* { dg-error "'fun2' is not a pointer variable" "fun2 not a pointer variable" { target *-*-* } 48 } */ + /* { dg-error "'i' is not a pointer variable" "i not a pointer variable" { target *-*-* } 48 } */ + /* { dg-error "'fp' appears more than once in map clauses" "fp more than once" { target *-*-* } 48 } */ + /* { dg-error "'fun2' is not a variable in 'map' clause" "fun2 is not a variable in map clause" { targ
Re: [PATCH] Copy over section name during cloning (PR ipa/64963)
Hi, > Hi! > > As discussed in the PR, in 4.9 we used to clone DECL_SECTION_NAME > through using copy_node on the FUNCTION_DECL, and only in selected places > (e.g. when creating artificial_thunk.*, or when creating virtual clones > of DECL_ONE_ONLY functions) we used to explicitly clear DECL_SECTION_NAME. > In 5 the section name is stored in cgraph node instead, and thus not > copied by default, so we instead need to copy it over to restore previous > behavior, otherwise we break the Linux kernel and various other packages. > > Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for > trunk? > > 2015-02-16 Jakub Jelinek > James Greenhalgh > > PR ipa/64963 > * cgraphclones.c (cgraph_node::create_virtual_clone): Copy > section if not linkonce. Fix up formatting. > (cgraph_node::create_version_clone_with_body): Copy section. > * trans-mem.c (ipa_tm_create_version): Likewise. > > * gcc.dg/ipa/ipa-clone-1.c: New test. Sorry, for taking so long on this. I made similar patch yesterday just did not get around testing it. Imissed the trans-mem case though :) > > --- gcc/cgraphclones.c.jj 2015-01-09 21:59:44.0 +0100 > +++ gcc/cgraphclones.c2015-02-16 14:02:16.564725881 +0100 > @@ -577,7 +577,7 @@ cgraph_node::create_virtual_clone (vecchar *name; > >if (!in_lto_p) > -gcc_checking_assert (tree_versionable_function_p (old_decl)); > +gcc_checking_assert (tree_versionable_function_p (old_decl)); > >gcc_assert (local.can_change_signature || !args_to_skip); > > @@ -617,6 +617,8 @@ cgraph_node::create_virtual_clone (vec ABI support for this. */ >set_new_clone_decl_and_node_flags (new_node); >new_node->clone.tree_map = tree_map; > + if (!DECL_ONE_ONLY (old_decl)) Instead of DECL_ONE_ONLY you want to test implicit_section flag. I think resolving unique section with -ffunction-section is also needed. > +new_node->set_section (this->get_section ()); No need for this->... OK with these changes. > >/* Clones of global symbols or symbols with unique names are unique. */ >if ((TREE_PUBLIC (old_decl) > @@ -1009,6 +1011,7 @@ cgraph_node::create_version_clone_with_b >new_version_node->externally_visible = 0; >new_version_node->local.local = 1; >new_version_node->lowered = true; > + new_version_node->set_section (this->get_section ()); >/* Clones of global symbols or symbols with unique names are unique. */ >if ((TREE_PUBLIC (old_decl) > && !DECL_EXTERNAL (old_decl) > --- gcc/trans-mem.c.jj2015-01-14 09:55:19.0 +0100 > +++ gcc/trans-mem.c 2015-02-16 12:58:01.399808815 +0100 > @@ -4967,6 +4967,7 @@ ipa_tm_create_version (struct cgraph_nod >new_node->externally_visible = old_node->externally_visible; >new_node->lowered = true; >new_node->tm_clone = 1; > + new_node->set_section (old_node->get_section ()); >get_cg_data (&old_node, true)->clone = new_node; > >if (old_node->get_availability () >= AVAIL_INTERPOSABLE) > --- gcc/testsuite/gcc.dg/ipa/ipa-clone-1.c.jj 2015-02-16 14:14:39.041625503 > +0100 > +++ gcc/testsuite/gcc.dg/ipa/ipa-clone-1.c2015-02-16 14:15:31.944760949 > +0100 > @@ -0,0 +1,20 @@ > +/* PR ipa/64693 */ > +/* { dg-do compile } */ > +/* { dg-require-named-sections "" } */ > +/* { dg-options "-O3 -fipa-cp -fipa-cp-clone -fdump-ipa-cp" } */ > + > +static int __attribute__ ((noinline, section ("test_section"))) > +foo (int arg) > +{ > + return 7 * arg; > +} > + > +int > +bar (int arg) > +{ > + return foo (5); > +} > + > +/* { dg-final { scan-assembler "test_section" } } */ > +/* { dg-final { scan-ipa-dump "Creating a specialized node of foo" "cp" } } > */ > +/* { dg-final { cleanup-ipa-dump "cp" } } */ > > Jakub
RE: [PATCH, FT32] initial support
On Mon, 16 Feb 2015, James Bowman wrote: > I have updated the target options. Space-saving is now enabled by > -Os. There is also a new option -msim to enable building for the > simulator (the simulator is pending submission to gdb-binutils). The documentation in this patch doesn't seem to have been updated for those changes. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH][3/n] Fix PR65015
On Mon, 16 Feb 2015, H.J. Lu wrote: > On Mon, Feb 16, 2015 at 6:25 AM, Richard Biener wrote: > > > > This fixes another leakage of random LTO temporary filenames into > > executables, this time via .symtab FILE entries. Removing it > > doesn't work (GNU ld adds it back) and is said to be incorrect. > > FWIW, ld.bfd will be fixed in 2.26. But it is still a good idea to > support older links. Yeah, especiall as I plan to backport this series if the reporter is happy. Richard. > > So the following patch, similar to the dwarf CU DW_AT_name uses > > . > > > > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. > > > > Richard. > > > > 2015-02-16 Richard Biener > > > > PR lto/65015 > > * varasm.c (default_file_start): For LTO produced units > > emit as file directive. > > > > Index: gcc/varasm.c > > === > > --- gcc/varasm.c(revision 220677) > > +++ gcc/varasm.c(working copy) > > @@ -7043,7 +7047,13 @@ default_file_start (void) > > fputs (ASM_APP_OFF, asm_out_file); > > > >if (targetm.asm_file_start_file_directive) > > -output_file_directive (asm_out_file, main_input_filename); > > +{ > > + /* LTO produced units have no meaningful main_input_filename. */ > > + if (in_lto_p) > > + output_file_directive (asm_out_file, ""); > > + else > > + output_file_directive (asm_out_file, main_input_filename); > > +} > > } > > > > /* This is a generic routine suitable for use as TARGET_ASM_FILE_END > > > > -- Richard Biener SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
Re: [PATCH, CHKP, PR target/65044] Restrict pointer bounds checker with Sanitizer
On 16 Feb 17:01, Jakub Jelinek wrote: > On Mon, Feb 16, 2015 at 06:56:45PM +0300, Ilya Enkovich wrote: > > On 16 Feb 16:31, Jakub Jelinek wrote: > > > On Mon, Feb 16, 2015 at 06:20:59PM +0300, Ilya Enkovich wrote: > > > > This patch restricts usage of Pointer Bounds Checker with Sanitizer. > > > > OK for trunk? > > > > > > There are many sanitizers, and for most of them I don't see why they would > > > be in any conflict with -mmpx, it is just -fsanitize=address and > > > -fsanitize=kernel-address. > > > So perhaps test instead if (flag_sanitize & SANITIZE_ADDRESS) != 0, and > > > better clear the flag_pointer_bounds after issuing the error, error () is > > > not a fatal function, so you need something sensible for error-recovery. > > > > > > Jakub > > > > I don't know all sanitizers in details. Code generated by some of them may > > be incorrect from checker point of view. Thus I just wanted to disable > > unexplored and untested combinations. > > Shouldn't be that hard to write a testcase and test it. > > Most of the sanitizers just add code like > if (some_condition) > __ubsan_handle_... (); > where from the POV of the program the __ubsan_* function reports or might > report some problem, and optionally abort the program. > That some_condition can be a check of the pointer value, shift count, > divisor check, etc. > > Jakub OK. With no tricky memory references this should be safe. Here is a patch to filter off Adress Sanitizer only. Thanks for review! Ilya -- gcc/ 2015-02-16 Ilya Enkovich PR target/65044 * toplev.c (process_options): Restrict Pointer Bounds Checker usage with Address Sanitizer. gcc/testsuite/ 2015-02-16 Ilya Enkovich PR target/65044 * gcc.target/i386/pr65044.c: New. diff --git a/gcc/testsuite/gcc.target/i386/pr65044.c b/gcc/testsuite/gcc.target/i386/pr65044.c new file mode 100644 index 000..4f318d6 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr65044.c @@ -0,0 +1,12 @@ +/* { dg-error "-fcheck-pointer-bounds is not supported with Address Sanitizer" } */ +/* { dg-do compile } */ +/* { dg-require-effective-target mpx } */ +/* { dg-options "-fcheck-pointer-bounds -mmpx -fsanitize=address" } */ + +extern int x[]; + +void +foo () +{ + x[0] = 0; +} diff --git a/gcc/toplev.c b/gcc/toplev.c index 99cf180..70eb6b6 100644 --- a/gcc/toplev.c +++ b/gcc/toplev.c @@ -1376,6 +1376,11 @@ process_options (void) { if (targetm.chkp_bound_mode () == VOIDmode) error ("-fcheck-pointer-bounds is not supported for this target"); + + if (flag_sanitize & SANITIZE_ADDRESS) + error ("-fcheck-pointer-bounds is not supported with Address Sanitizer"); + + flag_check_pointer_bounds = 0; } /* One region RA really helps to decrease the code size. */
[PATCH] Copy over section name during cloning (PR ipa/64963)
Hi! As discussed in the PR, in 4.9 we used to clone DECL_SECTION_NAME through using copy_node on the FUNCTION_DECL, and only in selected places (e.g. when creating artificial_thunk.*, or when creating virtual clones of DECL_ONE_ONLY functions) we used to explicitly clear DECL_SECTION_NAME. In 5 the section name is stored in cgraph node instead, and thus not copied by default, so we instead need to copy it over to restore previous behavior, otherwise we break the Linux kernel and various other packages. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-02-16 Jakub Jelinek James Greenhalgh PR ipa/64963 * cgraphclones.c (cgraph_node::create_virtual_clone): Copy section if not linkonce. Fix up formatting. (cgraph_node::create_version_clone_with_body): Copy section. * trans-mem.c (ipa_tm_create_version): Likewise. * gcc.dg/ipa/ipa-clone-1.c: New test. --- gcc/cgraphclones.c.jj 2015-01-09 21:59:44.0 +0100 +++ gcc/cgraphclones.c 2015-02-16 14:02:16.564725881 +0100 @@ -577,7 +577,7 @@ cgraph_node::create_virtual_clone (vecclone.tree_map = tree_map; + if (!DECL_ONE_ONLY (old_decl)) +new_node->set_section (this->get_section ()); /* Clones of global symbols or symbols with unique names are unique. */ if ((TREE_PUBLIC (old_decl) @@ -1009,6 +1011,7 @@ cgraph_node::create_version_clone_with_b new_version_node->externally_visible = 0; new_version_node->local.local = 1; new_version_node->lowered = true; + new_version_node->set_section (this->get_section ()); /* Clones of global symbols or symbols with unique names are unique. */ if ((TREE_PUBLIC (old_decl) && !DECL_EXTERNAL (old_decl) --- gcc/trans-mem.c.jj 2015-01-14 09:55:19.0 +0100 +++ gcc/trans-mem.c 2015-02-16 12:58:01.399808815 +0100 @@ -4967,6 +4967,7 @@ ipa_tm_create_version (struct cgraph_nod new_node->externally_visible = old_node->externally_visible; new_node->lowered = true; new_node->tm_clone = 1; + new_node->set_section (old_node->get_section ()); get_cg_data (&old_node, true)->clone = new_node; if (old_node->get_availability () >= AVAIL_INTERPOSABLE) --- gcc/testsuite/gcc.dg/ipa/ipa-clone-1.c.jj 2015-02-16 14:14:39.041625503 +0100 +++ gcc/testsuite/gcc.dg/ipa/ipa-clone-1.c 2015-02-16 14:15:31.944760949 +0100 @@ -0,0 +1,20 @@ +/* PR ipa/64693 */ +/* { dg-do compile } */ +/* { dg-require-named-sections "" } */ +/* { dg-options "-O3 -fipa-cp -fipa-cp-clone -fdump-ipa-cp" } */ + +static int __attribute__ ((noinline, section ("test_section"))) +foo (int arg) +{ + return 7 * arg; +} + +int +bar (int arg) +{ + return foo (5); +} + +/* { dg-final { scan-assembler "test_section" } } */ +/* { dg-final { scan-ipa-dump "Creating a specialized node of foo" "cp" } } */ +/* { dg-final { cleanup-ipa-dump "cp" } } */ Jakub
[C++ PATCH] Fix constexpr C++11 handling with lambdas (PR c++/65075)
Hi! If there are lambdas in C++11 constexpr return-stmts, we get implicit typedefs of the lambda types, but those are artificial and we should ignore them. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-02-16 Paolo Carlini Jakub Jelinek PR c++/65075 * constexpr.c (check_constexpr_bind_expr_vars): Allow implicit typedefs for lambda types. * g++.dg/cpp0x/pr65075.C: New test. --- gcc/cp/constexpr.c.jj 2015-02-14 00:19:49.0 +0100 +++ gcc/cp/constexpr.c 2015-02-16 15:12:06.260262857 +0100 @@ -416,7 +416,8 @@ check_constexpr_bind_expr_vars (tree t) for (tree var = BIND_EXPR_VARS (t); var; var = DECL_CHAIN (var)) if (TREE_CODE (var) == TYPE_DECL - && DECL_IMPLICIT_TYPEDEF_P (var)) + && DECL_IMPLICIT_TYPEDEF_P (var) + && !LAMBDA_TYPE_P (TREE_TYPE (var))) return false; return true; } --- gcc/testsuite/g++.dg/cpp0x/pr65075.C.jj 2015-02-16 15:09:48.405517062 +0100 +++ gcc/testsuite/g++.dg/cpp0x/pr65075.C2015-02-16 15:09:23.0 +0100 @@ -0,0 +1,17 @@ +// PR c++/65075 +// { dg-do compile { target c++11 } } + +typedef void (*E) (); +template +constexpr E +bar (bool a) +{ + return a ? []() {} : []() {}; +} + +void +foo () +{ + (bar (false)) (); + (bar (true)) (); +} Jakub
Re: [committed] Change "Q" and "T" constraints to memory constraints
On 02/14/2015 06:50 AM, John David Anglin wrote: > Possibly the constant can somehow be forced into the data section where the > relocations > aren't a problem? Hmm. It looks like we might already do that. See default_select_rtx_section. r~
Re: [PATCH] PR rtl-optimization/32219: optimizer causees wrong code in pic/hidden/weak symbol checking
On Mon, Feb 16, 2015 at 8:30 AM, Richard Henderson wrote: > On 02/16/2015 06:01 AM, H.J. Lu wrote: >> On Mon, Feb 16, 2015 at 5:25 AM, Uros Bizjak wrote: >>> Hello! >>> 2015-02-12 H.J. Lu Richard Henderson PR rtl/32219 * cgraphunit.c (cgraph_node::finalize_function): Set definition before notice_global_symbol. (varpool_node::finalize_decl): Likewise. * varasm.c (default_binds_local_p_2): Rename from default_binds_local_p_1, add weak_dominate argument. Use direct returns instead of assigning to local variable. Unify varpool and cgraph paths via symtab_node. Reject undef weak variables before testing visibility. Reorder tests for simplicity. (default_binds_local_p): Use default_binds_local_p_2. (default_binds_local_p_1): Likewise. (decl_binds_to_current_def_p): Unify varpool and cgraph paths via symtab_node. (default_elf_asm_output_external): Emit visibility when specified. >>> >>> It looks like this patch broke alphaev68-linux-gnu [1]. There are many >>> failures of the type: >>> >>> /tmp/cck7V7MR.o: In function >>> `__static_initialization_and_destruction_0(int, int)':^M >>> (.text+0x3ac): relocation truncated to fit: GPRELHIGH against symbol >>> `std::__cxx11::basic_string, >>> std::allocator >::~basic_string()@@GLIBCXX_3.4.21' defined in >>> .text section in >>> /space/uros/gcc-build/alphaev68-unknown-linux-gnu/./libstdc++-v3/src/.libs/libstdc++.so^M >>> /space/homedirs/uros/local/bin/ld: /tmp/cck7V7MR.o: gp-relative >>> relocation against dynamic symbol >> >> It could be related to: >> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65064 >> >> Before this bug fix, all common symbols don't bind locally, >> which is one of PR 32219 bugs. After this fix, common >> symbols bind locally. It may cause problems on targets with >> small data sections and common symbols aren't in small >> data section: > > This is a destructor, and so obviously not a common symbol. Then it could be: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65074 -- H.J.
Re: [PATCH] PR rtl-optimization/32219: optimizer causees wrong code in pic/hidden/weak symbol checking
On 02/16/2015 06:01 AM, H.J. Lu wrote: > On Mon, Feb 16, 2015 at 5:25 AM, Uros Bizjak wrote: >> Hello! >> >>> 2015-02-12 H.J. Lu >>> Richard Henderson >>> >>> PR rtl/32219 >>> * cgraphunit.c (cgraph_node::finalize_function): Set definition >>> before notice_global_symbol. >>> (varpool_node::finalize_decl): Likewise. >>> * varasm.c (default_binds_local_p_2): Rename from >>> default_binds_local_p_1, add weak_dominate argument. Use direct >>> returns instead of assigning to local variable. Unify varpool and >>> cgraph paths via symtab_node. Reject undef weak variables before >>> testing visibility. Reorder tests for simplicity. >>> (default_binds_local_p): Use default_binds_local_p_2. >>> (default_binds_local_p_1): Likewise. >>> (decl_binds_to_current_def_p): Unify varpool and cgraph paths >>> via symtab_node. >>> (default_elf_asm_output_external): Emit visibility when specified. >> >> It looks like this patch broke alphaev68-linux-gnu [1]. There are many >> failures of the type: >> >> /tmp/cck7V7MR.o: In function >> `__static_initialization_and_destruction_0(int, int)':^M >> (.text+0x3ac): relocation truncated to fit: GPRELHIGH against symbol >> `std::__cxx11::basic_string, >> std::allocator >::~basic_string()@@GLIBCXX_3.4.21' defined in >> .text section in >> /space/uros/gcc-build/alphaev68-unknown-linux-gnu/./libstdc++-v3/src/.libs/libstdc++.so^M >> /space/homedirs/uros/local/bin/ld: /tmp/cck7V7MR.o: gp-relative >> relocation against dynamic symbol > > It could be related to: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65064 > > Before this bug fix, all common symbols don't bind locally, > which is one of PR 32219 bugs. After this fix, common > symbols bind locally. It may cause problems on targets with > small data sections and common symbols aren't in small > data section: This is a destructor, and so obviously not a common symbol. I'll have a look. r~
[Patch docs obvious] install.texi: Put aarch64 back in alphabetical order, add link
Hi, Looking at https://gcc.gnu.org/install/specific.html , aarch64*-*-* is in an odd place and isn't linked to from the top of the file. This patch fixes that by reordering the entries and adding a link from the menu at the top of the page. I've built the documentation with no new issues, and had a look in firefox to ensure the changes were sane. I've applied this patch under the obvious rule as revision r220738. Thanks, James --- 2015-02-16 James Greenhalgh * doc/install.texi (Specific): Reorder targets list to put aarch64 in alphabetical order. Add a link to aarch64*-*-* from the top menu.diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi index c9e3bf1..47380a3 100644 --- a/gcc/doc/install.texi +++ b/gcc/doc/install.texi @@ -3238,6 +3238,8 @@ information have to. @ifhtml @itemize @item +@uref{#aarch64-x-x,,aarch64*-*-*} +@item @uref{#alpha-x-x,,alpha*-*-*} @item @uref{#alpha-dec-osf51,,alpha*-dec-osf5.1} @@ -3386,6 +3388,25 @@ information have to. @end html +@anchor{aarch64-x-x} +@heading aarch64*-*-* +Binutils pre 2.24 does not have support for selecting @option{-mabi} and +does not support ILP32. If it is used to build GCC 4.9 or later, GCC will +not support option @option{-mabi=ilp32}. + +To enable a workaround for the Cortex-A53 erratum number 835769 by default +(for all CPUs regardless of -mcpu option given) at configure time use the +@option{--enable-fix-cortex-a53-835769} option. This will enable the fix by +default and can be explicitly disabled during during compilation by passing the +@option{-mno-fix-cortex-a53-835769} option. Conversely, +@option{--disable-fix-cortex-a53-835769} will disable the workaround by +default. The workaround is disabled by default if neither of +@option{--enable-fix-cortex-a53-835769} or +@option{--disable-fix-cortex-a53-835769} is given at configure time. + +@html + +@end html @anchor{alpha-x-x} @heading alpha*-*-* This section contains general configuration information for all @@ -3897,25 +3918,6 @@ removed and the system libunwind library will always be used. @html -@end html -@anchor{aarch64-x-x} -@heading aarch64*-*-* -Binutils pre 2.24 does not have support for selecting @option{-mabi} and -does not support ILP32. If it is used to build GCC 4.9 or later, GCC will -not support option @option{-mabi=ilp32}. - -To enable a workaround for the Cortex-A53 erratum number 835769 by default -(for all CPUs regardless of -mcpu option given) at configure time use the -@option{--enable-fix-cortex-a53-835769} option. This will enable the fix by -default and can be explicitly disabled during during compilation by passing the -@option{-mno-fix-cortex-a53-835769} option. Conversely, -@option{--disable-fix-cortex-a53-835769} will disable the workaround by -default. The workaround is disabled by default if neither of -@option{--enable-fix-cortex-a53-835769} or -@option{--disable-fix-cortex-a53-835769} is given at configure time. - -@html - @end html @anchor{x-ibm-aix}
Re: [patch] Fix invalid attributes in libstdc++
On 03/02/15 10:37 +, Iain Sandoe wrote: Hi Jonathan, On 1 Feb 2015, at 15:10, Jonathan Wakely wrote: On 01/02/15 15:08 +, Jonathan Wakely wrote: I failed to CC gcc-patches on this patch ... On 29/01/15 13:02 +, Jonathan Wakely wrote: Jakub pointed out that we have some attributes that don't use the reserved namespace, e.g. __attribute__ ((always_inline)). This is a 4.9/5 regression and the fix was pre-approved by Jakub so I've committed it to trunk. When we're back in stage1 I'll fix the TODO comments in the new tests (see PR64857) and will also rename testsuite/17_intro/headers/c++200x to .../c++2011. The new test fails on darwin (PR64883) and --enable-threads=single targets (PR64885). This is a workaround for 64883. Tested x86_64-linux, committed to trunk. the following additional tweaks provide further work-arounds. ... checked on darwin12 and darwin14. OK for trunk - thanks. I have a fixincludes patch for next stage #1. Iain diff --git a/libstdc++-v3/testsuite/17_intro/headers/c++1998/all_attributes.cc b/libstdc++-v3/testsuite/17_intro/headers/c++1998/all_attributes.cc index 76a935e..6fc362a 100644 --- a/libstdc++-v3/testsuite/17_intro/headers/c++1998/all_attributes.cc +++ b/libstdc++-v3/testsuite/17_intro/headers/c++1998/all_attributes.cc @@ -26,11 +26,11 @@ // darwin headers use these, see PR 64883 # define deprecated 1 # define noreturn 1 +# define visibility 1 #endif #define packed 1 #define pure 1 #define unused 1 -#define visibility 1 #include // TODO: this is missing from #include diff --git a/libstdc++-v3/testsuite/17_intro/headers/c++200x/all_attributes.cc b/libstdc++-v3/testsuite/17_intro/headers/c++200x/all_attributes.cc index c7ec27a..0726e3f 100644 --- a/libstdc++-v3/testsuite/17_intro/headers/c++200x/all_attributes.cc +++ b/libstdc++-v3/testsuite/17_intro/headers/c++200x/all_attributes.cc @@ -22,11 +22,14 @@ // Don't test 'const' and 'noreturn' because they are reserved anyway. #define abi_tag 1 #define always_inline 1 -#define deprecated 1 +#ifndef __APPLE__ +// darwin headers use these, see PR 64883 +# define visibility 1 +# define deprecated 1 +#endif #define packed 1 #define pure 1 #define unused 1 -#define visibility 1 #include // TODO: this is missing from #include// TODO: this is missing from diff --git a/libstdc++-v3/testsuite/17_intro/headers/c++2014/all_attributes.cc b/libstdc++-v3/testsuite/17_intro/headers/c++2014/all_attributes.cc index 533a6f1..06bcb8e 100644 --- a/libstdc++-v3/testsuite/17_intro/headers/c++2014/all_attributes.cc +++ b/libstdc++-v3/testsuite/17_intro/headers/c++2014/all_attributes.cc @@ -22,11 +22,14 @@ // Don't test 'const' and 'noreturn' because they are reserved anyway. #define abi_tag 1 #define always_inline 1 -#define deprecated 1 +#ifndef __APPLE__ +// darwin headers use these, see PR 64883 +# define deprecated 1 +# define visibility 1 +#endif #define packed 1 #define pure 1 #define unused 1 -#define visibility 1 #include // TODO: this is missing from #include // TODO: this is missing from
Re: [PATCH, CHKP, PR target/65044] Restrict pointer bounds checker with Sanitizer
On Mon, Feb 16, 2015 at 06:56:45PM +0300, Ilya Enkovich wrote: > On 16 Feb 16:31, Jakub Jelinek wrote: > > On Mon, Feb 16, 2015 at 06:20:59PM +0300, Ilya Enkovich wrote: > > > This patch restricts usage of Pointer Bounds Checker with Sanitizer. OK > > > for trunk? > > > > There are many sanitizers, and for most of them I don't see why they would > > be in any conflict with -mmpx, it is just -fsanitize=address and > > -fsanitize=kernel-address. > > So perhaps test instead if (flag_sanitize & SANITIZE_ADDRESS) != 0, and > > better clear the flag_pointer_bounds after issuing the error, error () is > > not a fatal function, so you need something sensible for error-recovery. > > > > Jakub > > I don't know all sanitizers in details. Code generated by some of them may > be incorrect from checker point of view. Thus I just wanted to disable > unexplored and untested combinations. Shouldn't be that hard to write a testcase and test it. Most of the sanitizers just add code like if (some_condition) __ubsan_handle_... (); where from the POV of the program the __ubsan_* function reports or might report some problem, and optionally abort the program. That some_condition can be a check of the pointer value, shift count, divisor check, etc. Jakub
Re: [patch] Fix invalid attributes in libstdc++
On Tue, Feb 3, 2015 at 5:37 AM, Iain Sandoe wrote: > Hi Jonathan, > > On 1 Feb 2015, at 15:10, Jonathan Wakely wrote: > >> On 01/02/15 15:08 +, Jonathan Wakely wrote: >>> I failed to CC gcc-patches on this patch ... >>> >>> On 29/01/15 13:02 +, Jonathan Wakely wrote: Jakub pointed out that we have some attributes that don't use the reserved namespace, e.g. __attribute__ ((always_inline)). This is a 4.9/5 regression and the fix was pre-approved by Jakub so I've committed it to trunk. When we're back in stage1 I'll fix the TODO comments in the new tests (see PR64857) and will also rename testsuite/17_intro/headers/c++200x to .../c++2011. >> >> The new test fails on darwin (PR64883) and --enable-threads=single >> targets (PR64885). >> >> This is a workaround for 64883. Tested x86_64-linux, committed to >> trunk. >> > > the following additional tweaks provide further work-arounds. > ... checked on darwin12 and darwin14. > I have a fixincludes patch for next stage #1. > Iain Ping on https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00122.html > > > > >
Re: [PATCH, CHKP, PR target/65044] Restrict pointer bounds checker with Sanitizer
On 16 Feb 16:31, Jakub Jelinek wrote: > On Mon, Feb 16, 2015 at 06:20:59PM +0300, Ilya Enkovich wrote: > > This patch restricts usage of Pointer Bounds Checker with Sanitizer. OK > > for trunk? > > There are many sanitizers, and for most of them I don't see why they would > be in any conflict with -mmpx, it is just -fsanitize=address and > -fsanitize=kernel-address. > So perhaps test instead if (flag_sanitize & SANITIZE_ADDRESS) != 0, and > better clear the flag_pointer_bounds after issuing the error, error () is > not a fatal function, so you need something sensible for error-recovery. > > Jakub I don't know all sanitizers in details. Code generated by some of them may be incorrect from checker point of view. Thus I just wanted to disable unexplored and untested combinations. Ilya
Contents of PO file 'cpplib-5.1-b20150208.sv.po'
cpplib-5.1-b20150208.sv.po.gz Description: Binary data The Translation Project robot, in the name of your translation coordinator.
New Swedish PO file for 'cpplib' (version 5.1-b20150208)
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'cpplib' has been submitted by the Swedish team of translators. The file is available at: http://translationproject.org/latest/cpplib/sv.po (This file, 'cpplib-5.1-b20150208.sv.po', has just now been sent to you in a separate email.) All other PO files for your package are available in: http://translationproject.org/latest/cpplib/ Please consider including all of these in your next release, whether official or a pretest. Whenever you have a new distribution with a new version number ready, containing a newer POT file, please send the URL of that distribution tarball to the address below. The tarball may be just a pretest or a snapshot, it does not even have to compile. It is just used by the translators when they need some extra translation context. The following HTML page has been updated: http://translationproject.org/domain/cpplib.html If any question arises, please contact the translation coordinator. Thank you for all your work, The Translation Project robot, in the name of your translation coordinator.
Re: [PATCH, CHKP, PR target/65044] Restrict pointer bounds checker with Sanitizer
On Mon, Feb 16, 2015 at 06:20:59PM +0300, Ilya Enkovich wrote: > This patch restricts usage of Pointer Bounds Checker with Sanitizer. OK for > trunk? There are many sanitizers, and for most of them I don't see why they would be in any conflict with -mmpx, it is just -fsanitize=address and -fsanitize=kernel-address. So perhaps test instead if (flag_sanitize & SANITIZE_ADDRESS) != 0, and better clear the flag_pointer_bounds after issuing the error, error () is not a fatal function, so you need something sensible for error-recovery. Jakub
Re: [PATCH][3/n] Fix PR65015
On Mon, Feb 16, 2015 at 6:25 AM, Richard Biener wrote: > > This fixes another leakage of random LTO temporary filenames into > executables, this time via .symtab FILE entries. Removing it > doesn't work (GNU ld adds it back) and is said to be incorrect. FWIW, ld.bfd will be fixed in 2.26. But it is still a good idea to support older links. > So the following patch, similar to the dwarf CU DW_AT_name uses > . > > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. > > Richard. > > 2015-02-16 Richard Biener > > PR lto/65015 > * varasm.c (default_file_start): For LTO produced units > emit as file directive. > > Index: gcc/varasm.c > === > --- gcc/varasm.c(revision 220677) > +++ gcc/varasm.c(working copy) > @@ -7043,7 +7047,13 @@ default_file_start (void) > fputs (ASM_APP_OFF, asm_out_file); > >if (targetm.asm_file_start_file_directive) > -output_file_directive (asm_out_file, main_input_filename); > +{ > + /* LTO produced units have no meaningful main_input_filename. */ > + if (in_lto_p) > + output_file_directive (asm_out_file, ""); > + else > + output_file_directive (asm_out_file, main_input_filename); > +} > } > > /* This is a generic routine suitable for use as TARGET_ASM_FILE_END -- H.J.
[PATCH, CHKP, PR target/65044] Restrict pointer bounds checker with Sanitizer
Hi, This patch restricts usage of Pointer Bounds Checker with Sanitizer. OK for trunk? Thanks, Ilya -- gcc/ 2015-02-16 Ilya Enkovich PR target/65044 * toplev.c (process_options): Restrict Pointer Bounds Checker usage with sanitizers. gcc/testsuite/ 2015-02-16 Ilya Enkovich PR target/65044 * gcc.target/i386/pr65044.c: New. diff --git a/gcc/testsuite/gcc.target/i386/pr65044.c b/gcc/testsuite/gcc.target/i386/pr65044.c new file mode 100644 index 000..79ecb04 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr65044.c @@ -0,0 +1,12 @@ +/* { dg-error "-fcheck-pointer-bounds is not supported with sanitizers" } */ +/* { dg-do compile } */ +/* { dg-require-effective-target mpx } */ +/* { dg-options "-fcheck-pointer-bounds -mmpx -fsanitize=address" } */ + +extern int x[]; + +void +foo () +{ + x[0] = 0; +} diff --git a/gcc/toplev.c b/gcc/toplev.c index 99cf180..bf987c8 100644 --- a/gcc/toplev.c +++ b/gcc/toplev.c @@ -1376,6 +1376,9 @@ process_options (void) { if (targetm.chkp_bound_mode () == VOIDmode) error ("-fcheck-pointer-bounds is not supported for this target"); + + if (flag_sanitize) + error ("-fcheck-pointer-bounds is not supported with sanitizers"); } /* One region RA really helps to decrease the code size. */
Re: [PATCH][OpenMP] Forbid usage of non-target functions in target regions
On Mon, Feb 02, 2015 at 13:10:19 +0100, Jakub Jelinek wrote: > [...] Generally, the solution if something goes > wrong during the offloading compilation should be just to give up on the > offloading to the particular offloading target (i.e. fill in the sections > libgomp reads in a way that will result in host fallback). Do you mean something like this? Bootstrapped/regtested on x86_64-linux and i686-linux. Is this patch OK for stage4 or stage1? gcc/ * collect-utils.c (do_wait, fork_execute): New argument. * collect-utils.h: Likewise. * collect2.c: Pass new argument to do_wait and fork_execute. * config/i386/intelmic-mkoffload.c: Likewise. (compile_for_target): Don't call fatal_error if compilation failed. (generate_target_descr_file, generate_target_offloadend_file) (prepare_target_image): Pass out_filename to compile_for_target. * config/nvptx/mkoffload.c: Pass new argument fork_execute. * lto-wrapper.c (num_offload_targets): New static global variable. (compile_offload_image): Return NULL if an image was not created. (compile_images_for_offload_targets): Call warning instead of fatal_error if an image was not created. (run_gcc): Do not return empty images to linker. Pass new argument to do_wait and fork_execute. libgomp/ * target.c (gomp_target_fallback): New static function. (GOMP_target): Move host fallback to the new gomp_target_fallback. Run gomp_target_fallback if tgt_fn is not present in the splay tree. diff --git a/gcc/collect-utils.c b/gcc/collect-utils.c index 6bbe9eb..2dce3c9 100644 --- a/gcc/collect-utils.c +++ b/gcc/collect-utils.c @@ -85,10 +85,10 @@ collect_wait (const char *prog, struct pex_obj *pex) } void -do_wait (const char *prog, struct pex_obj *pex) +do_wait (const char *prog, struct pex_obj *pex, bool non_fatal) { int ret = collect_wait (prog, pex); - if (ret != 0) + if (!non_fatal && ret != 0) fatal_error (input_location, "%s returned %d exit status", prog, ret); if (response_file && !save_temps) @@ -201,13 +201,13 @@ collect_execute (const char *prog, char **argv, const char *outname, } void -fork_execute (const char *prog, char **argv, bool use_atfile) +fork_execute (const char *prog, char **argv, bool use_atfile, bool non_fatal) { struct pex_obj *pex; pex = collect_execute (prog, argv, NULL, NULL, PEX_LAST | PEX_SEARCH, use_atfile); - do_wait (prog, pex); + do_wait (prog, pex, non_fatal); } /* Delete tempfiles. */ diff --git a/gcc/collect-utils.h b/gcc/collect-utils.h index 2b3ed44..7b3b3da 100644 --- a/gcc/collect-utils.h +++ b/gcc/collect-utils.h @@ -29,8 +29,8 @@ extern struct pex_obj *collect_execute (const char *, char **, const char *, const char *, int, bool); extern int collect_wait (const char *, struct pex_obj *); -extern void do_wait (const char *, struct pex_obj *); -extern void fork_execute (const char *, char **, bool); +extern void do_wait (const char *, struct pex_obj *, bool); +extern void fork_execute (const char *, char **, bool, bool); extern void utils_cleanup (bool); diff --git a/gcc/collect2.c b/gcc/collect2.c index b53e151..f8be7da 100644 --- a/gcc/collect2.c +++ b/gcc/collect2.c @@ -758,7 +758,7 @@ maybe_run_lto_and_relink (char **lto_ld_argv, char **object_lst, obstack_free (&temporary_obstack, temporary_firstobj); } - do_wait (prog, pex); + do_wait (prog, pex, false); pex = NULL; /* Compute memory needed for new LD arguments. At most number of original arguemtns @@ -803,7 +803,8 @@ maybe_run_lto_and_relink (char **lto_ld_argv, char **object_lst, /* Run the linker again, this time replacing the object files optimized by the LTO with the temporary file generated by the LTO. */ - fork_execute ("ld", out_lto_ld_argv, HAVE_GNU_LD && at_file_supplied); + fork_execute ("ld", out_lto_ld_argv, HAVE_GNU_LD && at_file_supplied, + false); post_ld_pass (true); free (lto_ld_argv); @@ -813,7 +814,7 @@ maybe_run_lto_and_relink (char **lto_ld_argv, char **object_lst, { /* Our caller is relying on us to do the link even though there is no LTO back end work to be done. */ - fork_execute ("ld", lto_ld_argv, HAVE_GNU_LD && at_file_supplied); + fork_execute ("ld", lto_ld_argv, HAVE_GNU_LD && at_file_supplied, false); post_ld_pass (false); } else @@ -1706,7 +1707,7 @@ main (int argc, char **argv) strip_argv[0] = strip_file_name; strip_argv[1] = output_file; strip_argv[2] = (char *) 0; - fork_execute ("strip", real_strip_argv, false); + fork_execute ("strip", real_strip_argv, false, false); } #ifdef COLLECT_EXPORT_LIST @@ -1792,7 +1793,7 @@ main (int argc, char **argv) /
[PATCH] Fix PR65077
The following removes an optimization not considering FP values to carry pointers from PTA. Instead to fix the underlying problem in PR37021 this patch adds handling of the rest of handled_components_p. Bootstrap and regtest in progress on x86_64-unknown-linux-gnu. Richard. 2015-02-16 Richard Biener PR tree-optimization/65077 * tree-ssa-structalias.c (get_constraint_for_1): Handle IMAGPART_EXPR, REALPART_EXPR and BIT_FIELD_REF. (find_func_aliases): Allow float values to carry pointers again. * gcc.dg/torture/pr65077.c: New testcase. Index: gcc/tree-ssa-structalias.c === --- gcc/tree-ssa-structalias.c (revision 220731) +++ gcc/tree-ssa-structalias.c (working copy) @@ -3492,6 +3492,9 @@ get_constraint_for_1 (tree t, vec case ARRAY_REF: case ARRAY_RANGE_REF: case COMPONENT_REF: + case IMAGPART_EXPR: + case REALPART_EXPR: + case BIT_FIELD_REF: get_constraint_for_component_ref (t, results, address_p, lhs_p); return; case VIEW_CONVERT_EXPR: @@ -4712,11 +4811,7 @@ find_func_aliases (struct function *fn, get_constraint_for (lhsop, &lhsc); - if (FLOAT_TYPE_P (TREE_TYPE (lhsop))) - /* If the operation produces a floating point result then - assume the value is not produced to transfer a pointer. */ - ; - else if (code == POINTER_PLUS_EXPR) + if (code == POINTER_PLUS_EXPR) get_constraint_for_ptr_offset (gimple_assign_rhs1 (t), gimple_assign_rhs2 (t), &rhsc); else if (code == BIT_AND_EXPR Index: gcc/testsuite/gcc.dg/torture/pr65077.c === --- gcc/testsuite/gcc.dg/torture/pr65077.c (revision 0) +++ gcc/testsuite/gcc.dg/torture/pr65077.c (working copy) @@ -0,0 +1,70 @@ +/* { dg-do run } */ + +extern void abort (void); +extern void *memcpy(void *, const void *, __SIZE_TYPE__); + +typedef struct { +void *v1; +void *v2; +void *v3; +union { + void *f1; + void *f2; +} u; +} S; + + +S *getS(); +void verify_p(void *p); +double *getP(void *p); + +void memcpy_bug() +{ + S *s; + double *p = getP(0); + + if (p) { + int intSptr[sizeof(S*)/sizeof(int)]; + unsigned i = 0; + for (i = 0; i < sizeof(intSptr)/sizeof(*intSptr); ++i) { + intSptr[i] = (int) p[i]; + } + memcpy(&s, intSptr, sizeof(intSptr)); + (s)->u.f1 = p; + verify_p((s)->u.f1); + } else { + s = getS(); + } + verify_p(s->u.f1); +} + +double P[4]; + +double *getP(void *p) { +union u { + void *p; + int i[2]; +} u; +u.p = P; +P[0] = u.i[0]; +P[1] = u.i[1]; +return P; +} + +S *getS() +{ + return 0; +} + +void verify_p(void *p) +{ + if (p != P) +abort (); +} + +int main(int argc, char *argv[]) +{ +memcpy_bug(); +return 0; +} +
[PATCH][3/n] Fix PR65015
This fixes another leakage of random LTO temporary filenames into executables, this time via .symtab FILE entries. Removing it doesn't work (GNU ld adds it back) and is said to be incorrect. So the following patch, similar to the dwarf CU DW_AT_name uses . Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2015-02-16 Richard Biener PR lto/65015 * varasm.c (default_file_start): For LTO produced units emit as file directive. Index: gcc/varasm.c === --- gcc/varasm.c(revision 220677) +++ gcc/varasm.c(working copy) @@ -7043,7 +7047,13 @@ default_file_start (void) fputs (ASM_APP_OFF, asm_out_file); if (targetm.asm_file_start_file_directive) -output_file_directive (asm_out_file, main_input_filename); +{ + /* LTO produced units have no meaningful main_input_filename. */ + if (in_lto_p) + output_file_directive (asm_out_file, ""); + else + output_file_directive (asm_out_file, main_input_filename); +} } /* This is a generic routine suitable for use as TARGET_ASM_FILE_END
[PATCH] Fix PRs 65063 and 63593
Predictive commoning happens to re-use SSA names it released while there are still uses of them (oops), confusing the hell out of other code (expected). Fixed thus. Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2015-02-16 Richard Biener PR tree-optimization/63593 PR tree-optimization/65063 * tree-predcom.c (execute_pred_commoning_chain): Delay removing stmts and releasing SSA names until... (execute_pred_commoning): ... after processing all chains. * gcc.dg/pr63593.c: New testcase. * gcc.dg/pr65063.c: Likewise. Index: gcc/tree-predcom.c === --- gcc/tree-predcom.c (revision 220731) +++ gcc/tree-predcom.c (working copy) @@ -1745,9 +1745,8 @@ execute_pred_commoning_chain (struct loo if (chain->combined) { /* For combined chains, just remove the statements that are used to -compute the values of the expression (except for the root one). */ - for (i = 1; chain->refs.iterate (i, &a); i++) - remove_stmt (a->stmt); +compute the values of the expression (except for the root one). +We delay this until after all chains are processed. */ } else { @@ -1811,6 +1810,21 @@ execute_pred_commoning (struct loop *loo execute_pred_commoning_chain (loop, chain, tmp_vars); } + FOR_EACH_VEC_ELT (chains, i, chain) +{ + if (chain->type == CT_INVARIANT) + ; + else if (chain->combined) + { + /* For combined chains, just remove the statements that are used to +compute the values of the expression (except for the root one). */ + dref a; + unsigned j; + for (j = 1; chain->refs.iterate (j, &a); j++) + remove_stmt (a->stmt); + } +} + update_ssa (TODO_update_ssa_only_virtuals); } Index: gcc/testsuite/gcc.dg/pr63593.c === --- gcc/testsuite/gcc.dg/pr63593.c (revision 0) +++ gcc/testsuite/gcc.dg/pr63593.c (working copy) @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fno-tree-vectorize" } */ + +int in[2 * 4][4]; +int out[4]; + +void +foo (void) +{ + int sum; + int i, j, k; + for (k = 0; k < 4; k++) +{ + sum = 1; + for (j = 0; j < 4; j++) + for (i = 0; i < 4; i++) + sum *= in[i + k][j]; + out[k] = sum; +} +} Index: gcc/testsuite/gcc.dg/pr65063.c === --- gcc/testsuite/gcc.dg/pr65063.c (revision 0) +++ gcc/testsuite/gcc.dg/pr65063.c (working copy) @@ -0,0 +1,33 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -fno-tree-loop-ivcanon -fno-tree-vectorize" } */ + +static int in[8][4]; +static int out[4]; +static const int check_result[] = {0, 16, 256, 4096}; + +static inline void foo () +{ + int sum; + int i, j, k; + for (k = 0; k < 4; k++) +{ + sum = 1; + for (j = 0; j < 4; j++) + for (i = 0; i < 4; i++) + sum *= in[i + k][j]; + out[k] = sum; +} +} + +int main () +{ + int i, j, k; + for (i = 0; i < 8; i++) +for (j = 0; j < 4; j++) + in[i][j] = (i + 2) / 3; + foo (); + for (k = 0; k < 4; k++) +if (out[k] != check_result[k]) + __builtin_abort (); + return 0; +}
Re: [PATCH] PR rtl-optimization/32219: optimizer causees wrong code in pic/hidden/weak symbol checking
On Mon, Feb 16, 2015 at 5:25 AM, Uros Bizjak wrote: > Hello! > >> 2015-02-12 H.J. Lu >> Richard Henderson >> >> PR rtl/32219 >> * cgraphunit.c (cgraph_node::finalize_function): Set definition >> before notice_global_symbol. >> (varpool_node::finalize_decl): Likewise. >> * varasm.c (default_binds_local_p_2): Rename from >> default_binds_local_p_1, add weak_dominate argument. Use direct >> returns instead of assigning to local variable. Unify varpool and >> cgraph paths via symtab_node. Reject undef weak variables before >> testing visibility. Reorder tests for simplicity. >> (default_binds_local_p): Use default_binds_local_p_2. >> (default_binds_local_p_1): Likewise. >> (decl_binds_to_current_def_p): Unify varpool and cgraph paths >> via symtab_node. >> (default_elf_asm_output_external): Emit visibility when specified. > > It looks like this patch broke alphaev68-linux-gnu [1]. There are many > failures of the type: > > /tmp/cck7V7MR.o: In function > `__static_initialization_and_destruction_0(int, int)':^M > (.text+0x3ac): relocation truncated to fit: GPRELHIGH against symbol > `std::__cxx11::basic_string, > std::allocator >::~basic_string()@@GLIBCXX_3.4.21' defined in > .text section in > /space/uros/gcc-build/alphaev68-unknown-linux-gnu/./libstdc++-v3/src/.libs/libstdc++.so^M > /space/homedirs/uros/local/bin/ld: /tmp/cck7V7MR.o: gp-relative > relocation against dynamic symbol It could be related to: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65064 Before this bug fix, all common symbols don't bind locally, which is one of PR 32219 bugs. After this fix, common symbols bind locally. It may cause problems on targets with small data sections and common symbols aren't in small data section: 1. Since common symbols bind locally, backend may assume they are in small data section and lead to link-time failure. 2. Backend assume common symbols are never in small data section. But a definition in small data section may override a common symbol, which still binds lcoally, and turn a reference to common symbol to reference to small data section. This also may lead to link-time failure. Those targets can't assume common symbols are in small data section since it may change at link-time. The most conservative solution is to make common symbol doesn't bind locally for targets which defines TARGET_IN_SMALL_DATA_P. -- H.J.
Re: [4.8 branch] PATCH: PR middle-end/53623: [4.7/4.8 Regression] sign extension is effectively split into two x86-64 instructions
On Mon, Feb 16, 2015 at 05:44:45AM -0800, H.J. Lu wrote: > Should it be a concern for 4.8 backport? Should we also backport r215205: Probably. But, please wait for Jeff Law's approval of all of this before committing. > commit b71346c449d2b4a63985a39c4c092ecdfb37b5a0 > Author: jiwang > Date: Fri Sep 12 09:29:16 2014 + > > [Ree] Ensure inserted copy don't change the number of hard registers > > 2014-09-12 Wilco Dijkstra > > gcc/ > * ree.c (combine_reaching_defs): Ensure inserted copy don't change the > number of hard registers. > > > git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@215205 > 138bc75d-0d04-0410-961f-82ee72b054a4 > > to 4.8 and 4.9 branches? Jakub
Re: [4.8 branch] PATCH: PR middle-end/53623: [4.7/4.8 Regression] sign extension is effectively split into two x86-64 instructions
On Mon, Feb 16, 2015 at 5:24 AM, H.J. Lu wrote: > On Mon, Feb 16, 2015 at 5:18 AM, Jakub Jelinek wrote: >> On Mon, Feb 16, 2015 at 05:15:02AM -0800, H.J. Lu wrote: >>> On Mon, Feb 16, 2015 at 4:30 AM, H.J. Lu wrote: >>> > On Mon, Feb 16, 2015 at 1:35 AM, Jakub Jelinek wrote: >>> >> On Sun, Feb 15, 2015 at 12:53:39PM -0800, H.J. Lu wrote: >>> >>> This is a backport of the patch for PR middle-end/53623 plus all bug >>> >>> fixes caused by it. Tested on Linux/x86-32, Linux/x86-64 and x32. OK >>> >>> for 4.8 branch? >>> >> >>> >> What about PR64286 and PR63659, are you sure those aren't related? >>> >> I mean, they are on the 4.9 branch and I don't see why they couldn't >>> >> affect >>> >> the 4.8 backport. >>> >> >>> >> Jakub >>> > >>> > Fix for PR 63659 has been backported to 4.8 branch. I will check if >>> > fix for PR 64286 is needed. >>> > >>> > -- >>> > H.J. >>> >>> The fix for PR 64286 is an updated fix for PR 59754 which is caused by >>> the fix for PR 53623. But the testcase in the fix for PR 64286 doesn't >>> fail on 4.8 branch + my backport of the fix for PR 53623 on Haswell. >>> I suggest >>> >>> 1. We go without my current backport and backport the fix for PR 64286 >>> in a separate patch. Or >>> 2. We go without my backport minus the backport of the PR 59754 >>> fix and backport the fixes for PR 59754 plus PR 64286 in a separate patch >> >> I think keeping the branch broken is bad, even if we don't have a testcase >> that really fails, pressumably the issue is just latent. >> So I'd strongly prefer >> 3. Add the PR64286 fix to the patch being tested and commit only when it as >> whole is tested, as one commit. >> 4.9 branch backport of the PR64286 fix caused a regression on ARM64: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64286#c11 Should it be a concern for 4.8 backport? Should we also backport r215205: commit b71346c449d2b4a63985a39c4c092ecdfb37b5a0 Author: jiwang Date: Fri Sep 12 09:29:16 2014 + [Ree] Ensure inserted copy don't change the number of hard registers 2014-09-12 Wilco Dijkstra gcc/ * ree.c (combine_reaching_defs): Ensure inserted copy don't change the number of hard registers. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@215205 138bc75d-0d04-0410-961f-82ee72b054a4 to 4.8 and 4.9 branches? > I will do that and restart the testing. > > BTW, PR 53623 was a missed optimization bug originally. Now > it turns out that it fixed a wrong code bug. We are trying to extract > a run-time testcase from PR 64941 for trunk and branches. > > Thanks. > > -- > H.J. -- H.J.
Re: [PATCH] PR rtl-optimization/32219: optimizer causees wrong code in pic/hidden/weak symbol checking
Hello! > 2015-02-12 H.J. Lu > Richard Henderson > > PR rtl/32219 > * cgraphunit.c (cgraph_node::finalize_function): Set definition > before notice_global_symbol. > (varpool_node::finalize_decl): Likewise. > * varasm.c (default_binds_local_p_2): Rename from > default_binds_local_p_1, add weak_dominate argument. Use direct > returns instead of assigning to local variable. Unify varpool and > cgraph paths via symtab_node. Reject undef weak variables before > testing visibility. Reorder tests for simplicity. > (default_binds_local_p): Use default_binds_local_p_2. > (default_binds_local_p_1): Likewise. > (decl_binds_to_current_def_p): Unify varpool and cgraph paths > via symtab_node. > (default_elf_asm_output_external): Emit visibility when specified. It looks like this patch broke alphaev68-linux-gnu [1]. There are many failures of the type: /tmp/cck7V7MR.o: In function `__static_initialization_and_destruction_0(int, int)':^M (.text+0x3ac): relocation truncated to fit: GPRELHIGH against symbol `std::__cxx11::basic_string, std::allocator >::~basic_string()@@GLIBCXX_3.4.21' defined in .text section in /space/uros/gcc-build/alphaev68-unknown-linux-gnu/./libstdc++-v3/src/.libs/libstdc++.so^M /space/homedirs/uros/local/bin/ld: /tmp/cck7V7MR.o: gp-relative relocation against dynamic symbol _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEED1Ev@@GLIBCXX_3.4.21^M /space/homedirs/uros/local/bin/ld: final link failed: Nonrepresentable section on output^M An example is g++.dg/torture/pr60750.C : /space/uros/gcc-build/gcc/testsuite/g++/../../xg++ -B/space/uros/gcc-build/gcc/testsuite/g++/../../ /space/homedirs/uros/gcc-svn/trunk/gcc/testsuite/g++.dg/torture/pr60750.C -fno-diagnostics-show-caret -fdiagnostics-color=never -nostdinc++ -I/space/homedirs/uros/gcc-build/alphaev68-unknown-linux-gnu/libstdc++-v3/include/alphaev68-unknown-linux-gnu -I/space/homedirs/uros/gcc-build/alphaev68-unknown-linux-gnu/libstdc++-v3/include -I/space/homedirs/uros/gcc-svn/trunk/libstdc++-v3/libsupc++ -I/space/homedirs/uros/gcc-svn/trunk/libstdc++-v3/include/backward -I/space/homedirs/uros/gcc-svn/trunk/libstdc++-v3/testsuite/util -fmessage-length=0 -O0 -std=c++11 -L/space/uros/gcc-build/alphaev68-unknown-linux-gnu/./libstdc++-v3/src/.libs -B/space/uros/gcc-build/alphaev68-unknown-linux-gnu/./libstdc++-v3/src/.libs -L/space/uros/gcc-build/alphaev68-unknown-linux-gnu/./libstdc++-v3/src/.libs -lm -o ./pr60750.exe^M /space/homedirs/uros/local/bin/ld: /tmp/cck7V7MR.o: gp-relative relocation against dynamic symbol _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEED1Ev@@GLIBCXX_3.4.21^M /tmp/cck7V7MR.o: In function `__static_initialization_and_destruction_0(int, int)':^M (.text+0x3ac): relocation truncated to fit: GPRELHIGH against symbol `std::__cxx11::basic_string, std::allocator >::~basic_string()@@GLIBCXX_3.4.21' defined in .text section in /space/uros/gcc-build/alphaev68-unknown-linux-gnu/./libstdc++-v3/src/.libs/libstdc++.so^M /space/homedirs/uros/local/bin/ld: /tmp/cck7V7MR.o: gp-relative relocation against dynamic symbol _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEED1Ev@@GLIBCXX_3.4.21^M /space/homedirs/uros/local/bin/ld: final link failed: Nonrepresentable section on output^M collect2: error: ld returned 1 exit status^M compiler exited with status 1 [1] https://gcc.gnu.org/ml/gcc-testresults/2015-02/msg01867.html Uros.
Re: [4.8 branch] PATCH: PR middle-end/53623: [4.7/4.8 Regression] sign extension is effectively split into two x86-64 instructions
On Mon, Feb 16, 2015 at 5:18 AM, Jakub Jelinek wrote: > On Mon, Feb 16, 2015 at 05:15:02AM -0800, H.J. Lu wrote: >> On Mon, Feb 16, 2015 at 4:30 AM, H.J. Lu wrote: >> > On Mon, Feb 16, 2015 at 1:35 AM, Jakub Jelinek wrote: >> >> On Sun, Feb 15, 2015 at 12:53:39PM -0800, H.J. Lu wrote: >> >>> This is a backport of the patch for PR middle-end/53623 plus all bug >> >>> fixes caused by it. Tested on Linux/x86-32, Linux/x86-64 and x32. OK >> >>> for 4.8 branch? >> >> >> >> What about PR64286 and PR63659, are you sure those aren't related? >> >> I mean, they are on the 4.9 branch and I don't see why they couldn't >> >> affect >> >> the 4.8 backport. >> >> >> >> Jakub >> > >> > Fix for PR 63659 has been backported to 4.8 branch. I will check if >> > fix for PR 64286 is needed. >> > >> > -- >> > H.J. >> >> The fix for PR 64286 is an updated fix for PR 59754 which is caused by >> the fix for PR 53623. But the testcase in the fix for PR 64286 doesn't >> fail on 4.8 branch + my backport of the fix for PR 53623 on Haswell. >> I suggest >> >> 1. We go without my current backport and backport the fix for PR 64286 >> in a separate patch. Or >> 2. We go without my backport minus the backport of the PR 59754 >> fix and backport the fixes for PR 59754 plus PR 64286 in a separate patch > > I think keeping the branch broken is bad, even if we don't have a testcase > that really fails, pressumably the issue is just latent. > So I'd strongly prefer > 3. Add the PR64286 fix to the patch being tested and commit only when it as > whole is tested, as one commit. > I will do that and restart the testing. BTW, PR 53623 was a missed optimization bug originally. Now it turns out that it fixed a wrong code bug. We are trying to extract a run-time testcase from PR 64941 for trunk and branches. Thanks. -- H.J.
Re: [4.8 branch] PATCH: PR middle-end/53623: [4.7/4.8 Regression] sign extension is effectively split into two x86-64 instructions
On Mon, Feb 16, 2015 at 05:15:02AM -0800, H.J. Lu wrote: > On Mon, Feb 16, 2015 at 4:30 AM, H.J. Lu wrote: > > On Mon, Feb 16, 2015 at 1:35 AM, Jakub Jelinek wrote: > >> On Sun, Feb 15, 2015 at 12:53:39PM -0800, H.J. Lu wrote: > >>> This is a backport of the patch for PR middle-end/53623 plus all bug > >>> fixes caused by it. Tested on Linux/x86-32, Linux/x86-64 and x32. OK > >>> for 4.8 branch? > >> > >> What about PR64286 and PR63659, are you sure those aren't related? > >> I mean, they are on the 4.9 branch and I don't see why they couldn't affect > >> the 4.8 backport. > >> > >> Jakub > > > > Fix for PR 63659 has been backported to 4.8 branch. I will check if > > fix for PR 64286 is needed. > > > > -- > > H.J. > > The fix for PR 64286 is an updated fix for PR 59754 which is caused by > the fix for PR 53623. But the testcase in the fix for PR 64286 doesn't > fail on 4.8 branch + my backport of the fix for PR 53623 on Haswell. > I suggest > > 1. We go without my current backport and backport the fix for PR 64286 > in a separate patch. Or > 2. We go without my backport minus the backport of the PR 59754 > fix and backport the fixes for PR 59754 plus PR 64286 in a separate patch I think keeping the branch broken is bad, even if we don't have a testcase that really fails, pressumably the issue is just latent. So I'd strongly prefer 3. Add the PR64286 fix to the patch being tested and commit only when it as whole is tested, as one commit. Jakub
Re: [4.8 branch] PATCH: PR middle-end/53623: [4.7/4.8 Regression] sign extension is effectively split into two x86-64 instructions
On Mon, Feb 16, 2015 at 4:30 AM, H.J. Lu wrote: > On Mon, Feb 16, 2015 at 1:35 AM, Jakub Jelinek wrote: >> On Sun, Feb 15, 2015 at 12:53:39PM -0800, H.J. Lu wrote: >>> This is a backport of the patch for PR middle-end/53623 plus all bug >>> fixes caused by it. Tested on Linux/x86-32, Linux/x86-64 and x32. OK >>> for 4.8 branch? >> >> What about PR64286 and PR63659, are you sure those aren't related? >> I mean, they are on the 4.9 branch and I don't see why they couldn't affect >> the 4.8 backport. >> >> Jakub > > Fix for PR 63659 has been backported to 4.8 branch. I will check if > fix for PR 64286 is needed. > > -- > H.J. The fix for PR 64286 is an updated fix for PR 59754 which is caused by the fix for PR 53623. But the testcase in the fix for PR 64286 doesn't fail on 4.8 branch + my backport of the fix for PR 53623 on Haswell. I suggest 1. We go without my current backport and backport the fix for PR 64286 in a separate patch. Or 2. We go without my backport minus the backport of the PR 59754 fix and backport the fixes for PR 59754 plus PR 64286 in a separate patch -- H.J.
Re: [4.8 branch] PATCH: PR middle-end/53623: [4.7/4.8 Regression] sign extension is effectively split into two x86-64 instructions
On Mon, Feb 16, 2015 at 1:35 AM, Jakub Jelinek wrote: > On Sun, Feb 15, 2015 at 12:53:39PM -0800, H.J. Lu wrote: >> This is a backport of the patch for PR middle-end/53623 plus all bug >> fixes caused by it. Tested on Linux/x86-32, Linux/x86-64 and x32. OK >> for 4.8 branch? > > What about PR64286 and PR63659, are you sure those aren't related? > I mean, they are on the 4.9 branch and I don't see why they couldn't affect > the 4.8 backport. > > Jakub Fix for PR 63659 has been backported to 4.8 branch. I will check if fix for PR 64286 is needed. -- H.J.
[PATCH] Fix PR ipa/65059
Hello. This patch is fix which was attached by Honza to the PR. Patch was tested on x86_64-linux-pc and no new regression is introduced. Patch is pre-approved by Honza and I'm going to install the patch. Martin >From cfe7bd6b57cc6e0768fd72d27a7b222ab1136b32 Mon Sep 17 00:00:00 2001 From: mliska Date: Mon, 16 Feb 2015 11:37:29 +0100 Subject: [PATCH] Fix PR ipa/65059. gcc/ChangeLog: 2015-02-16 Jan Hubicka PR ipa/65059 * ipa-comdats.c (ipa_comdats): Do not categorize thunks to external functions. --- gcc/ipa-comdats.c | 5 + 1 file changed, 5 insertions(+) diff --git a/gcc/ipa-comdats.c b/gcc/ipa-comdats.c index ad5945f..9f43f29 100644 --- a/gcc/ipa-comdats.c +++ b/gcc/ipa-comdats.c @@ -328,9 +328,14 @@ ipa_comdats (void) FOR_EACH_DEFINED_SYMBOL (symbol) { + struct cgraph_node *fun; symbol->aux = NULL; if (!symbol->get_comdat_group () && !symbol->alias + /* Thunks to external functions do not need to be categorized. */ + && (!(fun = dyn_cast (symbol)) + || !fun->thunk.thunk_p + || fun->function_symbol ()->definition) && symbol->real_symbol_p ()) { tree *val = map.get (symbol); -- 2.1.2
Re: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible
On Mon, Feb 16, 2015 at 11:26 AM, Thomas Preud'homme wrote: > Hi, > > The RTL cprop pass in GCC operates by doing a local constant/copy propagation > first and then a global one. In the local one, if a constant cannot be > propagated (eg. due to constraints of the destination instruction) a copy > propagation is done instead. However, at the global level copy propagation is > only tried if no constant can be propagated, ie. if a constant can be > propagated but the constraints of the destination instruction forbids it, no > copy propagation will be tried. This patch fixes this issue. This solves the > redundant ldr problem in GCC32RM-439. > This would address https://gcc.gnu.org/bugzilla/show_bug.cgi?id=34503#c4 I'll have a look at the patch tonight. Ciao! Seven
Re: [Patch, fortran] PR60898 premature release of entry symbols
Le 15/02/2015 19:00, Jerry DeLisle a écrit : > On 02/15/2015 09:48 AM, Mikael Morin wrote: > >> [*] I have a few failing testcases (also without the patch), namely the >> following; does this ring a bell ? >> FAIL: gfortran.dg/erf_3.F90 >> FAIL: gfortran.dg/fmt_g0_7.f08 >> FAIL: gfortran.dg/fmt_en.f90 >> FAIL: gfortran.dg/nan_7.f90 >> FAIL: gfortran.dg/quad_2.f90 >> FAIL: gfortran.dg/quad_3.f90 >> FAIL: gfortran.dg/round_4.f90 >> > > fmt_g0_7.f08 is a new test that should be passing on x86-64 unless you > have not updated scanner.c. Are these fails on x86-64? I do not see > them here on mine. > On x86_64, yes. I cleared my build tree, bootstrapped again, and the failures are gone. :-) Mikael
Re: [C PATCH] Don't crash on null param (PR c/65066)
On Mon, Feb 16, 2015 at 12:05:06PM +0100, Marek Polacek wrote: > The CUR_PARAM can be null at this place, so check for that. > > I had hoped that extra testing the original patch by running the C testsuite > with -Wformat=2 enabled would detect such a case, but apparently not. :( > > Bootstrapped/regtested on x86_64-linux, ok for trunk? > > 2015-02-16 Marek Polacek > > PR c/65066 > * c-format.c (check_format_types): Handle null param. > > * gcc.dg/pr65066.c: New test. Ok, thanks. Jakub
[C PATCH] Don't crash on null param (PR c/65066)
The CUR_PARAM can be null at this place, so check for that. I had hoped that extra testing the original patch by running the C testsuite with -Wformat=2 enabled would detect such a case, but apparently not. :( Bootstrapped/regtested on x86_64-linux, ok for trunk? 2015-02-16 Marek Polacek PR c/65066 * c-format.c (check_format_types): Handle null param. * gcc.dg/pr65066.c: New test. diff --git gcc/c-family/c-format.c gcc/c-family/c-format.c index 2f49b2d..9d03ff0 100644 --- gcc/c-family/c-format.c +++ gcc/c-family/c-format.c @@ -2492,6 +2492,7 @@ check_format_types (location_t loc, format_wanted_type *types) && TREE_CODE (cur_type) == INTEGER_TYPE && warn_format_signedness && TYPE_UNSIGNED (wanted_type) + && cur_param != NULL_TREE && TREE_CODE (cur_param) == NOP_EXPR) { tree t = TREE_TYPE (TREE_OPERAND (cur_param, 0)); diff --git gcc/testsuite/gcc.dg/pr65066.c gcc/testsuite/gcc.dg/pr65066.c index e69de29..883a87d 100644 --- gcc/testsuite/gcc.dg/pr65066.c +++ gcc/testsuite/gcc.dg/pr65066.c @@ -0,0 +1,12 @@ +/* PR c/65066 */ +/* { dg-do compile } */ +/* { dg-options "-Wformat=2" } */ + +extern int sscanf (const char *restrict, const char *restrict, ...); +int *a; + +void +foo () +{ + sscanf (0, "0x%x #", a); /* { dg-warning "expects argument of type" } */ +} Marek
Re: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible
On Mon, 16 Feb 2015, Thomas Preud'homme wrote: > Hi, > > The RTL cprop pass in GCC operates by doing a local constant/copy > propagation first and then a global one. In the local one, if a constant > cannot be propagated (eg. due to constraints of the destination > instruction) a copy propagation is done instead. However, at the global > level copy propagation is only tried if no constant can be propagated, > ie. if a constant can be propagated but the constraints of the > destination instruction forbids it, no copy propagation will be tried. > This patch fixes this issue. This solves the redundant ldr problem in > GCC32RM-439. I think Steven is more familiar with this code. Richard. > ChangeLog entries are as follows: > > *** gcc/ChangeLog *** > > 2015-01-21 Thomas Preud'homme thomas.preudho...@arm.com > > * cprop.c (find_avail_set): Return up to two sets, one whose source is > a register and one whose source is a constant. Sets are returned in > an array passed as parameter rather than as a return value. > (cprop_insn): Use a do while loop rather than a goto. Try each of the > sets returned by find_avail_set, starting with the one whose source is > a constant. > > > *** gcc/testsuite/ChangeLog *** > > 2015-01-21 Thomas Preud'homme thomas.preudho...@arm.com > > * gcc.target/arm/pr64616.c: New file. > > Following testing was done: > > Bootstrapped on x86_64 and ran the testsuite without regression > Build an arm-none-eabi cross-compilers and ran the testsuite without > regression with QEMU emulating a Cortex-M3 > Compiled CSiBE targeting x86_64 and Cortex-M3 arm-none-eabi without > regression > > diff --git a/gcc/cprop.c b/gcc/cprop.c > index 4538291..c246d4b 100644 > --- a/gcc/cprop.c > +++ b/gcc/cprop.c > @@ -815,15 +815,15 @@ try_replace_reg (rtx from, rtx to, rtx_insn *insn) >return success; > } > > -/* Find a set of REGNOs that are available on entry to INSN's block. Return > - NULL no such set is found. */ > +/* Find a set of REGNOs that are available on entry to INSN's block. If > found, > + SET_RET[0] will be assigned a set with a register source and SET_RET[1] a > + set with a constant source. If not found the corresponding entry is set > to > + NULL. */ > > -static struct cprop_expr * > -find_avail_set (int regno, rtx_insn *insn) > +static void > +find_avail_set (int regno, rtx_insn *insn, struct cprop_expr *set_ret[2]) > { > - /* SET1 contains the last set found that can be returned to the caller for > - use in a substitution. */ > - struct cprop_expr *set1 = 0; > + set_ret[0] = set_ret[1] = NULL; > >/* Loops are not possible here. To get a loop we would need two sets > available at the start of the block containing INSN. i.e. we would > @@ -863,8 +863,10 @@ find_avail_set (int regno, rtx_insn *insn) > If the source operand changed, we may still use it for the next > iteration of this loop, but we may not use it for substitutions. */ > > - if (cprop_constant_p (src) || reg_not_set_p (src, insn)) > - set1 = set; > + if (cprop_constant_p (src)) > + set_ret[1] = set; > + else if (reg_not_set_p (src, insn)) > + set_ret[0] = set; > >/* If the source of the set is anything except a register, then >we have reached the end of the copy chain. */ > @@ -875,10 +877,6 @@ find_avail_set (int regno, rtx_insn *insn) >and see if we have an available copy into SRC. */ >regno = REGNO (src); > } > - > - /* SET1 holds the last set that was available and anticipatable at > - INSN. */ > - return set1; > } > > /* Subroutine of cprop_insn that tries to propagate constants into > @@ -1044,40 +1042,41 @@ cprop_insn (rtx_insn *insn) >int changed = 0, changed_this_round; >rtx note; > > -retry: > - changed_this_round = 0; > - reg_use_count = 0; > - note_uses (&PATTERN (insn), find_used_regs, NULL); > + do > +{ > + changed_this_round = 0; > + reg_use_count = 0; > + note_uses (&PATTERN (insn), find_used_regs, NULL); > > - /* We may win even when propagating constants into notes. */ > - note = find_reg_equal_equiv_note (insn); > - if (note) > -find_used_regs (&XEXP (note, 0), NULL); > + /* We may win even when propagating constants into notes. */ > + note = find_reg_equal_equiv_note (insn); > + if (note) > + find_used_regs (&XEXP (note, 0), NULL); > > - for (i = 0; i < reg_use_count; i++) > -{ > - rtx reg_used = reg_use_table[i]; > - unsigned int regno = REGNO (reg_used); > - rtx src; > - struct cprop_expr *set; > + for (i = 0; i < reg_use_count; i++) > + { > + rtx reg_used = reg_use_table[i]; > + unsigned int regno = REGNO (reg_used); > + rtx src_cst = NULL, src_reg = NULL; > + struct cprop_expr *set[2]; > > - /* If the register has already been set in this block, there's > - nothi
[PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible
Hi, The RTL cprop pass in GCC operates by doing a local constant/copy propagation first and then a global one. In the local one, if a constant cannot be propagated (eg. due to constraints of the destination instruction) a copy propagation is done instead. However, at the global level copy propagation is only tried if no constant can be propagated, ie. if a constant can be propagated but the constraints of the destination instruction forbids it, no copy propagation will be tried. This patch fixes this issue. This solves the redundant ldr problem in GCC32RM-439. ChangeLog entries are as follows: *** gcc/ChangeLog *** 2015-01-21 Thomas Preud'homme thomas.preudho...@arm.com * cprop.c (find_avail_set): Return up to two sets, one whose source is a register and one whose source is a constant. Sets are returned in an array passed as parameter rather than as a return value. (cprop_insn): Use a do while loop rather than a goto. Try each of the sets returned by find_avail_set, starting with the one whose source is a constant. *** gcc/testsuite/ChangeLog *** 2015-01-21 Thomas Preud'homme thomas.preudho...@arm.com * gcc.target/arm/pr64616.c: New file. Following testing was done: Bootstrapped on x86_64 and ran the testsuite without regression Build an arm-none-eabi cross-compilers and ran the testsuite without regression with QEMU emulating a Cortex-M3 Compiled CSiBE targeting x86_64 and Cortex-M3 arm-none-eabi without regression diff --git a/gcc/cprop.c b/gcc/cprop.c index 4538291..c246d4b 100644 --- a/gcc/cprop.c +++ b/gcc/cprop.c @@ -815,15 +815,15 @@ try_replace_reg (rtx from, rtx to, rtx_insn *insn) return success; } -/* Find a set of REGNOs that are available on entry to INSN's block. Return - NULL no such set is found. */ +/* Find a set of REGNOs that are available on entry to INSN's block. If found, + SET_RET[0] will be assigned a set with a register source and SET_RET[1] a + set with a constant source. If not found the corresponding entry is set to + NULL. */ -static struct cprop_expr * -find_avail_set (int regno, rtx_insn *insn) +static void +find_avail_set (int regno, rtx_insn *insn, struct cprop_expr *set_ret[2]) { - /* SET1 contains the last set found that can be returned to the caller for - use in a substitution. */ - struct cprop_expr *set1 = 0; + set_ret[0] = set_ret[1] = NULL; /* Loops are not possible here. To get a loop we would need two sets available at the start of the block containing INSN. i.e. we would @@ -863,8 +863,10 @@ find_avail_set (int regno, rtx_insn *insn) If the source operand changed, we may still use it for the next iteration of this loop, but we may not use it for substitutions. */ - if (cprop_constant_p (src) || reg_not_set_p (src, insn)) - set1 = set; + if (cprop_constant_p (src)) + set_ret[1] = set; + else if (reg_not_set_p (src, insn)) + set_ret[0] = set; /* If the source of the set is anything except a register, then we have reached the end of the copy chain. */ @@ -875,10 +877,6 @@ find_avail_set (int regno, rtx_insn *insn) and see if we have an available copy into SRC. */ regno = REGNO (src); } - - /* SET1 holds the last set that was available and anticipatable at - INSN. */ - return set1; } /* Subroutine of cprop_insn that tries to propagate constants into @@ -1044,40 +1042,41 @@ cprop_insn (rtx_insn *insn) int changed = 0, changed_this_round; rtx note; -retry: - changed_this_round = 0; - reg_use_count = 0; - note_uses (&PATTERN (insn), find_used_regs, NULL); + do +{ + changed_this_round = 0; + reg_use_count = 0; + note_uses (&PATTERN (insn), find_used_regs, NULL); - /* We may win even when propagating constants into notes. */ - note = find_reg_equal_equiv_note (insn); - if (note) -find_used_regs (&XEXP (note, 0), NULL); + /* We may win even when propagating constants into notes. */ + note = find_reg_equal_equiv_note (insn); + if (note) + find_used_regs (&XEXP (note, 0), NULL); - for (i = 0; i < reg_use_count; i++) -{ - rtx reg_used = reg_use_table[i]; - unsigned int regno = REGNO (reg_used); - rtx src; - struct cprop_expr *set; + for (i = 0; i < reg_use_count; i++) + { + rtx reg_used = reg_use_table[i]; + unsigned int regno = REGNO (reg_used); + rtx src_cst = NULL, src_reg = NULL; + struct cprop_expr *set[2]; - /* If the register has already been set in this block, there's -nothing we can do. */ - if (! reg_not_set_p (reg_used, insn)) - continue; + /* If the register has already been set in this block, there's +nothing we can do. */ + if (! reg_not_set_p (reg_used, insn)) + continue; - /* Find an assignment that sets reg_used and is available -at the sta
Re: [PATCH, PR tree-optimization/65002] Disable SRA for functions wrongly marked as read-only
2015-02-15 20:25 GMT+03:00 Mike Stump : > On Feb 13, 2015, at 11:25 AM, Jakub Jelinek wrote: 2015-02-12 Ilya Enkovich PR tree-optimization/65002 * gcc.dg/pr65002.C: New. >>> >>> This test should have gone into g++.dg. >> >> Into g++.dg/opt or g++.dg/ipa in particular. > > Pre-approved if someone wants to svn mv it. Moved it. 2015-02-16 Ilya Enkovich * gcc.dg/pr65002.C: Move ... * g++.dg/ipa/pr65002.C: ... here. Thanks, Ilya
RE: [PATCH, FT32] initial support
> From: Joseph Myers > ... > > +@table @gcctabopt > > + > > +@item -mspace > > +@opindex mspace > > +Enable code-size optimizations. > > +Some of these optimizations incur a minor performance penalty. > > We already have -Os, so why is an architecture-specific option for this > needed? > > > +A 16-bit signed constant (-32768..32767) > > Use @minus{} for a minus sign in Texinfo documentation, and @dots{} > instead of literal ".." or "...". I have updated the target options. Space-saving is now enabled by -Os. There is also a new option -msim to enable building for the simulator (the simulator is pending submission to gdb-binutils). I have fixed the Texinfo formatting. FT32 is a new high performance 32-bit RISC core developed by FTDI for embedded applications. Support for FT32 has already been added to binutils. This patch adds FT32 support to gcc. Please can someone review it, and if appropriate commit it, as I do not have write access to the tree. The FSF have acknowledged receipt of FTDI's copyright assignment papers. Thanks very much. ChangeLog entry: 2014-02-16 James Bowman * configure.ac: FT32 target added * libgcc/config.host: FT32 target added * gcc/config/ft32/: FT32 target added * libgcc/config/ft32/: FT32 target added * gcc/doc/install.texi, invoke.texi, md.texi: FT32 details added * gcc/doc/contrib.texi, MAINTAINERS: self added * contrib/config-list.mk: FT32 target added * configure: Regenerated -- James Bowman FTDI Open Source Liaison gcc-ft32.txt.gz Description: gcc-ft32.txt.gz
Re: [4.8 branch] PATCH: PR middle-end/53623: [4.7/4.8 Regression] sign extension is effectively split into two x86-64 instructions
On Sun, Feb 15, 2015 at 12:53:39PM -0800, H.J. Lu wrote: > This is a backport of the patch for PR middle-end/53623 plus all bug > fixes caused by it. Tested on Linux/x86-32, Linux/x86-64 and x32. OK > for 4.8 branch? What about PR64286 and PR63659, are you sure those aren't related? I mean, they are on the 4.9 branch and I don't see why they couldn't affect the 4.8 backport. Jakub
Re: [4.8 branch] PATCH: PR middle-end/53623: [4.7/4.8 Regression] sign extension is effectively split into two x86-64 instructions
On Sun, Feb 15, 2015 at 9:53 PM, H.J. Lu wrote: > Hi, > > This is a backport of the patch for PR middle-end/53623 plus all bug > fixes caused by it. Tested on Linux/x86-32, Linux/x86-64 and x32. OK > for 4.8 branch? Ok if nobody objects within 24h (you changed the bug to wrong-code which is why I am considering it - a missed-optimization fix this big shouldn't be backported). Richard. > Thanks. > > > H.J. > --- > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index 469ee31..44bf322 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,82 @@ > +2015-02-15 H.J. Lu > + > + Backport from mainline: > + 2014-06-13 Jeff Law > + > + PR rtl-optimization/61094 > + PR rtl-optimization/61446 > + * ree.c (combine_reaching_defs): Get the mode for the copy from > + the extension insn rather than the defining insn. > + > + 2014-06-02 Jeff Law > + > + PR rtl-optimization/61094 > + * ree.c (combine_reaching_defs): Do not reextend an insn if it > + was marked as do_no_reextend. If a copy is needed to eliminate > + an extension, then mark it as do_not_reextend. > + > + 2014-02-14 Jeff Law > + > + PR rtl-optimization/60131 > + * ree.c (get_extended_src_reg): New function. > + (combine_reaching_defs): Use it rather than assuming location > + of REG. > + (find_and_remove_re): Verify first operand of extension is > + a REG before adding the insns to the copy list. > + > + 2014-01-17 Jeff Law > + > + * ree.c (combine_set_extension): Temporarily disable test for > + changing number of hard registers. > + > + 2014-01-15 Jeff Law > + > + PR tree-optimization/59747 > + * ree.c (find_and_remove_re): Properly handle case where a second > + eliminated extension requires widening a copy created for elimination > + of a prior extension. > + (combine_set_extension): Ensure that the number of hard regs needed > + for a destination register does not change when we widen it. > + > + 2014-01-10 Jeff Law > + > + PR middle-end/59743 > + * ree.c (combine_reaching_defs): Ensure the defining statement > + occurs before the extension when optimizing extensions with > + different source and destination hard registers. > + > + 2014-01-10 Jakub Jelinek > + > + PR rtl-optimization/59754 > + * ree.c (combine_reaching_defs): Disallow !SCALAR_INT_MODE_P > + modes in the REGNO != REGNO case. > + > + 2014-01-08 Jeff Law > + > + * ree.c (get_sub_rtx): New function, extracted from... > + (merge_def_and_ext): Here. > + (combine_reaching_defs): Use get_sub_rtx. > + > + 2014-01-07 Jeff Law > + > + PR middle-end/53623 > + * ree.c (combine_set_extension): Handle case where source > + and destination registers in an extension insn are different. > + (combine_reaching_defs): Allow source and destination > + registers in extension to be different under limited > + circumstances. > + (add_removable_extension): Remove restriction that the > + source and destination registers in the extension are the > + same. > + (find_and_remove_re): Emit a copy from the extension's > + destination to its source after the defining insn if > + the source and destination registers are different. > + > + 2013-12-12 Jeff Law > + > + * i386.md (simple LEA peephole2): Add missing mode to zero_extend > + for zero-extended MULT simple LEA pattern. > + > 2015-02-12 Jakub Jelinek > > Backported from mainline > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md > index 372ae63..aabd6ec 100644 > --- a/gcc/config/i386/i386.md > +++ b/gcc/config/i386/i386.md > @@ -17265,7 +17265,7 @@ > && REGNO (operands[0]) == REGNO (operands[1]) > && peep2_regno_dead_p (0, FLAGS_REG)" >[(parallel [(set (match_dup 0) > - (zero_extend (ashift:SI (match_dup 1) (match_dup 2 > + (zero_extend:DI (ashift:SI (match_dup 1) (match_dup 2 > (clobber (reg:CC FLAGS_REG))])] >"operands[2] = GEN_INT (exact_log2 (INTVAL (operands[2])));") > > diff --git a/gcc/ree.c b/gcc/ree.c > index c7e106f..bc566ad 100644 > --- a/gcc/ree.c > +++ b/gcc/ree.c > @@ -327,8 +327,30 @@ combine_set_extension (ext_cand *cand, rtx curr_insn, > rtx *orig_set) > { >rtx orig_src = SET_SRC (*orig_set); >enum machine_mode orig_mode = GET_MODE (SET_DEST (*orig_set)); > - rtx new_reg = gen_rtx_REG (cand->mode, REGNO (SET_DEST (*orig_set))); >rtx new_set; > + rtx cand_pat = PATTERN (cand->insn); > + > + /* If the extension's source/destination registers are not the same > + then we need to change the original load to reference the destination > + of the extension. Then we need to emit a copy from that destination > + to the original destination of the load