Re: [patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])
On Thu, Oct 4, 2012 at 8:57 AM, Steven Bosscher stevenb@gmail.com wrote: On Thu, Oct 4, 2012 at 5:30 AM, Vladimir Makarov vmaka...@redhat.com wrote: I was going to look at this code too but I was interesting in generation of less points and live ranges. It is strange that in my profiles, remove_some_program_points_and_update_live_ranges takes 0.6% of compiler time on these huge tests. So I was not interesting to speed up the function and may be therefore you have no visible change in compilation time. Right. The compression algorithm doesn't care much about the initial number of program points, only about the number of live ranges before and after compression. I had expected a bigger effect on the number of live ranges before compression. 0.6% sounds really very different from my timings. How much time does create_start_finish_chains take for you? I don't object the idea of the patch. I need some time to look at it (the different results on a function is a bit scary for me) and check simulator times on other tests. Understood. BTW, it would be great if you can also look at this additional patch hunk: @@ -994,8 +1044,8 @@ lra_create_live_ranges (bool all_p) curr_point = 0; point_freq_vec = VEC_alloc (int, heap, get_max_uid () * 2); lra_point_freq = VEC_address (int, point_freq_vec); - FOR_EACH_BB (bb) -process_bb_lives (bb); + FOR_EACH_BB_REVERSE (bb) +process_bb_lives (bb, curr_point); lra_live_max_point = curr_point; create_start_finish_chains (); if (lra_dump_file != NULL) I think this should result in more live ranges being merged. Here's why I think so, based on my far worse understanding of this code than yours, so forgive me if I'm Completely Wrong :-) process_bb_lives walks insns in the basic block from last to first, so say you have a basic block chain 1-2-3, and each block has 4 insns, then AFAIU the program points in block 1 will be [4,3,2,1], in block 2 it will be [8,7,6,5], and in block 3 it will be [12,11,10,9]. Say a reg is used in block 3 at point 11, and set in block at point 3. Then this reg will have a live range chain [3-1],[8-5],[12-11]. If you visit the basic blocks in reverse order, the program points will be: 1:[12,11,10,9], 2:[8,7,6,5], 3:[4,3,2,1]. Now the same reg will be set at point 11 and used at point 3, and the live range chain will be just [11-3]. I'm experimenting with this extra hunk and report back here. Ciao! Steven
Re: RFC: Using DECL_NO_LIMIT_STACK as a backend specific flag
On Wed, Oct 3, 2012 at 5:20 PM, nick clifton ni...@redhat.com wrote: Hi Ian, Can't you just keep a list of the decls for which you have issued the warning? Yes - that would work too. In fact I agree that this would be cleaner solution in my particular case. I'll create a new patch... How many decls do you expect to be on that list? Not very many. Maybe two or three at most. But I am interested to know if there is a way for targets to add their own information to function decls (and decls in general). If not for this particular case, then for problems to come in the future. To the decl not. But I suppose you are only looking at defined functions, so it should suffice to amend struct function. It's reasonable to allow the target to associate extra info with struct function and we already have a way to do that via the init_machine_status target hook. Richard. Cheers Nick
Re: PATCH trunk: gengtype honoring mark_hook-s inside struct inside union-s
On Wed, Oct 03, 2012 at 01:02:44PM +0200, Basile Starynkevitch wrote: On Wed, Oct 03, 2012 at 12:21:02PM +0300, Laurynas Biveinis wrote: Hello Basile - 2012-10-02 Basile Starynkevitch bas...@starynkevitch.net * gengtype.c (walk_type): Emit mark_hook when inside a struct of a union member. Can you send me off-list the gengtype output before and after the fix? I messed something, the example I did send was wrong. Let's start all over again. Consider the following file [...] This is PR54809 on our bugzilla. Thanks. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***
Re: RFA: add lock_length attribute to break branch-shortening cycles
On Wed, Oct 3, 2012 at 8:22 PM, Joern Rennecke joern.renne...@embecosm.com wrote: The ARCompact architecture has some pipelining features that result in the vanilla branch shortening not always converging. Moreover, there are some short, complex branch instructions with very short offsets; replacing them with a multi-insn sequence when the offset doesn't reach makes the code significantly longer. Thus, when starting branch shortening with pessimistic assumptions, the short branches are often not used because of the pessimistic branch length causing the offsets going out of range. This problem can be avoided when starting with a low estimate and working upwards. However, that makes the incidence of infinite branch shortening cycles higher, and also makes it impossible to just break out after some iteration count. To address these issues, I've made the generator programs recognize the optional lock_length attribute. To quote from the documentation added for this feature: If you define the `lock_length' attribute, branch shortening will work the other way round: it starts out assuming minimum instruction lengths and iterates from there. In addition, the value of the `lock_length' attribute does not decrease across iterations, and the value computed for the `length' attribute will be no smaller than that of the `lock_length' attribute. I miss a few things in this description: - what is the value of lock_length supposed to be? From the lock prefix it sounds like it is something unchanging, maybe even constant, thus a maximum? - the length attribute still needs to be specified when lock_length is? how do they relate? Is lock_length always smaller / bigger than length? - what happens if you have patterns with lock_length and patterns without? - what patterns does lock_length apply to? In general optimistically attacking this kind of problem should be always better - did you try simply switching this for all targets? It shouldn't be slower and the only thing you need to guarantee is that during iteration you never make insn-lenghts smaller again. Richard. bootstrapped and regression tested on i686-pc-linux-gnu 2012-10-03 Joern Rennecke joern.renne...@embecosm.com * final.c (get_attr_length_1): Use direct recursion rather than calling get_attr_length. (get_attr_lock_length): New function. (INSN_VARIABLE_LENGTH_P): Define. (shorten_branches): Take HAVE_ATTR_lock_length into account. Don't overwrite non-delay slot insn lengths with the lengths of delay slot insns with same uid. * genattrtab.c (lock_length_str): New variable. (make_length_attrs): New parameter base. (main): Initialize lock_length_str. Generate lock_lengths attributes. * genattr.c (gen_attr): Emit declarations for lock_length attribute related functions. * doc/md.texi (node Insn Lengths): Document lock_length attribute. Index: doc/md.texi === --- doc/md.texi (revision 192036) +++ doc/md.texi (working copy) @@ -8004,6 +8004,20 @@ (define_insn jump (const_int 6)))]) @end smallexample +@cindex lock_length +Usually, branch shortening is done assuming the worst case (i.e. longest) +lengths, and then iterating (if optimizing) to smaller lengths till +no further changed occur. This does not work so well for architectures +that have very small minimum offsets and considerable jumps in instruction +lengths. + +If you define the @code{lock_length} attribute, branch shortening will +work the other way round: it starts out assuming minimum instruction +lengths and iterates from there. In addition, the value of the +@code{lock_length} attribute does not decrease across iterations, and +the value computed for the @code{length} attribute will be no smaller +than that of the @code{lock_length} attribute. + @end ifset @ifset INTERNALS @node Constant Attributes Index: final.c === --- final.c (revision 192036) +++ final.c (working copy) @@ -312,6 +312,7 @@ dbr_sequence_length (void) `insn_current_length'. */ static int *insn_lengths; +static char *uid_lock_length; VEC(int,heap) *insn_addresses_; @@ -447,6 +448,20 @@ get_attr_length (rtx insn) return get_attr_length_1 (insn, insn_default_length); } +#ifdef HAVE_ATTR_lock_length +int +get_attr_lock_length (rtx insn) +{ + if (uid_lock_length insn_lengths_max_uid INSN_UID (insn)) +return uid_lock_length[INSN_UID (insn)]; + return get_attr_length_1 (insn, insn_min_lock_length); +} +#define INSN_VARIABLE_LENGTH_P(INSN) \ + (insn_variable_length_p (INSN) || insn_variable_lock_length_p (INSN)) +#else +#define INSN_VARIABLE_LENGTH_P(INSN) (insn_variable_length_p (INSN)) +#endif + /* Obtain the current length of an insn.
Re: [PATCH] Fix up DW_TAG_formal_parameter placement
On Thu, Oct 04, 2012 at 09:42:59AM +0200, Richard Guenther wrote: This looks like the wrong place to fix things to me ... either we can fix this at the point we create the VAR_DECLs for the optimized away PARM_DECLs (or we should delay that until here?) No, that is not possible. There is no other block they could be added to (they are added to DECL_INITIAL block), and they definitely need to be added there, they are needed for the more common case where it is not inlined. And in that case it is the right location, for non-inlined function DECL_INITIAL block's BLOCK_VARS is added directly as children of the DW_TAG_subprogram. or we fix it up in dwarf2out.c (how does this fix interact with stabs and the other debuginfo formats? We can't do that either, dwarf2out doesn't have information whether blocks are really used (as in, any insns/stmts mention that block in INSN_BLOCK/gimple_block) or not, it is only correct to move the VAR_DECLs with PARM_DECL DECL_ORIGIN (i.e. DW_TAG_formal_parameter) up if the outer BLOCK is not referenced by any insn/stmt (i.e. if the ranges of the inner block with the VAR_DECL and outer block are exactly the same). If the outer block has range that is superset of the inner block's range, then the move would invalidly say that the DW_TAG_formal_parameter is available somewhere where it is not supposed to be available. Initially I thought I'd do the moves in tree-ssa-live.c, in remove_unused_scope_block_p it has information about what blocks are used by any stmts and what are not. But it would be terribly expensive, for each VAR_DECL in a block where its BLOCK_SUPERCONTEXT wasn't originally TREE_USED before remove_unused_scope_block_p (and such blocks up to a !TREE_USED inlined_function_outer_scope_p), it would need to search all the BLOCK_SUPERCONTEXT BLOCK_VARS to see if the VAR_DECL isn't present there as well, and only if not, move to the inlined_function_outer_scope_p BLOCK. Doing it in tree-inline.c is IMHO the right spot, it is the place that creates the extra artificial BLOCK around the remapped DECL_INITIAL block and puts function arguments there. At that point we know for sure that the DECL_INITIAL block has the same ranges as the whole inline function, and it is desirable to move all arguments to the outer block, not just those that were still present in DECL_ARGUMENTS during inlining. If you want to be more specific on what is to be moved, we could either add some VAR_DECL flag bit (but that is expensive, we don't have many), or perhaps just check that DECL_CONTEXT (DECL_ORIGIN (v)) == DECL_ORIGIN (fn) (but is that ever false?). mentioning DWARF in tree-inline looks odd, unless we get rid of the other formats - something I'd of course welcome ;)) That can be fixed, I can replace the DWARF terminology with something more fuzzy. Jakub
Adjust gcc.dg/lower-subreg-1.c for SPARC
Tested on SPARC/Solaris 10, applied on the mainline. 2012-10-04 Eric Botcazou ebotca...@adacore.com PR rtl-optimization/54739 * gcc.dg/lower-subreg-1.c: Also skip on SPARC. -- Eric BotcazouIndex: gcc.dg/lower-subreg-1.c === --- gcc.dg/lower-subreg-1.c (revision 192031) +++ gcc.dg/lower-subreg-1.c (working copy) @@ -1,4 +1,4 @@ -/* { dg-do compile { target { ! { mips64 || { arm*-*-* ia64-*-* spu-*-* tilegx-*-* } } } } } */ +/* { dg-do compile { target { ! { mips64 || { arm*-*-* ia64-*-* sparc*-*-* spu-*-* tilegx-*-* } } } } } */ /* { dg-options -O -fdump-rtl-subreg1 } */ /* { dg-skip-if { { i?86-*-* x86_64-*-* } x32 } { * } { } } */ /* { dg-require-effective-target ilp32 } */
Re: opts.c, gcc.c: Plug some memory leaks - and an out-of-bounds memory access
On Wed, Oct 3, 2012 at 11:01 PM, Tobias Burnus bur...@net-b.de wrote: Found using http://scan5.coverity.com/ Build on x86-64-gnu-linux with C/C++/Fortran. I will now do an all-language build/regtest. OK when it passes? (Note to the save_string call: I reduced it by 2: The +1 in the call makes it long (out of bounds) and the +1 in temp_filename_length is not needed (but also doesn't harm) as tmp is null terminated and save_string adds another '\0' after copying len bytes.) - prefix = concat (target_sysroot_suffix, prefix, NULL); - prefix = concat (sysroot_no_trailing_dir_separator, prefix, NULL); + { + char *tmp; + tmp = concat (target_sysroot_suffix, prefix, NULL); + prefix = concat (sysroot_no_trailing_dir_separator, tmp, NULL); + free (tmp); + } prefix = concat (sysroot_no_trailing_dir_separator, target_sysroot_suffix, prefix, NULL); should be equivalent and easier to read, no? + else + prefix = concat (sysroot_no_trailing_dir_separator, prefix, NULL); + btw, we're not careing too much about memleaks in the driver ... Otherwise the patch looks ok with the above change. Thanks, Richard. Tobias
Fix -fdump-ada-spec
After changes by Sharad (Add option for dumping to stderr (issue6190057)), -fdump-ada-spec is broken, and is now a no-op. Admittedly, this is because -fdump-ada-spec is handled differently from other -fdump-* switches, so this patch fixes support for -fdump-ada-spec by using an approach similar to -fdump-go-spec, and use regular switches via c/c.opt. I've removed the handling of TDF_RAW, which was a debugging option, and never really used, so can be simply deleted. Change is mostly trivial/mechanical. Tested on x86_64-pc-linux-gnu, OK for trunk? gcc/ 2012-10-04 Arnaud Charlet char...@adacore.com * dumpfile.h, dumpfile.c: Remove TDI_ada. c-family/ 2012-10-04 Arnaud Charlet char...@adacore.com * c-ada-spec.c (print_ada_declaration): Remove handling of TDF_RAW. * c.opt (-fdump-ada-spec, -fdump-ada-spec-slim): Move switch definition out of dumpfile.h. c/ 2012-10-04 Arnaud Charlet char...@adacore.com * c-decl.c (c_write_global_declarations): Fix handling of -fdump-ada-spec*. cp/ 2012-10-04 Arnaud Charlet char...@adacore.com * decl2.c (cp_write_global_declarations): Fix handling of -fdump-ada-spec*. Arno -- Index: c-family/c.opt === --- c-family/c.opt (revision 192062) +++ c-family/c.opt (working copy) @@ -799,6 +799,14 @@ fdollars-in-identifiers C ObjC C++ ObjC++ Permit '$' as an identifier character +fdump-ada-spec +C ObjC C++ ObjC++ RejectNegative Var(flag_dump_ada_spec) +Write all declarations as Ada code transitively + +fdump-ada-spec-slim +C ObjC C++ ObjC++ RejectNegative Var(flag_dump_ada_spec_slim) +Write all declarations as Ada code for the given file only + felide-constructors C++ ObjC++ Var(flag_elide_constructors) Init(1) Index: c-family/c-ada-spec.c === --- c-family/c-ada-spec.c (revision 192062) +++ c-family/c-ada-spec.c (working copy) @@ -2535,7 +2535,6 @@ print_ada_declaration (pretty_printer *b int is_class = false; tree name = TYPE_NAME (TREE_TYPE (t)); tree decl_name = DECL_NAME (t); - bool dump_internal = get_dump_file_info (TDI_ada)-pflags TDF_RAW; tree orig = NULL_TREE; if (cpp_check cpp_check (t, IS_TEMPLATE)) @@ -2705,8 +2704,7 @@ print_ada_declaration (pretty_printer *b } else { - if (!dump_internal - TREE_CODE (t) == VAR_DECL + if (TREE_CODE (t) == VAR_DECL decl_name *IDENTIFIER_POINTER (decl_name) == '_') return 0; @@ -2796,8 +2794,7 @@ print_ada_declaration (pretty_printer *b /* If this function has an entry in the dispatch table, we cannot omit it. */ - if (!dump_internal !DECL_VINDEX (t) - *IDENTIFIER_POINTER (decl_name) == '_') + if (!DECL_VINDEX (t) *IDENTIFIER_POINTER (decl_name) == '_') { if (IDENTIFIER_POINTER (decl_name)[1] == '_') return 0; Index: c/c-decl.c === --- c/c-decl.c (revision 192062) +++ c/c-decl.c (working copy) @@ -10079,10 +10079,10 @@ c_write_global_declarations (void) gcc_assert (!current_scope); /* Handle -fdump-ada-spec[-slim]. */ - if (dump_initialized_p (TDI_ada)) + if (flag_dump_ada_spec || flag_dump_ada_spec_slim) { /* Build a table of files to generate specs for */ - if (get_dump_file_info (TDI_ada)-pflags TDF_SLIM) + if (flag_dump_ada_spec_slim) collect_source_ref (main_input_filename); else for_each_global_decl (collect_source_ref_cb); Index: cp/decl2.c === --- cp/decl2.c (revision 192062) +++ cp/decl2.c (working copy) @@ -3698,9 +3698,9 @@ cp_write_global_declarations (void) cgraph_process_same_body_aliases (); /* Handle -fdump-ada-spec[-slim] */ - if (dump_initialized_p (TDI_ada)) + if (flag_dump_ada_spec || flag_dump_ada_spec_slim) { - if (get_dump_file_info (TDI_ada)-pflags TDF_SLIM) + if (flag_dump_ada_spec_slim) collect_source_ref (main_input_filename); else collect_source_refs (global_namespace); Index: dumpfile.c === --- dumpfile.c (revision 192062) +++ dumpfile.c (working copy) @@ -57,8 +57,7 @@ static struct dump_file_info dump_files[ 0, 0, 0, 5}, {.vcg, tree-vcg, NULL, NULL, NULL, NULL, NULL, TDF_TREE, 0, 0, 0, 6}, - {.ads, ada-spec, NULL, NULL, NULL, NULL, NULL, 0, 0, 0, 0, 7}, -#define FIRST_AUTO_NUMBERED_DUMP 8 +#define FIRST_AUTO_NUMBERED_DUMP 7 {NULL, tree-all, NULL, NULL, NULL, NULL, NULL, TDF_TREE, 0, 0, 0, 0}, Index: dumpfile.h === --- dumpfile.h (revision 192062) +++ dumpfile.h (working copy) @@ -37,7 +37,6 @@ enum tree_dump_index TDI_nested,
Re: [PATCH] Fix instability of -fschedule-insn for x86
Hi, Based on this opinion, the patch is OK for mainline, if there are no ... Checked in: http://gcc.gnu.org/ml/gcc-cvs/2012-10/msg00187.html K
Re: [Ping]FW: [PATCH] Cilk Plus merging to trunk (2 of n)
On Thu, Oct 4, 2012 at 2:22 AM, Iyer, Balaji V balaji.v.i...@intel.com wrote: Hi Joseph, Did you get a chance to look at this submission? I think I have fixed all the changes you have mentioned. Is it OK for trunk? Thanks, Balaji V. Iyer. -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Iyer, Balaji V Sent: Wednesday, September 26, 2012 7:16 PM To: Joseph Myers Cc: gcc-patches@gcc.gnu.org; al...@redhat.com; r...@redhat.com; l...@redhat.com Subject: RE: [PATCH] Cilk Plus merging to trunk (2 of n) Hello Joseph, In my last patch, I forgot to add the change Richard Guenther wanted me to make. He wanted me to move the ARRAY_NOTATION_REF node from tree.def to c-family/c-common.def. Here is a new one that has this change. I am sorry for this. Here are ChangeLog entries: gcc/ChangeLog 2012-09-26 Balaji V. Iyer balaji.v.i...@intel.com * tree.h (array_notation_reduce_type): Added new enumerator. This should be moved to c-tree.h then, and ... * Makefile.in (OBJS): Added array-notation-common.o. * doc/passes.texi (Cilk Plus Transformation): Documented array notation and overall transformations for Cilk Plus. * doc/invoke.texi (C Dialect Options): Documented -fcilkplus flag. * doc/generic.texi (Storage References): Documented ARRAY_NOTATION_REF tree addition. * tree-pretty-pretty.c (dump_generic_node): Added ARRAY_NOTATION_REF case. ... this to c-pretty-print.c and * array-notation-common.c: New file. ... this to the c-common/ directory. Basically this should be a completely frontend-only patch. Richard. gcc/c-family/ChangeLog 2012-09-26 Balaji V. Iyer balaji.v.i...@intel.com * c-common.h (build_array_notation_expr): New function declaration. (ARRAY_NOTATION_ARRAY): Added new #define. (ARRAY_NOTATION_CHECK): Likewise. (ARRAY_NOTATION_START): Likewise. (ARRAY_NOTATION_LENGTH): Likewise. (ARRAY_NOTATION_STRIDE): Likewise. (ARRAY_NOTATION_TYPE): Likewise. * c-common.def: Added new tree ARRAY_NOTATION_REF. * c-common.c (c_define_builtins): Added a call to initialize array notation builtin functions. (c_common_init_ts): Set ARRAY_NOTATION_REF as typed. * c.opt (-fcilkplus): Define new command line switch. gcc/c/ChangeLog 2012-09-26 Balaji V. Iyer balaji.v.i...@intel.com * c-typeck.c (convert_arguments): Added a check if tree contains array notation expressions before throwing errors or doing anything. * Make-lang.in (C_AND_OBJC_OBJS): Added c-array-notation.o. * c-parser.c (c_parser_compound_statement): Check if array notation code is used in tree, if so, then transform them into appropriate C code. (c_parser_expr_no_commas): Check if array notation is used in LHS or RHS, if so, then build array notation expression instead of regular modify. (c_parser_postfix_expression_after_primary): Added a check for colon(s) after square braces, if so then handle it like an array notation. Also, break up array notations in unary op if found. (c_parser_array_notation): New function. * c-array-notation.c: New file. gcc/testsuite/ChangeLog 2012-09-26 Balaji V. Iyer balaji.v.i...@intel.com * gcc.dg/cilk-plus/array_notation/execute/execute.exp: New script. * gcc.dg/cilk-plus/array_notation/compile/compile.exp: Likewise. * gcc.dg/cilk-plus/array_notation/errors/errors.exp: Likewise. * gcc.dg/cilk-plus/array_notation/execute/sec_implicit_ex.c: New test. * gcc.dg/cilk-plus/array_notation/execute/if_test.c: Likewise. * gcc.dg/cilk-plus/array_notation/execute/gather_scatter.c: Likewise. * gcc.dg/cilk-plus/array_notation/execute/builtin_func_double2.c: Likewise. * gcc.dg/cilk-plus/array_notation/execute/builtin_func_double.c: Likewise. * gcc.dg/cilk-plus/array_notation/execute/builtin_fn_custom.c: Likewise. * gcc.dg/cilk-plus/array_notation/execute/builtin_fn_mutating.c: Likewise. * gcc.dg/cilk-plus/array_notation/execute/array_test_ND.c: Likewise. * gcc.dg/cilk-plus/array_notation/execute/array_test2.c: Likewise. * gcc.dg/cilk-plus/array_notation/execute/array_test1.c: Likewise. * gcc.dg/cilk-plus/array_notation/compile/sec_implicit_ex.c: Likewise. * gcc.dg/cilk-plus/array_notation/compile/gather_scatter.c: Likewise. * gcc.dg/cilk-plus/array_notation/compile/builtin_func_double2.c: Likewise. * gcc.dg/cilk-plus/array_notation/compile/array_test_ND.c: Likewise. * gcc.dg/cilk-plus/array_notation/compile/if_test.c: Likewise. * gcc.dg/cilk-plus/array_notation/compile/builtin_func_double.c: Likewise. * gcc.dg/cilk-plus/array_notation/compile/array_test1.c:
Re: Fix -fdump-ada-spec
On Thu, Oct 4, 2012 at 10:26 AM, Arnaud Charlet char...@adacore.com wrote: After changes by Sharad (Add option for dumping to stderr (issue6190057)), -fdump-ada-spec is broken, and is now a no-op. Admittedly, this is because -fdump-ada-spec is handled differently from other -fdump-* switches, so this patch fixes support for -fdump-ada-spec by using an approach similar to -fdump-go-spec, and use regular switches via c/c.opt. I've removed the handling of TDF_RAW, which was a debugging option, and never really used, so can be simply deleted. Change is mostly trivial/mechanical. Tested on x86_64-pc-linux-gnu, OK for trunk? Much cleaner indeed. Ok, Thanks, Richard. gcc/ 2012-10-04 Arnaud Charlet char...@adacore.com * dumpfile.h, dumpfile.c: Remove TDI_ada. c-family/ 2012-10-04 Arnaud Charlet char...@adacore.com * c-ada-spec.c (print_ada_declaration): Remove handling of TDF_RAW. * c.opt (-fdump-ada-spec, -fdump-ada-spec-slim): Move switch definition out of dumpfile.h. c/ 2012-10-04 Arnaud Charlet char...@adacore.com * c-decl.c (c_write_global_declarations): Fix handling of -fdump-ada-spec*. cp/ 2012-10-04 Arnaud Charlet char...@adacore.com * decl2.c (cp_write_global_declarations): Fix handling of -fdump-ada-spec*. Arno -- Index: c-family/c.opt === --- c-family/c.opt (revision 192062) +++ c-family/c.opt (working copy) @@ -799,6 +799,14 @@ fdollars-in-identifiers C ObjC C++ ObjC++ Permit '$' as an identifier character +fdump-ada-spec +C ObjC C++ ObjC++ RejectNegative Var(flag_dump_ada_spec) +Write all declarations as Ada code transitively + +fdump-ada-spec-slim +C ObjC C++ ObjC++ RejectNegative Var(flag_dump_ada_spec_slim) +Write all declarations as Ada code for the given file only + felide-constructors C++ ObjC++ Var(flag_elide_constructors) Init(1) Index: c-family/c-ada-spec.c === --- c-family/c-ada-spec.c (revision 192062) +++ c-family/c-ada-spec.c (working copy) @@ -2535,7 +2535,6 @@ print_ada_declaration (pretty_printer *b int is_class = false; tree name = TYPE_NAME (TREE_TYPE (t)); tree decl_name = DECL_NAME (t); - bool dump_internal = get_dump_file_info (TDI_ada)-pflags TDF_RAW; tree orig = NULL_TREE; if (cpp_check cpp_check (t, IS_TEMPLATE)) @@ -2705,8 +2704,7 @@ print_ada_declaration (pretty_printer *b } else { - if (!dump_internal - TREE_CODE (t) == VAR_DECL + if (TREE_CODE (t) == VAR_DECL decl_name *IDENTIFIER_POINTER (decl_name) == '_') return 0; @@ -2796,8 +2794,7 @@ print_ada_declaration (pretty_printer *b /* If this function has an entry in the dispatch table, we cannot omit it. */ - if (!dump_internal !DECL_VINDEX (t) - *IDENTIFIER_POINTER (decl_name) == '_') + if (!DECL_VINDEX (t) *IDENTIFIER_POINTER (decl_name) == '_') { if (IDENTIFIER_POINTER (decl_name)[1] == '_') return 0; Index: c/c-decl.c === --- c/c-decl.c (revision 192062) +++ c/c-decl.c (working copy) @@ -10079,10 +10079,10 @@ c_write_global_declarations (void) gcc_assert (!current_scope); /* Handle -fdump-ada-spec[-slim]. */ - if (dump_initialized_p (TDI_ada)) + if (flag_dump_ada_spec || flag_dump_ada_spec_slim) { /* Build a table of files to generate specs for */ - if (get_dump_file_info (TDI_ada)-pflags TDF_SLIM) + if (flag_dump_ada_spec_slim) collect_source_ref (main_input_filename); else for_each_global_decl (collect_source_ref_cb); Index: cp/decl2.c === --- cp/decl2.c (revision 192062) +++ cp/decl2.c (working copy) @@ -3698,9 +3698,9 @@ cp_write_global_declarations (void) cgraph_process_same_body_aliases (); /* Handle -fdump-ada-spec[-slim] */ - if (dump_initialized_p (TDI_ada)) + if (flag_dump_ada_spec || flag_dump_ada_spec_slim) { - if (get_dump_file_info (TDI_ada)-pflags TDF_SLIM) + if (flag_dump_ada_spec_slim) collect_source_ref (main_input_filename); else collect_source_refs (global_namespace); Index: dumpfile.c === --- dumpfile.c (revision 192062) +++ dumpfile.c (working copy) @@ -57,8 +57,7 @@ static struct dump_file_info dump_files[ 0, 0, 0, 5}, {.vcg, tree-vcg, NULL, NULL, NULL, NULL, NULL, TDF_TREE, 0, 0, 0, 6}, - {.ads, ada-spec, NULL, NULL, NULL, NULL, NULL, 0, 0, 0, 0, 7}, -#define FIRST_AUTO_NUMBERED_DUMP 8 +#define FIRST_AUTO_NUMBERED_DUMP 7 {NULL, tree-all, NULL, NULL, NULL, NULL, NULL, TDF_TREE,
[Ada] Remaining fixes to get MINIMIZED through test suite
This patch makes two minor corrections to fix two remaining tests in the test suite that failed if run with -gnato2 forced on. There is also a minor optimization of Compile_Time_Compare which improves the results in some cases (noticed during testing, but does not have any effect on the test suite). The first problem was in handling folding of /= in some cases which showed up as a bogus complaint about a length check failing. The following should compile cleanly with -gnato2 with no messages. 1. procedure Compov2 is 2. begin 3.for J in 1 .. 1 loop 4. declare 5. Dest : constant Wide_String (J .. J + 6) := is_copy; 6. begin 7. null; 8. end; 9.end loop; 10. end; The other case was a mistake in handling of case expressions, which shows up as an unexpected constraint error. No simple test case is available for this problem. Tested on x86_64-pc-linux-gnu, committed on trunk 2012-10-04 Robert Dewar de...@adacore.com * checks.adb (Minimize_Eliminate_Overflow_Checks): Dont reanalyze if/case expression if nothing has changed (just reexpand). Stops case expression from generating incorrect temporary. * exp_ch4.adb (Expand_Compare_Minimize_Eliminate_Overflow): Fix cut and paste typo for range analysis in NE (not equal) case. * sem_eval.adb (Compile_Time_Compare): Small optimization to catch some more cases. * types.ads (Suppressed_Or_Checked): New subtype of Overflow_Check_Type. Index: types.ads === --- types.ads (revision 192066) +++ types.ads (working copy) @@ -737,7 +737,9 @@ subtype Minimized_Or_Eliminated is Overflow_Check_Type range Minimized .. Eliminated; - -- Definte subtypes so that clients don't need to know ordering. Note that + subtype Suppressed_Or_Checked is + Overflow_Check_Type range Suppressed .. Checked; + -- Define subtypes so that clients don't need to know ordering. Note that -- Overflow_Check_Type is not marked as an ordered enumeration type. -- The following structure captures the state of check suppression or Index: checks.adb === --- checks.adb (revision 192066) +++ checks.adb (working copy) @@ -34,6 +34,7 @@ with Exp_Tss; use Exp_Tss; with Exp_Util; use Exp_Util; with Elists; use Elists; +with Expander; use Expander; with Eval_Fat; use Eval_Fat; with Freeze; use Freeze; with Lib; use Lib; @@ -1272,8 +1273,7 @@ Apply_Range_Check (N, Typ); end if; - elsif (Is_Record_Type (Typ) - or else Is_Private_Type (Typ)) + elsif (Is_Record_Type (Typ) or else Is_Private_Type (Typ)) and then Has_Discriminants (Base_Type (Typ)) and then Is_Constrained (Typ) then @@ -6709,10 +6709,12 @@ -- to be done in bignum mode), and the determined ranges of the operands. -- After possible rewriting of a constituent subexpression node, a call is - -- made to reanalyze the node after setting Analyzed to False. To avoid a - -- recursive call into the whole overflow apparatus, and important rule for - -- this reanalysis call is that either Do_Overflow_Check must be False, or - -- if it is set, then the overflow checking mode must be temporarily set + -- made to either reexpand the node (if nothing has changed) or reanalyze + -- the node (if it has been modified by the overflow check processing). + -- The Analyzed_flag is set False before the reexpand/reanalyze. To avoid + -- a recursive call into the whole overflow apparatus, and important rule + -- for this call is that either Do_Overflow_Check must be False, or if + -- it is set, then the overflow checking mode must be temporarily set -- to Checked/Suppressed. Either step will avoid the unwanted recursion. procedure Minimize_Eliminate_Overflow_Checks @@ -6761,6 +6763,17 @@ -- range, then we must convert such operands back to the result type. -- This switch is properly set only when Bignum_Operands is False. + procedure Reexpand (C : Suppressed_Or_Checked); + -- This is called when we have not modifed the node, so we do not need + -- to reanalyze it. But we do want to reexpand it in either CHECKED + -- or SUPPRESSED mode (as indicated by the argument C) to get proper + -- expansion. It is important that we reset the mode to SUPPRESSED or + -- CHECKED, since if we leave it in MINIMIZED or ELIMINATED mode we + -- would reenter this routine recursively which would not be good! + -- Note that this is not just an optimization, testing has showed up + -- several complex cases in which renalyzing an already analyzed node + -- causes incorrect behavior. + function In_Result_Range return Boolean; -- Returns True
[Ada] Remaining fixes for -gnato3 (eliminated mode overflow checks)
This patch corrects a couple of errors in the handling of ELIMINATED mode overflow checking. With this patch, the entire test suite passes with -gnato3 mode forced on (there are some differences in output, but all are expected). The following three tests now work correctly The following compiles quietly with -gnato3 and outputs TRUE 1. with Text_IO; use Text_IO; 2. procedure InGNATo3 is 3.function K (X, Z : Integer) return Boolean is 4.begin 5. return X in 1 .. Z ** 10; 6.end; 7. begin 8.Put_Line (K (1, Integer'Last)'Img); 9. end; The following test compiles with the messages shown, regardless of -gnato mode, and in particular this is now the -gnato3 output. 1. procedure whynonso3 (a : integer) is 2.x : constant := 1 + a ** 10; 1 2 non-static expression used in number declaration a is not static constant or named number (RM 4.9(5)) 3. begin 4.null; 5. end; The following test compiles quietly in -gnato3 mode 1. procedure BadAleno3 is 2.type Arr is array (Positive range ) of Integer; 3.N : Integer := 0; 4. 5.type ARR_DEF (D3 : INTEGER) is record 6. C1 : Arr (N .. D3); 7.end record; 8. 9.A : Arr := (1, 2); 10.X : constant ARR_DEF := (1, A ); 11. begin 12.null; 13. end BadAleno3; Tested on x86_64-pc-linux-gnu, committed on trunk 2012-10-04 Robert Dewar de...@adacore.com * exp_ch4.adb (Expand_Compare_Minimize_Eliminate_Overflow): Deal with case where we get a bignum operand and cannot do a range analysis. * sem_eval.adb (Why_Not_Static): Deal with bignum operands Index: exp_ch4.adb === --- exp_ch4.adb (revision 192070) +++ exp_ch4.adb (working copy) @@ -2325,9 +2325,12 @@ Minimize_Eliminate_Overflow_Checks (Right_Opnd (N), Rlo, Rhi, Top_Level = False); - -- See if the range information decides the result of the comparison + -- See if the range information decides the result of the comparison. + -- We can only do this if we in fact have full range information (which + -- won't be the case if either operand is bignum at this stage). - case N_Op_Compare (Nkind (N)) is + if Llo /= No_Uint and then Rlo /= No_Uint then + case N_Op_Compare (Nkind (N)) is when N_Op_Eq = if Llo = Lhi and then Rlo = Rhi and then Llo = Rlo then Set_True; @@ -2369,12 +2372,13 @@ elsif Llo Rhi or else Lhi Rlo then Set_True; end if; - end case; + end case; - -- All done if we did the rewrite + -- All done if we did the rewrite - if Nkind (N) not in N_Op_Compare then - return; + if Nkind (N) not in N_Op_Compare then +return; + end if; end if; -- Otherwise, time to do the comparison Index: sem_eval.adb === --- sem_eval.adb(revision 192070) +++ sem_eval.adb(working copy) @@ -37,6 +37,7 @@ with Nmake;use Nmake; with Nlists; use Nlists; with Opt; use Opt; +with Rtsfind; use Rtsfind; with Sem; use Sem; with Sem_Aux; use Sem_Aux; with Sem_Cat; use Sem_Cat; @@ -5419,10 +5420,12 @@ return; end if; - -- Type must be scalar or string type + -- Type must be scalar or string type (but allow Bignum, since this + -- is really a scalar type from our point of view in this diagnosis). if not Is_Scalar_Type (Typ) and then not Is_String_Type (Typ) + and then not Is_RTE (Typ, RE_Bignum) then Error_Msg_N (static expression must have scalar or string type @@ -5539,8 +5542,15 @@ when N_Function_Call = Why_Not_Static_List (Parameter_Associations (N)); -Error_Msg_N (non-static function call (RM 4.9(6,18))!, N); +-- Complain about non-static function call unless we have Bignum +-- which means that the underlying expression is really some +-- scalar arithmetic operation. + +if not Is_RTE (Typ, RE_Bignum) then + Error_Msg_N (non-static function call (RM 4.9(6,18))!, N); +end if; + when N_Parameter_Association = Why_Not_Static (Explicit_Actual_Parameter (N));
Re: [SH] PR 33135 - Remove mieee option in libgcc
Oleg Endo oleg.e...@t-online.de wrote: Since the -mieee behavior has been fixed, is enabled by default on SH and the additional flags in libgcc can be removed. OK? OK. Regards, kaz
[Ada] Legality of aspects specified on a full view
In Ada 2012, certain aaspects, such as Type_Invariant, can be specified on a partial view of a type, or on the full view, but not in both This patch rejects such duplications cleanly. the command: gcc -c -gnat12 -gnata r.ads must yield: r.ads:5:32: aspect already specified in private declaration --- package R is type T is private with Type_Invariant = Non_Null (T); function Non_Null (X : T) return Boolean; private type T is new Integer with Type_Invariant = T /= 0; function Non_Null (X : T) return Boolean is (X /= 0); end R; Tested on x86_64-pc-linux-gnu, committed on trunk 2012-10-04 Ed Schonberg schonb...@adacore.com * sem_ch3.adb (Check_Duplicate_Aspects): Diagnose properly aspects that appear in the partial and the full view of a type. Index: sem_ch3.adb === --- sem_ch3.adb (revision 192066) +++ sem_ch3.adb (working copy) @@ -23,6 +23,7 @@ -- -- -- +with Aspects; use Aspects; with Atree;use Atree; with Checks; use Checks; with Debug;use Debug; @@ -14805,6 +14806,11 @@ New_Id : Entity_Id; Prev_Par : Node_Id; + procedure Check_Duplicate_Aspects; + -- Check that aspects specified in a completion have not been specified + -- already in the partial view. Type_Invariant and others can be + -- specified on either view but never on both. + procedure Tag_Mismatch; -- Diagnose a tagged partial view whose full view is untagged. -- We post the message on the full view, with a reference to @@ -14813,6 +14819,38 @@ -- so we determine the position of the error message from the -- respective slocs of both. + - + -- Check_Duplicate_Aspects -- + - + procedure Check_Duplicate_Aspects is + Prev_Aspects : constant List_Id := Aspect_Specifications (Prev_Par); + Full_Aspects : constant List_Id := Aspect_Specifications (N); + F_Spec, P_Spec : Node_Id; + + begin + if Present (Prev_Aspects) and then Present (Full_Aspects) then +F_Spec := First (Full_Aspects); +while Present (F_Spec) loop + P_Spec := First (Prev_Aspects); + while Present (P_Spec) loop + if +Chars (Identifier (P_Spec)) = Chars (Identifier (F_Spec)) + then + Error_Msg_N + (aspect already specified in private declaration, + F_Spec); + Remove (F_Spec); + return; + end if; + + Next (P_Spec); + end loop; + + Next (F_Spec); +end loop; + end if; + end Check_Duplicate_Aspects; + -- -- Tag_Mismatch -- -- @@ -15022,6 +15060,10 @@ (declaration of full view must appear in private part, N); end if; +if Ada_Version = Ada_2012 then + Check_Duplicate_Aspects; +end if; + Copy_And_Swap (Prev, Id); Set_Has_Private_Declaration (Prev); Set_Has_Private_Declaration (Id);
[Ada]: Remove __gl_zero_cost_exceptions in the binder generated file
This variable wasn't used anymore. No functional change. Tested on x86_64-pc-linux-gnu, committed on trunk 2012-10-04 Tristan Gingold ging...@adacore.com * init.c (__gl_zero_cost_exceptions): Comment it as not used anymore. * bindgen.adb (Gen_Adainit): Do not emit Zero_Cost_Exceptions anymore. Index: bindgen.adb === --- bindgen.adb (revision 192066) +++ bindgen.adb (working copy) @@ -137,7 +137,6 @@ -- Num_Interrupt_States : Integer; -- Unreserve_All_Interrupts : Integer; -- Exception_Tracebacks : Integer; - -- Zero_Cost_Exceptions : Integer; -- Detect_Blocking : Integer; -- Default_Stack_Size: Integer; -- Leap_Seconds_Support : Integer; @@ -216,9 +215,6 @@ -- tracebacks are provided by default, so a value of zero for this -- parameter does not necessarily mean no trace backs are available. - -- Zero_Cost_Exceptions is set to one if zero cost exceptions are used for - -- this partition, and to zero if longjmp/setjmp exceptions are used. - -- Detect_Blocking indicates whether pragma Detect_Blocking is active or -- not. A value of zero indicates that the pragma is not present, while a -- value of 1 signals its presence in the partition. @@ -607,9 +603,6 @@ __gl_exception_tracebacks);); end if; - WBI ( Zero_Cost_Exceptions : Integer;); - WBI ( pragma Import (C, Zero_Cost_Exceptions, - __gl_zero_cost_exceptions);); WBI ( Detect_Blocking : Integer;); WBI ( pragma Import (C, Detect_Blocking, __gl_detect_blocking);); @@ -803,17 +796,6 @@ WBI ( Exception_Tracebacks := 1;); end if; - Set_String ( Zero_Cost_Exceptions := ); - - if Zero_Cost_Exceptions_Specified then -Set_String (1); - else -Set_String (0); - end if; - - Set_String (;); - Write_Statement_Buffer; - Set_String ( Detect_Blocking := ); if Detect_Blocking then Index: init.c === --- init.c (revision 192066) +++ init.c (working copy) @@ -103,12 +103,14 @@ int __gl_num_interrupt_states = 0; int __gl_unreserve_all_interrupts = 0; int __gl_exception_tracebacks = 0; -int __gl_zero_cost_exceptions = 0; int __gl_detect_blocking = 0; int __gl_default_stack_size= -1; int __gl_leap_seconds_support = 0; int __gl_canonical_streams = 0; +/* This value is not used anymore, but kept for bootstrapping purpose. */ +int __gl_zero_cost_exceptions = 0; + /* Indication of whether synchronous signal handler has already been installed by a previous call to adainit. */ int __gnat_handler_installed = 0;
[Ada] New preprocessor switch -a
This change introduces a new switch -a (all source text preserved) for gnatprep and the integrated preprocessor, causing all source text to be preserved (i.e. the if, all elsif and the else branch of a #if construct all show up in the output). This is useful to perform simple style checks on all branches. The following command must produce the shown output: $ gnatprep -a -c all_source_.adb.in all_source_text.adb $ cat all_source_text.adb procedure All_Source_Text is beGin Some_Code; --! #if Some_Condition then Do_Something; --! #else Do_Something_else; --! #end if; Some_More_Code; end All_Source_Text; Input: procedure All_Source_Text is beGin Some_Code; #if Some_Condition then Do_Something; #else Do_Something_else; #end if; Some_More_Code; end All_Source_Text; Tested on x86_64-pc-linux-gnu, committed on trunk 2012-10-04 Thomas Quinot qui...@adacore.com * prep.adb, prepcomp.adb, gprep.adb, opt.ads: New preprocessor switch -a (all source text preserved). Index: prep.adb === --- prep.adb(revision 192066) +++ prep.adb(working copy) @@ -6,7 +6,7 @@ -- -- -- B o d y -- -- -- --- Copyright (C) 2002-2010, Free Software Foundation, Inc. -- +-- Copyright (C) 2002-2012, Free Software Foundation, Inc. -- -- -- -- GNAT is free software; you can redistribute it and/or modify it under -- -- terms of the GNU General Public License as published by the Free Soft- -- @@ -292,8 +292,8 @@ Result.Value := End_String; end if; - -- Now, check the syntax of the symbol (we don't allow accented and - -- wide characters) + -- Now, check the syntax of the symbol (we don't allow accented or + -- wide characters). if Name_Buffer (1) not in 'a' .. 'z' and then Name_Buffer (1) not in 'A' .. 'Z' @@ -356,7 +356,7 @@ begin -- Always return False when not inside an #if statement - if Pp_States.Last = Ground then + if Opt.No_Deletion or else Pp_States.Last = Ground then return False; else return Pp_States.Table (Pp_States.Last).Deleting; Index: prepcomp.adb === --- prepcomp.adb(revision 192066) +++ prepcomp.adb(working copy) @@ -6,7 +6,7 @@ -- -- -- B o d y -- -- -- --- Copyright (C) 2003-2010, Free Software Foundation, Inc. -- +-- Copyright (C) 2003-2012, Free Software Foundation, Inc. -- -- -- -- GNAT is free software; you can redistribute it and/or modify it under -- -- terms of the GNU General Public License as published by the Free Soft- -- @@ -60,6 +60,7 @@ Undef_False : Boolean:= False; Always_Blank : Boolean:= False; Comments : Boolean:= False; + No_Deletion : Boolean:= False; List_Symbols : Boolean:= False; Processed: Boolean:= False; end record; @@ -73,6 +74,7 @@ Undef_False = False, Always_Blank = False, Comments = False, + No_Deletion = False, List_Symbols = False, Processed= False); @@ -330,6 +332,16 @@ -- significant. case Sinput.Source (Token_Ptr) is + when 'a' = + + -- All source text preserved (also implies -u) + + if Name_Len = 1 then +Current_Data.No_Deletion := True; +Current_Data.Undef_False := True; +OK := True; + end if; + when 'u' = -- Undefined symbol are False @@ -581,15 +593,15 @@ -- Set the preprocessing flags according to the preprocessing data - if Current_Data.Comments and then not Current_Data.Always_Blank then + if Current_Data.Comments and not Current_Data.Always_Blank then Comment_Deleted_Lines := True; Blank_Deleted_Lines := False; - else Comment_Deleted_Lines := False; Blank_Deleted_Lines := True; end if; + No_Deletion := Current_Data.No_Deletion; Undefined_Symbols_Are_False := Current_Data.Undef_False; List_Preprocessing_Symbols := Current_Data.List_Symbols;
[Ada] Warn on Ada 2012 set membership test duplicate element
This patch adds a warning if a duplicate literal entry is found in an Ada 2012 set membership, as shown by this example: 1. pragma Ada_2012; 2. package Dupset is 3.a : integer; 4.b : character; 5.c : boolean := a in 1 | 6.2 | 7.3 | 8.1 | | warning: duplicate of value given at line 5 9.5; 10.d : boolean := b in 'a' | 11.'b' | 12.'c' | 13.'b'; | warning: duplicate of value given at line 11 14. 15.type Day is (Mon, Tue, Wed, Thu, Fri); 16.x : Day; 17.e : boolean := x in Mon | Tue | 18.Wed | Mon; | warning: duplicate of value given at line 17 19. end; Tested on x86_64-pc-linux-gnu, committed on trunk 2012-10-04 Robert Dewar de...@adacore.com * sem_res.adb (Resolve_Set_Membership): Warn on duplicates. Index: sem_res.adb === --- sem_res.adb (revision 192066) +++ sem_res.adb (working copy) @@ -7685,10 +7685,11 @@ procedure Resolve_Set_Membership is - Alt : Node_Id; + Alt : Node_Id; + Ltyp : constant Entity_Id := Etype (L); begin - Resolve (L, Etype (L)); + Resolve (L, Ltyp); Alt := First (Alternatives (N)); while Present (Alt) loop @@ -7699,11 +7700,51 @@ if not Is_Entity_Name (Alt) or else not Is_Type (Entity (Alt)) then - Resolve (Alt, Etype (L)); + Resolve (Alt, Ltyp); end if; Next (Alt); end loop; + + -- Check for duplicates for discrete case + + if Is_Discrete_Type (Ltyp) then +declare + type Ent is record + Alt : Node_Id; + Val : Uint; + end record; + + Alts : array (0 .. List_Length (Alternatives (N))) of Ent; + Nalts : Nat; + +begin + -- Loop checking duplicates. This is quadratic, but giant sets + -- are unlikely in this context so it's a reasonable choice. + + Nalts := 0; + Alt := First (Alternatives (N)); + while Present (Alt) loop + if Is_Static_Expression (Alt) +and then (Nkind_In (Alt, N_Integer_Literal, + N_Character_Literal) + or else Nkind (Alt) in N_Has_Entity) + then + Nalts := Nalts + 1; + Alts (Nalts) := (Alt, Expr_Value (Alt)); + + for J in 1 .. Nalts - 1 loop +if Alts (J).Val = Alts (Nalts).Val then + Error_Msg_Sloc := Sloc (Alts (J).Alt); + Error_Msg_N (duplicate of value given#?, Alt); +end if; + end loop; + end if; + + Alt := Next (Alt); + end loop; +end; + end if; end Resolve_Set_Membership; -- Start of processing for Resolve_Membership_Op
[Ada] Fix value of GNAT.Command_Line.Full_Switch on invalid switch
This patch fixes the value returned by Full_Switch when the user provided an invalid long switch (instead of return --, Full_Switch will now return --long), as in this example: with GNAT.Command_Line; use GNAT.Command_Line; with Ada.Text_IO; use Ada.Text_IO; procedure Main is begin while True loop case Getopt (o: -long: s) is when 'o' | 's' = null; when '-' = null; when others = exit; end case; end loop; exception when GNAT.Command_Line.Invalid_Switch = Put_Line (invalid switch: Full_Switch); end Main; Calling ./main --lond will now indicate that --lond is invalid, not --. Tested on x86_64-pc-linux-gnu, committed on trunk 2012-10-04 Emmanuel Briot br...@adacore.com * g-comlin.adb (Getopt): Fix value of Full_Switch returned in case of invalid switch. Index: g-comlin.adb === --- g-comlin.adb(revision 192066) +++ g-comlin.adb(working copy) @@ -39,6 +39,10 @@ package body GNAT.Command_Line is + -- General note: this entire body could use much more commenting. There + -- are large sections of uncommented code throughout, and many formal + -- parameters of local subprograms are not documented at all ??? + package CL renames Ada.Command_Line; type Switch_Parameter_Type is @@ -56,6 +60,12 @@ Extra: Character := ASCII.NUL); pragma Inline (Set_Parameter); -- Set the parameter that will be returned by Parameter below + -- + -- Extra is a character that needs to be added when reporting Full_Switch. + -- (it will in general be the switch character, for instance '-'). + -- Otherwise, Full_Switch will report 'f' instead of '-f'. In particular, + -- it needs to be set when reporting an invalid switch or handling '*'. + -- -- Parameters need to be defined ??? function Goto_Next_Argument_In_Section (Parser : Opt_Parser) return Boolean; @@ -95,9 +105,9 @@ Index_In_Switches : out Integer; Switch_Length : out Integer; Param : out Switch_Parameter_Type); - -- Return the Longest switch from Switches that at least partially - -- partially Arg. Index_In_Switches is set to 0 if none matches. - -- What are other parameters??? in particular Param is not always set??? + -- Return the Longest switch from Switches that at least partially matches + -- Arg. Index_In_Switches is set to 0 if none matches. What are other + -- parameters??? in particular Param is not always set??? procedure Unchecked_Free is new Ada.Unchecked_Deallocation (Argument_List, Argument_List_Access); @@ -663,17 +673,45 @@ if Index_Switches = 0 then --- Depending on the value of Concatenate, the full switch is --- a single character or the rest of the argument. +-- Find the current switch that we did not recognize. This is in +-- fact difficult because Getopt does not know explicitly about +-- short and long switches. Ideally, we would want the following +-- behavior: -End_Index := - (if Concatenate then Parser.Current_Index else Arg'Last); +-- * for short switches, with Concatenate: +--if -a is not recognized, and the command line has -daf +--we should report the invalid switch as -a. +-- * for short switches, wihtout Concatenate: +--we should report the invalid switch as -daf. + +-- * for long switches: +--if the commadn line is --long we should report --long +--as unrecongized. + +-- Unfortunately, the fact that long switches start with a +-- duplicate switch character is just a convention (so we could +-- have a long switch -long for instance). We'll still rely on +-- this convention here to try and get as helpful an error message +-- as possible. + +-- Long switch case (starting with double switch character) + +if Arg (Arg'First + 1) = Parser.Switch_Character then + End_Index := Arg'Last; + +-- Short switch case + +else + End_Index := + (if Concatenate then Parser.Current_Index else Arg'Last); +end if; + if Switches (Switches'First) = '*' then - -- Always prepend the switch character, so that users know that - -- this comes from a switch on the command line. This is - -- especially important when Concatenate is False, since + -- Always prepend the switch character, so that users know + -- that this comes from a switch on the command line. This + -- is especially
Re: [patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])
On Wed, Oct 3, 2012 at 5:35 PM, Steven Bosscher stevenb@gmail.com wrote: The worst result is this: Compressing live ranges: from 726174 to 64496 - 8%, pre_count 40476128, post_count 12483414 But that's still a lot better than before the patch for the same function: Compressing live ranges: from 1742569 to 73069 - 4%, pre_count 40842330, post_count 12479992 Walking basic blocks with FOR_EACH_BB_REVERSE gives: Only FOR_EACH_BB_REVERSE: Compressing live ranges: from 1742579 to 429746 - 24% pre_count 41106212, post_count 34376494 Compressing live ranges: from 1742569 to 63000 - 3% pre_count 40835340, post_count 11055747 FOR_EACH_BB_REVERSE + need_curr_point_incr: Compressing live ranges: from 726184 to 416529 - 57% pre_count 40743516, post_count 34376846 Compressing live ranges: from 726174 to 61840 - 8% pre_count 40472806, post_count 11055747 The combination of the two changes takes ~20s off the ~180s for LRA create live ranges. Ciao! Steven
Re: [Fortran, (RFC) patch] PR49110/51055 Assignment to alloc. deferred-length character vars
Thanks for the suggestions. The attached patch changes all .-something symbol names, which I found. Build and regtested on x86-64-gnu-linux. OK for the trunk and 4.7? (.saved_dovar also occurs in 4.6; we could also backport that part to 4.6, but I am not sure whether it is needed.) We probably should also bump the .mod version (and timely also commit the patch http://gcc.gnu.org/ml/fortran/2012-04/msg00033.html). Comments? Tobias Am 04.10.2012 01:07, schrieb David Edelsohn: For C and C++, identifiers beginning with underscore and upper case letter or with two underscores are reserve to the implementation. C++ uses _Z for mangling. Maybe Fortran could prepend _F. Something beginning with an underscore seems like a much better choice, given the rules about reserved identifiers. Thanks, David On Wed, Oct 3, 2012 at 5:00 PM, Tobias Burnus bur...@net-b.de wrote: David, David Edelsohn wrote: I am not sure why you chose a period and how best to correct this. Well, in principle any name which the user cannot enter would do. (Not enter: At least not as valid Fortran identifier.) The reason for choosing . is that dotvar_name is used elsewhere in gfortran for such identifier for the string-length variable belonging to var_name, e.g. ._result in trans-decl.c. I assume the reason that it didn't pop up with those is that those are local variables, but I wouldn't be surprised if it would break elsewhere. I wonder whether @ would work, otherwise, one could also use _. The only other problem is that it will break the ABI. On the other hand, it's a rather new feature and if we bump the .mod version number, the chance that one effectively forces the user to re-compile is rather high. So far we always bumped the .mod version number as something changed. There are also some other patches pending which effectively lead to a bump in the .mod version. (The .mod version won't affect code which doesn't use modules such as BLAS/LAPACK or any Fortran 66/77 code, but those won't be affected by the ABI change anyway as there the name doesn't propagate as it does with modules..) Thanks for investigating the test-suite failure. Tobias 2012-10-04 Tobias Burnus bur...@net-b.de * trans-decl.c (gfc_create_string_length, create_function_arglist): Don't create a symbol which contains a dot. * trans-stmt.c (gfc_trans_simple_do, gfc_trans_do): Ditto. diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c index 910b150..f41fc8b 100644 --- a/gcc/fortran/trans-decl.c +++ b/gcc/fortran/trans-decl.c @@ -1097,9 +1097,9 @@ gfc_create_string_length (gfc_symbol * sym) /* Also prefix the mangled name. */ if (sym-module) - name = gfc_get_string (.__%s_MOD_%s, sym-module, sym-name); + name = gfc_get_string (_F_%s_MOD_%s, sym-module, sym-name); else - name = gfc_get_string (.%s, sym-name); + name = gfc_get_string (_F%s, sym-name); length = build_decl (input_location, VAR_DECL, get_identifier (name), @@ -1984,7 +1984,7 @@ create_function_arglist (gfc_symbol * sym) length = build_decl (input_location, PARM_DECL, - get_identifier (.__result), + get_identifier (_Flen__result), len_type); if (!sym-ts.u.cl-length) { @@ -2007,7 +2007,7 @@ create_function_arglist (gfc_symbol * sym) { tree len = build_decl (input_location, VAR_DECL, - get_identifier (..__result), + get_identifier (_Flen2__result), gfc_charlen_type_node); DECL_ARTIFICIAL (len) = 1; TREE_USED (len) = 1; diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c index 204f069..37fc6ee 100644 --- a/gcc/fortran/trans-stmt.c +++ b/gcc/fortran/trans-stmt.c @@ -1376,7 +1376,7 @@ gfc_trans_simple_do (gfc_code * code, stmtblock_t *pblock, tree dovar, /* Save value for do-tinkering checking. */ if (gfc_option.rtcheck GFC_RTCHECK_DO) { - saved_dovar = gfc_create_var (type, .saved_dovar); + saved_dovar = gfc_create_var (type, _F_saved_dovar); gfc_add_modify_loc (loc, pblock, saved_dovar, dovar); } @@ -1581,7 +1581,7 @@ gfc_trans_do (gfc_code * code, tree exit_cond) /* Save value for do-tinkering checking. */ if (gfc_option.rtcheck GFC_RTCHECK_DO) { - saved_dovar = gfc_create_var (type, .saved_dovar); + saved_dovar = gfc_create_var (type, _F_saved_dovar); gfc_add_modify_loc (loc, block, saved_dovar, dovar); }
[RFA 1/5] New port: CR16: Remove gdb from noconfigdirs in configure.ac
Hi, This patch is one of patch set to add a new port (National Instruments CR16) in gdb. This patch will, - Remove gdb from noconfigdirs in top-level configure.ac. - Add target-lobgloss - Make target OS independent in config.sub OK for gcc and binutils? Regards, Kaushik 2012-10-04 Kaushik Phatak kaushik.pha...@kpitcummins.com Changelog * configure.ac (noconfigdirs): Remove gdb. Add target-libgloss. * configure: Regenerate. * config.sub: Make target OS independent --- ./gdb_src.orig/configure2012-06-28 17:20:52.0 +0530 +++ ./gdb_src/configure 2012-08-17 16:56:10.0 +0530 @@ -3447,7 +3447,7 @@ case ${target} in noconfigdirs=$noconfigdirs target-libgloss gdb ;; cr16-*-*) -noconfigdirs=$noconfigdirs gdb +noconfigdirs=$noconfigdirs target-libgloss ;; d10v-*-*) noconfigdirs=$noconfigdirs target-libgloss --- ./gdb_src.orig/configure.ac 2012-08-17 23:07:18.0 +0530 +++ ./gdb_src/configure.ac 2012-08-17 16:56:17.0 +0530 @@ -873,7 +873,7 @@ case ${target} in noconfigdirs=$noconfigdirs target-libgloss gdb ;; cr16-*-*) -noconfigdirs=$noconfigdirs gdb +noconfigdirs=$noconfigdirs target-libgloss ;; d10v-*-*) noconfigdirs=$noconfigdirs target-libgloss --- ./gdb_src.orig/config.sub 2012-04-25 21:23:25.0 +0530 +++ ./gdb_src/config.sub2012-09-24 11:26:00.0 +0530 @@ -575,7 +575,6 @@ case $basic_machine in ;; cr16 | cr16-*) basic_machine=cr16-unknown - os=-elf ;; crds | unos) basic_machine=m68k-crds
Re: [Fortran, (RFC) patch] PR49110/51055 Assignment to alloc. deferred-length character vars
2012/10/4 Tobias Burnus bur...@net-b.de: Thanks for the suggestions. The attached patch changes all .-something symbol names, which I found. Build and regtested on x86-64-gnu-linux. OK for the trunk and 4.7? (.saved_dovar also occurs in 4.6; we could also backport that part to 4.6, but I am not sure whether it is needed.) I think at least for trunk it should be ok. We probably should also bump the .mod version (and timely also commit the patch http://gcc.gnu.org/ml/fortran/2012-04/msg00033.html). Strong agreement here :) Comments? When backporting to 4.6 and 4.7, do you intend to also bump the module version there? Does that make sense? Moreover, .__result probably goes back even further than 4.6, right? Cheers, Janus Am 04.10.2012 01:07, schrieb David Edelsohn: For C and C++, identifiers beginning with underscore and upper case letter or with two underscores are reserve to the implementation. C++ uses _Z for mangling. Maybe Fortran could prepend _F. Something beginning with an underscore seems like a much better choice, given the rules about reserved identifiers. Thanks, David On Wed, Oct 3, 2012 at 5:00 PM, Tobias Burnus bur...@net-b.de wrote: David, David Edelsohn wrote: I am not sure why you chose a period and how best to correct this. Well, in principle any name which the user cannot enter would do. (Not enter: At least not as valid Fortran identifier.) The reason for choosing . is that dotvar_name is used elsewhere in gfortran for such identifier for the string-length variable belonging to var_name, e.g. ._result in trans-decl.c. I assume the reason that it didn't pop up with those is that those are local variables, but I wouldn't be surprised if it would break elsewhere. I wonder whether @ would work, otherwise, one could also use _. The only other problem is that it will break the ABI. On the other hand, it's a rather new feature and if we bump the .mod version number, the chance that one effectively forces the user to re-compile is rather high. So far we always bumped the .mod version number as something changed. There are also some other patches pending which effectively lead to a bump in the .mod version. (The .mod version won't affect code which doesn't use modules such as BLAS/LAPACK or any Fortran 66/77 code, but those won't be affected by the ABI change anyway as there the name doesn't propagate as it does with modules..) Thanks for investigating the test-suite failure. Tobias
Re: [PATCH] Use __cxa_atexit on OpenBSD
Date: Wed, 3 Oct 2012 17:45:21 +0200 (CEST) From: Gerald Pfeifer ger...@pfeifer.com On Sat, 15 Sep 2012, Ian Lance Taylor wrote: 2012-09-02 Mark Kettenis kette...@openbsd.org * config.gcc (*-*-openbsd4.[3-9]|*-*-openbsd[5-9]*): Set default_use_cxa_atexit to yes. This is OK. I committed this to trunk and plan on doing so for the 4.7 branch as well (so that OpenBSD can benefit from a release branch of GCC carrying this) unless there are objections. Thanks. I won't object, although I'm not sure patching up the 4.7 branch is all that useful at this point given that it doesn't support OpenBSD/amd64. Any chance of getting the other diff that Ian approved http://gcc.gnu.org/ml/gcc-patches/2012-09/msg01070.html committed to trunk? Thanks, Mark
Re: [Patch, Fortran, OOP] PR 54784: [4.7/4.8 Regression] wrong code in polymorphic allocation with SOURCE
Le 03/10/2012 18:48, Janus Weil a écrit : Hi all, here is a small patch for a wrong-code regression with polymorphic allocation. The problem is that we falsely detect the allocation variable to be a polymorphic array (although it is a scalar). For further details see the PR, in particular comment 4. Regtested on x86_64-unknown-linux-gnu. Ok for trunk and 4.7? Hello, the fix looks incomplete. Index: gcc/fortran/trans-stmt.c === --- gcc/fortran/trans-stmt.c (revision 192004) +++ gcc/fortran/trans-stmt.c (working copy) @@ -5145,7 +5145,9 @@ gfc_trans_allocate (gfc_code * code) dataref = actual-next-expr-ref; /* Make sure we go up through the reference chain to the _data reference, where the arrayspec is found. */ - while (dataref-next dataref-next-type != REF_ARRAY) + while (!(dataref-type == REF_COMPONENT + strcmp (dataref-u.c.component-name, _data) == 0) + dataref-next) this stops on the first class reference, while it seems to me that it should stop on the last. Mikael
Re: [Fortran, (RFC) patch] PR49110/51055 Assignment to alloc. deferred-length character vars
On Thu, Oct 4, 2012 at 1:50 PM, Janus Weil ja...@gcc.gnu.org wrote: Thanks for the suggestions. The attached patch changes all .-something symbol names, which I found. Build and regtested on x86-64-gnu-linux. OK for the trunk and 4.7? (.saved_dovar also occurs in 4.6; we could also backport that part to 4.6, but I am not sure whether it is needed.) I think at least for trunk it should be ok. One more comment: Since its appearance is a bit scattered in the code, how about using a small macro which prepends the _F prefix to a given variable name? For normal identifiers in a module, the current scheme of __modulename_MOD_symbolname is probably too widely entrenched e.g. in debuggers and various interoperability toolkits to be worth changing at this point. The OOP stuff OTOH is IMHO sufficiently new that there is little harm in changing it. I was thinking about this in the beginning of the year and produced the attached document (I never sent it before as I realized I wouldn't have time to do anything about it myself in the near future). Funnily, I also came up with the idea of _F at that point, though maybe not so surprising as I also studied the g++ name mangling for inspiration. Also note that the document itself has a perhaps naive approach which does not consider backwards compatibility enough (see e.g. the above paragraph). Some related mangling PR's: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51802 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52606 (not the PR itself, but the discussion in the comments) -- Janne Blomqvist Gfortran name mangling ABI == It would be nice if GFortran would have a documented, consistent name mangling ABI. This would reduce the risk of an inadvertent name clash with some other software, and make it easier for 3rd party tools such as debuggers, profilers etc. to demangle symbol names. If, and when, the ABI is broken due to the array descriptor update, we could also think about fixing this issue. An explicit non-goal of this is to come up with some common cross-compiler name mangling ABI, as it seems very unlikely that other compiler vendors will want to change their mangling, and mangling alone is a very small part of ABI compability. Another non-goal is to change the mangling of the F77 interface (lowercase, append underscore(s) depending on the compiler options). Thus the following discussion refers only to mangling F90+ names, e.g. module procedures and so forth. Rules of the road - Some names are reserved identifiers, reserved for the implementation. Mangled names should be such reserved names, in order to not clash with user-defined names. Fortran specifies that names are of the form letter + alphanum. C and C++ reserved identifiers (that is, identifiers which are reserved for use by the implementation) are - A name beginning with an underscore followed by a capital letter. - A name containing double underscore (plain C reserves only names beginning with double underscore, but C++ reserves anything with double underscores). - POSIX adds additional restrictions wrt various POSIX functionality. Thus, choosing names beginning with an underscore followed by either a capital letter or another underscore should be good. Current GFortran name mangling -- Currently the name mangling is a bit ad-hoc, with several different prefixes depending on which part of the compiler is used: - Procedure foo in module bar: __bar_MOD_foo - Runtime library: _gfortran_XXX - Runtime library internal functions (not visible if symbol visibility is supported): _gfortrani_XXX - ISO_C_BINDING functions in the library: __iso_c_binding_XXX - OOP stuff: __class_XXX - Others? The C++ name mangling - For inspiration, see the C++ name mangling ABI that GCC follows at http://sourcery.mentor.com/public/cxx-abi/abi.html#mangling http://sourcery.mentor.com/public/cxx-abi/abi-examples.html#mangling http://sourcery.mentor.com/public/cxx-abi/abi-mangling.html The C++ name mangling ABI, in a very simplified form, is - Everything has the prefix _Z. - names are encoded as length, name pairs. - At the end of a function symbol there is a E, followed by the type of the function arguments (in order to handle overloading). - Outside of names, characters have meaning as various flags, e.g. TI means the identifier is a typeinfo structure, or TV for a virtual table, and so on. E.g. a member function foo(void) in a class Test (Test::foo()) would thus be encoded as _ZN4Test3fooEv. (The N means it's a nested name, v at the end means the void argument). Proposed GFortran mangling -- Fortran name mangling requirements are considerably simpler than C++, due to Fortran not having function overloading (yes, Fortran has generic interfaces, which are a bit different and don't require mangling), nor templates. - Every symbol has the prefix _F (F as in
Re: [Fortran, (RFC) patch] PR49110/51055 Assignment to alloc. deferred-length character vars
Hi Janus, Janus Weil wrote: When backporting to 4.6 and 4.7, do you intend to also bump the module version there? Does that make sense? Probably not. The .__result and the .saved_dovar are not ABI relevant, thus, they can be changed without problems - but also not that important. The module variable is more crucial, but I think we should avoid bumping the module version. I think we should only change the trunk. Moreover, .__result probably goes back even further than 4.6, right? Presumably yes. Janus Weil wrote: One more comment: Since its appearance is a bit scattered in the code, how about using a small macro which prepends the _F prefix to a given variable name? Btw, note that we are using a double underscore scheme in other places (like __class, __vtab, __vtype, etc). I have even used an '@' in one place, namely (hidden) procedure pointer results (ppr@). Is there a need to unify all those cases? It think it would be useful to unify those. Are you volunteering? Regarding the @: GCC only has: NO_DOLLAR_IN_LABEL and NO_DOT_IN_LABEL. However, I would be also careful with the @ label. The @ has also a special meaning to assemblers. However, if the name with the @ is not publicly visible, the name does not occur in the assembler file (except as string for the debug information). Thus, there should be no real problem. Still, one should consider to change also them to the _F convention. Tobias
Re: [patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])
On Thu, Oct 4, 2012 at 11:43 AM, Steven Bosscher stevenb@gmail.com wrote: On Wed, Oct 3, 2012 at 5:35 PM, Steven Bosscher stevenb@gmail.com wrote: The worst result is this: Compressing live ranges: from 726174 to 64496 - 8%, pre_count 40476128, post_count 12483414 But that's still a lot better than before the patch for the same function: Compressing live ranges: from 1742569 to 73069 - 4%, pre_count 40842330, post_count 12479992 Walking basic blocks with FOR_EACH_BB_REVERSE gives: Only FOR_EACH_BB_REVERSE: Compressing live ranges: from 1742579 to 429746 - 24% pre_count 41106212, post_count 34376494 Compressing live ranges: from 1742569 to 63000 - 3% pre_count 40835340, post_count 11055747 FOR_EACH_BB_REVERSE + need_curr_point_incr: Compressing live ranges: from 726184 to 416529 - 57% pre_count 40743516, post_count 34376846 Compressing live ranges: from 726174 to 61840 - 8% pre_count 40472806, post_count 11055747 The combination of the two changes takes ~20s off the ~180s for LRA create live ranges. Isn't _REVERSE vs. non-_RESERVE still kind-of random order? Thus, doesn't the above show there exists an optimal order for processing which we could use? (I realize _REVERSE is a simple solution, but might there not exist a pathological case where _REVERSE is even worse than non-_REVERSE?) Richard. Ciao! Steven
Re: libgo patch committed: Use libbacktrace
Ian Lance Taylor i...@google.com writes: This patch to libgo changes it to use libbacktrace. Previously backtraces required the Go package debug/elf to register itself with the runtime during the package initialization, which only worked if the program actually imported debug/elf one way or another. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Unfortunately, this breaks all use of libgo on versions of Solaris 11 which lack strnlen: Undefined first referenced symbol in file strnlen /var/gcc/regression/trunk/10-gcc/build/i386- pc-solaris2.10/libgo/.libs/libgo.so ld: fatal: symbol referencing errors. No output written to a.out collect2: error: ld returned 1 exit status FAIL: bufio One could either try to also link libiberty into libgo.la, but that has the complication of needing to decide whether to use libiberty.a or pic/libiberty.a since libiberty is no libtool library. Alternatively, one could add another implementation of strnlen to libgo, which duplicates code. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: patch to fix
On Wed, Oct 3, 2012 at 7:15 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: The enclosed patch is the third of at least four patches that fix the problems associated with supporting integers on the target that are wider than two HOST_WIDE_INTs. While GCC claims to support OI mode, and we have two public ports that make minor use of this mode, in practice, compilation that uses OImode mode commonly gives the wrong result or ices. We have a private port of GCC for an architecture that is further down the road to needing comprehensive OImode and we have discovered that this is unusable. We have decided to fix it in a general way that so that it is most beneficial to the GCC community. It is our belief that we are just a little ahead of the X86 and the NEON and these patches will shortly be essential. The first two of these patches were primarily lexigraphical and have already been committed.They transformed the uses of CONST_DOUBLE so that it is easy to tell what the intended usage is. The underlying structures in the next two patches are very general: once they are added to the compiler, the compiler will be able to support targets with any size of integer from hosts of any size integer. The patch enclosed deals with the portable RTL parts of the compiler. The next patch, which is currently under construction deals with the tree level. However, this patch can be put on the trunk as is, and it will eleviate many, but not all of the current limitations in the rtl parts of the compiler. Some of the patch is conditional, depending on a port defining the symbol 'TARGET_SUPPORTS_WIDE_INT' to be non zero. Defining this symbol to be non zero is declaring that the port has been converted to use the new form or integer constants. However, the patch is completely backwards compatible to allow ports that do not need this immediately to convert at their leasure. The conversion process is not difficult, but it does require some knowledge of the port, so we are not volinteering to do this for all ports. OVERVIEW OF THE PATCH: The patch defines a new datatype, a 'wide_int' (defined in wide-int.[ch], and this datatype will be used to perform all of the integer constant math in the compiler. Externally, wide-int is very similar to double-int except that it does not have the limitation that math must be done on exactly two HOST_WIDE_INTs. Internally, a wide_int is a structure that contains a fixed sized array of HOST_WIDE_INTs, a length field and a mode. The size of the That it has a mode sounds odd to me and makes it subtly different from HOST_WIDE_INT and double-int. Maybe the patch will tell why this is so. array is determined at generation time by dividing the number of bits of the largest integer supported on the target by the number of bits in a HOST_WIDE_INT of the host. Thus, with this format, any length of integer can be supported on any host. A new rtx type is created, the CONST_WIDE_INT, which contains a garbage collected array of HOST_WIDE_INTS that is large enough to hold the constant. For the targets that define TARGET_SUPPORTS_WIDE_INT to be non zero, CONST_DOUBLES are only used to hold floating point values. If the target leaves TARGET_SUPPORTS_WIDE_INT defined as 0, CONST_WIDE_INTs are not used and CONST_DOUBLEs are as they were before. CONST_INT does not change except that it is defined to hold all constants that fit in exactly one HOST_WIDE_INT. Note that is slightly different than the current trunk. Before this patch, the TImode constant '5' could either be in a CONST_INT or CONST_DOUBLE depending on which code path was used to create it. This patch changes this so that if the constant fits in a CONST_INT then it is represented in a CONST_INT no matter how it is created. For the array inside a CONST_WIDE_INT, and internally in wide-int, we use a compressed form for integers that need more than one HOST_WIDE_INT. Higher elements of the array are not needed if they are just a sign extension of the elements below them. This does not imply that constants are signed or are sign extended, this is only a compression technique. While it might seem to be more esthetically pleasing to have not introduced the CONST_WIDE_INT and to have changed the representation of the CONST_INT to accomodate larger numbers, this would have both used more space and would be a time consuming change for the port maintainers. We believe that most ports can be quickly converted with the current scheme because there is just not a lot of code in the back ends that cares about large constants. Furthermore, the CONST_INT is very space efficient and even in a program that was heavy in large values, most constants would still fit in a CONST_INT. All of the parts of the rtl level that deal with CONST_DOUBLE as an now conditionally work with CONST_WIDE_INTs depending on the value of TARGET_SUPPORTS_WIDE_INT. We believe that this
Re: [Fortran, (RFC) patch] PR49110/51055 Assignment to alloc. deferred-length character vars
Hi, One more comment: Since its appearance is a bit scattered in the code, how about using a small macro which prepends the _F prefix to a given variable name? Btw, note that we are using a double underscore scheme in other places (like __class, __vtab, __vtype, etc). I have even used an '@' in one place, namely (hidden) procedure pointer results (ppr@). Is there a need to unify all those cases? It think it would be useful to unify those. Are you volunteering? yeah, why not ;) Attached is a draft patch (not regtested), which adds a macro GFC_PREFIX (in gfortran.h) to prepend _F to the cases included in Tobias' earlier patch as well as the OOP-related stuff and procedure pointer results. It also bumps the module version. Any comments so far? (Of course the name of the macro can be debated. I just tried to keep it short for now.) Cheers, Janus mangling.diff Description: Binary data
Re: [Patch] Fix PR53397
On Tue, Oct 2, 2012 at 6:40 PM, Kumar, Venkataramanan venkataramanan.ku...@amd.com wrote: Hi Richi, (Snip) + (!cst_and_fits_in_hwi (step)) +{ + if( loop-inner != NULL) +{ + if (dump_file (dump_flags TDF_DETAILS)) +{ + fprintf (dump_file, Reference %p:\n, (void *) ref); + fprintf (dump_file, (base ); + print_generic_expr (dump_file, base, TDF_SLIM); + fprintf (dump_file, , step ); + print_generic_expr (dump_file, step, TDF_TREE); + fprintf (dump_file, )\n); No need to repeat this - all references are dumped when we gather them. (Snip) The dumping happens at record_ref which is called after these statements to record these references. When the step is invariant we return from the function without recording the references. so I thought of dumping the references here. Is there a cleaner way to dump the references at one place? Yes, call dump_mem_ref then, instead of repeating parts of its body. Richard. Regards, Venkat. -Original Message- From: Richard Guenther [mailto:rguent...@suse.de] Sent: Tuesday, October 02, 2012 5:42 PM To: Kumar, Venkataramanan Cc: gcc-patches@gcc.gnu.org Subject: Re: [Patch] Fix PR53397 On Mon, 1 Oct 2012, venkataramanan.ku...@amd.com wrote: Hi, The below patch fixes the FFT/Scimark regression caused by useless prefetch generation. This fix tries to make prefetch less aggressive by prefetching arrays in the inner loop, when the step is invariant in the entire loop nest. GCC currently tries to prefetch invariant steps when they are in the inner loop. But does not check if the step is variant in outer loops. In the scimark FFT case, the trip count of the inner loop varies by a non constant step, which is invariant in the inner loop. But the step variable is varying in outer loop. This makes inner loop trip count small (at run time varies sometimes as small as 1 iteration) Prefetching ahead x iteration when the inner loop trip count is smaller than x leads to useless prefetches. Flag used: -O3 -march=amdfam10 Before ** ** ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** ** for details. (Results can be submitted to p...@nist.gov) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 550.50 FFT Mflops:38.66(N=1024) SOR Mflops: 617.61(100 x 100) MonteCarlo: Mflops: 173.74 Sparse matmult Mflops: 675.63(N=1000, nz=5000) LU Mflops: 1246.88(M=100, N=100) After ** ** ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** ** for details. (Results can be submitted to p...@nist.gov) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 639.20 FFT Mflops: 479.19(N=1024) SOR Mflops: 617.61(100 x 100) MonteCarlo: Mflops: 173.18 Sparse matmult Mflops: 679.13(N=1000, nz=5000) LU Mflops: 1246.88(M=100, N=100) GCC regression make check -k passes with x86_64-unknown-linux-gnu New tests that PASS: gcc.dg/pr53397-1.c scan-assembler prefetcht0 gcc.dg/pr53397-1.c scan-tree-dump aprefetch Issued prefetch gcc.dg/pr53397-1.c (test for excess errors) gcc.dg/pr53397-2.c scan-tree-dump aprefetch loop variant step gcc.dg/pr53397-2.c scan-tree-dump aprefetch Not prefetching gcc.dg/pr53397-2.c (test for excess errors) Checked CPU2006 and polyhedron on latest AMD processor, no regressions noted. Ok to commit in trunk? regards, Venkat gcc/ChangeLog +2012-10-01 Venkataramanan Kumar venkataramanan.ku...@amd.com + + * tree-ssa-loop-prefetch.c (gather_memory_references_ref):$ + Perform non constant step prefetching in inner loop, only $ + when it is invariant in the entire loop nest. $ + * testsuite/gcc.dg/pr53397-1.c: New test case $ + Checks we are prefecthing for loop invariant steps$ + * testsuite/gcc.dg/pr53397-2.c: New test case$ + Checks we are not prefecthing for loop variant steps + Index: gcc/testsuite/gcc.dg/pr53397-1.c === --- gcc/testsuite/gcc.dg/pr53397-1.c (revision 0) +++ gcc/testsuite/gcc.dg/pr53397-1.c (revision 0) @@ -0,0 +1,28 @@ +/* Prefetching when the step is loop invariant. */ + +/* { dg-do compile } */ +/* { dg-options -O3 -fprefetch-loop-arrays +-fdump-tree-aprefetch-details --param min-insn-to-prefetch-ratio=3 +--param simultaneous-prefetches=10 -fdump-tree-aprefetch-details } +*/ + + +double data[16384]; +void prefetch_when_non_constant_step_is_invariant(int step, int n) { + int a;
Re: [patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])
On Thu, Oct 4, 2012 at 1:30 PM, Richard Guenther richard.guent...@gmail.com wrote: Isn't _REVERSE vs. non-_RESERVE still kind-of random order? Not at this stage. For cfglayout mode I would answer yes, but IRA/LRA operates in cfgrtl mode, so the sequence of insns and basic blocks must match. Therefore, if you walk the basic blocks in reverse, and the insns in each basic block in reverse, you effectively work on a, let's say, reverse extended basic block (??) in that your program points are sequential across fallthrough edges. Thus, doesn't the above show there exists an optimal order for processing which we could use? There may be a smarter order: Could even walk blocks in that order if you know a priori what path through the CFG minimizes the length of the live range chains. But to know that order, you have to build the chains. So chicken-and-egg... (I realize _REVERSE is a simple solution, but might there not exist a pathological case where _REVERSE is even worse than non-_REVERSE?) Intuitively, I'm certain that _REVERSE is always better than non-_REVERSE, although I don't know how to prove that :-) Ciao! Steven
Re: Propagate profile counts during switch expansion
Hi, This patch propagates the profile counts during RTL expansion. In many cases, there is no way to determine the exact count of an edge generated during the expansion. So this patch uses some simple heuristics to estimate the edge counts but ensures that the counts of the basic blocks corresponding to the cases are (nearly the) same as at the gimple level. Bootstrapped and profile-bootstrapped on an x86_64/linux machine. OK for trunk? Index: gcc/expr.c === --- gcc/expr.c (revision 191879) +++ gcc/expr.c (working copy) @@ -154,7 +154,7 @@ static rtx do_store_flag (sepops, rtx, enum machin #ifdef PUSH_ROUNDING static void emit_single_push_insn (enum machine_mode, rtx, tree); #endif -static void do_tablejump (rtx, enum machine_mode, rtx, rtx, rtx); +static void do_tablejump (rtx, enum machine_mode, rtx, rtx, rtx, int); static rtx const_vector_from_tree (tree); static void write_complex_part (rtx, rtx, bool); @@ -10894,7 +10894,7 @@ try_casesi (tree index_type, tree index_expr, tree static void do_tablejump (rtx index, enum machine_mode mode, rtx range, rtx table_label, - rtx default_label) + rtx default_label, int default_probability) Please document default_probability. { rtx temp, vector; @@ -10910,9 +10910,17 @@ do_tablejump (rtx index, enum machine_mode mode, r the maximum value of the range. */ if (default_label) -emit_cmp_and_jump_insns (index, range, GTU, NULL_RTX, mode, 1, - default_label); +{ + emit_cmp_and_jump_insns (index, range, GTU, NULL_RTX, mode, 1, + default_label); + if (default_probability != -1) +{ + rtx jump_insn = get_last_insn(); + add_reg_note (jump_insn, REG_BR_PROB, GEN_INT (default_probability)); +} +} dojump already does this kind of logic, but it is bit more cureful: emit_cmp_and_jump_insns (op0, op1, code, size, mode, unsignedp, if_true_label); if (prob != -1 profile_status != PROFILE_ABSENT) { for (last = NEXT_INSN (last); last NEXT_INSN (last); last = NEXT_INSN (last)) if (JUMP_P (last)) break; if (last JUMP_P (last) ! NEXT_INSN (last) any_condjump_p (last)) { gcc_assert (!find_reg_note (last, REG_BR_PROB, 0)); add_reg_note (last, REG_BR_PROB, GEN_INT (prob)); } } What about making emit_cmp_and_jump_insns taking the probability argument and moving the code above inside? Most of other places need updating to propagate probabilities. (compare_and_jump_seq in loop-unswitch probably also can be updated) @@ -10954,7 +10962,7 @@ do_tablejump (rtx index, enum machine_mode mode, r int try_tablejump (tree index_type, tree index_expr, tree minval, tree range, - rtx table_label, rtx default_label) + rtx table_label, rtx default_label, int default_probability) Simiarly here. Index: gcc/cfgbuild.c === --- gcc/cfgbuild.c (revision 191879) +++ gcc/cfgbuild.c (working copy) @@ -533,6 +533,23 @@ find_bb_boundaries (basic_block bb) purge_dead_tablejump_edges (bb, table); } +/* If there is at least one edge in EDGES with a non-zero count, then + compute probabilities based on the existing counts. */ + +static bool +gen_probabilities_from_existing_counts ( VEC(edge,gc) *edges) { + edge e; + edge_iterator ei; + gcov_type count_sum = 0; + FOR_EACH_EDGE(e, ei, edges) +count_sum += e-count; + if (count_sum == 0) +return false; + FOR_EACH_EDGE(e, ei, edges) +e-probability = e-count * REG_BR_PROB_BASE / count_sum; + return true; +} + /* Assume that frequency of basic block B is known. Compute frequencies and probabilities of outgoing edges. */ @@ -560,7 +577,6 @@ compute_outgoing_frequencies (basic_block b) return; } } - if (single_succ_p (b)) { e = single_succ_edge (b); @@ -568,7 +584,10 @@ compute_outgoing_frequencies (basic_block b) e-count = b-count; return; } - guess_outgoing_edge_probabilities (b); + else if (!gen_probabilities_from_existing_counts (b-succs)){ +/* All outgoing edges of B have zero count. Guess probabilities. */ +guess_outgoing_edge_probabilities (b); + } Hmm, I do not quite follow logic here. basic block B is one of many basic blocks that the original BB was split from. It is possible that B may have some of original edges, but there may be new ones. How you can guess the outgoing probabilitie shere. Do you have an example? Also gen_probabilities_from_existing_counts could probably also work based on original edge frequencies. @@ -1664,7 +1668,7 @@ do_jump_if_equal (enum machine_mode mode, rtx op0,
Re: RFA: darwin PATCH to fix build, internal visibility
On Wed, Oct 03, 2012 at 03:26:14PM -0700, Mike Stump wrote: On Oct 3, 2012, at 12:04 PM, Jason Merrill ja...@redhat.com wrote: This patch fixes a couple of Darwin issues I noticed with a cross-compiler: 1) Adds a couple of consts to avoid const-correctness errors. 2) Treats visibility internal like hidden rather than like default. The documentation says that internal is hidden + processor-specific semantics, so treating it as just hidden makes sense to me. OK for trunk? Ok. FYI, the x86_64-apple-darwin12 testsuite results with the proposed patch are at... http://gcc.gnu.org/ml/gcc-testresults/2012-10/msg00434.html It appears that the patch should also special case the scan-assembler .internal.*Foo.methodEv tests in g++.dg/ext/visibility/pragma-override1.C and g++.dg/ext/visibility/pragma-override2.C on darwin as well... FAIL: g++.dg/ext/visibility/pragma-override1.C -std=c++98 scan-assembler .internal.*Foo.methodEv FAIL: g++.dg/ext/visibility/pragma-override1.C -std=c++11 scan-assembler .internal.*Foo.methodEv FAIL: g++.dg/ext/visibility/pragma-override2.C -std=c++98 scan-assembler .internal.*Foo.methodEv FAIL: g++.dg/ext/visibility/pragma-override2.C -std=c++11 scan-assembler .internal.*Foo.methodEv Jack
Re: patch to fix constant math
Let me talk about the mode here first. What this interface/patch provides is a facility where the constant math that is done in optimizations is done exactly the way that it would be done on the target machine. What we have now is a compiler that only does this if it convenient to do on the host. I admit that i care about this more than others right now, but if intel adds a couple of more instructions to their vector units, other people will start to really care about this issue. If you take an OImode value with the current compiler and left shift it by 250 the middle end will say that the result is 0. This is just wrong!!! What this means is that the bitsize and precision of the operations need to be carried along when doing math. when wide-int checks for overflow on the multiply or add, it is not checking the if the value overflowed on two HWIs, it is checking if the add overflowed in the mode of the types that are represented on the target. When we do shift, we are not doing a shift within two HWIs, we are truncating the shift value (if this is appropriate) according to the bitsize and shifting according the precision. I think that an argument could be made that storing the mode should be changed to an explicit precision and bitsize. (A possible other option would be to store a tree type, but this would make the usage at the rtl level very cumbersome since types are rare.) Aside from the work, you would not get much push back. But the signess is a different argument. At the rtl level, the signess is a matter of context. (you could argue that this is a mistake and i would agree, but that is an even bigger change.) But more to the point, at the tree level, there are a surprising number of places where the operation desired does not follow the sign of the types that were used to construct the constants. Furthermore, not carrying the sign is more consistent with the double int code, which as you point out carries nothing. As for the splitting out the patch in smaller pieces, i am all for it. I have done this twice already and i could get the const_scalar_int_p patch out quickly.But you do not get too far along that before you are still left with a big patch. I could split out wide-int.* and just commit those files with no clients as a first step. My guess is that Richard Sandiford would appreciate that because while he has carefully checked the rtl stuff, i think that the code inside wide-int is not in his comfort zone of things he would approve. As far as your btw - noticed this last night. it is an artifact of the way i produced the patch and responsible people have been sacked. However, it shows that you read the patch carefully, and i really appreciate that. i owe you a beer (not that you need another at this time of year). Kenny On 10/04/2012 08:48 AM, Richard Guenther wrote: On Wed, Oct 3, 2012 at 7:15 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: The enclosed patch is the third of at least four patches that fix the problems associated with supporting integers on the target that are wider than two HOST_WIDE_INTs. While GCC claims to support OI mode, and we have two public ports that make minor use of this mode, in practice, compilation that uses OImode mode commonly gives the wrong result or ices. We have a private port of GCC for an architecture that is further down the road to needing comprehensive OImode and we have discovered that this is unusable. We have decided to fix it in a general way that so that it is most beneficial to the GCC community. It is our belief that we are just a little ahead of the X86 and the NEON and these patches will shortly be essential. The first two of these patches were primarily lexigraphical and have already been committed.They transformed the uses of CONST_DOUBLE so that it is easy to tell what the intended usage is. The underlying structures in the next two patches are very general: once they are added to the compiler, the compiler will be able to support targets with any size of integer from hosts of any size integer. The patch enclosed deals with the portable RTL parts of the compiler. The next patch, which is currently under construction deals with the tree level. However, this patch can be put on the trunk as is, and it will eleviate many, but not all of the current limitations in the rtl parts of the compiler. Some of the patch is conditional, depending on a port defining the symbol 'TARGET_SUPPORTS_WIDE_INT' to be non zero. Defining this symbol to be non zero is declaring that the port has been converted to use the new form or integer constants. However, the patch is completely backwards compatible to allow ports that do not need this immediately to convert at their leasure. The conversion process is not difficult, but it does require some knowledge of the port, so we are not volinteering to do this for all ports. OVERVIEW OF THE PATCH: The patch
Profile housekeeping 6/n (-fprofile-consistency-report)
Hi, this patch implements -fprofile-consistency-report that is useful to get an statistic about what pass are major offenders in keeping profile up-to-date. For example the following is output for combine.c Pass: fnsplit (after pass) mismatched in: +16 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: fnsplit (after TODO) mismatched in: -16 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: inline (after pass) mismatched in: +197 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: inline (after TODO) mismatched in: -209 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: ccp (after TODO) mismatched in: +8 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: vrp (after pass) mismatched in: +191 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: vrp (after TODO) mismatched in: +25 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: dce (after TODO) mismatched in: -19 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: cdce (after pass) mismatched in: -1 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: cselim (after pass) mismatched in: +1 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: ifcombine(after TODO) mismatched in: +1 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: phiopt (after pass) mismatched in: -2 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: ch (after pass) mismatched in: +2 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: ch (after TODO) mismatched in: +1 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: dom (after pass) mismatched in: +89 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: dom (after TODO) mismatched in: -3 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: phicprop (after TODO) mismatched in: -6 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: dce (after TODO) mismatched in: -2 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: copyprop (after TODO) mismatched in: -17 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: unswitch (after pass) mismatched in: +7 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: unswitch (after TODO) mismatched in: +19 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: cunroll (after pass) mismatched in: +10 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: vrp (after pass) mismatched in: +18 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: vrp (after TODO) mismatched in: -8 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: dom (after pass) mismatched in: +14 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: dom (after TODO) mismatched in: +4 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: phicprop (after TODO) mismatched in: +1 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: cddce(after TODO) mismatched in: -1 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: expand (after pass) mismatched in: +435 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: jump (after pass) mismatched in: +6 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: cse1 (after pass) mismatched in: +1 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: cprop(after pass) mismatched in: -8 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: rtl pre (after pass) mismatched in: -1 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: cse_local(after pass) mismatched in: -7 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: ce1 (after pass) mismatched in: +5 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: loop2_init (after pass) mismatched in: +1 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: loop2_done (after pass) mismatched in: -1 (freqs) +0 (counts); michmatched out: +0 (freqs) +0 (counts) Pass: reload
Re: Profile housekeeping 6/n (-fprofile-consistency-report)
On Thu, Oct 4, 2012 at 4:01 PM, Jan Hubicka wrote: * doc/invoke.texi (-fprofile-consistency-report): Document. * common.opt (fprofile-consistency-report): New. * toplev.h (dump_profile_consistency_report): Declare. * toplev.c (finalize): Call dump_profile_consistency_report. * passes.c (profile_record): New global var. (check_profile_consistency): New function. (dump_profile_consistency_report): New function. (execute_one_ipa_transform_pass): Call check_profile_consistency. (execute_one_pass): Likewise. Nice. And long overdue! :-) +fprofile-consistency-report +Common Report Var(profile_report) +Report on consistency of profile Maybe make this a -d flag instead of -f? Index: passes.c +/* Hold statistic about profile consistency. */ ... I don't see why this should live in passes.c, can you please put it in a more logical place (profile.c, perhaps)? Ciao! Steven
Re: RFA: add lock_length attribute to break branch-shortening cycles
Quoting Richard Guenther richard.guent...@gmail.com: I miss a few things in this description: - what is the value of lock_length supposed to be? From the lock prefix it sounds like it is something unchanging, maybe even constant, thus a maximum? - the length attribute still needs to be specified when lock_length is? how do they relate? Is lock_length always smaller / bigger than length? I hope I have clarified this in the updated documentation: Usually, branch shortening is done assuming the worst case (i.e. longest) lengths, and then iterating (if optimizing) to smaller lengths till no further changed occur. This does not work so well for architectures that have very small minimum offsets and considerable jumps in instruction lengths. If you define the `lock_length' attribute, branch shortening will work the other way round: it starts out assuming minimum instruction lengths and iterates from there. `lock_length' specifies an instruction length value that is calculated like `length' in every iteration, but if the value at the last iteration was larger, that larger previous value will be used instead. The value computed for the `length' attribute will be no smaller than that of the `lock_length' attribute, but you may still specify a larger value, and in that case `length' can decrease in the next iteration, but not below the value of `lock_length'. The easiest way to make sure branch shortening doesn't start looping indefinitely is to define the `length' attribute to only a minimum instruction length for varying length instructions and let the `lock_length' attribute do all the heavy lifting of computing varying lengths. On the other hand, for some instruction sets you might get better shortening outcomes if you use a more complex `length' attribute and use `lock_length' only to the extent required to prevent indefinite looping. Note that `ADJUST_INSN_LENGTH' applies only to the `length' attribute. Because defining `lock_length' makes branch shortening start with optimistic assumptions, this means we have to see it through to the end to eliminate all out-of-range branches, thus branch shortening might take a bit longer at `-O0' than if you didn't define the `lock_length' attribute. Using `lock_length' with varying delay slots and varying length delay slot insns (all at once) is currently not supported. - what happens if you have patterns with lock_length and patterns without? - what patterns does lock_length apply to? These questions don't really make literal sense; instruction attributes are defined for all instructions. Of course, you can define the default value to 0, which means you see no effect of this attribute on a pattern unless some non-default definition applies. In general optimistically attacking this kind of problem should be always better - did you try simply switching this for all targets? No, I haven't. The boundaries to be set for branch offsets can be different when starting with optimistic assumptions than when starting with pessimistic assumptions. Moreover, there could be 'interesting' interactions with ADJUST_INSN_LENGTH. And there is the problem with handling varying delay slots. I have modified the final.c patch to reduce the scope of this issue so that lock_length should hopefully be useful for a wider range of targets. Also, ... It shouldn't be slower and the only thing you need to guarantee is that during iteration you never make insn-lenghts smaller again. At -O0 it is slower because we can't finish the loop early. For all these reasons, I think it is better if each target maintainer evaluates if the better branch shortening weighs out the longer -O0 compilation time, and addresses any issues arising if/when converting. Bootstrapped on i686-pc-linux-gnu Index: doc/md.texi === --- doc/md.texi (revision 192036) +++ doc/md.texi (working copy) @@ -8004,6 +8004,42 @@ (define_insn jump (const_int 6)))]) @end smallexample +@cindex lock_length +Usually, branch shortening is done assuming the worst case (i.e. longest) +lengths, and then iterating (if optimizing) to smaller lengths till +no further changed occur. This does not work so well for architectures +that have very small minimum offsets and considerable jumps in instruction +lengths. + +If you define the @code{lock_length} attribute, branch shortening will +work the other way round: it starts out assuming minimum instruction +lengths and iterates from there. @code{lock_length} specifies an instruction +length value that is calculated like @code{length} in every iteration, +but if the value at the last iteration was larger, that larger previous +value will be used instead. +The value computed for the @code{length} attribute will be no smaller +than that of the @code{lock_length} attribute, but you may still specify +a larger value, and in that case @code{length} can decrease in the
Re: [RFC] Make vectorizer to skip loops with small iteration estimate
So SOC cancels out in the runtime check. I still think we need two formulas - one determining if vectorization is profitable, other specifying the threshold for scalar path at runtime (that will generally give lower values). True, we want two values. But part of the scalar path right now is all the computation required for alias and alignment runtime checks (because the way all the conditions are combined). I'm not much into the details of what we account for in SOC (I suppose it's everything we insert in the preheader of the vector loop). Yes, it seems contain everything we insert prior the loop in unfolded form. + if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS)) +fprintf (vect_dump, not vectorized: estimated iteration count too small.); + if (vect_print_dump_info (REPORT_DETAILS)) +fprintf (vect_dump, not vectorized: estimated iteration count smaller than + user specified loop bound parameter or minimum + profitable iterations (whichever is more conservative).); this won't work anymore btw - dumping infrastructure changed. Ah, will update that. I suppose your patch is a step in the right direction, but to really make progress we need to re-organize the loop and predicate structure produced by the vectorizer. This reminds me what I did for string functions on x86. It gets very hard to get all the paths right when one starts to be really cureful to not output too much cruft on the short paths + do not consume too many registers. In fact I want to re-think this for the SSE string ops patch, so I may try to look into that incrementally. So, please update your patch, re-test and then it's ok. Thanks. I tested enabling loop_ch in early passes with -fprofile-feedback and it is SPEC neutral. Given that it improves loop count estimates, I would still like mainline doing that. I do not like these quite important estimates to be wrong most of time. I agree. It also helps getting rid of once rolling loops I think. I am attaching the patch for early-ch. Will commit it tomorrow. Concerning jump threading, it would help to make some of it during early passes so the profile estiamte do not get invalided. I tried to move VRP early but now it makes compiler to hang during bootstrap. I will debug that. Btw, I added a similar check in vect_analyze_loop_operations: if ((LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo) (LOOP_VINFO_INT_NITERS (loop_vinfo) vectorization_factor)) || ((max_niter = max_stmt_executions_int (loop)) != -1 (unsigned HOST_WIDE_INT) max_niter vectorization_factor)) { if (dump_kind_p (MSG_MISSED_OPTIMIZATION)) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, not vectorized: iteration count too small.); if (dump_kind_p (MSG_MISSED_OPTIMIZATION)) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, not vectorized: iteration count smaller than vectorization factor.); return false; } maybe you simply need to update that to also consider the profile? Hmm, I am still getting familiar wth the code. Later we later have if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo) LOOP_VINFO_INT_NITERS (loop_vinfo) = th) { if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS)) fprintf (vect_dump, not vectorized: vectorization not profitable.); if (vect_print_dump_info (REPORT_DETAILS)) fprintf (vect_dump, not vectorized: iteration count smaller than user specified loop bound parameter or minimum profitable iterations (whichever is more conservative).); return false; } where th is always greater or equal than vectorization_factor from the cost model. So this test seems redundant if the max_stmt_executions_int was pushed down to the second conditoinal? Yes, sort of. The new check was supposed to be crystal clear, and even with the cost model disabled we want to not vectorize in this case. But yes, the whole cost-model stuff needs TLC. Ah yes, without cost model we would skip it. I suppose we do not need to brother witht he profile estiamte in the case anyway. They are kind of aprt of the cost models. * passes.c (init_optimization_passes): Schedule early CH. * tree-pass.h (pass_early_ch): Declare it. * tree-ssa-loop-ch.c (gate_early_ch): New function. (pass_early_ch): New pass. Index: passes.c === --- passes.c(revision 191852) +++ passes.c(working copy) @@ -1335,6 +1336,7 @@ init_optimization_passes (void) NEXT_PASS (pass_cleanup_eh); NEXT_PASS (pass_profile); NEXT_PASS (pass_local_pure_const);
Re: libgo patch committed: Use libbacktrace
On Thu, Oct 4, 2012 at 5:11 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Ian Lance Taylor i...@google.com writes: This patch to libgo changes it to use libbacktrace. Previously backtraces required the Go package debug/elf to register itself with the runtime during the package initialization, which only worked if the program actually imported debug/elf one way or another. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Unfortunately, this breaks all use of libgo on versions of Solaris 11 which lack strnlen: Undefined first referenced symbol in file strnlen /var/gcc/regression/trunk/10-gcc/build/i386- pc-solaris2.10/libgo/.libs/libgo.so ld: fatal: symbol referencing errors. No output written to a.out collect2: error: ld returned 1 exit status FAIL: bufio One could either try to also link libiberty into libgo.la, but that has the complication of needing to decide whether to use libiberty.a or pic/libiberty.a since libiberty is no libtool library. I guess I won't try to link libgo against libiberty. I just changed libbacktrace to provide its own strnlen function, like so. Bootstrapped and ran libbacktrace testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian 2012-10-04 Ian Lance Taylor i...@google.com * dwarf.c: If the system header files do not declare strnlen, provide our own version. foo.patch Description: Binary data
Re: patch to fix
On 10/04/2012 09:17 AM, Marc Glisse wrote: On Wed, 3 Oct 2012, Mike Stump wrote: On Oct 3, 2012, at 1:47 PM, Marc Glisse marc.gli...@inria.fr wrote: did you consider making the size of wide_int a template parameter, now that we are using C++? All with a convenient typedef or macro so it doesn't show. I am asking because in vrp I do some arithmetic that requires 2*N+1 bits where N is the size of double_int. No, not really. I'd maybe answer it this way, we put in a type (singular) to support all integral constants in all languages on a port. Since we only needed 1, there was little need to templatize it. By supporting all integral constants in all languages, there is little need for more. If Ada say, wanted a 2048 bit integer, then, we just have it drop off the size it wants someplace and we would mix that in on a MAX(….) line, net result, the type we use would then directly support the needs of Ada. If vpr wanted 2x of all existing modes, we could simply change the MAX equation and essentially double it; if people need that. This comes as a cost, as the intermediate wide values are fixed size allocated (not variable); so these all would be larger. And this cost could be eliminated by having a template wide_int_ so only the places that need it actually use the extra size ;-) The space is not really an issue in most places since wide-ints tend to be short lived. i guess vrp is slightly different because it creates a lot at once. but then they go away. However the real question is what are you going to instantiate the template on?What we do is look at the target and determine the largest type that the target supports and build a wide int type that supports that.how are you going to do better? are you going to instantiate one for every type you see? are these going to be static or dynamic? The last line this email seems to imply that you were planning to know that __int128 was the largest integer that any target or front end could support. and then what do you do for the parts of the compiler that have operations that take things of two different types, like shift. The shift amount can and may times is a shorter type that what is being shifted. Would these different length integers be represented with different instances from the same template? I am not a c++ programmer and so all of this is a little new to me, but given a perspective of the rest of the compiler, this does not seem like the right way to go. On Wed, 3 Oct 2012, Kenneth Zadeck wrote: i have already converted the vrp code, so i have some guess at where you are talking about. (of course correct me if i am wrong). in the code that computes the range when two variables are multiplied together needs to do a multiplication that produces a result that is twice as wide as the inputs. Yes, exactly. my library is able to do that with one catch (and this is a big catch): the target has to have an integer mode that is twice as big as the mode of the operands. The issue is that wide-ints actually carry around the mode of the value in order to get the bitsize and precision of the operands (it does not have the type, because this code has to both work on the rtl and tree level and i generally do not want the signness anyway). my current code in vrp checks to see if such a mode exists and if it does, it produces the product. if the mode does not exist, it returns bottom. What this means is that for most (many or some) targets that have a TImode, the largest thing that particular vrp discover ranges for is a DImode value. We could get around this by defining the next larger mode than what the target really needs but i wonder how much mileage you are going to get out of that with really large numbers. This will be for discussion when you submit that next patch, but currently VRP handles integers the same size as double_int. In particular, it handles __int128. I would be unhappy if introducing a larger bigint type in gcc made us regress there. You are only happy now because you do not really understand the world around you.This is not what your code does. What you code does is that if the host is a 64 bit host you can handle __int128 and if your host is a 32 bit host you can handle a __int64. If you are building a cross compiler from a 32 bit host to a 64 bit target, your pass is either going to get the wrong answer, give up, or ice. There are currently parts of gcc that do each of these three solutions and my patch gets rid of these because it does the math as the target does the math, no matter that the target is. The goal of my patch is to make gcc produce the same correct results no matter what types the target or host support.The last thing that we need to have some optimization knowing what the limits of either of these are and hard coding that in a set of templates that have been statically instantiated.
[C++ testcase] PR 54323
Hi, I'm adding the testcase and closing the PR. Tested x86_64-linux. Thanks, Paolo. / 2012-10-04 Paolo Carlini paolo.carl...@oracle.com PR c++/54323 * g++.dg/cpp0x/pr54323.C: New. Index: g++.dg/cpp0x/pr54323.C === --- g++.dg/cpp0x/pr54323.C (revision 0) +++ g++.dg/cpp0x/pr54323.C (working copy) @@ -0,0 +1,37 @@ +// PR c++/54323 +// { dg-do compile { target c++11 } } + +templatebool, typename T = void +struct enable_if { }; + +templatetypename T +struct enable_iftrue, T +{ typedef T type; }; + +templatetemplatetypename class CRTP, typename T +class Base +{ +public: + templatetemplatetypename class CRTP0, typename T0, class + friend int func(const BaseCRTP0, T0 rhs); + +protected: + int n; +}; + +templatetemplatetypename class CRTP0, typename T0, +class = typename enable_iftrue::type +int func(const BaseCRTP0, T0 rhs) +{ + return rhs.n; +} + +templatetypename T +class Derived : public BaseDerived, T {}; + +int main() +{ + Derivedint x; + func(x); + return 0; +}
Re: [patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])
On 10/04/2012 03:24 AM, Steven Bosscher wrote: On Thu, Oct 4, 2012 at 8:57 AM, Steven Bosscher stevenb@gmail.com wrote: On Thu, Oct 4, 2012 at 5:30 AM, Vladimir Makarov vmaka...@redhat.com wrote: I was going to look at this code too but I was interesting in generation of less points and live ranges. It is strange that in my profiles, remove_some_program_points_and_update_live_ranges takes 0.6% of compiler time on these huge tests. So I was not interesting to speed up the function and may be therefore you have no visible change in compilation time. Right. The compression algorithm doesn't care much about the initial number of program points, only about the number of live ranges before and after compression. I had expected a bigger effect on the number of live ranges before compression. 0.6% sounds really very different from my timings. How much time does create_start_finish_chains take for you? I don't object the idea of the patch. I need some time to look at it (the different results on a function is a bit scary for me) and check simulator times on other tests. Understood. BTW, it would be great if you can also look at this additional patch hunk: @@ -994,8 +1044,8 @@ lra_create_live_ranges (bool all_p) curr_point = 0; point_freq_vec = VEC_alloc (int, heap, get_max_uid () * 2); lra_point_freq = VEC_address (int, point_freq_vec); - FOR_EACH_BB (bb) -process_bb_lives (bb); + FOR_EACH_BB_REVERSE (bb) +process_bb_lives (bb, curr_point); lra_live_max_point = curr_point; create_start_finish_chains (); if (lra_dump_file != NULL) I think this should result in more live ranges being merged. Here's why I think so, based on my far worse understanding of this code than yours, so forgive me if I'm Completely Wrong :-) No, you are not wrong. Two days ago, I worked on patch which contains the same code. The patch actually takes EBB into account to decrease # calls of mark_pseudo_live at the beginning of process_bb_lives and mark_pseudo_dead at the function end and for that I needed FOR_EACH_BB_REVERSE. The patch was half baked (it did not checked hard regs live changes at the end of BB to set up right hard reg conflicts for pseudos) but it gave an idea how much I can get from this. It is not bad but not what I expected. So I stopped work on this. But we still should work on these ideas as they improve LRA speed in small steps (many small steps will create a visible effect). We can really solve scalability problem only by using simpler but still good enough algorithms (too simple algorithms result in big code size and actually even in worse compilation times). I've been working on it and I'll send a patch soon. process_bb_lives walks insns in the basic block from last to first, so say you have a basic block chain 1-2-3, and each block has 4 insns, then AFAIU the program points in block 1 will be [4,3,2,1], in block 2 it will be [8,7,6,5], and in block 3 it will be [12,11,10,9]. Say a reg is used in block 3 at point 11, and set in block at point 3. Then this reg will have a live range chain [3-1],[8-5],[12-11]. If you visit the basic blocks in reverse order, the program points will be: 1:[12,11,10,9], 2:[8,7,6,5], 3:[4,3,2,1]. Now the same reg will be set at point 11 and used at point 3, and the live range chain will be just [11-3].
Small PATCH to rs6000.c to fix cross-compiler build without gas
If configure doesn't think that the assembler supports weak symbols, rs6000.c fails to compile because ASM_WEAKEN_DECL isn't defined. So let's not use it in that case. OK for trunk? commit 5f0878b79d1a42795aca2fabc8d70eefa2e29fa6 Author: Jason Merrill ja...@redhat.com Date: Wed Oct 3 15:55:48 2012 -0400 * config/rs6000/rs6000.c (rs6000_code_end): Protect the use of ASM_WEAKEN_DECL with #if RS6000_WEAK. diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 3e3d553..f4e4dec 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -28295,6 +28295,7 @@ rs6000_code_end (void) TREE_PUBLIC (decl) = 1; TREE_STATIC (decl) = 1; +#if RS6000_WEAK if (USE_HIDDEN_LINKONCE) { DECL_COMDAT_GROUP (decl) = DECL_ASSEMBLER_NAME (decl); @@ -28307,6 +28308,7 @@ rs6000_code_end (void) ASM_DECLARE_FUNCTION_NAME (asm_out_file, name, decl); } else +#endif { switch_to_section (text_section); ASM_OUTPUT_LABEL (asm_out_file, name);
Re: patch to fix
Actually richi, this code is correct for some broken definition of correct. If all that is done is to convert the rtl parts of the compiler, then this code is the best you can do (of course an assertion that the length is not greater than 2 would be a useful addition). The code that is in the follow on patch which converts the insides of a tree cst to look like a const wide int, i.e. an array of HWIs. When that happens, this code looks completely different. But if you only convert the rtl level, at some point there is going to be an impedance mismatch and it is buried here. I will point out that this is the fall out of trying to split things into a bunch of smaller patches that could in theory go in separately. kenny +/* Constructs tree in type TYPE from with value given by CST. Signedness + of CST is assumed to be the same as the signedness of TYPE. */ + +tree +wide_int_to_tree (tree type, const wide_int cst) +{ + wide_int v; + if (TYPE_UNSIGNED (type)) +v = cst.zext (TYPE_PRECISION (type)); + else +v = cst.sext (TYPE_PRECISION (type)); + + return build_int_cst_wide (type, v.elt (0), v.elt (1)); +} is surely broken. A wide-int does not fit a double-int. How are you going to fix this? Thanks, Richard. kenny
PATCH to acinclude.m4 to fix gas version detection
Recent versions of binutils seem to have started putting ' around the version number in bfd/configure.in, which was confusing gcc configure. This patch allows us to detect the version number again. OK for trunk? commit f9ce75775fe4392ee92893c46e89e17dc31bb816 Author: Jason Merrill ja...@redhat.com Date: Thu Oct 4 00:55:33 2012 -0400 * acinclude.m4 (_gcc_COMPUTE_GAS_VERSION): Handle ' around version number. * configure: Regenerate. diff --git a/gcc/acinclude.m4 b/gcc/acinclude.m4 index c24464b..d5eb4da 100644 --- a/gcc/acinclude.m4 +++ b/gcc/acinclude.m4 @@ -393,7 +393,7 @@ for f in $gcc_cv_as_bfd_srcdir/configure \ $gcc_cv_as_gas_srcdir/configure \ $gcc_cv_as_gas_srcdir/configure.in \ $gcc_cv_as_gas_srcdir/Makefile.in ; do - gcc_cv_gas_version=`sed -n -e 's/^[[ ]]*\(VERSION=[[0-9]]*\.[[0-9]]*.*\)/\1/p' $f` + gcc_cv_gas_version=`sed -n -e s/^[[ ]]*VERSION='*\([[0-9]]*\.[[0-9]]*[^']*\)'*/VERSION=\1/p $f` if test x$gcc_cv_gas_version != x; then break fi diff --git a/gcc/configure b/gcc/configure index 45bba8e..2d71f7d 100755 --- a/gcc/configure +++ b/gcc/configure @@ -21237,7 +21237,7 @@ for f in $gcc_cv_as_bfd_srcdir/configure \ $gcc_cv_as_gas_srcdir/configure \ $gcc_cv_as_gas_srcdir/configure.in \ $gcc_cv_as_gas_srcdir/Makefile.in ; do - gcc_cv_gas_version=`sed -n -e 's/^[ ]*\(VERSION=[0-9]*\.[0-9]*.*\)/\1/p' $f` + gcc_cv_gas_version=`sed -n -e s/^[[ ]]*VERSION='*\([[0-9]]*\.[[0-9]]*[^']*\)'*/VERSION=\1/p $f` if test x$gcc_cv_gas_version != x; then break fi
Re: RFC: LRA for x86/x86-64 [7/9] -- continuation
Hi Vlad, This message is for lra-assigns.c. Sorry for the piecemeal reviews, never sure when I'll get time... +/* This file contains a pass mostly assigning hard registers to reload + pseudos. There is no any RTL code transformation on this pass. Maybe: /* This file's main objective is to assign hard registers to reload pseudos. It also tries to allocate hard registers to other pseudos, but at a lower priority than the reload pseudos. The pass does not transform the RTL. if that's accurate. + Reload pseudos get what they need (usually) hard registers in + anyway possibly by spilling non-reload pseudos and by assignment + reload pseudos with smallest number of available hard registers + first. + + If reload pseudos can get hard registers only through spilling + other pseudos, we choose what pseudos to spill taking into account + how given reload pseudo benefits and also how other reload pseudos + not assigned yet benefit too (see function spill_for). Maybe: We must allocate a hard register to every reload pseudo. We try to increase the chances of finding a viable allocation by assigning the pseudos in order of fewest available hard registers first. If we still fail to find a hard register, we spill other (non-reload) pseudos in order to make room. assign_hard_regno_for allocates registers without spilling. spill_for does the same with spilling. Both functions use a cost model to determine the most profitable choice of hard and spill registers. + Non-reload pseudos can get hard registers too if it is possible and + improves the code. It might be possible because of spilling + non-reload pseudos on given pass. Maybe: Once we have finished allocating reload pseudos, we also try to assign registers to other (non-reload) pseudos. This is useful if hard registers were freed up by the spilling just described. + We try to assign hard registers processing pseudos by threads. The + thread contains reload and inheritance pseudos connected by copies + (move insns). It improves the chance to get the same hard register + to pseudos in the thread and, as the result, to remove some move + insns. Maybe: We try to assign hard registers by collecting pseudos into threads. These threads contain reload and inheritance pseudos that are connected by copies (move insns). Doing this improves the chances of pseudos in the thread getting the same hard register and, as a result, of allowing some move insns to be deleted. + When we assign hard register to a pseudo, we decrease the cost of + the hard registers for corresponding pseudos connected by copies. Maybe: When we assign a hard register to a pseudo, we decrease the cost of using the same hard register for pseudos that are connected by copies. + If two hard registers are equally good for assigning the pseudo + with hard register cost point of view, we prefer a hard register in + smaller register bank. By default, there is only one register + bank. A target can define register banks by hook + register_bank. For example, x86-64 has a few register banks: hard + regs with and without REX prefixes are in different banks. It + permits to generate smaller code as insns without REX prefix are + shorter. Maybe: If two hard registers have the same frequency-derived cost, we prefer hard registers in lower register banks. The mapping of registers to banks is controlled by the register_bank target hook. For example, x86-64 has a few register banks: hard registers with and without REX prefixes are in different banks. This permits us to generate smaller code as insns without REX prefixes are shorter. although this might change if the name of the hook changes. +/* Info about pseudo used during the assignment pass. Thread is a set + of connected reload and inheritance pseudos with the same set of + available hard reg set. Thread is a pseudo itself for other + cases. */ +struct regno_assign_info Maybe: /* Information about the thread to which a pseudo belongs. Threads are a set of connected reload and inheritance pseudos with the same set of available hard registers. Lone registers belong to their own threads. */ Although the condition seems to be: + (ira_class_hard_regs_num[regno_allocno_class_array[regno1]] + == ira_class_hard_regs_num[regno_allocno_class_array[regno2]])) i.e. the same _number_ of available hard regs, but not necessarily the same set. thread might be more mnemonic than regno_assign in this file, but that's bikeshed stuff. + for (i = FIRST_PSEUDO_REGISTER; i max_reg_num (); i++) +{ + regno_assign_info[i].first = i; + regno_assign_info[i].next = -1; + regno_assign_info[i].freq = lra_reg_info[i].freq; +} Minor speedup, but it's probably worth caching max_reg_num () rather than calling it in each loop
Re: [PATCH, gdc] - Merging gdc (GNU D Compiler) into gcc
On Oct 4, 2012, at 6:06 AM, Iain Buclaw ibuc...@ubuntu.com wrote: I would like to get a bump on this. It's been a while, and there have been quite a number of changes since the initial post that address many of the issues raised. Rather than reposting patches, someone mentioned attaching changelog, well, here it is. Repository is still located here: https://github.com/D-Programming-GDC/GDC Would it be possible to have a re-newed review? You don't ask, you post. If you have independent patches to the rest of gcc that improve it, I'd suggest posting those as separate patches and getting those in. This give you a base onto which to slot in the front-end.
Re: [patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])
On Thu, Oct 4, 2012 at 5:31 PM, Vladimir Makarov vmaka...@redhat.com wrote: Wow. I did not have such effect. What machine do you use? I do all my testing on gcc17. Ciao! Steven
Re: patch to fix
On Thu, 4 Oct 2012, Kenneth Zadeck wrote: On 10/04/2012 09:17 AM, Marc Glisse wrote: On Wed, 3 Oct 2012, Mike Stump wrote: On Oct 3, 2012, at 1:47 PM, Marc Glisse marc.gli...@inria.fr wrote: did you consider making the size of wide_int a template parameter, now that we are using C++? All with a convenient typedef or macro so it doesn't show. I am asking because in vrp I do some arithmetic that requires 2*N+1 bits where N is the size of double_int. No, not really. I'd maybe answer it this way, we put in a type (singular) to support all integral constants in all languages on a port. Since we only needed 1, there was little need to templatize it. By supporting all integral constants in all languages, there is little need for more. If Ada say, wanted a 2048 bit integer, then, we just have it drop off the size it wants someplace and we would mix that in on a MAX(….) line, net result, the type we use would then directly support the needs of Ada. If vpr wanted 2x of all existing modes, we could simply change the MAX equation and essentially double it; if people need that. This comes as a cost, as the intermediate wide values are fixed size allocated (not variable); so these all would be larger. And this cost could be eliminated by having a template wide_int_ so only the places that need it actually use the extra size ;-) The space is not really an issue in most places since wide-ints tend to be short lived. You were the one talking of a cost. However the real question is what are you going to instantiate the template on?What we do is look at the target and determine the largest type that the target supports and build a wide int type that supports that.how are you going to do better? In a single place in tree-vrp.c in the code that evaluates multiplications, I would instantiate the template on the double (possibly +1) of the value you selected as large enough for all constants. For all the rest, your type is fine. This will be for discussion when you submit that next patch, but currently VRP handles integers the same size as double_int. In particular, it handles __int128. I would be unhappy if introducing a larger bigint type in gcc made us regress there. You are only happy now because you do not really understand the world around you. I did not want to go into details, but let me re-phrase: I do not want to regress. Currently, hosts with a 64 bit hwi can handle VRP multiplications on __int128. If your patch introducing better big integers breaks that, that sounds bad to me, since I would expect s/double_int/wide_int/ to just work, and using wide_int2*MAX would just be a potential simplification of the code for later. Note that VRP is just the one case I am familiar with. Using templates should (I haven't checked) be completely trivial and help the next person who needs bigger integers for a specific purpose and doesn't want to penalize the whole compiler. If the size of wide_int is completely irrelevant and we can make it 10 times larger without thinking, I guess some numbers showing it would be great (or maybe that's common knowledge, then I guess it is fine). Now those are only some comments from an occasional contributor, not reviewer requirements, it is fine to ignore them. -- Marc Glisse
Re: [patch][lra] Improve initial program point density in lra-lives.c
On Thu, Oct 4, 2012 at 6:12 PM, Vladimir Makarov vmaka...@redhat.com wrote: 0.6% sounds really very different from my timings. How much time does create_start_finish_chains take for you? 0.65% (2.78s). Actually, I have a profile but I am not sure now that it is for PR54146. It might be for PR26854. I'll check it again to be sure. Not it looks about the same. Well, that's very strange. Maybe we measure these things differently? I just hi-hack a timevar, so I measure e.g. the time spent in create_start_finish_chains like so: Index: lra-lives.c === --- lra-lives.c (revision 192052) +++ lra-lives.c (working copy) @@ -770,6 +812,7 @@ create_start_finish_chains (void) int i, max_regno; lra_live_range_t r; +timevar_push (TV_CPROP); lra_start_point_ranges = XCNEWVEC (lra_live_range_t, lra_live_max_point); lra_finish_point_ranges = XCNEWVEC (lra_live_range_t, lra_live_max_point); max_regno = max_reg_num (); @@ -783,6 +826,7 @@ create_start_finish_chains (void) lra_finish_point_ranges[r-finish] = r; } } +timevar_pop (TV_CPROP); } /* Rebuild LRA_START_POINT_RANGES and LRA_FINISH_POINT_RANGES after so that I get the timings in the -ftime-report like so: CPROP : 43.14 ( 4%) usr integrated RA : 200.81 (17%) usr LRA non-specific: 62.18 ( 5%) usr LRA virtuals elimination: 61.71 ( 5%) usr LRA reload inheritance : 6.41 ( 1%) usr LRA create live ranges : 139.75 (13%) usr LRA hard reg assignment : 130.90 (11%) usr LRA coalesce pseudo regs: 2.45 ( 0%) usr reload : 9.09 ( 1%) usr Crude, but efficient (tm) :-) How do you measure the time spent in that function, and in remove_some_program_points_and_update_live_ranges? Ciao! Steven
Re: patch to fix constant math
On Thu, Oct 4, 2012 at 3:55 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: Let me talk about the mode here first. What this interface/patch provides is a facility where the constant math that is done in optimizations is done exactly the way that it would be done on the target machine. What we have now is a compiler that only does this if it convenient to do on the host. I admit that i care about this more than others right now, but if intel adds a couple of more instructions to their vector units, other people will start to really care about this issue. If you take an OImode value with the current compiler and left shift it by 250 the middle end will say that the result is 0. This is just wrong!!! What this means is that the bitsize and precision of the operations need to be carried along when doing math. when wide-int checks for overflow on the multiply or add, it is not checking the if the value overflowed on two HWIs, it is checking if the add overflowed in the mode of the types that are represented on the target. When we do shift, we are not doing a shift within two HWIs, we are truncating the shift value (if this is appropriate) according to the bitsize and shifting according the precision. I think that an argument could be made that storing the mode should be changed to an explicit precision and bitsize. (A possible other option would be to store a tree type, but this would make the usage at the rtl level very cumbersome since types are rare.) Aside from the work, you would not get much push back. But the signess is a different argument. At the rtl level, the signess is a matter of context. (you could argue that this is a mistake and i would agree, but that is an even bigger change.) But more to the point, at the tree level, there are a surprising number of places where the operation desired does not follow the sign of the types that were used to construct the constants. Furthermore, not carrying the sign is more consistent with the double int code, which as you point out carries nothing. Well, on RTL the signedness is on the operation (you have sdiv and udiv, etc.). double-int tries to present a sign-less twos-complement entity of size 2 * HOST_BITS_PER_WIDE_INT. I think that is sensible and for obvious reasons should not change. Both tree and RTL rely on this. What we do not want is that up to TImode you get an internal representation done one way (twos-complement) and on OImode and larger you suddenly get subtly different behavior. That's a recepie for desaster. I'd like to clean up the interface to double-int some more (now with the nice C++ stuff we have). double-int should be pure twos-complement, there should be no operations on double-ints that behave differently when done signed or unsigned, instead we have signed and unsigned versions of the operations (similar to how signedness is handled on the RTL level). With some trivial C++ fu you could have a double_sint and double_uint type that would get rid of the bool sign params we have to some functions (and then you could write double_sint n using operator notation). I'd like wide-int (whatever it's internal representation is) to behave exactly like double-ints with respect to precision and signedness handling. Ideally all static functions we have that operate on double-ints would be 1:1 available for wide-ints, so I can change the type of entities in an algorithm from double-ints to wide-ints (or vice versa) and do not have to change the code at all. Thus as first step I'd like you to go over the double-int stuff, compare it to the wide-int stuff you introduce and point out differences (changing double-ints or wide-ints to whatever is the more general concept). Now, as for 'modes' - similar to signedness some functions that operate on double-ints take a precision argument (like the various extensions). You can add a similar wrapper type like double_sint, but this time with a cost - a new precision member, that can be constructed from a double_int (or wide_int) that ends up specifying the desired precision (be it in terms of a mode or a type). You didn't question my suggestion to have the number of HOST_WIDE_INTs in a wide-int be compile-time constant - was that just an oversight on your side? The consequence is that code wanting to deal with arbitrary length wide-ints needs to be a template. As for the splitting out the patch in smaller pieces, i am all for it. I have done this twice already and i could get the const_scalar_int_p patch out quickly.But you do not get too far along that before you are still left with a big patch. I could split out wide-int.* and just commit those files with no clients as a first step. My guess is that Richard Sandiford would appreciate that because while he has carefully checked the rtl stuff, i think that the code inside wide-int is not in his comfort zone of things he would approve. As far as your btw - noticed this
Adjust gcc.dg/lto/20120723_0.c for SPARC
As noted by Martin in http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00116.html, the testcase is invalid C and cannot pass on SPARC 32-bit because of the ABI. Tested on SPARC/Solaris 10, applied on the mainline. 2012-10-04 Eric Botcazou ebotca...@adacore.com * gcc.dg/lto/20120723_0.c: Skip on SPARC 32-bit. -- Eric BotcazouIndex: gcc.dg/lto/20120723_0.c === --- gcc.dg/lto/20120723_0.c (revision 192073) +++ gcc.dg/lto/20120723_0.c (working copy) @@ -1,7 +1,9 @@ /* Make sure that by reference and by value aggregate jump functions do not get - mixed up. */ + mixed up. + ??? This testcase is invalid C and can only pass on specific platforms. */ /* { dg-lto-do run } */ -/* { dg-lto-options {{-O3 -fno-early-inlining -flto}} } */ +/* { dg-skip-if { { sparc*-*-* } ilp32 } { * } { } } */ +/* { dg-lto-options { {-O3 -fno-early-inlining -flto}} } */ extern void abort (void);
Re: [PATCH] Improve var-tracking memory disambiguation with frame pointer (PR debug/54796)
On Thu, Oct 04, 2012 at 06:42:47PM +0200, Steven Bosscher wrote: On Thu, Oct 4, 2012 at 6:33 PM, Jakub Jelinek wrote: This patch fixes a few FAILs in the ix86 guality testsuite (mainly -Os), by better disambiguating sp based VALUEs (which usually have no MEM_EXPR and thus the alias Oracle can't be used for them) from frame pointer based ones or global vars. Does this also help for some of the var-tracking compile-time-hog bugs? ISTR PR53958 was one with very long chains of sp based values... Very unlikely. That PR has been reported against 4.7, which doesn't have clobber_overlapping_mems at all, the patch could only affect compile times if there were huge chains of sp based values and find_base_term appeared anywhere significantly in the profiles. That PR seems to create so huge VTA hash tables that var-tracking gives up, unfortunately the current give up heuristics doesn't estimate the compile time and compile memory needed accurately enough and lowering the limit (which is a param)'s default would on the other side result in sane sized programs to trigger no-VTA way too often. Jakub
Re: [Patch, Fortran] Fix some memory leaks
On Thu, Oct 4, 2012 at 7:06 PM, Tobias Burnus wrote: This patch fixes some memory leaks and other issues found by http://scan5.coverity.com. Build and regtested on x86-64-linux. OK for the trunk? Yes, thanks for plugging these! Some of them have been there since day 0 :-) Ciao! Steven
Re: PATCH trunk: gengtype honoring mark_hook-s inside struct inside union-s
On Thu, Oct 04, 2012 at 06:51:35PM +0300, Laurynas Biveinis wrote: 2012-10-03 Basile Starynkevitch bas...@starynkevitch.net * gengtype.c (walk_type): Emit mark_hook when inside a struct of a union member. This is OK. thanks, Committed revision 192092 to trunk. I believe this patch should be backported into GCC 4.7 and 4.6 Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***
Re: [lra] patch to solve most scalability problems for LRA
On Thu, Oct 4, 2012 at 7:07 PM, Steven Bosscher stevenb@gmail.com wrote: On Thu, Oct 4, 2012 at 5:37 PM, Vladimir Makarov vmaka...@redhat.com wrote: The only issue now is PR54146 compilation time for IRA+LRA although it was improved significantly. I will continue work on PR54146. But now I am going to focus on proposals from reviews. Right, there still are opportunities to improve things. (The real solution may be to stop SRA from creating so many simultaneously live pseudos in the first place...) + lra_simple_p += (ira_use_lra_p max_reg_num () = (1 26) / last_basic_block); I think you should use n_basic_blocks here instead of last_basic_block, in case this runs without compacting the cfg first (n_basic_blocks is the real number of basic blocks in the cfg, last_basic_block is the highest index, so last_basic_block = n_basic_blocks). I also noticed that switching to IRA_REGION_ONE improves things when we have a large number of loops (profile points to some loop code in IRA). Note that the magic number above should be a new --param, and once we have a diagnostic flag that shows whenever we back off like this it should notify the user of that fact (and the params we have overflown) - this just reminded me of that idea from somebody else ;) Thanks for working on this! Indeed ;) It, btw, also applies to IRA + reload ... Richard. Ciao! Steven
Re: PATCH trunk: gengtype honoring mark_hook-s inside struct inside union-s
On Thu, Oct 4, 2012 at 7:24 PM, Basile Starynkevitch bas...@starynkevitch.net wrote: On Thu, Oct 04, 2012 at 06:51:35PM +0300, Laurynas Biveinis wrote: 2012-10-03 Basile Starynkevitch bas...@starynkevitch.net * gengtype.c (walk_type): Emit mark_hook when inside a struct of a union member. This is OK. thanks, Committed revision 192092 to trunk. I believe this patch should be backported into GCC 4.7 and 4.6 I see no reason for this unless it is a regression. Richard. Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***
Re: [PATCH] Teach VRP to handle if ((unsigned_narrowing_cast) x != 0) similarly to if ((x 0xffff) != 0) (PR tree-optimization/54810)
On Thu, Oct 4, 2012 at 6:31 PM, Jakub Jelinek ja...@redhat.com wrote: Hi! This patch handles unsigned narrowing casts the same as BIT_AND_EXPR with the unsigned narrow type's max value. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Thanks, Richard. 2012-10-04 Jakub Jelinek ja...@redhat.com PR tree-optimization/54810 * tree-vrp.c (register_edge_assert_for_2): Handle NAME = (unsigned) NAME2; if (NAME cmp CST) for narrowing casts to unsigned integral type like NAME = NAME2 CST2; if (NAME cmp CST) where CST2 is the max value of the unsigned integral type. --- gcc/tree-vrp.c.jj 2012-09-25 14:45:48.0 +0200 +++ gcc/tree-vrp.c 2012-10-04 11:43:32.334988401 +0200 @@ -4712,6 +4712,11 @@ register_edge_assert_for_2 (tree name, e tree val2 = NULL_TREE; double_int mask = double_int_zero; unsigned int prec = TYPE_PRECISION (TREE_TYPE (val)); + unsigned int nprec = prec; + enum tree_code rhs_code = ERROR_MARK; + + if (is_gimple_assign (def_stmt)) + rhs_code = gimple_assign_rhs_code (def_stmt); /* Add asserts for NAME cmp CST and NAME being defined as NAME = (int) NAME2. */ @@ -4721,7 +4726,7 @@ register_edge_assert_for_2 (tree name, e gimple_assign_cast_p (def_stmt)) { name2 = gimple_assign_rhs1 (def_stmt); - if (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt)) + if (CONVERT_EXPR_CODE_P (rhs_code) INTEGRAL_TYPE_P (TREE_TYPE (name2)) TYPE_UNSIGNED (TREE_TYPE (name2)) prec == TYPE_PRECISION (TREE_TYPE (name2)) @@ -4767,8 +4772,7 @@ register_edge_assert_for_2 (tree name, e NAME = NAME2 CST2. Extract CST2 from the right shift. */ - if (is_gimple_assign (def_stmt) - gimple_assign_rhs_code (def_stmt) == RSHIFT_EXPR) + if (rhs_code == RSHIFT_EXPR) { name2 = gimple_assign_rhs1 (def_stmt); cst2 = gimple_assign_rhs2 (def_stmt); @@ -4840,21 +4844,37 @@ register_edge_assert_for_2 (tree name, e /* Add asserts for NAME cmp CST and NAME being defined as NAME = NAME2 CST2. -Extract CST2 from the and. */ +Extract CST2 from the and. + +Also handle +NAME = (unsigned) NAME2; +casts where NAME's type is unsigned and has smaller precision +than NAME2's type as if it was NAME = NAME2 MASK. */ names[0] = NULL_TREE; names[1] = NULL_TREE; cst2 = NULL_TREE; - if (is_gimple_assign (def_stmt) - gimple_assign_rhs_code (def_stmt) == BIT_AND_EXPR) + if (rhs_code == BIT_AND_EXPR + || (CONVERT_EXPR_CODE_P (rhs_code) + TREE_CODE (TREE_TYPE (val)) == INTEGER_TYPE + TYPE_UNSIGNED (TREE_TYPE (val)) + TYPE_PRECISION (TREE_TYPE (gimple_assign_rhs1 (def_stmt))) + prec + !retval)) { name2 = gimple_assign_rhs1 (def_stmt); - cst2 = gimple_assign_rhs2 (def_stmt); + if (rhs_code == BIT_AND_EXPR) + cst2 = gimple_assign_rhs2 (def_stmt); + else + { + cst2 = TYPE_MAX_VALUE (TREE_TYPE (val)); + nprec = TYPE_PRECISION (TREE_TYPE (name2)); + } if (TREE_CODE (name2) == SSA_NAME INTEGRAL_TYPE_P (TREE_TYPE (name2)) TREE_CODE (cst2) == INTEGER_CST !integer_zerop (cst2) - prec = HOST_BITS_PER_DOUBLE_INT - (prec 1 + nprec = HOST_BITS_PER_DOUBLE_INT + (nprec 1 || TYPE_UNSIGNED (TREE_TYPE (val { gimple def_stmt2 = SSA_NAME_DEF_STMT (name2); @@ -4881,12 +4901,12 @@ register_edge_assert_for_2 (tree name, e bool valid_p = false, valn = false, cst2n = false; enum tree_code ccode = comp_code; - valv = tree_to_double_int (val).zext (prec); - cst2v = tree_to_double_int (cst2).zext (prec); + valv = tree_to_double_int (val).zext (nprec); + cst2v = tree_to_double_int (cst2).zext (nprec); if (!TYPE_UNSIGNED (TREE_TYPE (val))) { - valn = valv.sext (prec).is_negative (); - cst2n = cst2v.sext (prec).is_negative (); + valn = valv.sext (nprec).is_negative (); + cst2n = cst2v.sext (nprec).is_negative (); } /* If CST2 doesn't have most significant bit set, but VAL is negative, we have comparison like @@ -4894,7 +4914,7 @@ register_edge_assert_for_2 (tree name, e if (!cst2n valn) ccode = ERROR_MARK; if (cst2n) - sgnbit = double_int_one.llshift (prec - 1, prec).zext (prec); + sgnbit = double_int_one.llshift (nprec - 1, nprec).zext (nprec);
Re: PATCH trunk: gengtype honoring mark_hook-s inside struct inside union-s
On Thu, Oct 04, 2012 at 07:26:23PM +0200, Richard Guenther wrote: On Thu, Oct 4, 2012 at 7:24 PM, Basile Starynkevitch bas...@starynkevitch.net wrote: On Thu, Oct 04, 2012 at 06:51:35PM +0300, Laurynas Biveinis wrote: 2012-10-03 Basile Starynkevitch bas...@starynkevitch.net * gengtype.c (walk_type): Emit mark_hook when inside a struct of a union member. This is OK. thanks, Committed revision 192092 to trunk. I believe this patch should be backported into GCC 4.7 and 4.6 I see no reason for this unless it is a regression. If GCC 4.7 will have future micro releases, (like an hypothetical 4.7.3) they will have the same bug. What is the procedure to get this bug fixed in 4.7.3? (and there are plugins for 4.7 affected by this bug, http://gcc-melt.org/ for example) Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***
Re: PING^2: [patch] pr/54508: fix incomplete debug information for class
/* We also have to mark its parents as used. -(But we don't want to mark our parents' kids due to this.) */ +(But we don't want to mark our parent's kids due to this, +unless it is a class.) */ if (die-die_parent) - prune_unused_types_mark (die-die_parent, 0); + prune_unused_types_mark (die-die_parent, +(die-die_parent-die_tag == DW_TAG_class_type || + die-die_parent-die_tag == DW_TAG_structure_type || + die-die_parent-die_tag == DW_TAG_union_type)); I'd suggest replacing these conditions with a call to class_scope_p(). That will also cover DW_TAG_interface_type, which might be irrelevant for this particular case, but is probably good to cover in the general case. -cary
Re: [patch][lra] Improve initial program point density in lra-lives.c
On 10/04/2012 12:56 PM, Steven Bosscher wrote: On Thu, Oct 4, 2012 at 6:12 PM, Vladimir Makarov vmaka...@redhat.com wrote: 0.6% sounds really very different from my timings. How much time does create_start_finish_chains take for you? 0.65% (2.78s). Actually, I have a profile but I am not sure now that it is for PR54146. It might be for PR26854. I'll check it again to be sure. Not it looks about the same. Well, that's very strange. Maybe we measure these things differently? I just hi-hack a timevar, so I measure e.g. the time spent in create_start_finish_chains like so: Index: lra-lives.c === --- lra-lives.c (revision 192052) +++ lra-lives.c (working copy) @@ -770,6 +812,7 @@ create_start_finish_chains (void) int i, max_regno; lra_live_range_t r; +timevar_push (TV_CPROP); lra_start_point_ranges = XCNEWVEC (lra_live_range_t, lra_live_max_point); lra_finish_point_ranges = XCNEWVEC (lra_live_range_t, lra_live_max_point); max_regno = max_reg_num (); @@ -783,6 +826,7 @@ create_start_finish_chains (void) lra_finish_point_ranges[r-finish] = r; } } +timevar_pop (TV_CPROP); } /* Rebuild LRA_START_POINT_RANGES and LRA_FINISH_POINT_RANGES after so that I get the timings in the -ftime-report like so: CPROP : 43.14 ( 4%) usr integrated RA : 200.81 (17%) usr LRA non-specific: 62.18 ( 5%) usr LRA virtuals elimination: 61.71 ( 5%) usr LRA reload inheritance : 6.41 ( 1%) usr LRA create live ranges : 139.75 (13%) usr LRA hard reg assignment : 130.90 (11%) usr LRA coalesce pseudo regs: 2.45 ( 0%) usr reload : 9.09 ( 1%) usr Crude, but efficient (tm) :-) How do you measure the time spent in that function, and in remove_some_program_points_and_update_live_ranges? You use AMD and I use Intel. So it may be different with cache point of view. Another thing is that I used gprof (-pg was used for bitmap.o lra*.o and ira*.o). Your measurements are more accurate, I think, because it is without instrumentation and bitmap.o takes too much time. Bitmap does not work well in this case because they are too big and sparse.
Re: Convert more non-GTY htab_t to hash_table.
On 10/4/12, Richard Guenther rguent...@suse.de wrote: On Tue, 2 Oct 2012, Lawrence Crowl wrote: On 10/2/12, Richard Guenther rguent...@suse.de wrote: On Mon, 1 Oct 2012, Lawrence Crowl wrote: Change more non-GTY hash tables to use the new type-safe template hash table. Constify member function parameters that can be const. Correct a couple of expressions in formerly uninstantiated templates. The new code is 0.362% faster in bootstrap, with a 99.5% confidence of being faster. Tested on x86-64. Okay for trunk? You are changing a hashtable used by fold checking, did you test with fold checking enabled? I didn't know I had to do anything beyond the normal make check. What do I do? +/* Data structures used to maintain mapping between basic blocks and + copies. */ +static hash_table bb_copy_hasher bb_original; +static hash_table bb_copy_hasher bb_copy; note that because hash_table has a constructor we now get global CTORs for all statics :( (and mx-protected local inits ...) The overhead for the global constructors isn't significant. Only the function-local statics have mx-protection, and that can be eliminated by making them global static. Can you please try to remove the constructor from hash_table to avoid this overhead? (as a followup - that is, don't initialize htab) The initialization avoids potential errors in calling dispose. I can do it, but I don't think the overhead (after moving the function-local statics to global) will matter, and so I prefer to keep the safety. So is the move of the statics sufficient or do you still want to remove constructors? Hm, having them in-scope where they are used is good style. Why can't they be statically initialized and put in .data? Please make it so - you know C++ enough (ISTR value-initialization is default - which means NULL for the pointer?) Zero initialization is default for static variables, but not for local or heap variables. We can live with the uninitialized memory in some cases, and add another function to explicitly null the member in the rest of the cases. I am not convinced that extra coding is worth the performance difference, particularly as I do not expect that difference to be measureable. However we decide here, I think that work should be a separate patch, as it will certainly touch more files than the current patch. So, can we separate the issue? Richard. The cfg.c, dse.c and hash-table.h parts are ok for trunk, I'll leave the rest to respective maintainers of the pieces of the compiler. Thanks, Richard. Index: gcc/java/ChangeLog 2012-10-01 Lawrence Crowl cr...@google.com * Make-lang.in (JAVA_OBJS): Add dependence on hash-table.o. (JCFDUMP_OBJS): Add dependence on hash-table.o. (jcf-io.o): Add dependence on hash-table.h. * jcf-io.c (memoized_class_lookups): Change to use type-safe hash table. Index: gcc/c/ChangeLog 2012-10-01 Lawrence Crowl cr...@google.com * Make-lang.in (c-decl.o): Add dependence on hash-table.h. * c-decl.c (detect_field_duplicates_hash): Change to new type-safe hash table. Index: gcc/objc/ChangeLog 2012-10-01 Lawrence Crowl cr...@google.com * Make-lang.in (OBJC_OBJS): Add dependence on hash-table.o. (objc-act.o): Add dependence on hash-table.h. * objc-act.c (objc_detect_field_duplicates): Change to new type-safe hash table. Index: gcc/ChangeLog 2012-10-01 Lawrence Crowl cr...@google.com * Makefile.in (fold-const.o): Add depencence on hash-table.h. (dse.o): Likewise. (cfg.o): Likewise. * fold-const.c (fold_checksum_tree): Change to new type-safe hash table. * (print_fold_checksum): Likewise. * cfg.c (var bb_original): Likewise. * (var bb_copy): Likewise. * (var loop_copy): Likewise. * hash-table.h (template hash_table): Constify parameters for find... and remove_elt... member functions. (hash_table::empty) Correct size expression. (hash_table::clear_slot) Correct deleted entry assignment. * dse.c (var rtx_group_table): Change to new type-safe hash table. Index: gcc/cp/ChangeLog 2012-10-01 Lawrence Crowl cr...@google.com * Make-lang.in (class.o): Add dependence on hash-table.h. (tree.o): Likewise. (semantics.o): Likewise. * class.c (fixed_type_or_null): Change to new type-safe hash table. * tree.c (verify_stmt_tree): Likewise. (verify_stmt_tree_r): Likewise. * semantics.c (struct nrv_data): Likewise. Index: gcc/java/Make-lang.in === --- gcc/java/Make-lang.in (revision 191941) +++ gcc/java/Make-lang.in (working copy) @@ -83,10 +83,10 @@ JAVA_OBJS = java/class.o java/decl.o jav java/zextract.o java/jcf-io.o java/win32-host.o java/jcf-parse.o java/mangle.o \ java/mangle_name.o java/builtins.o java/resource.o \ java/jcf-depend.o \
Re: patch to fix constant math
On 10/04/2012 12:58 PM, Richard Guenther wrote: On Thu, Oct 4, 2012 at 3:55 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: Let me talk about the mode here first. What this interface/patch provides is a facility where the constant math that is done in optimizations is done exactly the way that it would be done on the target machine. What we have now is a compiler that only does this if it convenient to do on the host. I admit that i care about this more than others right now, but if intel adds a couple of more instructions to their vector units, other people will start to really care about this issue. If you take an OImode value with the current compiler and left shift it by 250 the middle end will say that the result is 0. This is just wrong!!! What this means is that the bitsize and precision of the operations need to be carried along when doing math. when wide-int checks for overflow on the multiply or add, it is not checking the if the value overflowed on two HWIs, it is checking if the add overflowed in the mode of the types that are represented on the target. When we do shift, we are not doing a shift within two HWIs, we are truncating the shift value (if this is appropriate) according to the bitsize and shifting according the precision. I think that an argument could be made that storing the mode should be changed to an explicit precision and bitsize. (A possible other option would be to store a tree type, but this would make the usage at the rtl level very cumbersome since types are rare.) Aside from the work, you would not get much push back. But the signess is a different argument. At the rtl level, the signess is a matter of context. (you could argue that this is a mistake and i would agree, but that is an even bigger change.) But more to the point, at the tree level, there are a surprising number of places where the operation desired does not follow the sign of the types that were used to construct the constants. Furthermore, not carrying the sign is more consistent with the double int code, which as you point out carries nothing. Well, on RTL the signedness is on the operation (you have sdiv and udiv, etc.). yes, there is a complete enough set of operations that allow you to specify the signess where this matters. double-int tries to present a sign-less twos-complement entity of size 2 * HOST_BITS_PER_WIDE_INT. I think that is sensible and for obvious reasons should not change. Both tree and RTL rely on this. What we do not want is that up to TImode you get an internal representation done one way (twos-complement) and on OImode and larger you suddenly get subtly different behavior. That's a recepie for desaster. This is the main difference between double-int and wide-int.Wide int does the math the way the machine does it or the way the front end would expect it to be done.There is nothing about the host that is visible in the interfaces. I reiterate, our world is already bigger than 128 bits and the intel world is likely to be soon. Double int is stuck in a 64/128 bit world. these patches, which i admit are huge, are a way out of that box. I'd like to clean up the interface to double-int some more (now with the nice C++ stuff we have). double-int should be pure twos-complement, there should be no operations on double-ints that behave differently when done signed or unsigned, instead we have signed and unsigned versions of the operations (similar to how signedness is handled on the RTL level). With some trivial C++ fu you could have a double_sint and double_uint type that would get rid of the bool sign params we have to some functions (and then you could write double_sint n using operator notation). The problem is that size does matter.wide int is effectively infinite precision twos complement.In practice, we can get by by just looking at the bitsize and precision of the types/modes involved and this makes the implementation faster than true infinite precision. I went done the road trying to fix all of the places where the compiler either iced or got the wrong answer. I showed this to Sandiford and he talked me out of it. He was right, it was a rat hole. It could have been a smaller patch but it was there were places where it was clearly going to take monumental work just to be able to back out and say that you had nothing.The number of places in the compiler where you compare against the largest and smallest representation of an integer is not small and some of them are buried very deep down chains that were not designed to say i cannot answer that question. I believe that i have all of the functionality of double int in wide int, it is just the calls look different because there are not all of the interfaces that take two HWI's.As mentioned before, all of the places where the overflow is computed for the purpose of asking if this is ok in two hwi's is gone. I'd like wide-int (whatever
Re: Use conditional casting with symtab_node
On Thu, Oct 4, 2012 at 2:14 PM, Lawrence Crowl cr...@googlers.com wrote: So, Jan Hubicka requested and approved the current spelling. What now? I don't think we should hold this up. The names Jan requested seem reasonable enough. We seem to be running in circles here. Diego.
Re: PATCH trunk: gengtype honoring mark_hook-s inside struct inside union-s
On Wed, Oct 03, 2012 at 01:02:44PM +0200, Basile Starynkevitch wrote: So I applied and I am proposing the following patch to gcc trunk 192031 (Laurynas, I did take your remarks into account) # patch to trunk Index: gcc/gengtype.c === --- gcc/gengtype.c(revision 192031) +++ gcc/gengtype.c(working copy) @@ -2810,6 +2810,7 @@ walk_type (type_p t, struct walk_type_data *d) const char *oldval = d-val; const char *oldprevval1 = d-prev_val[1]; const char *oldprevval2 = d-prev_val[2]; + const char *struct_mark_hook = NULL; const int union_p = t-kind == TYPE_UNION; int seen_default_p = 0; options_p o; @@ -2833,7 +2834,14 @@ walk_type (type_p t, struct walk_type_data *d) if (!desc strcmp (o-name, desc) == 0 o-kind == OPTION_STRING) desc = o-info.string; + else if (!struct_mark_hook strcmp (o-name, mark_hook) == 0 + o-kind == OPTION_STRING) + struct_mark_hook = o-info.string; + if (struct_mark_hook) + oprintf (d-of, %*s%s (%s));\n, + d-indent, , struct_mark_hook, oldval); Sorry for the typo, the patch should have only one closing parenthesis. So I'm applying the following patch to trunk to correct it. Index: gengtype.c === --- gengtype.c (revision 192094) +++ gengtype.c (working copy) @@ -2839,7 +2839,7 @@ struct_mark_hook = o-info.string; if (struct_mark_hook) - oprintf (d-of, %*s%s (%s));\n, + oprintf (d-of, %*s%s (%s);\n, d-indent, , struct_mark_hook, oldval); d-prev_val[2] = oldval; I hope that correcting such a typo falls into the obvious patch rule, so I dare committing it right now, reusing the same ChangeLog entry. % svn commit gcc/gengtype.c Sendinggcc/gengtype.c Transmitting file data . Committed revision 192095. If you feel it is wrong to correct such a typo without asking, I'll revert this obvious commit. Apologies for the typo. Cheers. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***
Re: PING^2: [patch] pr/54508: fix incomplete debug information for class
On Oct 4, 2012, at 1:38 PM, Cary Coutant wrote: /* We also have to mark its parents as used. -(But we don't want to mark our parents' kids due to this.) */ +(But we don't want to mark our parent's kids due to this, +unless it is a class.) */ if (die-die_parent) - prune_unused_types_mark (die-die_parent, 0); + prune_unused_types_mark (die-die_parent, +(die-die_parent-die_tag == DW_TAG_class_type || + die-die_parent-die_tag == DW_TAG_structure_type || + die-die_parent-die_tag == DW_TAG_union_type)); I'd suggest replacing these conditions with a call to class_scope_p(). That will also cover DW_TAG_interface_type, which might be irrelevant for this particular case, but is probably good to cover in the general case. -cary Thanks, that makes it very simple. Here is the updated patch. Ok to commit with this change? paul ChangeLog: 2012-10-04 Paul Koning n...@arrl.net * dwarf2out.c (prune_unused_types_mark): Mark all of parent's children if parent is a class. testsuite/ChangeLog: 2012-10-04 Paul Koning n...@arrl.net * g++.dg/debug/dwarf2/pr54508.C: New. Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c (revision 192048) +++ gcc/dwarf2out.c (working copy) @@ -21035,9 +21035,11 @@ prune_unused_types_mark_generic_parms_dies (die); /* We also have to mark its parents as used. -(But we don't want to mark our parents' kids due to this.) */ +(But we don't want to mark our parent's kids due to this, +unless it is a class.) */ if (die-die_parent) - prune_unused_types_mark (die-die_parent, 0); + prune_unused_types_mark (die-die_parent, +class_scope_p (die-die_parent)); /* Mark any referenced nodes. */ prune_unused_types_walk_attribs (die); Index: testsuite/g++.dg/debug/dwarf2/pr54508.C === --- testsuite/g++.dg/debug/dwarf2/pr54508.C (revision 0) +++ testsuite/g++.dg/debug/dwarf2/pr54508.C (revision 0) @@ -0,0 +1,67 @@ +// PR debug/54508 +// { dg-do compile } +// { dg-options -g2 -dA } + +// { dg-final { scan-assembler \cbase0\\[ \t\]+\[#;/!|@\]+ DW_AT_name\|DW_AT_name: \cbase\ } } +// { dg-final { scan-assembler \OPCODE0\\[ \t\]+\[#;/!|@\]+ DW_AT_name\|DW_AT_name: \OPCODE\ } } +// { dg-final { scan-assembler \bi0\\[ \t\]+\[#;/!|@\]+ DW_AT_name } } +// { dg-final { scan-assembler \si0\\[ \t\]+\[#;/!|@\]+ DW_AT_name } } +// { dg-final { scan-assembler \f10\\[ \t\]+\[#;/!|@\]+ DW_AT_name } } +// { dg-final { scan-assembler \f20\\[ \t\]+\[#;/!|@\]+ DW_AT_name } } +// { dg-final { scan-assembler-not \nc0\\[ \t\]+\# DW_AT_name\|DW_AT_name: \nc\ } } + +class cbase + +{ +public: + static int si; +int bi; +}; + +class c : public cbase + +{ +public: + enum + { + OPCODE = 251 + }; + int i ; + static const char *testc (void) { return foo; } +}; + +struct s +{ +int f1; +static const char *tests (void) { return test; } +}; + +union u +{ +int f2; +double d; +static const char *testu (void) { return test union; } +}; + +namespace n +{ +const char *ntest (void) { return test n; } + +class nc +{ +public: +int i; +static int sj; +}; +} + +extern void send (int, int, const void *, int); + +void test (int src) +{ + int cookie = 1; + send(src, c::OPCODE, c::testc (), cookie); + send(src, c::OPCODE, s::tests (), cookie); + send(src, c::OPCODE, u::testu (), cookie); + send(src, c::OPCODE, n::ntest (), cookie); +}
Re: [libbacktrace] Fix warning as error in btest.c and hence `make test`
On Thu, Oct 4, 2012 at 12:03 PM, Gerald Pfeifer ger...@pfeifer.com wrote: Running `make test` on my nightly testers I noticed the following in the log files: cc1: warnings being treated as errors /scratch/tmp/gerald/gcc-HEAD/libbacktrace/btest.c: In function 'f23': /scratch/tmp/gerald/gcc-HEAD/libbacktrace/btest.c:476: warning: 'expected' may be used uninitialized in this function gmake[3]: *** [btest-btest.o] Error 1 gmake[3]: Target `btest' not remade because of errors. Doesn't this mean that effectively the test is not executed? I don't think so, not sure what you are getting at here. I'm not sure why I even wrote case 3:, j can never take on the value 3. 2012-10-04 Gerald Pfeifer ger...@pfeifer.com * btest.c (f23): Avoid uninitialized variable warning. This is OK. Thanks. Ian
[PATCH] gcc-{ar,nm,ranlib}: Find binutils binaries relative to self
Hi All, Currently the gcc-{ar,nm,ranlib} utilities assume that binutils is in path when invoking the wrapped binutils program. This goes against the accepted practice in GCC to find sub-programs relative to where the GCC binaries are stored and to not make assumptions about the PATH. This patch changes the gcc-{ar,nm,ranlib} utilities to do the same by factoring out some utility code for finding files from collect2.c. These functions are then leveraged to find the binutils programs. Note that similar code exist in gcc.c. Perhaps one day everything can be merged to the file-find files. Tested for Windows and GNU/Linux hosts and i686-pc-linux-gnu and arm-none-eabi targets. OK? P.S. I am not quite sure what is best for the copyrights and contributed by comments in the file-find* files I added since that code was just moved. This patch drops the contributed by and keeps all the copyright dates from collect2.c. 2012-10-04 Meador Inge mead...@codesourcery.com * collect2.c (main): Call find_file_set_debug. (find_a_find, add_prefix, prefix_from_env, prefix_from_string): Factor out into ... * file-find.c (New file): ... here and ... * file-find.h (New file): ... here. * gcc-ar.c (standard_exec_prefix): New variable. (standard_libexec_prefix): Ditto. (tooldir_base_prefix) Ditto. (self_exec_prefix): Ditto. (self_libexec_prefix): Ditto. (self_tooldir_prefix): Ditto. (target_version): Ditto. (path): Ditto. (target_path): Ditto. (setup_prefixes): New function. (main): Rework how wrapped programs are found. * Makefile.in (OBJS-libcommon-target): Add file-find.o. (AR_OBJS): New variable. (gcc-ar$(exeext)): Add dependency on $(AR_OBJS). (gcc-nm$(exeext)): Ditto. (gcc-ranlib(exeext)): Ditto. (COLLECT2_OBJS): Add file-find.o. (collect2.o): Add file-find.h prerequisite. (file-find.o): New rule. Index: gcc/gcc-ar.c === --- gcc/gcc-ar.c(revision 192099) +++ gcc/gcc-ar.c(working copy) @@ -21,21 +21,110 @@ #include config.h #include system.h #include libiberty.h +#include file-find.h #ifndef PERSONALITY #error Please set personality #endif +/* The exec prefix as derived at compile-time from --prefix. */ + +static const char standard_exec_prefix[] = STANDARD_EXEC_PREFIX; + +/* The libexec prefix as derived at compile-time from --prefix. */ + static const char standard_libexec_prefix[] = STANDARD_LIBEXEC_PREFIX; + +/* The bindir prefix as derived at compile-time from --prefix. */ + static const char standard_bin_prefix[] = STANDARD_BINDIR_PREFIX; -static const char *const target_machine = TARGET_MACHINE; +/* A relative path to be used in finding the location of tools + relative to this program. */ + +static const char *const tooldir_base_prefix = TOOLDIR_BASE_PREFIX; + +/* The exec prefix as relocated from the location of this program. */ + +static const char *self_exec_prefix; + +/* The libexec prefix as relocated from the location of this program. */ + +static const char *self_libexec_prefix; + +/* The tools prefix as relocated from the location of this program. */ + +static const char *self_tooldir_prefix; + +/* The name of the machine that is being targeted. */ + +static const char *const target_machine = DEFAULT_TARGET_MACHINE; + +/* The target version. */ + +static const char *const target_version = DEFAULT_TARGET_VERSION; + +/* The collection of target specific path prefixes. */ + +static struct path_prefix target_path; + +/* The collection path prefixes. */ + +static struct path_prefix path; + +/* The directory separator. */ + static const char dir_separator[] = { DIR_SEPARATOR, 0 }; +static void +setup_prefixes (const char *exec_path) +{ + const char *self; + + self = getenv (GCC_EXEC_PREFIX); + if (!self) +self = exec_path; + else +self = concat (self, gcc- PERSONALITY, NULL); + + /* Relocate the exec prefix. */ + self_exec_prefix = make_relative_prefix (self, + standard_bin_prefix, + standard_exec_prefix); + if (self_exec_prefix == NULL) +self_exec_prefix = standard_exec_prefix; + + /* Relocate libexec prefix. */ + self_libexec_prefix = make_relative_prefix (self, + standard_bin_prefix, + standard_libexec_prefix); + if (self_libexec_prefix == NULL) +self_libexec_prefix = standard_libexec_prefix; + + + /* Build the relative path to the target-specific tool directory. */ + self_tooldir_prefix = concat (tooldir_base_prefix, target_machine, + dir_separator, NULL); + self_tooldir_prefix = concat (self_exec_prefix, target_machine, + dir_separator,
Re: Propagate profile counts during switch expansion
Hi Honza, I am addressing some of the questions you raise here. Will send an updated patch later. On Thu, Oct 4, 2012 at 6:19 AM, Jan Hubicka hubi...@ucw.cz wrote: @@ -560,7 +577,6 @@ compute_outgoing_frequencies (basic_block b) return; } } - if (single_succ_p (b)) { e = single_succ_edge (b); @@ -568,7 +584,10 @@ compute_outgoing_frequencies (basic_block b) e-count = b-count; return; } - guess_outgoing_edge_probabilities (b); + else if (!gen_probabilities_from_existing_counts (b-succs)){ +/* All outgoing edges of B have zero count. Guess probabilities. */ +guess_outgoing_edge_probabilities (b); + } Hmm, I do not quite follow logic here. basic block B is one of many basic blocks that the original BB was split from. It is possible that B may have some of original edges, but there may be new ones. How you can guess the outgoing probabilitie shere. Do you have an example? The code reaches here only when the BB has more than two successor. I assume that this happens only when the BB is terminated by a switch statement. During conversion to RTL, the outgoing edges of this BB and their counts ar untouched. So I am just using those counts to compute the edge probabilities. Also gen_probabilities_from_existing_counts could probably also work based on original edge frequencies. Just to be clear I understand it right, you want me to use the frequency instead of count since the frequency is derived from the counts in profile use builds? + + default_edge-count = default_count; + if (count) +{ + edge e; + edge_iterator ei; + FOR_EACH_EDGE (e, ei, stmt_bb-succs) +e-probability = e-count * REG_BR_PROB_BASE / count; +} Hmm, this updates origina BB containing the switch statement? The out_edges of the original BB. Of course, modulo roundoff errors, this should hold. I wonder where did you got the diferences and why do you need this? Originally, I obtained e-probability as e-count * REG_BR_PROB_BASE / bb-count. During profile bootstrap, I noticed BBs whose sum of outgoing edge counts exceeded the BB's count and hence I normalized with the sum of edge counts. You are going to output new control flow and find_many_sub_basic_blocks will recompute all the counts/frequencies inside anyway? Sorry, I am lost here. I am just updating the probability of stmt_bb-succs right? @@ -2430,7 +2523,11 @@ emit_case_nodes (rtx index, case_node_ptr node, rt unsignedp), GT, NULL_RTX, mode, unsignedp, label_rtx (node-right-code_label)); - emit_case_nodes (index, node-left, default_label, index_type); + probability = case_probability (node-right-count, + subtree_count + default_count); + add_prob_note_to_last_insn (probability); + emit_case_nodes (index, node-left, default_label, default_count, + index_type); Hmm, here you seem to be distributing the probabilities of the counts reached. What happens in the case when the edge probability needs to be distributed across nodes of the decision tree. I.e. in t(int a) { switch (a) { case 100: case 200: case 300: case 400: case 500: case 600: case 700: case 800: case 900: t(); case 101: case 202: case 303: case 404: case 505: case 606: case 707: case 808: case 909: q(); } } Ok, that's a bug in my patch. In the above example, there are two outgoing edges from the BB containing switch but many case nodes. I would end up multiplying the counts by the number of cases that lead to a label. If I divide the counts of each case_node by the number of case nodes that jump to the same label, will that be reasonable? In the above case, if t() is called 900 times, I will assume that each of the 9 cases contribute a count of 100. Thanks, Easwaran
Re: RFC: LRA for x86/x86-64 [0/9]
On Tue, Oct 2, 2012 at 3:14 AM, Vladimir Makarov vmaka...@redhat.com wrote: Analogous live ranges are used in IRA as intermidiate step to build a conflict graph. Right, ira-lives.c and lra-lives.c look very much alike, the only major difference is that the object of interest in an IRA live range is an ira_object_t, and in an LRA live range it's just a regno. But the code of create_start_finish_chains, ira_rebuild_start_finish_chains, remove_some_program_points_and_update_live_ranges, and most the live range printing functions, are almost identical between lra-lives.c and ira-lives.c. That looks like unnecessary code duplication. Do you think some of that code be shared (esp. in a C++ world where maybe a live range can be a container for another type)? Ciao! Steven
[BZ 50356] Obvious bugfix in h8300 bits
This is a pretty obvious typo/thinko. Interestingly enough, fixing it makes absolutely no difference in the generated code for newlib ;-) One could make an argument it should just be zapped, but I'm not going to open that can of worms today. Obviously I tested this by building newlib before and after this change :-) * PR target/50356 * config/h8300/h8300.c (h8300_rtx_costs): Fix typo in CONST_INT case. Index: config/h8300/h8300.c === --- config/h8300/h8300.c(revision 192102) +++ config/h8300/h8300.c(working copy) @@ -1,6 +1,6 @@ /* Subroutines for insn-output.c for Renesas H8/300. Copyright (C) 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, - 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 + 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012 Free Software Foundation, Inc. Contributed by Steve Chamberlain (s...@cygnus.com), Jim Wilson (wil...@cygnus.com), and Doug Evans (d...@cygnus.com). @@ -1244,7 +1244,7 @@ *total = 0; return true; } - if (-4 = n || n = 4) + if (-4 = n n = 4) { switch ((int) n) {
Re: [Fortran, (RFC) patch] PR49110/51055 Assignment to alloc. deferred-length character vars
Hi all, Btw, note that we are using a double underscore scheme in other places (like __class, __vtab, __vtype, etc). I have even used an '@' in one place, namely (hidden) procedure pointer results (ppr@). Is there a need to unify all those cases? It think it would be useful to unify those. Are you volunteering? yeah, why not ;) Attached is a draft patch (not regtested), which adds a macro GFC_PREFIX (in gfortran.h) to prepend _F to the cases included in Tobias' earlier patch as well as the OOP-related stuff and procedure pointer results. It also bumps the module version. Any comments so far? (Of course the name of the macro can be debated. I just tried to keep it short for now.) unfortunately my previous patch regressed on the proc_ptr_result test cases (due to problems with implicit typing of symbols with leading underscores, which also were the reason for using a suffix instead of a prefix for proc-ptr results in the first place). So I have taken out the 'ppr' parts, leaving only Tobias' original cases and the OOP stuff, which at least should be regression-free now. There are some more double-underscore cases which one could also change into the new _F convention. Should I keep going in this direction, or should we rather restrict this to the leading dot cases for now? I guess this is a question of how much ABI breaking we are willing to take. Opinions? Cheers, Janus mangling_v2.diff Description: Binary data
Re: patch to fix
On Wed, 3 Oct 2012, Kenneth Zadeck wrote: i have already converted the vrp code, so i have some guess at where you are talking about. (of course correct me if i am wrong). in the code that computes the range when two variables are multiplied together needs to do a multiplication that produces a result that is twice as wide as the inputs. my library is able to do that with one catch (and this is a big catch): the target has to have an integer mode that is twice as big as the mode of the operands. The issue is that wide-ints actually carry around the mode of the value in order to get the bitsize and precision of the operands (it does not have the type, because this code has to both work on the rtl and tree level and i generally do not want the signness anyway). Ah, after reading the whole thread, now I understand that it is because wide_int carries a mode that it makes little sense making it a template (sorry that it took me so long when the information was in your first answer). I understand that it would be inconvenient (much longer code) to have a base_wide_int that does just the arithmetic and a wrapper that contains the mode as well. Your idea below to define dummy extra modes does bring the template idea back to the table though ;-) my current code in vrp checks to see if such a mode exists and if it does, it produces the product. if the mode does not exist, it returns bottom. What this means is that for most (many or some) targets that have a TImode, the largest thing that particular vrp discover ranges for is a DImode value. We could get around this by defining the next larger mode than what the target really needs but i wonder how much mileage you are going to get out of that with really large numbers. The current wrapping multiplication code in vrp works with a pair of double_int, so it should keep working with a pair of wide_int. I see now why wide_int doesn't allow to simplify the code, but it doesn't have to break. -- Marc Glisse
Re: PING^2: [patch] pr/54508: fix incomplete debug information for class
Updated patch: there were two existing testcases that needed to be adjusted because of this fix. Ran check RUNTESTFLAGS=dwarf2.exp, no regressions. paul ChangeLog: 2012-10-04 Paul Koning n...@arrl.net * dwarf2out.c (prune_unused_types_mark): Mark all of parent's children if parent is a class. testsuite/ChangeLog: 2012-10-04 Paul Koning n...@arrl.net * g++.dg/debug/dwarf2/pr54508.C: New. * g++.dg/debug/dwarf2/localclass1.C: Expect staticfn1, staticfn2, method1 in debug output. * g++.dg/debug/dwarf2/localclass2.C: Likewise. Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c (revision 192048) +++ gcc/dwarf2out.c (working copy) @@ -21035,9 +21035,11 @@ prune_unused_types_mark_generic_parms_dies (die); /* We also have to mark its parents as used. -(But we don't want to mark our parents' kids due to this.) */ +(But we don't want to mark our parent's kids due to this, +unless it is a class.) */ if (die-die_parent) - prune_unused_types_mark (die-die_parent, 0); + prune_unused_types_mark (die-die_parent, +class_scope_p (die-die_parent)); /* Mark any referenced nodes. */ prune_unused_types_walk_attribs (die); Index: gcc/testsuite/g++.dg/debug/dwarf2/pr54508.C === --- gcc/testsuite/g++.dg/debug/dwarf2/pr54508.C (revision 0) +++ gcc/testsuite/g++.dg/debug/dwarf2/pr54508.C (revision 0) @@ -0,0 +1,67 @@ +// PR debug/54508 +// { dg-do compile } +// { dg-options -g2 -dA -fno-merge-debug-strings } + +// { dg-final { scan-assembler \cbase0\\[ \t\]+\[#;/!|@\]+ DW_AT_name } } +// { dg-final { scan-assembler \OPCODE0\\[ \t\]+\[#;/!|@\]+ DW_AT_name } } +// { dg-final { scan-assembler \bi0\\[ \t\]+\[#;/!|@\]+ DW_AT_name } } +// { dg-final { scan-assembler \si0\\[ \t\]+\[#;/!|@\]+ DW_AT_name } } +// { dg-final { scan-assembler \f10\\[ \t\]+\[#;/!|@\]+ DW_AT_name } } +// { dg-final { scan-assembler \f20\\[ \t\]+\[#;/!|@\]+ DW_AT_name } } +// { dg-final { scan-assembler-not \nc0\\[ \t\]+\# DW_AT_name } } + +class cbase + +{ +public: + static int si; +int bi; +}; + +class c : public cbase + +{ +public: + enum + { + OPCODE = 251 + }; + int i ; + static const char *testc (void) { return foo; } +}; + +struct s +{ +int f1; +static const char *tests (void) { return test; } +}; + +union u +{ +int f2; +double d; +static const char *testu (void) { return test union; } +}; + +namespace n +{ +const char *ntest (void) { return test n; } + +class nc +{ +public: +int i; +static int sj; +}; +} + +extern void send (int, int, const void *, int); + +void test (int src) +{ + int cookie = 1; + send(src, c::OPCODE, c::testc (), cookie); + send(src, c::OPCODE, s::tests (), cookie); + send(src, c::OPCODE, u::testu (), cookie); + send(src, c::OPCODE, n::ntest (), cookie); +} Index: gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C === --- gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C (revision 192048) +++ gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C (working copy) @@ -59,11 +59,11 @@ // { dg-final { scan-assembler foo\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler staticfn1\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler staticfn2\[^\n\r\]*DW_AT_name } } -// { dg-final { scan-assembler-not staticfn3\[^\n\r\]*DW_AT_name } } -// { dg-final { scan-assembler-not staticfn4\[^\n\r\]*DW_AT_name } } +// { dg-final { scan-assembler staticfn3\[^\n\r\]*DW_AT_name } } +// { dg-final { scan-assembler staticfn4\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler-not staticfn5\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler-not staticfn6\[^\n\r\]*DW_AT_name } } -// { dg-final { scan-assembler-not method1\[^\n\r\]*DW_AT_name } } +// { dg-final { scan-assembler method1\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler arg1\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler arg2\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler arg3\[^\n\r\]*DW_AT_name } } Index: gcc/testsuite/g++.dg/debug/dwarf2/localclass2.C === --- gcc/testsuite/g++.dg/debug/dwarf2/localclass2.C (revision 192048) +++ gcc/testsuite/g++.dg/debug/dwarf2/localclass2.C (working copy) @@ -59,11 +59,11 @@ // { dg-final { scan-assembler foo\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler staticfn1\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler staticfn2\[^\n\r\]*DW_AT_name } } -// { dg-final { scan-assembler-not staticfn3\[^\n\r\]*DW_AT_name } } -// { dg-final { scan-assembler-not staticfn4\[^\n\r\]*DW_AT_name } } +// { dg-final { scan-assembler staticfn3\[^\n\r\]*DW_AT_name } } +// { dg-final { scan-assembler
Re: [PATCH] PR 53528 c++/ C++11 Generalized Attribute support
On 09/20/2012 02:59 AM, Dodji Seketeli wrote: + if ((flags ATTR_FLAG_CXX11) + !(flags ATTR_FLAG_TYPE_IN_PLACE + (TREE_CODE (*node) == RECORD_TYPE + || TREE_CODE (*node) == UNION_TYPE))) + { + /* unused is being used as a c++11 attribute. In this mode +we prevent it from applying to types, unless it's for a +class defintion. */ + warning (OPT_Wattributes, + attribute %qE cannot be applied to a non-class type, name); + return NULL_TREE; + } I think this should now be covered by the general ignoring of attributes that appertain to type-specifiers, so we don't need to check it here. Removed. Looks to me like it's still there. +int attr_flag = attribute_takes_identifier_p (attr_id) + ? id_attr + : normal_attr; This should only apply to attributes in the gnu namespace. Jason
Re: [libbacktrace] Fix warning as error in btest.c and hence `make test`
On Thu, 4 Oct 2012, Ian Lance Taylor wrote: cc1: warnings being treated as errors /scratch/tmp/gerald/gcc-HEAD/libbacktrace/btest.c: In function 'f23': /scratch/tmp/gerald/gcc-HEAD/libbacktrace/btest.c:476: warning: 'expected' may be used uninitialized in this function gmake[3]: *** [btest-btest.o] Error 1 gmake[3]: Target `btest' not remade because of errors. Doesn't this mean that effectively the test is not executed? I don't think so, not sure what you are getting at here. With warnings being treated as error the warning lead to btest-btest.o] Error 1 and so I thought the build did not actually go through. Anyway, after my patch this is now gone. :-) Thanks for the quick review! Gerald
[wwwdocs] SH 4.8 changes update
Hello, The atomic options of SH have been changed recently. The attached patch updates the 4.8 changes.html accordingly, plus some minor wording fixes. OK? Cheers, Oleg ? www_4_8_sh_changes_2.patch Index: htdocs/gcc-4.8/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v retrieving revision 1.35 diff -u -r1.35 changes.html --- htdocs/gcc-4.8/changes.html 3 Oct 2012 22:33:46 - 1.35 +++ htdocs/gcc-4.8/changes.html 4 Oct 2012 21:47:18 - @@ -255,19 +255,45 @@ liImproved support for the code__atomic/code built-in functions: ul - liMinor improvements to code generated for software atomic sequences - that are enabled by code-msoft-atomic/code./li + liA new option code-matomic-model=imodel/i/code selects the + model for the generated atomic sequences. The following models are + supported: + ul +licodesoft-gusa/codebr +Software gUSA sequences (SH3* and SH4* only). On SH4A targets this +will now also partially utilize the codemovco.l/code and +codemovli.l/code instructions. This is the default when the target +is codesh3*-*-linux*/code or codesh4*-*-linux*/code./li + +licodehard-llcs/codebr +Hardware codemovco.l/code / codemovli.l/code sequences +(SH4A only)./li + +licodesoft-tcb/codebr +Software thread control block sequences./li + +licodesoft-imask/codebr +Software interrupt flipping sequences (privileged mode only). This is +the default when the target is codesh1*-*-linux*/code or +codesh2*-*-linux*/code./li + +licodenone/codebr +Generates function calls to the respective code__atomic/code +built-in functions. This is the default for SH64 targets or when the +target is not codesh*-*-linux*/code./li + /ul/li + + liThe option code-msoft-atomic/code has been deprecated. It is + now an alias for code-matomic-model=soft-gusa/code./li liA new option code-mtas/code makes the compiler generate the codetas.b/code instruction for the - code__atomic_test_and_set/code built-in function./li + code__atomic_test_and_set/code built-in function regardless of the + selected atomic model./li + + liThe code__sync/code functions in codelibgcc/code now reflect + the selected atomic model when building the toolchain./li - liThe SH4A instructions codemovco.l/code and - codemovli.l/code are now supported. They are used to implement some - software atomic sequences that are enabled by code-msoft-atomic/code. - In addition to that, pure codemovco.l/code / codemovli.l/code - atomic sequences can be enabled with the new option - code-mhard-atomic/code./li /ul/li liAdded support for the codemov.b/code and codemov.w/code @@ -280,11 +306,11 @@ liImprovements to conditional branches and code that involves the T bit. A new option code-mzdcbranch/code tells the compiler to favor -zero-displacement branches. This is enabled by default for SH4 and -SH4A./li +zero-displacement branches. This is enabled by default for SH4* targets. +/li liThe codepref/code instruction will now be emitted by the -code__builtin_prefetch/code built-in function for SH3./li +code__builtin_prefetch/code built-in function for SH3* targets./li liThe codefmac/code instruction will now be emitted by the codefmaf/code standard function and the code__builtin_fmaf/code @@ -298,7 +324,7 @@ liAdded new options code-mfsrra/code and code-mfsca/code to allow the compiler using the codefsrra/code and codefsca/code -instructions on CPUs other than SH4A (where they are already enabled by +instructions on targets other than SH4A (where they are already enabled by default)./li liAdded support for the code__builtin_bswap32/code built-in function.
Re: patch to fix
There are a bunch of ways to skin the cat. 1) we can define the extra mode. 2) if we get rid of the mode inside the wide int and replace it with an explicit precision and bitsize, then we can just make the size of the buffer twice as big as the analysis of the modes indicates. 3) or we can leave your code in a form that uses 2 wide ints. my current patch (which i have not gotten working yet) changes this to use the mul_full call, but it could be changed. It is much simpler that the existing code. i do not see how templates offer any solution at all. the wide int code really needs to have something valid to indicate the length of the object, and if there is no mode that big, the code will ice. my personal feeling is that range analysis is quite useful for small integers and not so much as the values get larger. The only really large integer constants that you are likely to find in real code are encryption keys, like the dvd decoding keys, and if the key is chosen well there should be no useful optimization that you can perform on the code. If this did not work on the largest modes, no one would ever notice. i.e. i would bet that you never make a useful transformation on any integer that would not fit in an int32. However, this is your pass, and i understand the principal of never back down. Kenny On 10/04/2012 05:06 PM, Marc Glisse wrote: On Wed, 3 Oct 2012, Kenneth Zadeck wrote: i have already converted the vrp code, so i have some guess at where you are talking about. (of course correct me if i am wrong). in the code that computes the range when two variables are multiplied together needs to do a multiplication that produces a result that is twice as wide as the inputs. my library is able to do that with one catch (and this is a big catch): the target has to have an integer mode that is twice as big as the mode of the operands. The issue is that wide-ints actually carry around the mode of the value in order to get the bitsize and precision of the operands (it does not have the type, because this code has to both work on the rtl and tree level and i generally do not want the signness anyway). Ah, after reading the whole thread, now I understand that it is because wide_int carries a mode that it makes little sense making it a template (sorry that it took me so long when the information was in your first answer). I understand that it would be inconvenient (much longer code) to have a base_wide_int that does just the arithmetic and a wrapper that contains the mode as well. Your idea below to define dummy extra modes does bring the template idea back to the table though ;-) my current code in vrp checks to see if such a mode exists and if it does, it produces the product. if the mode does not exist, it returns bottom. What this means is that for most (many or some) targets that have a TImode, the largest thing that particular vrp discover ranges for is a DImode value. We could get around this by defining the next larger mode than what the target really needs but i wonder how much mileage you are going to get out of that with really large numbers. The current wrapping multiplication code in vrp works with a pair of double_int, so it should keep working with a pair of wide_int. I see now why wide_int doesn't allow to simplify the code, but it doesn't have to break.
[C++ testcase] PR 53403
Hi, I'm adding the testcase and closing the PR. Tested x86_64-linux. Thanks, Paolo. / 2012-10-04 Paolo Carlini paolo.carl...@oracle.com PR c++/53403 * g++.dg/template/friend53.C Index: g++.dg/template/friend53.C === --- g++.dg/template/friend53.C (revision 0) +++ g++.dg/template/friend53.C (working copy) @@ -0,0 +1,23 @@ +// PR c++/53403 + +template typename T +class Foo +{ + typedef void type; + template typename U friend void f(); +public: + Foo() {} +}; + +template class Foovoid; + +template typename T +void f() +{ + typedef Foovoid::type type; +} + +int main() +{ + fvoid(); +}
Re: [Ping]FW: [PATCH] Cilk Plus merging to trunk (2 of n)
On Thu, 4 Oct 2012, Iyer, Balaji V wrote: Did you get a chance to look at this submission? I think I have fixed all the changes you have mentioned. Is it OK for trunk? I expect to look at the revised specification and patch later this month. My initial impression from a glance at the revised specification was that it was much improved, but some issues would probably still require further work on the specification - and once those had been resolved, then a detailed review of the patch would make sense. -- Joseph S. Myers jos...@codesourcery.com
RE: [Ping]FW: [PATCH] Cilk Plus merging to trunk (2 of n)
Hello Joseph, Please see my responses below: -Original Message- From: Joseph Myers [mailto:jos...@codesourcery.com] Sent: Thursday, October 04, 2012 7:45 PM To: Iyer, Balaji V Cc: gcc-patches@gcc.gnu.org Subject: Re: [Ping]FW: [PATCH] Cilk Plus merging to trunk (2 of n) On Thu, 4 Oct 2012, Iyer, Balaji V wrote: Did you get a chance to look at this submission? I think I have fixed all the changes you have mentioned. Is it OK for trunk? I expect to look at the revised specification and patch later this month. My initial impression from a glance at the revised specification was that it was much improved, but some issues would probably still require further work on the specification - and once those had been resolved, then a detailed review of the patch would make sense. Sure! I am not sure if you noticed, but Richard Guenther gave me some additional feedback today and I have fixed them and submitted another patch. Here is the thread for that email (http://gcc.gnu.org/ml/gcc-patches/2012-10/msg00431.html). I just wanted to let you know about this. Also, I have other parts of Cilk Plus. Can I submit them now itself or should I wait till this is reviewed? Thank you very much for helping me with this! Yours Sincerely, Balaji V. Iyer. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH, gdc] - Merging gdc (GNU D Compiler) into gcc
On Thu, 4 Oct 2012, Iain Buclaw wrote: The only patches to gcc proper are documentation-related and adding the D frontend / libphobos to configure and make files. I would have thought that these would typically only be included with the actual front-end? Looking back at my previous review comments, I suggested that you might need to split up c-common.[ch] so that certain parts of attribute handling could be shared with D, because duplicate code copied from elsewhere in GCC was not an appropriate implementation approach. Have you then eliminated the duplicate code in some other way that does not involve splitting up those files so code can be shared? -- Joseph S. Myers jos...@codesourcery.com
Re: PING^2: [patch] pr/54508: fix incomplete debug information for class
Index: gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C === --- gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C (revision 192048) +++ gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C (working copy) @@ -59,11 +59,11 @@ // { dg-final { scan-assembler foo\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler staticfn1\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler staticfn2\[^\n\r\]*DW_AT_name } } -// { dg-final { scan-assembler-not staticfn3\[^\n\r\]*DW_AT_name } } -// { dg-final { scan-assembler-not staticfn4\[^\n\r\]*DW_AT_name } } +// { dg-final { scan-assembler staticfn3\[^\n\r\]*DW_AT_name } } +// { dg-final { scan-assembler staticfn4\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler-not staticfn5\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler-not staticfn6\[^\n\r\]*DW_AT_name } } -// { dg-final { scan-assembler-not method1\[^\n\r\]*DW_AT_name } } +// { dg-final { scan-assembler method1\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler arg1\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler arg2\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler arg3\[^\n\r\]*DW_AT_name } } The fact that these two tests were specifically checking for the absence of staticfn3 and staticfn4 leads me to believe that the current behavior is deliberate. Jakub, that change was yours (it dates back to November 2008). Are you OK with Paul's change? It seems to me that there are cases where we just want to emit the class for the context info (like a namespace, which doesn't have to be complete everywhere). Is there a way to tell the debugger that this class declaration is incomplete and that it should look elsewhere for a full definition? -cary
Re: PING^2: [patch] pr/54508: fix incomplete debug information for class
On Oct 4, 2012, at 8:26 PM, Cary Coutant wrote: Index: gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C === --- gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C (revision 192048) +++ gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C (working copy) @@ -59,11 +59,11 @@ // { dg-final { scan-assembler foo\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler staticfn1\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler staticfn2\[^\n\r\]*DW_AT_name } } -// { dg-final { scan-assembler-not staticfn3\[^\n\r\]*DW_AT_name } } -// { dg-final { scan-assembler-not staticfn4\[^\n\r\]*DW_AT_name } } +// { dg-final { scan-assembler staticfn3\[^\n\r\]*DW_AT_name } } +// { dg-final { scan-assembler staticfn4\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler-not staticfn5\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler-not staticfn6\[^\n\r\]*DW_AT_name } } -// { dg-final { scan-assembler-not method1\[^\n\r\]*DW_AT_name } } +// { dg-final { scan-assembler method1\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler arg1\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler arg2\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler arg3\[^\n\r\]*DW_AT_name } } The fact that these two tests were specifically checking for the absence of staticfn3 and staticfn4 leads me to believe that the current behavior is deliberate. Jakub, that change was yours (it dates back to November 2008). Are you OK with Paul's change? It seems to me that there are cases where we just want to emit the class for the context info (like a namespace, which doesn't have to be complete everywhere). Is there a way to tell the debugger that this class declaration is incomplete and that it should look elsewhere for a full definition? -cary Certainly GDB does not currently do that. As it stands, it uses whatever definition it finds first, so depending on which compilation unit's symbols are read first, what you see varies unpredictably. If there is a way to express in Dwarf-2 the fact that this is an incomplete definition, GDB could presumably be changed to take advantage of that. I have no idea how hard that is. paul
Re: PING^2: [patch] pr/54508: fix incomplete debug information for class
On Oct 4, 2012, at 8:26 PM, Cary Coutant wrote: Index: gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C === --- gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C (revision 192048) +++ gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C (working copy) @@ -59,11 +59,11 @@ // { dg-final { scan-assembler foo\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler staticfn1\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler staticfn2\[^\n\r\]*DW_AT_name } } -// { dg-final { scan-assembler-not staticfn3\[^\n\r\]*DW_AT_name } } -// { dg-final { scan-assembler-not staticfn4\[^\n\r\]*DW_AT_name } } +// { dg-final { scan-assembler staticfn3\[^\n\r\]*DW_AT_name } } +// { dg-final { scan-assembler staticfn4\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler-not staticfn5\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler-not staticfn6\[^\n\r\]*DW_AT_name } } -// { dg-final { scan-assembler-not method1\[^\n\r\]*DW_AT_name } } +// { dg-final { scan-assembler method1\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler arg1\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler arg2\[^\n\r\]*DW_AT_name } } // { dg-final { scan-assembler arg3\[^\n\r\]*DW_AT_name } } The fact that these two tests were specifically checking for the absence of staticfn3 and staticfn4 leads me to believe that the current behavior is deliberate. Jakub, that change was yours (it dates back to November 2008). Are you OK with Paul's change? It seems to me that there are cases where we just want to emit the class for the context info (like a namespace, which doesn't have to be complete everywhere). Is there a way to tell the debugger that this class declaration is incomplete and that it should look elsewhere for a full definition? -cary The code itself (where I changed dwarf2out.c) is from 2003-02-28, by rth, what appears to be the original implementation of the -feliminate-unused-debug-types flag. Looking at Jakub's change from 2008 and PR/27017 it fixes, it looks more like a case of much more debug information that was missing and Jakub's change corrected that. It looks like those two testcase files describe the resulting behavior, but I don't read the discussion in PR/27017 as saying that having staticfn3 omitted was specifically desired. paul
[PATCH] PR 54789: Fix driver error when GCC_COMPARE_DEBUG is defined
This patch simplifies process_command a bit by using save_switch and sets the known switch field to true. gcc/ChangeLog: 2012-10-05 Dmitry Gorbachev d.g.gorbac...@gmail.com PR driver/54789 * gcc.c (process_command): Use save_switch for synthesized -fcompare-debug=* option; mark the switch as known. --- gcc/gcc.c +++ gcc/gcc.c @@ -3961,18 +3961,12 @@ process_command (unsigned int decoded_options_count, if (n_infiles == last_language_n_infiles spec_lang != 0) warning (0, %-x %s% after last input file has no effect, spec_lang); + /* Synthesize -fcompare-debug flag from the GCC_COMPARE_DEBUG + environment variable. */ if (compare_debug == 2 || compare_debug == 3) { - alloc_switch (); - switches[n_switches].part1 = concat (fcompare-debug=, - compare_debug_opt, - NULL); - switches[n_switches].args = 0; - switches[n_switches].live_cond = 0; - switches[n_switches].validated = false; - switches[n_switches].known = false; - switches[n_switches].ordering = 0; - n_switches++; + const char *opt = concat (-fcompare-debug=, compare_debug_opt, NULL); + save_switch (opt, 0, NULL, false, true); compare_debug = 1; }
Re: RFC: LRA for x86/x86-64 [0/9]
On 12-10-04 4:56 PM, Steven Bosscher wrote: On Tue, Oct 2, 2012 at 3:14 AM, Vladimir Makarov vmaka...@redhat.com wrote: Analogous live ranges are used in IRA as intermidiate step to build a conflict graph. Right, ira-lives.c and lra-lives.c look very much alike, the only major difference is that the object of interest in an IRA live range is an ira_object_t, and in an LRA live range it's just a regno. But the code of create_start_finish_chains, ira_rebuild_start_finish_chains, remove_some_program_points_and_update_live_ranges, and most the live range printing functions, are almost identical between lra-lives.c and ira-lives.c. That looks like unnecessary code duplication. Do you think some of that code be shared (esp. in a C++ world where maybe a live range can be a container for another type)? Yes, C++ could help here. The simplest solution is inheritance from a common class. But it is not a high priority task now.