Re: [patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])

2012-10-04 Thread Steven Bosscher
On Thu, Oct 4, 2012 at 8:57 AM, Steven Bosscher stevenb@gmail.com wrote:
 On Thu, Oct 4, 2012 at 5:30 AM, Vladimir Makarov vmaka...@redhat.com wrote:
 I was going to look at this code too but I was interesting in generation of
 less points and live ranges.  It is strange that in my profiles,
 remove_some_program_points_and_update_live_ranges takes 0.6% of compiler
 time on these huge tests.   So I was not interesting to speed up the
 function and may be therefore you have no visible change in compilation
 time.

 Right. The compression algorithm doesn't care much about the initial
 number of program points, only about the number of live ranges before
 and after compression. I had expected a bigger effect on the number of
 live ranges before compression.

 0.6% sounds really very different from my timings. How much time does
 create_start_finish_chains take for you?


 I don't object the idea of the patch.  I need some time to look at it (the
 different results on a function is a bit scary for me) and check simulator
 times on other tests.

 Understood.

BTW, it would be great if you can also look at this additional patch hunk:

@@ -994,8 +1044,8 @@ lra_create_live_ranges (bool all_p)
   curr_point = 0;
   point_freq_vec = VEC_alloc (int, heap, get_max_uid () * 2);
   lra_point_freq = VEC_address (int, point_freq_vec);
-  FOR_EACH_BB (bb)
-process_bb_lives (bb);
+  FOR_EACH_BB_REVERSE (bb)
+process_bb_lives (bb, curr_point);
   lra_live_max_point = curr_point;
   create_start_finish_chains ();
   if (lra_dump_file != NULL)

I think this should result in more live ranges being merged. Here's
why I think so, based on my far worse understanding of this code than
yours, so forgive me if I'm Completely Wrong :-)

process_bb_lives walks insns in the basic block from last to first, so
say you have a basic block chain 1-2-3, and each block has 4 insns,
then AFAIU the program points in block 1 will be [4,3,2,1], in block 2
it will be [8,7,6,5], and in block 3 it will be [12,11,10,9]. Say a
reg is used in block 3 at point 11, and set in block at point 3. Then
this reg will have a live range chain [3-1],[8-5],[12-11].

If you visit the basic blocks in reverse order, the program points
will be: 1:[12,11,10,9], 2:[8,7,6,5], 3:[4,3,2,1]. Now the same reg
will be set at point 11 and used at point 3, and the live range chain
will be just [11-3].

I'm experimenting with this extra hunk and report back here.

Ciao!
Steven


Re: RFC: Using DECL_NO_LIMIT_STACK as a backend specific flag

2012-10-04 Thread Richard Guenther
On Wed, Oct 3, 2012 at 5:20 PM, nick clifton ni...@redhat.com wrote:
 Hi Ian,


 Can't you just keep a list of the decls for which you have issued the
 warning?


 Yes - that would work too.  In fact I agree that this would be cleaner
 solution in my particular case.  I'll create a new patch...



 How many decls do you expect to be on that list?


 Not very many.  Maybe two or three at most.

 But I am interested to know if there is a way for targets to add their own
 information to function decls (and decls in general).  If not for this
 particular case, then for problems to come in the future.

To the decl not.  But I suppose you are only looking at defined functions,
so it should suffice to amend struct function.  It's reasonable to allow
the target to associate extra info with struct function and we already
have a way to do that via the init_machine_status target hook.

Richard.

 Cheers
   Nick




Re: PATCH trunk: gengtype honoring mark_hook-s inside struct inside union-s

2012-10-04 Thread Basile Starynkevitch
On Wed, Oct 03, 2012 at 01:02:44PM +0200, Basile Starynkevitch wrote:
 On Wed, Oct 03, 2012 at 12:21:02PM +0300, Laurynas Biveinis wrote:
  Hello Basile -
  
   2012-10-02  Basile Starynkevitch  bas...@starynkevitch.net
  
   * gengtype.c (walk_type): Emit mark_hook when inside a
 struct of a union member.
  
  Can you send me off-list the gengtype output before and after the fix?
 
 I messed something, the example I did send was wrong. Let's start all over 
 again.
 
 Consider the following file 
[...]

This is PR54809 on our bugzilla.

Thanks.

-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***


Re: RFA: add lock_length attribute to break branch-shortening cycles

2012-10-04 Thread Richard Guenther
On Wed, Oct 3, 2012 at 8:22 PM, Joern Rennecke
joern.renne...@embecosm.com wrote:
 The ARCompact architecture has some pipelining features that result in
 the vanilla branch shortening not always converging.

 Moreover, there are some short, complex branch instructions with very short
 offsets; replacing them with a multi-insn sequence when the offset doesn't
 reach makes the code significantly longer.  Thus, when starting branch
 shortening with pessimistic assumptions, the short branches are often not
 used because of the pessimistic branch length causing the offsets going out
 of range.
 This problem can be avoided when starting with a low estimate and working
 upwards.  However, that makes the incidence of infinite branch shortening
 cycles higher, and also makes it impossible to just break out after some
 iteration count.

 To address these issues, I've made the generator programs recognize the
 optional lock_length attribute.

 To quote from the documentation added for this feature:

  If you define the `lock_length' attribute, branch shortening will work
 the other way round: it starts out assuming minimum instruction lengths
 and iterates from there.  In addition, the value of the `lock_length'
 attribute does not decrease across iterations, and the value computed
 for the `length' attribute will be no smaller than that of the
 `lock_length' attribute.

I miss a few things in this description:
- what is the value of lock_length supposed to be?  From the lock
  prefix it sounds like it is something unchanging, maybe even constant,
  thus a maximum?
- the length attribute still needs to be specified when lock_length is?
  how do they relate?  Is lock_length always smaller / bigger than length?
- what happens if you have patterns with lock_length and patterns without?
- what patterns does lock_length apply to?

In general optimistically attacking this kind of problem should be always
better - did you try simply switching this for all targets?  It shouldn't be
slower and the only thing you need to guarantee is that during iteration
you never make insn-lenghts smaller again.

Richard.

 bootstrapped and regression tested on i686-pc-linux-gnu

 2012-10-03  Joern Rennecke  joern.renne...@embecosm.com

 * final.c (get_attr_length_1): Use direct recursion rather than
 calling get_attr_length.
 (get_attr_lock_length): New function.
 (INSN_VARIABLE_LENGTH_P): Define.
 (shorten_branches): Take HAVE_ATTR_lock_length into account.
 Don't overwrite non-delay slot insn lengths with the lengths of
 delay slot insns with same uid.
 * genattrtab.c (lock_length_str): New variable.
 (make_length_attrs): New parameter base.
 (main): Initialize lock_length_str.
 Generate lock_lengths attributes.
 * genattr.c (gen_attr): Emit declarations for lock_length attribute
 related functions.
 * doc/md.texi (node Insn Lengths): Document lock_length attribute.

 Index: doc/md.texi
 ===
 --- doc/md.texi (revision 192036)
 +++ doc/md.texi (working copy)
 @@ -8004,6 +8004,20 @@ (define_insn jump
(const_int 6)))])
  @end smallexample

 +@cindex lock_length
 +Usually, branch shortening is done assuming the worst case (i.e. longest)
 +lengths, and then iterating (if optimizing) to smaller lengths till
 +no further changed occur.  This does not work so well for architectures
 +that have very small minimum offsets and considerable jumps in instruction
 +lengths.
 +
 +If you define the @code{lock_length} attribute, branch shortening will
 +work the other way round: it starts out assuming minimum instruction
 +lengths and iterates from there.  In addition, the value of the
 +@code{lock_length} attribute does not decrease across iterations, and
 +the value computed for the @code{length} attribute will be no smaller
 +than that of the @code{lock_length} attribute.
 +
  @end ifset
  @ifset INTERNALS
  @node Constant Attributes
 Index: final.c
 ===
 --- final.c (revision 192036)
 +++ final.c (working copy)
 @@ -312,6 +312,7 @@ dbr_sequence_length (void)
 `insn_current_length'.  */

  static int *insn_lengths;
 +static char *uid_lock_length;

  VEC(int,heap) *insn_addresses_;

 @@ -447,6 +448,20 @@ get_attr_length (rtx insn)
return get_attr_length_1 (insn, insn_default_length);
  }

 +#ifdef HAVE_ATTR_lock_length
 +int
 +get_attr_lock_length (rtx insn)
 +{
 +  if (uid_lock_length  insn_lengths_max_uid  INSN_UID (insn))
 +return uid_lock_length[INSN_UID (insn)];
 +  return get_attr_length_1 (insn, insn_min_lock_length);
 +}
 +#define INSN_VARIABLE_LENGTH_P(INSN) \
 +  (insn_variable_length_p (INSN) || insn_variable_lock_length_p (INSN))
 +#else
 +#define INSN_VARIABLE_LENGTH_P(INSN) (insn_variable_length_p (INSN))
 +#endif
 +
  /* Obtain the current length of an insn. 

Re: [PATCH] Fix up DW_TAG_formal_parameter placement

2012-10-04 Thread Jakub Jelinek
On Thu, Oct 04, 2012 at 09:42:59AM +0200, Richard Guenther wrote:
 This looks like the wrong place to fix things to me ... either we can
 fix this at the point we create the VAR_DECLs for the optimized away
 PARM_DECLs (or we should delay that until here?)

No, that is not possible.  There is no other block they could be added
to (they are added to DECL_INITIAL block), and they definitely need to
be added there, they are needed for the more common case where it is not 
inlined.
And in that case it is the right location, for non-inlined function
DECL_INITIAL block's BLOCK_VARS is added directly as children of the
DW_TAG_subprogram.

 or we fix it up
 in dwarf2out.c (how does this fix interact with stabs and the other
 debuginfo formats?

We can't do that either, dwarf2out doesn't have information whether blocks
are really used (as in, any insns/stmts mention that block in
INSN_BLOCK/gimple_block) or not, it is only correct to move the VAR_DECLs
with PARM_DECL DECL_ORIGIN (i.e. DW_TAG_formal_parameter) up if the outer
BLOCK is not referenced by any insn/stmt (i.e. if the ranges of the
inner block with the VAR_DECL and outer block are exactly the same).
If the outer block has range that is superset of the inner block's range,
then the move would invalidly say that the DW_TAG_formal_parameter
is available somewhere where it is not supposed to be available.

Initially I thought I'd do the moves in tree-ssa-live.c, in
remove_unused_scope_block_p it has information about what blocks are used
by any stmts and what are not.  But it would be terribly expensive,
for each VAR_DECL in a block where its BLOCK_SUPERCONTEXT wasn't originally
TREE_USED before remove_unused_scope_block_p (and such blocks up to a
!TREE_USED inlined_function_outer_scope_p), it would need to search
all the BLOCK_SUPERCONTEXT BLOCK_VARS to see if the VAR_DECL isn't present
there as well, and only if not, move to the inlined_function_outer_scope_p
BLOCK.

Doing it in tree-inline.c is IMHO the right spot, it is the place that
creates the extra artificial BLOCK around the remapped DECL_INITIAL block
and puts function arguments there.  At that point we know for sure that
the DECL_INITIAL block has the same ranges as the whole inline function,
and it is desirable to move all arguments to the outer block, not just
those that were still present in DECL_ARGUMENTS during inlining.

If you want to be more specific on what is to be moved, we could either
add some VAR_DECL flag bit (but that is expensive, we don't have many),
or perhaps just check that DECL_CONTEXT (DECL_ORIGIN (v)) == DECL_ORIGIN (fn)
(but is that ever false?).

  mentioning DWARF in tree-inline looks odd,
 unless we get rid of the other formats - something I'd of course
 welcome ;))

That can be fixed, I can replace the DWARF terminology with something more
fuzzy.

Jakub


Adjust gcc.dg/lower-subreg-1.c for SPARC

2012-10-04 Thread Eric Botcazou
Tested on SPARC/Solaris 10, applied on the mainline.


2012-10-04  Eric Botcazou  ebotca...@adacore.com

PR rtl-optimization/54739
* gcc.dg/lower-subreg-1.c: Also skip on SPARC.


-- 
Eric BotcazouIndex: gcc.dg/lower-subreg-1.c
===
--- gcc.dg/lower-subreg-1.c	(revision 192031)
+++ gcc.dg/lower-subreg-1.c	(working copy)
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { ! { mips64 || { arm*-*-* ia64-*-* spu-*-* tilegx-*-* } } } } } */
+/* { dg-do compile { target { ! { mips64 || { arm*-*-* ia64-*-* sparc*-*-* spu-*-* tilegx-*-* } } } } } */
 /* { dg-options -O -fdump-rtl-subreg1 } */
 /* { dg-skip-if  { { i?86-*-* x86_64-*-* }  x32 } { * } {  } } */
 /* { dg-require-effective-target ilp32 } */

Re: opts.c, gcc.c: Plug some memory leaks - and an out-of-bounds memory access

2012-10-04 Thread Richard Guenther
On Wed, Oct 3, 2012 at 11:01 PM, Tobias Burnus bur...@net-b.de wrote:
 Found using http://scan5.coverity.com/

 Build on x86-64-gnu-linux with C/C++/Fortran. I will now do an all-language
 build/regtest.
 OK when it passes?

 (Note to the save_string call: I reduced it by 2: The +1 in the call makes
 it long (out of bounds) and the +1 in temp_filename_length is not needed
 (but also doesn't harm) as tmp is null terminated and save_string adds
 another '\0' after copying len bytes.)

- prefix = concat (target_sysroot_suffix, prefix, NULL);
-  prefix = concat (sysroot_no_trailing_dir_separator, prefix, NULL);
+   {
+ char *tmp;
+ tmp = concat (target_sysroot_suffix, prefix, NULL);
+ prefix = concat (sysroot_no_trailing_dir_separator, tmp, NULL);
+ free (tmp);
+   }

prefix = concat (sysroot_no_trailing_dir_separator,
target_sysroot_suffix, prefix, NULL);

should be equivalent and easier to read, no?

+  else
+   prefix = concat (sysroot_no_trailing_dir_separator, prefix, NULL);
+

btw, we're not careing too much about memleaks in the driver ...

Otherwise the patch looks ok with the above change.

Thanks,
Richard.

 Tobias


Fix -fdump-ada-spec

2012-10-04 Thread Arnaud Charlet
After changes by Sharad (Add option for dumping to stderr (issue6190057)),
-fdump-ada-spec is broken, and is now a no-op.

Admittedly, this is because -fdump-ada-spec is handled differently from
other -fdump-* switches, so this patch fixes support for -fdump-ada-spec
by using an approach similar to -fdump-go-spec, and use regular switches
via c/c.opt. I've removed the handling of TDF_RAW, which was a debugging
option, and never really used, so can be simply deleted.

Change is mostly trivial/mechanical.

Tested on x86_64-pc-linux-gnu, OK for trunk?

gcc/

2012-10-04  Arnaud Charlet  char...@adacore.com

* dumpfile.h, dumpfile.c: Remove TDI_ada.

c-family/

2012-10-04  Arnaud Charlet  char...@adacore.com

* c-ada-spec.c (print_ada_declaration): Remove handling of TDF_RAW.
* c.opt (-fdump-ada-spec, -fdump-ada-spec-slim): Move switch definition
out of dumpfile.h.

c/

2012-10-04  Arnaud Charlet  char...@adacore.com

* c-decl.c (c_write_global_declarations): Fix handling of
-fdump-ada-spec*.

cp/

2012-10-04  Arnaud Charlet  char...@adacore.com

* decl2.c (cp_write_global_declarations): Fix handling of
-fdump-ada-spec*.

Arno
--
Index: c-family/c.opt
===
--- c-family/c.opt  (revision 192062)
+++ c-family/c.opt  (working copy)
@@ -799,6 +799,14 @@ fdollars-in-identifiers
 C ObjC C++ ObjC++
 Permit '$' as an identifier character
 
+fdump-ada-spec
+C ObjC C++ ObjC++ RejectNegative Var(flag_dump_ada_spec)
+Write all declarations as Ada code transitively
+
+fdump-ada-spec-slim
+C ObjC C++ ObjC++ RejectNegative Var(flag_dump_ada_spec_slim)
+Write all declarations as Ada code for the given file only
+
 felide-constructors
 C++ ObjC++ Var(flag_elide_constructors) Init(1)
 
Index: c-family/c-ada-spec.c
===
--- c-family/c-ada-spec.c   (revision 192062)
+++ c-family/c-ada-spec.c   (working copy)
@@ -2535,7 +2535,6 @@ print_ada_declaration (pretty_printer *b
   int is_class = false;
   tree name = TYPE_NAME (TREE_TYPE (t));
   tree decl_name = DECL_NAME (t);
-  bool dump_internal = get_dump_file_info (TDI_ada)-pflags  TDF_RAW;
   tree orig = NULL_TREE;
 
   if (cpp_check  cpp_check (t, IS_TEMPLATE))
@@ -2705,8 +2704,7 @@ print_ada_declaration (pretty_printer *b
 }
   else
 {
-  if (!dump_internal
-  TREE_CODE (t) == VAR_DECL
+  if (TREE_CODE (t) == VAR_DECL
   decl_name
   *IDENTIFIER_POINTER (decl_name) == '_')
return 0;
@@ -2796,8 +2794,7 @@ print_ada_declaration (pretty_printer *b
 
   /* If this function has an entry in the dispatch table, we cannot
 omit it.  */
-  if (!dump_internal  !DECL_VINDEX (t)
-  *IDENTIFIER_POINTER (decl_name) == '_')
+  if (!DECL_VINDEX (t)  *IDENTIFIER_POINTER (decl_name) == '_')
{
  if (IDENTIFIER_POINTER (decl_name)[1] == '_')
return 0;
Index: c/c-decl.c
===
--- c/c-decl.c  (revision 192062)
+++ c/c-decl.c  (working copy)
@@ -10079,10 +10079,10 @@ c_write_global_declarations (void)
   gcc_assert (!current_scope);
 
   /* Handle -fdump-ada-spec[-slim]. */
-  if (dump_initialized_p (TDI_ada))
+  if (flag_dump_ada_spec || flag_dump_ada_spec_slim)
 {
   /* Build a table of files to generate specs for */
-  if (get_dump_file_info (TDI_ada)-pflags  TDF_SLIM)
+  if (flag_dump_ada_spec_slim)
collect_source_ref (main_input_filename);
   else
for_each_global_decl (collect_source_ref_cb);
Index: cp/decl2.c
===
--- cp/decl2.c  (revision 192062)
+++ cp/decl2.c  (working copy)
@@ -3698,9 +3698,9 @@ cp_write_global_declarations (void)
   cgraph_process_same_body_aliases ();
 
   /* Handle -fdump-ada-spec[-slim] */
-  if (dump_initialized_p (TDI_ada))
+  if (flag_dump_ada_spec || flag_dump_ada_spec_slim)
 {
-  if (get_dump_file_info (TDI_ada)-pflags  TDF_SLIM)
+  if (flag_dump_ada_spec_slim)
collect_source_ref (main_input_filename);
   else
collect_source_refs (global_namespace);
Index: dumpfile.c
===
--- dumpfile.c  (revision 192062)
+++ dumpfile.c  (working copy)
@@ -57,8 +57,7 @@ static struct dump_file_info dump_files[
0, 0, 0, 5},
   {.vcg, tree-vcg, NULL, NULL, NULL, NULL, NULL, TDF_TREE,
0, 0, 0, 6},
-  {.ads, ada-spec, NULL, NULL, NULL, NULL, NULL, 0, 0, 0, 0, 7},
-#define FIRST_AUTO_NUMBERED_DUMP 8
+#define FIRST_AUTO_NUMBERED_DUMP 7
 
   {NULL, tree-all, NULL, NULL, NULL, NULL, NULL, TDF_TREE,
0, 0, 0, 0},
Index: dumpfile.h
===
--- dumpfile.h  (revision 192062)
+++ dumpfile.h  (working copy)
@@ -37,7 +37,6 @@ enum tree_dump_index
   TDI_nested,  

Re: [PATCH] Fix instability of -fschedule-insn for x86

2012-10-04 Thread Kirill Yukhin
Hi,

 Based on this opinion, the patch is OK for mainline, if there are no
...

Checked in: http://gcc.gnu.org/ml/gcc-cvs/2012-10/msg00187.html

K


Re: [Ping]FW: [PATCH] Cilk Plus merging to trunk (2 of n)

2012-10-04 Thread Richard Guenther
On Thu, Oct 4, 2012 at 2:22 AM, Iyer, Balaji V balaji.v.i...@intel.com wrote:
 Hi Joseph,
 Did you get a chance to look at this submission? I think I have fixed 
 all the changes you have mentioned. Is it OK for trunk?

 Thanks,

 Balaji V. Iyer.

-Original Message-
From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
ow...@gcc.gnu.org] On Behalf Of Iyer, Balaji V
Sent: Wednesday, September 26, 2012 7:16 PM
To: Joseph Myers
Cc: gcc-patches@gcc.gnu.org; al...@redhat.com; r...@redhat.com;
l...@redhat.com
Subject: RE: [PATCH] Cilk Plus merging to trunk (2 of n)

Hello Joseph,
   In my last patch, I forgot to add the change Richard Guenther wanted me
to make. He wanted me to move the ARRAY_NOTATION_REF node from tree.def
to c-family/c-common.def.  Here is a new one that has this change. I am sorry 
for
this.

Here are ChangeLog entries:

gcc/ChangeLog
2012-09-26  Balaji V. Iyer  balaji.v.i...@intel.com

* tree.h (array_notation_reduce_type): Added new enumerator.

This should be moved to c-tree.h then, and ...

* Makefile.in (OBJS): Added array-notation-common.o.
* doc/passes.texi (Cilk Plus Transformation): Documented array
notation and overall transformations for Cilk Plus.
* doc/invoke.texi (C Dialect Options): Documented -fcilkplus flag.
* doc/generic.texi (Storage References): Documented
ARRAY_NOTATION_REF
tree addition.
* tree-pretty-pretty.c (dump_generic_node): Added ARRAY_NOTATION_REF
case.

... this to c-pretty-print.c and

* array-notation-common.c: New file.

... this to the c-common/ directory.

Basically this should be a completely frontend-only patch.

Richard.

gcc/c-family/ChangeLog
2012-09-26  Balaji V. Iyer  balaji.v.i...@intel.com

* c-common.h (build_array_notation_expr): New function declaration.
(ARRAY_NOTATION_ARRAY): Added new #define.
(ARRAY_NOTATION_CHECK): Likewise.
(ARRAY_NOTATION_START): Likewise.
(ARRAY_NOTATION_LENGTH): Likewise.
(ARRAY_NOTATION_STRIDE): Likewise.
(ARRAY_NOTATION_TYPE): Likewise.
* c-common.def: Added new tree ARRAY_NOTATION_REF.
* c-common.c (c_define_builtins): Added a call to initialize array
notation builtin functions.
(c_common_init_ts): Set ARRAY_NOTATION_REF as typed.
* c.opt (-fcilkplus): Define new command line switch.

gcc/c/ChangeLog
2012-09-26  Balaji V. Iyer  balaji.v.i...@intel.com

* c-typeck.c (convert_arguments): Added a check if tree contains
array notation expressions before throwing errors or doing anything.
* Make-lang.in (C_AND_OBJC_OBJS): Added c-array-notation.o.
* c-parser.c (c_parser_compound_statement): Check if array notation 
 code
is used in tree, if so, then transform them into appropriate C code.
(c_parser_expr_no_commas): Check if array notation is used in LHS or
RHS, if so, then build array notation expression instead of regular
modify.
(c_parser_postfix_expression_after_primary): Added a check for 
 colon(s)
after square braces, if so then handle it like an array notation.  
 Also,
break up array notations in unary op if found.
(c_parser_array_notation): New function.
* c-array-notation.c: New file.

gcc/testsuite/ChangeLog
2012-09-26  Balaji V. Iyer  balaji.v.i...@intel.com

* gcc.dg/cilk-plus/array_notation/execute/execute.exp: New script.
* gcc.dg/cilk-plus/array_notation/compile/compile.exp: Likewise.
* gcc.dg/cilk-plus/array_notation/errors/errors.exp: Likewise.
* gcc.dg/cilk-plus/array_notation/execute/sec_implicit_ex.c: New test.
* gcc.dg/cilk-plus/array_notation/execute/if_test.c: Likewise.
* gcc.dg/cilk-plus/array_notation/execute/gather_scatter.c: Likewise.
* gcc.dg/cilk-plus/array_notation/execute/builtin_func_double2.c:
Likewise.
* gcc.dg/cilk-plus/array_notation/execute/builtin_func_double.c:
Likewise.
* gcc.dg/cilk-plus/array_notation/execute/builtin_fn_custom.c: 
 Likewise.
* gcc.dg/cilk-plus/array_notation/execute/builtin_fn_mutating.c:
Likewise.
* gcc.dg/cilk-plus/array_notation/execute/array_test_ND.c: Likewise.
* gcc.dg/cilk-plus/array_notation/execute/array_test2.c: Likewise.
* gcc.dg/cilk-plus/array_notation/execute/array_test1.c: Likewise.
* gcc.dg/cilk-plus/array_notation/compile/sec_implicit_ex.c: Likewise.
* gcc.dg/cilk-plus/array_notation/compile/gather_scatter.c: Likewise.
* gcc.dg/cilk-plus/array_notation/compile/builtin_func_double2.c:
Likewise.
* gcc.dg/cilk-plus/array_notation/compile/array_test_ND.c: Likewise.
* gcc.dg/cilk-plus/array_notation/compile/if_test.c: Likewise.
* gcc.dg/cilk-plus/array_notation/compile/builtin_func_double.c:
Likewise.
* gcc.dg/cilk-plus/array_notation/compile/array_test1.c: 

Re: Fix -fdump-ada-spec

2012-10-04 Thread Richard Guenther
On Thu, Oct 4, 2012 at 10:26 AM, Arnaud Charlet char...@adacore.com wrote:
 After changes by Sharad (Add option for dumping to stderr (issue6190057)),
 -fdump-ada-spec is broken, and is now a no-op.

 Admittedly, this is because -fdump-ada-spec is handled differently from
 other -fdump-* switches, so this patch fixes support for -fdump-ada-spec
 by using an approach similar to -fdump-go-spec, and use regular switches
 via c/c.opt. I've removed the handling of TDF_RAW, which was a debugging
 option, and never really used, so can be simply deleted.

 Change is mostly trivial/mechanical.

 Tested on x86_64-pc-linux-gnu, OK for trunk?

Much cleaner indeed.

Ok,
Thanks,
Richard.

 gcc/

 2012-10-04  Arnaud Charlet  char...@adacore.com

 * dumpfile.h, dumpfile.c: Remove TDI_ada.

 c-family/

 2012-10-04  Arnaud Charlet  char...@adacore.com

 * c-ada-spec.c (print_ada_declaration): Remove handling of TDF_RAW.
 * c.opt (-fdump-ada-spec, -fdump-ada-spec-slim): Move switch 
 definition
 out of dumpfile.h.

 c/

 2012-10-04  Arnaud Charlet  char...@adacore.com

 * c-decl.c (c_write_global_declarations): Fix handling of
 -fdump-ada-spec*.

 cp/

 2012-10-04  Arnaud Charlet  char...@adacore.com

 * decl2.c (cp_write_global_declarations): Fix handling of
 -fdump-ada-spec*.

 Arno
 --
 Index: c-family/c.opt
 ===
 --- c-family/c.opt  (revision 192062)
 +++ c-family/c.opt  (working copy)
 @@ -799,6 +799,14 @@ fdollars-in-identifiers
  C ObjC C++ ObjC++
  Permit '$' as an identifier character

 +fdump-ada-spec
 +C ObjC C++ ObjC++ RejectNegative Var(flag_dump_ada_spec)
 +Write all declarations as Ada code transitively
 +
 +fdump-ada-spec-slim
 +C ObjC C++ ObjC++ RejectNegative Var(flag_dump_ada_spec_slim)
 +Write all declarations as Ada code for the given file only
 +
  felide-constructors
  C++ ObjC++ Var(flag_elide_constructors) Init(1)

 Index: c-family/c-ada-spec.c
 ===
 --- c-family/c-ada-spec.c   (revision 192062)
 +++ c-family/c-ada-spec.c   (working copy)
 @@ -2535,7 +2535,6 @@ print_ada_declaration (pretty_printer *b
int is_class = false;
tree name = TYPE_NAME (TREE_TYPE (t));
tree decl_name = DECL_NAME (t);
 -  bool dump_internal = get_dump_file_info (TDI_ada)-pflags  TDF_RAW;
tree orig = NULL_TREE;

if (cpp_check  cpp_check (t, IS_TEMPLATE))
 @@ -2705,8 +2704,7 @@ print_ada_declaration (pretty_printer *b
  }
else
  {
 -  if (!dump_internal
 -  TREE_CODE (t) == VAR_DECL
 +  if (TREE_CODE (t) == VAR_DECL
decl_name
*IDENTIFIER_POINTER (decl_name) == '_')
 return 0;
 @@ -2796,8 +2794,7 @@ print_ada_declaration (pretty_printer *b

/* If this function has an entry in the dispatch table, we cannot
  omit it.  */
 -  if (!dump_internal  !DECL_VINDEX (t)
 -  *IDENTIFIER_POINTER (decl_name) == '_')
 +  if (!DECL_VINDEX (t)  *IDENTIFIER_POINTER (decl_name) == '_')
 {
   if (IDENTIFIER_POINTER (decl_name)[1] == '_')
 return 0;
 Index: c/c-decl.c
 ===
 --- c/c-decl.c  (revision 192062)
 +++ c/c-decl.c  (working copy)
 @@ -10079,10 +10079,10 @@ c_write_global_declarations (void)
gcc_assert (!current_scope);

/* Handle -fdump-ada-spec[-slim]. */
 -  if (dump_initialized_p (TDI_ada))
 +  if (flag_dump_ada_spec || flag_dump_ada_spec_slim)
  {
/* Build a table of files to generate specs for */
 -  if (get_dump_file_info (TDI_ada)-pflags  TDF_SLIM)
 +  if (flag_dump_ada_spec_slim)
 collect_source_ref (main_input_filename);
else
 for_each_global_decl (collect_source_ref_cb);
 Index: cp/decl2.c
 ===
 --- cp/decl2.c  (revision 192062)
 +++ cp/decl2.c  (working copy)
 @@ -3698,9 +3698,9 @@ cp_write_global_declarations (void)
cgraph_process_same_body_aliases ();

/* Handle -fdump-ada-spec[-slim] */
 -  if (dump_initialized_p (TDI_ada))
 +  if (flag_dump_ada_spec || flag_dump_ada_spec_slim)
  {
 -  if (get_dump_file_info (TDI_ada)-pflags  TDF_SLIM)
 +  if (flag_dump_ada_spec_slim)
 collect_source_ref (main_input_filename);
else
 collect_source_refs (global_namespace);
 Index: dumpfile.c
 ===
 --- dumpfile.c  (revision 192062)
 +++ dumpfile.c  (working copy)
 @@ -57,8 +57,7 @@ static struct dump_file_info dump_files[
 0, 0, 0, 5},
{.vcg, tree-vcg, NULL, NULL, NULL, NULL, NULL, TDF_TREE,
 0, 0, 0, 6},
 -  {.ads, ada-spec, NULL, NULL, NULL, NULL, NULL, 0, 0, 0, 0, 7},
 -#define FIRST_AUTO_NUMBERED_DUMP 8
 +#define FIRST_AUTO_NUMBERED_DUMP 7

{NULL, tree-all, NULL, NULL, NULL, NULL, NULL, TDF_TREE,

[Ada] Remaining fixes to get MINIMIZED through test suite

2012-10-04 Thread Arnaud Charlet
This patch makes two minor corrections to fix two remaining tests in
the test suite that failed if run with -gnato2 forced on. There is
also a minor optimization of Compile_Time_Compare which improves
the results in some cases (noticed during testing, but does not
have any effect on the test suite).

The first problem was in handling folding of /= in some cases
which showed up as a bogus complaint about a length check failing.
The following should compile cleanly with -gnato2 with no messages.

 1. procedure Compov2 is
 2. begin
 3.for J in 1 .. 1 loop
 4.   declare
 5.  Dest   : constant Wide_String (J .. J + 6) := is_copy;
 6.   begin
 7.  null;
 8.   end;
 9.end loop;
10. end;

The other case was a mistake in handling of case expressions, which
shows up as an unexpected constraint error. No simple test case is
available for this problem.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-04  Robert Dewar  de...@adacore.com

* checks.adb (Minimize_Eliminate_Overflow_Checks): Dont reanalyze
if/case expression if nothing has changed (just reexpand). Stops
case expression from generating incorrect temporary.
* exp_ch4.adb (Expand_Compare_Minimize_Eliminate_Overflow):
Fix cut and paste typo for range analysis in NE (not equal) case.
* sem_eval.adb (Compile_Time_Compare): Small optimization to
catch some more cases.
* types.ads (Suppressed_Or_Checked): New subtype of
Overflow_Check_Type.

Index: types.ads
===
--- types.ads   (revision 192066)
+++ types.ads   (working copy)
@@ -737,7 +737,9 @@
 
subtype Minimized_Or_Eliminated is
  Overflow_Check_Type range Minimized .. Eliminated;
-   --  Definte subtypes so that clients don't need to know ordering. Note that
+   subtype Suppressed_Or_Checked is
+ Overflow_Check_Type range Suppressed .. Checked;
+   --  Define subtypes so that clients don't need to know ordering. Note that
--  Overflow_Check_Type is not marked as an ordered enumeration type.
 
--  The following structure captures the state of check suppression or
Index: checks.adb
===
--- checks.adb  (revision 192066)
+++ checks.adb  (working copy)
@@ -34,6 +34,7 @@
 with Exp_Tss;  use Exp_Tss;
 with Exp_Util; use Exp_Util;
 with Elists;   use Elists;
+with Expander; use Expander;
 with Eval_Fat; use Eval_Fat;
 with Freeze;   use Freeze;
 with Lib;  use Lib;
@@ -1272,8 +1273,7 @@
 Apply_Range_Check (N, Typ);
  end if;
 
-  elsif (Is_Record_Type (Typ)
-   or else Is_Private_Type (Typ))
+  elsif (Is_Record_Type (Typ) or else Is_Private_Type (Typ))
 and then Has_Discriminants (Base_Type (Typ))
 and then Is_Constrained (Typ)
   then
@@ -6709,10 +6709,12 @@
--  to be done in bignum mode), and the determined ranges of the operands.
 
--  After possible rewriting of a constituent subexpression node, a call is
-   --  made to reanalyze the node after setting Analyzed to False. To avoid a
-   --  recursive call into the whole overflow apparatus, and important rule for
-   --  this reanalysis call is that either Do_Overflow_Check must be False, or
-   --  if it is set, then the overflow checking mode must be temporarily set
+   --  made to either reexpand the node (if nothing has changed) or reanalyze
+   --  the node (if it has been modified by the overflow check processing).
+   --  The Analyzed_flag is set False before the reexpand/reanalyze. To avoid
+   --  a recursive call into the whole overflow apparatus, and important rule
+   --  for this call is that either Do_Overflow_Check must be False, or if
+   --  it is set, then the overflow checking mode must be temporarily set
--  to Checked/Suppressed. Either step will avoid the unwanted recursion.
 
procedure Minimize_Eliminate_Overflow_Checks
@@ -6761,6 +6763,17 @@
   --  range, then we must convert such operands back to the result type.
   --  This switch is properly set only when Bignum_Operands is False.
 
+  procedure Reexpand (C : Suppressed_Or_Checked);
+  --  This is called when we have not modifed the node, so we do not need
+  --  to reanalyze it. But we do want to reexpand it in either CHECKED
+  --  or SUPPRESSED mode (as indicated by the argument C) to get proper
+  --  expansion. It is important that we reset the mode to SUPPRESSED or
+  --  CHECKED, since if we leave it in MINIMIZED or ELIMINATED mode we
+  --  would reenter this routine recursively which would not be good!
+  --  Note that this is not just an optimization, testing has showed up
+  --  several complex cases in which renalyzing an already analyzed node
+  --  causes incorrect behavior.
+
   function In_Result_Range return Boolean;
   --  Returns True 

[Ada] Remaining fixes for -gnato3 (eliminated mode overflow checks)

2012-10-04 Thread Arnaud Charlet
This patch corrects a couple of errors in the handling of ELIMINATED mode
overflow checking. With this patch, the entire test suite passes with
-gnato3 mode forced on (there are some differences in output, but all
are expected).

The following three tests now work correctly

The following compiles quietly with -gnato3 and outputs TRUE

 1. with Text_IO; use Text_IO;
 2. procedure InGNATo3 is
 3.function K (X, Z : Integer) return Boolean is
 4.begin
 5.   return X in 1 .. Z ** 10;
 6.end;
 7. begin
 8.Put_Line (K (1, Integer'Last)'Img);
 9. end;

The following test compiles with the messages shown, regardless
of -gnato mode, and in particular this is now the -gnato3 output.

 1. procedure whynonso3 (a : integer) is
 2.x : constant := 1 + a ** 10;
   1   2
 non-static expression used in number declaration
 a is not static constant or named number (RM 4.9(5))

 3. begin
 4.null;
 5. end;

The following test compiles quietly in -gnato3 mode

 1. procedure BadAleno3 is
 2.type Arr is array (Positive range ) of Integer;
 3.N : Integer := 0;
 4.
 5.type ARR_DEF (D3 : INTEGER) is   record
 6.   C1 : Arr (N .. D3);
 7.end record;
 8.
 9.A : Arr := (1, 2);
10.X : constant ARR_DEF := (1, A );
11. begin
12.null;
13. end BadAleno3;

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-04  Robert Dewar  de...@adacore.com

* exp_ch4.adb (Expand_Compare_Minimize_Eliminate_Overflow):
Deal with case where we get a bignum operand and cannot do a
range analysis.
* sem_eval.adb (Why_Not_Static): Deal with bignum operands

Index: exp_ch4.adb
===
--- exp_ch4.adb (revision 192070)
+++ exp_ch4.adb (working copy)
@@ -2325,9 +2325,12 @@
   Minimize_Eliminate_Overflow_Checks
 (Right_Opnd (N), Rlo, Rhi, Top_Level = False);
 
-  --  See if the range information decides the result of the comparison
+  --  See if the range information decides the result of the comparison.
+  --  We can only do this if we in fact have full range information (which
+  --  won't be the case if either operand is bignum at this stage).
 
-  case N_Op_Compare (Nkind (N)) is
+  if Llo /= No_Uint and then Rlo /= No_Uint then
+ case N_Op_Compare (Nkind (N)) is
  when N_Op_Eq =
 if Llo = Lhi and then Rlo = Rhi and then Llo = Rlo then
Set_True;
@@ -2369,12 +2372,13 @@
 elsif Llo  Rhi or else Lhi  Rlo then
Set_True;
 end if;
-  end case;
+ end case;
 
-  --  All done if we did the rewrite
+ --  All done if we did the rewrite
 
-  if Nkind (N) not in N_Op_Compare then
- return;
+ if Nkind (N) not in N_Op_Compare then
+return;
+ end if;
   end if;
 
   --  Otherwise, time to do the comparison
Index: sem_eval.adb
===
--- sem_eval.adb(revision 192070)
+++ sem_eval.adb(working copy)
@@ -37,6 +37,7 @@
 with Nmake;use Nmake;
 with Nlists;   use Nlists;
 with Opt;  use Opt;
+with Rtsfind;  use Rtsfind;
 with Sem;  use Sem;
 with Sem_Aux;  use Sem_Aux;
 with Sem_Cat;  use Sem_Cat;
@@ -5419,10 +5420,12 @@
 return;
  end if;
 
- --  Type must be scalar or string type
+ --  Type must be scalar or string type (but allow Bignum, since this
+ --  is really a scalar type from our point of view in this diagnosis).
 
  if not Is_Scalar_Type (Typ)
and then not Is_String_Type (Typ)
+   and then not Is_RTE (Typ, RE_Bignum)
  then
 Error_Msg_N
   (static expression must have scalar or string type  
@@ -5539,8 +5542,15 @@
 
  when N_Function_Call =
 Why_Not_Static_List (Parameter_Associations (N));
-Error_Msg_N (non-static function call (RM 4.9(6,18))!, N);
 
+--  Complain about non-static function call unless we have Bignum
+--  which means that the underlying expression is really some
+--  scalar arithmetic operation.
+
+if not Is_RTE (Typ, RE_Bignum) then
+   Error_Msg_N (non-static function call (RM 4.9(6,18))!, N);
+end if;
+
  when N_Parameter_Association =
 Why_Not_Static (Explicit_Actual_Parameter (N));
 


Re: [SH] PR 33135 - Remove mieee option in libgcc

2012-10-04 Thread Kaz Kojima
Oleg Endo oleg.e...@t-online.de wrote:
 Since the -mieee behavior has been fixed, is enabled by default on SH
 and the additional flags in libgcc can be removed.
 
 OK?

OK.

Regards,
kaz


[Ada] Legality of aspects specified on a full view

2012-10-04 Thread Arnaud Charlet
In Ada 2012, certain aaspects, such as Type_Invariant, can be specified on a
partial view of a type, or on the full view, but not in both This patch
rejects such duplications cleanly.

the command:

gcc -c -gnat12 -gnata r.ads

must yield:

r.ads:5:32: aspect already specified in private declaration

---
package R is
type T is private with Type_Invariant = Non_Null (T);
function Non_Null (X : T) return Boolean;
private
type T is new Integer with Type_Invariant = T /= 0;
function Non_Null (X : T) return Boolean is (X /= 0);
end R;

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-04  Ed Schonberg  schonb...@adacore.com

* sem_ch3.adb (Check_Duplicate_Aspects): Diagnose properly
aspects that appear in the partial and the full view of a type.

Index: sem_ch3.adb
===
--- sem_ch3.adb (revision 192066)
+++ sem_ch3.adb (working copy)
@@ -23,6 +23,7 @@
 --  --
 --
 
+with Aspects;  use Aspects;
 with Atree;use Atree;
 with Checks;   use Checks;
 with Debug;use Debug;
@@ -14805,6 +14806,11 @@
   New_Id   : Entity_Id;
   Prev_Par : Node_Id;
 
+  procedure Check_Duplicate_Aspects;
+  --  Check that aspects specified in a completion have not been specified
+  --  already in the partial view. Type_Invariant and others can be
+  --  specified on either view but never on both.
+
   procedure Tag_Mismatch;
   --  Diagnose a tagged partial view whose full view is untagged.
   --  We post the message on the full view, with a reference to
@@ -14813,6 +14819,38 @@
   --  so we determine the position of the error message from the
   --  respective slocs of both.
 
+  -
+  -- Check_Duplicate_Aspects --
+  -
+  procedure Check_Duplicate_Aspects is
+ Prev_Aspects   : constant List_Id := Aspect_Specifications (Prev_Par);
+ Full_Aspects   : constant List_Id := Aspect_Specifications (N);
+ F_Spec, P_Spec : Node_Id;
+
+  begin
+ if Present (Prev_Aspects) and then Present (Full_Aspects) then
+F_Spec := First (Full_Aspects);
+while Present (F_Spec) loop
+   P_Spec := First (Prev_Aspects);
+   while Present (P_Spec) loop
+  if
+Chars (Identifier (P_Spec)) = Chars (Identifier (F_Spec))
+  then
+ Error_Msg_N
+   (aspect already specified in private declaration,
+ F_Spec);
+ Remove (F_Spec);
+ return;
+  end if;
+
+  Next (P_Spec);
+   end loop;
+
+   Next (F_Spec);
+end loop;
+ end if;
+  end Check_Duplicate_Aspects;
+
   --
   -- Tag_Mismatch --
   --
@@ -15022,6 +15060,10 @@
  (declaration of full view must appear in private part, N);
 end if;
 
+if Ada_Version = Ada_2012 then
+   Check_Duplicate_Aspects;
+end if;
+
 Copy_And_Swap (Prev, Id);
 Set_Has_Private_Declaration (Prev);
 Set_Has_Private_Declaration (Id);


[Ada]: Remove __gl_zero_cost_exceptions in the binder generated file

2012-10-04 Thread Arnaud Charlet
This variable wasn't used anymore.
No functional change. 

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-04  Tristan Gingold  ging...@adacore.com

* init.c (__gl_zero_cost_exceptions): Comment it as not used
anymore.
* bindgen.adb (Gen_Adainit): Do not emit Zero_Cost_Exceptions
anymore.

Index: bindgen.adb
===
--- bindgen.adb (revision 192066)
+++ bindgen.adb (working copy)
@@ -137,7 +137,6 @@
-- Num_Interrupt_States  : Integer;
-- Unreserve_All_Interrupts  : Integer;
-- Exception_Tracebacks  : Integer;
-   -- Zero_Cost_Exceptions  : Integer;
-- Detect_Blocking   : Integer;
-- Default_Stack_Size: Integer;
-- Leap_Seconds_Support  : Integer;
@@ -216,9 +215,6 @@
--  tracebacks are provided by default, so a value of zero for this
--  parameter does not necessarily mean no trace backs are available.
 
-   --  Zero_Cost_Exceptions is set to one if zero cost exceptions are used for
-   --  this partition, and to zero if longjmp/setjmp exceptions are used.
-
--  Detect_Blocking indicates whether pragma Detect_Blocking is active or
--  not. A value of zero indicates that the pragma is not present, while a
--  value of 1 signals its presence in the partition.
@@ -607,9 +603,6 @@
  __gl_exception_tracebacks););
  end if;
 
- WBI (  Zero_Cost_Exceptions : Integer;);
- WBI (  pragma Import (C, Zero_Cost_Exceptions,  
-  __gl_zero_cost_exceptions););
  WBI (  Detect_Blocking : Integer;);
  WBI (  pragma Import (C, Detect_Blocking,  
   __gl_detect_blocking););
@@ -803,17 +796,6 @@
 WBI (  Exception_Tracebacks := 1;);
  end if;
 
- Set_String (  Zero_Cost_Exceptions := );
-
- if Zero_Cost_Exceptions_Specified then
-Set_String (1);
- else
-Set_String (0);
- end if;
-
- Set_String (;);
- Write_Statement_Buffer;
-
  Set_String (  Detect_Blocking := );
 
  if Detect_Blocking then
Index: init.c
===
--- init.c  (revision 192066)
+++ init.c  (working copy)
@@ -103,12 +103,14 @@
 int   __gl_num_interrupt_states  = 0;
 int   __gl_unreserve_all_interrupts  = 0;
 int   __gl_exception_tracebacks  = 0;
-int   __gl_zero_cost_exceptions  = 0;
 int   __gl_detect_blocking   = 0;
 int   __gl_default_stack_size= -1;
 int   __gl_leap_seconds_support  = 0;
 int   __gl_canonical_streams = 0;
 
+/* This value is not used anymore, but kept for bootstrapping purpose.  */
+int   __gl_zero_cost_exceptions  = 0;
+
 /* Indication of whether synchronous signal handler has already been
installed by a previous call to adainit.  */
 int  __gnat_handler_installed  = 0;


[Ada] New preprocessor switch -a

2012-10-04 Thread Arnaud Charlet
This change introduces a new switch -a (all source text preserved) for
gnatprep and the integrated preprocessor, causing all source text to be
preserved (i.e. the if, all elsif and the else branch of a
#if construct all show up in the output). This is useful to perform
simple style checks on all branches.

The following command must produce the shown output:
$ gnatprep -a -c all_source_.adb.in all_source_text.adb
$ cat all_source_text.adb
procedure All_Source_Text is
beGin
   Some_Code;

--! #if Some_Condition then
   Do_Something;
--! #else
   Do_Something_else;
--! #end if;

   Some_More_Code;
end All_Source_Text;

Input:
procedure All_Source_Text is
beGin
   Some_Code;

#if Some_Condition then
   Do_Something;
#else
   Do_Something_else;
#end if;

   Some_More_Code;
end All_Source_Text;

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-04  Thomas Quinot  qui...@adacore.com

* prep.adb, prepcomp.adb, gprep.adb, opt.ads: New preprocessor switch
-a (all source text preserved).

Index: prep.adb
===
--- prep.adb(revision 192066)
+++ prep.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 2002-2010, Free Software Foundation, Inc. --
+--  Copyright (C) 2002-2012, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -292,8 +292,8 @@
  Result.Value := End_String;
   end if;
 
-  --  Now, check the syntax of the symbol (we don't allow accented and
-  --  wide characters)
+  --  Now, check the syntax of the symbol (we don't allow accented or
+  --  wide characters).
 
   if Name_Buffer (1) not in 'a' .. 'z'
 and then Name_Buffer (1) not in 'A' .. 'Z'
@@ -356,7 +356,7 @@
begin
   --  Always return False when not inside an #if statement
 
-  if Pp_States.Last = Ground then
+  if Opt.No_Deletion or else Pp_States.Last = Ground then
  return False;
   else
  return Pp_States.Table (Pp_States.Last).Deleting;
Index: prepcomp.adb
===
--- prepcomp.adb(revision 192066)
+++ prepcomp.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 2003-2010, Free Software Foundation, Inc. --
+--  Copyright (C) 2003-2012, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -60,6 +60,7 @@
   Undef_False  : Boolean:= False;
   Always_Blank : Boolean:= False;
   Comments : Boolean:= False;
+  No_Deletion  : Boolean:= False;
   List_Symbols : Boolean:= False;
   Processed: Boolean:= False;
end record;
@@ -73,6 +74,7 @@
   Undef_False  = False,
   Always_Blank = False,
   Comments = False,
+  No_Deletion  = False,
   List_Symbols = False,
   Processed= False);
 
@@ -330,6 +332,16 @@
--  significant.
 
case Sinput.Source (Token_Ptr) is
+  when 'a' =
+
+ --  All source text preserved (also implies -u)
+
+ if Name_Len = 1 then
+Current_Data.No_Deletion := True;
+Current_Data.Undef_False := True;
+OK := True;
+ end if;
+
   when 'u' =
 
  --  Undefined symbol are False
@@ -581,15 +593,15 @@
 
   --  Set the preprocessing flags according to the preprocessing data
 
-  if Current_Data.Comments and then not Current_Data.Always_Blank then
+  if Current_Data.Comments and not Current_Data.Always_Blank then
  Comment_Deleted_Lines := True;
  Blank_Deleted_Lines   := False;
-
   else
  Comment_Deleted_Lines := False;
  Blank_Deleted_Lines   := True;
   end if;
 
+  No_Deletion := Current_Data.No_Deletion;
   Undefined_Symbols_Are_False := Current_Data.Undef_False;
   List_Preprocessing_Symbols  := Current_Data.List_Symbols;
 

[Ada] Warn on Ada 2012 set membership test duplicate element

2012-10-04 Thread Arnaud Charlet
This patch adds a warning if a duplicate literal entry is found in
an Ada 2012 set membership, as shown by this example:

 1. pragma Ada_2012;
 2. package Dupset is
 3.a : integer;
 4.b : character;
 5.c : boolean := a in 1 |
 6.2 |
 7.3 |
 8.1 |
   |
 warning: duplicate of value given at line 5

 9.5;
10.d : boolean := b in 'a' |
11.'b' |
12.'c' |
13.'b';
   |
 warning: duplicate of value given at line 11

14.
15.type Day is (Mon, Tue, Wed, Thu, Fri);
16.x : Day;
17.e : boolean := x in Mon | Tue |
18.Wed | Mon;
 |
 warning: duplicate of value given at line 17

19. end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-04  Robert Dewar  de...@adacore.com

* sem_res.adb (Resolve_Set_Membership): Warn on duplicates.

Index: sem_res.adb
===
--- sem_res.adb (revision 192066)
+++ sem_res.adb (working copy)
@@ -7685,10 +7685,11 @@
   
 
   procedure Resolve_Set_Membership is
- Alt : Node_Id;
+ Alt  : Node_Id;
+ Ltyp : constant Entity_Id := Etype (L);
 
   begin
- Resolve (L, Etype (L));
+ Resolve (L, Ltyp);
 
  Alt := First (Alternatives (N));
  while Present (Alt) loop
@@ -7699,11 +7700,51 @@
 if not Is_Entity_Name (Alt)
   or else not Is_Type (Entity (Alt))
 then
-   Resolve (Alt, Etype (L));
+   Resolve (Alt, Ltyp);
 end if;
 
 Next (Alt);
  end loop;
+
+ --  Check for duplicates for discrete case
+
+ if Is_Discrete_Type (Ltyp) then
+declare
+   type Ent is record
+  Alt : Node_Id;
+  Val : Uint;
+   end record;
+
+   Alts  : array (0 .. List_Length (Alternatives (N))) of Ent;
+   Nalts : Nat;
+
+begin
+   --  Loop checking duplicates. This is quadratic, but giant sets
+   --  are unlikely in this context so it's a reasonable choice.
+
+   Nalts := 0;
+   Alt := First (Alternatives (N));
+   while Present (Alt) loop
+  if Is_Static_Expression (Alt)
+and then (Nkind_In (Alt, N_Integer_Literal,
+ N_Character_Literal)
+   or else Nkind (Alt) in N_Has_Entity)
+  then
+ Nalts := Nalts + 1;
+ Alts (Nalts) := (Alt, Expr_Value (Alt));
+
+ for J in 1 .. Nalts - 1 loop
+if Alts (J).Val = Alts (Nalts).Val then
+   Error_Msg_Sloc := Sloc (Alts (J).Alt);
+   Error_Msg_N (duplicate of value given#?, Alt);
+end if;
+ end loop;
+  end if;
+
+  Alt := Next (Alt);
+   end loop;
+end;
+ end if;
   end Resolve_Set_Membership;
 
--  Start of processing for Resolve_Membership_Op


[Ada] Fix value of GNAT.Command_Line.Full_Switch on invalid switch

2012-10-04 Thread Arnaud Charlet
This patch fixes the value returned by Full_Switch when the user
provided an invalid long switch (instead of return --, Full_Switch
will now return --long), as in this example:

with GNAT.Command_Line;   use GNAT.Command_Line;
with Ada.Text_IO; use Ada.Text_IO;
procedure Main is
begin
   while True loop
  case Getopt (o: -long: s) is
 when 'o' | 's' = null;
 when '-' = null;
 when others = exit;
  end case;
   end loop;
exception
   when GNAT.Command_Line.Invalid_Switch =
  Put_Line (invalid switch:   Full_Switch);
end Main;

Calling ./main --lond will now indicate that --lond is invalid,
not --.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-04  Emmanuel Briot  br...@adacore.com

* g-comlin.adb (Getopt): Fix value of Full_Switch returned in case of
invalid switch.

Index: g-comlin.adb
===
--- g-comlin.adb(revision 192066)
+++ g-comlin.adb(working copy)
@@ -39,6 +39,10 @@
 
 package body GNAT.Command_Line is
 
+   --  General note: this entire body could use much more commenting. There
+   --  are large sections of uncommented code throughout, and many formal
+   --  parameters of local subprograms are not documented at all ???
+
package CL renames Ada.Command_Line;
 
type Switch_Parameter_Type is
@@ -56,6 +60,12 @@
   Extra: Character := ASCII.NUL);
pragma Inline (Set_Parameter);
--  Set the parameter that will be returned by Parameter below
+   --
+   --  Extra is a character that needs to be added when reporting Full_Switch.
+   --  (it will in general be the switch character, for instance '-').
+   --  Otherwise, Full_Switch will report 'f' instead of '-f'. In particular,
+   --  it needs to be set when reporting an invalid switch or handling '*'.
+   --
--  Parameters need to be defined ???
 
function Goto_Next_Argument_In_Section (Parser : Opt_Parser) return Boolean;
@@ -95,9 +105,9 @@
   Index_In_Switches : out Integer;
   Switch_Length : out Integer;
   Param : out Switch_Parameter_Type);
-   --  Return the Longest switch from Switches that at least partially
-   --  partially Arg. Index_In_Switches is set to 0 if none matches.
-   --  What are other parameters??? in particular Param is not always set???
+   --  Return the Longest switch from Switches that at least partially matches
+   --  Arg. Index_In_Switches is set to 0 if none matches. What are other
+   --  parameters??? in particular Param is not always set???
 
procedure Unchecked_Free is new Ada.Unchecked_Deallocation
  (Argument_List, Argument_List_Access);
@@ -663,17 +673,45 @@
 
  if Index_Switches = 0 then
 
---  Depending on the value of Concatenate, the full switch is
---  a single character or the rest of the argument.
+--  Find the current switch that we did not recognize. This is in
+--  fact difficult because Getopt does not know explicitly about
+--  short and long switches. Ideally, we would want the following
+--  behavior:
 
-End_Index :=
-  (if Concatenate then Parser.Current_Index else Arg'Last);
+--  * for short switches, with Concatenate:
+--if -a is not recognized, and the command line has -daf
+--we should report the invalid switch as -a.
 
+--  * for short switches, wihtout Concatenate:
+--we should report the invalid switch as -daf.
+
+--  * for long switches:
+--if the commadn line is --long we should report --long
+--as unrecongized.
+
+--  Unfortunately, the fact that long switches start with a
+--  duplicate switch character is just a convention (so we could
+--  have a long switch -long for instance). We'll still rely on
+--  this convention here to try and get as helpful an error message
+--  as possible.
+
+--  Long switch case (starting with double switch character)
+
+if Arg (Arg'First + 1) = Parser.Switch_Character then
+   End_Index := Arg'Last;
+
+--  Short switch case
+
+else
+   End_Index :=
+ (if Concatenate then Parser.Current_Index else Arg'Last);
+end if;
+
 if Switches (Switches'First) = '*' then
 
-   --  Always prepend the switch character, so that users know that
-   --  this comes from a switch on the command line. This is
-   --  especially important when Concatenate is False, since
+   --  Always prepend the switch character, so that users know
+   --  that this comes from a switch on the command line. This
+   --  is especially 

Re: [patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])

2012-10-04 Thread Steven Bosscher
On Wed, Oct 3, 2012 at 5:35 PM, Steven Bosscher stevenb@gmail.com wrote:
 The worst result is this:
 Compressing live ranges: from 726174 to 64496 - 8%, pre_count 40476128, 
 post_count 12483414

 But that's still a lot better than before the patch for the same function:
 Compressing live ranges: from 1742569 to 73069 - 4%, pre_count 40842330, 
 post_count 12479992

Walking basic blocks with FOR_EACH_BB_REVERSE gives:

Only FOR_EACH_BB_REVERSE:
Compressing live ranges: from 1742579 to 429746 - 24% pre_count
41106212, post_count 34376494
Compressing live ranges: from 1742569 to 63000 - 3% pre_count
40835340, post_count 11055747

FOR_EACH_BB_REVERSE + need_curr_point_incr:
Compressing live ranges: from 726184 to 416529 - 57% pre_count
40743516, post_count 34376846
Compressing live ranges: from 726174 to 61840 - 8% pre_count 40472806,
post_count 11055747

The combination of the two changes takes ~20s off the ~180s for LRA
create live ranges.

Ciao!
Steven


Re: [Fortran, (RFC) patch] PR49110/51055 Assignment to alloc. deferred-length character vars

2012-10-04 Thread Tobias Burnus
Thanks for the suggestions. The attached patch changes all .-something 
symbol names, which I found.


Build and regtested on x86-64-gnu-linux.
OK for the trunk and 4.7? (.saved_dovar also occurs in 4.6; we could 
also backport that part to 4.6, but I am not sure whether it is needed.)


We probably should also bump the .mod version (and timely also commit 
the patch http://gcc.gnu.org/ml/fortran/2012-04/msg00033.html). Comments?


Tobias


Am 04.10.2012 01:07, schrieb David Edelsohn:

For C and C++, identifiers beginning with underscore and upper case
letter or with two underscores are reserve to the implementation.  C++
uses _Z for mangling.

Maybe Fortran could prepend _F.  Something beginning with an
underscore seems like a much better choice, given the rules about
reserved identifiers.

Thanks, David

On Wed, Oct 3, 2012 at 5:00 PM, Tobias Burnus bur...@net-b.de wrote:

David,


David Edelsohn wrote:

I am not sure why you chose a period and how best to correct this.


Well, in principle any name which the user cannot enter would do. (Not
enter: At least not as valid Fortran identifier.)

The reason for choosing . is that dotvar_name is used elsewhere in
gfortran for such identifier for the string-length variable belonging to
var_name, e.g. ._result in trans-decl.c. I assume the reason that it
didn't pop up with those is that those are local variables, but I wouldn't
be surprised if it would break elsewhere.

I wonder whether @ would work, otherwise, one could also use _. The only
other problem is that it will break the ABI. On the other hand, it's a
rather new feature and if we bump the .mod version number, the chance that
one effectively forces the user to re-compile is rather high. So far we
always bumped the .mod version number as something changed. There are also
some other patches pending which effectively lead to a bump in the .mod
version.

(The .mod version won't affect code which doesn't use modules such as
BLAS/LAPACK or any Fortran 66/77 code, but those won't be affected by the
ABI change anyway as there the name doesn't propagate as it does with
modules..)


Thanks for investigating the test-suite failure.

Tobias


2012-10-04  Tobias Burnus  bur...@net-b.de

	* trans-decl.c (gfc_create_string_length, create_function_arglist):
	Don't create a symbol which contains a dot.
	* trans-stmt.c (gfc_trans_simple_do, gfc_trans_do): Ditto.

diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index 910b150..f41fc8b 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -1097,9 +1097,9 @@ gfc_create_string_length (gfc_symbol * sym)
 
   /* Also prefix the mangled name.  */
   if (sym-module)
-	name = gfc_get_string (.__%s_MOD_%s, sym-module, sym-name);
+	name = gfc_get_string (_F_%s_MOD_%s, sym-module, sym-name);
   else
-	name = gfc_get_string (.%s, sym-name);
+	name = gfc_get_string (_F%s, sym-name);
 
   length = build_decl (input_location,
 			   VAR_DECL, get_identifier (name),
@@ -1984,7 +1984,7 @@ create_function_arglist (gfc_symbol * sym)
 
 	  length = build_decl (input_location,
 			   PARM_DECL,
-			   get_identifier (.__result),
+			   get_identifier (_Flen__result),
 			   len_type);
 	  if (!sym-ts.u.cl-length)
 	{
@@ -2007,7 +2007,7 @@ create_function_arglist (gfc_symbol * sym)
 		{
 		  tree len = build_decl (input_location,
 	 VAR_DECL,
-	 get_identifier (..__result),
+	 get_identifier (_Flen2__result),
 	 gfc_charlen_type_node);
 		  DECL_ARTIFICIAL (len) = 1;
 		  TREE_USED (len) = 1;
diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c
index 204f069..37fc6ee 100644
--- a/gcc/fortran/trans-stmt.c
+++ b/gcc/fortran/trans-stmt.c
@@ -1376,7 +1376,7 @@ gfc_trans_simple_do (gfc_code * code, stmtblock_t *pblock, tree dovar,
   /* Save value for do-tinkering checking. */
   if (gfc_option.rtcheck  GFC_RTCHECK_DO)
 {
-  saved_dovar = gfc_create_var (type, .saved_dovar);
+  saved_dovar = gfc_create_var (type, _F_saved_dovar);
   gfc_add_modify_loc (loc, pblock, saved_dovar, dovar);
 }
 
@@ -1581,7 +1581,7 @@ gfc_trans_do (gfc_code * code, tree exit_cond)
   /* Save value for do-tinkering checking. */
   if (gfc_option.rtcheck  GFC_RTCHECK_DO)
 {
-  saved_dovar = gfc_create_var (type, .saved_dovar);
+  saved_dovar = gfc_create_var (type, _F_saved_dovar);
   gfc_add_modify_loc (loc, block, saved_dovar, dovar);
 }
 


[RFA 1/5] New port: CR16: Remove gdb from noconfigdirs in configure.ac

2012-10-04 Thread Kaushik Phatak
Hi,
This patch is one of patch set to add a new port (National Instruments CR16) in 
gdb.
This patch will,
- Remove gdb from noconfigdirs in top-level configure.ac.
- Add target-lobgloss
- Make target OS independent in config.sub

OK for gcc and binutils?

Regards,
Kaushik


2012-10-04  Kaushik Phatak  kaushik.pha...@kpitcummins.com
Changelog
* configure.ac (noconfigdirs): Remove gdb. Add target-libgloss.
* configure: Regenerate.
* config.sub: Make target OS independent

--- ./gdb_src.orig/configure2012-06-28 17:20:52.0 +0530
+++ ./gdb_src/configure 2012-08-17 16:56:10.0 +0530
@@ -3447,7 +3447,7 @@ case ${target} in
 noconfigdirs=$noconfigdirs target-libgloss gdb
 ;;
   cr16-*-*)
-noconfigdirs=$noconfigdirs gdb
+noconfigdirs=$noconfigdirs target-libgloss
 ;;
   d10v-*-*)
 noconfigdirs=$noconfigdirs target-libgloss
--- ./gdb_src.orig/configure.ac 2012-08-17 23:07:18.0 +0530
+++ ./gdb_src/configure.ac  2012-08-17 16:56:17.0 +0530
@@ -873,7 +873,7 @@ case ${target} in
 noconfigdirs=$noconfigdirs target-libgloss gdb
 ;;
   cr16-*-*)
-noconfigdirs=$noconfigdirs gdb
+noconfigdirs=$noconfigdirs target-libgloss
 ;;
   d10v-*-*)
 noconfigdirs=$noconfigdirs target-libgloss
--- ./gdb_src.orig/config.sub   2012-04-25 21:23:25.0 +0530
+++ ./gdb_src/config.sub2012-09-24 11:26:00.0 +0530
@@ -575,7 +575,6 @@ case $basic_machine in
;;
cr16 | cr16-*)
basic_machine=cr16-unknown
-   os=-elf
;;
crds | unos)
basic_machine=m68k-crds




Re: [Fortran, (RFC) patch] PR49110/51055 Assignment to alloc. deferred-length character vars

2012-10-04 Thread Janus Weil
2012/10/4 Tobias Burnus bur...@net-b.de:
 Thanks for the suggestions. The attached patch changes all .-something
 symbol names, which I found.

 Build and regtested on x86-64-gnu-linux.
 OK for the trunk and 4.7? (.saved_dovar also occurs in 4.6; we could also
 backport that part to 4.6, but I am not sure whether it is needed.)

I think at least for trunk it should be ok.


 We probably should also bump the .mod version (and timely also commit the
 patch http://gcc.gnu.org/ml/fortran/2012-04/msg00033.html).

Strong agreement here :)


 Comments?

When backporting to 4.6 and 4.7, do you intend to also bump the module
version there? Does that make sense?

Moreover, .__result probably goes back even further than 4.6, right?

Cheers,
Janus



 Am 04.10.2012 01:07, schrieb David Edelsohn:

 For C and C++, identifiers beginning with underscore and upper case
 letter or with two underscores are reserve to the implementation.  C++
 uses _Z for mangling.

 Maybe Fortran could prepend _F.  Something beginning with an
 underscore seems like a much better choice, given the rules about
 reserved identifiers.

 Thanks, David

 On Wed, Oct 3, 2012 at 5:00 PM, Tobias Burnus bur...@net-b.de wrote:

 David,


 David Edelsohn wrote:

 I am not sure why you chose a period and how best to correct this.


 Well, in principle any name which the user cannot enter would do. (Not
 enter: At least not as valid Fortran identifier.)

 The reason for choosing . is that dotvar_name is used elsewhere in
 gfortran for such identifier for the string-length variable belonging to
 var_name, e.g. ._result in trans-decl.c. I assume the reason that it
 didn't pop up with those is that those are local variables, but I
 wouldn't
 be surprised if it would break elsewhere.

 I wonder whether @ would work, otherwise, one could also use _. The
 only
 other problem is that it will break the ABI. On the other hand, it's a
 rather new feature and if we bump the .mod version number, the chance
 that
 one effectively forces the user to re-compile is rather high. So far we
 always bumped the .mod version number as something changed. There are
 also
 some other patches pending which effectively lead to a bump in the .mod
 version.

 (The .mod version won't affect code which doesn't use modules such as
 BLAS/LAPACK or any Fortran 66/77 code, but those won't be affected by the
 ABI change anyway as there the name doesn't propagate as it does with
 modules..)


 Thanks for investigating the test-suite failure.

 Tobias




Re: [PATCH] Use __cxa_atexit on OpenBSD

2012-10-04 Thread Mark Kettenis
 Date: Wed, 3 Oct 2012 17:45:21 +0200 (CEST)
 From: Gerald Pfeifer ger...@pfeifer.com
 
 On Sat, 15 Sep 2012, Ian Lance Taylor wrote:
  2012-09-02  Mark Kettenis  kette...@openbsd.org
 
  * config.gcc (*-*-openbsd4.[3-9]|*-*-openbsd[5-9]*): Set
  default_use_cxa_atexit to yes.
  This is OK.
 
 I committed this to trunk and plan on doing so for the 4.7 branch
 as well (so that OpenBSD can benefit from a release branch of GCC
 carrying this) unless there are objections.

Thanks.  I won't object, although I'm not sure patching up the 4.7
branch is all that useful at this point given that it doesn't support
OpenBSD/amd64.

Any chance of getting the other diff that Ian approved

  http://gcc.gnu.org/ml/gcc-patches/2012-09/msg01070.html

committed to trunk?

Thanks,

Mark


Re: [Patch, Fortran, OOP] PR 54784: [4.7/4.8 Regression] wrong code in polymorphic allocation with SOURCE

2012-10-04 Thread Mikael Morin
Le 03/10/2012 18:48, Janus Weil a écrit :
 Hi all,
 
 here is a small patch for a wrong-code regression with polymorphic
 allocation. The problem is that we falsely detect the allocation
 variable to be a polymorphic array (although it is a scalar). For
 further details see the PR, in particular comment 4.
 
 Regtested on x86_64-unknown-linux-gnu. Ok for trunk and 4.7?
Hello, the fix looks incomplete.

 
 Index: gcc/fortran/trans-stmt.c
 ===
 --- gcc/fortran/trans-stmt.c  (revision 192004)
 +++ gcc/fortran/trans-stmt.c  (working copy)
 @@ -5145,7 +5145,9 @@ gfc_trans_allocate (gfc_code * code)
 dataref = actual-next-expr-ref;
 /* Make sure we go up through the reference chain to
the _data reference, where the arrayspec is found.  */
 -   while (dataref-next  dataref-next-type != REF_ARRAY)
 +   while (!(dataref-type == REF_COMPONENT
 + strcmp (dataref-u.c.component-name, _data) == 0)
 +   dataref-next)
this stops on the first class reference, while it seems to me that it
should stop on the last.

Mikael


Re: [Fortran, (RFC) patch] PR49110/51055 Assignment to alloc. deferred-length character vars

2012-10-04 Thread Janne Blomqvist
On Thu, Oct 4, 2012 at 1:50 PM, Janus Weil ja...@gcc.gnu.org wrote:
 Thanks for the suggestions. The attached patch changes all .-something
 symbol names, which I found.

 Build and regtested on x86-64-gnu-linux.
 OK for the trunk and 4.7? (.saved_dovar also occurs in 4.6; we could also
 backport that part to 4.6, but I am not sure whether it is needed.)

 I think at least for trunk it should be ok.

 One more comment: Since its appearance is a bit scattered in the code,
 how about using a small macro which prepends the _F prefix to a
 given variable name?

For normal identifiers in a module, the current scheme of
__modulename_MOD_symbolname is probably too widely entrenched e.g.
in debuggers and various interoperability toolkits to be worth
changing at this point. The OOP stuff OTOH is IMHO sufficiently new
that there is little harm in changing it.

I was thinking about this in the beginning of the year and produced
the attached document (I never sent it before as I realized I wouldn't
have time to do anything about it myself in the near future). Funnily,
I also came up with the idea of _F at that point, though maybe not
so surprising as I also studied the g++ name mangling for inspiration.
Also note that the document itself has a perhaps naive approach which
does not consider backwards compatibility enough (see e.g. the above
paragraph).

Some related mangling PR's:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51802

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52606 (not the PR itself,
but the discussion in the comments)

-- 
Janne Blomqvist
Gfortran name mangling ABI
==

It would be nice if GFortran would have a documented, consistent name
mangling ABI. This would reduce the risk of an inadvertent name clash
with some other software, and make it easier for 3rd party tools such
as debuggers, profilers etc. to demangle symbol names.

If, and when, the ABI is broken due to the array descriptor update, we
could also think about fixing this issue.

An explicit non-goal of this is to come up with some common
cross-compiler name mangling ABI, as it seems very unlikely that other
compiler vendors will want to change their mangling, and mangling
alone is a very small part of ABI compability.

Another non-goal is to change the mangling of the F77 interface
(lowercase, append underscore(s) depending on the compiler
options). Thus the following discussion refers only to mangling F90+
names, e.g. module procedures and so forth.

Rules of the road
-

Some names are reserved identifiers, reserved for the
implementation. Mangled names should be such reserved names, in order
to not clash with user-defined names.

Fortran specifies that names are of the form letter + alphanum.

C and C++ reserved identifiers (that is, identifiers which are
reserved for use by the implementation) are

- A name beginning with an underscore followed by a capital letter.

- A name containing double underscore (plain C reserves only names
  beginning with double underscore, but C++ reserves anything with
  double underscores).

- POSIX adds additional restrictions wrt various POSIX functionality.

Thus, choosing names beginning with an underscore followed by either a
capital letter or another underscore should be good.

Current GFortran name mangling
--

Currently the name mangling is a bit ad-hoc, with several different
prefixes depending on which part of the compiler is used:

- Procedure foo in module bar: __bar_MOD_foo

- Runtime library: _gfortran_XXX

- Runtime library internal functions (not visible if symbol visibility
  is supported): _gfortrani_XXX

- ISO_C_BINDING functions in the library: __iso_c_binding_XXX

- OOP stuff: __class_XXX

- Others?


The C++ name mangling
-

For inspiration, see the C++ name mangling ABI that GCC follows at 

http://sourcery.mentor.com/public/cxx-abi/abi.html#mangling

http://sourcery.mentor.com/public/cxx-abi/abi-examples.html#mangling

http://sourcery.mentor.com/public/cxx-abi/abi-mangling.html


The C++ name mangling ABI, in a very simplified form, is

- Everything has the prefix _Z.

- names are encoded as length, name pairs.

- At the end of a function symbol there is a E, followed by the type
  of the function arguments (in order to handle overloading).

- Outside of names, characters have meaning as various flags,
  e.g. TI means the identifier is a typeinfo structure, or TV for
  a virtual table, and so on.

E.g. a member function foo(void) in a class Test (Test::foo())
would thus be encoded as _ZN4Test3fooEv. (The N means it's a
nested name, v at the end means the void argument).


Proposed GFortran mangling
--

Fortran name mangling requirements are considerably simpler than C++,
due to Fortran not having function overloading (yes, Fortran has
generic interfaces, which are a bit different and don't require
mangling), nor templates.

- Every symbol has the prefix _F (F as in 

Re: [Fortran, (RFC) patch] PR49110/51055 Assignment to alloc. deferred-length character vars

2012-10-04 Thread Tobias Burnus

Hi Janus,

Janus Weil wrote:
When backporting to 4.6 and 4.7, do you intend to also bump the module 
version there? Does that make sense?


Probably not. The .__result and the .saved_dovar are not ABI relevant, 
thus, they can be changed without problems - but also not that 
important. The module variable is more crucial, but I think we should 
avoid bumping the module version. I think we should only change the trunk.



Moreover, .__result probably goes back even further than 4.6, right?


Presumably yes.

Janus Weil wrote:

One more comment: Since its appearance is a bit scattered in the code,
how about using a small macro which prepends the _F prefix to a
given variable name?

Btw, note that we are using a double underscore scheme in other places
(like __class, __vtab, __vtype, etc). I have even used an '@' in one
place, namely (hidden) procedure pointer results (ppr@). Is there a
need to unify all those cases?


It think it would be useful to unify those. Are you volunteering?

Regarding the @: GCC only has: NO_DOLLAR_IN_LABEL and NO_DOT_IN_LABEL. 
However, I would be also careful with the @ label.


The @ has also a special meaning to assemblers. However, if the name 
with the @ is not publicly visible, the name does not occur in the 
assembler file (except as string for the debug information). Thus, there 
should be no real problem. Still, one should consider to change also 
them to the _F convention.


Tobias


Re: [patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])

2012-10-04 Thread Richard Guenther
On Thu, Oct 4, 2012 at 11:43 AM, Steven Bosscher stevenb@gmail.com wrote:
 On Wed, Oct 3, 2012 at 5:35 PM, Steven Bosscher stevenb@gmail.com wrote:
 The worst result is this:
 Compressing live ranges: from 726174 to 64496 - 8%, pre_count 40476128, 
 post_count 12483414

 But that's still a lot better than before the patch for the same function:
 Compressing live ranges: from 1742569 to 73069 - 4%, pre_count 40842330, 
 post_count 12479992

 Walking basic blocks with FOR_EACH_BB_REVERSE gives:

 Only FOR_EACH_BB_REVERSE:
 Compressing live ranges: from 1742579 to 429746 - 24% pre_count
 41106212, post_count 34376494
 Compressing live ranges: from 1742569 to 63000 - 3% pre_count
 40835340, post_count 11055747

 FOR_EACH_BB_REVERSE + need_curr_point_incr:
 Compressing live ranges: from 726184 to 416529 - 57% pre_count
 40743516, post_count 34376846
 Compressing live ranges: from 726174 to 61840 - 8% pre_count 40472806,
 post_count 11055747

 The combination of the two changes takes ~20s off the ~180s for LRA
 create live ranges.

Isn't _REVERSE vs. non-_RESERVE still kind-of random order?  Thus, doesn't
the above show there exists an optimal order for processing which we could use?
(I realize _REVERSE is a simple solution, but might there not exist a
pathological
case where _REVERSE is even worse than non-_REVERSE?)

Richard.

 Ciao!
 Steven


Re: libgo patch committed: Use libbacktrace

2012-10-04 Thread Rainer Orth
Ian Lance Taylor i...@google.com writes:

 This patch to libgo changes it to use libbacktrace.  Previously
 backtraces required the Go package debug/elf to register itself with the
 runtime during the package initialization, which only worked if the
 program actually imported debug/elf one way or another.  Bootstrapped
 and ran Go testsuite on x86_64-unknown-linux-gnu.  Committed to
 mainline.

Unfortunately, this breaks all use of libgo on versions of Solaris  11
which lack strnlen:

Undefined   first referenced
 symbol in file
strnlen /var/gcc/regression/trunk/10-gcc/build/i386-
pc-solaris2.10/libgo/.libs/libgo.so
ld: fatal: symbol referencing errors. No output written to a.out
collect2: error: ld returned 1 exit status
FAIL: bufio

One could either try to also link libiberty into libgo.la, but that has
the complication of needing to decide whether to use libiberty.a or
pic/libiberty.a since libiberty is no libtool library.

Alternatively, one could add another implementation of strnlen to libgo,
which duplicates code.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: patch to fix

2012-10-04 Thread Richard Guenther
On Wed, Oct 3, 2012 at 7:15 PM, Kenneth Zadeck zad...@naturalbridge.com wrote:
 The enclosed patch is the third of at least four patches that fix the
 problems associated with supporting integers on the target that are
 wider than two HOST_WIDE_INTs.

 While GCC claims to support OI mode, and we have two public ports that
 make minor use of this mode, in practice, compilation that uses OImode
 mode commonly gives the wrong result or ices.  We have a private port
 of GCC for an architecture that is further down the road to needing
 comprehensive OImode and we have discovered that this is unusable. We
 have decided to fix it in a general way that so that it is most
 beneficial to the GCC community.  It is our belief that we are just a
 little ahead of the X86 and the NEON and these patches will shortly be
 essential.

 The first two of these patches were primarily lexigraphical and have
 already been committed.They transformed the uses of CONST_DOUBLE
 so that it is easy to tell what the intended usage is.

 The underlying structures in the next two patches are very general:
 once they are added to the compiler, the compiler will be able to
 support targets with any size of integer from hosts of any size
 integer.

 The patch enclosed deals with the portable RTL parts of the compiler.
 The next patch, which is currently under construction deals with the
 tree level.  However, this patch can be put on the trunk as is, and it
 will eleviate many, but not all of the current limitations in the rtl
 parts of the compiler.

 Some of the patch is conditional, depending on a port defining the
 symbol 'TARGET_SUPPORTS_WIDE_INT' to be non zero.  Defining this
 symbol to be non zero is declaring that the port has been converted to
 use the new form or integer constants.  However, the patch is
 completely backwards compatible to allow ports that do not need this
 immediately to convert at their leasure.  The conversion process is
 not difficult, but it does require some knowledge of the port, so we
 are not volinteering to do this for all ports.

 OVERVIEW OF THE PATCH:

 The patch defines a new datatype, a 'wide_int' (defined in
 wide-int.[ch], and this datatype will be used to perform all of the
 integer constant math in the compiler.  Externally, wide-int is very
 similar to double-int except that it does not have the limitation that
 math must be done on exactly two HOST_WIDE_INTs.

 Internally, a wide_int is a structure that contains a fixed sized
 array of HOST_WIDE_INTs, a length field and a mode.  The size of the

That it has a mode sounds odd to me and makes it subtly different
from HOST_WIDE_INT and double-int.  Maybe the patch will tell
why this is so.

 array is determined at generation time by dividing the number of bits
 of the largest integer supported on the target by the number of bits
 in a HOST_WIDE_INT of the host.  Thus, with this format, any length of
 integer can be supported on any host.

 A new rtx type is created, the CONST_WIDE_INT, which contains a
 garbage collected array of HOST_WIDE_INTS that is large enough to hold
 the constant.  For the targets that define TARGET_SUPPORTS_WIDE_INT to
 be non zero, CONST_DOUBLES are only used to hold floating point
 values.  If the target leaves TARGET_SUPPORTS_WIDE_INT defined as 0,
 CONST_WIDE_INTs are not used and CONST_DOUBLEs are as they were
 before.

 CONST_INT does not change except that it is defined to hold all
 constants that fit in exactly one HOST_WIDE_INT.  Note that is slightly
 different than the current trunk.  Before this patch, the TImode
 constant '5' could either be in a CONST_INT or CONST_DOUBLE depending
 on which code path was used to create it.  This patch changes this so
 that if the constant fits in a CONST_INT then it is represented in a
 CONST_INT no matter how it is created.

 For the array inside a CONST_WIDE_INT, and internally in wide-int, we
 use a compressed form for integers that need more than one
 HOST_WIDE_INT.  Higher elements of the array are not needed if they
 are just a sign extension of the elements below them.  This does not
 imply that constants are signed or are sign extended, this is only a
 compression technique.

 While it might seem to be more esthetically pleasing to have not
 introduced the CONST_WIDE_INT and to have changed the representation
 of the CONST_INT to accomodate larger numbers, this would have both
 used more space and would be a time consuming change for the port
 maintainers.  We believe that most ports can be quickly converted with
 the current scheme because there is just not a lot of code in the back
 ends that cares about large constants.  Furthermore, the CONST_INT is
 very space efficient and even in a program that was heavy in large
 values, most constants would still fit in a CONST_INT.

 All of the parts of the rtl level that deal with CONST_DOUBLE as an
 now conditionally work with CONST_WIDE_INTs depending on the value
 of TARGET_SUPPORTS_WIDE_INT.  We believe that this 

Re: [Fortran, (RFC) patch] PR49110/51055 Assignment to alloc. deferred-length character vars

2012-10-04 Thread Janus Weil
Hi,

 One more comment: Since its appearance is a bit scattered in the code,
 how about using a small macro which prepends the _F prefix to a
 given variable name?

 Btw, note that we are using a double underscore scheme in other places
 (like __class, __vtab, __vtype, etc). I have even used an '@' in one
 place, namely (hidden) procedure pointer results (ppr@). Is there a
 need to unify all those cases?


 It think it would be useful to unify those. Are you volunteering?

yeah, why not ;)

Attached is a draft patch (not regtested), which adds a macro
GFC_PREFIX (in gfortran.h) to prepend _F to the cases included in
Tobias' earlier patch as well as the OOP-related stuff and procedure
pointer results. It also bumps the module version.

Any comments so far? (Of course the name of the macro can be debated.
I just tried to keep it short for now.)

Cheers,
Janus


mangling.diff
Description: Binary data


Re: [Patch] Fix PR53397

2012-10-04 Thread Richard Guenther
On Tue, Oct 2, 2012 at 6:40 PM, Kumar, Venkataramanan
venkataramanan.ku...@amd.com wrote:
 Hi Richi,

 (Snip)
 + (!cst_and_fits_in_hwi (step))
 +{
 +  if( loop-inner != NULL)
 +{
 +  if (dump_file  (dump_flags  TDF_DETAILS))
 +{
 +  fprintf (dump_file, Reference %p:\n, (void *) ref);
 +  fprintf (dump_file, (base  );
 +  print_generic_expr (dump_file, base, TDF_SLIM);
 +  fprintf (dump_file, , step );
 +  print_generic_expr (dump_file, step, TDF_TREE);
 +  fprintf (dump_file, )\n);

 No need to repeat this - all references are dumped when we gather them.
 (Snip)

 The dumping happens at record_ref which is called after these statements to 
 record these references.

 When the step is invariant  we return from the function without recording the 
 references.

  so I thought of dumping the references here.

 Is there a cleaner way to dump the references at one place?

Yes, call dump_mem_ref then, instead of repeating parts of its body.

Richard.

 Regards,
 Venkat.



 -Original Message-
 From: Richard Guenther [mailto:rguent...@suse.de]
 Sent: Tuesday, October 02, 2012 5:42 PM
 To: Kumar, Venkataramanan
 Cc: gcc-patches@gcc.gnu.org
 Subject: Re: [Patch] Fix PR53397

 On Mon, 1 Oct 2012, venkataramanan.ku...@amd.com wrote:

 Hi,

 The below patch fixes the FFT/Scimark regression caused by useless
 prefetch generation.

 This fix tries to make prefetch less aggressive by prefetching arrays
 in the inner loop, when the step is invariant in the entire loop nest.

 GCC currently tries to prefetch invariant steps when they are in the
 inner loop. But does not check if the step is variant in outer loops.

 In the scimark FFT case, the trip count of the inner loop varies by a
 non constant step, which is invariant in the inner loop.
 But the step variable is varying in outer loop. This makes inner loop
 trip count small (at run time varies sometimes as small as 1
 iteration)

 Prefetching ahead x iteration when the inner loop trip count is
 smaller than x leads to useless prefetches.

 Flag used: -O3 -march=amdfam10

 Before
 **  **
 ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark **
 ** for details. (Results can be submitted to p...@nist.gov) **
 **  **
 Using   2.00 seconds min time per kenel.
 Composite Score:  550.50
 FFT Mflops:38.66(N=1024)
 SOR Mflops:   617.61(100 x 100)
 MonteCarlo: Mflops:   173.74
 Sparse matmult  Mflops:   675.63(N=1000, nz=5000)
 LU  Mflops:  1246.88(M=100, N=100)


 After
 **  **
 ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark **
 ** for details. (Results can be submitted to p...@nist.gov) **
 **  **
 Using   2.00 seconds min time per kenel.
 Composite Score:  639.20
 FFT Mflops:   479.19(N=1024)
 SOR Mflops:   617.61(100 x 100)
 MonteCarlo: Mflops:   173.18
 Sparse matmult  Mflops:   679.13(N=1000, nz=5000)
 LU  Mflops:  1246.88(M=100, N=100)

 GCC regression make check -k passes with x86_64-unknown-linux-gnu
 New tests that PASS:

 gcc.dg/pr53397-1.c scan-assembler prefetcht0 gcc.dg/pr53397-1.c
 scan-tree-dump aprefetch Issued prefetch
 gcc.dg/pr53397-1.c (test for excess errors) gcc.dg/pr53397-2.c
 scan-tree-dump aprefetch loop variant step
 gcc.dg/pr53397-2.c scan-tree-dump aprefetch Not prefetching
 gcc.dg/pr53397-2.c (test for excess errors)


 Checked CPU2006 and polyhedron on latest AMD processor, no regressions noted.

 Ok to commit in trunk?

 regards,
 Venkat

 gcc/ChangeLog
 +2012-10-01  Venkataramanan Kumar  venkataramanan.ku...@amd.com
 +
 +   * tree-ssa-loop-prefetch.c (gather_memory_references_ref):$
 +   Perform non constant step prefetching in inner loop, only $
 +   when it is invariant in the entire loop nest.  $
 +   * testsuite/gcc.dg/pr53397-1.c: New test case $
 +   Checks we are prefecthing for loop invariant steps$
 +   * testsuite/gcc.dg/pr53397-2.c: New test case$
 +   Checks we are not prefecthing for loop variant steps
 +


 Index: gcc/testsuite/gcc.dg/pr53397-1.c
 ===
 --- gcc/testsuite/gcc.dg/pr53397-1.c  (revision 0)
 +++ gcc/testsuite/gcc.dg/pr53397-1.c  (revision 0)
 @@ -0,0 +1,28 @@
 +/* Prefetching when the step is loop invariant.  */
 +
 +/* { dg-do compile } */
 +/* { dg-options -O3 -fprefetch-loop-arrays
 +-fdump-tree-aprefetch-details --param min-insn-to-prefetch-ratio=3
 +--param simultaneous-prefetches=10 -fdump-tree-aprefetch-details }
 +*/
 +
 +
 +double data[16384];
 +void prefetch_when_non_constant_step_is_invariant(int step, int n) {
 + int a;

Re: [patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])

2012-10-04 Thread Steven Bosscher
On Thu, Oct 4, 2012 at 1:30 PM, Richard Guenther
richard.guent...@gmail.com wrote:
 Isn't _REVERSE vs. non-_RESERVE still kind-of random order?

Not at this stage. For cfglayout mode I would answer yes, but IRA/LRA
operates in cfgrtl mode, so the sequence of insns and basic blocks
must match. Therefore, if you walk the basic blocks in reverse, and
the insns in each basic block in reverse, you effectively work on a,
let's say, reverse extended basic block (??) in that your program
points are sequential across fallthrough edges.

  Thus, doesn't
 the above show there exists an optimal order for processing which we could 
 use?

There may be a smarter order: Could even walk blocks in that order if
you know a priori what path through the CFG minimizes the length of
the live range chains. But to know that order, you have to build the
chains. So chicken-and-egg...

 (I realize _REVERSE is a simple solution, but might there not exist a
 pathological case where _REVERSE is even worse than non-_REVERSE?)

Intuitively, I'm certain that _REVERSE is always better than
non-_REVERSE, although I don't know how to prove that :-)

Ciao!
Steven


Re: Propagate profile counts during switch expansion

2012-10-04 Thread Jan Hubicka
 Hi,
  This patch propagates the profile counts during RTL expansion. In
 many cases, there is no way to determine the exact count of an edge
 generated during the expansion. So this patch uses some simple
 heuristics to estimate the edge counts but ensures that the counts of
 the basic blocks corresponding to the cases are (nearly the) same as
 at the gimple level.
 
 Bootstrapped and profile-bootstrapped on an x86_64/linux machine. OK for 
 trunk?
 Index: gcc/expr.c
 ===
 --- gcc/expr.c (revision 191879)
 +++ gcc/expr.c (working copy)
 @@ -154,7 +154,7 @@ static rtx do_store_flag (sepops, rtx, enum machin
  #ifdef PUSH_ROUNDING
  static void emit_single_push_insn (enum machine_mode, rtx, tree);
  #endif
 -static void do_tablejump (rtx, enum machine_mode, rtx, rtx, rtx);
 +static void do_tablejump (rtx, enum machine_mode, rtx, rtx, rtx, int);
  static rtx const_vector_from_tree (tree);
  static void write_complex_part (rtx, rtx, bool);
 
 @@ -10894,7 +10894,7 @@ try_casesi (tree index_type, tree index_expr, tree
 
  static void
  do_tablejump (rtx index, enum machine_mode mode, rtx range, rtx table_label,
 -  rtx default_label)
 +  rtx default_label, int default_probability)

Please document default_probability.
  {
rtx temp, vector;
 
 @@ -10910,9 +10910,17 @@ do_tablejump (rtx index, enum machine_mode mode, r
   the maximum value of the range.  */
 
if (default_label)
 -emit_cmp_and_jump_insns (index, range, GTU, NULL_RTX, mode, 1,
 - default_label);
 +{
 +  emit_cmp_and_jump_insns (index, range, GTU, NULL_RTX, mode, 1,
 +   default_label);
 +  if (default_probability != -1)
 +{
 +  rtx jump_insn = get_last_insn();
 +  add_reg_note (jump_insn, REG_BR_PROB, GEN_INT 
 (default_probability));
 +}
 +}

dojump already does this kind of logic, but it is bit more cureful:
  emit_cmp_and_jump_insns (op0, op1, code, size, mode, unsignedp,
   if_true_label);
  if (prob != -1  profile_status != PROFILE_ABSENT)
{ 
  for (last = NEXT_INSN (last);
   last  NEXT_INSN (last);
   last = NEXT_INSN (last))
if (JUMP_P (last))
  break;
  if (last
   JUMP_P (last)
   ! NEXT_INSN (last)
   any_condjump_p (last))
{ 
  gcc_assert (!find_reg_note (last, REG_BR_PROB, 0));
  add_reg_note (last, REG_BR_PROB, GEN_INT (prob));
}
}

What about making emit_cmp_and_jump_insns taking the probability argument
and moving the code above inside?  Most of other places need updating to
propagate probabilities.

(compare_and_jump_seq in loop-unswitch probably also can be updated)
 @@ -10954,7 +10962,7 @@ do_tablejump (rtx index, enum machine_mode mode, r
 
  int
  try_tablejump (tree index_type, tree index_expr, tree minval, tree range,
 -   rtx table_label, rtx default_label)
 +   rtx table_label, rtx default_label, int default_probability)

Simiarly here.
 Index: gcc/cfgbuild.c
 ===
 --- gcc/cfgbuild.c (revision 191879)
 +++ gcc/cfgbuild.c (working copy)
 @@ -533,6 +533,23 @@ find_bb_boundaries (basic_block bb)
  purge_dead_tablejump_edges (bb, table);
  }
 
 +/* If there is at least one edge in EDGES with a non-zero count, then
 +   compute probabilities based on the existing counts.  */
 +
 +static bool
 +gen_probabilities_from_existing_counts ( VEC(edge,gc) *edges) {
 +  edge e;
 +  edge_iterator ei;
 +  gcov_type count_sum = 0;
 +  FOR_EACH_EDGE(e, ei, edges)
 +count_sum += e-count;
 +  if (count_sum == 0)
 +return false;
 +  FOR_EACH_EDGE(e, ei, edges)
 +e-probability = e-count * REG_BR_PROB_BASE / count_sum;
 +  return true;
 +}
 +
  /*  Assume that frequency of basic block B is known.  Compute frequencies
  and probabilities of outgoing edges.  */
 
 @@ -560,7 +577,6 @@ compute_outgoing_frequencies (basic_block b)
return;
   }
  }
 -
if (single_succ_p (b))
  {
e = single_succ_edge (b);
 @@ -568,7 +584,10 @@ compute_outgoing_frequencies (basic_block b)
e-count = b-count;
return;
  }
 -  guess_outgoing_edge_probabilities (b);
 +  else if (!gen_probabilities_from_existing_counts (b-succs)){
 +/* All outgoing edges of B have zero count. Guess probabilities.  */
 +guess_outgoing_edge_probabilities (b);
 +  }

Hmm, I do not quite follow logic here.  
basic block B is one of many basic blocks that the original BB was split from.
It is possible that B may have some of original edges, but there may be new 
ones.
How you can guess the outgoing probabilitie shere.  Do you have an example?

Also gen_probabilities_from_existing_counts could probably also work based
on original edge frequencies.
 @@ -1664,7 +1668,7 @@ do_jump_if_equal (enum machine_mode mode, rtx op0,
 

Re: RFA: darwin PATCH to fix build, internal visibility

2012-10-04 Thread Jack Howarth
On Wed, Oct 03, 2012 at 03:26:14PM -0700, Mike Stump wrote:
 On Oct 3, 2012, at 12:04 PM, Jason Merrill ja...@redhat.com wrote:
  This patch fixes a couple of Darwin issues I noticed with a cross-compiler:
  
  1) Adds a couple of consts to avoid const-correctness errors.
  2) Treats visibility internal like hidden rather than like default.  
  The documentation says that internal is hidden + processor-specific 
  semantics, so treating it as just hidden makes sense to me.
  
  OK for trunk?
 
 Ok.


FYI, the x86_64-apple-darwin12 testsuite results with the proposed patch are 
at...

http://gcc.gnu.org/ml/gcc-testresults/2012-10/msg00434.html

It appears that the patch should also special case the scan-assembler 
.internal.*Foo.methodEv
tests in g++.dg/ext/visibility/pragma-override1.C and 
g++.dg/ext/visibility/pragma-override2.C 
on darwin as well...

FAIL: g++.dg/ext/visibility/pragma-override1.C -std=c++98  scan-assembler 
.internal.*Foo.methodEv
FAIL: g++.dg/ext/visibility/pragma-override1.C -std=c++11  scan-assembler 
.internal.*Foo.methodEv
FAIL: g++.dg/ext/visibility/pragma-override2.C -std=c++98  scan-assembler 
.internal.*Foo.methodEv
FAIL: g++.dg/ext/visibility/pragma-override2.C -std=c++11  scan-assembler 
.internal.*Foo.methodEv

   Jack



Re: patch to fix constant math

2012-10-04 Thread Kenneth Zadeck

Let me talk about the mode here first.

What this interface/patch provides is a facility where the constant math 
that is done in optimizations is done exactly the way that it would be 
done on the target machine.   What we have now is a compiler that only 
does this if it convenient to do on the host.   I admit that i care 
about this more than others right now, but if intel adds a couple of 
more instructions to their vector units, other people will start to 
really care about this issue.   If you take an OImode value with the 
current compiler and left shift it by 250 the middle end will say that 
the result is 0.   This is just wrong!!!


What this means is that the bitsize and precision of the operations need 
to be carried along when doing math. when wide-int  checks for overflow 
on the multiply or add, it is not checking the if the value overflowed 
on two HWIs, it is checking if the add overflowed in the mode of the 
types that are represented on the target.   When we do shift, we are not 
doing a shift within two HWIs, we are truncating the shift value (if 
this is appropriate) according to the bitsize and shifting according the 
precision.


I think that an argument could be made that storing the mode should be 
changed to an explicit precision and bitsize.  (A possible other option 
would be to store a tree type, but this would make the usage at the rtl 
level very cumbersome since types are rare.) Aside from the work, you 
would not get much push back.


But the signess is a different argument.   At the rtl level, the signess 
is a matter of context.   (you could argue that this is a mistake and i 
would agree, but that is an even bigger change.)   But more to the 
point, at the tree level, there are a surprising number of places where 
the operation desired does not follow the sign of the types that were 
used to construct the constants.   Furthermore, not carrying the sign is 
more consistent with the double int code, which as you point out carries 
nothing.


As for the splitting out the patch in smaller pieces, i am all for it.   
I have done this twice already and i could get the const_scalar_int_p 
patch out quickly.But you do not get too far along that before you 
are still left with a big patch.   I could split out wide-int.* and just 
commit those files with no clients as a first step.   My guess is that 
Richard Sandiford would appreciate that because while he has carefully 
checked the rtl stuff, i think that the code inside wide-int is not in 
his comfort zone of things he would approve.


As far as your btw - noticed this last night.   it is an artifact of the 
way i produced the patch and responsible people have been sacked.   
However, it shows that you read the patch carefully, and i really 
appreciate that.   i owe you a beer (not that you need another at this 
time of year).


Kenny



On 10/04/2012 08:48 AM, Richard Guenther wrote:

On Wed, Oct 3, 2012 at 7:15 PM, Kenneth Zadeck zad...@naturalbridge.com wrote:

The enclosed patch is the third of at least four patches that fix the
problems associated with supporting integers on the target that are
wider than two HOST_WIDE_INTs.

While GCC claims to support OI mode, and we have two public ports that
make minor use of this mode, in practice, compilation that uses OImode
mode commonly gives the wrong result or ices.  We have a private port
of GCC for an architecture that is further down the road to needing
comprehensive OImode and we have discovered that this is unusable. We
have decided to fix it in a general way that so that it is most
beneficial to the GCC community.  It is our belief that we are just a
little ahead of the X86 and the NEON and these patches will shortly be
essential.

The first two of these patches were primarily lexigraphical and have
already been committed.They transformed the uses of CONST_DOUBLE
so that it is easy to tell what the intended usage is.

The underlying structures in the next two patches are very general:
once they are added to the compiler, the compiler will be able to
support targets with any size of integer from hosts of any size
integer.

The patch enclosed deals with the portable RTL parts of the compiler.
The next patch, which is currently under construction deals with the
tree level.  However, this patch can be put on the trunk as is, and it
will eleviate many, but not all of the current limitations in the rtl
parts of the compiler.

Some of the patch is conditional, depending on a port defining the
symbol 'TARGET_SUPPORTS_WIDE_INT' to be non zero.  Defining this
symbol to be non zero is declaring that the port has been converted to
use the new form or integer constants.  However, the patch is
completely backwards compatible to allow ports that do not need this
immediately to convert at their leasure.  The conversion process is
not difficult, but it does require some knowledge of the port, so we
are not volinteering to do this for all ports.

OVERVIEW OF THE PATCH:

The patch 

Profile housekeeping 6/n (-fprofile-consistency-report)

2012-10-04 Thread Jan Hubicka
Hi,
this patch implements -fprofile-consistency-report that is useful to get an
statistic about what pass are major offenders in keeping profile up-to-date.

For example the following is output for combine.c
 Pass: fnsplit  (after pass) mismatched in:  +16 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: fnsplit  (after TODO) mismatched in:  -16 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: inline   (after pass) mismatched in: +197 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: inline   (after TODO) mismatched in: -209 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: ccp  (after TODO) mismatched in:   +8 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: vrp  (after pass) mismatched in: +191 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: vrp  (after TODO) mismatched in:  +25 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: dce  (after TODO) mismatched in:  -19 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: cdce (after pass) mismatched in:   -1 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: cselim   (after pass) mismatched in:   +1 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: ifcombine(after TODO) mismatched in:   +1 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: phiopt   (after pass) mismatched in:   -2 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: ch   (after pass) mismatched in:   +2 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: ch   (after TODO) mismatched in:   +1 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: dom  (after pass) mismatched in:  +89 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: dom  (after TODO) mismatched in:   -3 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: phicprop (after TODO) mismatched in:   -6 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: dce  (after TODO) mismatched in:   -2 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: copyprop (after TODO) mismatched in:  -17 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: unswitch (after pass) mismatched in:   +7 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: unswitch (after TODO) mismatched in:  +19 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: cunroll  (after pass) mismatched in:  +10 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: vrp  (after pass) mismatched in:  +18 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: vrp  (after TODO) mismatched in:   -8 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: dom  (after pass) mismatched in:  +14 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: dom  (after TODO) mismatched in:   +4 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: phicprop (after TODO) mismatched in:   +1 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: cddce(after TODO) mismatched in:   -1 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: expand   (after pass) mismatched in: +435 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: jump (after pass) mismatched in:   +6 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: cse1 (after pass) mismatched in:   +1 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: cprop(after pass) mismatched in:   -8 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: rtl pre  (after pass) mismatched in:   -1 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: cse_local(after pass) mismatched in:   -7 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: ce1  (after pass) mismatched in:   +5 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: loop2_init   (after pass) mismatched in:   +1 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: loop2_done   (after pass) mismatched in:   -1 (freqs)   +0 
(counts); michmatched out:   +0 (freqs)   +0 (counts)
 Pass: reload 

Re: Profile housekeeping 6/n (-fprofile-consistency-report)

2012-10-04 Thread Steven Bosscher
On Thu, Oct 4, 2012 at 4:01 PM, Jan Hubicka wrote:
 * doc/invoke.texi (-fprofile-consistency-report): Document.
 * common.opt (fprofile-consistency-report): New.
 * toplev.h (dump_profile_consistency_report): Declare.
 * toplev.c (finalize): Call dump_profile_consistency_report.
 * passes.c (profile_record): New global var.
 (check_profile_consistency): New function.
 (dump_profile_consistency_report): New function.
 (execute_one_ipa_transform_pass): Call check_profile_consistency.
 (execute_one_pass): Likewise.


Nice. And long overdue! :-)


 +fprofile-consistency-report
 +Common Report Var(profile_report)
 +Report on consistency of profile

Maybe make this a -d flag instead of -f?


 Index: passes.c
 +/* Hold statistic about profile consistency.  */
...

I don't see why this should live in passes.c, can you please put it in
a more logical place (profile.c, perhaps)?

Ciao!
Steven


Re: RFA: add lock_length attribute to break branch-shortening cycles

2012-10-04 Thread Joern Rennecke

Quoting Richard Guenther richard.guent...@gmail.com:


I miss a few things in this description:
- what is the value of lock_length supposed to be?  From the lock
  prefix it sounds like it is something unchanging, maybe even constant,
  thus a maximum?
- the length attribute still needs to be specified when lock_length is?
  how do they relate?  Is lock_length always smaller / bigger than length?


I hope I have clarified this in the updated documentation:

 Usually, branch shortening is done assuming the worst case (i.e.
longest) lengths, and then iterating (if optimizing) to smaller lengths
till no further changed occur.  This does not work so well for
architectures that have very small minimum offsets and considerable
jumps in instruction lengths.

 If you define the `lock_length' attribute, branch shortening will work
the other way round: it starts out assuming minimum instruction lengths
and iterates from there.  `lock_length' specifies an instruction length
value that is calculated like `length' in every iteration, but if the
value at the last iteration was larger, that larger previous value will
be used instead.  The value computed for the `length' attribute will be
no smaller than that of the `lock_length' attribute, but you may still
specify a larger value, and in that case `length' can decrease in the
next iteration, but not below the value of `lock_length'.  The easiest
way to make sure branch shortening doesn't start looping indefinitely
is to define the `length' attribute to only a minimum instruction
length for varying length instructions and let the `lock_length'
attribute do all the heavy lifting of computing varying lengths.  On
the other hand, for some instruction sets you might get better
shortening outcomes if you use a more complex `length' attribute and
use `lock_length' only to the extent required to prevent indefinite
looping.  Note that `ADJUST_INSN_LENGTH' applies only to the `length'
attribute.

 Because defining `lock_length' makes branch shortening start with
optimistic assumptions, this means we have to see it through to the end
to eliminate all out-of-range branches, thus branch shortening might
take a bit longer at `-O0' than if you didn't define the `lock_length'
attribute.

 Using `lock_length' with varying delay slots and varying length delay
slot insns (all at once) is currently not supported.


- what happens if you have patterns with lock_length and patterns without?
- what patterns does lock_length apply to?


These questions don't really make literal sense; instruction attributes
are defined for all instructions.  Of course, you can define the default
value to 0, which means you see no effect of this attribute on a pattern
unless some non-default definition applies.


In general optimistically attacking this kind of problem should be always
better - did you try simply switching this for all targets?


No, I haven't.  The boundaries to be set for branch offsets can be different
when starting with optimistic assumptions than when starting with  
pessimistic assumptions.  Moreover, there could be 'interesting'  
interactions

with ADJUST_INSN_LENGTH.  And there is the problem with handling varying
delay slots.  I have modified the final.c patch to reduce the scope of this
issue so that lock_length should hopefully be useful for a wider range of
targets.

Also, ...


 It shouldn't be
slower and the only thing you need to guarantee is that during iteration
you never make insn-lenghts smaller again.


At -O0 it is slower because we can't finish the loop early.

For all these reasons, I think it is better if each target maintainer  
evaluates

if the better branch shortening weighs out the longer -O0 compilation time,
and addresses any issues arising if/when converting.

Bootstrapped on i686-pc-linux-gnu
Index: doc/md.texi
===
--- doc/md.texi (revision 192036)
+++ doc/md.texi (working copy)
@@ -8004,6 +8004,42 @@ (define_insn jump
   (const_int 6)))])
 @end smallexample
 
+@cindex lock_length
+Usually, branch shortening is done assuming the worst case (i.e. longest)
+lengths, and then iterating (if optimizing) to smaller lengths till 
+no further changed occur.  This does not work so well for architectures
+that have very small minimum offsets and considerable jumps in instruction
+lengths.
+
+If you define the @code{lock_length} attribute, branch shortening will
+work the other way round: it starts out assuming minimum instruction
+lengths and iterates from there.  @code{lock_length} specifies an instruction
+length value that is calculated like @code{length} in every iteration,
+but if the value at the last iteration was larger, that larger previous
+value will be used instead.
+The value computed for the @code{length} attribute will be no smaller
+than that of the @code{lock_length} attribute, but you may still specify
+a larger value, and in that case @code{length} can decrease in the 

Re: [RFC] Make vectorizer to skip loops with small iteration estimate

2012-10-04 Thread Jan Hubicka
  So SOC cancels out in the runtime check.
  I still think we need two formulas - one determining if vectorization is
  profitable, other specifying the threshold for scalar path at runtime (that
  will generally give lower values).
 
 True, we want two values.  But part of the scalar path right now
 is all the computation required for alias and alignment runtime checks
 (because the way all the conditions are combined).
 
 I'm not much into the details of what we account for in SOC (I suppose
 it's everything we insert in the preheader of the vector loop).

Yes, it seems contain everything we insert prior the loop in unfolded form.
 
 +  if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS))
 +fprintf (vect_dump, not vectorized: estimated iteration count 
 too small.);
 +  if (vect_print_dump_info (REPORT_DETAILS))
 +fprintf (vect_dump, not vectorized: estimated iteration count 
 smaller than 
 + user specified loop bound parameter or minimum 
 + profitable iterations (whichever is more 
 conservative).);
 
 this won't work anymore btw - dumping infrastructure changed.

Ah, will update that.
 
 I suppose your patch is a step in the right direction, but to really
 make progress we need to re-organize the loop and predicate structure
 produced by the vectorizer.

This reminds me what I did for string functions on x86. It gets very hard
to get all the paths right when one starts to be really cureful to not
output too much cruft on the short paths + do not consume too many registers.

In fact I want to re-think this for the SSE string ops patch, so I may try to
look into that incrementally.
 
 So, please update your patch, re-test and then it's ok.

Thanks.
  I tested enabling loop_ch in early passes with -fprofile-feedback and it is 
  SPEC
  neutral.  Given that it improves loop count estimates, I would still like 
  mainline
  doing that.  I do not like these quite important estimates to be wrong most 
  of time.
 
 I agree.  It also helps getting rid of once rolling loops I think.

I am attaching the patch for early-ch.  Will commit it tomorrow.

Concerning jump threading, it would help to make some of it during early passes
so the profile estiamte do not get invalided.  I tried to move VRP early but 
now it
makes compiler to hang during bootstrap.  I will debug that.
 
   
   Btw, I added a similar check in vect_analyze_loop_operations:
   
 if ((LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
   (LOOP_VINFO_INT_NITERS (loop_vinfo)  vectorization_factor))
 || ((max_niter = max_stmt_executions_int (loop)) != -1
  (unsigned HOST_WIDE_INT) max_niter  vectorization_factor))
   {
 if (dump_kind_p (MSG_MISSED_OPTIMIZATION))
   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
not vectorized: iteration count too small.);
 if (dump_kind_p (MSG_MISSED_OPTIMIZATION))
   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
not vectorized: iteration count smaller than 
vectorization factor.);
 return false;
   }
   
   maybe you simply need to update that to also consider the profile?
  
  Hmm, I am still getting familiar wth the code. Later we later have
if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
 LOOP_VINFO_INT_NITERS (loop_vinfo) = th)
  {
if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS))
  fprintf (vect_dump, not vectorized: vectorization not 
   profitable.);
if (vect_print_dump_info (REPORT_DETAILS))
  fprintf (vect_dump, not vectorized: iteration count smaller than 
   user specified loop bound parameter or minimum 
   profitable iterations (whichever is more conservative).);
return false;
  }
  
  where th is always greater or equal than vectorization_factor from the cost 
  model.
  So this test seems redundant if the max_stmt_executions_int was pushed down
  to the second conditoinal?
 
 Yes, sort of.  The new check was supposed to be crystal clear, and
 even with the cost model disabled we want to not vectorize in this
 case.  But yes, the whole cost-model stuff needs TLC.

Ah yes, without cost model we would skip it.  I suppose we do not need to
brother  witht he profile estiamte in the case anyway. They are kind of aprt of
the cost models.

* passes.c (init_optimization_passes): Schedule early CH.
* tree-pass.h (pass_early_ch): Declare it.
* tree-ssa-loop-ch.c (gate_early_ch): New function.
(pass_early_ch): New pass.
Index: passes.c
===
--- passes.c(revision 191852)
+++ passes.c(working copy)
@@ -1335,6 +1336,7 @@ init_optimization_passes (void)
   NEXT_PASS (pass_cleanup_eh);
   NEXT_PASS (pass_profile);
   NEXT_PASS (pass_local_pure_const);

Re: libgo patch committed: Use libbacktrace

2012-10-04 Thread Ian Lance Taylor
On Thu, Oct 4, 2012 at 5:11 AM, Rainer Orth r...@cebitec.uni-bielefeld.de 
wrote:
 Ian Lance Taylor i...@google.com writes:

 This patch to libgo changes it to use libbacktrace.  Previously
 backtraces required the Go package debug/elf to register itself with the
 runtime during the package initialization, which only worked if the
 program actually imported debug/elf one way or another.  Bootstrapped
 and ran Go testsuite on x86_64-unknown-linux-gnu.  Committed to
 mainline.

 Unfortunately, this breaks all use of libgo on versions of Solaris  11
 which lack strnlen:

 Undefined   first referenced
  symbol in file
 strnlen 
 /var/gcc/regression/trunk/10-gcc/build/i386-
 pc-solaris2.10/libgo/.libs/libgo.so
 ld: fatal: symbol referencing errors. No output written to a.out
 collect2: error: ld returned 1 exit status
 FAIL: bufio

 One could either try to also link libiberty into libgo.la, but that has
 the complication of needing to decide whether to use libiberty.a or
 pic/libiberty.a since libiberty is no libtool library.

I guess I won't try to link libgo against libiberty.  I just changed
libbacktrace to provide its own strnlen function, like so.
Bootstrapped and ran libbacktrace testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian


2012-10-04  Ian Lance Taylor  i...@google.com

* dwarf.c: If the system header files do not declare strnlen,
provide our own version.


foo.patch
Description: Binary data


Re: patch to fix

2012-10-04 Thread Kenneth Zadeck

On 10/04/2012 09:17 AM, Marc Glisse wrote:

On Wed, 3 Oct 2012, Mike Stump wrote:

On Oct 3, 2012, at 1:47 PM, Marc Glisse marc.gli...@inria.fr wrote:
did you consider making the size of wide_int a template parameter, 
now that we are using C++? All with a convenient typedef or macro so 
it doesn't show. I am asking because in vrp I do some arithmetic 
that requires 2*N+1 bits where N is the size of double_int.


No, not really.  I'd maybe answer it this way, we put in a type 
(singular) to support all integral constants in all languages on a 
port. Since we only needed 1, there was little need to templatize 
it.  By supporting all integral constants in all languages, there is 
little need for more.  If Ada say, wanted a 2048 bit integer, then, 
we just have it drop off the size it wants someplace and we would mix 
that in on a MAX(….) line, net result, the type we use would then 
directly support the needs of Ada.  If vpr wanted 2x of all existing 
modes, we could simply change the MAX equation and essentially double 
it; if people need that.  This comes as a cost, as the intermediate 
wide values are fixed size allocated (not variable); so these all 
would be larger.


And this cost could be eliminated by having a template wide_int_ so 
only the places that need it actually use the extra size ;-)


The space is not really an issue in most places since wide-ints tend to 
be short lived.  i guess vrp is slightly different because it creates a 
lot at once.  but then they go away.


However the real question is what are you going to instantiate the 
template on?What we do is look at the target and determine the 
largest type that the target supports and build a wide int type that 
supports that.how are you going to do better?   are you going to 
instantiate one for every type you see?   are these going to be static 
or dynamic?   The last line this email seems to imply that you were 
planning to know that __int128 was the largest integer that any target 
or front end could support.


and then what do you do for the parts of the compiler that have 
operations that take things of two different types, like shift. The 
shift amount can and may times is a shorter type that what is being 
shifted. Would these different length integers be represented with 
different instances from the same template?   I am not a c++ programmer 
and so all of this is a little new to me, but given a perspective of the 
rest of the compiler, this does not seem like the right way to go.





On Wed, 3 Oct 2012, Kenneth Zadeck wrote:

i have already converted the vrp code, so i have some guess at where 
you are talking about.  (of course correct me if i am wrong).


in the code that computes the range when two variables are multiplied 
together needs to do a multiplication that produces a result that is 
twice as wide as the inputs.


Yes, exactly.

my library is able to do that with one catch (and this is a big 
catch): the target has to have an integer mode that is twice as big 
as the mode of the operands. The issue is that wide-ints actually 
carry around the mode of the value in order to get the bitsize and 
precision of the operands (it does not have the type, because this 
code has to both work on the rtl and tree level and i generally do 
not want the signness anyway).


my current code in vrp checks to see if such a mode exists and if it 
does, it produces the product.   if the mode does not exist, it 
returns bottom.   What this means is that for most (many or some) 
targets that have a TImode, the largest thing that particular vrp 
discover ranges for is a DImode value.   We could get around this by 
defining the next larger mode than what the target really needs but i 
wonder how much mileage you are going to get out of that with really 
large numbers.


This will be for discussion when you submit that next patch, but 
currently VRP handles integers the same size as double_int. In 
particular, it handles __int128. I would be unhappy if introducing a 
larger bigint type in gcc made us regress there.


You are only happy now because you do not really understand the world 
around you.This is not what your code does.   What you code does is 
that if the host is a 64 bit host you can handle __int128 and if your 
host is a 32 bit host you can handle a __int64.  If you are building a 
cross compiler from a 32 bit host to a 64 bit target, your pass is 
either going to get the wrong answer, give up, or ice.   There are 
currently parts of gcc that do each of these three solutions and my 
patch gets rid of these because it does the math as the target does the 
math, no matter that the target is.


The goal of my patch is to make gcc produce the same correct results no 
matter what types the target or host support.The last thing that we 
need to have some optimization knowing what the limits of either of 
these are and hard coding that in a set of templates that have been 
statically instantiated.





[C++ testcase] PR 54323

2012-10-04 Thread Paolo Carlini

Hi,

I'm adding the testcase and closing the PR. Tested x86_64-linux.

Thanks,
Paolo.

/
2012-10-04  Paolo Carlini  paolo.carl...@oracle.com

PR c++/54323
* g++.dg/cpp0x/pr54323.C: New.
Index: g++.dg/cpp0x/pr54323.C
===
--- g++.dg/cpp0x/pr54323.C  (revision 0)
+++ g++.dg/cpp0x/pr54323.C  (working copy)
@@ -0,0 +1,37 @@
+// PR c++/54323
+// { dg-do compile { target c++11 } }
+
+templatebool, typename T = void
+struct enable_if { };
+
+templatetypename T
+struct enable_iftrue, T
+{ typedef T type; };
+
+templatetemplatetypename class CRTP, typename T
+class Base
+{
+public:
+  templatetemplatetypename class CRTP0, typename T0, class
+  friend int func(const BaseCRTP0, T0 rhs);
+
+protected:
+  int n;
+};
+
+templatetemplatetypename class CRTP0, typename T0,
+class = typename enable_iftrue::type
+int func(const BaseCRTP0, T0 rhs)
+{
+  return rhs.n;
+}
+
+templatetypename T
+class Derived : public BaseDerived, T {};
+
+int main()
+{
+  Derivedint x;
+  func(x);
+  return 0;
+}


Re: [patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])

2012-10-04 Thread Vladimir Makarov

On 10/04/2012 03:24 AM, Steven Bosscher wrote:

On Thu, Oct 4, 2012 at 8:57 AM, Steven Bosscher stevenb@gmail.com wrote:

On Thu, Oct 4, 2012 at 5:30 AM, Vladimir Makarov vmaka...@redhat.com wrote:

I was going to look at this code too but I was interesting in generation of
less points and live ranges.  It is strange that in my profiles,
remove_some_program_points_and_update_live_ranges takes 0.6% of compiler
time on these huge tests.   So I was not interesting to speed up the
function and may be therefore you have no visible change in compilation
time.

Right. The compression algorithm doesn't care much about the initial
number of program points, only about the number of live ranges before
and after compression. I had expected a bigger effect on the number of
live ranges before compression.

0.6% sounds really very different from my timings. How much time does
create_start_finish_chains take for you?



I don't object the idea of the patch.  I need some time to look at it (the
different results on a function is a bit scary for me) and check simulator
times on other tests.

Understood.

BTW, it would be great if you can also look at this additional patch hunk:

@@ -994,8 +1044,8 @@ lra_create_live_ranges (bool all_p)
curr_point = 0;
point_freq_vec = VEC_alloc (int, heap, get_max_uid () * 2);
lra_point_freq = VEC_address (int, point_freq_vec);
-  FOR_EACH_BB (bb)
-process_bb_lives (bb);
+  FOR_EACH_BB_REVERSE (bb)
+process_bb_lives (bb, curr_point);
lra_live_max_point = curr_point;
create_start_finish_chains ();
if (lra_dump_file != NULL)

I think this should result in more live ranges being merged. Here's
why I think so, based on my far worse understanding of this code than
yours, so forgive me if I'm Completely Wrong :-)

No, you are not wrong.
Two days ago, I worked on patch which contains the same code.  The patch 
actually takes EBB into account to decrease # calls of mark_pseudo_live 
at the beginning of process_bb_lives and mark_pseudo_dead at the 
function end and for that I needed FOR_EACH_BB_REVERSE.  The patch was 
half baked (it did not checked hard regs live changes at the end of BB 
to set up right hard reg conflicts for pseudos) but it gave an idea how 
much I can get from this.  It is not bad but not what I expected.  So I 
stopped work on this.  But we still should work on these ideas as they 
improve LRA speed in small steps (many small steps will create a visible 
effect).


We can really solve scalability problem only by using simpler but still 
good enough algorithms (too simple algorithms result in big code size 
and actually even in worse compilation times).  I've been working on it 
and I'll send a patch soon.

process_bb_lives walks insns in the basic block from last to first, so
say you have a basic block chain 1-2-3, and each block has 4 insns,
then AFAIU the program points in block 1 will be [4,3,2,1], in block 2
it will be [8,7,6,5], and in block 3 it will be [12,11,10,9]. Say a
reg is used in block 3 at point 11, and set in block at point 3. Then
this reg will have a live range chain [3-1],[8-5],[12-11].

If you visit the basic blocks in reverse order, the program points
will be: 1:[12,11,10,9], 2:[8,7,6,5], 3:[4,3,2,1]. Now the same reg
will be set at point 11 and used at point 3, and the live range chain
will be just [11-3].






Small PATCH to rs6000.c to fix cross-compiler build without gas

2012-10-04 Thread Jason Merrill
If configure doesn't think that the assembler supports weak symbols, 
rs6000.c fails to compile because ASM_WEAKEN_DECL isn't defined.  So 
let's not use it in that case.


OK for trunk?
commit 5f0878b79d1a42795aca2fabc8d70eefa2e29fa6
Author: Jason Merrill ja...@redhat.com
Date:   Wed Oct 3 15:55:48 2012 -0400

	* config/rs6000/rs6000.c (rs6000_code_end): Protect the use of
	ASM_WEAKEN_DECL with #if RS6000_WEAK.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 3e3d553..f4e4dec 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -28295,6 +28295,7 @@ rs6000_code_end (void)
   TREE_PUBLIC (decl) = 1;
   TREE_STATIC (decl) = 1;
 
+#if RS6000_WEAK
   if (USE_HIDDEN_LINKONCE)
 {
   DECL_COMDAT_GROUP (decl) = DECL_ASSEMBLER_NAME (decl);
@@ -28307,6 +28308,7 @@ rs6000_code_end (void)
   ASM_DECLARE_FUNCTION_NAME (asm_out_file, name, decl);
 }
   else
+#endif
 {
   switch_to_section (text_section);
   ASM_OUTPUT_LABEL (asm_out_file, name);


Re: patch to fix

2012-10-04 Thread Kenneth Zadeck
Actually richi, this code is correct for some broken definition of 
correct.


If all that is done is to convert the rtl parts of the compiler, then 
this code is the best you can do (of course an assertion that the length 
is not greater than 2 would be a useful addition).


The code that is in the follow on patch which converts the insides of a 
tree cst to look like a const wide int, i.e. an array of HWIs.   When 
that happens, this code looks completely different. But if you only 
convert the rtl level, at some point there is going to be an impedance 
mismatch and it is buried here.


I will point out that this is the fall out of trying to split things 
into a bunch of smaller patches that could in theory go in separately.


kenny






+/* Constructs tree in type TYPE from with value given by CST.  Signedness
+   of CST is assumed to be the same as the signedness of TYPE.  */
+
+tree
+wide_int_to_tree (tree type, const wide_int cst)
+{
+  wide_int v;
+  if (TYPE_UNSIGNED (type))
+v = cst.zext (TYPE_PRECISION (type));
+  else
+v = cst.sext (TYPE_PRECISION (type));
+
+  return build_int_cst_wide (type, v.elt (0), v.elt (1));
+}

is surely broken.  A wide-int does not fit a double-int.  How are you
going to fix this?

Thanks,
Richard.


kenny




PATCH to acinclude.m4 to fix gas version detection

2012-10-04 Thread Jason Merrill
Recent versions of binutils seem to have started putting ' around the 
version number in bfd/configure.in, which was confusing gcc configure. 
This patch allows us to detect the version number again.


OK for trunk?
commit f9ce75775fe4392ee92893c46e89e17dc31bb816
Author: Jason Merrill ja...@redhat.com
Date:   Thu Oct 4 00:55:33 2012 -0400

	* acinclude.m4 (_gcc_COMPUTE_GAS_VERSION): Handle ' around version
	number.
	* configure: Regenerate.

diff --git a/gcc/acinclude.m4 b/gcc/acinclude.m4
index c24464b..d5eb4da 100644
--- a/gcc/acinclude.m4
+++ b/gcc/acinclude.m4
@@ -393,7 +393,7 @@ for f in $gcc_cv_as_bfd_srcdir/configure \
  $gcc_cv_as_gas_srcdir/configure \
  $gcc_cv_as_gas_srcdir/configure.in \
  $gcc_cv_as_gas_srcdir/Makefile.in ; do
-  gcc_cv_gas_version=`sed -n -e 's/^[[ 	]]*\(VERSION=[[0-9]]*\.[[0-9]]*.*\)/\1/p'  $f`
+  gcc_cv_gas_version=`sed -n -e s/^[[ 	]]*VERSION='*\([[0-9]]*\.[[0-9]]*[^']*\)'*/VERSION=\1/p  $f`
   if test x$gcc_cv_gas_version != x; then
 break
   fi
diff --git a/gcc/configure b/gcc/configure
index 45bba8e..2d71f7d 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -21237,7 +21237,7 @@ for f in $gcc_cv_as_bfd_srcdir/configure \
  $gcc_cv_as_gas_srcdir/configure \
  $gcc_cv_as_gas_srcdir/configure.in \
  $gcc_cv_as_gas_srcdir/Makefile.in ; do
-  gcc_cv_gas_version=`sed -n -e 's/^[ 	]*\(VERSION=[0-9]*\.[0-9]*.*\)/\1/p'  $f`
+  gcc_cv_gas_version=`sed -n -e s/^[[ 	]]*VERSION='*\([[0-9]]*\.[[0-9]]*[^']*\)'*/VERSION=\1/p  $f`
   if test x$gcc_cv_gas_version != x; then
 break
   fi


Re: RFC: LRA for x86/x86-64 [7/9] -- continuation

2012-10-04 Thread Richard Sandiford
Hi Vlad,

This message is for lra-assigns.c.  Sorry for the piecemeal reviews,
never sure when I'll get time...

 +/* This file contains a pass mostly assigning hard registers to reload
 +   pseudos.  There is no any RTL code transformation on this pass.

Maybe:

/* This file's main objective is to assign hard registers to reload pseudos.
   It also tries to allocate hard registers to other pseudos, but at a lower
   priority than the reload pseudos.  The pass does not transform the RTL.

if that's accurate.

 +   Reload pseudos get what they need (usually) hard registers in
 +   anyway possibly by spilling non-reload pseudos and by assignment
 +   reload pseudos with smallest number of available hard registers
 +   first.
 +
 +   If reload pseudos can get hard registers only through spilling
 +   other pseudos, we choose what pseudos to spill taking into account
 +   how given reload pseudo benefits and also how other reload pseudos
 +   not assigned yet benefit too (see function spill_for).

Maybe:

   We must allocate a hard register to every reload pseudo.  We try to
   increase the chances of finding a viable allocation by assigning the
   pseudos in order of fewest available hard registers first.  If we
   still fail to find a hard register, we spill other (non-reload)
   pseudos in order to make room.

   assign_hard_regno_for allocates registers without spilling.
   spill_for does the same with spilling.  Both functions use
   a cost model to determine the most profitable choice of
   hard and spill registers.

 +   Non-reload pseudos can get hard registers too if it is possible and
 +   improves the code.  It might be possible because of spilling
 +   non-reload pseudos on given pass.

Maybe:

   Once we have finished allocating reload pseudos, we also try to
   assign registers to other (non-reload) pseudos.  This is useful
   if hard registers were freed up by the spilling just described.

 +   We try to assign hard registers processing pseudos by threads.  The
 +   thread contains reload and inheritance pseudos connected by copies
 +   (move insns).  It improves the chance to get the same hard register
 +   to pseudos in the thread and, as the result, to remove some move
 +   insns.

Maybe:

   We try to assign hard registers by collecting pseudos into threads.
   These threads contain reload and inheritance pseudos that are connected
   by copies (move insns).  Doing this improves the chances of pseudos
   in the thread getting the same hard register and, as a result,
   of allowing some move insns to be deleted.

 +   When we assign hard register to a pseudo, we decrease the cost of
 +   the hard registers for corresponding pseudos connected by copies.

Maybe:

   When we assign a hard register to a pseudo, we decrease the cost of
   using the same hard register for pseudos that are connected by copies.

 +   If two hard registers are equally good for assigning the pseudo
 +   with hard register cost point of view, we prefer a hard register in
 +   smaller register bank.  By default, there is only one register
 +   bank.  A target can define register banks by hook
 +   register_bank. For example, x86-64 has a few register banks: hard
 +   regs with and without REX prefixes are in different banks.  It
 +   permits to generate smaller code as insns without REX prefix are
 +   shorter.

Maybe:

   If two hard registers have the same frequency-derived cost,
   we prefer hard registers in lower register banks.  The mapping
   of registers to banks is controlled by the register_bank target hook.
   For example, x86-64 has a few register banks: hard registers with and
   without REX prefixes are in different banks.  This permits us
   to generate smaller code as insns without REX prefixes are shorter.

although this might change if the name of the hook changes.

 +/* Info about pseudo used during the assignment pass.  Thread is a set
 +   of connected reload and inheritance pseudos with the same set of
 +   available hard reg set.  Thread is a pseudo itself for other
 +   cases.  */
 +struct regno_assign_info

Maybe:

/* Information about the thread to which a pseudo belongs.  Threads are
   a set of connected reload and inheritance pseudos with the same set of
   available hard registers.  Lone registers belong to their own threads.  */

Although the condition seems to be:

 +  (ira_class_hard_regs_num[regno_allocno_class_array[regno1]]
 + == ira_class_hard_regs_num[regno_allocno_class_array[regno2]]))

i.e. the same _number_ of available hard regs, but not necessarily the
same set.

thread might be more mnemonic than regno_assign in this file,
but that's bikeshed stuff.

 +  for (i = FIRST_PSEUDO_REGISTER; i  max_reg_num (); i++)
 +{
 +  regno_assign_info[i].first = i;
 +  regno_assign_info[i].next = -1;
 +  regno_assign_info[i].freq = lra_reg_info[i].freq;
 +}

Minor speedup, but it's probably worth caching max_reg_num () rather than
calling it in each loop 

Re: [PATCH, gdc] - Merging gdc (GNU D Compiler) into gcc

2012-10-04 Thread Mike Stump
On Oct 4, 2012, at 6:06 AM, Iain Buclaw ibuc...@ubuntu.com wrote:
 I would like to get a bump on this.
 
 It's been a while, and there have been quite a number of changes since
 the initial post that address many of the issues raised.  Rather than
 reposting patches, someone mentioned attaching changelog, well, here
 it is.
 
 Repository is still located here: https://github.com/D-Programming-GDC/GDC
 
 Would it be possible to have a re-newed review?

You don't ask, you post.  If you have independent patches to the rest of gcc 
that improve it, I'd suggest posting those as separate patches and getting 
those in.  This give you a base onto which to slot in the front-end.


Re: [patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])

2012-10-04 Thread Steven Bosscher
On Thu, Oct 4, 2012 at 5:31 PM, Vladimir Makarov vmaka...@redhat.com wrote:

 Wow.  I did not have such effect.  What machine do you use?

I do all my testing on gcc17.

Ciao!
Steven


Re: patch to fix

2012-10-04 Thread Marc Glisse

On Thu, 4 Oct 2012, Kenneth Zadeck wrote:


On 10/04/2012 09:17 AM, Marc Glisse wrote:

On Wed, 3 Oct 2012, Mike Stump wrote:

On Oct 3, 2012, at 1:47 PM, Marc Glisse marc.gli...@inria.fr wrote:
did you consider making the size of wide_int a template parameter, now 
that we are using C++? All with a convenient typedef or macro so it 
doesn't show. I am asking because in vrp I do some arithmetic that 
requires 2*N+1 bits where N is the size of double_int.


No, not really.  I'd maybe answer it this way, we put in a type (singular) 
to support all integral constants in all languages on a port. Since we 
only needed 1, there was little need to templatize it.  By supporting all 
integral constants in all languages, there is little need for more.  If 
Ada say, wanted a 2048 bit integer, then, we just have it drop off the 
size it wants someplace and we would mix that in on a MAX(….) line, net 
result, the type we use would then directly support the needs of Ada.  If 
vpr wanted 2x of all existing modes, we could simply change the MAX 
equation and essentially double it; if people need that.  This comes as a 
cost, as the intermediate wide values are fixed size allocated (not 
variable); so these all would be larger.


And this cost could be eliminated by having a template wide_int_ so only 
the places that need it actually use the extra size ;-)


The space is not really an issue in most places since wide-ints tend to be 
short lived.


You were the one talking of a cost.

However the real question is what are you going to instantiate the template 
on?What we do is look at the target and determine the largest type that 
the target supports and build a wide int type that supports that.how are 
you going to do better?


In a single place in tree-vrp.c in the code that evaluates 
multiplications, I would instantiate the template on the double (possibly 
+1) of the value you selected as large enough for all constants. For all 
the rest, your type is fine.


This will be for discussion when you submit that next patch, but currently 
VRP handles integers the same size as double_int. In particular, it handles 
__int128. I would be unhappy if introducing a larger bigint type in gcc 
made us regress there.


You are only happy now because you do not really understand the world around 
you.


I did not want to go into details, but let me re-phrase: I do not want to 
regress. Currently, hosts with a 64 bit hwi can handle VRP multiplications 
on __int128. If your patch introducing better big integers breaks that, 
that sounds bad to me, since I would expect s/double_int/wide_int/ to just 
work, and using wide_int2*MAX would just be a potential simplification 
of the code for later.



Note that VRP is just the one case I am familiar with. Using templates 
should (I haven't checked) be completely trivial and help the next person 
who needs bigger integers for a specific purpose and doesn't want to 
penalize the whole compiler. If the size of wide_int is completely 
irrelevant and we can make it 10 times larger without thinking, I guess 
some numbers showing it would be great (or maybe that's common 
knowledge, then I guess it is fine).



Now those are only some comments from an occasional contributor, not 
reviewer requirements, it is fine to ignore them.


--
Marc Glisse


Re: [patch][lra] Improve initial program point density in lra-lives.c

2012-10-04 Thread Steven Bosscher
On Thu, Oct 4, 2012 at 6:12 PM, Vladimir Makarov vmaka...@redhat.com wrote:
 0.6% sounds really very different from my timings. How much time does
 create_start_finish_chains take for you?

 0.65% (2.78s).

 Actually, I have a profile but I am not sure now that it is for PR54146.
 It might be for PR26854.

 I'll check it again to be sure.

 Not it looks about the same.

Well, that's very strange. Maybe we measure these things differently?
I just hi-hack a timevar, so I measure e.g. the time spent in
create_start_finish_chains like so:

Index: lra-lives.c
===
--- lra-lives.c (revision 192052)
+++ lra-lives.c (working copy)
@@ -770,6 +812,7 @@ create_start_finish_chains (void)
   int i, max_regno;
   lra_live_range_t r;

+timevar_push (TV_CPROP);
   lra_start_point_ranges = XCNEWVEC (lra_live_range_t, lra_live_max_point);
   lra_finish_point_ranges = XCNEWVEC (lra_live_range_t, lra_live_max_point);
   max_regno = max_reg_num ();
@@ -783,6 +826,7 @@ create_start_finish_chains (void)
  lra_finish_point_ranges[r-finish] = r;
}
 }
+timevar_pop (TV_CPROP);
 }

 /* Rebuild LRA_START_POINT_RANGES and LRA_FINISH_POINT_RANGES after


so that I get the timings in the -ftime-report like so:

 CPROP   :  43.14 ( 4%) usr
 integrated RA   : 200.81 (17%) usr
 LRA non-specific:  62.18 ( 5%) usr
 LRA virtuals elimination:  61.71 ( 5%) usr
 LRA reload inheritance  :   6.41 ( 1%) usr
 LRA create live ranges  :  139.75 (13%) usr
 LRA hard reg assignment : 130.90 (11%) usr
 LRA coalesce pseudo regs:   2.45 ( 0%) usr
 reload  :   9.09 ( 1%) usr

Crude, but efficient (tm) :-)

How do you measure the time spent in that function, and in
remove_some_program_points_and_update_live_ranges?

Ciao!
Steven


Re: patch to fix constant math

2012-10-04 Thread Richard Guenther
On Thu, Oct 4, 2012 at 3:55 PM, Kenneth Zadeck zad...@naturalbridge.com wrote:
 Let me talk about the mode here first.

 What this interface/patch provides is a facility where the constant math
 that is done in optimizations is done exactly the way that it would be done
 on the target machine.   What we have now is a compiler that only does this
 if it convenient to do on the host.   I admit that i care about this more
 than others right now, but if intel adds a couple of more instructions to
 their vector units, other people will start to really care about this issue.
 If you take an OImode value with the current compiler and left shift it by
 250 the middle end will say that the result is 0.   This is just wrong!!!

 What this means is that the bitsize and precision of the operations need to
 be carried along when doing math. when wide-int  checks for overflow on the
 multiply or add, it is not checking the if the value overflowed on two HWIs,
 it is checking if the add overflowed in the mode of the types that are
 represented on the target.   When we do shift, we are not doing a shift
 within two HWIs, we are truncating the shift value (if this is appropriate)
 according to the bitsize and shifting according the precision.

 I think that an argument could be made that storing the mode should be
 changed to an explicit precision and bitsize.  (A possible other option
 would be to store a tree type, but this would make the usage at the rtl
 level very cumbersome since types are rare.) Aside from the work, you would
 not get much push back.

 But the signess is a different argument.   At the rtl level, the signess is
 a matter of context.   (you could argue that this is a mistake and i would
 agree, but that is an even bigger change.)   But more to the point, at the
 tree level, there are a surprising number of places where the operation
 desired does not follow the sign of the types that were used to construct
 the constants.   Furthermore, not carrying the sign is more consistent with
 the double int code, which as you point out carries nothing.

Well, on RTL the signedness is on the operation (you have sdiv and udiv, etc.).

double-int tries to present a sign-less twos-complement entity of size
2 * HOST_BITS_PER_WIDE_INT.  I think that is sensible and for
obvious reasons should not change.  Both tree and RTL rely on this.
What we do not want is that up to TImode you get an internal representation
done one way (twos-complement) and on OImode and larger you
suddenly get subtly different behavior.  That's a recepie for desaster.

I'd like to clean up the interface to double-int some more (now with the
nice C++ stuff we have).  double-int should be pure twos-complement,
there should be no operations on double-ints that behave differently
when done signed or unsigned, instead we have signed and unsigned
versions of the operations (similar to how signedness is handled on
the RTL level).  With some trivial C++ fu you could have a
double_sint and double_uint type that would get rid of the bool
sign params we have to some functions (and then you could
write double_sint  n using operator notation).

I'd like wide-int (whatever it's internal representation is) to behave
exactly like double-ints with respect to precision and signedness
handling.  Ideally all static functions we have that operate on
double-ints would be 1:1 available for wide-ints, so I can change
the type of entities in an algorithm from double-ints to wide-ints
(or vice versa) and do not have to change the code at all.

Thus as first step I'd like you to go over the double-int stuff,
compare it to the wide-int stuff you introduce and point out
differences (changing double-ints or wide-ints to whatever is
the more general concept).

Now, as for 'modes' - similar to signedness some functions
that operate on double-ints take a precision argument (like
the various extensions).  You can add a similar wrapper
type like double_sint, but this time with a cost - a new precision
member, that can be constructed from a double_int (or wide_int)
that ends up specifying the desired precision (be it in terms
of a mode or a type).

You didn't question my suggestion to have the number of
HOST_WIDE_INTs in a wide-int be compile-time constant - was
that just an oversight on your side?  The consequence is that
code wanting to deal with arbitrary length wide-ints needs to
be a template.

 As for the splitting out the patch in smaller pieces, i am all for it.   I
 have done this twice already and i could get the const_scalar_int_p patch
 out quickly.But you do not get too far along that before you are still
 left with a big patch.   I could split out wide-int.* and just commit those
 files with no clients as a first step.   My guess is that Richard Sandiford
 would appreciate that because while he has carefully checked the rtl stuff,
 i think that the code inside wide-int is not in his comfort zone of things
 he would approve.

 As far as your btw - noticed this 

Adjust gcc.dg/lto/20120723_0.c for SPARC

2012-10-04 Thread Eric Botcazou
As noted by Martin in http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00116.html, 
the testcase is invalid C and cannot pass on SPARC 32-bit because of the ABI.

Tested on SPARC/Solaris 10, applied on the mainline.


2012-10-04  Eric Botcazou  ebotca...@adacore.com

* gcc.dg/lto/20120723_0.c: Skip on SPARC 32-bit.


-- 
Eric BotcazouIndex: gcc.dg/lto/20120723_0.c
===
--- gcc.dg/lto/20120723_0.c	(revision 192073)
+++ gcc.dg/lto/20120723_0.c	(working copy)
@@ -1,7 +1,9 @@
 /* Make sure that by reference and by value aggregate jump functions do not get
-   mixed up.  */
+   mixed up.
+   ??? This testcase is invalid C and can only pass on specific platforms.  */
 /* { dg-lto-do run } */
-/* { dg-lto-options {{-O3 -fno-early-inlining -flto}} } */
+/* { dg-skip-if  { { sparc*-*-* }  ilp32 } { * } {  } } */
+/* { dg-lto-options { {-O3 -fno-early-inlining -flto}} } */
 
 extern void abort (void);
 


Re: [PATCH] Improve var-tracking memory disambiguation with frame pointer (PR debug/54796)

2012-10-04 Thread Jakub Jelinek
On Thu, Oct 04, 2012 at 06:42:47PM +0200, Steven Bosscher wrote:
 On Thu, Oct 4, 2012 at 6:33 PM, Jakub Jelinek wrote:
  This patch fixes a few FAILs in the ix86 guality testsuite (mainly -Os),
  by better disambiguating sp based VALUEs (which usually have no MEM_EXPR
  and thus the alias Oracle can't be used for them) from frame pointer
  based ones or global vars.
 
 Does this also help for some of the var-tracking compile-time-hog
 bugs? ISTR PR53958 was one with very long chains of sp based values...

Very unlikely.  That PR has been reported against 4.7, which doesn't have
clobber_overlapping_mems at all, the patch could only affect compile times
if there were huge chains of sp based values and find_base_term appeared
anywhere significantly in the profiles.
That PR seems to create so huge VTA hash tables that var-tracking gives up,
unfortunately the current give up heuristics doesn't estimate the compile
time and compile memory needed accurately enough and lowering the limit
(which is a param)'s default would on the other side result in sane sized
programs to trigger no-VTA way too often.

Jakub


Re: [Patch, Fortran] Fix some memory leaks

2012-10-04 Thread Steven Bosscher
On Thu, Oct 4, 2012 at 7:06 PM, Tobias Burnus wrote:
 This patch fixes some memory leaks and other issues found by
 http://scan5.coverity.com.

 Build and regtested on x86-64-linux.
 OK for the trunk?

Yes, thanks for plugging these!
Some of them have been there since day 0 :-)

Ciao!
Steven


Re: PATCH trunk: gengtype honoring mark_hook-s inside struct inside union-s

2012-10-04 Thread Basile Starynkevitch
On Thu, Oct 04, 2012 at 06:51:35PM +0300, Laurynas Biveinis wrote:
  2012-10-03  Basile Starynkevitch  bas...@starynkevitch.net
 
  * gengtype.c (walk_type): Emit mark_hook when inside a
struct of a union member.
 
 This is OK.

thanks, Committed revision 192092 to trunk.


I believe this patch should be backported into GCC 4.7 and 4.6

Regards.
-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***


Re: [lra] patch to solve most scalability problems for LRA

2012-10-04 Thread Richard Guenther
On Thu, Oct 4, 2012 at 7:07 PM, Steven Bosscher stevenb@gmail.com wrote:
 On Thu, Oct 4, 2012 at 5:37 PM, Vladimir Makarov vmaka...@redhat.com wrote:
   The only issue now is PR54146 compilation time for IRA+LRA although it
 was improved significantly.  I will continue work on PR54146.  But now I
 am going to focus on proposals from reviews.

 Right, there still are opportunities to improve things.

 (The real solution may be to stop SRA from creating so many
 simultaneously live pseudos in the first place...)

 +  lra_simple_p
 += (ira_use_lra_p  max_reg_num () = (1  26) / last_basic_block);

 I think you should use n_basic_blocks here instead of
 last_basic_block, in case this runs without compacting the cfg first
 (n_basic_blocks is the real number of basic blocks in the cfg,
 last_basic_block is the highest index, so last_basic_block =
 n_basic_blocks).

I also noticed that switching to IRA_REGION_ONE improves things
when we have a large number of loops (profile points to some
loop code in IRA).  Note that the magic number above should
be a new --param, and once we have a diagnostic flag that
shows whenever we back off like this it should notify the
user of that fact (and the params we have overflown) - this
just reminded me of that idea from somebody else ;)

 Thanks for working on this!

Indeed ;)

It, btw, also applies to IRA + reload ...

Richard.

 Ciao!
 Steven


Re: PATCH trunk: gengtype honoring mark_hook-s inside struct inside union-s

2012-10-04 Thread Richard Guenther
On Thu, Oct 4, 2012 at 7:24 PM, Basile Starynkevitch
bas...@starynkevitch.net wrote:
 On Thu, Oct 04, 2012 at 06:51:35PM +0300, Laurynas Biveinis wrote:
  2012-10-03  Basile Starynkevitch  bas...@starynkevitch.net
 
  * gengtype.c (walk_type): Emit mark_hook when inside a
struct of a union member.

 This is OK.

 thanks, Committed revision 192092 to trunk.


 I believe this patch should be backported into GCC 4.7 and 4.6

I see no reason for this unless it is a regression.

Richard.

 Regards.
 --
 Basile STARYNKEVITCH http://starynkevitch.net/Basile/
 email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359
 8, rue de la Faiencerie, 92340 Bourg La Reine, France
 *** opinions {are only mines, sont seulement les miennes} ***


Re: [PATCH] Teach VRP to handle if ((unsigned_narrowing_cast) x != 0) similarly to if ((x 0xffff) != 0) (PR tree-optimization/54810)

2012-10-04 Thread Richard Guenther
On Thu, Oct 4, 2012 at 6:31 PM, Jakub Jelinek ja...@redhat.com wrote:
 Hi!

 This patch handles unsigned narrowing casts the same as
 BIT_AND_EXPR with the unsigned narrow type's max value.

 Bootstrapped/regtested on x86_64-linux and i686-linux,
 ok for trunk?

Ok.

Thanks,
Richard.

 2012-10-04  Jakub Jelinek  ja...@redhat.com

 PR tree-optimization/54810
 * tree-vrp.c (register_edge_assert_for_2): Handle
 NAME = (unsigned) NAME2; if (NAME cmp CST) for
 narrowing casts to unsigned integral type like
 NAME = NAME2  CST2; if (NAME cmp CST) where CST2
 is the max value of the unsigned integral type.

 --- gcc/tree-vrp.c.jj   2012-09-25 14:45:48.0 +0200
 +++ gcc/tree-vrp.c  2012-10-04 11:43:32.334988401 +0200
 @@ -4712,6 +4712,11 @@ register_edge_assert_for_2 (tree name, e
tree val2 = NULL_TREE;
double_int mask = double_int_zero;
unsigned int prec = TYPE_PRECISION (TREE_TYPE (val));
 +  unsigned int nprec = prec;
 +  enum tree_code rhs_code = ERROR_MARK;
 +
 +  if (is_gimple_assign (def_stmt))
 +   rhs_code = gimple_assign_rhs_code (def_stmt);

/* Add asserts for NAME cmp CST and NAME being defined
  as NAME = (int) NAME2.  */
 @@ -4721,7 +4726,7 @@ register_edge_assert_for_2 (tree name, e
gimple_assign_cast_p (def_stmt))
 {
   name2 = gimple_assign_rhs1 (def_stmt);
 - if (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt))
 + if (CONVERT_EXPR_CODE_P (rhs_code)
INTEGRAL_TYPE_P (TREE_TYPE (name2))
TYPE_UNSIGNED (TREE_TYPE (name2))
prec == TYPE_PRECISION (TREE_TYPE (name2))
 @@ -4767,8 +4772,7 @@ register_edge_assert_for_2 (tree name, e
  NAME = NAME2  CST2.

  Extract CST2 from the right shift.  */
 -  if (is_gimple_assign (def_stmt)
 -  gimple_assign_rhs_code (def_stmt) == RSHIFT_EXPR)
 +  if (rhs_code == RSHIFT_EXPR)
 {
   name2 = gimple_assign_rhs1 (def_stmt);
   cst2 = gimple_assign_rhs2 (def_stmt);
 @@ -4840,21 +4844,37 @@ register_edge_assert_for_2 (tree name, e
/* Add asserts for NAME cmp CST and NAME being defined as
  NAME = NAME2  CST2.

 -Extract CST2 from the and.  */
 +Extract CST2 from the and.
 +
 +Also handle
 +NAME = (unsigned) NAME2;
 +casts where NAME's type is unsigned and has smaller precision
 +than NAME2's type as if it was NAME = NAME2  MASK.  */
names[0] = NULL_TREE;
names[1] = NULL_TREE;
cst2 = NULL_TREE;
 -  if (is_gimple_assign (def_stmt)
 -  gimple_assign_rhs_code (def_stmt) == BIT_AND_EXPR)
 +  if (rhs_code == BIT_AND_EXPR
 + || (CONVERT_EXPR_CODE_P (rhs_code)
 +  TREE_CODE (TREE_TYPE (val)) == INTEGER_TYPE
 +  TYPE_UNSIGNED (TREE_TYPE (val))
 +  TYPE_PRECISION (TREE_TYPE (gimple_assign_rhs1 (def_stmt)))
 + prec
 +  !retval))
 {
   name2 = gimple_assign_rhs1 (def_stmt);
 - cst2 = gimple_assign_rhs2 (def_stmt);
 + if (rhs_code == BIT_AND_EXPR)
 +   cst2 = gimple_assign_rhs2 (def_stmt);
 + else
 +   {
 + cst2 = TYPE_MAX_VALUE (TREE_TYPE (val));
 + nprec = TYPE_PRECISION (TREE_TYPE (name2));
 +   }
   if (TREE_CODE (name2) == SSA_NAME
INTEGRAL_TYPE_P (TREE_TYPE (name2))
TREE_CODE (cst2) == INTEGER_CST
!integer_zerop (cst2)
 -  prec = HOST_BITS_PER_DOUBLE_INT
 -  (prec  1
 +  nprec = HOST_BITS_PER_DOUBLE_INT
 +  (nprec  1
   || TYPE_UNSIGNED (TREE_TYPE (val
 {
   gimple def_stmt2 = SSA_NAME_DEF_STMT (name2);
 @@ -4881,12 +4901,12 @@ register_edge_assert_for_2 (tree name, e
   bool valid_p = false, valn = false, cst2n = false;
   enum tree_code ccode = comp_code;

 - valv = tree_to_double_int (val).zext (prec);
 - cst2v = tree_to_double_int (cst2).zext (prec);
 + valv = tree_to_double_int (val).zext (nprec);
 + cst2v = tree_to_double_int (cst2).zext (nprec);
   if (!TYPE_UNSIGNED (TREE_TYPE (val)))
 {
 - valn = valv.sext (prec).is_negative ();
 - cst2n = cst2v.sext (prec).is_negative ();
 + valn = valv.sext (nprec).is_negative ();
 + cst2n = cst2v.sext (nprec).is_negative ();
 }
   /* If CST2 doesn't have most significant bit set,
  but VAL is negative, we have comparison like
 @@ -4894,7 +4914,7 @@ register_edge_assert_for_2 (tree name, e
   if (!cst2n  valn)
 ccode = ERROR_MARK;
   if (cst2n)
 -   sgnbit = double_int_one.llshift (prec - 1, prec).zext (prec);
 +   sgnbit = double_int_one.llshift (nprec - 1, nprec).zext (nprec);

Re: PATCH trunk: gengtype honoring mark_hook-s inside struct inside union-s

2012-10-04 Thread Basile Starynkevitch
On Thu, Oct 04, 2012 at 07:26:23PM +0200, Richard Guenther wrote:
 On Thu, Oct 4, 2012 at 7:24 PM, Basile Starynkevitch
 bas...@starynkevitch.net wrote:
  On Thu, Oct 04, 2012 at 06:51:35PM +0300, Laurynas Biveinis wrote:
   2012-10-03  Basile Starynkevitch  bas...@starynkevitch.net
  
   * gengtype.c (walk_type): Emit mark_hook when inside a
 struct of a union member.
 
  This is OK.
 
  thanks, Committed revision 192092 to trunk.
 
 
  I believe this patch should be backported into GCC 4.7 and 4.6
 
 I see no reason for this unless it is a regression.

If GCC 4.7 will have future micro releases, (like an hypothetical 4.7.3) they 
will have the same bug.
What is the procedure to get this bug fixed in 4.7.3?

(and there are plugins for 4.7 affected by this bug, http://gcc-melt.org/ for 
example)

Regards.
-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***


Re: PING^2: [patch] pr/54508: fix incomplete debug information for class

2012-10-04 Thread Cary Coutant
   /* We also have to mark its parents as used.
 -(But we don't want to mark our parents' kids due to this.)  */
 +(But we don't want to mark our parent's kids due to this,
 +unless it is a class.)  */
   if (die-die_parent)
 -   prune_unused_types_mark (die-die_parent, 0);
 +   prune_unused_types_mark (die-die_parent,
 +(die-die_parent-die_tag == 
 DW_TAG_class_type ||
 + die-die_parent-die_tag == 
 DW_TAG_structure_type ||
 + die-die_parent-die_tag == 
 DW_TAG_union_type));

I'd suggest replacing these conditions with a call to class_scope_p().
That will also cover DW_TAG_interface_type, which might be irrelevant
for this particular case, but is probably good to cover in the general
case.

-cary


Re: [patch][lra] Improve initial program point density in lra-lives.c

2012-10-04 Thread Vladimir Makarov

On 10/04/2012 12:56 PM, Steven Bosscher wrote:

On Thu, Oct 4, 2012 at 6:12 PM, Vladimir Makarov vmaka...@redhat.com wrote:

0.6% sounds really very different from my timings. How much time does
create_start_finish_chains take for you?


0.65% (2.78s).

Actually, I have a profile but I am not sure now that it is for PR54146.
It might be for PR26854.

I'll check it again to be sure.

Not it looks about the same.

Well, that's very strange. Maybe we measure these things differently?
I just hi-hack a timevar, so I measure e.g. the time spent in
create_start_finish_chains like so:

Index: lra-lives.c
===
--- lra-lives.c (revision 192052)
+++ lra-lives.c (working copy)
@@ -770,6 +812,7 @@ create_start_finish_chains (void)
int i, max_regno;
lra_live_range_t r;

+timevar_push (TV_CPROP);
lra_start_point_ranges = XCNEWVEC (lra_live_range_t, lra_live_max_point);
lra_finish_point_ranges = XCNEWVEC (lra_live_range_t, lra_live_max_point);
max_regno = max_reg_num ();
@@ -783,6 +826,7 @@ create_start_finish_chains (void)
   lra_finish_point_ranges[r-finish] = r;
 }
  }
+timevar_pop (TV_CPROP);
  }

  /* Rebuild LRA_START_POINT_RANGES and LRA_FINISH_POINT_RANGES after


so that I get the timings in the -ftime-report like so:

  CPROP   :  43.14 ( 4%) usr
  integrated RA   : 200.81 (17%) usr
  LRA non-specific:  62.18 ( 5%) usr
  LRA virtuals elimination:  61.71 ( 5%) usr
  LRA reload inheritance  :   6.41 ( 1%) usr
  LRA create live ranges  :  139.75 (13%) usr
  LRA hard reg assignment : 130.90 (11%) usr
  LRA coalesce pseudo regs:   2.45 ( 0%) usr
  reload  :   9.09 ( 1%) usr

Crude, but efficient (tm) :-)

How do you measure the time spent in that function, and in
remove_some_program_points_and_update_live_ranges?

You use AMD and I use Intel.  So it may be different with cache point of 
view.


Another thing is that I used gprof (-pg was used for bitmap.o lra*.o and 
ira*.o).  Your measurements are more accurate, I think, because it is 
without instrumentation and bitmap.o takes too much time. Bitmap does 
not work well in this case because they are too big and sparse.




Re: Convert more non-GTY htab_t to hash_table.

2012-10-04 Thread Lawrence Crowl
On 10/4/12, Richard Guenther rguent...@suse.de wrote:
 On Tue, 2 Oct 2012, Lawrence Crowl wrote:
 On 10/2/12, Richard Guenther rguent...@suse.de wrote:
  On Mon, 1 Oct 2012, Lawrence Crowl wrote:
   Change more non-GTY hash tables to use the new type-safe
   template hash table.  Constify member function parameters that
   can be const.  Correct a couple of expressions in formerly
   uninstantiated templates.
  
   The new code is 0.362% faster in bootstrap, with a 99.5%
   confidence of being faster.
  
   Tested on x86-64.
  
   Okay for trunk?
 
  You are changing a hashtable used by fold checking, did you test
  with fold checking enabled?

 I didn't know I had to do anything beyond the normal make check.
 What do I do?

  +/* Data structures used to maintain mapping between basic blocks and
  +   copies.  */
  +static hash_table bb_copy_hasher bb_original;
  +static hash_table bb_copy_hasher bb_copy;
 
  note that because hash_table has a constructor we now get global
  CTORs for all statics :( (and mx-protected local inits ...)

 The overhead for the global constructors isn't significant.
 Only the function-local statics have mx-protection, and that can
 be eliminated by making them global static.

  Can you please try to remove the constructor from hash_table to
  avoid this overhead?  (as a followup - that is, don't initialize
  htab)

 The initialization avoids potential errors in calling dispose.
 I can do it, but I don't think the overhead (after moving the
 function-local statics to global) will matter, and so I prefer to
 keep the safety.  So is the move of the statics sufficient or do
 you still want to remove constructors?

 Hm, having them in-scope where they are used is good style.
 Why can't they be statically initialized and put in .data?
 Please make it so - you know C++ enough (ISTR value-initialization
 is default - which means NULL for the pointer?)

Zero initialization is default for static variables, but not for
local or heap variables.  We can live with the uninitialized memory
in some cases, and add another function to explicitly null the
member in the rest of the cases.  I am not convinced that extra
coding is worth the performance difference, particularly as I do
not expect that difference to be measureable.

However we decide here, I think that work should be a separate patch,
as it will certainly touch more files than the current patch.  So,
can we separate the issue?


 Richard.

  The cfg.c, dse.c and hash-table.h parts are ok for trunk, I'll
  leave the rest to respective maintainers of the pieces of the
  compiler.
 
  Thanks,
  Richard.
 
 
  Index: gcc/java/ChangeLog
 
  2012-10-01  Lawrence Crowl  cr...@google.com
 
* Make-lang.in (JAVA_OBJS): Add dependence on hash-table.o.
(JCFDUMP_OBJS): Add dependence on hash-table.o.
(jcf-io.o): Add dependence on hash-table.h.
* jcf-io.c (memoized_class_lookups): Change to use type-safe hash
  table.
 
  Index: gcc/c/ChangeLog
 
  2012-10-01  Lawrence Crowl  cr...@google.com
 
* Make-lang.in (c-decl.o): Add dependence on hash-table.h.
* c-decl.c (detect_field_duplicates_hash): Change to new type-safe
hash table.
 
  Index: gcc/objc/ChangeLog
 
  2012-10-01  Lawrence Crowl  cr...@google.com
 
* Make-lang.in (OBJC_OBJS): Add dependence on hash-table.o.
(objc-act.o): Add dependence on hash-table.h.
* objc-act.c (objc_detect_field_duplicates): Change to new type-safe
hash table.
 
  Index: gcc/ChangeLog
 
  2012-10-01  Lawrence Crowl  cr...@google.com
 
* Makefile.in (fold-const.o): Add depencence on hash-table.h.
(dse.o): Likewise.
(cfg.o): Likewise.
* fold-const.c (fold_checksum_tree): Change to new type-safe hash
  table.
* (print_fold_checksum): Likewise.
* cfg.c (var bb_original): Likewise.
* (var bb_copy): Likewise.
* (var loop_copy): Likewise.
* hash-table.h (template hash_table): Constify parameters for find...
and remove_elt... member functions.
  (hash_table::empty) Correct size expression.
  (hash_table::clear_slot) Correct deleted entry assignment.
* dse.c (var rtx_group_table): Change to new type-safe hash table.
 
  Index: gcc/cp/ChangeLog
 
  2012-10-01  Lawrence Crowl  cr...@google.com
 
* Make-lang.in (class.o): Add dependence on hash-table.h.
(tree.o): Likewise.
(semantics.o): Likewise.
* class.c (fixed_type_or_null): Change to new type-safe hash table.
* tree.c (verify_stmt_tree): Likewise.
(verify_stmt_tree_r): Likewise.
* semantics.c (struct nrv_data): Likewise.
 
 
  Index: gcc/java/Make-lang.in
  ===
  --- gcc/java/Make-lang.in (revision 191941)
  +++ gcc/java/Make-lang.in (working copy)
  @@ -83,10 +83,10 @@ JAVA_OBJS = java/class.o java/decl.o jav
 java/zextract.o java/jcf-io.o java/win32-host.o java/jcf-parse.o
  java/mangle.o \
 java/mangle_name.o java/builtins.o java/resource.o \
 java/jcf-depend.o \
  

Re: patch to fix constant math

2012-10-04 Thread Kenneth Zadeck

On 10/04/2012 12:58 PM, Richard Guenther wrote:

On Thu, Oct 4, 2012 at 3:55 PM, Kenneth Zadeck zad...@naturalbridge.com wrote:

Let me talk about the mode here first.

What this interface/patch provides is a facility where the constant math
that is done in optimizations is done exactly the way that it would be done
on the target machine.   What we have now is a compiler that only does this
if it convenient to do on the host.   I admit that i care about this more
than others right now, but if intel adds a couple of more instructions to
their vector units, other people will start to really care about this issue.
If you take an OImode value with the current compiler and left shift it by
250 the middle end will say that the result is 0.   This is just wrong!!!

What this means is that the bitsize and precision of the operations need to
be carried along when doing math. when wide-int  checks for overflow on the
multiply or add, it is not checking the if the value overflowed on two HWIs,
it is checking if the add overflowed in the mode of the types that are
represented on the target.   When we do shift, we are not doing a shift
within two HWIs, we are truncating the shift value (if this is appropriate)
according to the bitsize and shifting according the precision.

I think that an argument could be made that storing the mode should be
changed to an explicit precision and bitsize.  (A possible other option
would be to store a tree type, but this would make the usage at the rtl
level very cumbersome since types are rare.) Aside from the work, you would
not get much push back.

But the signess is a different argument.   At the rtl level, the signess is
a matter of context.   (you could argue that this is a mistake and i would
agree, but that is an even bigger change.)   But more to the point, at the
tree level, there are a surprising number of places where the operation
desired does not follow the sign of the types that were used to construct
the constants.   Furthermore, not carrying the sign is more consistent with
the double int code, which as you point out carries nothing.

Well, on RTL the signedness is on the operation (you have sdiv and udiv, etc.).
yes, there is a complete enough set of operations that allow you to 
specify the signess where this matters.



double-int tries to present a sign-less twos-complement entity of size
2 * HOST_BITS_PER_WIDE_INT.  I think that is sensible and for
obvious reasons should not change.  Both tree and RTL rely on this.
What we do not want is that up to TImode you get an internal representation
done one way (twos-complement) and on OImode and larger you
suddenly get subtly different behavior.  That's a recepie for desaster.


This is the main difference between double-int and wide-int.Wide int 
does the math the way the machine does it or the way the front end would 
expect it to be done.There is nothing about the host that is visible 
in the interfaces.


I reiterate, our world is already bigger than 128 bits and the intel 
world is likely to be soon.   Double int is stuck in a 64/128 bit world. 
these patches, which i admit are huge, are a way out of that box.




I'd like to clean up the interface to double-int some more (now with the
nice C++ stuff we have).  double-int should be pure twos-complement,
there should be no operations on double-ints that behave differently
when done signed or unsigned, instead we have signed and unsigned
versions of the operations (similar to how signedness is handled on
the RTL level).  With some trivial C++ fu you could have a
double_sint and double_uint type that would get rid of the bool
sign params we have to some functions (and then you could
write double_sint  n using operator notation).


The problem is that size does matter.wide int is effectively 
infinite precision twos complement.In practice, we can get by by 
just looking at the bitsize and precision of the types/modes involved 
and this makes the implementation faster than true infinite precision.


I went done the road trying to fix all of the places where the compiler 
either iced or got the wrong answer.   I showed this to Sandiford and he 
talked me out of it.  He was right, it was a rat hole.  It could have 
been a smaller patch but it was there were places where it was clearly 
going to take monumental work just to be able to back out and say that 
you had nothing.The number of places in the compiler where you 
compare against the largest and smallest representation of an integer is 
not small and some of them are buried very deep down chains that were 
not designed to say i cannot answer that question.


I believe that i have all of the functionality of double int in wide 
int, it is just the calls look different because there are not all of 
the interfaces that take two HWI's.As mentioned before, all of the 
places where the overflow is computed for the purpose of asking if this 
is ok in two hwi's is gone.




I'd like wide-int (whatever 

Re: Use conditional casting with symtab_node

2012-10-04 Thread Diego Novillo
On Thu, Oct 4, 2012 at 2:14 PM, Lawrence Crowl cr...@googlers.com wrote:

 So, Jan Hubicka requested and approved the current spelling.
 What now?

I don't think we should hold this up.  The names Jan requested seem
reasonable enough.  We seem to be running in circles here.


Diego.


Re: PATCH trunk: gengtype honoring mark_hook-s inside struct inside union-s

2012-10-04 Thread Basile Starynkevitch
On Wed, Oct 03, 2012 at 01:02:44PM +0200, Basile Starynkevitch wrote:
 So I applied and I am proposing the following patch to gcc trunk 192031
 (Laurynas, I did take your remarks into account)
 # patch to trunk
 Index: gcc/gengtype.c
 ===
 --- gcc/gengtype.c(revision 192031)
 +++ gcc/gengtype.c(working copy)
 @@ -2810,6 +2810,7 @@ walk_type (type_p t, struct walk_type_data *d)
   const char *oldval = d-val;
   const char *oldprevval1 = d-prev_val[1];
   const char *oldprevval2 = d-prev_val[2];
 + const char *struct_mark_hook = NULL;
   const int union_p = t-kind == TYPE_UNION;
   int seen_default_p = 0;
   options_p o;
 @@ -2833,7 +2834,14 @@ walk_type (type_p t, struct walk_type_data *d)
 if (!desc  strcmp (o-name, desc) == 0
  o-kind == OPTION_STRING)
   desc = o-info.string;
 +   else if (!struct_mark_hook  strcmp (o-name, mark_hook) == 0
 + o-kind == OPTION_STRING)
 + struct_mark_hook = o-info.string;
  
 + if (struct_mark_hook) 
 + oprintf (d-of, %*s%s (%s));\n,
 +  d-indent, , struct_mark_hook, oldval);

Sorry for the typo, the patch should have only one closing parenthesis.

So I'm applying the following patch to trunk to correct it.

Index: gengtype.c
===
--- gengtype.c  (revision 192094)
+++ gengtype.c  (working copy)
@@ -2839,7 +2839,7 @@
struct_mark_hook = o-info.string;
 
if (struct_mark_hook)
- oprintf (d-of, %*s%s (%s));\n,
+ oprintf (d-of, %*s%s (%s);\n,
   d-indent, , struct_mark_hook, oldval);
 
d-prev_val[2] = oldval;


I hope that correcting such a typo falls into the obvious patch rule, so I 
dare committing it
right now, reusing the same ChangeLog entry.

 % svn commit gcc/gengtype.c 
Sendinggcc/gengtype.c
Transmitting file data .
Committed revision 192095.

If you feel it is wrong to correct such a typo without asking, I'll revert this 
obvious commit.

Apologies for the typo.

Cheers.

-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***


Re: PING^2: [patch] pr/54508: fix incomplete debug information for class

2012-10-04 Thread Paul_Koning

On Oct 4, 2012, at 1:38 PM, Cary Coutant wrote:

  /* We also have to mark its parents as used.
 -(But we don't want to mark our parents' kids due to this.)  */
 +(But we don't want to mark our parent's kids due to this,
 +unless it is a class.)  */
  if (die-die_parent)
 -   prune_unused_types_mark (die-die_parent, 0);
 +   prune_unused_types_mark (die-die_parent,
 +(die-die_parent-die_tag == 
 DW_TAG_class_type ||
 + die-die_parent-die_tag == 
 DW_TAG_structure_type ||
 + die-die_parent-die_tag == 
 DW_TAG_union_type));
 
 I'd suggest replacing these conditions with a call to class_scope_p().
 That will also cover DW_TAG_interface_type, which might be irrelevant
 for this particular case, but is probably good to cover in the general
 case.
 
 -cary

Thanks, that makes it very simple.  Here is the updated patch.

Ok to commit with this change?

paul

ChangeLog:

2012-10-04  Paul Koning  n...@arrl.net

* dwarf2out.c (prune_unused_types_mark): Mark all of parent's
children if parent is a class.

testsuite/ChangeLog:

2012-10-04  Paul Koning  n...@arrl.net

* g++.dg/debug/dwarf2/pr54508.C: New.

Index: gcc/dwarf2out.c
===
--- gcc/dwarf2out.c (revision 192048)
+++ gcc/dwarf2out.c (working copy)
@@ -21035,9 +21035,11 @@
   prune_unused_types_mark_generic_parms_dies (die);
 
   /* We also have to mark its parents as used.
-(But we don't want to mark our parents' kids due to this.)  */
+(But we don't want to mark our parent's kids due to this,
+unless it is a class.)  */
   if (die-die_parent)
-   prune_unused_types_mark (die-die_parent, 0);
+   prune_unused_types_mark (die-die_parent, 
+class_scope_p (die-die_parent));
 
   /* Mark any referenced nodes.  */
   prune_unused_types_walk_attribs (die);
Index: testsuite/g++.dg/debug/dwarf2/pr54508.C
===
--- testsuite/g++.dg/debug/dwarf2/pr54508.C (revision 0)
+++ testsuite/g++.dg/debug/dwarf2/pr54508.C (revision 0)
@@ -0,0 +1,67 @@
+// PR debug/54508
+// { dg-do compile }
+// { dg-options -g2 -dA }
+
+// { dg-final { scan-assembler \cbase0\\[ \t\]+\[#;/!|@\]+ 
DW_AT_name\|DW_AT_name: \cbase\ } }
+// { dg-final { scan-assembler \OPCODE0\\[ \t\]+\[#;/!|@\]+ 
DW_AT_name\|DW_AT_name: \OPCODE\ } }
+// { dg-final { scan-assembler \bi0\\[ \t\]+\[#;/!|@\]+ DW_AT_name } }
+// { dg-final { scan-assembler \si0\\[ \t\]+\[#;/!|@\]+ DW_AT_name } }
+// { dg-final { scan-assembler \f10\\[ \t\]+\[#;/!|@\]+ DW_AT_name } }
+// { dg-final { scan-assembler \f20\\[ \t\]+\[#;/!|@\]+ DW_AT_name } }
+// { dg-final { scan-assembler-not \nc0\\[ \t\]+\# 
DW_AT_name\|DW_AT_name: \nc\ } }
+
+class cbase
+
+{
+public:
+ static int si;
+int bi;
+};
+
+class c : public cbase
+
+{
+public:
+ enum
+ {
+  OPCODE = 251
+ };
+ int i ;
+ static const char *testc (void) { return foo; }
+};
+
+struct s
+{
+int f1;
+static const char *tests (void) { return test; }
+};
+
+union u
+{
+int f2;
+double d;
+static const char *testu (void) { return test union; }
+};
+
+namespace n 
+{
+const char *ntest (void) { return test n; }
+
+class nc
+{
+public:
+int i;
+static int sj;
+};
+}
+
+extern void send (int, int, const void *, int);
+
+void test (int src)
+{
+  int cookie = 1;
+  send(src, c::OPCODE, c::testc (), cookie);
+  send(src, c::OPCODE, s::tests (), cookie);
+  send(src, c::OPCODE, u::testu (), cookie);
+  send(src, c::OPCODE, n::ntest (), cookie);
+}


Re: [libbacktrace] Fix warning as error in btest.c and hence `make test`

2012-10-04 Thread Ian Lance Taylor
On Thu, Oct 4, 2012 at 12:03 PM, Gerald Pfeifer ger...@pfeifer.com wrote:
 Running `make test` on my nightly testers I noticed the following
 in the log files:

 cc1: warnings being treated as errors
 /scratch/tmp/gerald/gcc-HEAD/libbacktrace/btest.c: In function 'f23':
 /scratch/tmp/gerald/gcc-HEAD/libbacktrace/btest.c:476: warning: 'expected' 
 may be used uninitialized in this function
 gmake[3]: *** [btest-btest.o] Error 1
 gmake[3]: Target `btest' not remade because of errors.

 Doesn't this mean that effectively the test is not executed?

I don't think so, not sure what you are getting at here.  I'm not sure
why I even wrote case 3:, j can never take on the value 3.

 2012-10-04  Gerald Pfeifer  ger...@pfeifer.com

 * btest.c (f23): Avoid uninitialized variable warning.

This is OK.

Thanks.

Ian


[PATCH] gcc-{ar,nm,ranlib}: Find binutils binaries relative to self

2012-10-04 Thread Meador Inge
Hi All,

Currently the gcc-{ar,nm,ranlib} utilities assume that binutils is in
path when invoking the wrapped binutils program.  This goes against the
accepted practice in GCC to find sub-programs relative to where the
GCC binaries are stored and to not make assumptions about the PATH.

This patch changes the gcc-{ar,nm,ranlib} utilities to do the same
by factoring out some utility code for finding files from collect2.c.
These functions are then leveraged to find the binutils programs.
Note that similar code exist in gcc.c.  Perhaps one day everything
can be merged to the file-find files.

Tested for Windows and GNU/Linux hosts and i686-pc-linux-gnu and
arm-none-eabi targets.  OK?

P.S. I am not quite sure what is best for the copyrights and contributed
by comments in the file-find* files I added since that code was just moved.
This patch drops the contributed by and keeps all the copyright dates from
collect2.c.

2012-10-04  Meador Inge  mead...@codesourcery.com

* collect2.c (main): Call find_file_set_debug.
(find_a_find, add_prefix, prefix_from_env, prefix_from_string):
Factor out into ...
* file-find.c (New file): ... here and ...
* file-find.h (New file): ... here.
* gcc-ar.c (standard_exec_prefix): New variable.
(standard_libexec_prefix): Ditto.
(tooldir_base_prefix) Ditto.
(self_exec_prefix): Ditto.
(self_libexec_prefix): Ditto.
(self_tooldir_prefix): Ditto.
(target_version): Ditto.
(path): Ditto.
(target_path): Ditto.
(setup_prefixes): New function.
(main): Rework how wrapped programs are found.
* Makefile.in (OBJS-libcommon-target): Add file-find.o.
(AR_OBJS): New variable.
(gcc-ar$(exeext)): Add dependency on $(AR_OBJS).
(gcc-nm$(exeext)): Ditto.
(gcc-ranlib(exeext)): Ditto.
(COLLECT2_OBJS): Add file-find.o.
(collect2.o): Add file-find.h prerequisite.
(file-find.o): New rule.

Index: gcc/gcc-ar.c
===
--- gcc/gcc-ar.c(revision 192099)
+++ gcc/gcc-ar.c(working copy)
@@ -21,21 +21,110 @@
 #include config.h
 #include system.h
 #include libiberty.h
+#include file-find.h
 
 #ifndef PERSONALITY
 #error Please set personality
 #endif
 
+/* The exec prefix as derived at compile-time from --prefix.  */
+
+static const char standard_exec_prefix[] = STANDARD_EXEC_PREFIX;
+
+/* The libexec prefix as derived at compile-time from --prefix.  */
+
 static const char standard_libexec_prefix[] = STANDARD_LIBEXEC_PREFIX;
+
+/* The bindir prefix as derived at compile-time from --prefix.  */
+
 static const char standard_bin_prefix[] = STANDARD_BINDIR_PREFIX;
-static const char *const target_machine = TARGET_MACHINE;
 
+/* A relative path to be used in finding the location of tools
+   relative to this program.  */
+
+static const char *const tooldir_base_prefix = TOOLDIR_BASE_PREFIX;
+
+/* The exec prefix as relocated from the location of this program.  */
+
+static const char *self_exec_prefix;
+
+/* The libexec prefix as relocated from the location of this program.  */
+
+static const char *self_libexec_prefix;
+
+/* The tools prefix as relocated from the location of this program.  */
+
+static const char *self_tooldir_prefix;
+
+/* The name of the machine that is being targeted.  */
+
+static const char *const target_machine = DEFAULT_TARGET_MACHINE;
+
+/* The target version.  */
+
+static const char *const target_version = DEFAULT_TARGET_VERSION;
+
+/* The collection of target specific path prefixes.  */
+
+static struct path_prefix target_path;
+
+/* The collection path prefixes.  */
+
+static struct path_prefix path;
+
+/* The directory separator.  */
+
 static const char dir_separator[] = { DIR_SEPARATOR, 0 };
 
+static void
+setup_prefixes (const char *exec_path)
+{
+  const char *self;
+
+  self = getenv (GCC_EXEC_PREFIX);
+  if (!self)
+self = exec_path;
+  else
+self = concat (self, gcc- PERSONALITY, NULL);
+
+  /* Relocate the exec prefix.  */
+  self_exec_prefix = make_relative_prefix (self,
+  standard_bin_prefix,
+  standard_exec_prefix);
+  if (self_exec_prefix == NULL)
+self_exec_prefix = standard_exec_prefix;
+
+  /* Relocate libexec prefix.  */
+  self_libexec_prefix = make_relative_prefix (self,
+ standard_bin_prefix,
+ standard_libexec_prefix);
+  if (self_libexec_prefix == NULL)
+self_libexec_prefix = standard_libexec_prefix;
+
+
+  /* Build the relative path to the target-specific tool directory.  */
+  self_tooldir_prefix = concat (tooldir_base_prefix, target_machine,
+   dir_separator, NULL);
+  self_tooldir_prefix = concat (self_exec_prefix, target_machine, 
+   dir_separator, 

Re: Propagate profile counts during switch expansion

2012-10-04 Thread Easwaran Raman
Hi Honza,
 I am addressing some of the questions you raise here. Will send an
updated patch later.


On Thu, Oct 4, 2012 at 6:19 AM, Jan Hubicka hubi...@ucw.cz wrote:

  @@ -560,7 +577,6 @@ compute_outgoing_frequencies (basic_block b)
 return;
}
   }
  -
 if (single_succ_p (b))
   {
 e = single_succ_edge (b);
  @@ -568,7 +584,10 @@ compute_outgoing_frequencies (basic_block b)
 e-count = b-count;
 return;
   }
  -  guess_outgoing_edge_probabilities (b);
  +  else if (!gen_probabilities_from_existing_counts (b-succs)){
  +/* All outgoing edges of B have zero count. Guess probabilities.  */
  +guess_outgoing_edge_probabilities (b);
  +  }

 Hmm, I do not quite follow logic here.
 basic block B is one of many basic blocks that the original BB was split from.
 It is possible that B may have some of original edges, but there may be new 
 ones.
 How you can guess the outgoing probabilitie shere.  Do you have an example?

The code reaches here only when the BB has more than two successor. I
assume that this happens only when the BB is terminated by a switch
statement. During conversion to RTL, the outgoing edges of this BB and
their counts ar untouched. So I am just using those counts to compute
the edge probabilities.

 Also gen_probabilities_from_existing_counts could probably also work based
 on original edge frequencies.

Just to be clear I understand it right, you want me to use the
frequency instead of count since the frequency is derived from the
counts in profile use builds?


  +
  +  default_edge-count = default_count;
  +  if (count)
  +{
  +  edge e;
  +  edge_iterator ei;
  +  FOR_EACH_EDGE (e, ei, stmt_bb-succs)
  +e-probability = e-count * REG_BR_PROB_BASE / count;
  +}

 Hmm, this updates origina BB containing the switch statement?
The out_edges of the original BB.
 Of course,
 modulo roundoff errors, this should hold.  I wonder where did you got the
 diferences and why do you need this?

Originally, I obtained e-probability as e-count * REG_BR_PROB_BASE /
bb-count. During profile bootstrap, I noticed BBs whose sum of
outgoing edge counts exceeded the BB's count and hence I normalized
with the sum of edge counts.

 You are going to output new control flow
 and find_many_sub_basic_blocks will recompute all the counts/frequencies 
 inside
 anyway?
Sorry, I am lost here. I am just updating the probability of
stmt_bb-succs right?



  @@ -2430,7 +2523,11 @@ emit_case_nodes (rtx index, case_node_ptr node, rt
unsignedp),
  GT, NULL_RTX, mode, unsignedp,
  label_rtx (node-right-code_label));
  -  emit_case_nodes (index, node-left, default_label, index_type);
  +  probability = case_probability (node-right-count,
  +  subtree_count + 
  default_count);
  +  add_prob_note_to_last_insn (probability);
  +  emit_case_nodes (index, node-left, default_label, default_count,
  +   index_type);

 Hmm, here you seem to be distributing the probabilities of the counts reached.
 What happens in the case when the edge probability needs to be distributed 
 across
 nodes of the decision tree. I.e. in
 t(int a)
 {
   switch (a)
{
  case 100:
  case 200:
  case 300:
  case 400:
  case 500:
  case 600:
  case 700:
  case 800:
  case 900:
t();
  case 101:
  case 202:
  case 303:
  case 404:
  case 505:
  case 606:
  case 707:
  case 808:
  case 909:
q();
}
 }

Ok, that's a bug in my patch. In the above example, there are two
outgoing edges from the BB containing switch but many case nodes. I
would end up multiplying the counts by the number of cases that lead
to a label. If I divide the counts of each case_node by the number of
case nodes that jump to the same label, will that be reasonable? In
the above case, if t() is called 900 times, I will assume that each of
the 9 cases contribute a count of 100.

Thanks,
Easwaran


Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-04 Thread Steven Bosscher
On Tue, Oct 2, 2012 at 3:14 AM, Vladimir Makarov vmaka...@redhat.com wrote:
   Analogous live ranges are used in IRA as intermidiate step to build a
 conflict graph.

Right, ira-lives.c and lra-lives.c look very much alike, the only
major difference is that the object of interest in an IRA live range
is an ira_object_t, and in an LRA live range it's just a regno. But
the code of create_start_finish_chains,
ira_rebuild_start_finish_chains,
remove_some_program_points_and_update_live_ranges, and most the live
range printing functions, are almost identical between lra-lives.c and
ira-lives.c. That looks like unnecessary code duplication. Do you
think some of that code be shared (esp. in a C++ world where maybe a
live range can be a container for another type)?

Ciao!
Steven


[BZ 50356] Obvious bugfix in h8300 bits

2012-10-04 Thread Jeff Law
This is a pretty obvious typo/thinko.  Interestingly enough, fixing it 
makes absolutely no difference in the generated code for newlib ;-)  One 
could make an argument it should just be zapped, but I'm not going to 
open that can of worms today.


Obviously I tested this by building newlib before and after this change :-)


* PR target/50356
* config/h8300/h8300.c (h8300_rtx_costs): Fix typo in CONST_INT case.

Index: config/h8300/h8300.c
===
--- config/h8300/h8300.c(revision 192102)
+++ config/h8300/h8300.c(working copy)
@@ -1,6 +1,6 @@
 /* Subroutines for insn-output.c for Renesas H8/300.
Copyright (C) 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000,
-   2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011
+   2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012
Free Software Foundation, Inc.
Contributed by Steve Chamberlain (s...@cygnus.com),
Jim Wilson (wil...@cygnus.com), and Doug Evans (d...@cygnus.com).
@@ -1244,7 +1244,7 @@
*total = 0;
return true;
  }
-   if (-4 = n || n = 4)
+   if (-4 = n  n = 4)
  {
switch ((int) n)
  {


Re: [Fortran, (RFC) patch] PR49110/51055 Assignment to alloc. deferred-length character vars

2012-10-04 Thread Janus Weil
Hi all,

 Btw, note that we are using a double underscore scheme in other places
 (like __class, __vtab, __vtype, etc). I have even used an '@' in one
 place, namely (hidden) procedure pointer results (ppr@). Is there a
 need to unify all those cases?


 It think it would be useful to unify those. Are you volunteering?

 yeah, why not ;)

 Attached is a draft patch (not regtested), which adds a macro
 GFC_PREFIX (in gfortran.h) to prepend _F to the cases included in
 Tobias' earlier patch as well as the OOP-related stuff and procedure
 pointer results. It also bumps the module version.

 Any comments so far? (Of course the name of the macro can be debated.
 I just tried to keep it short for now.)

unfortunately my previous patch regressed on the proc_ptr_result test
cases (due to problems with implicit typing of symbols with leading
underscores, which also were the reason for using a suffix instead of
a prefix for proc-ptr results in the first place). So I have taken out
the 'ppr' parts, leaving only Tobias' original cases and the OOP
stuff, which at least should be regression-free now.

There are some more double-underscore cases which one could also
change into the new _F convention. Should I keep going in this
direction, or should we rather restrict this to the leading dot
cases for now? I guess this is a question of how much ABI breaking we
are willing to take. Opinions?

Cheers,
Janus


mangling_v2.diff
Description: Binary data


Re: patch to fix

2012-10-04 Thread Marc Glisse

On Wed, 3 Oct 2012, Kenneth Zadeck wrote:

i have already converted the vrp code, so i have some guess at where you are 
talking about.  (of course correct me if i am wrong).


in the code that computes the range when two variables are multiplied 
together needs to do a multiplication that produces a result that is twice as 
wide as the inputs.


my library is able to do that with one catch (and this is a big catch): the 
target has to have an integer mode that is twice as big as the mode of the 
operands. The issue is that wide-ints actually carry around the mode of the 
value in order to get the bitsize and precision of the operands (it does not 
have the type, because this code has to both work on the rtl and tree level 
and i generally do not want the signness anyway).


Ah, after reading the whole thread, now I understand that it is because 
wide_int carries a mode that it makes little sense making it a template 
(sorry that it took me so long when the information was in your first 
answer). I understand that it would be inconvenient (much longer code) to 
have a base_wide_int that does just the arithmetic and a wrapper that 
contains the mode as well.


Your idea below to define dummy extra modes does bring the template idea 
back to the table though ;-)


my current code in vrp checks to see if such a mode exists and if it does, it 
produces the product.   if the mode does not exist, it returns bottom.   What 
this means is that for most (many or some) targets that have a TImode, the 
largest thing that particular vrp discover ranges for is a DImode value.   We 
could get around this by defining the next larger mode than what the target 
really needs but i wonder how much mileage you are going to get out of that 
with really large numbers.


The current wrapping multiplication code in vrp works with a pair of 
double_int, so it should keep working with a pair of wide_int. I see now 
why wide_int doesn't allow to simplify the code, but it doesn't have to 
break.


--
Marc Glisse


Re: PING^2: [patch] pr/54508: fix incomplete debug information for class

2012-10-04 Thread Paul_Koning
Updated patch: there were two existing testcases that needed to be adjusted 
because of this fix.

Ran check RUNTESTFLAGS=dwarf2.exp, no regressions.

paul

ChangeLog:

2012-10-04  Paul Koning  n...@arrl.net

* dwarf2out.c (prune_unused_types_mark): Mark all of parent's
children if parent is a class.

testsuite/ChangeLog:

2012-10-04  Paul Koning  n...@arrl.net

* g++.dg/debug/dwarf2/pr54508.C: New.
* g++.dg/debug/dwarf2/localclass1.C: Expect staticfn1, staticfn2,
method1 in debug output.
* g++.dg/debug/dwarf2/localclass2.C: Likewise.

Index: gcc/dwarf2out.c
===
--- gcc/dwarf2out.c (revision 192048)
+++ gcc/dwarf2out.c (working copy)
@@ -21035,9 +21035,11 @@
 prune_unused_types_mark_generic_parms_dies (die);

 /* We also have to mark its parents as used.
-(But we don't want to mark our parents' kids due to this.)  */
+(But we don't want to mark our parent's kids due to this,
+unless it is a class.)  */
 if (die-die_parent)
-   prune_unused_types_mark (die-die_parent, 0);
+   prune_unused_types_mark (die-die_parent, 
+class_scope_p (die-die_parent));

 /* Mark any referenced nodes.  */
 prune_unused_types_walk_attribs (die);
Index: gcc/testsuite/g++.dg/debug/dwarf2/pr54508.C
===
--- gcc/testsuite/g++.dg/debug/dwarf2/pr54508.C (revision 0)
+++ gcc/testsuite/g++.dg/debug/dwarf2/pr54508.C (revision 0)
@@ -0,0 +1,67 @@
+// PR debug/54508
+// { dg-do compile }
+// { dg-options -g2 -dA -fno-merge-debug-strings }
+
+// { dg-final { scan-assembler \cbase0\\[ \t\]+\[#;/!|@\]+ DW_AT_name 
} }
+// { dg-final { scan-assembler \OPCODE0\\[ \t\]+\[#;/!|@\]+ DW_AT_name 
} }
+// { dg-final { scan-assembler \bi0\\[ \t\]+\[#;/!|@\]+ DW_AT_name } }
+// { dg-final { scan-assembler \si0\\[ \t\]+\[#;/!|@\]+ DW_AT_name } }
+// { dg-final { scan-assembler \f10\\[ \t\]+\[#;/!|@\]+ DW_AT_name } }
+// { dg-final { scan-assembler \f20\\[ \t\]+\[#;/!|@\]+ DW_AT_name } }
+// { dg-final { scan-assembler-not \nc0\\[ \t\]+\# DW_AT_name } }
+
+class cbase
+
+{
+public:
+ static int si;
+int bi;
+};
+
+class c : public cbase
+
+{
+public:
+ enum
+ {
+  OPCODE = 251
+ };
+ int i ;
+ static const char *testc (void) { return foo; }
+};
+
+struct s
+{
+int f1;
+static const char *tests (void) { return test; }
+};
+
+union u
+{
+int f2;
+double d;
+static const char *testu (void) { return test union; }
+};
+
+namespace n 
+{
+const char *ntest (void) { return test n; }
+
+class nc
+{
+public:
+int i;
+static int sj;
+};
+}
+
+extern void send (int, int, const void *, int);
+
+void test (int src)
+{
+  int cookie = 1;
+  send(src, c::OPCODE, c::testc (), cookie);
+  send(src, c::OPCODE, s::tests (), cookie);
+  send(src, c::OPCODE, u::testu (), cookie);
+  send(src, c::OPCODE, n::ntest (), cookie);
+}
Index: gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C
===
--- gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C (revision 192048)
+++ gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C (working copy)
@@ -59,11 +59,11 @@
 // { dg-final { scan-assembler foo\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler staticfn1\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler staticfn2\[^\n\r\]*DW_AT_name } }
-// { dg-final { scan-assembler-not staticfn3\[^\n\r\]*DW_AT_name } }
-// { dg-final { scan-assembler-not staticfn4\[^\n\r\]*DW_AT_name } }
+// { dg-final { scan-assembler staticfn3\[^\n\r\]*DW_AT_name } }
+// { dg-final { scan-assembler staticfn4\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler-not staticfn5\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler-not staticfn6\[^\n\r\]*DW_AT_name } }
-// { dg-final { scan-assembler-not method1\[^\n\r\]*DW_AT_name } }
+// { dg-final { scan-assembler method1\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler arg1\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler arg2\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler arg3\[^\n\r\]*DW_AT_name } }
Index: gcc/testsuite/g++.dg/debug/dwarf2/localclass2.C
===
--- gcc/testsuite/g++.dg/debug/dwarf2/localclass2.C (revision 192048)
+++ gcc/testsuite/g++.dg/debug/dwarf2/localclass2.C (working copy)
@@ -59,11 +59,11 @@
 // { dg-final { scan-assembler foo\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler staticfn1\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler staticfn2\[^\n\r\]*DW_AT_name } }
-// { dg-final { scan-assembler-not staticfn3\[^\n\r\]*DW_AT_name } }
-// { dg-final { scan-assembler-not staticfn4\[^\n\r\]*DW_AT_name } }
+// { dg-final { scan-assembler staticfn3\[^\n\r\]*DW_AT_name } }
+// { dg-final { scan-assembler 

Re: [PATCH] PR 53528 c++/ C++11 Generalized Attribute support

2012-10-04 Thread Jason Merrill

On 09/20/2012 02:59 AM, Dodji Seketeli wrote:

+  if ((flags  ATTR_FLAG_CXX11)
+  !(flags  ATTR_FLAG_TYPE_IN_PLACE
+   (TREE_CODE (*node) == RECORD_TYPE
+  || TREE_CODE (*node) == UNION_TYPE)))
+   {
+ /* unused is being used as a c++11 attribute.  In this mode
+we prevent it from applying to types, unless it's for a
+class defintion.  */
+ warning (OPT_Wattributes,
+  attribute %qE cannot be applied to a non-class type, name);
+ return NULL_TREE;
+   }


I think this should now be covered by the general ignoring of
attributes that appertain to type-specifiers, so we don't need to
check it here.


Removed.


Looks to me like it's still there.


+int attr_flag = attribute_takes_identifier_p (attr_id)
+  ? id_attr
+  : normal_attr;


This should only apply to attributes in the gnu namespace.

Jason



Re: [libbacktrace] Fix warning as error in btest.c and hence `make test`

2012-10-04 Thread Gerald Pfeifer
On Thu, 4 Oct 2012, Ian Lance Taylor wrote:
 cc1: warnings being treated as errors
 /scratch/tmp/gerald/gcc-HEAD/libbacktrace/btest.c: In function 'f23':
 /scratch/tmp/gerald/gcc-HEAD/libbacktrace/btest.c:476: warning: 'expected' 
 may be used uninitialized in this function
 gmake[3]: *** [btest-btest.o] Error 1
 gmake[3]: Target `btest' not remade because of errors.

 Doesn't this mean that effectively the test is not executed?
 I don't think so, not sure what you are getting at here.  

With warnings being treated as error the warning lead to
btest-btest.o] Error 1 and so I thought the build did not
actually go through.

Anyway, after my patch this is now gone. :-)  Thanks for the
quick review!

Gerald


[wwwdocs] SH 4.8 changes update

2012-10-04 Thread Oleg Endo
Hello,

The atomic options of SH have been changed recently.  The attached patch
updates the 4.8 changes.html accordingly, plus some minor wording fixes.

OK?

Cheers,
Oleg
? www_4_8_sh_changes_2.patch
Index: htdocs/gcc-4.8/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v
retrieving revision 1.35
diff -u -r1.35 changes.html
--- htdocs/gcc-4.8/changes.html	3 Oct 2012 22:33:46 -	1.35
+++ htdocs/gcc-4.8/changes.html	4 Oct 2012 21:47:18 -
@@ -255,19 +255,45 @@
 
 liImproved support for the code__atomic/code built-in functions:
 ul
-  liMinor improvements to code generated for software atomic sequences
-  that are enabled by code-msoft-atomic/code./li
+  liA new option code-matomic-model=imodel/i/code selects the
+  model for the generated atomic sequences.  The following models are
+  supported:
+  ul
+licodesoft-gusa/codebr
+Software gUSA sequences (SH3* and SH4* only).  On SH4A targets this
+will now also partially utilize the codemovco.l/code and
+codemovli.l/code instructions.  This is the default when the target
+is codesh3*-*-linux*/code or codesh4*-*-linux*/code./li
+
+licodehard-llcs/codebr
+Hardware codemovco.l/code / codemovli.l/code sequences
+(SH4A only)./li
+
+licodesoft-tcb/codebr
+Software thread control block sequences./li
+
+licodesoft-imask/codebr
+Software interrupt flipping sequences (privileged mode only).  This is
+the default when the target is codesh1*-*-linux*/code or
+codesh2*-*-linux*/code./li
+
+licodenone/codebr
+Generates function calls to the respective code__atomic/code
+built-in functions.  This is the default for SH64 targets or when the
+target is not codesh*-*-linux*/code./li
+  /ul/li
+
+  liThe option code-msoft-atomic/code has been deprecated.  It is
+  now an alias for code-matomic-model=soft-gusa/code./li
 
   liA new option code-mtas/code makes the compiler generate
   the codetas.b/code instruction for the
-  code__atomic_test_and_set/code built-in function./li
+  code__atomic_test_and_set/code built-in function regardless of the
+  selected atomic model./li
+
+  liThe code__sync/code functions in codelibgcc/code now reflect
+  the selected atomic model when building the toolchain./li
 
-  liThe SH4A instructions codemovco.l/code and
-  codemovli.l/code are now supported.  They are used to implement some
-  software atomic sequences that are enabled by code-msoft-atomic/code.
-  In addition to that, pure codemovco.l/code / codemovli.l/code
-  atomic sequences can be enabled with the new option
-  code-mhard-atomic/code./li
 /ul/li
 
 liAdded support for the codemov.b/code and codemov.w/code
@@ -280,11 +306,11 @@
 
 liImprovements to conditional branches and code that involves the T bit.
 A new option code-mzdcbranch/code tells the compiler to favor
-zero-displacement branches.  This is enabled by default for SH4 and
-SH4A./li
+zero-displacement branches.  This is enabled by default for SH4* targets.
+/li
 
 liThe codepref/code instruction will now be emitted by the
-code__builtin_prefetch/code built-in function for SH3./li
+code__builtin_prefetch/code built-in function for SH3* targets./li
 
 liThe codefmac/code instruction will now be emitted by the
 codefmaf/code standard function and the code__builtin_fmaf/code
@@ -298,7 +324,7 @@
 
 liAdded new options code-mfsrra/code and code-mfsca/code to allow
 the compiler using the codefsrra/code and codefsca/code
-instructions on CPUs other than SH4A (where they are already enabled by
+instructions on targets other than SH4A (where they are already enabled by
 default)./li
 
 liAdded support for the code__builtin_bswap32/code built-in function.


Re: patch to fix

2012-10-04 Thread Kenneth Zadeck

There are a bunch of ways to skin the cat.

1) we can define the extra mode.
2) if we get rid of the mode inside the wide int and replace it with an 
explicit precision and bitsize, then we can just make the size of the 
buffer twice as big as the analysis of the modes indicates.
3) or we can leave your code in a form that uses 2 wide ints.   my 
current patch (which i have not gotten working yet) changes this to use 
the mul_full call, but it could be changed.   It is much simpler that 
the existing code.


i do not see how templates offer any solution at all.   the wide int 
code really needs to have something valid to indicate the length of the 
object, and if there is no mode that big, the code will ice.


my personal feeling is that range analysis is quite useful for small 
integers and not so much as the values get larger.   The only really 
large integer constants that you are likely to find in real code are 
encryption keys, like the dvd decoding keys, and if the key is chosen 
well there should be no useful optimization that you can perform on the 
code.   If this did not work on the largest modes, no one would ever 
notice.   i.e. i would bet that you never make a useful transformation 
on any integer that would not fit in an int32.


However, this is your pass, and i understand the principal of never 
back down.


Kenny

On 10/04/2012 05:06 PM, Marc Glisse wrote:

On Wed, 3 Oct 2012, Kenneth Zadeck wrote:

i have already converted the vrp code, so i have some guess at where 
you are talking about.  (of course correct me if i am wrong).


in the code that computes the range when two variables are multiplied 
together needs to do a multiplication that produces a result that is 
twice as wide as the inputs.


my library is able to do that with one catch (and this is a big 
catch): the target has to have an integer mode that is twice as big 
as the mode of the operands. The issue is that wide-ints actually 
carry around the mode of the value in order to get the bitsize and 
precision of the operands (it does not have the type, because this 
code has to both work on the rtl and tree level and i generally do 
not want the signness anyway).


Ah, after reading the whole thread, now I understand that it is 
because wide_int carries a mode that it makes little sense making it a 
template (sorry that it took me so long when the information was in 
your first answer). I understand that it would be inconvenient (much 
longer code) to have a base_wide_int that does just the arithmetic and 
a wrapper that contains the mode as well.


Your idea below to define dummy extra modes does bring the template 
idea back to the table though ;-)


my current code in vrp checks to see if such a mode exists and if it 
does, it produces the product.   if the mode does not exist, it 
returns bottom.   What this means is that for most (many or some) 
targets that have a TImode, the largest thing that particular vrp 
discover ranges for is a DImode value.   We could get around this by 
defining the next larger mode than what the target really needs but i 
wonder how much mileage you are going to get out of that with really 
large numbers.


The current wrapping multiplication code in vrp works with a pair of 
double_int, so it should keep working with a pair of wide_int. I see 
now why wide_int doesn't allow to simplify the code, but it doesn't 
have to break.






[C++ testcase] PR 53403

2012-10-04 Thread Paolo Carlini

Hi,

I'm adding the testcase and closing the PR. Tested x86_64-linux.

Thanks,
Paolo.

/
2012-10-04  Paolo Carlini  paolo.carl...@oracle.com

PR c++/53403
* g++.dg/template/friend53.C
Index: g++.dg/template/friend53.C
===
--- g++.dg/template/friend53.C  (revision 0)
+++ g++.dg/template/friend53.C  (working copy)
@@ -0,0 +1,23 @@
+// PR c++/53403
+
+template typename T
+class Foo
+{
+  typedef void type;
+  template typename U friend void f();
+public:
+  Foo() {}
+};
+
+template class Foovoid;
+
+template typename T
+void f()
+{
+  typedef Foovoid::type type;
+}
+
+int main()
+{
+  fvoid();
+}


Re: [Ping]FW: [PATCH] Cilk Plus merging to trunk (2 of n)

2012-10-04 Thread Joseph S. Myers
On Thu, 4 Oct 2012, Iyer, Balaji V wrote:

   Did you get a chance to look at this submission? I think I have 
 fixed all the changes you have mentioned. Is it OK for trunk?

I expect to look at the revised specification and patch later this month.  
My initial impression from a glance at the revised specification was that 
it was much improved, but some issues would probably still require further 
work on the specification - and once those had been resolved, then a 
detailed review of the patch would make sense.

-- 
Joseph S. Myers
jos...@codesourcery.com


RE: [Ping]FW: [PATCH] Cilk Plus merging to trunk (2 of n)

2012-10-04 Thread Iyer, Balaji V
Hello Joseph,
   Please see my responses below:

-Original Message-
From: Joseph Myers [mailto:jos...@codesourcery.com]
Sent: Thursday, October 04, 2012 7:45 PM
To: Iyer, Balaji V
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [Ping]FW: [PATCH] Cilk Plus merging to trunk (2 of n)

On Thu, 4 Oct 2012, Iyer, Balaji V wrote:

  Did you get a chance to look at this submission? I think I have fixed
 all the changes you have mentioned. Is it OK for trunk?

I expect to look at the revised specification and patch later this month.
My initial impression from a glance at the revised specification was that it 
was
much improved, but some issues would probably still require further work on the
specification - and once those had been resolved, then a detailed review of the
patch would make sense.

Sure! I am not sure if you noticed, but Richard Guenther gave me some 
additional feedback today and I have fixed them and submitted another patch. 
Here is the thread for that email 
(http://gcc.gnu.org/ml/gcc-patches/2012-10/msg00431.html). I just wanted to let 
you know about this.

Also, I have other parts of Cilk Plus. Can I submit them now itself or should I 
wait till this is reviewed?

Thank you very much for helping me with this!

Yours Sincerely,

Balaji V. Iyer.


--
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH, gdc] - Merging gdc (GNU D Compiler) into gcc

2012-10-04 Thread Joseph S. Myers
On Thu, 4 Oct 2012, Iain Buclaw wrote:

 The only patches to gcc proper are documentation-related and adding
 the D frontend / libphobos to configure and make files.  I would have
 thought that these would typically only be included with the actual
 front-end?

Looking back at my previous review comments, I suggested that you might 
need to split up c-common.[ch] so that certain parts of attribute handling 
could be shared with D, because duplicate code copied from elsewhere in 
GCC was not an appropriate implementation approach.  Have you then 
eliminated the duplicate code in some other way that does not involve 
splitting up those files so code can be shared?

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: PING^2: [patch] pr/54508: fix incomplete debug information for class

2012-10-04 Thread Cary Coutant
 Index: gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C
 ===
 --- gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C (revision 192048)
 +++ gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C (working copy)
 @@ -59,11 +59,11 @@
  // { dg-final { scan-assembler foo\[^\n\r\]*DW_AT_name } }
  // { dg-final { scan-assembler staticfn1\[^\n\r\]*DW_AT_name } }
  // { dg-final { scan-assembler staticfn2\[^\n\r\]*DW_AT_name } }
 -// { dg-final { scan-assembler-not staticfn3\[^\n\r\]*DW_AT_name } }
 -// { dg-final { scan-assembler-not staticfn4\[^\n\r\]*DW_AT_name } }
 +// { dg-final { scan-assembler staticfn3\[^\n\r\]*DW_AT_name } }
 +// { dg-final { scan-assembler staticfn4\[^\n\r\]*DW_AT_name } }
  // { dg-final { scan-assembler-not staticfn5\[^\n\r\]*DW_AT_name } }
  // { dg-final { scan-assembler-not staticfn6\[^\n\r\]*DW_AT_name } }
 -// { dg-final { scan-assembler-not method1\[^\n\r\]*DW_AT_name } }
 +// { dg-final { scan-assembler method1\[^\n\r\]*DW_AT_name } }
  // { dg-final { scan-assembler arg1\[^\n\r\]*DW_AT_name } }
  // { dg-final { scan-assembler arg2\[^\n\r\]*DW_AT_name } }
  // { dg-final { scan-assembler arg3\[^\n\r\]*DW_AT_name } }

The fact that these two tests were specifically checking for the
absence of staticfn3 and staticfn4 leads me to believe that the
current behavior is deliberate. Jakub, that change was yours (it dates
back to November 2008). Are you OK with Paul's change?

It seems to me that there are cases where we just want to emit the
class for the context info (like a namespace, which doesn't have to be
complete everywhere). Is there a way to tell the debugger that this
class declaration is incomplete and that it should look elsewhere for
a full definition?

-cary


Re: PING^2: [patch] pr/54508: fix incomplete debug information for class

2012-10-04 Thread Paul_Koning

On Oct 4, 2012, at 8:26 PM, Cary Coutant wrote:

 Index: gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C
 ===
 --- gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C (revision 192048)
 +++ gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C (working copy)
 @@ -59,11 +59,11 @@
 // { dg-final { scan-assembler foo\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler staticfn1\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler staticfn2\[^\n\r\]*DW_AT_name } }
 -// { dg-final { scan-assembler-not staticfn3\[^\n\r\]*DW_AT_name } }
 -// { dg-final { scan-assembler-not staticfn4\[^\n\r\]*DW_AT_name } }
 +// { dg-final { scan-assembler staticfn3\[^\n\r\]*DW_AT_name } }
 +// { dg-final { scan-assembler staticfn4\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler-not staticfn5\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler-not staticfn6\[^\n\r\]*DW_AT_name } }
 -// { dg-final { scan-assembler-not method1\[^\n\r\]*DW_AT_name } }
 +// { dg-final { scan-assembler method1\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler arg1\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler arg2\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler arg3\[^\n\r\]*DW_AT_name } }
 
 The fact that these two tests were specifically checking for the
 absence of staticfn3 and staticfn4 leads me to believe that the
 current behavior is deliberate. Jakub, that change was yours (it dates
 back to November 2008). Are you OK with Paul's change?
 
 It seems to me that there are cases where we just want to emit the
 class for the context info (like a namespace, which doesn't have to be
 complete everywhere). Is there a way to tell the debugger that this
 class declaration is incomplete and that it should look elsewhere for
 a full definition?
 
 -cary

Certainly GDB does not currently do that.  As it stands, it uses whatever 
definition it finds first, so depending on which compilation unit's symbols are 
read first, what you see varies unpredictably. If there is a way to express in 
Dwarf-2 the fact that this is an incomplete definition, GDB could presumably be 
changed to take advantage of that.  I have no idea how hard that is.

paul



Re: PING^2: [patch] pr/54508: fix incomplete debug information for class

2012-10-04 Thread Paul_Koning

On Oct 4, 2012, at 8:26 PM, Cary Coutant wrote:

 Index: gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C
 ===
 --- gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C (revision 192048)
 +++ gcc/testsuite/g++.dg/debug/dwarf2/localclass1.C (working copy)
 @@ -59,11 +59,11 @@
 // { dg-final { scan-assembler foo\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler staticfn1\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler staticfn2\[^\n\r\]*DW_AT_name } }
 -// { dg-final { scan-assembler-not staticfn3\[^\n\r\]*DW_AT_name } }
 -// { dg-final { scan-assembler-not staticfn4\[^\n\r\]*DW_AT_name } }
 +// { dg-final { scan-assembler staticfn3\[^\n\r\]*DW_AT_name } }
 +// { dg-final { scan-assembler staticfn4\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler-not staticfn5\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler-not staticfn6\[^\n\r\]*DW_AT_name } }
 -// { dg-final { scan-assembler-not method1\[^\n\r\]*DW_AT_name } }
 +// { dg-final { scan-assembler method1\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler arg1\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler arg2\[^\n\r\]*DW_AT_name } }
 // { dg-final { scan-assembler arg3\[^\n\r\]*DW_AT_name } }
 
 The fact that these two tests were specifically checking for the
 absence of staticfn3 and staticfn4 leads me to believe that the
 current behavior is deliberate. Jakub, that change was yours (it dates
 back to November 2008). Are you OK with Paul's change?
 
 It seems to me that there are cases where we just want to emit the
 class for the context info (like a namespace, which doesn't have to be
 complete everywhere). Is there a way to tell the debugger that this
 class declaration is incomplete and that it should look elsewhere for
 a full definition?
 
 -cary

The code itself (where I changed dwarf2out.c) is from 2003-02-28, by rth, what 
appears to be the original implementation of the -feliminate-unused-debug-types 
flag.

Looking at Jakub's change from 2008 and PR/27017 it fixes, it looks more like a 
case of much more debug information that was missing and Jakub's change 
corrected that.  It looks like those two testcase files describe the resulting 
behavior, but I don't read the discussion in PR/27017 as saying that having 
staticfn3 omitted was specifically desired. 

paul



[PATCH] PR 54789: Fix driver error when GCC_COMPARE_DEBUG is defined

2012-10-04 Thread Dmitry Gorbachev
This patch simplifies process_command a bit by using save_switch and
sets the known switch field to true.


gcc/ChangeLog:

2012-10-05  Dmitry Gorbachev  d.g.gorbac...@gmail.com

PR driver/54789
* gcc.c (process_command): Use save_switch for synthesized
-fcompare-debug=* option; mark the switch as known.

--- gcc/gcc.c
+++ gcc/gcc.c
@@ -3961,18 +3961,12 @@ process_command (unsigned int decoded_options_count,
   if (n_infiles == last_language_n_infiles  spec_lang != 0)
 warning (0, %-x %s% after last input file has no effect, spec_lang);

+  /* Synthesize -fcompare-debug flag from the GCC_COMPARE_DEBUG
+ environment variable.  */
   if (compare_debug == 2 || compare_debug == 3)
 {
-  alloc_switch ();
-  switches[n_switches].part1 = concat (fcompare-debug=,
-  compare_debug_opt,
-  NULL);
-  switches[n_switches].args = 0;
-  switches[n_switches].live_cond = 0;
-  switches[n_switches].validated = false;
-  switches[n_switches].known = false;
-  switches[n_switches].ordering = 0;
-  n_switches++;
+  const char *opt = concat (-fcompare-debug=, compare_debug_opt, NULL);
+  save_switch (opt, 0, NULL, false, true);
   compare_debug = 1;
 }


Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-04 Thread Vladimir Makarov

On 12-10-04 4:56 PM, Steven Bosscher wrote:

On Tue, Oct 2, 2012 at 3:14 AM, Vladimir Makarov vmaka...@redhat.com wrote:

   Analogous live ranges are used in IRA as intermidiate step to build a
conflict graph.

Right, ira-lives.c and lra-lives.c look very much alike, the only
major difference is that the object of interest in an IRA live range
is an ira_object_t, and in an LRA live range it's just a regno. But
the code of create_start_finish_chains,
ira_rebuild_start_finish_chains,
remove_some_program_points_and_update_live_ranges, and most the live
range printing functions, are almost identical between lra-lives.c and
ira-lives.c. That looks like unnecessary code duplication. Do you
think some of that code be shared (esp. in a C++ world where maybe a
live range can be a container for another type)?

Yes, C++ could help here.  The simplest solution is inheritance from a 
common class.  But it is not a high priority task now.