Fix gcc.dg/lower-subreg-1.c failure (was: [C Patch]: pr52543)

2012-05-09 Thread Hans-Peter Nilsson
 From: Richard Sandiford rdsandif...@googlemail.com
 Date: Tue, 1 May 2012 16:46:38 +0200

 To repeat: as things stand, very few targets define proper rtx costs
 for SET.

IMHO it's wrong to start blaming targets when rtx_cost doesn't
take the mode in account in the first place, for the default
cost.  (Well, except for the modes-tieable subreg special-case.)
The targets where an operation in N * word_mode costs no more
than one in word_mode, if there even is one, is a minority,
let's adjust the defaults to that.

  This patch is therefore expected to prevent lower-subreg
 from running in cases where it's actually benefical.  If you see that
 happening, please check whether the rtx_costs are defined properly.

Well, for CRIS (one of the targets of the PR53176 fallout) they
are sane, basically.  Where cris_rtx_costs returns true, it
returns mostly(*) ballparkly-correct costs *where it's passed an
rtx for which there's a corresponding insn*, otherwise falling
back to the defaults.  It shouldn't have to check for validity
of the rtx asked about; core GCC already knows which insns there
are and can gate that in rtx_cost or its callers.

(*) I see a bug in that cris_rtx_costs doesn't check the mode
for extendsidi2, to return COSTS_N_INSNS (3) instead of 0
(because a sign-extending move to SImode doesn't cost more than
a move; a sign- or zero-extension is also free in an operand for
addition and multiplication).  But this isn't on the path that
lower-subreg.c takes, so only an incidental observation...

 Of course, if the costs are defined properly and lower-subreg still
 makes the wrong choice, we need to look at why.

By the way, regarding validity of rtx_cost calls:

 +++ gcc/lower-subreg.c  2012-05-01 09:46:48.473830772 +0100

 +/* Return the cost of a CODE shift in mode MODE by OP1 bits, using the
 +   rtxes in RTXES.  SPEED_P selects between the speed and size cost.  */
 +
 +static int
 +shift_cost (bool speed_p, struct cost_rtxes *rtxes, enum rtx_code code,
 +   enum machine_mode mode, int op1)
 +{
 +  PUT_MODE (rtxes-target, mode);
 +  PUT_CODE (rtxes-shift, code);
 +  PUT_MODE (rtxes-shift, mode);
 +  PUT_MODE (rtxes-source, mode);
 +  XEXP (rtxes-shift, 1) = GEN_INT (op1);
 +  SET_SRC (rtxes-set) = rtxes-shift;
 +  return insn_rtx_cost (rtxes-set, speed_p);
 +}
 +
 +/* For each X in the range [0, BITS_PER_WORD), set SPLITTING[X]
 +   to true if it is profitable to split a double-word CODE shift
 +   of X + BITS_PER_WORD bits.  SPEED_P says whether we are testing
 +   for speed or size profitability.
 +
 +   Use the rtxes in RTXES to calculate costs.  WORD_MOVE_ZERO_COST is
 +   the cost of moving zero into a word-mode register.  WORD_MOVE_COST
 +   is the cost of moving between word registers.  */
 +
 +static void
 +compute_splitting_shift (bool speed_p, struct cost_rtxes *rtxes,
 +bool *splitting, enum rtx_code code,
 +int word_move_zero_cost, int word_move_cost)
 +{

I think there should be a gating check whether the target
implements that kind of shift in that mode at all, before
checking the cost.  Not sure whether it's generally best to put
that test here, or to make the rtx_cost function return the cost
of a libcall for that mode when that happens.  Similar for the
other insns.

Isn't the below better than doing virtually the same in each
target's rtx_costs?  Not tested yet besides make cc1 and
checking that lower-subreg.c yields sane costs and that
gcc.dg/lower-subreg-1.c passes for cris-elf.  Note that
untieable SUBREGs still get a higher cost than tieable ones.

I'll test this for cris-elf, please tell me if/what other tests
and targets are required (simulator or compilefarm targets only,
please).

* rtlanal.c (rtx_cost): Adjust default cost for X with a
UNITS_PER_WORD factor for all X according to the size of
its mode, not just for SUBREGs with untieable modes.

Index: gcc/rtlanal.c
===
--- gcc/rtlanal.c   (revision 187308)
+++ gcc/rtlanal.c   (working copy)
@@ -3755,10 +3755,17 @@ rtx_cost (rtx x, enum rtx_code outer_cod
   enum rtx_code code;
   const char *fmt;
   int total;
+  int factor;
 
   if (x == 0)
 return 0;
 
+  /* A size N times larger than UNITS_PER_WORD likely needs N times as
+ many insns, taking N times as long.  */
+  factor = GET_MODE_SIZE (GET_MODE (x)) / UNITS_PER_WORD;
+  if (factor == 0)
+factor = 1;
+
   /* Compute the default costs of certain things.
  Note that targetm.rtx_costs can override the defaults.  */
 
@@ -3766,20 +3773,27 @@ rtx_cost (rtx x, enum rtx_code outer_cod
   switch (code)
 {
 case MULT:
-  total = COSTS_N_INSNS (5);
+  total = factor * COSTS_N_INSNS (5);
   break;
 case DIV:
 case UDIV:
 case MOD:
 case UMOD:
-  total = COSTS_N_INSNS (7);
+  total = factor * COSTS_N_INSNS (7);
   break;
 case USE:
   /* Used in combine.c as a 

Re: [PATCH] MIPS16: Fix truncated DWARF-2 line information

2012-05-09 Thread Maciej W. Rozycki
On Tue, 8 May 2012, Richard Sandiford wrote:

   Are you using a hard-float multilib for your -mabi=32/-mips16 Linux 
  testing?
 
 Yeah.  As an example:
 
http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg00393.html
 
 which doesn't look to bad.  Clean fortran results, which I expect
 would test the FP interworking fairly heavily.  (It's certainly
 been a source of bug fixes in the past, although I don't remember
 the results ever being terrible.)

 Yes, these look very good indeed.  Especially with QEMU that I do not 
feel terribly confident about as far as MIPS16 emulation is concerned (I 
still need to track down a single piece of real silicon supporting both 
MIPS64 and MIPS16 code at a time).

 FAOD, this is with normal MIPS libraries and mips16 executables.
 There's still no way of building mips16 multilibs out of the box.

 I've checked some notes and the issue was with MIPS16 FP PIC code (and 
therefore obviously SVR4 stubs rather than PLT) indeed.  I hope these 
pieces will get submitted eventually.

  Maciej


[PATCH] Add option for dumping to stderr (issue6190057)

2012-05-09 Thread Sharad Singhai
In response to comments, I have updated the patch to support dumps in
user provided files via the option -fdump-xxx=filename. The
filenames stdout/stderr are treated specially, and are considered
standard streams.

Also updated documentation and a testcase. Okay for trunk?

Thanks,
Sharad

2012-05-08   Sharad Singhai  sing...@google.com

* doc/invoke.texi: Add documentation for new option.
* tree-dump.c (dump_stream_p): New function.
(dump_files): Update for new field.
(dump_switch_p_1): Handle user provided filenames.
(dump_begin): Likewise.
(get_dump_file_name): Likewise.
(dump_enable_all): Add new parameter USER_FILENAME.
All callers updated.
(dump_end): Remove attribute.
* tree-pass.h (enum tree_dump_index): Add new constant.
(struct dump_file_info): Add new field USER_FILENAME.
* testsuite/g++.dg/other/dump-userfile-1.C: New test.

Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 187265)
+++ doc/invoke.texi (working copy)
@@ -5322,20 +5322,24 @@ Here are some examples showing uses of these optio
 
 @item -d@var{letters}
 @itemx -fdump-rtl-@var{pass}
+@itemx -fdump-rtl-@var{pass}=@var{filename}
 @opindex d
 Says to make debugging dumps during compilation at times specified by
 @var{letters}.  This is used for debugging the RTL-based passes of the
 compiler.  The file names for most of the dumps are made by appending
 a pass number and a word to the @var{dumpname}, and the files are
-created in the directory of the output file.  Note that the pass
-number is computed statically as passes get registered into the pass
-manager.  Thus the numbering is not related to the dynamic order of
-execution of passes.  In particular, a pass installed by a plugin
-could have a number over 200 even if it executed quite early.
-@var{dumpname} is generated from the name of the output file, if
-explicitly specified and it is not an executable, otherwise it is the
-basename of the source file. These switches may have different effects
-when @option{-E} is used for preprocessing.
+created in the directory of the output file. If the
+@option{=@var{filename}} is appended to the longer form of the dump
+option then the dump is done on that file instead of numbered
+files. The filenames stdout and stderr are treated specially. Note
+that the pass number is computed statically as passes get registered
+into the pass manager.  Thus the numbering is not related to the
+dynamic order of execution of passes.  In particular, a pass installed
+by a plugin could have a number over 200 even if it executed quite
+early.  @var{dumpname} is generated from the name of the output file,
+if explicitly specified and it is not an executable, otherwise it is
+the basename of the source file. These switches may have different
+effects when @option{-E} is used for preprocessing.
 
 Debug dumps can be enabled with a @option{-fdump-rtl} switch or some
 @option{-d} option @var{letters}.  Here are the possible
@@ -5599,6 +5603,10 @@ These dumps are defined but always produce empty f
 @opindex fdump-rtl-all
 Produce all the dumps listed above.
 
+@item -fdump-rtl-all=stderr
+@opindex fdump-rtl-all=stderr
+Produce all RTL dumps on stderr.
+
 @item -dA
 @opindex dA
 Annotate the assembler output with miscellaneous debugging information.
@@ -5719,15 +5727,19 @@ counters for each function compiled.
 
 @item -fdump-tree-@var{switch}
 @itemx -fdump-tree-@var{switch}-@var{options}
+@itemx -fdump-tree-@var{switch}-@var{options}=@var{filename}
 @opindex fdump-tree
 Control the dumping at various stages of processing the intermediate
 language tree to a file.  The file name is generated by appending a
 switch specific suffix to the source file name, and the file is
-created in the same directory as the output file.  If the
-@samp{-@var{options}} form is used, @var{options} is a list of
-@samp{-} separated options which control the details of the dump.  Not
-all options are applicable to all dumps; those that are not
-meaningful are ignored.  The following options are available
+created in the same directory as the output file. In case of
+@option{=@var{filename}}, the dump output is on the given file. Note
+that the filenames stdout and stderr are treated specially and dumps
+are done on standard streams. If the @samp{-@var{options}} form is
+used, @var{options} is a list of @samp{-} separated options which
+control the details or location of the dump.  Not all options are
+applicable to all dumps; those that are not meaningful are ignored.
+The following options are available
 
 @table @samp
 @item address
@@ -5765,9 +5777,56 @@ Enable showing the tree dump for each statement.
 Enable showing the EH region number holding each statement.
 @item scev
 Enable showing scalar evolution analysis details.
+@item slim
+Inhibit dumping of members of a scope or body of a function merely
+because 

Re: patch ping: Add static branch predict heuristic of comparing IV to loop_bound variable

2012-05-09 Thread Jakub Jelinek
On Wed, May 09, 2012 at 09:02:14AM +0800, Dehao Chen wrote:
 Sorry for the error. Here is a new patch to fix them:
 
 gcc/testsuite/ChangeLog:
 2012-05-08  Dehao Chen  de...@google.com
 
   * gcc.dg/predict-1.c: Remove the replicated text in this text.
   * gcc.dg/predict-2.c: Likewise.
   * gcc.dg/predict-3.c: Likewise.
   * gcc.dg/predict-4.c: Likewise.
   * gcc.dg/predict-5.c: Likewise.
   * gcc.dg/predict-6.c: Likewise.

Ok (you could have committed it as obvious even).

 --- gcc/ChangeLog (revision 187307)
 +++ gcc/ChangeLog (working copy)
 @@ -110,15 +110,15 @@
 
  2012-05-08  Dehao Chen  de...@google.com
 
 - * predict.c (find_qualified_ssa_name): New
 - (find_ssa_name_in_expr): New
 - (find_ssa_name_in_assign_stmt): New
 - (is_comparison_with_loop_invariant_p): New
 - (is_bound_expr_similar): New
 - (predict_iv_comparison): New
 + * predict.c (find_qualified_ssa_name): New.
 + (find_ssa_name_in_expr): New.
 + (find_ssa_name_in_assign_stmt): New.
 + (is_comparison_with_loop_invariant_p): New.
 + (is_bound_expr_similar): New.
 + (predict_iv_comparison): New.
   (predict_loops): Add heuristic for loop-nested branches that compare an
   induction variable to a loop bound variable.
 - * predict.def (PRED_LOOP_IV_COMPARE): New macro
 + * predict.def (PRED_LOOP_IV_COMPARE): New macro.
 
  2012-05-08  Uros Bizjak  ubiz...@gmail.com
 
 Index: gcc/testsuite/gcc.dg/predict-3.c
 ===
 --- gcc/testsuite/gcc.dg/predict-3.c  (revision 187307)
 +++ gcc/testsuite/gcc.dg/predict-3.c  (working copy)
 @@ -23,28 +23,3 @@
 
  /* { dg-final { scan-tree-dump-times loop iv compare heuristics:
 100.0% 4 profile_estimate} } */
  /* { dg-final { cleanup-tree-dump profile_estimate } } */
 -/* { dg-do compile } */
 -/* { dg-options -O2 -fdump-tree-profile_estimate } */
 -
 -extern int global;
 -
 -int bar(int);
 -
 -void foo (int bound)
 -{
 -  int i, ret = 0;
 -  for (i = 0; i = bound; i++)
 -{
 -  if (i  bound - 2)
 - global += bar (i);
 -  if (i = bound)
 - global += bar (i);
 -  if (i + 1  bound)
 - global += bar (i);
 -  if (i != bound)
 - global += bar (i);
 -}
 -}
 -
 -/* { dg-final { scan-tree-dump-times loop iv compare heuristics:
 100.0% 4 profile_estimate} } */
 -/* { dg-final { cleanup-tree-dump profile_estimate } } */
 Index: gcc/testsuite/gcc.dg/predict-4.c
 ===
 --- gcc/testsuite/gcc.dg/predict-4.c  (revision 187307)
 +++ gcc/testsuite/gcc.dg/predict-4.c  (working copy)
 @@ -17,22 +17,3 @@
 
  /* { dg-final { scan-tree-dump loop iv compare heuristics: 50.0%
 profile_estimate} } */
  /* { dg-final { cleanup-tree-dump profile_estimate } } */
 -/* { dg-do compile } */
 -/* { dg-options -O2 -fdump-tree-profile_estimate } */
 -
 -extern int global;
 -
 -int bar(int);
 -
 -void foo (int bound)
 -{
 -  int i, ret = 0;
 -  for (i = 0; i  10; i++)
 -{
 -  if (i  5)
 - global += bar (i);
 -}
 -}
 -
 -/* { dg-final { scan-tree-dump loop iv compare heuristics: 50.0%
 profile_estimate} } */
 -/* { dg-final { cleanup-tree-dump profile_estimate } } */
 Index: gcc/testsuite/gcc.dg/predict-1.c
 ===
 --- gcc/testsuite/gcc.dg/predict-1.c  (revision 187307)
 +++ gcc/testsuite/gcc.dg/predict-1.c  (working copy)
 @@ -25,30 +25,3 @@
 
  /* { dg-final { scan-tree-dump-times loop iv compare heuristics:
 0.0% 5 profile_estimate} } */
  /* { dg-final { cleanup-tree-dump profile_estimate } } */
 -/* { dg-do compile } */
 -/* { dg-options -O2 -fdump-tree-profile_estimate } */
 -
 -extern int global;
 -
 -int bar(int);
 -
 -void foo (int bound)
 -{
 -  int i, ret = 0;
 -  for (i = 0; i  bound; i++)
 -{
 -  if (i  bound)
 - global += bar (i);
 -  if (i = bound + 2)
 - global += bar (i);
 -  if (i  bound - 2)
 - global += bar (i);
 -  if (i + 2  bound)
 - global += bar (i);
 -  if (i == 10)
 - global += bar (i);
 -}
 -}
 -
 -/* { dg-final { scan-tree-dump-times loop iv compare heuristics:
 0.0% 5 profile_estimate} } */
 -/* { dg-final { cleanup-tree-dump profile_estimate } } */
 Index: gcc/testsuite/gcc.dg/predict-5.c
 ===
 --- gcc/testsuite/gcc.dg/predict-5.c  (revision 187307)
 +++ gcc/testsuite/gcc.dg/predict-5.c  (working copy)
 @@ -23,28 +23,3 @@
 
  /* { dg-final { scan-tree-dump-times loop iv compare heuristics:
 100.0% 4 profile_estimate} } */
  /* { dg-final { cleanup-tree-dump profile_estimate } } */
 -/* { dg-do compile } */
 -/* { dg-options -O2 -fdump-tree-profile_estimate } */
 -
 -extern int global;
 -
 -int bar (int);
 -
 -void foo (int base, int bound)
 -{
 -  int i, ret = 0;
 -  for (i = base; i = bound; i++)
 -{
 -  if (i  base)
 - global += bar (i);
 -  if (i  

Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-05-09 Thread Andrew Pinski
On Tue, May 8, 2012 at 11:46 PM, Sharad Singhai sing...@google.com wrote:
 In response to comments, I have updated the patch to support dumps in
 user provided files via the option -fdump-xxx=filename. The
 filenames stdout/stderr are treated specially, and are considered
 standard streams.

I think - should also be treated as special (or maybe the only one
which should be treated as special).

Thanks,
Andrew Pinski


 Also updated documentation and a testcase. Okay for trunk?

 Thanks,
 Sharad

 2012-05-08   Sharad Singhai  sing...@google.com

        * doc/invoke.texi: Add documentation for new option.
        * tree-dump.c (dump_stream_p): New function.
        (dump_files): Update for new field.
        (dump_switch_p_1): Handle user provided filenames.
        (dump_begin): Likewise.
        (get_dump_file_name): Likewise.
        (dump_enable_all): Add new parameter USER_FILENAME.
        All callers updated.
        (dump_end): Remove attribute.
        * tree-pass.h (enum tree_dump_index): Add new constant.
        (struct dump_file_info): Add new field USER_FILENAME.
        * testsuite/g++.dg/other/dump-userfile-1.C: New test.

 Index: doc/invoke.texi
 ===
 --- doc/invoke.texi     (revision 187265)
 +++ doc/invoke.texi     (working copy)
 @@ -5322,20 +5322,24 @@ Here are some examples showing uses of these optio

  @item -d@var{letters}
  @itemx -fdump-rtl-@var{pass}
 +@itemx -fdump-rtl-@var{pass}=@var{filename}
  @opindex d
  Says to make debugging dumps during compilation at times specified by
  @var{letters}.  This is used for debugging the RTL-based passes of the
  compiler.  The file names for most of the dumps are made by appending
  a pass number and a word to the @var{dumpname}, and the files are
 -created in the directory of the output file.  Note that the pass
 -number is computed statically as passes get registered into the pass
 -manager.  Thus the numbering is not related to the dynamic order of
 -execution of passes.  In particular, a pass installed by a plugin
 -could have a number over 200 even if it executed quite early.
 -@var{dumpname} is generated from the name of the output file, if
 -explicitly specified and it is not an executable, otherwise it is the
 -basename of the source file. These switches may have different effects
 -when @option{-E} is used for preprocessing.
 +created in the directory of the output file. If the
 +@option{=@var{filename}} is appended to the longer form of the dump
 +option then the dump is done on that file instead of numbered
 +files. The filenames stdout and stderr are treated specially. Note
 +that the pass number is computed statically as passes get registered
 +into the pass manager.  Thus the numbering is not related to the
 +dynamic order of execution of passes.  In particular, a pass installed
 +by a plugin could have a number over 200 even if it executed quite
 +early.  @var{dumpname} is generated from the name of the output file,
 +if explicitly specified and it is not an executable, otherwise it is
 +the basename of the source file. These switches may have different
 +effects when @option{-E} is used for preprocessing.

  Debug dumps can be enabled with a @option{-fdump-rtl} switch or some
  @option{-d} option @var{letters}.  Here are the possible
 @@ -5599,6 +5603,10 @@ These dumps are defined but always produce empty f
  @opindex fdump-rtl-all
  Produce all the dumps listed above.

 +@item -fdump-rtl-all=stderr
 +@opindex fdump-rtl-all=stderr
 +Produce all RTL dumps on stderr.
 +
  @item -dA
  @opindex dA
  Annotate the assembler output with miscellaneous debugging information.
 @@ -5719,15 +5727,19 @@ counters for each function compiled.

  @item -fdump-tree-@var{switch}
  @itemx -fdump-tree-@var{switch}-@var{options}
 +@itemx -fdump-tree-@var{switch}-@var{options}=@var{filename}
  @opindex fdump-tree
  Control the dumping at various stages of processing the intermediate
  language tree to a file.  The file name is generated by appending a
  switch specific suffix to the source file name, and the file is
 -created in the same directory as the output file.  If the
 -@samp{-@var{options}} form is used, @var{options} is a list of
 -@samp{-} separated options which control the details of the dump.  Not
 -all options are applicable to all dumps; those that are not
 -meaningful are ignored.  The following options are available
 +created in the same directory as the output file. In case of
 +@option{=@var{filename}}, the dump output is on the given file. Note
 +that the filenames stdout and stderr are treated specially and dumps
 +are done on standard streams. If the @samp{-@var{options}} form is
 +used, @var{options} is a list of @samp{-} separated options which
 +control the details or location of the dump.  Not all options are
 +applicable to all dumps; those that are not meaningful are ignored.
 +The following options are available

  @table @samp
  @item address
 @@ 

Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-05-09 Thread Gabriel Dos Reis
On Wed, May 9, 2012 at 1:51 AM, Andrew Pinski pins...@gmail.com wrote:
 On Tue, May 8, 2012 at 11:46 PM, Sharad Singhai sing...@google.com wrote:
 In response to comments, I have updated the patch to support dumps in
 user provided files via the option -fdump-xxx=filename. The
 filenames stdout/stderr are treated specially, and are considered
 standard streams.

 I think - should also be treated as special (or maybe the only one
 which should be treated as special).

He originally wanted only stderr, so treating - only specially
would not be in
the line with his original goal.  - is equivalent to stdout, so it
is not like
we don't have the functionally with his revised patch.


 Thanks,
 Andrew Pinski


 Also updated documentation and a testcase. Okay for trunk?

 Thanks,
 Sharad

 2012-05-08   Sharad Singhai  sing...@google.com

        * doc/invoke.texi: Add documentation for new option.
        * tree-dump.c (dump_stream_p): New function.
        (dump_files): Update for new field.
        (dump_switch_p_1): Handle user provided filenames.
        (dump_begin): Likewise.
        (get_dump_file_name): Likewise.
        (dump_enable_all): Add new parameter USER_FILENAME.
        All callers updated.
        (dump_end): Remove attribute.
        * tree-pass.h (enum tree_dump_index): Add new constant.
        (struct dump_file_info): Add new field USER_FILENAME.
        * testsuite/g++.dg/other/dump-userfile-1.C: New test.

 Index: doc/invoke.texi
 ===
 --- doc/invoke.texi     (revision 187265)
 +++ doc/invoke.texi     (working copy)
 @@ -5322,20 +5322,24 @@ Here are some examples showing uses of these optio

  @item -d@var{letters}
  @itemx -fdump-rtl-@var{pass}
 +@itemx -fdump-rtl-@var{pass}=@var{filename}
  @opindex d
  Says to make debugging dumps during compilation at times specified by
  @var{letters}.  This is used for debugging the RTL-based passes of the
  compiler.  The file names for most of the dumps are made by appending
  a pass number and a word to the @var{dumpname}, and the files are
 -created in the directory of the output file.  Note that the pass
 -number is computed statically as passes get registered into the pass
 -manager.  Thus the numbering is not related to the dynamic order of
 -execution of passes.  In particular, a pass installed by a plugin
 -could have a number over 200 even if it executed quite early.
 -@var{dumpname} is generated from the name of the output file, if
 -explicitly specified and it is not an executable, otherwise it is the
 -basename of the source file. These switches may have different effects
 -when @option{-E} is used for preprocessing.
 +created in the directory of the output file. If the
 +@option{=@var{filename}} is appended to the longer form of the dump
 +option then the dump is done on that file instead of numbered
 +files. The filenames stdout and stderr are treated specially. Note
 +that the pass number is computed statically as passes get registered
 +into the pass manager.  Thus the numbering is not related to the
 +dynamic order of execution of passes.  In particular, a pass installed
 +by a plugin could have a number over 200 even if it executed quite
 +early.  @var{dumpname} is generated from the name of the output file,
 +if explicitly specified and it is not an executable, otherwise it is
 +the basename of the source file. These switches may have different
 +effects when @option{-E} is used for preprocessing.

  Debug dumps can be enabled with a @option{-fdump-rtl} switch or some
  @option{-d} option @var{letters}.  Here are the possible
 @@ -5599,6 +5603,10 @@ These dumps are defined but always produce empty f
  @opindex fdump-rtl-all
  Produce all the dumps listed above.

 +@item -fdump-rtl-all=stderr
 +@opindex fdump-rtl-all=stderr
 +Produce all RTL dumps on stderr.
 +
  @item -dA
  @opindex dA
  Annotate the assembler output with miscellaneous debugging information.
 @@ -5719,15 +5727,19 @@ counters for each function compiled.

  @item -fdump-tree-@var{switch}
  @itemx -fdump-tree-@var{switch}-@var{options}
 +@itemx -fdump-tree-@var{switch}-@var{options}=@var{filename}
  @opindex fdump-tree
  Control the dumping at various stages of processing the intermediate
  language tree to a file.  The file name is generated by appending a
  switch specific suffix to the source file name, and the file is
 -created in the same directory as the output file.  If the
 -@samp{-@var{options}} form is used, @var{options} is a list of
 -@samp{-} separated options which control the details of the dump.  Not
 -all options are applicable to all dumps; those that are not
 -meaningful are ignored.  The following options are available
 +created in the same directory as the output file. In case of
 +@option{=@var{filename}}, the dump output is on the given file. Note
 +that the filenames stdout and stderr are treated specially and dumps
 +are done on standard streams. If the @samp{-@var{options}} 

Re: [C++ Patch and pubnames 2/2] Adjust c decl pretty printer to match demangler (issue6195056)

2012-05-09 Thread Manuel López-Ibáñez
On 9 May 2012 01:34, Cary Coutant ccout...@google.com wrote:
 A suggestion: Make dwarf_name call the demangler, and then a (new?) a
 function that converts a mangled decl to a human-readable string. In
 any case, the pretty-printer does a lot of stuff that is mostly
 useless for just printing a declaration (translation, wrapping, etc.).

 Bonus point if all GNU toolchain program use the same functions for
 demangling and undemagling (because I guess they actually don't, no?)

 I guess it cannot be so easy, so I apologize in advance for saying nonsense.

 It makes sense; and I don't think it is as complicated as it might sound.

 dwarf_name takes a tree; the demangler takes a mangled name. We don't
 have mangled names for many of the names we want to enter into the
 pubnames table.

Sorry that was a typo. My proposal is to mangle the tree (GCC already
does that) and then demangle it to a human-readable string (GCC also
does that already, no? Or at least  GDB and other programs have to do
that anyway, no? In any case, the demangling code should be shared
and, hence, consistent.)

For the names that the mangler/demangler do not support, I guess it
doesn't make sense to extend the demangler, so you will need to handle
them explicitly. You may call then the pretty-printer, since anyway
GDB and the other programs cannot demangle those names, so whatever is
used by GCC should be fine. Or implement a custom pretty-printer for
them. A custom pretty-printer will be for sure much simpler than the
diagnostics pretty-printer, since you don't have to worry about all
kinds of trees, wrapping, inheritance, i18n, etc.

Probably I am missing some details that may make the plan above
infeasible. In any case, I think your intention to make the output of
the different toolchain programs more consistent is worthwhile. If
dwarf_name has to use the pretty-printer, then I'd prefer that you
changed the demangler to match the diagnostics output, rather than the
other way around. Also, I am no maintainer, so these are merely
suggestions that you may freely ignore. :-)

Cheers,

Manuel..


Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-05-09 Thread Gabriel Dos Reis
On Wed, May 9, 2012 at 1:46 AM, Sharad Singhai sing...@google.com wrote:
[...]

 +@item -fdump-rtl-all=stderr
 +@opindex fdump-rtl-all=stderr

You do not need to have a separate index entry for '=stderr' or '=stdout'.
Rather, expand the description to state this in all the documentation
for -fdump-xxx=yyy.

[...]

 +/* Return non-zero iff the USER_FILENAME corresponds to stdout or
 +   stderr stream.  */
 +
 +static int
 +dump_stream_p (const char *user_filename)
 +{
 +  if (user_filename)
 +    return !strncmp (stderr, user_filename, 6) ||
 +      !strncmp (stdout, user_filename, 6);
 +  else
 +    return 0;
 +}

The name is ambiguous.
This function is testing whether its string argument designates one of
the *standard* output streams.  Name it to reflect that..
Have it take the dump state context. Also the coding convention: the binary
operator || should be on next line.  In fact the thing could be
simpler.   Instead of
testing over and over again against stderr (once in this function, then again
later), just return the corresponding standard FILE* pointer.
Also, this is a case of overuse of strncmp.  If you name the function
dump_get_standard_stream:

return strcmp(stderr, dfi-user_filename) == 0
   ? stderr
: stdcmp(stdout, dfi-use_filename)
?  stdout
: NULL;

you can simplify:

 -  name = get_dump_file_name (phase);
   dfi = get_dump_file_info (phase);
 -  stream = fopen (name, dfi-state  0 ? w : a);
 -  if (!stream)
 -    error (could not open dump file %qs: %m, name);
 +  if (dump_stream_p (dfi-user_filename))
 +    {
 +      if (!strncmp (stderr, dfi-user_filename, 6))
 +        stream = stderr;
 +      else
 +        if (!strncmp (stdout, dfi-user_filename, 6))
 +          stream = stdout;
 +        else
 +          error (unknown stream: %qs: %m, dfi-user_filename);
 +      dfi-state = 1;
 +    }
   else
 -    dfi-state = 1;
 -  free (name);
 +    {
 +      name = get_dump_file_name (phase);
 +      stream = fopen (name, dfi-state  0 ? w : a);
 +      if (!stream)
 +        error (could not open dump file %qs: %m, name);
 +      else
 +        dfi-state = 1;
 +      free (name);
 +    }

   if (flag_ptr)
     *flag_ptr = dfi-flags;
 @@ -987,35 +1017,45 @@ dump_flag_name (int phase)
    dump_begin.  */

  void
 -dump_end (int phase ATTRIBUTE_UNUSED, FILE *stream)
 +dump_end (int phase, FILE *stream)
  {
 -  fclose (stream);
 +  struct dump_file_info *dfi = get_dump_file_info (phase);
 +  if (!dump_stream_p (dfi-user_filename))
 +    fclose (stream);
  }

  /* Enable all tree dumps.  Return number of enabled tree dumps.  */

  static int
 -dump_enable_all (int flags)
 +dump_enable_all (int flags, const char *user_filename)
  {
   int ir_dump_type = (flags  (TDF_TREE | TDF_RTL | TDF_IPA));
   int n = 0;
   size_t i;

   for (i = TDI_none + 1; i  (size_t) TDI_end; i++)
 -    if ((dump_files[i].flags  ir_dump_type))
 -      {
 -        dump_files[i].state = -1;
 -        dump_files[i].flags |= flags;
 -        n++;
 -      }
 +    {
 +      if ((dump_files[i].flags  ir_dump_type))
 +        {
 +          dump_files[i].state = -1;
 +          dump_files[i].flags |= flags;
 +          n++;
 +        }
 +      if (user_filename)
 +        dump_files[i].user_filename = user_filename;
 +    }

   for (i = 0; i  extra_dump_files_in_use; i++)
 -    if ((extra_dump_files[i].flags  ir_dump_type))
 -      {
 -        extra_dump_files[i].state = -1;
 -        extra_dump_files[i].flags |= flags;
 -       n++;
 -      }
 +    {
 +      if ((extra_dump_files[i].flags  ir_dump_type))
 +        {
 +          extra_dump_files[i].state = -1;
 +          extra_dump_files[i].flags |= flags;
 +          n++;
 +        }
 +      if (user_filename)
 +        extra_dump_files[i].user_filename = user_filename;
 +    }

   return n;
  }
 @@ -1037,7 +1077,7 @@ dump_switch_p_1 (const char *arg, struct dump_file
   if (!option_value)
     return 0;

 -  if (*option_value  *option_value != '-')
 +  if (*option_value  *option_value != '-'  *option_value != '=')
     return 0;

   ptr = option_value;
 @@ -1052,17 +1092,30 @@ dump_switch_p_1 (const char *arg, struct dump_file
       while (*ptr == '-')
        ptr++;
       end_ptr = strchr (ptr, '-');
 +
       if (!end_ptr)
        end_ptr = ptr + strlen (ptr);
       length = end_ptr - ptr;

 +      if (*ptr == '=')
 +        {
 +          /* Interpret rest of the argument as a dump filename.  The
 +             user provided filename overrides generated dump names as
 +             well as other command line filenames.  */
 +          flags |= TDF_USER_FILENAME;
 +          if (dfi-user_filename)
 +            free (dfi-user_filename);
 +          dfi-user_filename = xstrdup (ptr + 1);
 +          break;
 +        }
 +
       for (option_ptr = dump_options; option_ptr-name; option_ptr++)
        if (strlen (option_ptr-name) == length
             !memcmp (option_ptr-name, ptr, length))
 -         {
 -           flags |= option_ptr-value;
 +        

Re: [PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs

2012-05-09 Thread Richard Guenther
On Tue, May 8, 2012 at 6:18 PM, Xinliang David Li davi...@google.com wrote:
 To be clear, this flag is for malloc implementation (such as tcmalloc)
 with side effect unknown to the compiler. Using -fno-builtin-xxx is
 too conservative for that purpose.

I don't think that flys.  Btw, the patch also guards alloca - alloca is purely
GCC internal.

What's the unknown side-effects that are also important to preserve
for free(malloc(4))?

Richard.

 David

 On Tue, May 8, 2012 at 7:43 AM, Dehao Chen de...@google.com wrote:
 Hello,

 This patch adds a flag to guard the optimization that optimize the
 following code away:

 free (malloc (4));

 In some cases, we'd like this type of malloc/free pairs to remain in
 the optimized code.

 Tested with bootstrap, and no regression in the gcc testsuite.

 Is it ok for mainline?

 Thanks,
 Dehao

 gcc/ChangeLog
 2012-05-08  Dehao Chen  de...@google.com

        * common.opt (feliminate-malloc): New.
        * doc/invoke.texi: Document it.
        * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it.

 gcc/testsuite/ChangeLog
 2012-05-08  Dehao Chen  de...@google.com

        * gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working
        as expected.

 Index: gcc/doc/invoke.texi
 ===
 --- gcc/doc/invoke.texi (revision 187277)
 +++ gcc/doc/invoke.texi (working copy)
 @@ -360,7 +360,8 @@
  -fcx-limited-range @gol
  -fdata-sections -fdce -fdelayed-branch @gol
  -fdelete-null-pointer-checks -fdevirtualize -fdse @gol
 --fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects @gol
 +-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations @gol
 +-ffat-lto-objects @gol
  -ffast-math -ffinite-math-only -ffloat-store
 -fexcess-precision=@var{style} @gol
  -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol
  -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
 @@ -6238,6 +6239,7 @@
  -fdefer-pop @gol
  -fdelayed-branch @gol
  -fdse @gol
 +-feliminate-malloc @gol
  -fguess-branch-probability @gol
  -fif-conversion2 @gol
  -fif-conversion @gol
 @@ -6762,6 +6764,11 @@
  Perform dead store elimination (DSE) on RTL@.
  Enabled by default at @option{-O} and higher.

 +@item -feliminate-malloc
 +@opindex feliminate-malloc
 +Eliminate unnecessary malloc/free pairs.
 +Enabled by default at @option{-O} and higher.
 +
  @item -fif-conversion
  @opindex fif-conversion
  Attempt to transform conditional jumps into branch-less equivalents.  This
 Index: gcc/testsuite/gcc.dg/free-malloc.c
 ===
 --- gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
 +++ gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
 @@ -0,0 +1,12 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fno-eliminate-malloc } */
 +/* { dg-final { scan-assembler-times malloc 2} } */
 +/* { dg-final { scan-assembler-times free 2} } */
 +
 +extern void * malloc (unsigned long);
 +extern void free (void *);
 +
 +void test ()
 +{
 +  free (malloc (10));
 +}
 Index: gcc/common.opt
 ===
 --- gcc/common.opt      (revision 187277)
 +++ gcc/common.opt      (working copy)
 @@ -1474,6 +1474,10 @@
  Common Var(flag_dce) Init(1) Optimization
  Use the RTL dead code elimination pass

 +feliminate-malloc
 +Common Var(flag_eliminate_malloc) Init(1) Optimization
 +Eliminate unnecessary malloc/free pairs
 +
  fdse
  Common Var(flag_dse) Init(1) Optimization
  Use the RTL dead store elimination pass
 Index: gcc/tree-ssa-dce.c
 ===
 --- gcc/tree-ssa-dce.c  (revision 187277)
 +++ gcc/tree-ssa-dce.c  (working copy)
 @@ -309,6 +309,8 @@
            case BUILT_IN_CALLOC:
            case BUILT_IN_ALLOCA:
            case BUILT_IN_ALLOCA_WITH_ALIGN:
 +             if (!flag_eliminate_malloc)
 +               mark_stmt_necessary (stmt, true);
              return;

            default:;


Re: [PATCH] Fix endless loop in forwprop (PR tree-optimization/53226)

2012-05-09 Thread Richard Guenther
On Tue, May 8, 2012 at 8:33 PM, Jakub Jelinek ja...@redhat.com wrote:
 Hi!

 The attached testcase loops endlessly, using more and more memory.
 The problem is that the prev stmt iterator sometimes references stmts that
 remove_prop_source_from_use decides to remove, and since Michael's
 gimple seq changes that seems to be fatal.

 Fixed by not keeping an iterator, but instead marking stmts that don't need
 revisiting and restarting with first stmt that needs revisiting.  This
 assumes that new stmts will have uid 0, but I believe that is the case.

 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Using the stmt UID looks odd as you are using it as flag - why not use
one of the two stmt flags available to passes?

Btw, the other possibility would be to simply change the various combiners
to require them to update the iterator to point at the first statement that
they have inserted.  I have some TLC ideas in this area, let's see if I can
get to them.

Ok with using gimple_set_plf and friends.

Thanks,
Richard.

 2012-05-08  Jakub Jelinek  ja...@redhat.com

        PR tree-optimization/53226
        * tree-ssa-forwprop.c (ssa_forward_propagate_and_combine): Remove
        prev and prev_initialized vars, gimple_set_uid (stmt, 0) before
        processing it and gimple_set_uid (stmt, 1) if it doesn't need to be
        revisited, look for earliest stmt with uid 0 if something changed.

        * gcc.c-torture/compile/pr53226.c: New test.

 --- gcc/tree-ssa-forwprop.c.jj  2012-05-03 08:35:52.0 +0200
 +++ gcc/tree-ssa-forwprop.c     2012-05-08 18:10:19.662061709 +0200
 @@ -2677,8 +2677,7 @@ ssa_forward_propagate_and_combine (void)

   FOR_EACH_BB (bb)
     {
 -      gimple_stmt_iterator gsi, prev;
 -      bool prev_initialized;
 +      gimple_stmt_iterator gsi;

       /* Apply forward propagation to all stmts in the basic-block.
         Note we update GSI within the loop as necessary.  */
 @@ -2771,12 +2770,14 @@ ssa_forward_propagate_and_combine (void)

       /* Combine stmts with the stmts defining their operands.
         Note we update GSI within the loop as necessary.  */
 -      prev_initialized = false;
       for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
        {
          gimple stmt = gsi_stmt (gsi);
          bool changed = false;

 +         /* Mark stmt as potentially needing revisiting.  */
 +         gimple_set_uid (stmt, 0);
 +
          switch (gimple_code (stmt))
            {
            case GIMPLE_ASSIGN:
 @@ -2856,18 +2857,18 @@ ssa_forward_propagate_and_combine (void)
            {
              /* If the stmt changed then re-visit it and the statements
                 inserted before it.  */
 -             if (!prev_initialized)
 +             for (; !gsi_end_p (gsi); gsi_prev (gsi))
 +               if (gimple_uid (gsi_stmt (gsi)))
 +                 break;
 +             if (gsi_end_p (gsi))
                gsi = gsi_start_bb (bb);
              else
 -               {
 -                 gsi = prev;
 -                 gsi_next (gsi);
 -               }
 +               gsi_next (gsi);
            }
          else
            {
 -             prev = gsi;
 -             prev_initialized = true;
 +             /* Stmt no longer needs to be revisited.  */
 +             gimple_set_uid (stmt, 1);
              gsi_next (gsi);
            }
        }
 --- gcc/testsuite/gcc.c-torture/compile/pr53226.c.jj    2012-05-08 
 18:07:40.007000510 +0200
 +++ gcc/testsuite/gcc.c-torture/compile/pr53226.c       2012-05-08 
 18:07:35.071029578 +0200
 @@ -0,0 +1,13 @@
 +/* PR tree-optimization/53226 */
 +
 +void
 +foo (unsigned long *x, char y, char z)
 +{
 +  int i;
 +  for (i = y; i  z; ++i)
 +    {
 +      unsigned long a = ((unsigned char) i)  63UL;
 +      unsigned long b = 1ULL  a;
 +      *x |= b;
 +    }
 +}

        Jakub


Re: [PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs

2012-05-09 Thread Dehao Chen
On Wed, May 9, 2012 at 4:12 PM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Tue, May 8, 2012 at 6:18 PM, Xinliang David Li davi...@google.com wrote:
 To be clear, this flag is for malloc implementation (such as tcmalloc)
 with side effect unknown to the compiler. Using -fno-builtin-xxx is
 too conservative for that purpose.

 I don't think that flys.  Btw, the patch also guards alloca - alloca is purely
 GCC internal.

 What's the unknown side-effects that are also important to preserve
 for free(malloc(4))?

Malloc implementation may record some info to a global structure, and
a program may use this free(malloc()) pair to simulate the real runs
to get some data, such as peak memory requirement.

Dehao


 Richard.

 David

 On Tue, May 8, 2012 at 7:43 AM, Dehao Chen de...@google.com wrote:
 Hello,

 This patch adds a flag to guard the optimization that optimize the
 following code away:

 free (malloc (4));

 In some cases, we'd like this type of malloc/free pairs to remain in
 the optimized code.

 Tested with bootstrap, and no regression in the gcc testsuite.

 Is it ok for mainline?

 Thanks,
 Dehao

 gcc/ChangeLog
 2012-05-08  Dehao Chen  de...@google.com

        * common.opt (feliminate-malloc): New.
        * doc/invoke.texi: Document it.
        * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it.

 gcc/testsuite/ChangeLog
 2012-05-08  Dehao Chen  de...@google.com

        * gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working
        as expected.

 Index: gcc/doc/invoke.texi
 ===
 --- gcc/doc/invoke.texi (revision 187277)
 +++ gcc/doc/invoke.texi (working copy)
 @@ -360,7 +360,8 @@
  -fcx-limited-range @gol
  -fdata-sections -fdce -fdelayed-branch @gol
  -fdelete-null-pointer-checks -fdevirtualize -fdse @gol
 --fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects @gol
 +-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations 
 @gol
 +-ffat-lto-objects @gol
  -ffast-math -ffinite-math-only -ffloat-store
 -fexcess-precision=@var{style} @gol
  -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol
  -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
 @@ -6238,6 +6239,7 @@
  -fdefer-pop @gol
  -fdelayed-branch @gol
  -fdse @gol
 +-feliminate-malloc @gol
  -fguess-branch-probability @gol
  -fif-conversion2 @gol
  -fif-conversion @gol
 @@ -6762,6 +6764,11 @@
  Perform dead store elimination (DSE) on RTL@.
  Enabled by default at @option{-O} and higher.

 +@item -feliminate-malloc
 +@opindex feliminate-malloc
 +Eliminate unnecessary malloc/free pairs.
 +Enabled by default at @option{-O} and higher.
 +
  @item -fif-conversion
  @opindex fif-conversion
  Attempt to transform conditional jumps into branch-less equivalents.  This
 Index: gcc/testsuite/gcc.dg/free-malloc.c
 ===
 --- gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
 +++ gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
 @@ -0,0 +1,12 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fno-eliminate-malloc } */
 +/* { dg-final { scan-assembler-times malloc 2} } */
 +/* { dg-final { scan-assembler-times free 2} } */
 +
 +extern void * malloc (unsigned long);
 +extern void free (void *);
 +
 +void test ()
 +{
 +  free (malloc (10));
 +}
 Index: gcc/common.opt
 ===
 --- gcc/common.opt      (revision 187277)
 +++ gcc/common.opt      (working copy)
 @@ -1474,6 +1474,10 @@
  Common Var(flag_dce) Init(1) Optimization
  Use the RTL dead code elimination pass

 +feliminate-malloc
 +Common Var(flag_eliminate_malloc) Init(1) Optimization
 +Eliminate unnecessary malloc/free pairs
 +
  fdse
  Common Var(flag_dse) Init(1) Optimization
  Use the RTL dead store elimination pass
 Index: gcc/tree-ssa-dce.c
 ===
 --- gcc/tree-ssa-dce.c  (revision 187277)
 +++ gcc/tree-ssa-dce.c  (working copy)
 @@ -309,6 +309,8 @@
            case BUILT_IN_CALLOC:
            case BUILT_IN_ALLOCA:
            case BUILT_IN_ALLOCA_WITH_ALIGN:
 +             if (!flag_eliminate_malloc)
 +               mark_stmt_necessary (stmt, true);
              return;

            default:;


Re: Fix gcc.dg/lower-subreg-1.c failure

2012-05-09 Thread Richard Sandiford
Hans-Peter Nilsson hans-peter.nils...@axis.com writes:
 From: Richard Sandiford rdsandif...@googlemail.com
 Date: Tue, 1 May 2012 16:46:38 +0200

 To repeat: as things stand, very few targets define proper rtx costs
 for SET.

 IMHO it's wrong to start blaming targets when rtx_cost doesn't
 take the mode in account in the first place, for the default
 cost.  (Well, except for the modes-tieable subreg special-case.)
 The targets where an operation in N * word_mode costs no more
 than one in word_mode, if there even is one, is a minority,
 let's adjust the defaults to that.

I'll pass on approving or disapproving this patch, but for the record:
a factor of word_mode seems a bit too simplistic.  It's OK for moves
and logic ops, but addition of multiword modes is a bit more complicated.
Multiplication and division by multiword modes is even more so, of course.

 Of course, if the costs are defined properly and lower-subreg still
 makes the wrong choice, we need to look at why.

 By the way, regarding validity of rtx_cost calls:

 +++ gcc/lower-subreg.c  2012-05-01 09:46:48.473830772 +0100

 +/* Return the cost of a CODE shift in mode MODE by OP1 bits, using the
 +   rtxes in RTXES.  SPEED_P selects between the speed and size cost.  */
 +
 +static int
 +shift_cost (bool speed_p, struct cost_rtxes *rtxes, enum rtx_code code,
 +   enum machine_mode mode, int op1)
 +{
 +  PUT_MODE (rtxes-target, mode);
 +  PUT_CODE (rtxes-shift, code);
 +  PUT_MODE (rtxes-shift, mode);
 +  PUT_MODE (rtxes-source, mode);
 +  XEXP (rtxes-shift, 1) = GEN_INT (op1);
 +  SET_SRC (rtxes-set) = rtxes-shift;
 +  return insn_rtx_cost (rtxes-set, speed_p);
 +}
 +
 +/* For each X in the range [0, BITS_PER_WORD), set SPLITTING[X]
 +   to true if it is profitable to split a double-word CODE shift
 +   of X + BITS_PER_WORD bits.  SPEED_P says whether we are testing
 +   for speed or size profitability.
 +
 +   Use the rtxes in RTXES to calculate costs.  WORD_MOVE_ZERO_COST is
 +   the cost of moving zero into a word-mode register.  WORD_MOVE_COST
 +   is the cost of moving between word registers.  */
 +
 +static void
 +compute_splitting_shift (bool speed_p, struct cost_rtxes *rtxes,
 +bool *splitting, enum rtx_code code,
 +int word_move_zero_cost, int word_move_cost)
 +{

 I think there should be a gating check whether the target
 implements that kind of shift in that mode at all, before
 checking the cost.  Not sure whether it's generally best to put
 that test here, or to make the rtx_cost function return the cost
 of a libcall for that mode when that happens.  Similar for the
 other insns.

This has come up a few times in past discussions about rtx_cost
(as I'm sure you remember :-)).  On the one hand, I agree it might
be nice to shield the backend from invalid insns.  That would
effectively mean calling expand on each insn though, which would be
rather expensive.  Especially in a GC world.  It would also prevent
the target from being able to take things like pipeline characteristics
into account.  E.g. if you know that something takes 2 operations
that must be issued sequentially, you might want to add in an extra
(sub-COSTS_N_INSN (1)) cost compared to 2 operations that can be issued
together, even if the individual operations take the same number of cycles.

The same goes for multiplication in general.  On some targets
the speed of a multiplication can depend on the second operand,
so knowing its value is useful even if the target doesn't have
a multiplication instruction that takes constant operands.

As things stand, rtx_cost is intentionally used for more than
just valid target insns.  One of the main uses has always been
to decide whether it is better to implement multiplications by a
shift-and-add sequence, or whether to use multiplication instructions.
The associated expmed code has never tried to decide which shifts are
actually representable as matching rtl insns and which aren't.
The same goes for division-using-multiplication.

So I think this patch is using rtx_cost according to its current
interface.  If someone wants to change or restrict that interface,
than that's a separate change IMO.  And it should be done consistently
rather than in this one place.

In this case it doesn't matter anyway.  If we never see a shift
in mode M by amount X, we'll never need to make a decision about
whether to split it.

 Isn't the below better than doing virtually the same in each
 target's rtx_costs?

FWIW, MIPS, SH and x86 all used something slightly different (and more
complicated).  I imagine PowerPC and SPARC will too.  So each seems
a bit strong.

That's not an objection to the patch though.  I realise some ports do
have very regular architectures where every register is the same width
and has the same move cost.

Richard


Re: [PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs

2012-05-09 Thread Richard Guenther
On Wed, May 9, 2012 at 10:38 AM, Dehao Chen de...@google.com wrote:
 On Wed, May 9, 2012 at 4:12 PM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Tue, May 8, 2012 at 6:18 PM, Xinliang David Li davi...@google.com wrote:
 To be clear, this flag is for malloc implementation (such as tcmalloc)
 with side effect unknown to the compiler. Using -fno-builtin-xxx is
 too conservative for that purpose.

 I don't think that flys.  Btw, the patch also guards alloca - alloca is 
 purely
 GCC internal.

 What's the unknown side-effects that are also important to preserve
 for free(malloc(4))?

 Malloc implementation may record some info to a global structure, and
 a program may use this free(malloc()) pair to simulate the real runs
 to get some data, such as peak memory requirement.

So why not use an alternate interface into this special allocator for this
purpose?

 Dehao


 Richard.

 David

 On Tue, May 8, 2012 at 7:43 AM, Dehao Chen de...@google.com wrote:
 Hello,

 This patch adds a flag to guard the optimization that optimize the
 following code away:

 free (malloc (4));

 In some cases, we'd like this type of malloc/free pairs to remain in
 the optimized code.

 Tested with bootstrap, and no regression in the gcc testsuite.

 Is it ok for mainline?

 Thanks,
 Dehao

 gcc/ChangeLog
 2012-05-08  Dehao Chen  de...@google.com

        * common.opt (feliminate-malloc): New.
        * doc/invoke.texi: Document it.
        * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it.

 gcc/testsuite/ChangeLog
 2012-05-08  Dehao Chen  de...@google.com

        * gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working
        as expected.

 Index: gcc/doc/invoke.texi
 ===
 --- gcc/doc/invoke.texi (revision 187277)
 +++ gcc/doc/invoke.texi (working copy)
 @@ -360,7 +360,8 @@
  -fcx-limited-range @gol
  -fdata-sections -fdce -fdelayed-branch @gol
  -fdelete-null-pointer-checks -fdevirtualize -fdse @gol
 --fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects 
 @gol
 +-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations 
 @gol
 +-ffat-lto-objects @gol
  -ffast-math -ffinite-math-only -ffloat-store
 -fexcess-precision=@var{style} @gol
  -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol
  -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
 @@ -6238,6 +6239,7 @@
  -fdefer-pop @gol
  -fdelayed-branch @gol
  -fdse @gol
 +-feliminate-malloc @gol
  -fguess-branch-probability @gol
  -fif-conversion2 @gol
  -fif-conversion @gol
 @@ -6762,6 +6764,11 @@
  Perform dead store elimination (DSE) on RTL@.
  Enabled by default at @option{-O} and higher.

 +@item -feliminate-malloc
 +@opindex feliminate-malloc
 +Eliminate unnecessary malloc/free pairs.
 +Enabled by default at @option{-O} and higher.
 +
  @item -fif-conversion
  @opindex fif-conversion
  Attempt to transform conditional jumps into branch-less equivalents.  This
 Index: gcc/testsuite/gcc.dg/free-malloc.c
 ===
 --- gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
 +++ gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
 @@ -0,0 +1,12 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fno-eliminate-malloc } */
 +/* { dg-final { scan-assembler-times malloc 2} } */
 +/* { dg-final { scan-assembler-times free 2} } */
 +
 +extern void * malloc (unsigned long);
 +extern void free (void *);
 +
 +void test ()
 +{
 +  free (malloc (10));
 +}
 Index: gcc/common.opt
 ===
 --- gcc/common.opt      (revision 187277)
 +++ gcc/common.opt      (working copy)
 @@ -1474,6 +1474,10 @@
  Common Var(flag_dce) Init(1) Optimization
  Use the RTL dead code elimination pass

 +feliminate-malloc
 +Common Var(flag_eliminate_malloc) Init(1) Optimization
 +Eliminate unnecessary malloc/free pairs
 +
  fdse
  Common Var(flag_dse) Init(1) Optimization
  Use the RTL dead store elimination pass
 Index: gcc/tree-ssa-dce.c
 ===
 --- gcc/tree-ssa-dce.c  (revision 187277)
 +++ gcc/tree-ssa-dce.c  (working copy)
 @@ -309,6 +309,8 @@
            case BUILT_IN_CALLOC:
            case BUILT_IN_ALLOCA:
            case BUILT_IN_ALLOCA_WITH_ALIGN:
 +             if (!flag_eliminate_malloc)
 +               mark_stmt_necessary (stmt, true);
              return;

            default:;


Re: [PATCH] MIPS16: Remove DWARF-2 location information from GP accesses

2012-05-09 Thread Maciej W. Rozycki
On Tue, 8 May 2012, Richard Sandiford wrote:

  gcc-mips16-gp-pseudo-loc.patch
  Index: gcc-fsf-trunk-quilt/gcc/config/mips/mips.c
  ===
  --- gcc-fsf-trunk-quilt.orig/gcc/config/mips/mips.c 2012-05-02 
  23:42:46.185566469 +0100
  +++ gcc-fsf-trunk-quilt/gcc/config/mips/mips.c  2012-05-03 
  18:55:28.775580939 +0100
  @@ -2622,7 +2622,8 @@ mips16_gp_pseudo_reg (void)
  scan = NEXT_INSN (scan);
   
 insn = gen_load_const_gp (cfun-machine-mips16_gp_pseudo_rtx);
  -  emit_insn_after (insn, scan);
  +  insn = emit_insn_after (insn, scan);
  +  INSN_LOCATOR (insn) = 0;
   
 pop_topmost_sequence ();
   }
 
 An alternative would be to use prologue_locator, like ARM does.

 Is this instruction guaranteed to be emitted once per function only?  

 I'm not sure whether that's an improvement though, so the patch
 is OK as-is, thanks.

 Applied now, thanks.

  Maciej


Re: [PATCH] MIPS16: Remove DWARF-2 location information from GP accesses

2012-05-09 Thread Richard Sandiford
Maciej W. Rozycki ma...@codesourcery.com writes:
 On Tue, 8 May 2012, Richard Sandiford wrote:
  gcc-mips16-gp-pseudo-loc.patch
  Index: gcc-fsf-trunk-quilt/gcc/config/mips/mips.c
  ===
  --- gcc-fsf-trunk-quilt.orig/gcc/config/mips/mips.c2012-05-02 
  23:42:46.185566469 +0100
  +++ gcc-fsf-trunk-quilt/gcc/config/mips/mips.c 2012-05-03 
  18:55:28.775580939 +0100
  @@ -2622,7 +2622,8 @@ mips16_gp_pseudo_reg (void)
 scan = NEXT_INSN (scan);
   
 insn = gen_load_const_gp (cfun-machine-mips16_gp_pseudo_rtx);
  -  emit_insn_after (insn, scan);
  +  insn = emit_insn_after (insn, scan);
  +  INSN_LOCATOR (insn) = 0;
   
 pop_topmost_sequence ();
   }
 
 An alternative would be to use prologue_locator, like ARM does.

  Is this instruction guaranteed to be emitted once per function only?  

Yes.


Re: [committed] Fix lower-subreg cost calculation

2012-05-09 Thread Richard Earnshaw
On 08/05/12 22:42, Richard Sandiford wrote:
 Richard Earnshaw rearn...@arm.com writes:
 FTR, this caused
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53278
 
 Well, this really has been a brown-paper-bag patch.  Fixed as below.
 Tested on x86_64-linux-gnu and applied as obvious.
 
 Richard
 
 
 gcc/
   PR rtl-optimization/53278
   * lower-subreg.c (decompose_multiword_subregs): Remove left-over
   speed_p code from earlier patch.
 
 Index: gcc/lower-subreg.c
 ===
 --- gcc/lower-subreg.c2012-05-08 19:45:31.0 +0100
 +++ gcc/lower-subreg.c2012-05-08 19:45:31.793855523 +0100
 @@ -1487,9 +1487,7 @@ decompose_multiword_subregs (void)
FOR_EACH_BB (bb)
   {
 rtx insn;
 -   bool speed_p;
  
 -   speed_p = optimize_bb_for_speed_p (bb);
 FOR_BB_INSNS (bb, insn)
   {
 rtx pat;
 


Which begs the question as to why -wshadow didn't pick this up...

R.



Re: [PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs

2012-05-09 Thread Dehao Chen
On Wed, May 9, 2012 at 5:22 PM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Wed, May 9, 2012 at 10:38 AM, Dehao Chen de...@google.com wrote:
 On Wed, May 9, 2012 at 4:12 PM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Tue, May 8, 2012 at 6:18 PM, Xinliang David Li davi...@google.com 
 wrote:
 To be clear, this flag is for malloc implementation (such as tcmalloc)
 with side effect unknown to the compiler. Using -fno-builtin-xxx is
 too conservative for that purpose.

 I don't think that flys.  Btw, the patch also guards alloca - alloca is 
 purely
 GCC internal.

 What's the unknown side-effects that are also important to preserve
 for free(malloc(4))?

 Malloc implementation may record some info to a global structure, and
 a program may use this free(malloc()) pair to simulate the real runs
 to get some data, such as peak memory requirement.

 So why not use an alternate interface into this special allocator for this
 purpose?

There can be the following scenario:

We want to add a module to an existing app. Before implementing the
module, we want to collect some statistics on real runs. In this
scenario, we need:

* No change to the legacy code
* Optimized build for the simulation run
* Provide accurate statistical info

We want to collect data for both new module and the legacy code
without changing the later, thus we cannot use a new malloc/free
interface.

Thanks,
Dehao


 Dehao


 Richard.

 David

 On Tue, May 8, 2012 at 7:43 AM, Dehao Chen de...@google.com wrote:
 Hello,

 This patch adds a flag to guard the optimization that optimize the
 following code away:

 free (malloc (4));

 In some cases, we'd like this type of malloc/free pairs to remain in
 the optimized code.

 Tested with bootstrap, and no regression in the gcc testsuite.

 Is it ok for mainline?

 Thanks,
 Dehao

 gcc/ChangeLog
 2012-05-08  Dehao Chen  de...@google.com

        * common.opt (feliminate-malloc): New.
        * doc/invoke.texi: Document it.
        * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it.

 gcc/testsuite/ChangeLog
 2012-05-08  Dehao Chen  de...@google.com

        * gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working
        as expected.

 Index: gcc/doc/invoke.texi
 ===
 --- gcc/doc/invoke.texi (revision 187277)
 +++ gcc/doc/invoke.texi (working copy)
 @@ -360,7 +360,8 @@
  -fcx-limited-range @gol
  -fdata-sections -fdce -fdelayed-branch @gol
  -fdelete-null-pointer-checks -fdevirtualize -fdse @gol
 --fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects 
 @gol
 +-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations 
 @gol
 +-ffat-lto-objects @gol
  -ffast-math -ffinite-math-only -ffloat-store
 -fexcess-precision=@var{style} @gol
  -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol
  -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
 @@ -6238,6 +6239,7 @@
  -fdefer-pop @gol
  -fdelayed-branch @gol
  -fdse @gol
 +-feliminate-malloc @gol
  -fguess-branch-probability @gol
  -fif-conversion2 @gol
  -fif-conversion @gol
 @@ -6762,6 +6764,11 @@
  Perform dead store elimination (DSE) on RTL@.
  Enabled by default at @option{-O} and higher.

 +@item -feliminate-malloc
 +@opindex feliminate-malloc
 +Eliminate unnecessary malloc/free pairs.
 +Enabled by default at @option{-O} and higher.
 +
  @item -fif-conversion
  @opindex fif-conversion
  Attempt to transform conditional jumps into branch-less equivalents.  
 This
 Index: gcc/testsuite/gcc.dg/free-malloc.c
 ===
 --- gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
 +++ gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
 @@ -0,0 +1,12 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fno-eliminate-malloc } */
 +/* { dg-final { scan-assembler-times malloc 2} } */
 +/* { dg-final { scan-assembler-times free 2} } */
 +
 +extern void * malloc (unsigned long);
 +extern void free (void *);
 +
 +void test ()
 +{
 +  free (malloc (10));
 +}
 Index: gcc/common.opt
 ===
 --- gcc/common.opt      (revision 187277)
 +++ gcc/common.opt      (working copy)
 @@ -1474,6 +1474,10 @@
  Common Var(flag_dce) Init(1) Optimization
  Use the RTL dead code elimination pass

 +feliminate-malloc
 +Common Var(flag_eliminate_malloc) Init(1) Optimization
 +Eliminate unnecessary malloc/free pairs
 +
  fdse
  Common Var(flag_dse) Init(1) Optimization
  Use the RTL dead store elimination pass
 Index: gcc/tree-ssa-dce.c
 ===
 --- gcc/tree-ssa-dce.c  (revision 187277)
 +++ gcc/tree-ssa-dce.c  (working copy)
 @@ -309,6 +309,8 @@
            case BUILT_IN_CALLOC:
            case BUILT_IN_ALLOCA:
            case BUILT_IN_ALLOCA_WITH_ALIGN:
 +             if (!flag_eliminate_malloc)
 +               mark_stmt_necessary (stmt, 

Re: [PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs

2012-05-09 Thread Richard Guenther
On Wed, May 9, 2012 at 11:48 AM, Dehao Chen de...@google.com wrote:
 On Wed, May 9, 2012 at 5:22 PM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Wed, May 9, 2012 at 10:38 AM, Dehao Chen de...@google.com wrote:
 On Wed, May 9, 2012 at 4:12 PM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Tue, May 8, 2012 at 6:18 PM, Xinliang David Li davi...@google.com 
 wrote:
 To be clear, this flag is for malloc implementation (such as tcmalloc)
 with side effect unknown to the compiler. Using -fno-builtin-xxx is
 too conservative for that purpose.

 I don't think that flys.  Btw, the patch also guards alloca - alloca is 
 purely
 GCC internal.

 What's the unknown side-effects that are also important to preserve
 for free(malloc(4))?

 Malloc implementation may record some info to a global structure, and
 a program may use this free(malloc()) pair to simulate the real runs
 to get some data, such as peak memory requirement.

 So why not use an alternate interface into this special allocator for this
 purpose?

 There can be the following scenario:

 We want to add a module to an existing app. Before implementing the
 module, we want to collect some statistics on real runs. In this
 scenario, we need:

 * No change to the legacy code
 * Optimized build for the simulation run
 * Provide accurate statistical info

 We want to collect data for both new module and the legacy code
 without changing the later, thus we cannot use a new malloc/free
 interface.

So put in a optimization barrier then.  Like

free (({ void * x = malloc (4); __asm ( : +m (x)); __x; });

Btw, why can't you simply build the new module (which doesn't something
real anyway, just fake stuff) without optimization?

 Thanks,
 Dehao


 Dehao


 Richard.

 David

 On Tue, May 8, 2012 at 7:43 AM, Dehao Chen de...@google.com wrote:
 Hello,

 This patch adds a flag to guard the optimization that optimize the
 following code away:

 free (malloc (4));

 In some cases, we'd like this type of malloc/free pairs to remain in
 the optimized code.

 Tested with bootstrap, and no regression in the gcc testsuite.

 Is it ok for mainline?

 Thanks,
 Dehao

 gcc/ChangeLog
 2012-05-08  Dehao Chen  de...@google.com

        * common.opt (feliminate-malloc): New.
        * doc/invoke.texi: Document it.
        * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it.

 gcc/testsuite/ChangeLog
 2012-05-08  Dehao Chen  de...@google.com

        * gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working
        as expected.

 Index: gcc/doc/invoke.texi
 ===
 --- gcc/doc/invoke.texi (revision 187277)
 +++ gcc/doc/invoke.texi (working copy)
 @@ -360,7 +360,8 @@
  -fcx-limited-range @gol
  -fdata-sections -fdce -fdelayed-branch @gol
  -fdelete-null-pointer-checks -fdevirtualize -fdse @gol
 --fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects 
 @gol
 +-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations 
 @gol
 +-ffat-lto-objects @gol
  -ffast-math -ffinite-math-only -ffloat-store
 -fexcess-precision=@var{style} @gol
  -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol
  -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
 @@ -6238,6 +6239,7 @@
  -fdefer-pop @gol
  -fdelayed-branch @gol
  -fdse @gol
 +-feliminate-malloc @gol
  -fguess-branch-probability @gol
  -fif-conversion2 @gol
  -fif-conversion @gol
 @@ -6762,6 +6764,11 @@
  Perform dead store elimination (DSE) on RTL@.
  Enabled by default at @option{-O} and higher.

 +@item -feliminate-malloc
 +@opindex feliminate-malloc
 +Eliminate unnecessary malloc/free pairs.
 +Enabled by default at @option{-O} and higher.
 +
  @item -fif-conversion
  @opindex fif-conversion
  Attempt to transform conditional jumps into branch-less equivalents.  
 This
 Index: gcc/testsuite/gcc.dg/free-malloc.c
 ===
 --- gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
 +++ gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
 @@ -0,0 +1,12 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fno-eliminate-malloc } */
 +/* { dg-final { scan-assembler-times malloc 2} } */
 +/* { dg-final { scan-assembler-times free 2} } */
 +
 +extern void * malloc (unsigned long);
 +extern void free (void *);
 +
 +void test ()
 +{
 +  free (malloc (10));
 +}
 Index: gcc/common.opt
 ===
 --- gcc/common.opt      (revision 187277)
 +++ gcc/common.opt      (working copy)
 @@ -1474,6 +1474,10 @@
  Common Var(flag_dce) Init(1) Optimization
  Use the RTL dead code elimination pass

 +feliminate-malloc
 +Common Var(flag_eliminate_malloc) Init(1) Optimization
 +Eliminate unnecessary malloc/free pairs
 +
  fdse
  Common Var(flag_dse) Init(1) Optimization
  Use the RTL dead store elimination pass
 Index: gcc/tree-ssa-dce.c
 

Re: [PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs

2012-05-09 Thread Jakub Jelinek
On Wed, May 09, 2012 at 05:48:25PM +0800, Dehao Chen wrote:
  So why not use an alternate interface into this special allocator for this
  purpose?
 
 There can be the following scenario:
 
 We want to add a module to an existing app. Before implementing the
 module, we want to collect some statistics on real runs. In this
 scenario, we need:
 
 * No change to the legacy code
 * Optimized build for the simulation run
 * Provide accurate statistical info
 
 We want to collect data for both new module and the legacy code
 without changing the later, thus we cannot use a new malloc/free
 interface.

Note that even in the glibc testsuite there had to be workarounds for this
(asm barrier was used):
http://sources.redhat.com/ml/libc-alpha/2012-05/msg00138.html

Jakub


[PATCH] ARM/NEON: vld1q_dup_s64 builtin

2012-05-09 Thread Christophe Lyon

Hello,

On ARM+Neon, the expansion of vld1q_dup_s64() and vld1q_dup_u64() builtins 
currently fails to load the second vector element.

Here is a small patch to address this problem:

2012-05-07  Christophe Lyon christophe.l...@st.com

* gcc/config/arm/neon.md (neon_vld1_dup): Fix vld1q_dup_s64.

Index: gcc/config/arm/neon.md
===
--- gcc/config/arm/neon.md(revision 2659)
+++ gcc/config/arm/neon.md(revision 2660)
@@ -4203,7 +4203,7 @@
   if (GET_MODE_NUNITS (MODEmode)  2)
 return vld1.V_sz_elem\t{%e0[], %f0[]}, %A1;
   else
-return vld1.V_sz_elem\t%h0, %A1;
+return vld1.V_sz_elem\t%e0, %A1 \;vmov\t%f0, %e0;
 }
   [(set (attr neon_type)
   (if_then_else (gt (const_string V_mode_nunits) (const_string 1))

OK?

Thanks,

Christophe.



Backport: fma3 instruction generation for 'march=native' in AMD processors

2012-05-09 Thread Gopalasubramanian, Ganesh
Hello,

Below is the patch that has been committed in trunk (Revision: 187075).
We like to backport it to GCC 4.7 branch as couple of AMD processors require 
this change for fma3 instruction generation.

Bootstrapping and testing are successful. 

Is it OK to commit in GCC 4.7 branch?

Regards
Ganesh

PATCH
=

  * config/i386/driver-i386.c (host_detect_local_cpu): Reset
  has_fma4 for AMD processors with both fma3 and fma4 support.

Index: config/i386/driver-i386.c
===
--- config/i386/driver-i386.c   (revision 186897)
+++ config/i386/driver-i386.c   (working copy)
@@ -472,6 +472,8 @@
   has_abm = ecx  bit_ABM;
   has_lwp = ecx  bit_LWP;
   has_fma4 = ecx  bit_FMA4;
+  if (vendor == SIG_AMD  has_fma4  has_fma)
+   has_fma4 = 0;
   has_xop = ecx  bit_XOP;
   has_tbm = ecx  bit_TBM;
   has_lzcnt = ecx  bit_LZCNT;



Re: [RFC] improve caret diagnostics for overload failures

2012-05-09 Thread Manuel López-Ibáñez
Someone opened a bug about this: http://gcc.gnu.org/PR53289

Pinging: http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01836.html


On 29 April 2012 12:28, Manuel López-Ibáñez lopeziba...@gmail.com wrote:
 A new  version using unsigned int for the flag type. It also adds
 another use in the C FE.

 I am not asking for approval, only whether this
 approach/implementation is the way to go.

 Cheers,

 Manuel.

 On 23 April 2012 20:09, Manuel López-Ibáñez lopeziba...@gmail.com wrote:
 So, apart from the type of the flag, are there any other comments on
 the patch? Is the approach acceptable?

 On 21 April 2012 17:51, Gabriel Dos Reis g...@integrable-solutions.net 
 wrote:
 On Sat, Apr 21, 2012 at 9:42 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Sat, Apr 21, 2012 at 04:26:32PM +0200, Manuel López-Ibáñez wrote:
 On 21 April 2012 16:22, Gabriel Dos Reis g...@integrable-solutions.net 
 wrote:
  Do no use 'char' as the type of a flag.  Prefer 'unsigned int'.
 

 Thanks, good catch! Should I worry about memory here and use something 
 shorter?

 If it is a bool flag, you certainly should use bool type, which is shorter.

 It is a bit flag -- see the patch in his original message and 'enum
 diagnostic_info_flags'.


Re: [testsuite] Fix gcc.target/i386/hle-* testcases with Sun as

2012-05-09 Thread Rainer Orth
Hi Mike,

 On May 8, 2012, at 10:19 AM, Rainer Orth wrote:
 Several /gcc.target/i386/hle-*.c tests are currently failing on Solaris
 9/x86 with Sun as:
 
 FAIL: gcc.target/i386/hle-add-acq-1.c scan-assembler lock[ 
 \\n\\t]+(xacquire|.byte[ \\t]+0xf2)[ \\t\\n]+add
 
 The .s file has
 
lock;
.byte   0xf2
 
 but the scan-assembler regex currently doesn't allow for the ; (which is
 not present with gas 2.22).
 
 The patch below does just that.  Tested with the appropriate runtest
 invocation on i386-pc-solaris2.9 configured with as and gas
 respectively.
 
 Ok for mainline?

 Ok, assuming that the ; has to be there.  If it doesn't have to be

I just rechecked: this is covered by the gcc_cv_as_ix86_rep_lock_prefix
test in gcc/configure.as which fails with the Solaris (8 and) 9 native
assembler.

 there, an alternative patch might be to remove it from the port now
 instead of the patch.

Right, that's why I was asking for review rather than just installing on
my own.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[PATCH] Vectorizer versioning TLC

2012-05-09 Thread Richard Guenther

This continues the series of patches cleaning up versioning / peeling
in the vectorizer.  This moves computation of whether to do a cost
model check to a central place where it is also easy to see where
it will eventually end up generated.  Apart from this re-org this
recognizes that runtime checks are pointless if we could have done
the check fully at compile-time.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2012-05-09  Richard Guenther  rguent...@suse.de

* tree-vectorizer.h (vect_loop_versioning): Adjust prototype.
(vect_do_peeling_for_loop_bound): Likewise.
(vect_do_peeling_for_alignment): Likewise.
* tree-vect-loop-manip.c (conservative_cost_threshold): Remove.
(vect_do_peeling_for_loop_bound): Get check_profitability and
threshold as parameters.
(vect_do_peeling_for_alignment): Likewise.
(vect_loop_versioning): Likewise.
* tree-vect-loop.c (vect_transform_loop): Compute check_profitability
and threshold here.  Control where to put the check here.

Index: gcc/tree-vectorizer.h
===
*** gcc/tree-vectorizer.h   (revision 187316)
--- gcc/tree-vectorizer.h   (working copy)
*** extern LOC vect_loop_location;
*** 807,816 
 in tree-vect-loop-manip.c.  */
  extern void slpeel_make_loop_iterate_ntimes (struct loop *, tree);
  extern bool slpeel_can_duplicate_loop_p (const struct loop *, const_edge);
! extern void vect_loop_versioning (loop_vec_info);
  extern void vect_do_peeling_for_loop_bound (loop_vec_info, tree *,
! tree, gimple_seq);
! extern void vect_do_peeling_for_alignment (loop_vec_info);
  extern LOC find_loop_location (struct loop *);
  extern bool vect_can_advance_ivs_p (loop_vec_info);
  
--- 807,816 
 in tree-vect-loop-manip.c.  */
  extern void slpeel_make_loop_iterate_ntimes (struct loop *, tree);
  extern bool slpeel_can_duplicate_loop_p (const struct loop *, const_edge);
! extern void vect_loop_versioning (loop_vec_info, unsigned int, bool);
  extern void vect_do_peeling_for_loop_bound (loop_vec_info, tree *,
!   unsigned int, bool);
! extern void vect_do_peeling_for_alignment (loop_vec_info, unsigned int, bool);
  extern LOC find_loop_location (struct loop *);
  extern bool vect_can_advance_ivs_p (loop_vec_info);
  
Index: gcc/tree-vect-loop-manip.c
===
*** gcc/tree-vect-loop-manip.c  (revision 187316)
--- gcc/tree-vect-loop-manip.c  (working copy)
*** vect_update_ivs_after_vectorizer (loop_v
*** 1853,1886 
  }
  }
  
- /* Return the more conservative threshold between the
-min_profitable_iters returned by the cost model and the user
-specified threshold, if provided.  */
- 
- static unsigned int
- conservative_cost_threshold (loop_vec_info loop_vinfo,
-int min_profitable_iters)
- {
-   unsigned int th;
-   int min_scalar_loop_bound;
- 
-   min_scalar_loop_bound = ((PARAM_VALUE (PARAM_MIN_VECT_LOOP_BOUND)
-   * LOOP_VINFO_VECT_FACTOR (loop_vinfo)) - 1);
- 
-   /* Use the cost model only if it is more conservative than user specified
-  threshold.  */
-   th = (unsigned) min_scalar_loop_bound;
-   if (min_profitable_iters
-(!min_scalar_loop_bound
-   || min_profitable_iters  min_scalar_loop_bound))
- th = (unsigned) min_profitable_iters;
- 
-   if (th  vect_print_dump_info (REPORT_COST))
- fprintf (vect_dump, Profitability threshold is %u loop iterations., th);
- 
-   return th;
- }
- 
  /* Function vect_do_peeling_for_loop_bound
  
 Peel the last iterations of the loop represented by LOOP_VINFO.
--- 1853,1858 
*** conservative_cost_threshold (loop_vec_in
*** 1896,1902 
  
  void
  vect_do_peeling_for_loop_bound (loop_vec_info loop_vinfo, tree *ratio,
!   tree cond_expr, gimple_seq cond_expr_stmt_list)
  {
tree ni_name, ratio_mult_vf_name;
struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
--- 1868,1874 
  
  void
  vect_do_peeling_for_loop_bound (loop_vec_info loop_vinfo, tree *ratio,
!   unsigned int th, bool check_profitability)
  {
tree ni_name, ratio_mult_vf_name;
struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
*** vect_do_peeling_for_loop_bound (loop_vec
*** 1904,1913 
edge update_e;
basic_block preheader;
int loop_num;
-   bool check_profitability = false;
-   unsigned int th = 0;
-   int min_profitable_iters;
int max_iter;
  
if (vect_print_dump_info (REPORT_DETAILS))
  fprintf (vect_dump, === vect_do_peeling_for_loop_bound ===);
--- 1876,1884 
edge update_e;
basic_block preheader;
int loop_num;
int max_iter;
+   tree cond_expr = NULL_TREE;
+   gimple_seq 

Ping: [PATCH] Add -fdump-rtl-pass-quiet

2012-05-09 Thread Andrew Stubbs

Ping.

On 02/05/12 10:49, Andrew Stubbs wrote:

On 19/04/12 13:58, Andrew Stubbs wrote:

In the meantime, Mr Maintainers, can I commit my patch while we wait for
the new world order? I'm happy to change the option name quiet to
something else if necessary.


Ping.

I think David may have a point about 'quiet' being an inappropriate name
for this option. He proposed 'ir' which I don't like but will do the job
just fine.

Ok, with that change?

Andrew





Re: Backport: fma3 instruction generation for 'march=native' in AMD processors

2012-05-09 Thread Uros Bizjak
Hello!

 Below is the patch that has been committed in trunk (Revision: 187075).
 We like to backport it to GCC 4.7 branch as couple of AMD processors require 
 this change for fma3 instruction generation.

 Is it OK to commit in GCC 4.7 branch?

  * config/i386/driver-i386.c (host_detect_local_cpu): Reset
  has_fma4 for AMD processors with both fma3 and fma4 support.

OK.

Thanks,
Uros.


Re: [testsuite] Fix gcc.target/i386/hle-* testcases with Sun as

2012-05-09 Thread Uros Bizjak
On Wed, May 9, 2012 at 12:53 PM, Rainer Orth
r...@cebitec.uni-bielefeld.de wrote:

 Several /gcc.target/i386/hle-*.c tests are currently failing on Solaris
 9/x86 with Sun as:

 FAIL: gcc.target/i386/hle-add-acq-1.c scan-assembler lock[ 
 \\n\\t]+(xacquire|.byte[ \\t]+0xf2)[ \\t\\n]+add

 The .s file has

        lock;
        .byte   0xf2

 but the scan-assembler regex currently doesn't allow for the ; (which is
 not present with gas 2.22).

 The patch below does just that.  Tested with the appropriate runtest
 invocation on i386-pc-solaris2.9 configured with as and gas
 respectively.

 Ok for mainline?

 Ok, assuming that the ; has to be there.  If it doesn't have to be

 I just rechecked: this is covered by the gcc_cv_as_ix86_rep_lock_prefix
 test in gcc/configure.as which fails with the Solaris (8 and) 9 native
 assembler.

 there, an alternative patch might be to remove it from the port now
 instead of the patch.

 Right, that's why I was asking for review rather than just installing on
 my own.

I'd rather see that we remove semicolon in this case, but please note
that xchg with memory operand doesn't need lock prefix. If it is not
too much trouble...

Uros.


RE: Propose to add a new argument MULTILIB_REQUIRED in multilib framework

2012-05-09 Thread Terry Guo
 -Original Message-
 From: Joseph Myers [mailto:jos...@codesourcery.com]
 Sent: Thursday, May 03, 2012 8:49 PM
 To: Terry Guo
 Cc: gcc-patches@gcc.gnu.org
 Subject: RE: Propose to add a new argument MULTILIB_REQUIRED in
 multilib framework
 
 On Thu, 3 May 2012, Terry Guo wrote:
 
  2012-05-03  Terry Guo  terry@arm.com
 
  * Makefile.in (s-mlib): Add new argument MULTILIB_REQUIRED.
  * genmultilib (MULTILIB_REQUIRED): New.
  * doc/fragments.texi: Document the MULTILIB_REQUIRED.
 
 This is OK, with copyright dates on genmultilib and fragments.texi
 updated, in the absence of any build system maintainer objections
 within
 72 hours.
 

Committed into trunk with this patch at revision 187325 and copyright patch
at revision 187327.

BR,
Terry




Re: [C++ Patch] PR 53158

2012-05-09 Thread Paolo Carlini

Hi again,

On 05/08/2012 03:00 PM, Jason Merrill wrote:

On 05/07/2012 11:28 PM, Paolo Carlini wrote:

error: could not convert ‘b.main()::lambda()()’ from ‘void’ to ‘bool’


It wouldn't say operator()?

I think I'd leave that alone; it is somewhat more informative (about 
what b() expands to) and we're moving toward replacing %qE with caret 
anyway.


The patch to ocp_convert is OK.
As you may have noticed, the patchlet is still unapplied (I'm attaching 
below what I have tested and ready to get in per your approval).


Is unapplied because I was really nervous due to the wrong location 
(thus caret) of the error call, at the end of the whole condition. Now, 
I'm wondering, shall we consistently use error_at (location_of (expr), 
... for the error messages produced by the *convert* functions? The 
below quick fix makes me *much* more happy, the caret points to the 
closed round brace of the b() call. Can I trust all the exprs to come 
with an embedded location *at least* as accurate as input_location, 
normally better? In case I can do a pass through all of cvt.c etc and 
repost a largish patch...


Thanks!

Paolo.


Index: testsuite/g++.dg/cpp0x/lambda/lambda-err2.C
===
--- testsuite/g++.dg/cpp0x/lambda/lambda-err2.C (revision 0)
+++ testsuite/g++.dg/cpp0x/lambda/lambda-err2.C (revision 0)
@@ -0,0 +1,12 @@
+// PR c++/53158
+// { dg-do compile { target c++11 } }
+
+int main()
+{
+  auto a = []() { return true; };
+  auto b = []() { return a(); };  // { dg-error 'a' is not captured }
+  int c, d;
+  while (b()  c  d) // { dg-error could not convert }
+{
+}
+}
Index: cp/cvt.c
===
--- cp/cvt.c(revision 187280)
+++ cp/cvt.c(working copy)
@@ -743,6 +743,12 @@ ocp_convert (tree type, tree expr, int convtype, i
}
   if (code == BOOLEAN_TYPE)
{
+ if (TREE_CODE (intype) == VOID_TYPE)
+   {
+ error (could not convert %qE from %void% to %bool%, expr);
+ return error_mark_node;
+   }
+
  /* We can't implicitly convert a scoped enum to bool, so convert
 to the underlying type first.  */
  if (SCOPED_ENUM_P (intype)  (convtype  CONV_STATIC))
Index: cvt.c
===
--- cvt.c   (revision 187290)
+++ cvt.c   (working copy)
@@ -743,6 +743,14 @@ ocp_convert (tree type, tree expr, int convtype, i
}
   if (code == BOOLEAN_TYPE)
{
+ if (TREE_CODE (intype) == VOID_TYPE)
+   {
+ error_at (location_of (expr),
+   could not convert %qE from %void% to %bool%,
+   expr);
+ return error_mark_node;
+   }
+
  /* We can't implicitly convert a scoped enum to bool, so convert
 to the underlying type first.  */
  if (SCOPED_ENUM_P (intype)  (convtype  CONV_STATIC))


Re: [PATCH] Fix endless loop in forwprop (PR tree-optimization/53226)

2012-05-09 Thread Jakub Jelinek
On Wed, May 09, 2012 at 10:21:04AM +0200, Richard Guenther wrote:
 Using the stmt UID looks odd as you are using it as flag - why not use
 one of the two stmt flags available to passes?
 
 Btw, the other possibility would be to simply change the various combiners
 to require them to update the iterator to point at the first statement that
 they have inserted.  I have some TLC ideas in this area, let's see if I can
 get to them.
 
 Ok with using gimple_set_plf and friends.

Here is what I've committed after doing another bootstrap/regtest:

2012-05-09  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/53226
* tree-ssa-forwprop.c (ssa_forward_propagate_and_combine): Remove
prev and prev_initialized vars, gimple_set_plf (stmt, GF_PLF_1, false)
before processing it and gimple_set_plf (stmt, GF_PLF_1, true) if it
doesn't need to be revisited, look for earliest stmt with
!gimple_plf (stmt, GF_PLF_1) if something changed.

* gcc.c-torture/compile/pr53226.c: New test.

--- gcc/tree-ssa-forwprop.c.jj  2012-05-03 08:35:52.0 +0200
+++ gcc/tree-ssa-forwprop.c 2012-05-08 18:10:19.662061709 +0200
@@ -2677,8 +2677,7 @@ ssa_forward_propagate_and_combine (void)
 
   FOR_EACH_BB (bb)
 {
-  gimple_stmt_iterator gsi, prev;
-  bool prev_initialized;
+  gimple_stmt_iterator gsi;
 
   /* Apply forward propagation to all stmts in the basic-block.
 Note we update GSI within the loop as necessary.  */
@@ -2771,12 +2770,14 @@ ssa_forward_propagate_and_combine (void)
 
   /* Combine stmts with the stmts defining their operands.
 Note we update GSI within the loop as necessary.  */
-  prev_initialized = false;
   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
{
  gimple stmt = gsi_stmt (gsi);
  bool changed = false;
 
+ /* Mark stmt as potentially needing revisiting.  */
+ gimple_set_plf (stmt, GF_PLF_1, false);
+
  switch (gimple_code (stmt))
{
case GIMPLE_ASSIGN:
@@ -2856,18 +2857,18 @@ ssa_forward_propagate_and_combine (void)
{
  /* If the stmt changed then re-visit it and the statements
 inserted before it.  */
- if (!prev_initialized)
+ for (; !gsi_end_p (gsi); gsi_prev (gsi))
+   if (gimple_plf (gsi_stmt (gsi), GF_PLF_1))
+ break;
+ if (gsi_end_p (gsi))
gsi = gsi_start_bb (bb);
  else
-   {
- gsi = prev;
- gsi_next (gsi);
-   }
+   gsi_next (gsi);
}
  else
{
- prev = gsi;
- prev_initialized = true;
+ /* Stmt no longer needs to be revisited.  */
+ gimple_set_plf (stmt, GF_PLF_1, true);
  gsi_next (gsi);
}
}
--- gcc/testsuite/gcc.c-torture/compile/pr53226.c.jj2012-05-08 
18:07:40.007000510 +0200
+++ gcc/testsuite/gcc.c-torture/compile/pr53226.c   2012-05-08 
18:07:35.071029578 +0200
@@ -0,0 +1,13 @@
+/* PR tree-optimization/53226 */
+
+void
+foo (unsigned long *x, char y, char z)
+{
+  int i;
+  for (i = y; i  z; ++i)
+{
+  unsigned long a = ((unsigned char) i)  63UL;
+  unsigned long b = 1ULL  a;
+  *x |= b;
+}
+}

Jakub


Fix PR53185 (vectorizer segfault)

2012-05-09 Thread Michael Matz
Hi,

the current code for strided loads can't deal with the situation when a 
prologue loop (peeling for alignment) is created after analyzing the data 
refs.  There are multiple issues (non-constant steps in DRs mainly), so 
this is a simple stop gap.

Regtesting on x86_64-linux (all langs) in progress.  Okay for trunk?


Ciao,
Michael.

PR tree-optimization/53185
* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Disable
peeling when we see strided loads.

testsuite/
* gcc.dg/vect/pr53185.c: New test.

Index: tree-vect-data-refs.c
===
--- tree-vect-data-refs.c   (revision 187287)
+++ tree-vect-data-refs.c   (working copy)
@@ -1507,6 +1507,17 @@ vect_enhance_data_refs_alignment (loop_v
GROUP_FIRST_ELEMENT (stmt_info) != stmt)
 continue;
 
+  /* FORNOW: Any strided load prevents peeling.  The induction
+ variable analysis will fail when the prologue loop is generated,
+and so we can't generate the new base for the pointer.  */
+  if (STMT_VINFO_STRIDE_LOAD_P (stmt_info))
+   {
+ if (vect_print_dump_info (REPORT_DETAILS))
+   fprintf (vect_dump, strided load prevents peeling);
+ do_peeling = false;
+ break;
+   }
+
   /* For invariant accesses there is nothing to enhance.  */
   if (integer_zerop (DR_STEP (dr)))
continue;
Index: testsuite/gcc.dg/vect/pr53185.c
===
--- testsuite/gcc.dg/vect/pr53185.c (revision 0)
+++ testsuite/gcc.dg/vect/pr53185.c (revision 0)
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options -O3 -ftree-vectorize } */
+unsigned short a, e;
+int *b, *d;
+int c;
+extern int fn2();
+void fn1 () {
+  void *f;
+  for (;;) {
+fn2 ();
+b = f;
+e = 0;
+for (; e  a; ++e)
+  b[e] = d[e * c];
+  }
+}


Re: PR 40752: -Wconversion generates false warnings for operands not larger than target type

2012-05-09 Thread Manuel López-Ibáñez
On 7 May 2012 16:52, Paolo Carlini paolo.carl...@oracle.com wrote:
 Hi,


 On 06/11/2010 10:23 PM, Manuel López-Ibáñez wrote:

 On 11 August 2009 02:01, Joseph S. Myersjos...@codesourcery.com  wrote:

 On Tue, 11 Aug 2009, Manuel López-Ibáñez wrote:

 Modified the patch to make use of the new c-c++-common testsuite.
 Bootstrapped and regression tested on x86_64-linux-gnu.

 OK for trunk?

 I still think the warnings for these cases are mostly correct (there are
 cases where you may be able to make deductions about the range of
 possible
 values of the expression being converted) and appropriate, and if
 disabled
 should be disabled under some separate -Wno-conversion-whatever option.

 -Wno-conversion-after-promotion ?

 I am not sure what would be a good name, but I think it is worth to
 allow silencing these warnings, so suggestions are welcome.

 it looks like this PR is still open today, and I think resolving it one way
 or the other isn't much work... Thus I'm asking: shall we actually have a
 new -Wno-conversion-after-promotion? Or, alternately - the issue came up
 quite often in recent times, because other widespread compilers don't warn -
 suppress the warning only in case of conditional expressions:

 char
 foo(bool haveBar, char bar_)
 {
  return haveBar ? bar_ : 0;
 };

 What do you think?

I think Joseph has said before that if the range can be deduced to
stay within limits of the target type, then we should never warn, no
new option needed. This is exactly the case if you recurse only on
conditional expression operands and casts, and it will fix the warning
above. The other cases should be deal separately.

Cheers,

Manuel.


Re: [RFC ivopts] ARM - Make ivopts take into account whether pre and post increments are actually supported on targets.

2012-05-09 Thread Ramana Radhakrishnan
 I would like another set of eyes on the backend specific changes - I
 am currently regression testing this final version on FSF trunk.

After testing and benchmarking and getting some private feedback about
the patch, this is what I ended up committing.  I have a follow up
patch coming to adjust legitimate_address and friends for some of
these modes.

regards,
Ramana

2012-05-09  Ramana Radhakrishnan  ramana.radhakrish...@linaro.org

* tree-ssa-loop-ivopts.c (add_autoinc_candidates, get_address_cost):
Replace use of HAVE_{POST/PRE}_{INCREMENT/DECREMENT} with
USE_{LOAD/STORE}_{PRE/POST}_{INCREMENT/DECREMENT} appropriately.
* config/arm/arm.h (ARM_AUTOINC_VALID_FOR_MODE_P): New.
(USE_LOAD_POST_INCREMENT): Define.
(USE_LOAD_PRE_INCREMENT): Define.
(USE_LOAD_POST_DECREMENT): Define.
(USE_LOAD_PRE_DECREMENT): Define.
(USE_STORE_PRE_DECREMENT): Define.
(USE_STORE_PRE_INCREMENT): Define.
(USE_STORE_POST_DECREMENT): Define.
(USE_STORE_POST_INCREMENT): Define.
(arm_auto_incmodes): Add enumeration.
* config/arm/arm-protos.h (arm_autoinc_modes_ok_p): Declare.
* config/arm/arm.c (arm_autoinc_modes_ok_p): Define.




 2012-04-10  Ramana Radhakrishnan  ramana.radhakrish...@linaro.org

        * tree-ssa-loop-ivopts.c (add_autoinc_candidates, get_address_cost):
        Replace use of HAVE_{POST/PRE}_{INCREMENT/DECREMENT} with
        USE_{LOAD/STORE}_{PRE/POST}_{INCREMENT/DECREMENT} appropriately.
        * config/arm/arm.h (ARM_AUTOINC_VALID_FOR_MODE_P): New.
        (USE_LOAD_POST_INCREMENT): Define.
        (USE_LOAD_PRE_INCREMENT): Define.
        (USE_LOAD_POST_DECREMENT): Define.
        (USE_LOAD_PRE_DECREMENT): Define.
        (USE_STORE_PRE_DECREMENT): Define.
        (USE_STORE_PRE_INCREMENT): Define.
        (USE_STORE_POST_DECREMENT): Define.
        (USE_STORE_POST_INCREMENT): Define.
        (arm_auto_incmodes): Add enumeration.
        * config/arm/arm-protos.h (arm_autoinc_modes_ok_p): Declare.
        * config/arm/arm.c (arm_autoinc_modes_ok_p): Define.
        (arm_rtx_costs_1): Adjust costs for
        auto-inc modes and pre / post modify in floating point mode.
        (arm_size_rtx_costs): Likewise.


 regards,
 Ramana



 Richard.

 Richard.

 Ramana
Index: gcc/tree-ssa-loop-ivopts.c
===
--- gcc/tree-ssa-loop-ivopts.c  (revision 187327)
+++ gcc/tree-ssa-loop-ivopts.c  (working copy)
@@ -2362,8 +2362,12 @@
   cstepi = int_cst_value (step);
 
   mem_mode = TYPE_MODE (TREE_TYPE (*use-op_p));
-  if ((HAVE_PRE_INCREMENT  GET_MODE_SIZE (mem_mode) == cstepi)
-  || (HAVE_PRE_DECREMENT  GET_MODE_SIZE (mem_mode) == -cstepi))
+  if (((USE_LOAD_PRE_INCREMENT (mem_mode)
+   || USE_STORE_PRE_INCREMENT (mem_mode))
+GET_MODE_SIZE (mem_mode) == cstepi)
+  || ((USE_LOAD_PRE_DECREMENT (mem_mode)
+  || USE_STORE_PRE_DECREMENT (mem_mode))
+  GET_MODE_SIZE (mem_mode) == -cstepi))
 {
   enum tree_code code = MINUS_EXPR;
   tree new_base;
@@ -2380,8 +2384,12 @@
   add_candidate_1 (data, new_base, step, important, IP_BEFORE_USE, use,
   use-stmt);
 }
-  if ((HAVE_POST_INCREMENT  GET_MODE_SIZE (mem_mode) == cstepi)
-  || (HAVE_POST_DECREMENT  GET_MODE_SIZE (mem_mode) == -cstepi))
+  if (((USE_LOAD_POST_INCREMENT (mem_mode)
+   || USE_STORE_POST_INCREMENT (mem_mode))
+GET_MODE_SIZE (mem_mode) == cstepi)
+  || ((USE_LOAD_POST_DECREMENT (mem_mode)
+  || USE_STORE_POST_DECREMENT (mem_mode))
+  GET_MODE_SIZE (mem_mode) == -cstepi))
 {
   add_candidate_1 (data, base, step, important, IP_AFTER_USE, use,
   use-stmt);
@@ -3315,25 +3323,29 @@
   reg0 = gen_raw_REG (address_mode, LAST_VIRTUAL_REGISTER + 1);
   reg1 = gen_raw_REG (address_mode, LAST_VIRTUAL_REGISTER + 2);
 
-  if (HAVE_PRE_DECREMENT)
+  if (USE_LOAD_PRE_DECREMENT (mem_mode) 
+ || USE_STORE_PRE_DECREMENT (mem_mode))
{
  addr = gen_rtx_PRE_DEC (address_mode, reg0);
  has_predec[mem_mode]
= memory_address_addr_space_p (mem_mode, addr, as);
}
-  if (HAVE_POST_DECREMENT)
+  if (USE_LOAD_POST_DECREMENT (mem_mode) 
+ || USE_STORE_POST_DECREMENT (mem_mode))
{
  addr = gen_rtx_POST_DEC (address_mode, reg0);
  has_postdec[mem_mode]
= memory_address_addr_space_p (mem_mode, addr, as);
}
-  if (HAVE_PRE_INCREMENT)
+  if (USE_LOAD_PRE_INCREMENT (mem_mode) 
+ || USE_STORE_PRE_DECREMENT (mem_mode))
{
  addr = gen_rtx_PRE_INC (address_mode, reg0);
  has_preinc[mem_mode]
= memory_address_addr_space_p (mem_mode, addr, as);
}
-  if (HAVE_POST_INCREMENT)
+  if (USE_LOAD_POST_INCREMENT (mem_mode) 
+ || USE_STORE_POST_INCREMENT (mem_mode))
{
  addr = 

[PATCH] Move unexecuted vect testcase

2012-05-09 Thread Richard Guenther

Committed as obvious (passes on x86_64).

Richard.

2012-05-09  Richard Guenther  rguent...@suse.de

PR tree-optimization/18437
* gfortran.dg/vect/rnflow-trs2a2.f90: Move ...
* gfortran.dg/vect/fast-math-rnflow-trs2a2.f90: ... here.


RFA: Fix detecting of in-tree MPFR 3.1.0 sources

2012-05-09 Thread Nick Clifton
Hi Guys,

http://www.mpfr.org/mpfr-current/#changes

  The current release of the MPFR library (v3.1.0) has reorganized its
  sources such the mpfr.h header file is now in a sub-directory called
  'src', rather than being at the top level.  This has broken GCC's use
  of in-tree MPFR sources.
  
  I am asking for permission to apply the patch below to fix the
  problem.  I tested it by building an i686-pc-linux-gnu toolchain on a
  machine with no MPFR libraries installed, but with a copy of the mpfr
  3.1.0 sources installed in-tree.  I also built a second toolchain with
  an in-tree copy of the mpfr 2.4.2 sources, just to make sure that the
  old paths still worked.  Both builds worked.

  OK to apply ?

Cheers
  Nick

gcc/ChangeLog
2012-05-09  Nick Clifton  ni...@redhat.com

* configure.ac (mpfr-dir): When using in-tree MPFR sources
allow for the fact that from release v3.1.0 of MPFR the source
files were moved into a src sub-directory.
* configure: Regenerate.

Index: configure.ac
===
--- configure.ac(revision 187320)
+++ configure.ac(working copy)
@@ -1289,9 +1289,16 @@
   gmplibs=-L$with_mpfr_lib $gmplibs
 fi
 if test x$with_mpfr$with_mpfr_include$with_mpfr_lib = x  test -d 
${srcdir}/mpfr; then
-  gmplibs='-L$$r/$(HOST_SUBDIR)/mpfr/'$lt_cv_objdir $gmplibs
-  gmpinc='-I$$r/$(HOST_SUBDIR)/mpfr -I$$s/mpfr '$gmpinc
-  extra_mpc_mpfr_configure_flags='--with-mpfr-include=$$s/mpfr 
--with-mpfr-lib=$$r/$(HOST_SUBDIR)/mpfr/'$lt_cv_objdir
+  # MPFR v3.1.0 moved the sources into a src sub-directory.
+  if test -d ${srcdir}/mpfr/src; then
+gmplibs='-L$$r/$(HOST_SUBDIR)/mpfr/src/'$lt_cv_objdir $gmplibs
+gmpinc='-I$$r/$(HOST_SUBDIR)/mpfr/src -I$$s/mpfr/src '$gmpinc
+extra_mpc_mpfr_configure_flags='--with-mpfr-include=$$s/mpfr/src 
--with-mpfr-lib=$$r/$(HOST_SUBDIR)/mpfr/src/'$lt_cv_objdir
+  else
+gmplibs='-L$$r/$(HOST_SUBDIR)/mpfr/'$lt_cv_objdir $gmplibs
+gmpinc='-I$$r/$(HOST_SUBDIR)/mpfr -I$$s/mpfr '$gmpinc
+extra_mpc_mpfr_configure_flags='--with-mpfr-include=$$s/mpfr 
--with-mpfr-lib=$$r/$(HOST_SUBDIR)/mpfr/'$lt_cv_objdir
+  fi
   # Do not test the mpfr version.  Assume that it is sufficient, since
   # it is in the source tree, and the library has not been built yet
   # but it would be included on the link line in the version check below


Re: [C++ Patch] PR 53158

2012-05-09 Thread Jason Merrill

On 05/09/2012 08:21 AM, Paolo Carlini wrote:

Is unapplied because I was really nervous due to the wrong location
(thus caret) of the error call, at the end of the whole condition. Now,
I'm wondering, shall we consistently use error_at (location_of (expr),
... for the error messages produced by the *convert* functions?


Sure.  Note that as an alternative to error_at (location_of (expr) you 
can just use %q+E; the + means use the location of this argument.



The
below quick fix makes me *much* more happy, the caret points to the
closed round brace of the b() call. Can I trust all the exprs to come
with an embedded location *at least* as accurate as input_location,
normally better?


I would think so, yes.  If an expression doesn't have a location, 
location_of/%q+E will just use input_location.


Jason


Re: [RFC] improve caret diagnostics for overload failures

2012-05-09 Thread Jason Merrill

On 04/29/2012 06:28 AM, Manuel López-Ibáñez wrote:

A new  version using unsigned int for the flag type. It also adds
another use in the C FE.

I am not asking for approval, only whether this
approach/implementation is the way to go.


That looks good.  I would still also adjust the caret printer to avoid 
printing the same caret twice in a row even when we haven't updated the 
caller.


Jason


Re: [C++ Patch] PR 53158

2012-05-09 Thread Manuel López-Ibáñez
On 9 May 2012 14:59, Jason Merrill ja...@redhat.com wrote:
 On 05/09/2012 08:21 AM, Paolo Carlini wrote:

 Is unapplied because I was really nervous due to the wrong location
 (thus caret) of the error call, at the end of the whole condition. Now,
 I'm wondering, shall we consistently use error_at (location_of (expr),
 ... for the error messages produced by the *convert* functions?


 Sure.  Note that as an alternative to error_at (location_of (expr) you can
 just use %q+E; the + means use the location of this argument.

This far less clear than error_at(location, whatever). And it
requires the diagnostics machinery to know about input_location. I
removed %H precisely for those reasons. Please, let's stop using +
in diagnostics and always use explicit locations.

Cheers,

Manuel.


Re: [RFC] improve caret diagnostics for overload failures

2012-05-09 Thread Manuel López-Ibáñez
On 9 May 2012 15:04, Jason Merrill ja...@redhat.com wrote:
 On 04/29/2012 06:28 AM, Manuel López-Ibáńez wrote:

 A new  version using unsigned int for the flag type. It also adds
 another use in the C FE.

 I am not asking for approval, only whether this
 approach/implementation is the way to go.


 That looks good.  I would still also adjust the caret printer to avoid
 printing the same caret twice in a row even when we haven't updated the
 caller.

I could implement that by storing the last location in the
diagnostic_context or using a static location_t in
diagnostic_show_locus. What is your preference?

Cheers,

Manuel.


Re: [C++ Patch] PR 53158

2012-05-09 Thread Paolo Carlini

Hi,

On 05/09/2012 02:59 PM, Jason Merrill wrote:

On 05/09/2012 08:21 AM, Paolo Carlini wrote:

Is unapplied because I was really nervous due to the wrong location
(thus caret) of the error call, at the end of the whole condition. Now,
I'm wondering, shall we consistently use error_at (location_of (expr),
... for the error messages produced by the *convert* functions?


Sure.  Note that as an alternative to error_at (location_of (expr) you 
can just use %q+E; the + means use the location of this argument.
A great, very concise, didn't know that (but maybe we want to make 
Manuel happier ;)

The
below quick fix makes me *much* more happy, the caret points to the
closed round brace of the b() call. Can I trust all the exprs to come
with an embedded location *at least* as accurate as input_location,
normally better?
I would think so, yes.  If an expression doesn't have a location, 
location_of/%q+E will just use input_location.

In the meanwhile I found a counterexample ;) Consider:

enum E { e1 };

E e = e1;

void bar(int a)
{
  if (e = a)
;
}

right now the location for the invalid conversion from ‘int’ to ‘E’ 
error message seems pretty good, points to the a in  if (e = a). If I do:


Index: call.c
===
--- call.c  (revision 187290)
+++ call.c  (working copy)
@@ -5704,7 +5704,7 @@ convert_like_real (conversion *convs, tree expr, t
break;
}

-  permerror (input_location, invalid conversion from %qT to %qT,
+  permerror (location_of (expr), invalid conversion from %qT to %qT,
 TREE_TYPE (expr), totype);
   if (fn)
permerror (DECL_SOURCE_LOCATION (fn),

then an horrible thing happen: the error message points to the b in bar 
in void bar(int a) !!! What the heck has that to do with the actual 
conversion two lines below?!? Isn't even pointing to the a in void 
bar(int a)!


So... I guess I could start experimenting with my idea, but we should be 
ready to fix regressions... Or we can already take from the 
counterexample a general lesson? Or maybe a subset of the conversion* 
functions, those in cvt.c?, seems first blush safer to tweak?


Paolo.


Re: PATCH: Update longlong.h from GLIBC

2012-05-09 Thread H.J. Lu
On Tue, May 8, 2012 at 4:05 AM, Andreas Jaeger a...@suse.com wrote:
 On Tuesday, May 08, 2012 11:59:34 Richard Earnshaw wrote:
 On 08/05/12 10:04, Andreas Jaeger wrote:
  On Tuesday, May 08, 2012 10:43:14 Richard Guenther wrote:
  On Mon, May 7, 2012 at 11:11 PM, H.J. Lu hongjiu...@intel.com wrote:
  Hi,
 
  I am preparing to update GLIBC longlong.h from GCC.  This patch
  updates GCC longlong.h to use a URL instead of an FSF postal address
  and  replace spaces with tab.  OK to install?
 
  Since I'd like to simply copy longlong.h from GCC release branch to
  GLIBC, Is this also OK for 4.7 branch?
 
  Why?  Does it fix anything there?
 
  It makes sharing the file between gcc and glibc easier,
 
  Andreas

 Why should glibc be depending on the GCC release branch?  Sounds like
 the tail wagging the dog.

 Ah, you discuss the release branch ;) Let HJ defend that one.


 Changing this file has quite a high potential for introducing
 regressions.  I don't think we should risk that on the release branch.

 It's only whitespace  IMO. I'm arguing for the trunk to take the change,


I will take whatever I can get.

Ian, is this OK for trunk?

Thanks.


-- 
H.J.


Re: [C++ Patch] PR 53158

2012-05-09 Thread Jason Merrill
I think the problem is you really want to use EXPR_LOC_OR_HERE rather 
than location_of; if the argument happens to be a DECL, location_of will 
give you the declaration location rather than the use location.


Jason



Re: [C++ Patch] PR 53158

2012-05-09 Thread Jason Merrill

On 05/09/2012 09:04 AM, Manuel López-Ibáñez wrote:

This far less clear than error_at(location, whatever). And it
requires the diagnostics machinery to know about input_location. I
removed %H precisely for those reasons. Please, let's stop using +
in diagnostics and always use explicit locations.


OK.

Jason


Re: [RFC] improve caret diagnostics for overload failures

2012-05-09 Thread Jason Merrill

On 05/09/2012 09:07 AM, Manuel López-Ibáñez wrote:

I could implement that by storing the last location in the
diagnostic_context or using a static location_t in
diagnostic_show_locus. What is your preference?


diagnostic_context, I guess.

Jason



Re: PR 40752: -Wconversion generates false warnings for operands not larger than target type

2012-05-09 Thread Paolo Carlini

Hi,

On 05/09/2012 02:54 PM, Manuel López-Ibáñez wrote:

On 7 May 2012 16:52, Paolo Carlinipaolo.carl...@oracle.com  wrote:

Hi,


On 06/11/2010 10:23 PM, Manuel López-Ibáñez wrote:

On 11 August 2009 02:01, Joseph S. Myersjos...@codesourcery.comwrote:

On Tue, 11 Aug 2009, Manuel López-Ibáñez wrote:


Modified the patch to make use of the new c-c++-common testsuite.
Bootstrapped and regression tested on x86_64-linux-gnu.

OK for trunk?

I still think the warnings for these cases are mostly correct (there are
cases where you may be able to make deductions about the range of
possible
values of the expression being converted) and appropriate, and if
disabled
should be disabled under some separate -Wno-conversion-whatever option.

-Wno-conversion-after-promotion ?

I am not sure what would be a good name, but I think it is worth to
allow silencing these warnings, so suggestions are welcome.

it looks like this PR is still open today, and I think resolving it one way
or the other isn't much work... Thus I'm asking: shall we actually have a
new -Wno-conversion-after-promotion? Or, alternately - the issue came up
quite often in recent times, because other widespread compilers don't warn -
suppress the warning only in case of conditional expressions:

char
foo(bool haveBar, char bar_)
{
  return haveBar ? bar_ : 0;
};

What do you think?

I think Joseph has said before that if the range can be deduced to
stay within limits of the target type, then we should never warn, no
new option needed. This is exactly the case if you recurse only on
conditional expression operands and casts, and it will fix the warning
above. The other cases should be deal separately.
Good, I even have a patch essentially ready for that, but can you point 
me to that detailed feedback? I couldn't find it when I looked for it.


Thanks
Paolo.


Re: [C++ Patch] PR 53158

2012-05-09 Thread Paolo Carlini

On 05/09/2012 03:20 PM, Jason Merrill wrote:
I think the problem is you really want to use EXPR_LOC_OR_HERE rather 
than location_of; if the argument happens to be a DECL, location_of 
will give you the declaration location rather than the use location.
Ah! Thanks a lot, now all those small details I always see in the 
diagnostics are becoming much more clear ;) Let's see what I can do...


Paolo.


Re: Fix PR53185 (vectorizer segfault)

2012-05-09 Thread Richard Guenther
On Wed, May 9, 2012 at 2:38 PM, Michael Matz m...@suse.de wrote:
 Hi,

 the current code for strided loads can't deal with the situation when a
 prologue loop (peeling for alignment) is created after analyzing the data
 refs.  There are multiple issues (non-constant steps in DRs mainly), so
 this is a simple stop gap.

 Regtesting on x86_64-linux (all langs) in progress.  Okay for trunk?

Ok.

Thanks,
Richard.


 Ciao,
 Michael.
 
        PR tree-optimization/53185
        * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Disable
        peeling when we see strided loads.

 testsuite/
        * gcc.dg/vect/pr53185.c: New test.

 Index: tree-vect-data-refs.c
 ===
 --- tree-vect-data-refs.c       (revision 187287)
 +++ tree-vect-data-refs.c       (working copy)
 @@ -1507,6 +1507,17 @@ vect_enhance_data_refs_alignment (loop_v
            GROUP_FIRST_ELEMENT (stmt_info) != stmt)
         continue;

 +      /* FORNOW: Any strided load prevents peeling.  The induction
 +         variable analysis will fail when the prologue loop is generated,
 +        and so we can't generate the new base for the pointer.  */
 +      if (STMT_VINFO_STRIDE_LOAD_P (stmt_info))
 +       {
 +         if (vect_print_dump_info (REPORT_DETAILS))
 +           fprintf (vect_dump, strided load prevents peeling);
 +         do_peeling = false;
 +         break;
 +       }
 +
       /* For invariant accesses there is nothing to enhance.  */
       if (integer_zerop (DR_STEP (dr)))
        continue;
 Index: testsuite/gcc.dg/vect/pr53185.c
 ===
 --- testsuite/gcc.dg/vect/pr53185.c     (revision 0)
 +++ testsuite/gcc.dg/vect/pr53185.c     (revision 0)
 @@ -0,0 +1,16 @@
 +/* { dg-do compile } */
 +/* { dg-options -O3 -ftree-vectorize } */
 +unsigned short a, e;
 +int *b, *d;
 +int c;
 +extern int fn2();
 +void fn1 () {
 +  void *f;
 +  for (;;) {
 +    fn2 ();
 +    b = f;
 +    e = 0;
 +    for (; e  a; ++e)
 +      b[e] = d[e * c];
 +  }
 +}


Re: RFA: Fix detecting of in-tree MPFR 3.1.0 sources

2012-05-09 Thread Richard Guenther
On Wed, May 9, 2012 at 2:52 PM, Nick Clifton ni...@redhat.com wrote:
 Hi Guys,

    http://www.mpfr.org/mpfr-current/#changes

  The current release of the MPFR library (v3.1.0) has reorganized its
  sources such the mpfr.h header file is now in a sub-directory called
  'src', rather than being at the top level.  This has broken GCC's use
  of in-tree MPFR sources.

  I am asking for permission to apply the patch below to fix the
  problem.  I tested it by building an i686-pc-linux-gnu toolchain on a
  machine with no MPFR libraries installed, but with a copy of the mpfr
  3.1.0 sources installed in-tree.  I also built a second toolchain with
  an in-tree copy of the mpfr 2.4.2 sources, just to make sure that the
  old paths still worked.  Both builds worked.

  OK to apply ?

I think we only support dropping in exactly the versions we provide
in infrastructure/ - which matches the version we require in install.texi.

Or did that change?

 Cheers
  Nick

 gcc/ChangeLog
 2012-05-09  Nick Clifton  ni...@redhat.com

        * configure.ac (mpfr-dir): When using in-tree MPFR sources
        allow for the fact that from release v3.1.0 of MPFR the source
        files were moved into a src sub-directory.
        * configure: Regenerate.

 Index: configure.ac
 ===
 --- configure.ac        (revision 187320)
 +++ configure.ac        (working copy)
 @@ -1289,9 +1289,16 @@
   gmplibs=-L$with_mpfr_lib $gmplibs
  fi
  if test x$with_mpfr$with_mpfr_include$with_mpfr_lib = x  test -d 
 ${srcdir}/mpfr; then
 -  gmplibs='-L$$r/$(HOST_SUBDIR)/mpfr/'$lt_cv_objdir $gmplibs
 -  gmpinc='-I$$r/$(HOST_SUBDIR)/mpfr -I$$s/mpfr '$gmpinc
 -  extra_mpc_mpfr_configure_flags='--with-mpfr-include=$$s/mpfr 
 --with-mpfr-lib=$$r/$(HOST_SUBDIR)/mpfr/'$lt_cv_objdir
 +  # MPFR v3.1.0 moved the sources into a src sub-directory.
 +  if test -d ${srcdir}/mpfr/src; then
 +    gmplibs='-L$$r/$(HOST_SUBDIR)/mpfr/src/'$lt_cv_objdir $gmplibs
 +    gmpinc='-I$$r/$(HOST_SUBDIR)/mpfr/src -I$$s/mpfr/src '$gmpinc
 +    extra_mpc_mpfr_configure_flags='--with-mpfr-include=$$s/mpfr/src 
 --with-mpfr-lib=$$r/$(HOST_SUBDIR)/mpfr/src/'$lt_cv_objdir
 +  else
 +    gmplibs='-L$$r/$(HOST_SUBDIR)/mpfr/'$lt_cv_objdir $gmplibs
 +    gmpinc='-I$$r/$(HOST_SUBDIR)/mpfr -I$$s/mpfr '$gmpinc
 +    extra_mpc_mpfr_configure_flags='--with-mpfr-include=$$s/mpfr 
 --with-mpfr-lib=$$r/$(HOST_SUBDIR)/mpfr/'$lt_cv_objdir
 +  fi
   # Do not test the mpfr version.  Assume that it is sufficient, since
   # it is in the source tree, and the library has not been built yet
   # but it would be included on the link line in the version check below


Re: [PATCH] gcc/config/freebsd-spec.h: Fix building PIE executables. Link them with crt{begin,end}S.o and Scrt1.o which are PIC instead of crt{begin,end}.o and crt1.o which are not. Spec synced from g

2012-05-09 Thread Alexis Ballier
Hmm, sorry, it seems I forgot to look at MAINTAINERS and CC him...

On Tue,  8 May 2012 09:53:43 -0400
Alexis Ballier aball...@gentoo.org wrote:

 gcc/config/i386/freebsd.h: Likewise.
 ---
  gcc/config/freebsd-spec.h |9 +++--
  gcc/config/i386/freebsd.h |9 +++--
  2 files changed, 6 insertions(+), 12 deletions(-)
 
 diff --git a/gcc/config/freebsd-spec.h b/gcc/config/freebsd-spec.h
 index 770a3d1..2808582 100644
 --- a/gcc/config/freebsd-spec.h
 +++ b/gcc/config/freebsd-spec.h
 @@ -64,11 +64,8 @@ see the files COPYING3 and COPYING.RUNTIME
 respectively.  If not, see before entering `main'.  */
 
  #define FBSD_STARTFILE_SPEC \
 -  %{!shared: \
 - %{pg:gcrt1.o%s} %{!pg:%{p:gcrt1.o%s} \
 -%{!p:%{profile:gcrt1.o%s} \
 -  %{!profile:crt1.o%s \
 -   crti.o%s %{!shared:crtbegin.o%s} %{shared:crtbeginS.o%s}
 +  %{!shared: %{pg|p|profile:gcrt1.o%s;pie:Scrt1.o%s;:crt1.o%s}} \
 +   crti.o%s %{shared|pie:crtbeginS.o%s;:crtbegin.o%s}
  
  /* Provide a ENDFILE_SPEC appropriate for FreeBSD.  Here we tack on
 the magical crtend.o file (see crtstuff.c) which provides part of 
 @@ -77,7 +74,7 @@ see the files COPYING3 and COPYING.RUNTIME
 respectively.  If not, see `crtn.o'.  */
  
  #define FBSD_ENDFILE_SPEC \
 -  %{!shared:crtend.o%s} %{shared:crtendS.o%s} crtn.o%s
 +  %{shared|pie:crtendS.o%s;:crtend.o%s} crtn.o%s
  
  /* Provide a LIB_SPEC appropriate for FreeBSD as configured and as
 required by the user-land thread model.  Before __FreeBSD_version
 diff --git a/gcc/config/i386/freebsd.h b/gcc/config/i386/freebsd.h
 index 649274d..dd69e43 100644
 --- a/gcc/config/i386/freebsd.h
 +++ b/gcc/config/i386/freebsd.h
 @@ -67,11 +67,8 @@ along with GCC; see the file COPYING3.  If not see
 
  #undef   STARTFILE_SPEC
  #define STARTFILE_SPEC \
 -  %{!shared: \
 - %{pg:gcrt1.o%s} %{!pg:%{p:gcrt1.o%s} \
 -%{!p:%{profile:gcrt1.o%s} \
 -  %{!profile:crt1.o%s \
 -   crti.o%s %{!shared:crtbegin.o%s} %{shared:crtbeginS.o%s}
 +  %{!shared: %{pg|p|profile:gcrt1.o%s;pie:Scrt1.o%s;:crt1.o%s}} \
 +   crti.o%s %{shared|pie:crtbeginS.o%s;:crtbegin.o%s}
  
  /* Provide a ENDFILE_SPEC appropriate for FreeBSD.  Here we tack on
 the magical crtend.o file (see crtstuff.c) which provides part of 
 @@ -81,7 +78,7 @@ along with GCC; see the file COPYING3.  If not see
  
  #undef   ENDFILE_SPEC
  #define ENDFILE_SPEC \
 -  %{!shared:crtend.o%s} %{shared:crtendS.o%s} crtn.o%s
 +  %{shared|pie:crtendS.o%s;:crtend.o%s} crtn.o%s
  
  /* Provide a LINK_SPEC appropriate for FreeBSD.  Here we provide
 support for the special GCC options -static and -shared, which allow
 us to



Re: RFA: Fix detecting of in-tree MPFR 3.1.0 sources

2012-05-09 Thread Richard Guenther
On Wed, May 9, 2012 at 3:26 PM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Wed, May 9, 2012 at 2:52 PM, Nick Clifton ni...@redhat.com wrote:
 Hi Guys,

    http://www.mpfr.org/mpfr-current/#changes

  The current release of the MPFR library (v3.1.0) has reorganized its
  sources such the mpfr.h header file is now in a sub-directory called
  'src', rather than being at the top level.  This has broken GCC's use
  of in-tree MPFR sources.

  I am asking for permission to apply the patch below to fix the
  problem.  I tested it by building an i686-pc-linux-gnu toolchain on a
  machine with no MPFR libraries installed, but with a copy of the mpfr
  3.1.0 sources installed in-tree.  I also built a second toolchain with
  an in-tree copy of the mpfr 2.4.2 sources, just to make sure that the
  old paths still worked.  Both builds worked.

  OK to apply ?

 I think we only support dropping in exactly the versions we provide
 in infrastructure/ - which matches the version we require in install.texi.

 Or did that change?

Btw, it would probably be better to make the drop-in compiles doing
a staged install during build instead of using the build tree for use.

 Cheers
  Nick

 gcc/ChangeLog
 2012-05-09  Nick Clifton  ni...@redhat.com

        * configure.ac (mpfr-dir): When using in-tree MPFR sources
        allow for the fact that from release v3.1.0 of MPFR the source
        files were moved into a src sub-directory.
        * configure: Regenerate.

 Index: configure.ac
 ===
 --- configure.ac        (revision 187320)
 +++ configure.ac        (working copy)
 @@ -1289,9 +1289,16 @@
   gmplibs=-L$with_mpfr_lib $gmplibs
  fi
  if test x$with_mpfr$with_mpfr_include$with_mpfr_lib = x  test -d 
 ${srcdir}/mpfr; then
 -  gmplibs='-L$$r/$(HOST_SUBDIR)/mpfr/'$lt_cv_objdir $gmplibs
 -  gmpinc='-I$$r/$(HOST_SUBDIR)/mpfr -I$$s/mpfr '$gmpinc
 -  extra_mpc_mpfr_configure_flags='--with-mpfr-include=$$s/mpfr 
 --with-mpfr-lib=$$r/$(HOST_SUBDIR)/mpfr/'$lt_cv_objdir
 +  # MPFR v3.1.0 moved the sources into a src sub-directory.
 +  if test -d ${srcdir}/mpfr/src; then
 +    gmplibs='-L$$r/$(HOST_SUBDIR)/mpfr/src/'$lt_cv_objdir $gmplibs
 +    gmpinc='-I$$r/$(HOST_SUBDIR)/mpfr/src -I$$s/mpfr/src '$gmpinc
 +    extra_mpc_mpfr_configure_flags='--with-mpfr-include=$$s/mpfr/src 
 --with-mpfr-lib=$$r/$(HOST_SUBDIR)/mpfr/src/'$lt_cv_objdir
 +  else
 +    gmplibs='-L$$r/$(HOST_SUBDIR)/mpfr/'$lt_cv_objdir $gmplibs
 +    gmpinc='-I$$r/$(HOST_SUBDIR)/mpfr -I$$s/mpfr '$gmpinc
 +    extra_mpc_mpfr_configure_flags='--with-mpfr-include=$$s/mpfr 
 --with-mpfr-lib=$$r/$(HOST_SUBDIR)/mpfr/'$lt_cv_objdir
 +  fi
   # Do not test the mpfr version.  Assume that it is sufficient, since
   # it is in the source tree, and the library has not been built yet
   # but it would be included on the link line in the version check below


Re: PR 53249: Multiple address modes for same address space

2012-05-09 Thread H.J. Lu
On Tue, May 8, 2012 at 4:05 PM, Richard Henderson r...@redhat.com wrote:
 On 05/06/2012 11:41 AM, Richard Sandiford wrote:

        PR middle-end/53249
        * dwarf2out.h (get_address_mode): Move declaration to...
        * rtl.h: ...here.
        * dwarf2out.c (get_address_mode): Move definition to...
        * rtlanal.c: ...here.
        * var-tracking.c (get_address_mode): Delete.
        * combine.c (find_split_point): Use get_address_mode instead of
        targetm.addr_space.address_mode.
        * cselib.c (cselib_record_sets): Likewise.
        * dse.c (canon_address, record_store): Likewise.
        * emit-rtl.c (adjust_address_1, offset_address): Likewise.
        * expr.c (move_by_pieces, emit_block_move_via_loop,
 store_by_pieces)
        (store_by_pieces_1, expand_assignment, store_expr,
 store_constructor)
        (expand_expr_real_1): Likewise.
        * ifcvt.c (noce_try_cmove_arith): Likewise.
        * optabs.c (maybe_legitimize_operand_same_code): Likewise.
        * reload.c (find_reloads): Likewise.
        * sched-deps.c (sched_analyze_1, sched_analyze_2): Likewise.
        * sel-sched-dump.c (debug_mem_addr_value): Likewise.


 ok.


 r~

I checked in this testcase.

Thanks.

-- 
H.J.
--
2012-05-09  H.J. Lu  hongjiu...@intel.com

PR middle-end/53249
* gcc.target/i386/pr53249.c: New.

diff --git a/gcc/testsuite/gcc.target/i386/pr53249.c
b/gcc/testsuite/gcc.target/i386/pr53249.c
new file mode 100644
index 000..9eab8bc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr53249.c
@@ -0,0 +1,25 @@
+/* { dg-do compile { target { ! { ia32 } } } } */
+/* { dg-options -O2 -mx32 -ftls-model=initial-exec -maddress-mode=short } */
+
+struct gomp_task
+{
+  struct gomp_task *parent;
+};
+
+struct gomp_thread
+{
+  int foo1;
+  struct gomp_task *task;
+};
+
+extern __thread struct gomp_thread gomp_tls_data;
+
+void
+__attribute__ ((noinline))
+gomp_end_task (void)
+{
+  struct gomp_thread *thr = gomp_tls_data;
+  struct gomp_task *task = thr-task;
+
+  thr-task = task-parent;
+}


Re: [patch] support for multiarch systems

2012-05-09 Thread Paolo Bonzini
Il 09/05/2012 02:38, Matthias Klose ha scritto:
 Index: gcc/doc/invoke.texi
 ===
 --- gcc/doc/invoke.texi   (revision 187271)
 +++ gcc/doc/invoke.texi   (working copy)
 @@ -6110,6 +6110,11 @@
  @file{../lib32}, or if OS libraries are present in @file{lib/@var{subdir}}
  subdirectories it prints e.g.@: @file{amd64}, @file{sparcv9} or @file{ev6}.
  
 +@item -print-multiarch
 +@opindex print-multiarch
 +Print the path to OS libraries for the selected multiarch,
 +relative to some @file{lib} subdirectory.
 +
  @item -print-prog-name=@var{program}
  @opindex print-prog-name
  Like @option{-print-file-name}, but searches for a program such as 
 @samp{cpp}.

So -print-multiarch is like --print-multi-os-directory?  What is the
difference, and where is it docuemnted?  Should it fail if multiarch
support is not compiled in?

Paolo


Re: RFA: Fix detecting of in-tree MPFR 3.1.0 sources

2012-05-09 Thread Marc Glisse

On Wed, 9 May 2012, Nick Clifton wrote:


Hi Guys,

   http://www.mpfr.org/mpfr-current/#changes

 The current release of the MPFR library (v3.1.0) has reorganized its
 sources such the mpfr.h header file is now in a sub-directory called
 'src', rather than being at the top level.  This has broken GCC's use
 of in-tree MPFR sources.

 I am asking for permission to apply the patch below to fix the
 problem.  I tested it by building an i686-pc-linux-gnu toolchain on a
 machine with no MPFR libraries installed, but with a copy of the mpfr
 3.1.0 sources installed in-tree.  I also built a second toolchain with
 an in-tree copy of the mpfr 2.4.2 sources, just to make sure that the
 old paths still worked.  Both builds worked.


Note that this is PR50461, aka PR51935, which includes a patch which was 
submitted: http://gcc.gnu.org/ml/gcc-patches/2012-03/msg01505.html


(your patch is fine as well, from my non-reviewer point of view)


On Wed, 9 May 2012, Richard Guenther wrote:


I think we only support dropping in exactly the versions we provide
in infrastructure/ - which matches the version we require in install.texi.

Or did that change?


The documentation just says 2.4.2 (or later). Even if it wasn't supported, 
why not take the patch? And it is a first step to changing the version in 
infrastructure, which will eventually have to happen.


--
Marc Glisse


Re: RFA: Fix detecting of in-tree MPFR 3.1.0 sources

2012-05-09 Thread Paolo Bonzini
Il 09/05/2012 14:52, Nick Clifton ha scritto:
 Hi Guys,
 
 http://www.mpfr.org/mpfr-current/#changes
 
   The current release of the MPFR library (v3.1.0) has reorganized its
   sources such the mpfr.h header file is now in a sub-directory called
   'src', rather than being at the top level.  This has broken GCC's use
   of in-tree MPFR sources.
   
   I am asking for permission to apply the patch below to fix the
   problem.  I tested it by building an i686-pc-linux-gnu toolchain on a
   machine with no MPFR libraries installed, but with a copy of the mpfr
   3.1.0 sources installed in-tree.  I also built a second toolchain with
   an in-tree copy of the mpfr 2.4.2 sources, just to make sure that the
   old paths still worked.  Both builds worked.
 
   OK to apply ?

Ok.  I think it's nicer for the users if we enable both builds to work.

Paolo



Re: PR 40752: -Wconversion generates false warnings for operands not larger than target type

2012-05-09 Thread Manuel López-Ibáñez
On 9 May 2012 15:21, Paolo Carlini paolo.carl...@oracle.com wrote:
 Good, I even have a patch essentially ready for that, but can you point me
 to that detailed feedback? I couldn't find it when I looked for it.

This is how I understood the following paragraph:

 I still think the warnings for these cases are mostly correct (there
 are
 cases where you may be able to make deductions about the range of
 possible
 values of the expression being converted) and appropriate, and if
 disabled
 should be disabled under some separate -Wno-conversion-whatever option.

That is,

cases where you not are able to make a deduction about ranges = the
warnings are appropriate and should be disabled only under a separate
option.

cases where you are able to deduce the ranges = warning is not appropriate

But I may have misinterpreted it.

Cheers,

Manuel.



 Thanks
 Paolo.


Re: RFA: Fix detecting of in-tree MPFR 3.1.0 sources

2012-05-09 Thread Paolo Bonzini
Il 09/05/2012 15:31, Richard Guenther ha scritto:
 Btw, it would probably be better to make the drop-in compiles doing
 a staged install during build instead of using the build tree for use.

Same for GCC while building target library, but not really easy to do...

GMP/MPFR are compiled without shared libraries which makes it simpler,
but it would also be a bit ad hoc.

Paolo


Re: PR 40752: -Wconversion generates false warnings for operands not larger than target type

2012-05-09 Thread Joseph S. Myers
On Wed, 9 May 2012, Manuel López-Ibáñez wrote:

 That is,
 
 cases where you not are able to make a deduction about ranges = the
 warnings are appropriate and should be disabled only under a separate
 option.
 
 cases where you are able to deduce the ranges = warning is not appropriate

Yes, that's correct.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: RFA: Fix detecting of in-tree MPFR 3.1.0 sources

2012-05-09 Thread Richard Guenther
On Wed, May 9, 2012 at 3:40 PM, Paolo Bonzini bonz...@gnu.org wrote:
 Il 09/05/2012 15:31, Richard Guenther ha scritto:
 Btw, it would probably be better to make the drop-in compiles doing
 a staged install during build instead of using the build tree for use.

 Same for GCC while building target library, but not really easy to do...

 GMP/MPFR are compiled without shared libraries which makes it simpler,
 but it would also be a bit ad hoc.

Sure - but support for dropping in support libraries into the bootstrap tree
is ad-hoc, too ...

 Paolo


Re: PR 40752: -Wconversion generates false warnings for operands not larger than target type

2012-05-09 Thread Paolo Carlini

On 05/09/2012 03:54 PM, Joseph S. Myers wrote:

On Wed, 9 May 2012, Manuel López-Ibáñez wrote:


That is,

cases where you not are able to make a deduction about ranges =  the
warnings are appropriate and should be disabled only under a separate
option.

cases where you are able to deduce the ranges =  warning is not appropriate

Yes, that's correct.
Ah, great, thanks. Then I should have a patchlet almost ready, simply 
tweaking the conditional expression case with get_unwidened (op1, 0), 
likewise op2, and recursion.


Paolo.


Re: RFA: Fix detecting of in-tree MPFR 3.1.0 sources

2012-05-09 Thread Paolo Bonzini
Il 09/05/2012 15:57, Richard Guenther ha scritto:
  Btw, it would probably be better to make the drop-in compiles doing
  a staged install during build instead of using the build tree for use.
 
  Same for GCC while building target library, but not really easy to do...
 
  GMP/MPFR are compiled without shared libraries which makes it simpler,
  but it would also be a bit ad hoc.
 
 Sure - but support for dropping in support libraries into the bootstrap tree
 is ad-hoc, too ...

Yes, but pretty much confined in configure.ac.  What I'm wary of is
adding ad-hoc stuff to the Makefile, which is already enough of a mess...

Paolo


Symbol table 20/many: cleanup of cgraph_remove_unreachable_nodes

2012-05-09 Thread Jan Hubicka
Hi,
this patch reworks cgraph_remove_unreachable_nodes that has grown from somewhat
hackish but simple marksweep routine into quite unmaintainable mess.  It is
because it needs to handle several special cases - extern inline functions,
virtual functions, clones and inline clone tree reshaping and these features
was not added in particularly nice way and their handling changed several
times. 

Things bacome even more messy when varpool walking was added with ipa-ref
infrastructure in 4.6 timeframe.

This patch rewrites the function to be similar to what lto-cgraph does.  I.e.
compute reachable symbols and the boundary.  I also made some fixes and made
the function to release function bodies that are no longer neccesary more
aggressively.

This, for about 10th time, made me to run into problems with OMP lowering.
OMP create declarations of the artificial functions early, but fills them with
bodies only much later.  It keeps the functions incomplette for a while,
with STRUCT_FUNCTION initialized but left empty.

This patch simply makes it to initialize those structures when they are
needed.

Bootstrapped/regtested x86_64-linux, will commit it shortly.

Honza

* cgraph.h (cgraph_remove_unreachable_nodes): Rename to ...
(symtab_remove_unreachable_nodes): ... this one.
* ipa-cp.c (ipcp_driver): Do not remove unreachable nodes.
* omp-low.c (create_omp_child_function): Do not initialize
DECL_INITIAL and STRUCT_FUNCTION.
(remove_exit_barrier): Do not try to walk not created yet
function bodies.
(expand_omp_taskreg, lower_omp_taskreg): Initialize
the STRUCT_FUNCTION and DECL_INITIAL here.
* cgraphunit.c (ipa_passes): Update.
* cgraphclones.c (cgraph_materialize_all_clones): Update.
* ipa-inline.c (ipa_inline): Update.
* ipa.c: Include ipa-inline.h
(enqueue_cgraph_node, enqueue_varpool_node): Remove.
(enqueue_node): New function.
(process_references): Update.
(symtab_remove_unreachable_nodes): Cleanup.
* passes.c (execute_todo, execute_one_pass): Update.
Index: cgraph.h
===
--- cgraph.h(revision 187335)
+++ cgraph.h(working copy)
@@ -637,7 +637,7 @@ int compute_call_stmt_bb_frequency (tree
 void record_references_in_initializer (tree, bool);
 
 /* In ipa.c  */
-bool cgraph_remove_unreachable_nodes (bool, FILE *);
+bool symtab_remove_unreachable_nodes (bool, FILE *);
 cgraph_node_set cgraph_node_set_new (void);
 cgraph_node_set_iterator cgraph_node_set_find (cgraph_node_set,
   struct cgraph_node *);
Index: ipa-cp.c
===
--- ipa-cp.c(revision 187335)
+++ ipa-cp.c(working copy)
@@ -2445,7 +2445,6 @@ ipcp_driver (void)
   struct cgraph_2edge_hook_list *edge_duplication_hook_holder;
   struct topo_info topo;
 
-  cgraph_remove_unreachable_nodes (true,dump_file);
   ipa_check_create_node_params ();
   ipa_check_create_edge_args ();
   grow_next_edge_clone_vector ();
Index: omp-low.c
===
--- omp-low.c   (revision 187335)
+++ omp-low.c   (working copy)
@@ -1572,7 +1572,6 @@ create_omp_child_function (omp_context *
   DECL_UNINLINABLE (decl) = 1;
   DECL_EXTERNAL (decl) = 0;
   DECL_CONTEXT (decl) = NULL_TREE;
-  DECL_INITIAL (decl) = make_node (BLOCK);
 
   t = build_decl (DECL_SOURCE_LOCATION (decl),
  RESULT_DECL, NULL_TREE, void_type_node);
@@ -1605,13 +1604,6 @@ create_omp_child_function (omp_context *
   DECL_CHAIN (t) = DECL_ARGUMENTS (decl);
   DECL_ARGUMENTS (decl) = t;
 }
-
-  /* Allocate memory for the function structure.  The call to
- allocate_struct_function clobbers CFUN, so we need to restore
- it afterward.  */
-  push_struct_function (decl);
-  cfun-function_end_locus = gimple_location (ctx-stmt);
-  pop_cfun ();
 }
 
 
@@ -3241,28 +3233,31 @@ remove_exit_barrier (struct omp_region *
  unsigned ix;
 
  any_addressable_vars = 0;
- FOR_EACH_LOCAL_DECL (DECL_STRUCT_FUNCTION (child_fun), ix, decl)
-   if (TREE_ADDRESSABLE (decl))
- {
-   any_addressable_vars = 1;
-   break;
- }
- for (block = gimple_block (stmt);
-  !any_addressable_vars
-   block
-   TREE_CODE (block) == BLOCK;
-  block = BLOCK_SUPERCONTEXT (block))
+ if (DECL_STRUCT_FUNCTION (child_fun))
{
- for (local_decls = BLOCK_VARS (block);
-  local_decls;
-  local_decls = DECL_CHAIN (local_decls))
-   if (TREE_ADDRESSABLE (local_decls))
+ FOR_EACH_LOCAL_DECL (DECL_STRUCT_FUNCTION (child_fun), ix, 
decl)
+   if 

[google] Hide all uses of __float128 from Clang (issue6195066)

2012-05-09 Thread Simon Baldwin
Hide all uses of __float128 from Clang.

Brackets _GLIBCXX_USE_FLOAT128 with #ifndef __clang__.  Clang does not
currently support the __float128 builtin, and so will fail to process
libstdc++ headers that use it.

Tested for full bootstrap and dejagnu testsuite.

Okay for google/integration and google/gcc-4_7-integration branches?

Thanks.


2012-05-09   Simon Baldwin  sim...@google.com

* libstdc++-v3/acinclude.m4: Bracket _GLIBCXX_USE_FLOAT128
definition with ifndef __clang__.
* libstdc++-v3/config.h.in: Rebuild.

Index: libstdc++-v3/config.h.in
===
--- libstdc++-v3/config.h.in(revision 187148)
+++ libstdc++-v3/config.h.in(working copy)
@@ -799,8 +799,11 @@
this host. */
 #undef _GLIBCXX_USE_DECIMAL_FLOAT
 
-/* Define if __float128 is supported on this host. */
+/* Define if __float128 is supported on this host.
+   Hide all uses of __float128 from Clang.  Google ref b/6422845  */
+#ifndef __clang__
 #undef _GLIBCXX_USE_FLOAT128
+#endif
 
 /* Defined if gettimeofday is available. */
 #undef _GLIBCXX_USE_GETTIMEOFDAY
Index: libstdc++-v3/acinclude.m4
===
--- libstdc++-v3/acinclude.m4   (revision 187148)
+++ libstdc++-v3/acinclude.m4   (working copy)
@@ -2529,10 +2529,16 @@ int main()
 }
 EOF
 
+AH_VERBATIM([_GLIBCXX_USE_FLOAT128,],
+[/* Define if __float128 is supported on this host.
+   Hide all uses of __float128 from Clang.  Google ref b/6422845  */
+#ifndef __clang__
+#undef _GLIBCXX_USE_FLOAT128
+#endif])
+
 AC_MSG_CHECKING([for __float128])
 if AC_TRY_EVAL(ac_compile); then
-  AC_DEFINE(_GLIBCXX_USE_FLOAT128, 1,
-  [Define if __float128 is supported on this host.])
+  AC_DEFINE(_GLIBCXX_USE_FLOAT128, 1)
   enable_float128=yes
 else
   enable_float128=no

--
This patch is available for review at http://codereview.appspot.com/6195066


Re: [google] Hide all uses of __float128 from Clang (issue6195066)

2012-05-09 Thread Diego Novillo

On 12-05-09 08:19 , Simon Baldwin wrote:


2012-05-09   Simon Baldwinsim...@google.com

* libstdc++-v3/acinclude.m4: Bracket _GLIBCXX_USE_FLOAT128
definition with ifndef __clang__.
* libstdc++-v3/config.h.in: Rebuild.


OK.


Diego.


Re: [google] Hide all uses of __float128 from Clang (issue6195066)

2012-05-09 Thread Ollie Wild
On Wed, May 9, 2012 at 8:19 AM, Simon Baldwin sim...@google.com wrote:

 Hide all uses of __float128 from Clang.

 Brackets _GLIBCXX_USE_FLOAT128 with #ifndef __clang__.  Clang does not
 currently support the __float128 builtin, and so will fail to process
 libstdc++ headers that use it.

 Tested for full bootstrap and dejagnu testsuite.

 Okay for google/integration and google/gcc-4_7-integration branches?

Please check this into google/gcc-4_7 as well.

Ollie


[PATCH] gcc_update explicit use of svn

2012-05-09 Thread David Edelsohn
contrib/gcc_update currently uses svn explicitly to determine the
Revision and Branch instead of the $GCC_SVN shell variable used in
other instances in the script. This patch uses the shell variable in
all instances.

Thanks, David

* gcc_update: Use $GCC_SVN to retrieve branch and revision.

Index: gcc_update
===
--- gcc_update  (revision 187329)
+++ gcc_update  (working copy)
@@ -372,8 +372,8 @@
exit 1
fi

-   revision=`svn info | awk '/Revision:/ { print $2 }'`
-   branch=`svn info | sed -ne /URL:/ {
+   revision=`$GCC_SVN info | awk '/Revision:/ { print $2 }'`
+   branch=`$GCC_SVN info | sed -ne /URL:/ {
s,.*/trunk,trunk,
s,.*/branches/,,
s,.*/tags/,,


Re: [patch] support for multiarch systems

2012-05-09 Thread Matthias Klose
On 09.05.2012 15:37, Paolo Bonzini wrote:
 Il 09/05/2012 02:38, Matthias Klose ha scritto:
 Index: gcc/doc/invoke.texi
 ===
 --- gcc/doc/invoke.texi  (revision 187271)
 +++ gcc/doc/invoke.texi  (working copy)
 @@ -6110,6 +6110,11 @@
  @file{../lib32}, or if OS libraries are present in @file{lib/@var{subdir}}
  subdirectories it prints e.g.@: @file{amd64}, @file{sparcv9} or @file{ev6}.
  
 +@item -print-multiarch
 +@opindex print-multiarch
 +Print the path to OS libraries for the selected multiarch,
 +relative to some @file{lib} subdirectory.
 +
  @item -print-prog-name=@var{program}
  @opindex print-prog-name
  Like @option{-print-file-name}, but searches for a program such as 
 @samp{cpp}.
 
 So -print-multiarch is like --print-multi-os-directory?

the former prints the part before the `:' in the MULTILIB_OSDIRNAMES, the latter
the part after the `':', e.g. ../lib32 and i386-linux-gnu.

 What is the difference, and where is it documented?

Not sure how it should be further documented.

 Should it fail if multiarch support is not compiled in?

All the -print options always succeed. I would prefer if it just prints the
empty string if it is not configured (as it does now).

  Matthias


Re: [Patch,AVR,trunk,4.7]: Implement PR53256

2012-05-09 Thread Denis Chertykov
2012/5/7 Georg-Johann Lay a...@gjlay.de:
 AVR-LibC switched from using either signal /or/ interrupt function
 attribute to using both at the same time.

 This was never documented or implemented but worked accidentally for
 some time, but results in wrong code for 4.7+

 This patch adds better documentation of these attributes and makes
 'interrupt' silently override 'signal'.

 Besides that, some more sanity checking is done for function attributes.

 ASM_DECLARE_FUNCTION_NAME just served to check isr names.
 All the checking is done in the new hook TARGET_SET_CURRENT_FUNCTION
 now so that ASM_DECLARE_FUNCTION_NAME from defaults.h can be used,
 thus some clean-up in elf.h

 Ok for trunk and 4.7?

 Johann

        PR target/53256
        * config/avr/elf.h (ASM_DECLARE_FUNCTION_NAME): Remove.
        * config/avr/avr-protos.h (avr_asm_declare_function_name): Remove.
        * config/avr/avr.h (struct machine_function): Add attributes_checked_p.
        * config/avr/avr.c (avr_asm_declare_function_name): Remove.
        (expand_prologue): Move initialization of cfun-machine-is_naked,
        is_interrupt, is_signal, is_OS_task, is_OS_main from here to...
        (avr_regs_to_save): Ditto.
        (avr_set_current_function): ...this new static function.
        (TARGET_SET_CURRENT_FUNCTION): New define.
        (avr_function_ok_for_sibcall): Use cfun-machine-is_* instead of
        checking attributes of current_function_decl.
        (signal_function_p): Rename to avr_signal_function_p.
        (interrupt_function_p): Rename to avr_interrupt_function_p.

        * doc/extend.texi (Function Attributes): Better explanation of
        'interrupt' and 'signal for AVR. Move 'ifunc' down for
        alphabetical order.


Approved.

Denis.


Re: Optimize calls to functions that return one of their arguments

2012-05-09 Thread Vladimir Makarov

On 04/28/2012 11:31 AM, Bernd Schmidt wrote:
This patch allows us to recognize that even if the argument to memcpy 
lives across the call, we can allocate it to a call-used register by 
reusing the return value of the function.


First, the patch sets the existing fn spec attribute for 
memcpy/memmove. This is translated to a new form of 
CALL_INSN_FUNCTION_USAGE, a (set (returnreg) (argreg)). This is 
recognized by IRA to adjust costs, and for communicating to 
caller-save that the register can be restored cheaply.


The optimization only triggers if the argument is passed in a 
register, which should be the case in the majority of sane ABIs. The 
effect on the new testcase:


pushq%rbx  |subq$8, %rsp
movslq%edx, %rdxmovslq%edx, %rdx
movq%rdi, %rbx 
callmemcpycallmemcpy
movq%rbx, %rax  |addq$8, %rsp
popq%rbx 
retret


Bernd, sorry for some delay.  I needed to think about the patch.

It is pretty interesting and original idea.  My only major objection to 
the patch is about treatment of ALLOCNO_CHEAP_CALLS_CROSSED_NUM.  I 
think it should be accumulated as ALLOCNO_CALLS_CROSSED_NUM. Otherwise, 
I am afraid you will have a degradation in many cases instead of 
improvement.


IRA is a regional allocator.  The first it tries to do coloring in whole 
function (seeing a whole picture), then it tries to improve allocation 
in subregions.  When you treat ALLOCNO_CHEAP_CALLS_CROSSED_NUM not 
accumulated (it means not taking subregions into account) you mislead 
allocation in the region containing subregions.


For example, if the single call is in a subregion, 
ALLOCNO_CHEAP_CALLS_CROSSED_NUM for the subregion allocno will be 1 but 
in whole program allocno will be 0.  RA in whole function will tend to 
allocate callee-saved hard register and RA in the subregion will tend to 
allocate caller-saved hard register.  That will increase a possibility 
to create additional shuffle insns on the subregion borders and as 
consequence will degrade the code.


I don't expect that this micro-optimization improves SPEC2000, but it 
will improve some tests.  So it is good to have it.  It would be really 
interesting to see the optimization impact on SPEC2000.  I think I'll do 
it for myself in a week.


So IRA part of the patch is ok for me if you modify treatment of 
ALLOCNO_CHEAP_CALLS_CROSSED_NUM as it is done for 
ALLOCNO_CALLS_CROSSED_NUM (when upper region allocnos accumulate the 
values from the corresponding allocnos from its sub-regions).




Re: [google] Hide all uses of __float128 from Clang (issue6195066)

2012-05-09 Thread Duncan Sands

Hi Simon,


Hide all uses of __float128 from Clang.

Brackets _GLIBCXX_USE_FLOAT128 with #ifndef __clang__.  Clang does not
currently support the __float128 builtin, and so will fail to process
libstdc++ headers that use it.


if one day clang gets support for this type, won't this still turn everything
off?  Is it possible to test the compiler on some small program using
__float128, and turn off use of __float128 if the compiler barfs?

Ciao, Duncan.


Re: [google] Hide all uses of __float128 from Clang (issue6195066)

2012-05-09 Thread Steven Bosscher
On Wed, May 9, 2012 at 6:08 PM, Duncan Sands baldr...@free.fr wrote:
 Hi Simon,

 Hide all uses of __float128 from Clang.

 Brackets _GLIBCXX_USE_FLOAT128 with #ifndef __clang__.  Clang does not
 currently support the __float128 builtin, and so will fail to process
 libstdc++ headers that use it.


 if one day clang gets support for this type, won't this still turn
 everything
 off?  Is it possible to test the compiler on some small program using
 __float128, and turn off use of __float128 if the compiler barfs?

Does it matter? This is for Google-internal branches only.

Ciao!
Steven


Re: RFA: Fix detecting of in-tree MPFR 3.1.0 sources

2012-05-09 Thread nick clifton

Hi Guys,


Ok.  I think it's nicer for the users if we enable both builds to work.


Thanks.  I have applied the patch, along with a reference to PR 50461 
and a credit to Paul Smith in the ChangeLog entry since he came up with 
basically the same patch as mine.


Cheers
  Nick



Re: [patch] support for multiarch systems

2012-05-09 Thread Paolo Bonzini
Il 09/05/2012 17:34, Matthias Klose ha scritto:
  So -print-multiarch is like --print-multi-os-directory?
 the former prints the part before the `:' in the MULTILIB_OSDIRNAMES, the 
 latter
 the part after the `':', e.g. ../lib32 and i386-linux-gnu.

Yes, of course.

  What is the difference, and where is it documented?
 Not sure how it should be further documented.

No idea, it is a new concept and people need to understand how it
relates to multilibbing for example, what shortcomings are addressed etc.

I went through the Debian pages (only cursorily, I admit) and I found
nothing of this.

Another question I have is related to usage of the option.  Are you
supposed to look for libraries in the multilib directories too if the
compiler is multiarch-enabled?  Or only in /lib/i386-linux-gnu?  Which
one takes priority, multiarch or multiosdir?

From the patch I can guess the intended search path is
/lib/MULTIARCH:/lib/MULTIOSDIR, but I'm not entirely sure about that and
it needs documentation.

  Should it fail if multiarch support is not compiled in?
 All the -print options always succeed. I would prefer if it just prints the
 empty string if it is not configured (as it does now).

Will the empty string be a valid output for a multiarch-enabled
compiler?  For example gcc --print-multi-os-directory and gcc
--print-multi-directory on a bi-arch x86-64 compiler will never print
the empty string.  Again, I guess the answer is no but I'm not sure.

If the answer is no, returning the empty string is fine.

If the answer is yes, and assuming the search path is
/lib/MULTIARCH:/lib/MULTIOSDIR, then programs need to know whether they
need to omit the /lib/MULTIARCH component of the search path.  A failure
status code is then required.

Paolo


Re: [PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs

2012-05-09 Thread Xinliang David Li
On Wed, May 9, 2012 at 1:12 AM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Tue, May 8, 2012 at 6:18 PM, Xinliang David Li davi...@google.com wrote:
 To be clear, this flag is for malloc implementation (such as tcmalloc)
 with side effect unknown to the compiler. Using -fno-builtin-xxx is
 too conservative for that purpose.

 I don't think that flys.  Btw, the patch also guards alloca - alloca is purely
 GCC internal.

 What's the unknown side-effects that are also important to preserve
 for free(malloc(4))?


The side effect is the user registered malloc hooks.

The pattern 'free(malloc(4)', or loops like

for (..)
 {
malloc(..); // result not used
 }

itself is not interesting. They only appear in the test code and the
problem can be worked around by using -fno-builtin-malloc. The option
is to avoid disable this and all similar malloc optimizations (in the
future) for malloc implementation with hooks.

Thanks,

David

 Richard.

 David

 On Tue, May 8, 2012 at 7:43 AM, Dehao Chen de...@google.com wrote:
 Hello,

 This patch adds a flag to guard the optimization that optimize the
 following code away:

 free (malloc (4));

 In some cases, we'd like this type of malloc/free pairs to remain in
 the optimized code.

 Tested with bootstrap, and no regression in the gcc testsuite.

 Is it ok for mainline?

 Thanks,
 Dehao

 gcc/ChangeLog
 2012-05-08  Dehao Chen  de...@google.com

        * common.opt (feliminate-malloc): New.
        * doc/invoke.texi: Document it.
        * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it.

 gcc/testsuite/ChangeLog
 2012-05-08  Dehao Chen  de...@google.com

        * gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working
        as expected.

 Index: gcc/doc/invoke.texi
 ===
 --- gcc/doc/invoke.texi (revision 187277)
 +++ gcc/doc/invoke.texi (working copy)
 @@ -360,7 +360,8 @@
  -fcx-limited-range @gol
  -fdata-sections -fdce -fdelayed-branch @gol
  -fdelete-null-pointer-checks -fdevirtualize -fdse @gol
 --fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects @gol
 +-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations 
 @gol
 +-ffat-lto-objects @gol
  -ffast-math -ffinite-math-only -ffloat-store
 -fexcess-precision=@var{style} @gol
  -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol
  -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
 @@ -6238,6 +6239,7 @@
  -fdefer-pop @gol
  -fdelayed-branch @gol
  -fdse @gol
 +-feliminate-malloc @gol
  -fguess-branch-probability @gol
  -fif-conversion2 @gol
  -fif-conversion @gol
 @@ -6762,6 +6764,11 @@
  Perform dead store elimination (DSE) on RTL@.
  Enabled by default at @option{-O} and higher.

 +@item -feliminate-malloc
 +@opindex feliminate-malloc
 +Eliminate unnecessary malloc/free pairs.
 +Enabled by default at @option{-O} and higher.
 +
  @item -fif-conversion
  @opindex fif-conversion
  Attempt to transform conditional jumps into branch-less equivalents.  This
 Index: gcc/testsuite/gcc.dg/free-malloc.c
 ===
 --- gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
 +++ gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
 @@ -0,0 +1,12 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fno-eliminate-malloc } */
 +/* { dg-final { scan-assembler-times malloc 2} } */
 +/* { dg-final { scan-assembler-times free 2} } */
 +
 +extern void * malloc (unsigned long);
 +extern void free (void *);
 +
 +void test ()
 +{
 +  free (malloc (10));
 +}
 Index: gcc/common.opt
 ===
 --- gcc/common.opt      (revision 187277)
 +++ gcc/common.opt      (working copy)
 @@ -1474,6 +1474,10 @@
  Common Var(flag_dce) Init(1) Optimization
  Use the RTL dead code elimination pass

 +feliminate-malloc
 +Common Var(flag_eliminate_malloc) Init(1) Optimization
 +Eliminate unnecessary malloc/free pairs
 +
  fdse
  Common Var(flag_dse) Init(1) Optimization
  Use the RTL dead store elimination pass
 Index: gcc/tree-ssa-dce.c
 ===
 --- gcc/tree-ssa-dce.c  (revision 187277)
 +++ gcc/tree-ssa-dce.c  (working copy)
 @@ -309,6 +309,8 @@
            case BUILT_IN_CALLOC:
            case BUILT_IN_ALLOCA:
            case BUILT_IN_ALLOCA_WITH_ALIGN:
 +             if (!flag_eliminate_malloc)
 +               mark_stmt_necessary (stmt, true);
              return;

            default:;


Re: [PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs

2012-05-09 Thread Andrew Pinski
On Wed, May 9, 2012 at 9:32 AM, Xinliang David Li davi...@google.com wrote:
 On Wed, May 9, 2012 at 1:12 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Tue, May 8, 2012 at 6:18 PM, Xinliang David Li davi...@google.com wrote:
 To be clear, this flag is for malloc implementation (such as tcmalloc)
 with side effect unknown to the compiler. Using -fno-builtin-xxx is
 too conservative for that purpose.

 I don't think that flys.  Btw, the patch also guards alloca - alloca is 
 purely
 GCC internal.

 What's the unknown side-effects that are also important to preserve
 for free(malloc(4))?


 The side effect is the user registered malloc hooks.


IIRC future versions of glibc will have those malloc hooks removed.
Because of this problem.

Thanks,
Andrew Pinski



 The pattern 'free(malloc(4)', or loops like

 for (..)
  {
    malloc(..); // result not used
  }

 itself is not interesting. They only appear in the test code and the
 problem can be worked around by using -fno-builtin-malloc. The option
 is to avoid disable this and all similar malloc optimizations (in the
 future) for malloc implementation with hooks.

 Thanks,

 David

 Richard.

 David

 On Tue, May 8, 2012 at 7:43 AM, Dehao Chen de...@google.com wrote:
 Hello,

 This patch adds a flag to guard the optimization that optimize the
 following code away:

 free (malloc (4));

 In some cases, we'd like this type of malloc/free pairs to remain in
 the optimized code.

 Tested with bootstrap, and no regression in the gcc testsuite.

 Is it ok for mainline?

 Thanks,
 Dehao

 gcc/ChangeLog
 2012-05-08  Dehao Chen  de...@google.com

        * common.opt (feliminate-malloc): New.
        * doc/invoke.texi: Document it.
        * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it.

 gcc/testsuite/ChangeLog
 2012-05-08  Dehao Chen  de...@google.com

        * gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working
        as expected.

 Index: gcc/doc/invoke.texi
 ===
 --- gcc/doc/invoke.texi (revision 187277)
 +++ gcc/doc/invoke.texi (working copy)
 @@ -360,7 +360,8 @@
  -fcx-limited-range @gol
  -fdata-sections -fdce -fdelayed-branch @gol
  -fdelete-null-pointer-checks -fdevirtualize -fdse @gol
 --fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects 
 @gol
 +-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations 
 @gol
 +-ffat-lto-objects @gol
  -ffast-math -ffinite-math-only -ffloat-store
 -fexcess-precision=@var{style} @gol
  -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol
  -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
 @@ -6238,6 +6239,7 @@
  -fdefer-pop @gol
  -fdelayed-branch @gol
  -fdse @gol
 +-feliminate-malloc @gol
  -fguess-branch-probability @gol
  -fif-conversion2 @gol
  -fif-conversion @gol
 @@ -6762,6 +6764,11 @@
  Perform dead store elimination (DSE) on RTL@.
  Enabled by default at @option{-O} and higher.

 +@item -feliminate-malloc
 +@opindex feliminate-malloc
 +Eliminate unnecessary malloc/free pairs.
 +Enabled by default at @option{-O} and higher.
 +
  @item -fif-conversion
  @opindex fif-conversion
  Attempt to transform conditional jumps into branch-less equivalents.  This
 Index: gcc/testsuite/gcc.dg/free-malloc.c
 ===
 --- gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
 +++ gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
 @@ -0,0 +1,12 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fno-eliminate-malloc } */
 +/* { dg-final { scan-assembler-times malloc 2} } */
 +/* { dg-final { scan-assembler-times free 2} } */
 +
 +extern void * malloc (unsigned long);
 +extern void free (void *);
 +
 +void test ()
 +{
 +  free (malloc (10));
 +}
 Index: gcc/common.opt
 ===
 --- gcc/common.opt      (revision 187277)
 +++ gcc/common.opt      (working copy)
 @@ -1474,6 +1474,10 @@
  Common Var(flag_dce) Init(1) Optimization
  Use the RTL dead code elimination pass

 +feliminate-malloc
 +Common Var(flag_eliminate_malloc) Init(1) Optimization
 +Eliminate unnecessary malloc/free pairs
 +
  fdse
  Common Var(flag_dse) Init(1) Optimization
  Use the RTL dead store elimination pass
 Index: gcc/tree-ssa-dce.c
 ===
 --- gcc/tree-ssa-dce.c  (revision 187277)
 +++ gcc/tree-ssa-dce.c  (working copy)
 @@ -309,6 +309,8 @@
            case BUILT_IN_CALLOC:
            case BUILT_IN_ALLOCA:
            case BUILT_IN_ALLOCA_WITH_ALIGN:
 +             if (!flag_eliminate_malloc)
 +               mark_stmt_necessary (stmt, true);
              return;

            default:;


Re: Fix gcc.dg/lower-subreg-1.c failure

2012-05-09 Thread Hans-Peter Nilsson
 From: Richard Sandiford rdsandif...@googlemail.com
 Date: Wed, 9 May 2012 11:14:49 +0200

 Hans-Peter Nilsson hans-peter.nils...@axis.com writes:
  From: Richard Sandiford rdsandif...@googlemail.com
  Date: Tue, 1 May 2012 16:46:38 +0200
 
  To repeat: as things stand, very few targets define proper rtx costs
  for SET.
 
  IMHO it's wrong to start blaming targets when rtx_cost doesn't
  take the mode in account in the first place, for the default
  cost.  (Well, except for the modes-tieable subreg special-case.)
  The targets where an operation in N * word_mode costs no more
  than one in word_mode, if there even is one, is a minority,
  let's adjust the defaults to that.
 
 I'll pass on approving or disapproving this patch, but for the record:
 a factor of word_mode seems a bit too simplistic.

I'd say it's the right level: simplistic enough for the default,
not to mention now linear, without being plainly ignorant as now.

 It's OK for moves
 and logic ops, but addition of multiword modes is a bit more complicated.

How about (same factor) factor*COSTS_N_INSNS (1)*3/2 to account
for carry?  Or is 2*factor a better default?  Further
improvements are welcome, but I see the patch as a strict
improvement and I hope it will not be shot down by requests for
perfection - at least not without detailing said perfection.

 Multiplication and division by multiword modes is even more
 so, of course.

Suggestions are welcome, but in the absence of that, I'd say any
factor larger than one is is a good start, like in the patch.

  I think there should be a gating check whether the target
  implements that kind of shift in that mode at all, before
  checking the cost.  Not sure whether it's generally best to put
  that test here, or to make the rtx_cost function return the cost
  of a libcall for that mode when that happens.  Similar for the
  other insns.
 
 This has come up a few times in past discussions about rtx_cost
 (as I'm sure you remember :-)).  On the one hand, I agree it might
 be nice to shield the backend from invalid insns.  That would
 effectively mean calling expand on each insn though, which would be
 rather expensive.

No, nothing that complicated.  I'm thinking of just basically
checking that there's an operation in that mode, like:

if (direct_optab_handler (code_to_optab [GET_CODE (x)], mode)
== CODE_FOR_nothing)
  {
... return tabled default cost; for libcall or open-code ...
  }

Restricting the validity-gating to checking that the mode is
valid for the operation wouldn't interfere with fancy pipeline
speculative use.

 So I think this patch is using rtx_cost according to its current
 interface.

The interface use previously ignored the mode for most uses
(QED), so that's not completely correct. ;)

 If someone wants to change or restrict that interface,
 than that's a separate change IMO.  And it should be done consistently
 rather than in this one place.
 
 In this case it doesn't matter anyway.  If we never see a shift
 in mode M by amount X, we'll never need to make a decision about
 whether to split it.

If it's never used, then I don't mind it being wrong if that
simplifies the computation. :)

  Isn't the below better than doing virtually the same in each
  target's rtx_costs?
 
 FWIW, MIPS, SH and x86 all used something slightly different (and more
 complicated).  I imagine PowerPC and SPARC will too.  So each seems
 a bit strong.
 
 That's not an objection to the patch though.  I realise some ports do
 have very regular architectures where every register is the same width
 and has the same move cost.

Hence the default should follow a very regular model...

brgds, H-P


Re: RFA: PR target/53120, constraint modifier + on operand tied by matching-constraint, 0.

2012-05-09 Thread DJ Delorie

OK to apply ?

Ok.  Thanks!


Re: [patch] support for multiarch systems

2012-05-09 Thread Matthias Klose
On 09.05.2012 18:29, Paolo Bonzini wrote:
 Il 09/05/2012 17:34, Matthias Klose ha scritto:
 So -print-multiarch is like --print-multi-os-directory?
 the former prints the part before the `:' in the MULTILIB_OSDIRNAMES, the 
 latter
 the part after the `':', e.g. ../lib32 and i386-linux-gnu.
 
 Yes, of course.
 
 What is the difference, and where is it documented?
 Not sure how it should be further documented.
 
 No idea, it is a new concept and people need to understand how it
 relates to multilibbing for example, what shortcomings are addressed etc.
 
 I went through the Debian pages (only cursorily, I admit) and I found
 nothing of this.

these are referenced from the http://wiki.debian.org/Multiarch/Tuples
https://wiki.ubuntu.com/MultiarchSpec#Filesystem_layout
http://err.no/debian/amd64-multiarch-3

http://wiki.debian.org/Multiarch/TheCaseForMultiarch describes use cases for
multiarch, and why Debian thinks that the existing approaches are not sufficient
(having name collisions for different architectures or ad hoc names for new
architectures like libx32).  That may be contentious within the Linux community,
but I would like to avoid this kind of discussion here.

 Another question I have is related to usage of the option.  Are you
 supposed to look for libraries in the multilib directories too if the
 compiler is multiarch-enabled?

Debian/Ubuntu always use /lib for the default MULTIOSDIR, and this needs to be
looked up in the future too. The use of locations like ../lib32 will be faded
out over time, but I don't see any harm not to have looked up as well.

 Or only in /lib/i386-linux-gnu?  Which
 one takes priority, multiarch or multiosdir?
 
 From the patch I can guess the intended search path is
 /lib/MULTIARCH:/lib/MULTIOSDIR, but I'm not entirely sure about that and
 it needs documentation.

yes, this is the intended search path. I assume such kind of documentation
shouldn't go into invoke.texi.

 Should it fail if multiarch support is not compiled in?
 All the -print options always succeed. I would prefer if it just prints the
 empty string if it is not configured (as it does now).
 
 Will the empty string be a valid output for a multiarch-enabled
 compiler?  For example gcc --print-multi-os-directory and gcc
 --print-multi-directory on a bi-arch x86-64 compiler will never print
 the empty string.  Again, I guess the answer is no but I'm not sure.
 
 If the answer is no, returning the empty string is fine.

The answer is no. If multiarch is enabled, then the string is always non-empty
and should return a multiarch tuple as defined in
http://wiki.debian.org/Multiarch/Tuples. Anything else should be considered an
error.

 If the answer is yes, and assuming the search path is
 /lib/MULTIARCH:/lib/MULTIOSDIR, then programs need to know whether they
 need to omit the /lib/MULTIARCH component of the search path.

Unrelated, but why would programs hard code this path and not use something like
this?

gcc -v -E - /dev/null 21 | grep ^LIBRARY_PATH

  Matthias


[doc] Fix xref in extend.texi

2012-05-09 Thread Manuel López-Ibáñez
It is strange that a paragraph talking about GCC pragmas refers to the
section in CPP talking about #indent and other such directives (no
#pragma there). Wouldn't it better if it pointed to the section about
#pragmas?

OK?

2012-05-09  Manuel López-Ibáñez  m...@gcc.gnu.org

* doc/extend.texi (Function Attributes): Point xref to section
about Pragmas.


Index: doc/extend.texi
===
--- doc/extend.texi (revision 187299)
+++ doc/extend.texi (working copy)
@@ -4102,8 +4102,7 @@
 found convenient to use @code{__attribute__} to achieve a natural
 attachment of attributes to their corresponding declarations, whereas
 @code{#pragma GCC} is of use for constructs that do not naturally form
-part of the grammar.  @xref{Other Directives,,Miscellaneous
-Preprocessing Directives, cpp, The GNU C Preprocessor}.
+part of the grammar.  @xref{Pragmas,,Pragmas Accepted by GCC}.

 @node Attribute Syntax
 @section Attribute Syntax


[PATCH, i386]: Fix PR 44141, Redundant loads and stores generated for AMD bdver1 target

2012-05-09 Thread Uros Bizjak
Hello!

Attached patch avoids a deficiency in reload, where reload gives up on
handling subregs of pseudos (please see the PR  [1] for explanation by
Ulrich). The patch simply avoids generating V4SF moves with V4SF
subregs of V2DF values unless really necessary (i.e. moving SSE2 modes
without SSE2 enabled, which shouldn't happen anyway). With patched
gcc, expand pass emits (unaligned) moves in their original mode, and
this mode is kept until asm is generated. The asm instruction is
chosen according to the mode of insn pattern, and the mode is
calculated using various influencing conditions.

2012-05-09  Uros Bizjak  ubiz...@gmail.com

PR target/44141
* config/i386/i386.c (ix86_expand_vector_move_misalign): Do not handle
128 bit vectors specially for TARGET_AVX.  Emit sse2_movupd and
sse_movupd RTXes for TARGET_AVX, TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL
or when optimizing for size.
* config/i386/sse.md (*movmode_internal): Remove
TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL handling from asm output code.
Calculate mode attribute according to optimize_function_for_size_p
and TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL flag.
(*sse_movussemodesuffixavxsizesuffix): Choose asm template
depending on the mode of the instruction.  Calculate mode attribute
according to optimize_function_for_size_p, TARGET_SSE_TYPELESS_STORES
and TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL flags.
(*sse2_movdquavxsizesuffix): Ditto.

Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}.

The patch also fixes the testcase from  the PR.

Patch will be committed to mainline SVN.

[1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44141#c16

Uros.
Index: config/i386/sse.md
===
--- config/i386/sse.md  (revision 187286)
+++ config/i386/sse.md  (working copy)
@@ -449,8 +449,6 @@
   (misaligned_operand (operands[0], MODEmode)
  || misaligned_operand (operands[1], MODEmode)))
return vmovupd\t{%1, %0|%0, %1};
- else if (TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
-   return %vmovaps\t{%1, %0|%0, %1};
  else
return %vmovapd\t{%1, %0|%0, %1};
 
@@ -460,8 +458,6 @@
   (misaligned_operand (operands[0], MODEmode)
  || misaligned_operand (operands[1], MODEmode)))
return vmovdqu\t{%1, %0|%0, %1};
- else if (TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
-   return %vmovaps\t{%1, %0|%0, %1};
  else
return %vmovdqa\t{%1, %0|%0, %1};
 
@@ -475,19 +471,21 @@
   [(set_attr type sselog1,ssemov,ssemov)
(set_attr prefix maybe_vex)
(set (attr mode)
-   (cond [(match_test TARGET_AVX)
+   (cond [(and (eq_attr alternative 1,2)
+   (match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL))
+(if_then_else
+   (match_test GET_MODE_SIZE (MODEmode)  16)
+   (const_string V8SF)
+   (const_string V4SF))
+  (match_test TARGET_AVX)
 (const_string sseinsnmode)
-  (ior (ior (match_test optimize_function_for_size_p (cfun))
-(not (match_test TARGET_SSE2)))
+  (ior (and (eq_attr alternative 1,2)
+(match_test optimize_function_for_size_p (cfun)))
(and (eq_attr alternative 2)
 (match_test TARGET_SSE_TYPELESS_STORES)))
 (const_string V4SF)
-  (eq (const_string MODEmode) (const_string V4SFmode))
-(const_string V4SF)
-  (eq (const_string MODEmode) (const_string V2DFmode))
-(const_string V2DF)
  ]
- (const_string TI)))])
+ (const_string sseinsnmode)))])
 
 (define_insn sse2_movq128
   [(set (match_operand:V2DI 0 register_operand =x)
@@ -597,11 +595,33 @@
  [(match_operand:VF 1 nonimmediate_operand xm,x)]
  UNSPEC_MOVU))]
   TARGET_SSE  !(MEM_P (operands[0])  MEM_P (operands[1]))
-  %vmovussemodesuffix\t{%1, %0|%0, %1}
+{
+  switch (get_attr_mode (insn))
+{
+case MODE_V8SF:
+case MODE_V4SF:
+  return %vmovups\t{%1, %0|%0, %1};
+default:
+  return %vmovussemodesuffix\t{%1, %0|%0, %1};
+}
+}
   [(set_attr type ssemov)
(set_attr movu 1)
(set_attr prefix maybe_vex)
-   (set_attr mode MODE)])
+   (set (attr mode)
+   (cond [(match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
+(if_then_else
+   (match_test GET_MODE_SIZE (MODEmode)  16)
+   (const_string V8SF)
+   (const_string V4SF))
+  (match_test TARGET_AVX)
+(const_string MODE)
+  (ior (match_test optimize_function_for_size_p (cfun))
+   (and (eq_attr alternative 1)
+(match_test 

Re: [doc] Fix xref in extend.texi

2012-05-09 Thread Gabriel Dos Reis
On Wed, May 9, 2012 at 12:50 PM, Manuel López-Ibáñez
lopeziba...@gmail.com wrote:
 It is strange that a paragraph talking about GCC pragmas refers to the
 section in CPP talking about #indent and other such directives (no
 #pragma there). Wouldn't it better if it pointed to the section about
 #pragmas?

yes.  patch ok.

 OK?

 2012-05-09  Manuel López-Ibáñez  m...@gcc.gnu.org

        * doc/extend.texi (Function Attributes): Point xref to section
 about Pragmas.


 Index: doc/extend.texi
 ===
 --- doc/extend.texi     (revision 187299)
 +++ doc/extend.texi     (working copy)
 @@ -4102,8 +4102,7 @@
  found convenient to use @code{__attribute__} to achieve a natural
  attachment of attributes to their corresponding declarations, whereas
  @code{#pragma GCC} is of use for constructs that do not naturally form
 -part of the grammar.  @xref{Other Directives,,Miscellaneous
 -Preprocessing Directives, cpp, The GNU C Preprocessor}.
 +part of the grammar.  @xref{Pragmas,,Pragmas Accepted by GCC}.

  @node Attribute Syntax
  @section Attribute Syntax


Re: [PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs

2012-05-09 Thread Richard Guenther
On Wed, May 9, 2012 at 6:32 PM, Xinliang David Li davi...@google.com wrote:
 On Wed, May 9, 2012 at 1:12 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Tue, May 8, 2012 at 6:18 PM, Xinliang David Li davi...@google.com wrote:
 To be clear, this flag is for malloc implementation (such as tcmalloc)
 with side effect unknown to the compiler. Using -fno-builtin-xxx is
 too conservative for that purpose.

 I don't think that flys.  Btw, the patch also guards alloca - alloca is 
 purely
 GCC internal.

 What's the unknown side-effects that are also important to preserve
 for free(malloc(4))?


 The side effect is the user registered malloc hooks.

 The pattern 'free(malloc(4)', or loops like

 for (..)
  {
    malloc(..); // result not used
  }

 itself is not interesting. They only appear in the test code and the
 problem can be worked around by using -fno-builtin-malloc. The option
 is to avoid disable this and all similar malloc optimizations (in the
 future) for malloc implementation with hooks.

What is then interesting?  The above are all the same as if optimization
figured out the storage is not used, no?

Richard.

 Thanks,

 David

 Richard.

 David

 On Tue, May 8, 2012 at 7:43 AM, Dehao Chen de...@google.com wrote:
 Hello,

 This patch adds a flag to guard the optimization that optimize the
 following code away:

 free (malloc (4));

 In some cases, we'd like this type of malloc/free pairs to remain in
 the optimized code.

 Tested with bootstrap, and no regression in the gcc testsuite.

 Is it ok for mainline?

 Thanks,
 Dehao

 gcc/ChangeLog
 2012-05-08  Dehao Chen  de...@google.com

        * common.opt (feliminate-malloc): New.
        * doc/invoke.texi: Document it.
        * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it.

 gcc/testsuite/ChangeLog
 2012-05-08  Dehao Chen  de...@google.com

        * gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working
        as expected.

 Index: gcc/doc/invoke.texi
 ===
 --- gcc/doc/invoke.texi (revision 187277)
 +++ gcc/doc/invoke.texi (working copy)
 @@ -360,7 +360,8 @@
  -fcx-limited-range @gol
  -fdata-sections -fdce -fdelayed-branch @gol
  -fdelete-null-pointer-checks -fdevirtualize -fdse @gol
 --fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects 
 @gol
 +-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations 
 @gol
 +-ffat-lto-objects @gol
  -ffast-math -ffinite-math-only -ffloat-store
 -fexcess-precision=@var{style} @gol
  -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol
  -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
 @@ -6238,6 +6239,7 @@
  -fdefer-pop @gol
  -fdelayed-branch @gol
  -fdse @gol
 +-feliminate-malloc @gol
  -fguess-branch-probability @gol
  -fif-conversion2 @gol
  -fif-conversion @gol
 @@ -6762,6 +6764,11 @@
  Perform dead store elimination (DSE) on RTL@.
  Enabled by default at @option{-O} and higher.

 +@item -feliminate-malloc
 +@opindex feliminate-malloc
 +Eliminate unnecessary malloc/free pairs.
 +Enabled by default at @option{-O} and higher.
 +
  @item -fif-conversion
  @opindex fif-conversion
  Attempt to transform conditional jumps into branch-less equivalents.  This
 Index: gcc/testsuite/gcc.dg/free-malloc.c
 ===
 --- gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
 +++ gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
 @@ -0,0 +1,12 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fno-eliminate-malloc } */
 +/* { dg-final { scan-assembler-times malloc 2} } */
 +/* { dg-final { scan-assembler-times free 2} } */
 +
 +extern void * malloc (unsigned long);
 +extern void free (void *);
 +
 +void test ()
 +{
 +  free (malloc (10));
 +}
 Index: gcc/common.opt
 ===
 --- gcc/common.opt      (revision 187277)
 +++ gcc/common.opt      (working copy)
 @@ -1474,6 +1474,10 @@
  Common Var(flag_dce) Init(1) Optimization
  Use the RTL dead code elimination pass

 +feliminate-malloc
 +Common Var(flag_eliminate_malloc) Init(1) Optimization
 +Eliminate unnecessary malloc/free pairs
 +
  fdse
  Common Var(flag_dse) Init(1) Optimization
  Use the RTL dead store elimination pass
 Index: gcc/tree-ssa-dce.c
 ===
 --- gcc/tree-ssa-dce.c  (revision 187277)
 +++ gcc/tree-ssa-dce.c  (working copy)
 @@ -309,6 +309,8 @@
            case BUILT_IN_CALLOC:
            case BUILT_IN_ALLOCA:
            case BUILT_IN_ALLOCA_WITH_ALIGN:
 +             if (!flag_eliminate_malloc)
 +               mark_stmt_necessary (stmt, true);
              return;

            default:;


[PATCH, i386]: Some further TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL cleanups

2012-05-09 Thread Uros Bizjak
Hello!

Practically no functional change.

2012-05-09  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.c (*movdf_internal_rex64): Remove
TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL handling from asm output code.
Calculate mode attribute according to
TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL flag.
(*movdf_internal): Ditto.

Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN.

Uros.
Index: i386.md
===
--- i386.md (revision 187347)
+++ i386.md (working copy)
@@ -2953,8 +2953,7 @@
   switch (get_attr_mode (insn))
{
case MODE_V2DF:
- if (!TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
-   return %vmovapd\t{%1, %0|%0, %1};
+ return %vmovapd\t{%1, %0|%0, %1};
case MODE_V4SF:
  return %vmovaps\t{%1, %0|%0, %1};
 
@@ -3032,7 +3031,8 @@
  movaps encodes one byte shorter.  */
   (eq_attr alternative 8)
 (cond
-  [(match_test optimize_function_for_size_p (cfun))
+  [(ior (match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
+(match_test optimize_function_for_size_p (cfun)))
  (const_string V4SF)
(match_test TARGET_SSE_PARTIAL_REG_DEPENDENCY)
  (const_string V2DF)
@@ -3094,8 +3094,7 @@
   switch (get_attr_mode (insn))
{
case MODE_V2DF:
- if (!TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
-   return %vmovapd\t{%1, %0|%0, %1};
+ return %vmovapd\t{%1, %0|%0, %1};
case MODE_V4SF:
  return %vmovaps\t{%1, %0|%0, %1};
 
@@ -3167,7 +3166,8 @@
  movaps encodes one byte shorter.  */
   (eq_attr alternative 6,10)
 (cond
-  [(match_test optimize_function_for_size_p (cfun))
+  [(ior (match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
+(match_test optimize_function_for_size_p (cfun)))
  (const_string V4SF)
(match_test TARGET_SSE_PARTIAL_REG_DEPENDENCY)
  (const_string V2DF)


[C++ Patch] PR 53158 (EXPR_LOC_OR_HERE version)

2012-05-09 Thread Paolo Carlini

Hi again,

thus the below is what I booted and tested. As you can see I had to 
tweak a bit an existing testcase, for which the location is now a bit 
smaller, 3 chars, in fact is still not perfect, but arguably a tad 
better. Otherwise no issues. Ah, I'm touching also a c-common.c 
function, in fact all the other warning* functions in the same file 
already use the EXPR_LOC_OR_HERE on the expr argument idea.


Tested x86_64-linux.

Thanks,
Paolo.

//
/cp
2012-05-09  Paolo Carlini  paolo.carl...@oracle.com

PR c++/53158
* cvt.c (ocp_convert): Error out early for void - bool conversions.
* typeck.c (decay_conversion): Use error_at.
* call.c (build_integral_nontype_arg_conv, convert_like_real,
convert_arg_to_ellipsis, perform_implicit_conversion_flags,
initialize_reference): Likewise.
* cvt.c (warn_ref_binding): Add location_t parameter.
(cp_convert_to_pointer, convert_to_reference, ocp_convert,
convert_to_void, ): Use error_at and warning_at.

/c-family
2012-05-09  Paolo Carlini  paolo.carl...@oracle.com

PR c++/53158
* c-common.c (warnings_for_convert_and_check): Use warning_at.

/testsuite
2012-05-09  Paolo Carlini  paolo.carl...@oracle.com

PR c++/53158
* g++.dg/cpp0x/lambda/lambda-err2.C: New.
* g++.dg/parse/error26.C: Tweak dg-error column number.

Index: c-family/c-common.c
===
--- c-family/c-common.c (revision 187335)
+++ c-family/c-common.c (working copy)
@@ -2329,6 +2329,8 @@ conversion_warning (tree type, tree expr)
 void
 warnings_for_convert_and_check (tree type, tree expr, tree result)
 {
+  location_t loc = EXPR_LOC_OR_HERE (expr);
+
   if (TREE_CODE (expr) == INTEGER_CST
(TREE_CODE (type) == INTEGER_TYPE
   || TREE_CODE (type) == ENUMERAL_TYPE)
@@ -2344,8 +2346,8 @@ warnings_for_convert_and_check (tree type, tree ex
   /* This detects cases like converting -129 or 256 to
  unsigned char.  */
   if (!int_fits_type_p (expr, c_common_signed_type (type)))
-warning (OPT_Woverflow,
- large integer implicitly truncated to unsigned type);
+warning_at (loc, OPT_Woverflow,
+   large integer implicitly truncated to unsigned type);
   else
 conversion_warning (type, expr);
 }
@@ -2357,16 +2359,16 @@ warnings_for_convert_and_check (tree type, tree ex
(TREE_CODE (TREE_TYPE (expr)) != INTEGER_TYPE
   || TYPE_PRECISION (TREE_TYPE (expr))
   != TYPE_PRECISION (type)))
-   warning (OPT_Woverflow,
-overflow in implicit constant conversion);
+   warning_at (loc, OPT_Woverflow,
+   overflow in implicit constant conversion);
 
   else
conversion_warning (type, expr);
 }
   else if ((TREE_CODE (result) == INTEGER_CST
|| TREE_CODE (result) == FIXED_CST)  TREE_OVERFLOW (result))
-warning (OPT_Woverflow,
- overflow in implicit constant conversion);
+warning_at (loc, OPT_Woverflow,
+   overflow in implicit constant conversion);
   else
 conversion_warning (type, expr);
 }
Index: testsuite/g++.dg/parse/error26.C
===
--- testsuite/g++.dg/parse/error26.C(revision 187335)
+++ testsuite/g++.dg/parse/error26.C(working copy)
@@ -4,7 +4,7 @@
 void foo()
 {
   if (({int c[2];})) ; // { dg-error 7:ISO C.. forbids 7 }
-  // { dg-error 20:could not convert 20 { target *-*-* } 6 }
+  // { dg-error 17:could not convert 17 { target *-*-* } 6 }
 }
 
 void bar()
Index: testsuite/g++.dg/cpp0x/lambda/lambda-err2.C
===
--- testsuite/g++.dg/cpp0x/lambda/lambda-err2.C (revision 0)
+++ testsuite/g++.dg/cpp0x/lambda/lambda-err2.C (revision 0)
@@ -0,0 +1,12 @@
+// PR c++/53158
+// { dg-do compile { target c++11 } }
+
+int main()
+{
+  auto a = []() { return true; };
+  auto b = []() { return a(); };  // { dg-error 'a' is not captured }
+  int c, d;
+  while (b()  c  d) // { dg-error could not convert }
+{
+}
+}
Index: cp/typeck.c
===
--- cp/typeck.c (revision 187335)
+++ cp/typeck.c (working copy)
@@ -1822,6 +1822,7 @@ decay_conversion (tree exp, tsubst_flags_t complai
 {
   tree type;
   enum tree_code code;
+  location_t loc = EXPR_LOC_OR_HERE (exp);
 
   type = TREE_TYPE (exp);
   if (type == error_mark_node)
@@ -1853,7 +1854,7 @@ decay_conversion (tree exp, tsubst_flags_t complai
   if (code == VOID_TYPE)
 {
   if (complain  tf_error)
-   error (void value not ignored as it ought to be);
+   error_at (loc, void value not ignored as it ought to be);
   return error_mark_node;
 }
   if (invalid_nonstatic_memfn_p (exp, complain))
@@ -1882,7 

Re: Symbol table 19/many: cleanup varpool/front-end/varasm interactions

2012-05-09 Thread David Edelsohn
Jan,

This patch is causing a bootstrap failure on AIX.  It generates linker
errors that TOC symbols are not emitted:

ld: 0711-317 ERROR: Undefined symbol: LC..0
ld: 0711-317 ERROR: Undefined symbol: LC..1
ld: 0711-317 ERROR: Undefined symbol: LC..2
ld: 0711-317 ERROR: Undefined symbol: LC..3
ld: 0711-317 ERROR: Undefined symbol: LC..4

Although pLinux uses the same code, GCC is able to bootstrap on Linux.
 The function in question is

gcc/config/rs6000/rs6000.c:output_toc()

Do you have any suggestions on what is missing in the label allocation
that causes your cgraph change to believe the symbols are
unreferenced?

I have opened PR bootstrap/53300.

Thanks, David


Re: [PATCH] Optimize byte_from_pos, pos_from_bit

2012-05-09 Thread Eric Botcazou
 This optimizes byte_from_pos and pos_from_bit by noting that we operate
 on sizes whose computations have no intermediate (or final) overflow.
 This is the single patch necessary to get Ada to bootstrap and test
 with TYPE_IS_SIZETYPE removed.  Rather than amending size_binop
 (my original plan) I chose to optimize the above two commonly used
 accessors.

 Conveniently normalize_offset can be re-written to use pos_from_bit
 instead of inlinig it.  I also took the liberty to document the
 functions (sic).

Nice, thanks.  Could you add a blurb, in the head comment of the first function 
in which you operate under the no-overflow assumption, stating this fact and 
why this is necessary (an explicit mention of Ada isn't forbidden ;-), as well 
as a cross-reference to it in the head comment of the other function(s).

 Any comments?  Would you like different factoring of the optimization
 (I considered adding a byte_from_bitpos)?  Any idea why
 byte_from_pos is using TRUNC_DIV_EXPR (only positive offsets?) and
 pos_from_bit FLOOR_DIV_EXPR (also negative offsets?) - that seems
 inconsistent, and we fold FLOOR_DIV_EXPR of unsigned types (sizetype)
 to TRUNC_DIV_EXPR anyways.

I think that a bit position is treated as non-negative, so bitpos can use 
TRUNC_DIV_EXPR in byte_from_pos and you need to use FLOOR_MOD_EXPR to get a 
non-negative *pbitpos in pos_from_bit.  Very likely obsolete now indeed.

-- 
Eric Botcazou


Re: [PATCH] Remove TYPE_IS_SIZETYPE

2012-05-09 Thread Eric Botcazou
 This removes the TYPE_IS_SIZETYPE macro and all its uses (by
 assuming it returns zero and applying trivial folding).  Sizes
 and bitsizes can still be treat specially by means of knowing
 what the values represent and by means of using helper functions
 that assume you are dealing with sizes (in particular size_binop
 and friends and bit_from_pos, byte_from_pos or pos_from_bit).

Fine with me, if you add the blurb I talked about in the other reply.

 Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages
 including Ada with the patch optimizing bute_from_pos and pos_from_bit

Results on our internal testsuite are clean on x86-64 and almost clean on x86, 
an exception being:

package t is
type x (m : natural) is record
s : string (1 .. m);
r : natural;
b : boolean;
end record;
for x'alignment use 4;

pragma Pack (x);
end t;

Without the patches, compiling the package with -gnatR3 yields:

Representation information for unit t (spec)


for x'Object_Size use 17179869248;
for x'Value_Size use  ((#1 + 8) * 8) ;
for x'Alignment use 4;
for x use record
   m at  0 range  0 .. 30;
   s at  4 range  0 ..  ((#1 * 8))  - 1;
   r at bit offset (((#1 + 4) * 8))  size in bits = 31
   b at bit offset #1 + 7) * 8) + 7))  size in bits = 1
end record;

With the patches, this yields:

Representation information for unit t (spec)


for x'Object_Size use 17179869248;
for x'Value_Size use  (((#1 + 7) + 1) * 8) ;
for x'Alignment use 4;
for x use record
   m at  0 range  0 .. 30;
   s at  4 range  0 ..  ((#1 * 8))  - 1;
   r at bit offset (((#1 + 4) * 8))  size in bits = 31
   b at bit offset #1 + 7) * 8) + 7))  size in bits = 1
end record;

so we have lost a simple folding for x'Value_Size (TYPE_ADA_SIZE field).

 2012-05-08  Richard Guenther  rguent...@suse.de

   ada/
   * gcc-interface/cuintp.c (UI_From_gnu): Remove TYPE_IS_SIZETYPE use.

OK, modulo the formatting:

 Index: trunk/gcc/ada/gcc-interface/cuintp.c
 ===
 *** trunk.orig/gcc/ada/gcc-interface/cuintp.c 2011-04-11 17:01:30.0
 +0200 --- trunk/gcc/ada/gcc-interface/cuintp.c2012-05-07
 16:43:43.497218058 +0200 *** UI_From_gnu (tree Input)
 *** 178,186 
 if (host_integerp (Input, 0))
   return UI_From_Int (TREE_INT_CST_LOW (Input));
 else if (TREE_INT_CST_HIGH (Input)  0
 ! TYPE_UNSIGNED (gnu_type)
 ! !(TREE_CODE (gnu_type) == INTEGER_TYPE
 !  TYPE_IS_SIZETYPE (gnu_type)))
   return No_Uint;
   #endif

 --- 178,184 
 if (host_integerp (Input, 0))
   return UI_From_Int (TREE_INT_CST_LOW (Input));
 else if (TREE_INT_CST_HIGH (Input)  0
 ! TYPE_UNSIGNED (gnu_type))
   return No_Uint;
   #endif

 TYPE_UNSIGNED (gnu_type)) on the same line.

-- 
Eric Botcazou


[PATCH, i386]: Fix PR 52908 - xop-mul-1:f9 miscompiled on bulldozer (-mxop)

2012-05-09 Thread Uros Bizjak
Hello!

Attached patch fixes PR 52908. There is no need to generate fake
multiply instructions after reload, we can expand directly to MAC
instructions. This approach even produced better assembly for a couple
of testcases in gcc.target/i386 testsuite.

2012-05-09  Uros Bizjak  ubiz...@gmail.com

PR target/52908
* config/i386/sse.md (vec_widen_smult_hi_v4si): Expand using
xop_pmacsdqh insn pattern instead of xop_mulv2div2di3_high.
(vec_widen_smult_lo_v4si): Expand using xop_pmacsdql insn pattern
instead of xop_mulv2div2di3_low.
(xop_pmacsdql): Fix vec_select selector.
(xop_pmacsdqh): Ditto.
(xop_mulv2div2di3_low): Remove insn_and_split pattern.
(xop_mulv2div2di3_high): Ditto.

testsuite/ChangeLog:

PR target/52908
* gcc.target/i386/xop-imul32widen-vector.c: Update scan-assembler
directive to Scan for vpmuldq, not vpmacsdql.

Bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}, and
tested on XOP target by Venkataramanan.

Patch was committed to mainline SVN. The version, attached to the PR
should be backported to other release branches, but another volunteer
should do the backport, since I don't have access to XOP target.

Also, please note that XOP horizontal add/subtract instructions (and
possibly others) have vec_select parallel RTX in wrong endiannes.

Uros.
Index: config/i386/sse.md
===
--- config/i386/sse.md  (revision 187347)
+++ config/i386/sse.md  (working copy)
@@ -5748,11 +5748,15 @@
 
   if (TARGET_XOP)
 {
+  rtx t3 = gen_reg_rtx (V2DImode);
+
   emit_insn (gen_sse2_pshufd_1 (t1, op1, GEN_INT (0), GEN_INT (2),
GEN_INT (1), GEN_INT (3)));
   emit_insn (gen_sse2_pshufd_1 (t2, op2, GEN_INT (0), GEN_INT (2),
GEN_INT (1), GEN_INT (3)));
-  emit_insn (gen_xop_mulv2div2di3_high (operands[0], t1, t2));
+  emit_move_insn (t3, CONST0_RTX (V2DImode));
+
+  emit_insn (gen_xop_pmacsdqh (operands[0], t1, t2, t3));
   DONE;
 }
 
@@ -5777,11 +5781,15 @@
 
   if (TARGET_XOP)
 {
+  rtx t3 = gen_reg_rtx (V2DImode);
+
   emit_insn (gen_sse2_pshufd_1 (t1, op1, GEN_INT (0), GEN_INT (2),
GEN_INT (1), GEN_INT (3)));
   emit_insn (gen_sse2_pshufd_1 (t2, op2, GEN_INT (0), GEN_INT (2),
GEN_INT (1), GEN_INT (3)));
-  emit_insn (gen_xop_mulv2div2di3_low (operands[0], t1, t2));
+  emit_move_insn (t3, CONST0_RTX (V2DImode));
+
+  emit_insn (gen_xop_pmacsdql (operands[0], t1, t2, t3));
   DONE;
 }
 
@@ -9792,11 +9800,11 @@
  (sign_extend:V2DI
   (vec_select:V2SI
(match_operand:V4SI 1 nonimmediate_operand %x)
-   (parallel [(const_int 1) (const_int 3)])))
+   (parallel [(const_int 0) (const_int 2)])))
  (sign_extend:V2DI
   (vec_select:V2SI
(match_operand:V4SI 2 nonimmediate_operand xm)
-   (parallel [(const_int 1) (const_int 3)]
+   (parallel [(const_int 0) (const_int 2)]
 (match_operand:V2DI 3 nonimmediate_operand x)))]
   TARGET_XOP
   vpmacsdql\t{%3, %2, %1, %0|%0, %1, %2, %3}
@@ -9810,93 +9818,17 @@
  (sign_extend:V2DI
   (vec_select:V2SI
(match_operand:V4SI 1 nonimmediate_operand %x)
-   (parallel [(const_int 0) (const_int 2)])))
+   (parallel [(const_int 1) (const_int 3)])))
  (sign_extend:V2DI
   (vec_select:V2SI
(match_operand:V4SI 2 nonimmediate_operand xm)
-   (parallel [(const_int 0) (const_int 2)]
+   (parallel [(const_int 1) (const_int 3)]
 (match_operand:V2DI 3 nonimmediate_operand x)))]
   TARGET_XOP
   vpmacsdqh\t{%3, %2, %1, %0|%0, %1, %2, %3}
   [(set_attr type ssemuladd)
(set_attr mode TI)])
 
-;; We don't have a straight 32-bit parallel multiply and extend on XOP, so
-;; fake it with a multiply/add.  In general, we expect the define_split to
-;; occur before register allocation, so we have to handle the corner case where
-;; the target is the same as operands 1/2
-(define_insn_and_split xop_mulv2div2di3_low
-  [(set (match_operand:V2DI 0 register_operand =x)
-   (mult:V2DI
- (sign_extend:V2DI
-   (vec_select:V2SI
- (match_operand:V4SI 1 register_operand %x)
- (parallel [(const_int 1) (const_int 3)])))
- (sign_extend:V2DI
-   (vec_select:V2SI
- (match_operand:V4SI 2 nonimmediate_operand xm)
- (parallel [(const_int 1) (const_int 3)])]
-  TARGET_XOP
-  #
-   reload_completed
-  [(set (match_dup 0)
-   (match_dup 3))
-   (set (match_dup 0)
-   (plus:V2DI
-(mult:V2DI
- (sign_extend:V2DI
-  (vec_select:V2SI
-   (match_dup 1)
-   (parallel [(const_int 1) (const_int 3)])))
- 

[patch] Fix LTO regression in Ada

2012-05-09 Thread Eric Botcazou
Hi,

this is a regression present on mainline and 4.7 branch.  On the attached 
testcase, the compiler aborts in LTO mode with:

eric@atlantis:~/build/gcc/native32 gcc/xgcc -Bgcc -S lto11.adb -O -flto
+===GNAT BUG DETECTED==+
| 4.8.0 20120506 (experimental) [trunk revision 187216] (i586-suse-linux)
| tree code 'call_expr' is not supported in LTO streams   

The problem is that the Ada compiler started to use DECL_ORIGINAL_TYPE in 4.7.x 
and the type in this field can have arbitrary expressions as TYPE_SIZE, for 
example expressions with CALL_EXPRs.  Now the type is both not gimplified and 
streamed in LTO mode, so the CALL_EXPRs are sent to the streamer as-is.

The immediate solution would be not to stream DECL_ORIGINAL_TYPE (and clear it 
in free_lang_data_in_decl), but this yields a regression in C++ with -flto -g 
(ICE in splice_child_die).  Therefore, the patch implements the alternate 
solution of gimplifying DECL_ORIGINAL_TYPE.

Bootstrapped/regtested on x86_64-suse-linux, OK for mainline and 4.7 branch?


2012-05-09  Eric Botcazou  ebotca...@adacore.com

* gimplify.c (gimplify_decl_expr): For a TYPE_DECL, gimplify the
DECL_ORIGINAL_TYPE if it is present.


2012-05-09  Eric Botcazou  ebotca...@adacore.com

* gnat.dg/lto11.ad[sb]: New test.


-- 
Eric Botcazou
-- { dg-do compile }
-- { dg-options -flto { target lto } }

with Ada.Streams; use Ada.Streams;

package body Lto11 is

   procedure Write
 (S : not null access Root_Stream_Type'Class;
  V : Vector)
   is
  subtype M_SEA is Stream_Element_Array (1 .. V'Size / Stream_Element'Size);
  Bytes : M_SEA;
  for Bytes'Address use V'Address;
  pragma Import (Ada, Bytes);
   begin
  Ada.Streams.Write (S.all, Bytes);
   end;

end Lto11;
with Ada.Streams; use Ada.Streams;

package Lto11 is

   type Vector is array (Positive range ) of Float;

   procedure Write (S : not null access Root_Stream_Type'Class; V : Vector);

end Lto11;
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index c4792e6..f69e773 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -1448,6 +1448,11 @@ gimplify_decl_expr (tree *stmt_p, gimple_seq *seq_p)
!TYPE_SIZES_GIMPLIFIED (TREE_TYPE (decl)))
 gimplify_type_sizes (TREE_TYPE (decl), seq_p);
 
+  if (TREE_CODE (decl) == TYPE_DECL
+   DECL_ORIGINAL_TYPE (decl)
+   !TYPE_SIZES_GIMPLIFIED (DECL_ORIGINAL_TYPE (decl)))
+gimplify_type_sizes (DECL_ORIGINAL_TYPE (decl), seq_p);
+
   if (TREE_CODE (decl) == VAR_DECL  !DECL_EXTERNAL (decl))
 {
   tree init = DECL_INITIAL (decl);


Re: [RFC] PR 53063 encode group options in .opt files

2012-05-09 Thread Manuel López-Ibáñez
On 9 May 2012 00:38, Joseph S. Myers jos...@codesourcery.com wrote:
 On Wed, 9 May 2012, Manuel López-Ibáñez wrote:

 which looks correct to me. However, the build fails because now
 options.h requires input.h which requires line-map.h, which is not
 included when building for example libgcc. options.h is included by
 tm.h, so it basically appears everywhere.

 Any suggestions how to fix this?

 options.h already has some #if !defined(IN_LIBGCC2) 
 !defined(IN_TARGET_LIBS)  !defined(IN_RTS) conditionals, so you could
 arrange for some more such conditionals to be generated.

This works great for libgcc, however, ada's runtime library is
compiled with IN_GCC and includes options.h through tm.h, so it
breaks:

../../xgcc -B../../ -c -DIN_GCC  -g -O2 -W -Wall  \
-iquote /home/manuel/test2/src/gcc \
 -iquote . -iquote .. -iquote ../.. -iquote
/home/manuel/test2/src/gcc/ada -iquote /home/manuel/test2/src/gcc
-I/home/manuel/test2/src/gcc/../include \
../rts/targext.c -o targext.o
In file included from ../../options.h:3716:0,
 from ../../tm.h:19,
 from ../rts/targext.c:50:
/home/manuel/test2/src/gcc/input.h:25:22: fatal error: line-map.h: No
such file or directory
 #include line-map.h
  ^

That tm.h includes the whole options.h is bad, but I don't want to
mess with it. So I am instead only including input.h if options.h was
not included from tm.h. This requires surprisingly few changes.

The other major issue is that if -Wunused is enabled by -Wall, it has
to be enabled by handle_generated_option, so I need to modify every
front-end that does that . In the future, we should also generate
per-FE _auto function, but I didn't want to make this patch even
larger.

Bootstrapped and regression tested.

OK?

2012-05-09  Manuel López-Ibáñez  m...@gcc.gnu.org

PR 53063
gcc/
* doc/options.texi (EnabledBy): Document
* opts.c: Include opts.h and options.h before tm.h.
(finish_options): Do not handle some sub-options here...
(common_handle_option): ... instead call common_handle_option_auto here.
* optc-gen.awk: Handle EnabledBy.
* opth-gen.awk: Declare common_handle_option_auto.
* common.opt (Wuninitialized): Use EnabledBy. Delete Init.
(Wmaybe-uninitialized): Likewise.
(Wunused-but-set-variable): Likewise.
(Wunused-function): Likewise.
(Wunused-label): Likewise.
(Wunused-value): Likewise.
(Wunused-variable): Likewise.
* opt-read.awk: Create opt_numbers array.
ada/
* gcc-interface/misc.c (gnat_parse_file): Move before ...
(gnat_handle_option): ... this. Use handle_generated_option.
c-family/
* c-opts.c (c_common_handle_option): Use handle_generated_option
to enable sub-options.
fortran/
* options.c: Include diagnostics.h instead of
diagnostics-core.h.
(set_Wall): Do not see warn_unused here.
(gfc_handle_option): Set it here using handle_generated_option.


group-options-2.diff
Description: Binary data


Re: Bug 53289 - unnecessary repetition of caret diagnostics

2012-05-09 Thread Gabriel Dos Reis
On Wed, May 9, 2012 at 4:02 PM, Manuel López-Ibáñez
lopeziba...@gmail.com wrote:
 Simple enough. Bootstrapped and regression tested.

 The output for the example in the PR is now:

 /home/manuel/caret-overload.C:6:6: error: no matching function for
 call to ‘g(int)’
   g(1);
      ^
 /home/manuel/caret-overload.C:6:6: note: candidate is:
 /home/manuel/caret-overload.C:2:18: note: templateclass T typename
 T::type g(T)
  typename T::type g(T);
                  ^
 /home/manuel/caret-overload.C:2:18: note:   template argument
 deduction/substitution failed:
 /home/manuel/caret-overload.C: In substitution of ‘templateclass T
 typename T::type g(T) [with T = int]’:
 /home/manuel/caret-overload.C:6:6:   required from here
 /home/manuel/caret-overload.C:2:18: error: ‘int’ is not a class,
 struct, or union type

 OK?

OK.  Thanks,

-- Gaby


Re: [PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs

2012-05-09 Thread Xinliang David Li
On Wed, May 9, 2012 at 11:21 AM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Wed, May 9, 2012 at 6:32 PM, Xinliang David Li davi...@google.com wrote:
 On Wed, May 9, 2012 at 1:12 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Tue, May 8, 2012 at 6:18 PM, Xinliang David Li davi...@google.com 
 wrote:
 To be clear, this flag is for malloc implementation (such as tcmalloc)
 with side effect unknown to the compiler. Using -fno-builtin-xxx is
 too conservative for that purpose.

 I don't think that flys.  Btw, the patch also guards alloca - alloca is 
 purely
 GCC internal.

 What's the unknown side-effects that are also important to preserve
 for free(malloc(4))?


 The side effect is the user registered malloc hooks.

 The pattern 'free(malloc(4)', or loops like

 for (..)
  {
    malloc(..); // result not used
  }

 itself is not interesting. They only appear in the test code and the
 problem can be worked around by using -fno-builtin-malloc. The option
 is to avoid disable this and all similar malloc optimizations (in the
 future) for malloc implementation with hooks.

 What is then interesting?  The above are all the same as if optimization
 figured out the storage is not used, no?

Compiler may choose to convert malloc call into alloca or coalesce
multiple malloc calls -- but this may not always be the best choice
given that more advanced locality based heap allocation is possible.
Nothing like this exist yet, but the point is that it is convenient to
have a way to keep malloc call while keeping its aliasing property.

David


 Richard.

 Thanks,

 David

 Richard.

 David

 On Tue, May 8, 2012 at 7:43 AM, Dehao Chen de...@google.com wrote:
 Hello,

 This patch adds a flag to guard the optimization that optimize the
 following code away:

 free (malloc (4));

 In some cases, we'd like this type of malloc/free pairs to remain in
 the optimized code.

 Tested with bootstrap, and no regression in the gcc testsuite.

 Is it ok for mainline?

 Thanks,
 Dehao

 gcc/ChangeLog
 2012-05-08  Dehao Chen  de...@google.com

        * common.opt (feliminate-malloc): New.
        * doc/invoke.texi: Document it.
        * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it.

 gcc/testsuite/ChangeLog
 2012-05-08  Dehao Chen  de...@google.com

        * gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working
        as expected.

 Index: gcc/doc/invoke.texi
 ===
 --- gcc/doc/invoke.texi (revision 187277)
 +++ gcc/doc/invoke.texi (working copy)
 @@ -360,7 +360,8 @@
  -fcx-limited-range @gol
  -fdata-sections -fdce -fdelayed-branch @gol
  -fdelete-null-pointer-checks -fdevirtualize -fdse @gol
 --fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects 
 @gol
 +-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations 
 @gol
 +-ffat-lto-objects @gol
  -ffast-math -ffinite-math-only -ffloat-store
 -fexcess-precision=@var{style} @gol
  -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol
  -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
 @@ -6238,6 +6239,7 @@
  -fdefer-pop @gol
  -fdelayed-branch @gol
  -fdse @gol
 +-feliminate-malloc @gol
  -fguess-branch-probability @gol
  -fif-conversion2 @gol
  -fif-conversion @gol
 @@ -6762,6 +6764,11 @@
  Perform dead store elimination (DSE) on RTL@.
  Enabled by default at @option{-O} and higher.

 +@item -feliminate-malloc
 +@opindex feliminate-malloc
 +Eliminate unnecessary malloc/free pairs.
 +Enabled by default at @option{-O} and higher.
 +
  @item -fif-conversion
  @opindex fif-conversion
  Attempt to transform conditional jumps into branch-less equivalents.  
 This
 Index: gcc/testsuite/gcc.dg/free-malloc.c
 ===
 --- gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
 +++ gcc/testsuite/gcc.dg/free-malloc.c  (revision 0)
 @@ -0,0 +1,12 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fno-eliminate-malloc } */
 +/* { dg-final { scan-assembler-times malloc 2} } */
 +/* { dg-final { scan-assembler-times free 2} } */
 +
 +extern void * malloc (unsigned long);
 +extern void free (void *);
 +
 +void test ()
 +{
 +  free (malloc (10));
 +}
 Index: gcc/common.opt
 ===
 --- gcc/common.opt      (revision 187277)
 +++ gcc/common.opt      (working copy)
 @@ -1474,6 +1474,10 @@
  Common Var(flag_dce) Init(1) Optimization
  Use the RTL dead code elimination pass

 +feliminate-malloc
 +Common Var(flag_eliminate_malloc) Init(1) Optimization
 +Eliminate unnecessary malloc/free pairs
 +
  fdse
  Common Var(flag_dse) Init(1) Optimization
  Use the RTL dead store elimination pass
 Index: gcc/tree-ssa-dce.c
 ===
 --- gcc/tree-ssa-dce.c  (revision 187277)
 +++ gcc/tree-ssa-dce.c  (working copy)
 @@ -309,6 +309,8 @@
            case 

[C++ Patch] PR 53301

2012-05-09 Thread Paolo Carlini

Hi,

shame on me. I think the patch almost qualifies as obvious. Tested 
x86_64-linux.


Thanks,
Paolo.

//

/cp
2012-05-10  Paolo Carlini  paolo.carl...@oracle.com

PR c++/53301
* decl.c (check_default_argument): Fix typo (POINTER_TYPE_P
instead of TYPE_PTR_P) in zero-as-null-pointer-constant warning.

/testsuite
2012-05-10  Paolo Carlini  paolo.carl...@oracle.com

PR c++/53301
* g++.dg/warn/Wzero-as-null-pointer-constant-6.C: New.

Index: testsuite/g++.dg/warn/Wzero-as-null-pointer-constant-6.C
===
--- testsuite/g++.dg/warn/Wzero-as-null-pointer-constant-6.C(revision 0)
+++ testsuite/g++.dg/warn/Wzero-as-null-pointer-constant-6.C(revision 0)
@@ -0,0 +1,6 @@
+// PR c++/53301
+// { dg-options -Wzero-as-null-pointer-constant }
+
+class x { public: x(int v) {} };
+
+void foo(const x = 0);
Index: cp/decl.c
===
--- cp/decl.c   (revision 187335)
+++ cp/decl.c   (working copy)
@@ -10619,7 +10619,7 @@ check_default_argument (tree decl, tree arg)
 
   if (warn_zero_as_null_pointer_constant
c_inhibit_evaluation_warnings == 0
-   (POINTER_TYPE_P (decl_type) || TYPE_PTR_TO_MEMBER_P (decl_type))
+   (TYPE_PTR_P (decl_type) || TYPE_PTR_TO_MEMBER_P (decl_type))
null_ptr_cst_p (arg)
!NULLPTR_TYPE_P (TREE_TYPE (arg)))
 {


Go patch committed: Add -fgo-pkgpath option

2012-05-09 Thread Ian Lance Taylor
This patch to the Go frontend adds a new command line option:
-fgo-pkgpath.  This permits setting the string that is returned by the
PkgPath() method of reflect.Type for types defined in the package being
compiled.  It also sets the prefix used for symbol names.  This largely
replaces the existing -fgo-prefix option, although I retained
-fgo-prefix for backward compatibility.  This will permit libgo to have
PkgPath values that are the same as the ones returned by the other Go
compiler.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline and 4.7 branch.

Ian


2012-05-09  Ian Lance Taylor  i...@google.com

* lang.opt: Add -fgo-pkgpath.
* go-lang.c (go_pkgpath): New static variable.
(go_prefix): New static variable.
(go_langhook_init): Pass go_pkgpath and go_prefix to
go_create_gogo.
(go_langhook_handle_option): Handle -fgo-pkgpath.  Change
-fgo-prefix handling to just set go_prefix.
* go-c.h (go_set_prefix): Don't declare.
(go_create_gogo): Add pkgpath and prefix to declaration.
* go-gcc.cc (Gcc_backend::global_variable): Change unique_prefix
to pkgpath.  Don't include the package name in the asm name.
* gccgo.texi (Invoking gccgo): Document -fgo-pkgpath.  Update the
docs for -fgo-prefix.


Index: gcc/go/lang.opt
===
--- gcc/go/lang.opt	(revision 187020)
+++ gcc/go/lang.opt	(working copy)
@@ -53,6 +53,10 @@ fgo-optimize-
 Go Joined RejectNegative
 -fgo-optimize-type	Turn on optimization passes in the frontend
 
+fgo-pkgpath=
+Go Joined RejectNegative
+-fgo-pkgpath=string	Set Go package path
+
 fgo-prefix=
 Go Joined RejectNegative
 -fgo-prefix=string	Set package-specific prefix for exported Go names
Index: gcc/go/go-gcc.cc
===
--- gcc/go/go-gcc.cc	(revision 187020)
+++ gcc/go/go-gcc.cc	(working copy)
@@ -271,7 +271,7 @@ class Gcc_backend : public Backend
 
   Bvariable*
   global_variable(const std::string package_name,
-		  const std::string unique_prefix,
+		  const std::string pkgpath,
 		  const std::string name,
 		  Btype* btype,
 		  bool is_external,
@@ -1281,7 +1281,7 @@ Gcc_backend::non_zero_size_type(tree typ
 
 Bvariable*
 Gcc_backend::global_variable(const std::string package_name,
-			 const std::string unique_prefix,
+			 const std::string pkgpath,
 			 const std::string name,
 			 Btype* btype,
 			 bool is_external,
@@ -1310,9 +1310,9 @@ Gcc_backend::global_variable(const std::
 {
   TREE_PUBLIC(decl) = 1;
 
-  std::string asm_name(unique_prefix);
+  std::string asm_name(pkgpath);
   asm_name.push_back('.');
-  asm_name.append(var_name);
+  asm_name.append(name);
   SET_DECL_ASSEMBLER_NAME(decl, get_identifier_from_string(asm_name));
 }
   TREE_USED(decl) = 1;
Index: gcc/go/gccgo.texi
===
--- gcc/go/gccgo.texi	(revision 187020)
+++ gcc/go/gccgo.texi	(working copy)
@@ -157,14 +157,32 @@ compile time.
 When linking, specify a library search directory, as with
 @command{gcc}.
 
+@item -fgo-pkgpath=@var{string}
+@cindex @option{-fgo-pkgpath}
+Set the package path to use.  This sets the value returned by the
+PkgPath method of reflect.Type objects.  It is also used for the names
+of globally visible symbols.  The argument to this option should
+normally be the string that will be used to import this package after
+it has been installed; in other words, a pathname within the
+directories specified by the @option{-I} option.
+
 @item -fgo-prefix=@var{string}
 @cindex @option{-fgo-prefix}
+An alternative to @option{-fgo-pkgpath}.  The argument will be
+combined with the package name from the source file to produce the
+package path.  If @option{-fgo-pkgpath} is used, @option{-fgo-prefix}
+will be ignored.
+
 Go permits a single program to include more than one package with the
-same name.  This option is required to make this work with
-@command{gccgo}.  The argument to this option may be any string.  Each
-package with the same name must use a distinct @option{-fgo-prefix}
-option.  The argument is typically the full path under which the
-package will be installed, as that must obviously be unique.
+same name in the @code{package} clause in the source file, though
+obviously the two packages must be imported using different pathnames.
+In order for this to work with @command{gccgo}, either
+@option{-fgo-pkgpath} or @option{-fgo-prefix} must be specified when
+compiling a package.
+
+Using either @option{-fgo-pkgpath} or @option{-fgo-prefix} disables
+the special treatment of the @code{main} package and permits that
+package to be imported like any other.
 
 @item -frequire-return-statement
 @itemx -fno-require-return-statement
Index: gcc/go/gofrontend/gogo.cc

Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-05-09 Thread Sharad Singhai
Thanks for your suggestions/comments. I have updated the patch and
documentation. It supports the following usage:

gcc  -fdump-tree-all=tree.dump -fdump-tree-pre=stdout
-fdump-rtl-ira=ira.dump

Here all tree dumps except the PRE are output into tree.dump, PRE dump
goes to stdout and the IRA dump goes to ira.dump.

Thanks,
Sharad

2012-05-09   Sharad Singhai  sing...@google.com

* doc/invoke.texi: Add documentation for the new option.
* tree-dump.c (dump_get_standard_stream): New function.
(dump_files): Update for new field.
(dump_switch_p_1): Handle dump filenames.
(dump_begin): Likewise.
(get_dump_file_name): Likewise.
(dump_end): Remove attribute.
(dump_enable_all): Add new parameter FILENAME.
All callers updated.
(enable_rtl_dump_file):
* tree-pass.h (enum tree_dump_index): Add new constant.
(struct dump_file_info): Add new field FILENAME.
* testsuite/g++.dg/other/dump-filename-1.C: New test.

Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 187265)
+++ doc/invoke.texi (working copy)
@@ -5322,20 +5322,23 @@ Here are some examples showing uses of these optio

 @item -d@var{letters}
 @itemx -fdump-rtl-@var{pass}
+@itemx -fdump-rtl-@var{pass}=@var{filename}
 @opindex d
 Says to make debugging dumps during compilation at times specified by
 @var{letters}.  This is used for debugging the RTL-based passes of the
 compiler.  The file names for most of the dumps are made by appending
 a pass number and a word to the @var{dumpname}, and the files are
-created in the directory of the output file.  Note that the pass
-number is computed statically as passes get registered into the pass
-manager.  Thus the numbering is not related to the dynamic order of
-execution of passes.  In particular, a pass installed by a plugin
-could have a number over 200 even if it executed quite early.
-@var{dumpname} is generated from the name of the output file, if
-explicitly specified and it is not an executable, otherwise it is the
-basename of the source file. These switches may have different effects
-when @option{-E} is used for preprocessing.
+created in the directory of the output file. If the
+@option{=@var{filename}} is appended to the longer form of the dump
+option then the dump is done on that file instead of numbered
+files. Note that the pass number is computed statically as passes get
+registered into the pass manager.  Thus the numbering is not related
+to the dynamic order of execution of passes.  In particular, a pass
+installed by a plugin could have a number over 200 even if it executed
+quite early.  @var{dumpname} is generated from the name of the output
+file, if explicitly specified and it is not an executable, otherwise
+it is the basename of the source file. These switches may have
+different effects when @option{-E} is used for preprocessing.

 Debug dumps can be enabled with a @option{-fdump-rtl} switch or some
 @option{-d} option @var{letters}.  Here are the possible
@@ -5719,15 +5722,18 @@ counters for each function compiled.

 @item -fdump-tree-@var{switch}
 @itemx -fdump-tree-@var{switch}-@var{options}
+@itemx -fdump-tree-@var{switch}-@var{options}=@var{filename}
 @opindex fdump-tree
 Control the dumping at various stages of processing the intermediate
 language tree to a file.  The file name is generated by appending a
 switch specific suffix to the source file name, and the file is
-created in the same directory as the output file.  If the
-@samp{-@var{options}} form is used, @var{options} is a list of
-@samp{-} separated options which control the details of the dump.  Not
-all options are applicable to all dumps; those that are not
-meaningful are ignored.  The following options are available
+created in the same directory as the output file. In case of
+@option{=@var{filename}} option, the dump is output on the given file
+name instead.  If the @samp{-@var{options}} form is used,
+@var{options} is a list of @samp{-} separated options which control
+the details or location of the dump.  Not all options are applicable
+to all dumps; those that are not meaningful are ignored.  The
+following options are available

 @table @samp
 @item address
@@ -5765,9 +5771,46 @@ Enable showing the tree dump for each statement.
 Enable showing the EH region number holding each statement.
 @item scev
 Enable showing scalar evolution analysis details.
+@item slim
+Inhibit dumping of members of a scope or body of a function merely
+because that scope has been reached.  Only dump such items when they
+are directly reachable by some other path.  When dumping pretty-printed
+trees, this option inhibits dumping the bodies of control structures.
+@item =@var{filename}
+Instead of using an auto generated dump file name, use the given file
+name. The file names @file{stdout} and @file{stderr} are treated
+specially and are considered 

[h8300] increase dwarf address size

2012-05-09 Thread DJ Delorie

H8/300 cpus have a larger-than-64k address space, despite 16-bit
pointers.  OK to apply?  Ok for 4.7 branch?

See also http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48231

* config/h8300/h8300.h (DWARF2_ADDR_SIZE): Define as 4 bytes.

Index: h8300.h
===
--- h8300.h (revision 187362)
+++ h8300.h (working copy)
@@ -126,12 +126,13 @@ extern const char * const *h8_reg_names;
 
 /* The return address is pushed on the stack.  */
 #define INCOMING_RETURN_ADDR_RTX   gen_rtx_MEM (Pmode, gen_rtx_REG (Pmode, 
STACK_POINTER_REGNUM))
 #define INCOMING_FRAME_SP_OFFSET   (POINTER_SIZE / 8)
 
 #define DWARF_CIE_DATA_ALIGNMENT   2
+#define DWARF2_ADDR_SIZE   4
 
 /* Define this if addresses of constant functions
shouldn't be put through pseudo regs where they can be cse'd.
Desirable on machines where ordinary constants are expensive
but a CALL with constant address is cheap.
 


Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-05-09 Thread Xinliang David Li
Bummer.  I was thinking to reserve '=' for selective  dumping:

-fdump-tree-pre=func_list_regexp

I guess this can be achieved via @

-fdump-tree-pre@func_list

-fdump-tree-pre=file_name@func_list


Another issue -- I don't think the current precedence rule is correct.
Consider that -fopt-info=2 will be mapped to

-fdump-tree-all-transform-verbose2=stderr
-fdump-rtl-all-transform-verbose2=stderr

then

the current precedence rule will cause surprise when the following is used

-fopt-info -fdump-tree-pre

The PRE dump will be emitted to stderr which is not what user wants.
In short, special streams should be treated as 'weak' the same way as
your previous implementation.

thanks,

David



On Wed, May 9, 2012 at 4:56 PM, Sharad Singhai sing...@google.com wrote:
 Thanks for your suggestions/comments. I have updated the patch and
 documentation. It supports the following usage:

 gcc  -fdump-tree-all=tree.dump -fdump-tree-pre=stdout
 -fdump-rtl-ira=ira.dump

 Here all tree dumps except the PRE are output into tree.dump, PRE dump
 goes to stdout and the IRA dump goes to ira.dump.

 Thanks,
 Sharad

 2012-05-09   Sharad Singhai  sing...@google.com

        * doc/invoke.texi: Add documentation for the new option.
        * tree-dump.c (dump_get_standard_stream): New function.
        (dump_files): Update for new field.
        (dump_switch_p_1): Handle dump filenames.
        (dump_begin): Likewise.
        (get_dump_file_name): Likewise.
        (dump_end): Remove attribute.
        (dump_enable_all): Add new parameter FILENAME.
        All callers updated.
        (enable_rtl_dump_file):
        * tree-pass.h (enum tree_dump_index): Add new constant.
        (struct dump_file_info): Add new field FILENAME.
        * testsuite/g++.dg/other/dump-filename-1.C: New test.

 Index: doc/invoke.texi
 ===
 --- doc/invoke.texi     (revision 187265)
 +++ doc/invoke.texi     (working copy)
 @@ -5322,20 +5322,23 @@ Here are some examples showing uses of these optio

  @item -d@var{letters}
  @itemx -fdump-rtl-@var{pass}
 +@itemx -fdump-rtl-@var{pass}=@var{filename}
  @opindex d
  Says to make debugging dumps during compilation at times specified by
  @var{letters}.  This is used for debugging the RTL-based passes of the
  compiler.  The file names for most of the dumps are made by appending
  a pass number and a word to the @var{dumpname}, and the files are
 -created in the directory of the output file.  Note that the pass
 -number is computed statically as passes get registered into the pass
 -manager.  Thus the numbering is not related to the dynamic order of
 -execution of passes.  In particular, a pass installed by a plugin
 -could have a number over 200 even if it executed quite early.
 -@var{dumpname} is generated from the name of the output file, if
 -explicitly specified and it is not an executable, otherwise it is the
 -basename of the source file. These switches may have different effects
 -when @option{-E} is used for preprocessing.
 +created in the directory of the output file. If the
 +@option{=@var{filename}} is appended to the longer form of the dump
 +option then the dump is done on that file instead of numbered
 +files. Note that the pass number is computed statically as passes get
 +registered into the pass manager.  Thus the numbering is not related
 +to the dynamic order of execution of passes.  In particular, a pass
 +installed by a plugin could have a number over 200 even if it executed
 +quite early.  @var{dumpname} is generated from the name of the output
 +file, if explicitly specified and it is not an executable, otherwise
 +it is the basename of the source file. These switches may have
 +different effects when @option{-E} is used for preprocessing.

  Debug dumps can be enabled with a @option{-fdump-rtl} switch or some
  @option{-d} option @var{letters}.  Here are the possible
 @@ -5719,15 +5722,18 @@ counters for each function compiled.

  @item -fdump-tree-@var{switch}
  @itemx -fdump-tree-@var{switch}-@var{options}
 +@itemx -fdump-tree-@var{switch}-@var{options}=@var{filename}
  @opindex fdump-tree
  Control the dumping at various stages of processing the intermediate
  language tree to a file.  The file name is generated by appending a
  switch specific suffix to the source file name, and the file is
 -created in the same directory as the output file.  If the
 -@samp{-@var{options}} form is used, @var{options} is a list of
 -@samp{-} separated options which control the details of the dump.  Not
 -all options are applicable to all dumps; those that are not
 -meaningful are ignored.  The following options are available
 +created in the same directory as the output file. In case of
 +@option{=@var{filename}} option, the dump is output on the given file
 +name instead.  If the @samp{-@var{options}} form is used,
 +@var{options} is a list of @samp{-} separated options which control
 +the details or location of the dump.  Not all 

[Patch / RFC] Improving more locations for binary expressions

2012-05-09 Thread Paolo Carlini

Hi,

I'm looking into making more progress on those locations, for another 
class of cases. Consider:


struct T { };

T foo();

void bar(int a, int b)
{
  if (foo()  a  b)
;
}

thus, in this case, we have a class type T, instead of void. The error 
message is:


a.cc: In function ‘void bar(int, int)’:
a.cc:7:20: error: no match for ‘operator’ (operand types are ‘T’ and 
‘bool’)

   if (foo()  a  b)
^
a.cc:7:20: note: candidate is:
a.cc:7:20: note: operator(bool, bool) built-in
   if (foo()  a  b)
^
a.cc:7:20: note:   no known conversion for argument 1 from ‘T’ to ‘bool’

in case my message ends up garbled, the carets do not point to  
(column 13), two times point to b (column 20), which is obviously wrong. 
In other terms, all the columns are 20, all wrong.


Now, a first step toward fixing this is improving on the work I did some 
days ago on cp_parser_binary_expression: make sure that the location we 
are eventually passing to build_x_binary_op is always, provably, the 
location of the operator. It seems to me that In order to achieve that, 
the right thing to do is always handling tree_type and loc (the location 
of the operator) together, per the attached patch_parser (the 
initialization of loc to input_location is just to prevent uninitialized 
warnings)


Then, with patch_parser applied (it passes testing), we end up with a 
correct first caret for the testcase above, that for no match, but the 
second one remains wrong. What happens is that the location is correct 
up to the print_z_candidates call. It gets a correct location as 
argument, and prints the candidate is message with the right column, 
13. Then it calls print_z_candidate without passing a location, which, 
in turn:


- Prints the built-in message with input_location as location, 
which is 20, wrong.
- Then calls print_conversion_rejection passing location_of 
(candidate-fn) which is again 20, again wrong.


Now, if I change print_z_candidate to get the location_t from 
print_z_candidates and use it for both the last two outputs, I finally get:


a.cc: In function ‘void bar(int, int)’:
a.cc:7:13: error: no match for ‘operator’ (operand types are ‘T’ and 
‘bool’)

   if (foo()  a  b)
 ^
a.cc:7:13: note: candidate is:
a.cc:7:13: note: operator(bool, bool) built-in
a.cc:7:13: note:   no known conversion for argument 1 from ‘T’ to ‘bool’

again, in case the text is garbled, all the locations seems finally 
right to me, and, interestingly there is no caret at the end, which 
seems Ok.


Now, what I suspect we are doing wrong here, is that we are not handling 
separately, in terms of locations, the *builtin* overloads, which should 
simply use always the location passed from upstream (as I did in my 
quick experiment) from all the other overloads, for which the current 
code seems fine. Seems right? The patch for this second half of the 
issue seems now also rather simple to me.


Thanks,
Paolo.

///

Index: parser.c
===
--- parser.c(revision 187362)
+++ parser.c(working copy)
@@ -1621,6 +1621,8 @@ typedef struct cp_parser_expression_stack_entry
   enum tree_code tree_type;
   /* Precedence of the binary operation we are parsing.  */
   enum cp_parser_prec prec;
+  /* Location of the binary operation we are parsing.  */
+  location_t loc;
 } cp_parser_expression_stack_entry;
 
 /* The stack for storing partial expressions.  We only need NUM_PREC_VALUES
@@ -7277,7 +7279,7 @@ cp_parser_binary_expression (cp_parser* parser, bo
   cp_parser_expression_stack_entry *sp = stack[0];
   tree lhs, rhs;
   cp_token *token;
-  location_t loc;
+  location_t loc = input_location;
   enum tree_code tree_type, lhs_type, rhs_type;
   enum cp_parser_prec new_prec, lookahead_prec;
   tree overload;
@@ -7290,7 +7292,6 @@ cp_parser_binary_expression (cp_parser* parser, bo
 {
   /* Get an operator token.  */
   token = cp_lexer_peek_token (parser-lexer);
-  loc = token-location;
 
   if (warn_cxx0x_compat
token-type == CPP_RSHIFT
@@ -7320,6 +7321,7 @@ cp_parser_binary_expression (cp_parser* parser, bo
 
  get_rhs:
   tree_type = binops_by_token[token-type].tree_type;
+  loc = token-location;
 
   /* We used the operator token.  */
   cp_lexer_consume_token (parser-lexer);
@@ -7349,6 +7351,7 @@ cp_parser_binary_expression (cp_parser* parser, bo
 stack overflows.  */
  sp-prec = prec;
  sp-tree_type = tree_type;
+ sp-loc = loc;
  sp-lhs = lhs;
  sp-lhs_type = lhs_type;
  sp++;
@@ -7370,6 +7373,7 @@ cp_parser_binary_expression (cp_parser* parser, bo
  --sp;
  prec = sp-prec;
  tree_type = sp-tree_type;
+ loc = sp-loc;
  rhs = lhs;
  rhs_type = lhs_type;
  lhs = sp-lhs;



Re: [Patch / RFC] Improving more locations for binary expressions

2012-05-09 Thread Miles Bader
Paolo Carlini paolo.carl...@oracle.com writes:
 in case my message ends up garbled, the carets do not point to 
 (column 13), two times point to b (column 20), which is obviously
 wrong. In other terms, all the columns are 20, all wrong.

The new caret support does seem to have revealed a bunch of places
where the column info in error messages was pretty screwy... I guess
nobody paid much attention to it before... :]

Should these get reported as bugzilla bugs...?

-miles

-- 
The trouble with most people is that they think with their hopes or
fears or wishes rather than with their minds.  -- Will Durant