Re: [RFC] Old school parallelization of WPA streaming

2013-08-29 Thread Richard Biener
On Wed, 28 Aug 2013, Michael Matz wrote:

> Hi,
> 
> On Wed, 21 Aug 2013, Richard Biener wrote:
> 
> > I also fail to see why threads should not work here.  Maybe simply 
> > annotate gcc with openmp?
> 
> Threads simply don't work here, because the whole streamer infrastructure 
> (or anything else in GCC for that matter) isn't thread safe (you'd have to 
> have multiple streamer objects, multiple SCC finder objects, and you'd 
> have to audit everything for not using any other shared resources).

Hm, yeah, of course.

> Fork-fire-forget is really a much simpler choice here IMO; no worries 
> about shared resources, less debug hassle.

It might be not as cheap as it is on Linux hosts on other hosts of
course.  Also I'd rather try to avoid I/O than solving the issue
by parallelizing it.  Of course we can always come back to this
kind of hack later.

Richard.


Re: wide-int branch now up for public comment and review

2013-08-29 Thread Richard Biener
On Wed, 28 Aug 2013, Mike Stump wrote:

> On Aug 28, 2013, at 3:22 AM, Richard Biener  wrote:
> > Btw, rtl.h still wastes space with
> > 
> > struct GTY((variable_size)) hwivec_def {
> >  int num_elem; /* number of elements */
> >  HOST_WIDE_INT elem[1];
> > };
> > 
> > struct GTY((chain_next ("RTX_NEXT (&%h)"),
> >chain_prev ("RTX_PREV (&%h)"), variable_size)) rtx_def {
> > ...
> >  /* The first element of the operands of this rtx.
> > The number of operands and their types are controlled
> > by the `code' field, according to rtl.def.  */
> >  union u {
> >rtunion fld[1];
> >HOST_WIDE_INT hwint[1];
> >struct block_symbol block_sym;
> >struct real_value rv;
> >struct fixed_value fv;
> >struct hwivec_def hwiv;
> >  } GTY ((special ("rtx_def"), desc ("GET_CODE (&%0)"))) u;
> > };
> > 
> > there are 32bits available before the union.  If you don't use
> > those for num_elem then all wide-ints will at least take as
> > much space as DOUBLE_INTs originally took - and large ints
> > that would have required DOUBLE_INTs in the past will now
> > require more space than before.  Which means your math
> > motivating the 'num_elem' encoding stuff is wrong.  With
> > moving 'num_elem' before u you can even re-use the hwint
> > field in the union as the existing double-int code does
> > (which in fact could simply do the encoding trick in the
> > old CONST_DOUBLE scheme, similar for the tree INTEGER_CST
> > container).
> 
> So, HOST_WIDE_INT is likely 64 bits, and likely is 64 bit aligned.  The 
> base (stuff before the union) is 32 bits.  There is a 32 bit gap, even 
> if not used before the HOST_WIDE_INT elem.  We place the num_elem is 
> this gap.

No, you don't.  You place num_elem 64bit aligned _after_ the gap.
And you have another 32bit gap, as you say, before elem.

> Even if the field were removed, the size would not change, 
> nor the placement of elem.  So, short of packing, a 32-bit HWI host or 
> going with a 32-bit type instead of a HOST_WIDE_INT, I'm not sure I 
> follow you?  I tend to discount 32-bit hosted compilers as a thing of 
> the past.

Me, too.  On 32bit hosts nothing would change as 'u' is 32bit aligned
there (ok, on 32bit hosts putting num_elem before 'u' would actually
increase memory usage - but as you say, 32bit hosted compilers are
a thing of the past ;)).

Richard.


[PATCH] Fix PR57685

2013-08-29 Thread Richard Biener

This fixes PR57685 - exponential behavior in VRP assert expression
insertion.  The fix below cuts off in the simplest possible way.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-08-28  Richard Biener  

PR tree-optimization/57685
* tree-vrp.c (register_edge_assert_for_1): Recurse only for
single-use operands to avoid exponential complexity.

* gcc.dg/torture/pr57685.c: New testcase.

Index: gcc/tree-vrp.c
===
*** gcc/tree-vrp.c  (revision 202050)
--- gcc/tree-vrp.c  (working copy)
*** register_edge_assert_for_1 (tree op, enu
*** 5410,5419 
   && gimple_assign_rhs_code (op_def) == BIT_IOR_EXPR))
  {
/* Recurse on each operand.  */
!   retval |= register_edge_assert_for_1 (gimple_assign_rhs1 (op_def),
!   code, e, bsi);
!   retval |= register_edge_assert_for_1 (gimple_assign_rhs2 (op_def),
!   code, e, bsi);
  }
else if (gimple_assign_rhs_code (op_def) == BIT_NOT_EXPR
   && TYPE_PRECISION (TREE_TYPE (gimple_assign_lhs (op_def))) == 1)
--- 5410,5423 
   && gimple_assign_rhs_code (op_def) == BIT_IOR_EXPR))
  {
/* Recurse on each operand.  */
!   tree op0 = gimple_assign_rhs1 (op_def);
!   tree op1 = gimple_assign_rhs2 (op_def);
!   if (TREE_CODE (op0) == SSA_NAME
! && has_single_use (op0))
!   retval |= register_edge_assert_for_1 (op0, code, e, bsi);
!   if (TREE_CODE (op1) == SSA_NAME
! && has_single_use (op1))
!   retval |= register_edge_assert_for_1 (op1, code, e, bsi);
  }
else if (gimple_assign_rhs_code (op_def) == BIT_NOT_EXPR
   && TYPE_PRECISION (TREE_TYPE (gimple_assign_lhs (op_def))) == 1)
Index: gcc/testsuite/gcc.dg/torture/pr57685.c
===
*** gcc/testsuite/gcc.dg/torture/pr57685.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr57685.c  (working copy)
***
*** 0 
--- 1,15 
+ /* { dg-do compile } */
+ 
+ unsigned f(void)
+ {
+   unsigned a;
+   int b, c, d, e;
+ 
+   for(c = 27; c < 40; c++)
+ b |= d |= b;
+ 
+   if(b)
+ a = e;
+ 
+   return a;
+ }


Re: Make some comdats implicitly hidden

2013-08-29 Thread Jan Hubicka
Paolo,
there seems to be one extra issue about this patch. It causes quite a twist in 
libstdc++ exported symbols.
It is purpose of the patch to remove those that are going to be generated in 
user programs, too.
I am however bit confused about bad array. Perhaps it is an optimization 
difference dragging it in?
Does the changes look sensible?

Honza


21 added symbols 
0
_ZNSt20bad_array_new_lengthD2Ev
std::bad_array_new_length::~bad_array_new_length()
version status: compatible
CXXABI_1.3.8
type: function
status: added

1
__cxa_throw_bad_array_length
version status: compatible
CXXABI_1.3.8
type: function
status: added

2
_ZNKSt20bad_array_new_length4whatEv
std::bad_array_new_length::what() const
version status: compatible
CXXABI_1.3.8
type: function
status: added

3
_ZNSt20bad_array_new_lengthD0Ev
std::bad_array_new_length::~bad_array_new_length()
version status: compatible
CXXABI_1.3.8
type: function
status: added

4
_ZSt14get_unexpectedv
std::get_unexpected()
version status: compatible
GLIBCXX_3.4.20
type: function
status: added

5
_ZTVSt16bad_array_length
vtable for std::bad_array_length
version status: compatible
CXXABI_1.3.8
type: object
type size: 40
status: added

6
_ZTSSt16bad_array_length
typeinfo name for std::bad_array_length
version status: compatible
CXXABI_1.3.8
type: object
type size: 21
status: added

7
_ZNSt16bad_array_lengthD2Ev
std::bad_array_length::~bad_array_length()
version status: compatible
CXXABI_1.3.8
type: function
status: added

8
_ZSt15get_new_handlerv
std::get_new_handler()
version status: compatible
GLIBCXX_3.4.20
type: function
status: added

9
CXXABI_1.3.8
version status: compatible
type: object
type size: 0
status: added

10
_ZTVSt20bad_array_new_length
vtable for std::bad_array_new_length
version status: compatible
CXXABI_1.3.8
type: object
type size: 40
status: added

11
_ZTSSt20bad_array_new_length
typeinfo name for std::bad_array_new_length
version status: compatible
CXXABI_1.3.8
type: object
type size: 25
status: added

12
GLIBCXX_3.4.20
version status: compatible
type: object
type size: 0
status: added

13
_ZTISt20bad_array_new_length
typeinfo for std::bad_array_new_length
version status: compatible
CXXABI_1.3.8
type: object
type size: 24
status: added

14
_ZSt13get_terminatev
std::get_terminate()
version status: compatible
GLIBCXX_3.4.20
type: function
status: added

15
_ZNSt16bad_array_lengthD0Ev
std::bad_array_length::~bad_array_length()
version status: compatible
CXXABI_1.3.8
type: function
status: added

16
__cxa_throw_bad_array_new_length
version status: compatible
CXXABI_1.3.8
type: function
status: added

17
_ZNSt16bad_array_lengthD1Ev
std::bad_array_length::~bad_array_length()
version status: compatible
CXXABI_1.3.8
type: function
status: added

18
_ZTISt16bad_array_length
typeinfo for std::bad_array_length
version status: compatible
CXXABI_1.3.8
type: object
type size: 24
status: added

19
_ZNSt20bad_array_new_lengthD1Ev
std::bad_array_new_length::~bad_array_new_length()
version status: compatible
CXXABI_1.3.8
type: function
status: added

20
_ZNKSt16bad_array_length4whatEv
std::bad_array_length::what() const
version status: compatible
CXXABI_1.3.8
type: function
status: added


17 missing symbols 
0
_ZN9__gnu_cxx18stdio_sync_filebufIwSt11char_traitsIwEE6xsputnEPKwl
__gnu_cxx::stdio_sync_filebuf 
>::xsputn(wchar_t const*, long)
version status: unversioned
type: function
status: subtracted

1
_ZN9__gnu_cxx18stdio_sync_filebufIwSt11char_traitsIwEED0Ev
__gnu_cxx::stdio_sync_filebuf 
>::~stdio_sync_filebuf()
version status: unversioned
type: function
status: subtracted

2
_ZNKSt5ctypeIcE8do_widenEc
std::ctype::do_widen(char) const
version status: unversioned
type: function
status: subtracted

3
_ZN9__gnu_cxx18stdio_sync_filebufIcSt11char_traitsIcEE6xsgetnEPcl
__gnu_cxx::stdio_sync_filebuf >::xsgetn(char*, 
long)
version status: unversioned
type: function
status: subtracted

4
_ZNSt15basic_stringbufIcSt11char_traitsIcESaIcEED0Ev
std::basic_stringbuf, std::allocator 
>::~basic_stringbuf()
version status: unversioned
type: function
status: subtracted

5
_ZNSt15basic_stringbufIwSt11char_traitsIwESaIwEED0Ev
std::basic_stringbuf, 
std::allocator >::~basic_stringbuf()
version status: unversioned
type: function
status: subtracted

6
_ZNK10__cxxabiv117__pbase_type_info15__pointer_catchEPKS0_PPvj
__cxxabiv1::__pbase_type_info::__pointer_catch(__cxxabiv1::__pbase_type_info 
const*, void**, unsigned int) const
version status: unversioned
type: function
status: subtracted

7
_ZNKSt5ctypeIcE8do_widenEPKcS2_Pc
std::ctype::do_widen(char const*, char const*, char*) const
version status: unversioned
type: function
status: subtracted

8
_ZNSt15basic_stringbufIwSt11char_traitsIwESaIwEED1Ev
std::basic_stringbuf, 
std::allocator >::~basic_stringbuf()
version status: unversioned
type: function
status: subtracted

9
_ZN9__gnu_cxx18stdio_sync_filebufIcSt11char_traitsIcEED1Ev
__gnu_cxx::stdio_sync_filebuf 
>::~stdio_sync_filebuf()
version status: unversioned
type: function
status: su

Re: [PATCH]. Fix HAVE_SYS_SDT_H for cross-compilation

2013-08-29 Thread Christian Bruel
Hello Bill and Jakub

On 08/22/2013 07:47 PM, Jakub Jelinek wrote:
> On Thu, Aug 22, 2013 at 09:39:48AM -0500, Bill Schmidt wrote:
>> Hi Christian and Jakub,
>>
>> I'm curious whether there was ever any resolution for:
>> http://gcc.gnu.org/ml/gcc-patches/2012-12/msg01124.html.
>

Sorry for not having sent a follow up for this.

The problem is that configure was checking for cross features in the
host root dir instead of the cross root dir.

This SDT failure was only the visible part of the problem while building
a Canadian Cross linux hosted GCC, as  we could as well silently test
for different cross/target runtime features :-).

I fixed this problem  by fixing the usage of with_build_sysroot while
checking system features with target_header_dir when host != build.
Checked for legacy issue with various bare or hosted SH4 compilers in
various environments (linux, mingwn, cygwin)

Comments ? does this is seems reasonable to push to trunk ?

Cheers

Christian


2012-12-21  Christian Bruel  

   * configure.ac: Set target_header_dir for with-build-sysroot.
   * configure: Regenerate.

Index: gcc/configure
===
--- gcc/configure	(revision 202068)
+++ gcc/configure	(working copy)
@@ -27011,6 +27011,8 @@ if test x$host != x$target || test "x$TARGET_SYSTE
   else
 target_header_dir="${with_sysroot}${native_system_header_dir}"
   fi
+elif test x$host != x$build && test "x$with_build_sysroot" != "x"; then
+  target_header_dir="${with_build_sysroot}${native_system_header_dir}"
 else
   target_header_dir=${native_system_header_dir}
 fi
Index: gcc/configure.ac
===
--- gcc/configure.ac	(revision 202068)
+++ gcc/configure.ac	(working copy)
@@ -4822,6 +4822,8 @@ if test x$host != x$target || test "x$TARGET_SYSTE
   else
 target_header_dir="${with_sysroot}${native_system_header_dir}"
   fi
+elif test x$host != x$build && test "x$with_build_sysroot" != "x"; then
+  target_header_dir="${with_build_sysroot}${native_system_header_dir}"
 else
   target_header_dir=${native_system_header_dir}
 fi


Re: [PATCH 1/2] fix PR49847 ICE-on-HAVE_cc0 regression from PR50780 changes

2013-08-29 Thread Mikael Pettersson
Jeff Law writes:
 > >* cse.c (fold_rtx) : If prev_insn_cc0 is NULL
 > >don't call equiv_constant on it.
 > I can't help but feel something different needs to be done here.   It's 
 > always been the case that those two insns are expected to be contiguous 
 > and there's going to be code all over the place that makes that 
 > assumption and that model is what every GCC developer has expected for 
 > the last 20+ years.
 > 
 > What precisely about Richi's patch causes the setter and user to end up 
 > in different blocks?  I'm guessing we now consider the comparison itself 
 > as potentially trapping and thus we end the block immediately?  The 
 > consumer then appears in the next block?

Yes, r180192 makes tree-eh.c (stmt_could_throw_1_p) dive further into
the expression, apparently finding a potentially trapping FP comparison,
which causes the code generation difference.

gcc-4.5 generated:

.type   _Z1ff, @function
_Z1ff:
.LFB0:
.cfi_startproc
link.w %fp,#0
.cfi_def_cfa 14, 8
ftst.s 8(%fp)
fsge %d0
extb.l %d0
neg.l %d0
unlk %fp
.cfi_offset 14, -8
rts
.cfi_endproc

while gcc >= 4.7 (with the ICE-preventing patch applied) generate:

.type   _Z1ff, @function
_Z1ff:
.LFB0:
.cfi_startproc
.cfi_personality 0,__gxx_personality_v0
.cfi_lsda 0,.LLSDA0
link.w %fp,#0
.cfi_offset 14, -8
.cfi_def_cfa 14, 8
.LEHB0:
.LEHE0:
.LEHB1:
ftst.s 8(%fp)
.LEHE1:
fsge %d0
neg.b %d0
and.l #255,%d0
jra .L1
.L3:
move.l %d0,-(%sp)
jsr __cxa_begin_catch
addq.l #4,%sp
.LEHB2:
jsr __cxa_end_catch
.L1:
.LEHE2:
unlk %fp
rts
.cfi_endproc

plus a lot of exception table data.  (Note the label before the 'fsge'.)

 >   Alternately (and probably better) 
 > would be to convert the remaining stragglers and drop the old cc0 
 > support entirely.

I'd love to see that happen, but it would take someone like me a lot of time
to complete.  (I did try an incremental conversion of just the FP stuff 2-3
years ago, but it didn't work well.  Now I'd probably rewrite m68k.md and
everything cc0-related in m68k.c from scratch.)

/Mikael


Re: wide-int branch now up for public comment and review

2013-08-29 Thread Richard Biener
On Wed, 28 Aug 2013, Kenneth Zadeck wrote:

> 
> > > >Note that the bits above the precision are not defined and the
> > > >algorithms used here are careful not to depend on their value.  In
> > > >particular, values that come in from rtx constants may have random
> > > >bits.
> > Which is a red herring.  It should be fixed.  I cannot even believe
> > that sentence given the uses of CONST_DOUBLE_LOW/CONST_DOUBLE_HIGH
> > or INTVAL/UINTVAL.  I don't see accesses masking out 'undefined' bits
> > anywhere.
> > 
> I can agree with you that this could be fixed.   But it is not necessary to
> fix it.   The rtl level and most of the tree level has existed for a long time
> by doing math within the precision.
> 
> you do not see the masking at the rtl level because the masking is not
> necessary.if you never look beyond the precision you just do not care.
> There is the issue with divide and mod and quite frankly the code on the trunk
> scares me to death.   my code at the rtl level makes sure that everything is
> clean since i can see if it is a udiv or a div that is enough info to clean
> the value up.
> 
> > > I have a feeling I'm rehashing a past debate, sorry, but rtx constants
> > > can't
> > > have random bits.  The upper bits must be a sign extension of the value.
> > > There's exactly one valid rtx for each (value, mode) pair.  If you saw
> > > something different then that sounds like a bug.  The rules should already
> > > be fairly well enforced though, since something like (const_int 128) --
> > > or (const_int 256) -- will not match a QImode operand.
> > See.  We're saved ;)
> this is richard's theory.   There is a block of code at wide-int.cc:175 that
> is ifdefed out that checks to see if the rtl consts are canonical.   if you
> bring it in, building the libraries fails very quickly.   The million dollar
> question is this the only bug or is this the first bug of a 1000 bugs.   Your
> comment about not seeing any masking cuts both ways.   There is very little
> code on the way in if you create ints with GEN_INT that makes sure someone has
> done the right thing on that side.   So my guess is that there are a lot of
> failures and a large number of them will be in the ports.

Well, clearly RTL code _expects_ constants to be properly extended.  I
can reproduce the bug and I simply guess that's a matter of mismatching
modes here (or performing an implicit truncation without properly
extending the constant).

at /space/rguenther/src/svn/wide-int/gcc/combine.c:10086
10086  GEN_INT 
(count));
(gdb) l
10081
10082 mask_rtx = GEN_INT (nonzero_bits (varop, GET_MODE 
(varop)));
10083
10084 mask_rtx
10085   = simplify_const_binary_operation (code, 
result_mode, mask_rtx,
10086  GEN_INT 
(count));

uses of GEN_INT are frowned upon ... for exactly that reason - the
mask_rtx is not a proper RTL constant for SImode.

Btw, all this isn't a reason to not have a well-defined wide-int
representation.  You just have (as you have for trees) to properly
canonize the representation at the time you convert from RTL
constants to wide-int constants.

In the long run we want a uniform representation of constants
so we can do zero-copying - but it looks like we now have
three different representations - the tree one (sign or zero
extended dependent on sign), RTL (garbage as you show) and
wide-int (always sign-extended).

That's why I was looking at at least matching what tree does
(because tree constants _are_ properly canonized).

Richard.


Re: Make some comdats implicitly hidden

2013-08-29 Thread Paolo Carlini

Hi,

On 08/29/2013 10:11 AM, Jan Hubicka wrote:

Paolo,
there seems to be one extra issue about this patch. It causes quite a twist in 
libstdc++ exported symbols.
It is purpose of the patch to remove those that are going to be generated in 
user programs, too.
I am however bit confused about bad array. Perhaps it is an optimization 
difference dragging it in?
I'm also confused because your patch isn't already in (at least I can't 
find such a ChangeLog in svn) and what you attached, as long as 
bad_array_* is concerned is just what we currently have, before the 
patch. Check point: 21 added symbols.


In general, of course, inlining alone can make a difference, if for 
example you inline less and a pattern in the linker map isn't tight 
enough some symbols can become inadvertently exported.


*But*, outside bad_array* I can see a lot of SUBTRACTED, those would be 
very bad regressions of course. check_abi cannot pass.


In the past, when we (library people) had special issues with 
versioning, like we had to patch a mistake we made, Jakub promptly 
helped, I would recommend referring to him too, if you are seeing 
something weird, but the general rule as far as testing is concerned is 
very simple: *never* ignore check_abi failures, make sure additional 
symbols are exported only at the current ABI version, never add symbols 
at older versions, never remove symbols.


Paolo.




Re: [PATCH] PR58143/58227 wrong code at -O3

2013-08-29 Thread Richard Biener
On Wed, Aug 28, 2013 at 11:24 PM, Bernd Edlinger
 wrote:
> The lim pass (aka loop invariant motion) can move conditional expressions 
> with possible undefined behavior out of the if statement inside a loop which 
> may cause the loop optimization to silently generate wrong code as PR 
> tree-optimization/58143 and PR tree-optimization/58227 demonstrate.
>
> This patch prevents lim from moving code that might cause undefined behavior.
>
> This triggered a minor regression in gcc.target/i386/pr53397-1.c:
> Here lim used to move the expression "2*step" out of the loop, but this may 
> cause undefined behavior on case of overflow, I propose to resolve this by 
> adding -fno-strict-overflow. The test case looks pretty constructed anyway.
>
> The patch was boot-strapped and regression tested on i686-pc-linux-gnu and
> X86_64-linux-gnu
>
> OK for trunk?

2013-08-28  Bernd Edlinger  

PR tree-optimization/58143
PR tree-optimization/58227
Prevent lim from moving conditional expressions
if that could trigger undefined behavior.
* gimple.h (gimple_could_trap_p_2): New function.
* gimple.c (gimple_could_trap_p_2): New function.
(gimple_could_trap_p_1): Call gimple_could_trap_p_2.
* tree-ssa-loop-im.c (movement_possibility): Likewise.

Please fold the overall comment into the movement_possibility changelog.

Can you instead of introducing gimple_could_trap_p_2 simply change the
signature and name of gimple_could_trap_p_1 please?  It is only called
three times.
A proper name could be gimple_could_trap_or_invoke_undefined_p.

That you need to change the two testcases makes me nervous (also please
use -fwrapv, not -fno-strict-overflow) - did you check what happens for them?

Thanks,
Richard.



> Regards
> Bernd Edlinger


Re: [PATCH] Convert more passes to new dump framework

2013-08-29 Thread Richard Biener
On Wed, Aug 28, 2013 at 4:09 PM, Teresa Johnson  wrote:
> On Wed, Aug 28, 2013 at 4:01 AM, Richard Biener
>  wrote:
>> On Wed, Aug 7, 2013 at 7:23 AM, Teresa Johnson  wrote:
>>> On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson  wrote:
 On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor  wrote:
> Hi,
>
> On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote:
>> On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor  wrote:
>> > On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
>> >> This patch ports messages to the new dump framework,
>> >
>> > It would be great this new framework was documented somewhere.  I lost
>> > track of what was agreed it would be and from the uses in the
>> > vectorizer I was never quite sure how to utilize it in other passes.
>>
>> Cc'ing Sharad who implemented this - Sharad, is this documented on a
>> wiki or elsewhere?
>
> Thanks
>
>>
>> >
>> > I'd also like to point out two other minor things inline:
>> >
>> > [...]
>> >
>> >> 2013-08-06  Teresa Johnson  
>> >> Dehao Chen  
>> >>
>> >> * dumpfile.c (dump_loc): Add column number to output, make 
>> >> newlines
>> >> consistent.
>> >> * dumpfile.h (OPTGROUP_OTHER): Add and enable under 
>> >> OPTGROUP_ALL.
>> >> * ipa-inline-transform.c (clone_inlined_nodes):
>> >> (cgraph_node_opt_info): New function.
>> >> (cgraph_node_call_chain): Ditto.
>> >> (dump_inline_decision): Ditto.
>> >> (inline_call): Invoke dump_inline_decision.
>> >> * doc/invoke.texi: Document optall -fopt-info flag.
>> >> * profile.c (read_profile_edge_counts): Use new dump 
>> >> framework.
>> >> (compute_branch_probabilities): Ditto.
>> >> * passes.c (pass_manager::register_one_dump_file): Use 
>> >> OPTGROUP_OTHER
>> >> when pass not in any opt group.
>> >> * value-prof.c (check_counter): Use new dump framework.
>> >> (find_func_by_funcdef_no): Ditto.
>> >> (check_ic_target): Ditto.
>> >> * coverage.c (get_coverage_counts): Ditto.
>> >> (coverage_init): Setup new dump framework.
>> >> * ipa-inline.c (inline_small_functions): Set is_in_ipa_inline.
>> >> * ipa-inline.h (is_in_ipa_inline): Declare.
>> >>
>> >> * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
>> >> * testsuite/gcc.dg/pr26570.c: Ditto.
>> >> * testsuite/gcc.dg/pr32773.c: Ditto.
>> >> * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.
>> >>
>> >
>> > [...]
>> >
>> >> Index: ipa-inline-transform.c
>> >> ===
>> >> --- ipa-inline-transform.c  (revision 201461)
>> >> +++ ipa-inline-transform.c  (working copy)
>> >> @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, 
>> >> bool d
>> >>  }
>> >>
>> >>
>> >> +#define MAX_INT_LENGTH 20
>> >> +
>> >> +/* Return NODE's name and profile count, if available.  */
>> >> +
>> >> +static const char *
>> >> +cgraph_node_opt_info (struct cgraph_node *node)
>> >> +{
>> >> +  char *buf;
>> >> +  size_t buf_size;
>> >> +  const char *bfd_name = lang_hooks.dwarf_name (node->symbol.decl, 
>> >> 0);
>> >> +
>> >> +  if (!bfd_name)
>> >> +bfd_name = "unknown";
>> >> +
>> >> +  buf_size = strlen (bfd_name) + 1;
>> >> +  if (profile_info)
>> >> +buf_size += (MAX_INT_LENGTH + 3);
>> >> +
>> >> +  buf = (char *) xmalloc (buf_size);
>> >> +
>> >> +  strcpy (buf, bfd_name);
>> >> +
>> >> +  if (profile_info)
>> >> +sprintf (buf, "%s ("HOST_WIDEST_INT_PRINT_DEC")", buf, 
>> >> node->count);
>> >> +  return buf;
>> >> +}
>> >
>> > I'm not sure if output of this function is aimed only at the user or
>> > if it is supposed to be used by gcc developers as well.  If the
>> > latter, an incredibly useful thing is to also dump node->symbol.order
>> > too.  We usually dump it after "/" sign separating it from node name.
>> > It is invaluable when examining decisions in C++ code where you can
>> > have lots of clones of a node (and also because existing dumps print
>> > it, it is easy to combine them).
>>
>> The output is useful for both power users doing performance tuning of
>> their application, and by gcc developers. Adding the id is not so
>> useful for the former, but I agree that it is very useful for compiler
>> developers. In fact, in the google branch version we emit more verbose
>> information (the lipo module id and the funcdef_no) to help uniquely
>> identify the routines and to aid in post-processing by humans and
>> tools. S

Re: [PATCH] Convert more passes to new dump framework

2013-08-29 Thread Richard Biener
On Wed, Aug 28, 2013 at 5:20 PM, Xinliang David Li  wrote:
> On Wed, Aug 28, 2013 at 7:09 AM, Teresa Johnson  wrote:
>> On Wed, Aug 28, 2013 at 4:01 AM, Richard Biener
>>  wrote:
>>> On Wed, Aug 7, 2013 at 7:23 AM, Teresa Johnson  wrote:
 On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson  
 wrote:
> On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor  wrote:
>> Hi,
>>
>> On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote:
>>> On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor  wrote:
>>> > On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
>>> >> This patch ports messages to the new dump framework,
>>> >
>>> > It would be great this new framework was documented somewhere.  I lost
>>> > track of what was agreed it would be and from the uses in the
>>> > vectorizer I was never quite sure how to utilize it in other passes.
>>>
>>> Cc'ing Sharad who implemented this - Sharad, is this documented on a
>>> wiki or elsewhere?
>>
>> Thanks
>>
>>>
>>> >
>>> > I'd also like to point out two other minor things inline:
>>> >
>>> > [...]
>>> >
>>> >> 2013-08-06  Teresa Johnson  
>>> >> Dehao Chen  
>>> >>
>>> >> * dumpfile.c (dump_loc): Add column number to output, make 
>>> >> newlines
>>> >> consistent.
>>> >> * dumpfile.h (OPTGROUP_OTHER): Add and enable under 
>>> >> OPTGROUP_ALL.
>>> >> * ipa-inline-transform.c (clone_inlined_nodes):
>>> >> (cgraph_node_opt_info): New function.
>>> >> (cgraph_node_call_chain): Ditto.
>>> >> (dump_inline_decision): Ditto.
>>> >> (inline_call): Invoke dump_inline_decision.
>>> >> * doc/invoke.texi: Document optall -fopt-info flag.
>>> >> * profile.c (read_profile_edge_counts): Use new dump 
>>> >> framework.
>>> >> (compute_branch_probabilities): Ditto.
>>> >> * passes.c (pass_manager::register_one_dump_file): Use 
>>> >> OPTGROUP_OTHER
>>> >> when pass not in any opt group.
>>> >> * value-prof.c (check_counter): Use new dump framework.
>>> >> (find_func_by_funcdef_no): Ditto.
>>> >> (check_ic_target): Ditto.
>>> >> * coverage.c (get_coverage_counts): Ditto.
>>> >> (coverage_init): Setup new dump framework.
>>> >> * ipa-inline.c (inline_small_functions): Set 
>>> >> is_in_ipa_inline.
>>> >> * ipa-inline.h (is_in_ipa_inline): Declare.
>>> >>
>>> >> * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
>>> >> * testsuite/gcc.dg/pr26570.c: Ditto.
>>> >> * testsuite/gcc.dg/pr32773.c: Ditto.
>>> >> * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.
>>> >>
>>> >
>>> > [...]
>>> >
>>> >> Index: ipa-inline-transform.c
>>> >> ===
>>> >> --- ipa-inline-transform.c  (revision 201461)
>>> >> +++ ipa-inline-transform.c  (working copy)
>>> >> @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, 
>>> >> bool d
>>> >>  }
>>> >>
>>> >>
>>> >> +#define MAX_INT_LENGTH 20
>>> >> +
>>> >> +/* Return NODE's name and profile count, if available.  */
>>> >> +
>>> >> +static const char *
>>> >> +cgraph_node_opt_info (struct cgraph_node *node)
>>> >> +{
>>> >> +  char *buf;
>>> >> +  size_t buf_size;
>>> >> +  const char *bfd_name = lang_hooks.dwarf_name (node->symbol.decl, 
>>> >> 0);
>>> >> +
>>> >> +  if (!bfd_name)
>>> >> +bfd_name = "unknown";
>>> >> +
>>> >> +  buf_size = strlen (bfd_name) + 1;
>>> >> +  if (profile_info)
>>> >> +buf_size += (MAX_INT_LENGTH + 3);
>>> >> +
>>> >> +  buf = (char *) xmalloc (buf_size);
>>> >> +
>>> >> +  strcpy (buf, bfd_name);
>>> >> +
>>> >> +  if (profile_info)
>>> >> +sprintf (buf, "%s ("HOST_WIDEST_INT_PRINT_DEC")", buf, 
>>> >> node->count);
>>> >> +  return buf;
>>> >> +}
>>> >
>>> > I'm not sure if output of this function is aimed only at the user or
>>> > if it is supposed to be used by gcc developers as well.  If the
>>> > latter, an incredibly useful thing is to also dump node->symbol.order
>>> > too.  We usually dump it after "/" sign separating it from node name.
>>> > It is invaluable when examining decisions in C++ code where you can
>>> > have lots of clones of a node (and also because existing dumps print
>>> > it, it is easy to combine them).
>>>
>>> The output is useful for both power users doing performance tuning of
>>> their application, and by gcc developers. Adding the id is not so
>>> useful for the former, but I agree that it is very useful for compiler
>>> developers. In fact, in the google branch v

Re: [PATCH 1/2] Convert symtab, cgraph and varpool nodes into a real class hierarchy

2013-08-29 Thread Richard Biener
On Wed, Aug 28, 2013 at 7:02 PM, Mike Stump  wrote:
> On Aug 28, 2013, at 2:34 AM, Richard Biener  
> wrote:
>> Huh?  Why should wide-int need to be marked GTY at all?!
>
> Can I answer with a question?  Why would nb_iter_bound be GTYed?

Because it references a GIMPLE stmt which resides in GC memory.

Why would dw_val_struct be GTYed?

Because it resides in GC memory?

The first makes little sense to me.  The second, well, it is used to
generate debug information…  When the clients no longer want to GTY,
we could remove it.  wide_ints seem like they should not make it to
the PCH file.

I don't think this is PCH related.

Richard.


Re: opt-info message change for vectorizer

2013-08-29 Thread Richard Biener
On Wed, Aug 28, 2013 at 8:14 PM, Xinliang David Li  wrote:
> Fixed as requested. I don't like the extra newline either, but I will
> leave that to Teresa.
>
> basic3.c:8:foo: note: loop vectorized
>
> basic3.c:8:foo: note: loop versioned for vectorization because of
> possible aliasing
>
> basic3.c:8:foo: note: loop peeled for vectorization to enhance alignment
>
> basic3.c:8:foo: note: loop with 7 iterations completely unrolled
>
> basic3.c:5:foo: note: loop with 7 iterations completely unrolled
>
>
> Is this version ok after testing?

- "Vectorized basic-block\n");
+ "Basic block is vectorized\n");

lower case

Index: tree-vect-data-refs.c
===
--- tree-vect-data-refs.c (revision 201751)
+++ tree-vect-data-refs.c (working copy)
@@ -37,7 +37,7 @@ along with GCC; see the file COPYING3.
 #include "tree-scalar-evolution.h"
 #include "tree-vectorizer.h"
 #include "diagnostic-core.h"
-
+#include 
 /* Need to include rtl.h, expr.h, etc. for optabs.  */
 #include "expr.h"
 #include "optabs.h"
@@ -1393,6 +1393,8 @@ vect_enhance_data_refs_alignment (loop_v

   supportable_dr_alignment = vect_supportable_dr_alignment (dr, true);
   do_peeling = vector_alignment_reachable_p (dr);
+  if (getenv("NOPEEL"))
+do_peeling = false;
   if (do_peeling)
 {
   if (known_alignment_for_access_p (dr))

unrelated change

@@ -261,12 +262,20 @@ dump_loc (int dump_kind, FILE *dfile, so
   if (dump_kind)
 {
   if (LOCATION_LOCUS (loc) > BUILTINS_LOCATION)
-fprintf (dfile, "\n%s:%d: note: ", LOCATION_FILE (loc),
- LOCATION_LINE (loc));
+{
+  if (current_function_decl)
+fprintf (dfile, "\n%s:%d:%s: note: ", LOCATION_FILE (loc),
+ LOCATION_LINE (loc),
+ gimple_decl_printable_name (current_function_decl, 1));
+  else
+fprintf (dfile, "\n%s:%d: note: ", LOCATION_FILE (loc),
+ LOCATION_LINE (loc));
+}
   else if (current_function_decl)
-fprintf (dfile, "\n%s:%d: note: ",
+fprintf (dfile, "\n%s:%d:%s: note: ",
  DECL_SOURCE_FILE (current_function_decl),
- DECL_SOURCE_LINE (current_function_decl));
+ DECL_SOURCE_LINE (current_function_decl),
+ gimple_decl_printable_name (current_function_decl, 1));
 }
 }

please not with this change (I oppose to it).

Ok with the change and not committing the hunks above.

Thanks,
Richard.








> thanks,
>
> David
>
> On Wed, Aug 28, 2013 at 2:45 AM, Richard Biener
>  wrote:
>> On Tue, Aug 27, 2013 at 10:30 PM, Xinliang David Li  
>> wrote:
>>> If this is the convention, we should probably have another patch to
>>> fix all the existing opt-info messages.
>>
>> Yes please.
>>
>> Also ...
>>
>>
>> b.c:16:A::foo: note: Loop is vectorized
>>
>> "loop vectorized"
>>
>>
>> b.c:16:A::foo: note: Loop is versioned to remove aliases for 
>> vectorization
>>
>> "loop versioned for vectorization because of possible aliasing"
>>
>> b.c:16:A::foo: note: Loop is peeled to enhance alignment for 
>> vectorization
>>
>> "loop peeled for vectorization to enhance alignment"
>>
>> b.c:16:A::foo: note: Completely unroll loop 6 times
>>
>> maybe "loop with 6 iterations completely unrolled"
>>
>>
>> b.c:12:A::foo: note: Completely unroll loop 6 times
>>
>>
>> I hate the excessive vertical spacing as well.
>>
>> Richard.
>>
>> Ok after testing?
>>
>> thanks,
>>
>> David



Re: [PATCH] -mcmodel=large -fpic TLS GD and LD support gcc + binutils (PR target/58067)

2013-08-29 Thread Jakub Jelinek
On Wed, Aug 28, 2013 at 08:30:19AM -0700, H.J. Lu wrote:
> On Wed, Aug 28, 2013 at 2:37 AM, Jakub Jelinek  wrote:
> > On Wed, Aug 14, 2013 at 09:03:24AM +0200, Uros Bizjak wrote:
> >> The implementation for x86 is technically OK, but I wonder if these
> >> sequences should be documented in some authoritative document about
> >> TLS relocations. The "ELF Handling For Thread-Local Storage" document
> >> [1] doesn't mention various code models fo x86_64, so I was not able
> >> to cross-check the implementaton vs. documentation.
> >>
> >> [1] http://www.akkadia.org/drepper/tls.pdf
> >
> > Ping, are the patches ok for gcc trunk and binutils trunk?
> > Uli has kindly updated the docs some time ago.
> >
> 
> Linker change is OK with testcases for GD and LD.

Ok, here is what I've committed in the end, after testing on x86_64-linux
and x86_64-linux -> x86_64-nacl.

There is one unfortunate thing, in the tlsgd8.s test, that while the
transition GD->IE happens successfully, the PLT slot for __tls_get_addr
isn't removed (nothing jumps to it or takes its address, but still
it is emitted).  But, if I tweak tlsgd5a.s (the normal model GD->IE
transition test), so that it has call __tls_get_addr@plt rather than
call __tls_get_addr, I get the same behavior, so this isn't really a bug
in the new code, but preexisting issue (minor, sure), that might be
desirable to be improved eventually.

2013-08-29  Jakub Jelinek  

* elf64-x86-64.c (elf_x86_64_check_tls_transition): Allow
64-bit -mcmodel=large -fpic TLS GD and LD sequences.
(elf_x86_64_relocate_section): Handle -mcmodel=large -fpic
TLS GD and LD sequences in GD->LE, GD->IE and LD->LE transitions.
ld/testsuite/
* ld-x86-64/x86-64.exp: Add tlsld3, tlsgd7 and tlsgd8 tests.
* ld-x86-64/tlspic1.s: Add -mcmodel=large -fpic TLS GD and LD
sequences.
* ld-x86-64/tlspic.dd: Adjusted.
* ld-x86-64/tlspic.rd: Adjusted.
* ld-x86-64/tlspic-nacl.rd: Adjusted.
* ld-x86-64/tlsld3.dd: New test.
* ld-x86-64/tlsld3.s: New file.
* ld-x86-64/tlsgd7.dd: New test.
* ld-x86-64/tlsgd7.s: New file.
* ld-x86-64/tlsgd8.dd: New test.
* ld-x86-64/tlsgd8.s: New file.

--- bfd/elf64-x86-64.c.jj   2013-08-28 17:58:32.549408509 +0200
+++ bfd/elf64-x86-64.c  2013-08-28 17:33:13.738626828 +0200
@@ -1089,6 +1089,7 @@ elf_x86_64_check_tls_transition (bfd *ab
 {
   unsigned int val;
   unsigned long r_symndx;
+  bfd_boolean largepic = FALSE;
   struct elf_link_hash_entry *h;
   bfd_vma offset;
   struct elf_x86_64_link_hash_table *htab;
@@ -1126,16 +1127,32 @@ elf_x86_64_check_tls_transition (bfd *ab
 can transit to different access model.  For 32bit, only
leaq foo@tlsgd(%rip), %rdi
.word 0x; rex64; call __tls_get_addr
-can transit to different access model.  */
+can transit to different access model.  For largepic
+we also support:
+   leaq foo@tlsgd(%rip), %rdi
+   movabsq $__tls_get_addr@pltoff, %rax
+   addq $rbx, %rax
+   call *%rax.  */
 
  static const unsigned char call[] = { 0x66, 0x66, 0x48, 0xe8 };
  static const unsigned char leaq[] = { 0x66, 0x48, 0x8d, 0x3d };
 
- if ((offset + 12) > sec->size
- || memcmp (contents + offset + 4, call, 4) != 0)
+ if ((offset + 12) > sec->size)
return FALSE;
 
- if (ABI_64_P (abfd))
+ if (memcmp (contents + offset + 4, call, 4) != 0)
+   {
+ if (!ABI_64_P (abfd)
+ || (offset + 19) > sec->size
+ || offset < 3
+ || memcmp (contents + offset - 3, leaq + 1, 3) != 0
+ || memcmp (contents + offset + 4, "\x48\xb8", 2) != 0
+ || memcmp (contents + offset + 14, "\x48\x01\xd8\xff\xd0", 5)
+!= 0)
+   return FALSE;
+ largepic = TRUE;
+   }
+ else if (ABI_64_P (abfd))
{
  if (offset < 4
  || memcmp (contents + offset - 4, leaq, 4) != 0)
@@ -1153,16 +1170,31 @@ elf_x86_64_check_tls_transition (bfd *ab
  /* Check transition from LD access model.  Only
leaq foo@tlsld(%rip), %rdi;
call __tls_get_addr
-can transit to different access model.  */
+can transit to different access model.  For largepic
+we also support:
+   leaq foo@tlsld(%rip), %rdi
+   movabsq $__tls_get_addr@pltoff, %rax
+   addq $rbx, %rax
+   call *%rax.  */
 
  static const unsigned char lea[] = { 0x48, 0x8d, 0x3d };
 
  if (offset < 3 || (offset + 9) > sec->size)
return FALSE;
 
- if (memcmp (contents + offset - 3, lea, 3) != 0
- || 0xe8 != *(contents + offset + 4))
+ if (memcmp (contents + offset - 3, lea, 3) != 

[PATCH] RTEMS: Add LEON3/SPARC multilibs

2013-08-29 Thread Sebastian Huber
Recently support for LEON3 specific instructions were added to GCC.
Make this support available for RTEMS.

This patch should be committed to GCC 4.9.

gcc/ChangeLog
2013-08-29  Sebastian Huber  

* config/sparc/t-rtems: Add leon3 multilibs.
---
 gcc/config/sparc/t-rtems |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/sparc/t-rtems b/gcc/config/sparc/t-rtems
index 63d0217..f1a3d84 100644
--- a/gcc/config/sparc/t-rtems
+++ b/gcc/config/sparc/t-rtems
@@ -17,6 +17,6 @@
 # .
 #
 
-MULTILIB_OPTIONS = msoft-float mcpu=v8
-MULTILIB_DIRNAMES = soft v8
+MULTILIB_OPTIONS = msoft-float mcpu=v8/mcpu=leon3
+MULTILIB_DIRNAMES = soft v8 leon3
 MULTILIB_MATCHES = msoft-float=mno-fpu
-- 
1.7.7



[PATCH] Fix PR57287 (again)

2013-08-29 Thread Richard Biener

This makes sure we cleanup at least some of the spurious copies
that can be introduced for example by jump-threading and that confuse
uninitialized var analysis.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-08-29  Richard Biener  

PR middle-end/57287
* tree-ssa-copy.c (may_propagate_copy): Allow propagating
of default defs that appear in abnormal PHI nodes.

* gcc.dg/pr57287-2.c: New testcase.

Index: gcc/tree-ssa-copy.c
===
*** gcc/tree-ssa-copy.c (revision 202068)
--- gcc/tree-ssa-copy.c (working copy)
*** may_propagate_copy (tree dest, tree orig
*** 60,66 
  
/* If ORIG flows in from an abnormal edge, it cannot be propagated.  */
if (TREE_CODE (orig) == SSA_NAME
!   && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (orig))
  return false;
  
/* If DEST is an SSA_NAME that flows from an abnormal edge, then it
--- 60,72 
  
/* If ORIG flows in from an abnormal edge, it cannot be propagated.  */
if (TREE_CODE (orig) == SSA_NAME
!   && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (orig)
!   /* If it is the default definition and an automatic variable then
!  we can though and it is important that we do to avoid
!uninitialized regular copies.  */
!   && !(SSA_NAME_IS_DEFAULT_DEF (orig)
!  && (SSA_NAME_VAR (orig) == NULL_TREE
!  || TREE_CODE (SSA_NAME_VAR (orig)) == VAR_DECL)))
  return false;
  
/* If DEST is an SSA_NAME that flows from an abnormal edge, then it
Index: gcc/testsuite/gcc.dg/pr57287-2.c
===
*** gcc/testsuite/gcc.dg/pr57287-2.c(revision 0)
--- gcc/testsuite/gcc.dg/pr57287-2.c(working copy)
***
*** 0 
--- 1,35 
+ /* { dg-do compile } */
+ /* { dg-options "-O2 -Wall" } */
+ 
+ #include 
+ 
+ struct node
+ {
+   struct node *next;
+   char *name;
+ } *list;
+ 
+ struct node *list;
+ struct node *head (void);
+ 
+ sigjmp_buf *bar (void);
+ 
+ int baz (void)
+ {
+   struct node *n;
+   int varseen = 0;
+ 
+   list = head ();
+   for (n = list; n; n = n->next)
+ {
+   if (!varseen)
+   varseen = 1;
+ 
+   sigjmp_buf *buf = bar ();  /* { dg-bogus "may be used uninitialized" "" 
} */
+   __sigsetjmp (*buf, 1);
+ }
+ 
+   if (!varseen)
+ return 0;
+   return 1;
+ }


Re: [PATCH i386 2/8] [AVX512] Add mask registers.

2013-08-29 Thread Kirill Yukhin
Hello,

> Testing in progress, is it ok for trunk if pass?
I forgot to add clobber to split of andn,
so testing fail. Fixed.
Updated patch in the bottom.

Testing:
  1. Bootstrap pass.
  2. make check shows no regressions.
  3. Spec 2000 & 2006 build show no regressions both with and without -mavx512f 
option.
  4. Spec 2000 & 2006 run shows no stability regressions without -mavx512f 
option.

Is it ok for trunk?

Thanks, K

---
 gcc/config/i386/constraints.md |   8 +-
 gcc/config/i386/i386.c |  38 --
 gcc/config/i386/i386.h |  38 --
 gcc/config/i386/i386.md| 288 ++---
 gcc/config/i386/predicates.md  |   9 ++
 5 files changed, 318 insertions(+), 63 deletions(-)

diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md
index 28e626f..92e0c05 100644
--- a/gcc/config/i386/constraints.md
+++ b/gcc/config/i386/constraints.md
@@ -19,7 +19,7 @@
 
 ;;; Unused letters:
 ;;; B H   T
-;;;   h jk
+;;;   h j
 
 ;; Integer register constraints.
 ;; It is not necessary to define 'r' here.
@@ -78,6 +78,12 @@
  "TARGET_80387 || TARGET_FLOAT_RETURNS_IN_80387 ? FP_SECOND_REG : NO_REGS"
  "Second from top of 80387 floating-point stack (@code{%st(1)}).")
 
+(define_register_constraint "k" "TARGET_AVX512F ? MASK_EVEX_REGS : NO_REGS"
+"@internal Any mask register that can be used as predicate, i.e. k1-k7.")
+
+(define_register_constraint "Yk" "TARGET_AVX512F ? MASK_REGS : NO_REGS"
+"@internal Any mask register.")
+
 ;; Vector registers (also used for plain floating point nowadays).
 (define_register_constraint "y" "TARGET_MMX ? MMX_REGS : NO_REGS"
  "Any MMX register.")
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index d05dbf0..4e9bac0 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2032,6 +2032,9 @@ enum reg_class const regclass_map[FIRST_PSEUDO_REGISTER] =
   EVEX_SSE_REGS, EVEX_SSE_REGS, EVEX_SSE_REGS, EVEX_SSE_REGS,
   EVEX_SSE_REGS, EVEX_SSE_REGS, EVEX_SSE_REGS, EVEX_SSE_REGS,
   EVEX_SSE_REGS, EVEX_SSE_REGS, EVEX_SSE_REGS, EVEX_SSE_REGS,
+  /* Mask registers.  */
+  MASK_REGS, MASK_EVEX_REGS, MASK_EVEX_REGS, MASK_EVEX_REGS,
+  MASK_EVEX_REGS, MASK_EVEX_REGS, MASK_EVEX_REGS, MASK_EVEX_REGS,
 };
 
 /* The "default" register map used in 32bit mode.  */
@@ -2047,6 +2050,7 @@ int const dbx_register_map[FIRST_PSEUDO_REGISTER] =
   -1, -1, -1, -1, -1, -1, -1, -1,  /* extended SSE registers */
   -1, -1, -1, -1, -1, -1, -1, -1,   /* AVX-512 registers 16-23*/
   -1, -1, -1, -1, -1, -1, -1, -1,   /* AVX-512 registers 24-31*/
+  93, 94, 95, 96, 97, 98, 99, 100,  /* Mask registers */
 };
 
 /* The "default" register map used in 64bit mode.  */
@@ -2062,6 +2066,7 @@ int const dbx64_register_map[FIRST_PSEUDO_REGISTER] =
   25, 26, 27, 28, 29, 30, 31, 32,  /* extended SSE registers */
   67, 68, 69, 70, 71, 72, 73, 74,   /* AVX-512 registers 16-23 */
   75, 76, 77, 78, 79, 80, 81, 82,   /* AVX-512 registers 24-31 */
+  118, 119, 120, 121, 122, 123, 124, 125, /* Mask registers */
 };
 
 /* Define the register numbers to be used in Dwarf debugging information.
@@ -2129,6 +2134,7 @@ int const svr4_dbx_register_map[FIRST_PSEUDO_REGISTER] =
   -1, -1, -1, -1, -1, -1, -1, -1,  /* extended SSE registers */
   -1, -1, -1, -1, -1, -1, -1, -1,   /* AVX-512 registers 16-23*/
   -1, -1, -1, -1, -1, -1, -1, -1,   /* AVX-512 registers 24-31*/
+  93, 94, 95, 96, 97, 98, 99, 100,  /* Mask registers */
 };
 
 /* Define parameter passing and return registers.  */
@@ -4219,8 +4225,13 @@ ix86_conditional_register_usage (void)
 
   /* If AVX512F is disabled, squash the registers.  */
   if (! TARGET_AVX512F)
-for (i = FIRST_EXT_REX_SSE_REG; i < LAST_EXT_REX_SSE_REG; i++)
-  fixed_regs[i] = call_used_regs[i] = 1, reg_names[i] = "";
+{
+  for (i = FIRST_EXT_REX_SSE_REG; i < LAST_EXT_REX_SSE_REG; i++)
+   fixed_regs[i] = call_used_regs[i] = 1, reg_names[i] = "";
+
+  for (i = FIRST_MASK_REG; i < LAST_MASK_REG; i++)
+   fixed_regs[i] = call_used_regs[i] = 1, reg_names[i] = "";
+}
 }
 
 
@@ -33889,10 +33900,12 @@ ix86_preferred_reload_class (rtx x, reg_class_t 
regclass)
 return regclass;
 
   /* Force constants into memory if we are loading a (nonzero) constant into
- an MMX or SSE register.  This is because there are no MMX/SSE instructions
- to load from a constant.  */
+ an MMX, SSE or MASK register.  This is because there are no MMX/SSE/MASK
+ instructions to load from a constant.  */
   if (CONSTANT_P (x)
-  && (MAYBE_MMX_CLASS_P (regclass) || MAYBE_SSE_CLASS_P (regclass)))
+  && (MAYBE_MMX_CLASS_P (regclass)
+ || MAYBE_SSE_CLASS_P (regclass)
+ || MAYBE_MASK_CLASS_P (regclass)))
 return NO_REGS;
 
   /* Prefer SSE regs only, if we can use them for math.  */
@@ -33996,10 +34009,11 @@ ix86_secondary_reload (bool in_p, rtx x, reg_class_t 
rclass,
 
   /* QImode spills from non-QI re

Re: Make some comdats implicitly hidden

2013-08-29 Thread Jan Hubicka
M
> Hi,
> 
> On 08/29/2013 10:11 AM, Jan Hubicka wrote:
> >Paolo,
> >there seems to be one extra issue about this patch. It causes quite a twist 
> >in libstdc++ exported symbols.
> >It is purpose of the patch to remove those that are going to be generated in 
> >user programs, too.
> >I am however bit confused about bad array. Perhaps it is an optimization 
> >difference dragging it in?
> I'm also confused because your patch isn't already in (at least I
> can't find such a ChangeLog in svn) and what you attached, as long
> as bad_array_* is concerned is just what we currently have, before
> the patch. Check point: 21 added symbols.

I see, I tought that adding a symbol will make ABI check to fail, so I got
sidetracked by trying to understand why we need the bad array.
> 
> In general, of course, inlining alone can make a difference, if for
> example you inline less and a pattern in the linker map isn't tight
> enough some symbols can become inadvertently exported.
> 
> *But*, outside bad_array* I can see a lot of SUBTRACTED, those would
> be very bad regressions of course. check_abi cannot pass.

I see, it is purpose of the patch to cause regressions like this :)
What I am tracking is the following testcase:

jan@linux-9ure:~> cat t.C
__attribute__ ((noinline))
inline void t(void)
{
 asm("fsin");
}
void q(void);

main()
{
  q();
  t();
}
jan@linux-9ure:~> cat t2.C
__attribute__ ((noinline))
inline void t(void)
{
 asm("fsin");
}

void q(void)
{
  t();
}
jan@linux-9ure:~> gcc -O2 t2.C -fPIC --shared -o libt.so 
jan@linux-9ure:~> gcc -O2 t.C -lt -L.

Now I have a shared library and main binary but defining function t():

jan@linux-9ure:~> objdump -d a.out | grep fsin
  400720:   d9 fe   fsin   
jan@linux-9ure:~> objdump -d libt.so | grep fsin
 720:   d9 fe   fsin   

Both define it in dynamic symbol table:

jan@linux-9ure:~> objdump -t a.out | grep _Z1tv
00400720  wF .text  0003  _Z1tv
jan@linux-9ure:~> objdump -t libt.so | grep _Z1tv
0720  wF .text  0003  _Z1tv

and libt has dynamic relocation on call to the function:
jan@linux-9ure:~> objdump -R libt.so | grep _Z1tv
00201020 R_X86_64_JUMP_SLOT  _Z1tv
jan@linux-9ure:~> objdump -R a.out | grep _Z1tv
jan@linux-9ure:~> 

Both versions of t() (in libt and a.out) are known to be semantically
equivalent.  My patch makes an observation that is does not matter to what copy
of t() we call.  So it makes both symbols hidden.  This makes them resovled at
link-time rather than at dynamic link-time.  Thinks should still work as
previously, just we will dispatch at runtime into both copies. I believe it is
little cost to pay for getting cheaper dynamic linking and better codegen (for
locally bound calls).

The symbol disappear from the external symbol table, but this transformation
ought to be valid since the same would happen if the symbol was fully inlined
or its uses optimized out.

This optimization will disable itself when you take address of t() so 
comparing addresses works as expected. Also the optimization won't take
place when you key a template since then the users of the library do not
necesarily need to emit their own copy of the code.

My longer term vision is to turn as many relocations as possible from external
to IP relative. This by itself will help (i.e. from libreoffice, this patch
cuts down important percentage of symbols). If something like Mike Homey's 
elfhack
http://glandium.org/blog/?p=1177
ever land into official dynamic linker (I think it should), we will see a lot
smaller binaries.
> 
> In the past, when we (library people) had special issues with
> versioning, like we had to patch a mistake we made, Jakub promptly
> helped, I would recommend referring to him too, if you are seeing
> something weird, but the general rule as far as testing is concerned
> is very simple: *never* ignore check_abi failures, make sure
> additional symbols are exported only at the current ABI version,
> never add symbols at older versions, never remove symbols.

So my belief is that it is safe to drop those symbols from libstdc++. Every
program/DSO using them have to define its own copy of those symbols, so I
believe removing them from libstdc++ won't cause issues.

Honza
> 
> Paolo.
> 


Re: wide-int branch now up for public comment and review

2013-08-29 Thread Kenneth Zadeck

On 08/29/2013 04:42 AM, Richard Biener wrote:

On Wed, 28 Aug 2013, Kenneth Zadeck wrote:


Note that the bits above the precision are not defined and the
algorithms used here are careful not to depend on their value.  In
particular, values that come in from rtx constants may have random
bits.

Which is a red herring.  It should be fixed.  I cannot even believe
that sentence given the uses of CONST_DOUBLE_LOW/CONST_DOUBLE_HIGH
or INTVAL/UINTVAL.  I don't see accesses masking out 'undefined' bits
anywhere.


I can agree with you that this could be fixed.   But it is not necessary to
fix it.   The rtl level and most of the tree level has existed for a long time
by doing math within the precision.

you do not see the masking at the rtl level because the masking is not
necessary.if you never look beyond the precision you just do not care.
There is the issue with divide and mod and quite frankly the code on the trunk
scares me to death.   my code at the rtl level makes sure that everything is
clean since i can see if it is a udiv or a div that is enough info to clean
the value up.


I have a feeling I'm rehashing a past debate, sorry, but rtx constants
can't
have random bits.  The upper bits must be a sign extension of the value.
There's exactly one valid rtx for each (value, mode) pair.  If you saw
something different then that sounds like a bug.  The rules should already
be fairly well enforced though, since something like (const_int 128) --
or (const_int 256) -- will not match a QImode operand.

See.  We're saved ;)

this is richard's theory.   There is a block of code at wide-int.cc:175 that
is ifdefed out that checks to see if the rtl consts are canonical.   if you
bring it in, building the libraries fails very quickly.   The million dollar
question is this the only bug or is this the first bug of a 1000 bugs.   Your
comment about not seeing any masking cuts both ways.   There is very little
code on the way in if you create ints with GEN_INT that makes sure someone has
done the right thing on that side.   So my guess is that there are a lot of
failures and a large number of them will be in the ports.

Well, clearly RTL code _expects_ constants to be properly extended.  I
can reproduce the bug and I simply guess that's a matter of mismatching
modes here (or performing an implicit truncation without properly
extending the constant).

 at /space/rguenther/src/svn/wide-int/gcc/combine.c:10086
10086  GEN_INT
(count));
(gdb) l
10081
10082 mask_rtx = GEN_INT (nonzero_bits (varop, GET_MODE
(varop)));
10083
10084 mask_rtx
10085   = simplify_const_binary_operation (code,
result_mode, mask_rtx,
10086  GEN_INT
(count));

uses of GEN_INT are frowned upon ... for exactly that reason - the
mask_rtx is not a proper RTL constant for SImode.
over time, the GEN_INTs will go away at the portable rtl level  as more 
of the code is transitioned to use wide-int.The port story is not so 
good.   Any port that uses TI or beyond will likely evolve to using 
wide-int for the math and unifying the cases between CONST_INT and 
CONST_WIDE_INT.  (the wide-int constructor from rtl takes either and the 
constructor to rtl looks at the constant.) But a port like the mips that 
has no TI will likely never change.


Btw, all this isn't a reason to not have a well-defined wide-int
representation.  You just have (as you have for trees) to properly
canonize the representation at the time you convert from RTL
constants to wide-int constants.
The wide-int representation is completely well defined.This is what 
i cannot see why you do not understand.   You do not need to look above 
the precision so there is nothing undefined!!   I know this bothers 
you very badly and i will agree that if it is easy to clean up rtl to 
match this, then it would simplify the code somewhat.But it is not 
undefined or random.   Furthermore, cleaning it will not change the abi 
so if rtl gets cleaner, we can change this.



In the long run we want a uniform representation of constants
so we can do zero-copying - but it looks like we now have
three different representations - the tree one (sign or zero
extended dependent on sign), RTL (garbage as you show) and
wide-int (always sign-extended).
You wanted this to look more like double-int, now that it does, you now 
to see the flaws in it.Do i ever get to win?   I do not think anyone 
would have been served well by trying to make trees have only signed 
constants in the way that double-int did.  I will admit that the 
implementation would be cleaner if we could just change everything else 
in the compiler to match a super clean wide-int design.   I took the 
other approach and bottled as much of the ugliness behind the a api and 
tried to keep the rest of the compiler looking mostly the way that it does.


To a first approxima

[4.8] 3 backports

2013-08-29 Thread Jakub Jelinek
Hi!

I've bootstrapped/regtested on x86_64-linux and i686-linux these 3 backports
(last patch is actually combined from 3 separate commits), all applied
cleanly to 4.8 branch.

Ok for 4.8?

Jakub
2013-08-29  Jakub Jelinek  

Backported from mainline
2013-05-27  Richard Biener  

PR tree-optimization/57343
* tree-ssa-loop-niter.c (number_of_iterations_ne_max): Do not
use multiple_of_p if not TYPE_OVERFLOW_UNDEFINED.
(number_of_iterations_cond): Do not build the folded tree.

* gcc.dg/torture/pr57343.c: New testcase.

--- gcc/tree-ssa-loop-niter.c   (revision 199356)
+++ gcc/tree-ssa-loop-niter.c   (revision 199357)
@@ -552,10 +552,18 @@ number_of_iterations_ne_max (mpz_t bnd,
 {
   double_int max;
   mpz_t d;
+  tree type = TREE_TYPE (c);
   bool bnds_u_valid = ((no_overflow && exit_must_be_taken)
   || mpz_sgn (bnds->below) >= 0);
 
-  if (multiple_of_p (TREE_TYPE (c), c, s))
+  if (integer_onep (s)
+  || (TREE_CODE (c) == INTEGER_CST
+ && TREE_CODE (s) == INTEGER_CST
+ && tree_to_double_int (c).mod (tree_to_double_int (s),
+TYPE_UNSIGNED (type),
+EXACT_DIV_EXPR).is_zero ())
+  || (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (c))
+ && multiple_of_p (type, c, s)))
 {
   /* If C is an exact multiple of S, then its value will be reached before
 the induction variable overflows (unless the loop is exited in some
@@ -572,16 +580,15 @@ number_of_iterations_ne_max (mpz_t bnd,
  the whole # of iterations analysis will fail).  */
   if (!no_overflow)
 {
-  max = double_int::mask (TYPE_PRECISION (TREE_TYPE (c))
-- tree_low_cst (num_ending_zeros (s), 1));
+  max = double_int::mask (TYPE_PRECISION (type)
+ - tree_low_cst (num_ending_zeros (s), 1));
   mpz_set_double_int (bnd, max, true);
   return;
 }
 
   /* Now we know that the induction variable does not overflow, so the loop
  iterates at most (range of type / S) times.  */
-  mpz_set_double_int (bnd, double_int::mask (TYPE_PRECISION (TREE_TYPE (c))),
- true);
+  mpz_set_double_int (bnd, double_int::mask (TYPE_PRECISION (type)), true);
 
   /* If the induction variable is guaranteed to reach the value of C before
  overflow, ... */
@@ -1311,7 +1318,8 @@ number_of_iterations_cond (struct loop *
 }
 
   /* If the loop exits immediately, there is nothing to do.  */
-  if (integer_zerop (fold_build2 (code, boolean_type_node, iv0->base, 
iv1->base)))
+  tree tem = fold_binary (code, boolean_type_node, iv0->base, iv1->base);
+  if (tem && integer_zerop (tem))
 {
   niter->niter = build_int_cst (unsigned_type_for (type), 0);
   niter->max = double_int_zero;
--- gcc/testsuite/gcc.dg/torture/pr57343.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr57343.c  (revision 199357)
@@ -0,0 +1,22 @@
+/* { dg-do run } */
+
+int c = 0;
+
+int
+main ()
+{
+  int i, f = 1;
+  for (i = 0; i < 5; i++)
+{
+  --c;
+  unsigned char h = c * 100;
+  if (h == 0)
+   {
+ f = 0;
+ break;
+   }
+}
+  if (f != 1)
+__builtin_abort ();
+  return 0;
+}
2013-08-29  Jakub Jelinek  

Backported from mainline
2013-05-27  Richard Biener  

PR tree-optimization/57396
* tree-affine.c (double_int_constant_multiple_p): Properly
return false for val == 0 and div != 0.

* gfortran.fortran-torture/execute/pr57396.f90: New testcase.

--- gcc/tree-affine.c   (revision 199349)
+++ gcc/tree-affine.c   (revision 199350)
@@ -736,11 +736,10 @@ free_affine_expand_cache (struct pointer
 }
 
 /* If VAL != CST * DIV for any constant CST, returns false.
-   Otherwise, if VAL != 0 (and hence CST != 0), and *MULT_SET is true,
-   additionally compares CST and MULT, and if they are different,
-   returns false.  Finally, if neither of these two cases occur,
-   true is returned, and if CST != 0, CST is stored to MULT and
-   MULT_SET is set to true.  */
+   Otherwise, if *MULT_SET is true, additionally compares CST and MULT,
+   and if they are different, returns false.  Finally, if neither of these
+   two cases occur, true is returned, and CST is stored to MULT and MULT_SET
+   is set to true.  */
 
 static bool
 double_int_constant_multiple_p (double_int val, double_int div,
@@ -749,7 +748,13 @@ double_int_constant_multiple_p (double_i
   double_int rem, cst;
 
   if (val.is_zero ())
-return true;
+{
+  if (*mult_set && !mult->is_zero ())
+   return false;
+  *mult_set = true;
+  *mult = double_int_zero;
+  return true;
+}
 
   if (div.is_zero ())
 return false;
--- gcc/testsuite/gfortran.fortran-torture/execute/pr57396.f90  (revision 0)
+++ gcc/testsuite/gfortran.fortran-torture/execute/pr57396.f90  (revision 
199350)
@@ -0,0 +1,33 @@
+module testmod
+  impli

Re: [RFA] Type inheritance graph analysis & speculative devirtualization, part 4/7, ODR at LTO time

2013-08-29 Thread Jan Hubicka
Richard, Jason,
I would apprechiate your opinion on this patch.  It blocks all the code that 
makes
use of ipa-devirt post LTO streaming.

The main part that I would like to know your opinion on is the ODR rule 
implementation
by vtable comparsion (it looks obvious but I got it wrong once already) and the 
idea
of choosing type leader out of all (presumably equivalent) ODR types and 
killing the
other BINFOs.

I will also gather statistics on size of the ODR query cache size on the bigger
beasts Martin or me can build and will set up some linear limit on it.  For
firefox it takes approximately (num_methods+num_differet_obj_type_refs)*1.3.  I
am thinking about seeting limit to cca 100 instead of 1.3 and simply start
loosing devirtualization oppurtunities if it is met. I do not think sane
sourcebases will ever hit it.

Thank you,
Honza
> Hi,
> this patch makes inheritance graph builder to work on LTO.  Nothing but
> ODR violation warnings and dump file is produced, yet.
> 
> The main change is to make types_same_for_odr and ODR hasher to use
> assembler name of virtual tables of the type (if present) to establish type
> equality. This is result of discussion with Jason at 
> http://gcc.gnu.org/ml/gcc-patches/2013-08/msg00826.html
> Jason, does the implemenetation seem sane?  
> 
> Multiple tree types can now be representations of single ODR type.  There is
> new add_type_duplicate that output warnings on obvious mismatches and records
> variant in odr_type->types array.  Those are used for debug output (it may
> be interesting to get more of the unified by Richard's merging).
> 
> Trickier part is the code that merges BINFOs of ODR equivalent types.  This is
> important since we do not want to keep external vtables around when internal
> exists (external version may refer to external static methods that are 
> internal
> to our LTO unit). With ODR violations it may lead to wrong devirtualization,
> but that is the case previously anyway.
> 
> Bootstrapped/regtested x86_64-linux, OK?
> 
> Honza
> 
>   * Makefile.in (ipa-devirt.o): Add dependency on diagnostic.h
>   * ipa-devirt.c: Include diganostic.h
>   (odr_type_d): Add types and types_set.
>   (hash_type_name): Work for types with vtables during LTO.
>   (odr_hasher::remove): Fix comment; destroy types_set.
>   (add_type_duplicate): New function,
>   (get_odr_type): Use it.
>   (dump_type_inheritance_graph): Dump type duplicates.
>   * ipa.c (symtab_remove_unreachable_nodes): Build type inheritance
>   graph.
>   * tree.c (types_same_for_odr): Give exact answers on types with
>   virtual tables.

> Index: Makefile.in
> ===
> *** Makefile.in   (revision 201836)
> --- Makefile.in   (working copy)
> *** ipa.o : ipa.c $(CONFIG_H) $(SYSTEM_H) co
> *** 2948,2954 
>  $(LTO_STREAMER_H) $(DATA_STREAMER_H)
>   ipa-devirt.o : ipa-devirt.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) 
> $(CGRAPH_H) \
>  $(GIMPLE_H) $(TARGET_H) $(GGC_H) pointer-set.h \
> !$(IPA_UTILS_H) $(HASH_TABLE_H) 
>   ipa-prop.o : ipa-prop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
>  langhooks.h $(GGC_H) $(TARGET_H) $(CGRAPH_H) $(IPA_PROP_H) 
> $(DIAGNOSTIC_H) \
>  $(TREE_FLOW_H) $(TM_H) $(TREE_PASS_H) $(FLAGS_H) $(TREE_H) \
> --- 2948,2954 
>  $(LTO_STREAMER_H) $(DATA_STREAMER_H)
>   ipa-devirt.o : ipa-devirt.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) 
> $(CGRAPH_H) \
>  $(GIMPLE_H) $(TARGET_H) $(GGC_H) pointer-set.h \
> !$(IPA_UTILS_H) $(HASH_TABLE_H) $(DIAGNOSTIC_H)
>   ipa-prop.o : ipa-prop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
>  langhooks.h $(GGC_H) $(TARGET_H) $(CGRAPH_H) $(IPA_PROP_H) 
> $(DIAGNOSTIC_H) \
>  $(TREE_FLOW_H) $(TM_H) $(TREE_PASS_H) $(FLAGS_H) $(TREE_H) \
> Index: ipa-devirt.c
> ===
> *** ipa-devirt.c  (revision 201836)
> --- ipa-devirt.c  (working copy)
> *** along with GCC; see the file COPYING3.
> *** 116,121 
> --- 116,122 
>   #include "tree-pretty-print.h"
>   #include "ipa-utils.h"
>   #include "gimple.h"
> + #include "diagnostic.h"
>   
>   /* The node of type inheritance graph.  For each type unique in
>  One Defintion Rule (ODR) sense, we produce one node linking all 
> *** along with GCC; see the file COPYING3.
> *** 123,136 
>   
>   struct GTY(()) odr_type_d
>   {
> -   /* Unique ID indexing the type in odr_types array.  */
> -   int id;
> /* leader type.  */
> tree type;
> /* All bases.  */
> vec GTY((skip)) bases;
> /* All derrived types with virtual methods seen in unit.  */
> vec GTY((skip)) derived_types;
> /* Is it in anonymous namespace? */
> bool anonymous_namespace;
>   };
> --- 124,143 
>   
>   struct GTY(()) odr_type_d
>   {
> /* leader type.  */
> tree type;
> /* All bases.  */
> vec GTY((skip)) bases;
> /

Re: Make some comdats implicitly hidden

2013-08-29 Thread Paolo Carlini

Hi,

On 08/29/2013 02:19 PM, Jan Hubicka wrote:
So my belief is that it is safe to drop those symbols from libstdc++. 
Every program/DSO using them have to define its own copy of those 
symbols, so I believe removing them from libstdc++ won't cause issues.
Really, you should check with Jakub before proceeding. I the change it's 
Ok with him, it's Ok with me too (the other library maintainers should 
be in CC however). At minimum the baselines would need updating.


If I can ask you a personal courtesy, please don't commit anything 
causing regressions, like check_abi failing, either a complete change, 
or nothing.


Thanks!
Paolo.



Re: [RFC] Old school parallelization of WPA streaming

2013-08-29 Thread Jan Hubicka
Jakub,
I am adding you to CC since I put my current toughts on LTO and debug info
in here.
> > Fork-fire-forget is really a much simpler choice here IMO; no worries 
> > about shared resources, less debug hassle.
> 
> It might be not as cheap as it is on Linux hosts on other hosts of
> course.  Also I'd rather try to avoid I/O than solving the issue

I still have some items on list here
 1) avoid function sections to be decompressed by WPA
(this won't cause much compile time improvements as decompression is
 well bellow 10% of runtime)
 2) put variable initializers into named sections just as function bodies
are.
Seeing Martin's systemtaps of firefox/gimp/inkscape, to my surprise the
initializers are actually about as big as the text segment.  While
it seems bit wasteful to pust single integer_cst there (and we can
special case this), it seems that there is a promise for vtables
and other stuff.

To make devirt work, we will need to load vtables into memory (or
invent representation to stream them other way that would be similarly
big). Still we will avoid need to load them in 5000 copies and merge
them.
 3) I think good part of function/partitioning overhead is because abstract
origin streaming is utterly broken.

Currently we can have DECL_ABSTRACT_ORIGIN on a function.  This I can now
track by used_as_abstract_origin flag and I can stream those functions
into partitins using them.

This is still wrong for multitude of reasons

1) we really want DECL_INITIAL tree of the functions used as abstract
   origins in the form before any gimple optimizations happened on them.
   (that is when debug hook is called)
   This is not what happens - we stream the tree as it looks during
   TLO streaming time - i.e. after early optimizations.

   I think we may just (at a time calling the debug hook) duplicate 
DECL_INITIAL
   same way we duplicate decls for save_function_body and saving it 
elsewhere.
   Making this tree to be abstract origin of the offline copy of the 
function itself.

2) dwarf2out doesn't really the DECL_INITIAL tree so it does something 
useful
   only when it is already there. 
   It can simply call cgraph_get_body when it needs the DECL_INITIAL, but it
   doesn't becuase push_cfun causes ICE.
   If we really can't push_cfun from middle of RTL queueu, I suppose I can
   just save it elsewhere

3) It is not only toplevel decl that has origin, but all local vars in the
   function.

   I think this goes terribly wrong - these decls are not indexable so they
   are stored into function section of every function referring to them.
   They are then read in many duplicates and never merged with the 
DECL_INITIAL
   tree of the actual abstract origin. For some reason dwarf2out doesn't
   seem to ICE, but I also do not see how this can produce working debug.
   Moreover I think the duplicates contribute to our current debug info
   size problems with LTO.

   If we solve 1) as discussed by above (i.e. by having separate
   block trees for functions that are abstract origins), we can then solve 
3)
   by streaming those into global decl stream and make 
cross-function_context
   tree references to become global.

4) Of course after early inlining function may need abstract origins from
   multiple other functions.  I do not track this at all.
   May be easy to just collect a vector of functions that are needed into
   cgraph_node.

Of course solving 1)-4) is bit of early debug info without actually going to
stream the dwarf dies, but by using the BLOCK trees as a temporary 
representation.
Incrementally we can have this saved BLOCK tree to be a dwarf DIE and have
origins to point to them instead of decls.

To get resonable streaming performance it would be nice to have way to get
abstract origin references cross-partition that debug info can accomplish.

Said that, I now have the fork() patch in all my trees and enjoy 50% faster
WPA times.  I changed my mind about claim that stremaing should be disk bound -
it is hard to hope for disk boundness for something that should fit in cache.

We went down from 5GB to 2GB of streaming for Firefox that is good.  But we will
see again 4GB once Martin's code layout work will land.  I think it is from good
part because of the origin fun above.

Honza

> by parallelizing it.  Of course we can always come back to this
> kind of hack later.
> 
> Richard.


Re: Make some comdats implicitly hidden

2013-08-29 Thread Jan Hubicka
> Hi,
> 
> On 08/29/2013 02:19 PM, Jan Hubicka wrote:
> >So my belief is that it is safe to drop those symbols from
> >libstdc++. Every program/DSO using them have to define its own
> >copy of those symbols, so I believe removing them from libstdc++
> >won't cause issues.
> Really, you should check with Jakub before proceeding. I the change
> it's Ok with him, it's Ok with me too (the other library maintainers
> should be in CC however). At minimum the baselines would need
> updating.
> 
> If I can ask you a personal courtesy, please don't commit anything
> causing regressions, like check_abi failing, either a complete
> change, or nothing.

No worries. I do not intend to commit the patch until the check_abi issue is
discussed and solved.

(check_abi tends to be easy to miss when one checks test results by hand, but I
am trying to force myself to always use compare_tests, so hopefully I won't do
that in future)

Note that currently you will get the same abi_check failure if you try to LTO
libstdc++.so.  It would be really nice if we started to do that eventually.

Honza


Re: Make some comdats implicitly hidden

2013-08-29 Thread Jakub Jelinek
On Thu, Aug 29, 2013 at 02:47:46PM +0200, Paolo Carlini wrote:
> On 08/29/2013 02:19 PM, Jan Hubicka wrote:
> >So my belief is that it is safe to drop those symbols from
> >libstdc++. Every program/DSO using them have to define its own
> >copy of those symbols, so I believe removing them from libstdc++
> >won't cause issues.
> Really, you should check with Jakub before proceeding. I the change
> it's Ok with him, it's Ok with me too (the other library maintainers
> should be in CC however). At minimum the baselines would need
> updating.

I'm very nervous about removing any exported symbols, apps could dlsym them
or whatever.  So, if the compiler makes those hidden, either there should be
an option to restore the old behavior and libstdc++ should use it, or we
need some renaming/alias hacks to restore those.

Jakub


Re: [ubsan] Use pointer map instead of hash table.

2013-08-29 Thread Marek Polacek
On Wed, Aug 28, 2013 at 03:37:31PM +0200, Jakub Jelinek wrote:
> On Wed, Aug 28, 2013 at 03:30:46PM +0200, Marek Polacek wrote:
> > >From a quick look, it doesn't seem like we do.  (I was searching for
> > something about pointer_map in ggc* and gen* files.)
> 
> If we can't make it GC aware easily, I'm afraid we need to revert that change 
> to
> pointer_map.  Now, the question is if the if_marked stuff can be easily done
> for the previous implementation with C++ish hash table, or if we should just
> use something similar to what tree.c uses for value_expr_for_decl and
> similar (of course, with minor adjustments, because we want to map from
> TYPE_UID rather than DECL_UID of the key to tree).  Or with pointer_map we'd
> need to push both the keys (types) and what they map to into (decls) into
> some GC vector in addition to the pointer_map.

This is an attempt to convert the pointer_map into C style hash map;
it mimics value_expr_for_decl stuff in tree.[ch].  I've built a bunch of
build_distinct_type_copy's that are in fact unused and in the final assembly
I see only the really used ones, so hopefully it works ;).

Tested x86_64-linux, ran bootstrap-ubsan and stage2/3 comparison looks
ok.  Though, I'd appreciate if either of you could take a peek at it
before I commit it.  Thanks,

2013-08-29  Marek Polacek  

* Makefile.in (ubsan.o): Add HASH_TABLE_H and gt-ubsan.h
dependencies.  Remove pointer-set.h dependency.
* ubsan.c: Convert to C style hash table.

--- gcc/Makefile.in.mp  2013-08-29 14:24:49.839578625 +0200
+++ gcc/Makefile.in 2013-08-29 14:54:39.354258737 +0200
@@ -2273,7 +2273,7 @@ tsan.o : $(CONFIG_H) $(SYSTEM_H) $(TREE_
intl.h cfghooks.h output.h options.h $(C_COMMON_H) tsan.h asan.h \
tree-ssa-propagate.h
 ubsan.o : ubsan.c ubsan.h $(CONFIG_H) $(SYSTEM_H) $(GIMPLE_H) \
-   output.h coretypes.h $(TREE_H) $(CGRAPH_H) pointer-set.h \
+   output.h coretypes.h $(TREE_H) $(CGRAPH_H) $(HASH_TABLE_H) gt-ubsan.h \
toplev.h $(C_COMMON_H)
 tree-ssa-tail-merge.o: tree-ssa-tail-merge.c \
$(SYSTEM_H) $(CONFIG_H) coretypes.h $(TM_H) $(BITMAP_H) \
--- gcc/ubsan.c.mp  2013-08-29 14:24:49.840578629 +0200
+++ gcc/ubsan.c 2013-08-29 14:24:54.848597440 +0200
@@ -24,39 +24,71 @@ along with GCC; see the file COPYING3.
 #include "tree.h"
 #include "cgraph.h"
 #include "gimple.h"
+#include "hash-table.h"
 #include "pointer-set.h"
 #include "output.h"
 #include "toplev.h"
 #include "ubsan.h"
 #include "c-family/c-common.h"
 
-/* Map a TYPE to an ubsan type descriptor VAR_DECL for that type.  */
-static pointer_map *typedesc_map;
+/* Map from a tree to a VAR_DECL tree.  */
 
-/* Insert DECL as the VAR_DECL for TYPE in the TYPEDESC_MAP.  */
+struct GTY(()) tree_type_map {
+  struct tree_map_base type;
+  unsigned int hash;
+  tree decl;
+};
+
+#define tree_type_map_eq tree_map_base_eq
+#define tree_type_map_marked_p tree_map_base_marked_p
+
+static GTY ((if_marked ("tree_type_map_marked_p"), param_is (struct 
tree_type_map)))
+ htab_t decl_tree_for_type;
+
+/* Initialize the hash table.  */
 
 static void
-insert_decl_for_type (tree decl, tree type)
+init_hash_table (void)
 {
-  *typedesc_map->insert (type) = decl;
+  decl_tree_for_type = htab_create_ggc (512, tree_decl_map_hash,
+   tree_decl_map_eq, 0);
 }
 
-/* Find the VAR_DECL for TYPE in TYPEDESC_MAP.  If TYPE does not
-   exist in the map, return NULL_TREE, otherwise, return the VAR_DECL
-   we found.  */
+/* Lookup a VAR_DECL for TYPE, and return it if we find one.  */
 
-static tree
-lookup_decl_for_type (tree type)
+tree
+decl_for_type_lookup (tree type)
 {
-  /* If the pointer map is not initialized yet, create it now.  */
-  if (typedesc_map == NULL)
+  /* If the hash table is not initialized yet, create it now.  */
+  if (decl_tree_for_type == NULL)
 {
-  typedesc_map = new pointer_map;
+  init_hash_table ();
   /* That also means we don't have to bother with the lookup.  */
   return NULL_TREE;
 }
-  tree *t = typedesc_map->contains (type);
-  return t ? *t : NULL_TREE;
+
+  struct tree_type_map *h, in;
+  in.type.from = type;
+
+  h = (struct tree_type_map *)
+  htab_find_with_hash (decl_tree_for_type, &in, TYPE_UID (type));
+  return h ? h->decl : NULL_TREE;
+}
+
+/* Insert a mapping TYPE->DECL in the VAR_DECL for type hashtable.  */
+
+void
+decl_for_type_insert (tree type, tree decl)
+{
+  struct tree_type_map *h;
+  void **slot;
+
+  h = ggc_alloc_tree_type_map ();
+  h->type.from = type;
+  h->decl = decl;
+  slot = htab_find_slot_with_hash (decl_tree_for_type, h, TYPE_UID (type),
+  INSERT);
+  *(struct tree_type_map **) slot = h;
 }
 
 /* Helper routine, which encodes a value in the pointer_sized_int_node.
@@ -226,7 +258,7 @@ ubsan_type_descriptor (tree type)
   /* See through any typedefs.  */
   type = TYPE_MAIN_VARIANT (type);
 
-  tree decl = lookup_decl_for_type (type);
+  tree decl = decl_for_typ

[PATCH] Fix PR58246

2013-08-29 Thread Richard Biener

The following fixes PR58246 by fixing a dominator check.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-08-29  Richard Biener  

PR tree-optimization/58246
* tree-ssa-dce.c (mark_aliased_reaching_defs_necessary_1): Properly
handle the dominance check inside a basic-block.

* gcc.dg/torture/pr58246.c: New testcase.

Index: gcc/tree-ssa-dce.c
===
*** gcc/tree-ssa-dce.c  (revision 202068)
--- gcc/tree-ssa-dce.c  (working copy)
*** mark_aliased_reaching_defs_necessary_1 (
*** 574,579 
--- 574,584 
  in the references (gcc.c-torture/execute/pr42142.c).
  The simplest way is to check if the kill dominates
  the use.  */
+  /* But when both are in the same block we cannot
+ easily tell whether we came from a backedge
+ unless we decide to compute stmt UIDs
+ (see PR58246).  */
+  && (basic_block) data != gimple_bb (def_stmt)
   && dominated_by_p (CDI_DOMINATORS, (basic_block) data,
  gimple_bb (def_stmt))
   && operand_equal_p (ref->ref, lhs, 0))
Index: gcc/testsuite/gcc.dg/torture/pr58246.c
===
*** gcc/testsuite/gcc.dg/torture/pr58246.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr58246.c  (working copy)
***
*** 0 
--- 1,21 
+ /* { dg-do run } */
+ 
+ extern void abort (void);
+ 
+ int a, b; 
+ 
+ int main ()
+ {
+   int t[2] = {1,1};
+ 
+   for (a = 0; a < 2; a++)
+ {
+   b ^= t[a];
+   t[a] = t[1] = 0;
+ }
+ 
+   if (b != 1)
+ abort ();
+ 
+   return 0;
+ }


Re: [4.8] 3 backports

2013-08-29 Thread Richard Biener
On Thu, 29 Aug 2013, Jakub Jelinek wrote:

> Hi!
> 
> I've bootstrapped/regtested on x86_64-linux and i686-linux these 3 backports
> (last patch is actually combined from 3 separate commits), all applied
> cleanly to 4.8 branch.
> 
> Ok for 4.8?

Ok.

Thanks,
Richard.


Re: Make some comdats implicitly hidden

2013-08-29 Thread Jan Hubicka
> On Thu, Aug 29, 2013 at 02:47:46PM +0200, Paolo Carlini wrote:
> > On 08/29/2013 02:19 PM, Jan Hubicka wrote:
> > >So my belief is that it is safe to drop those symbols from
> > >libstdc++. Every program/DSO using them have to define its own
> > >copy of those symbols, so I believe removing them from libstdc++
> > >won't cause issues.
> > Really, you should check with Jakub before proceeding. I the change
> > it's Ok with him, it's Ok with me too (the other library maintainers
> > should be in CC however). At minimum the baselines would need
> > updating.
> 
> I'm very nervous about removing any exported symbols, apps could dlsym them
> or whatever.  So, if the compiler makes those hidden, either there should be
> an option to restore the old behavior and libstdc++ should use it, or we
> need some renaming/alias hacks to restore those.

I can add an option to disable it ( would -fno-private-inlines work?) that will
disable this transformation to hidden as well as LTO comdat localization when
symbol is PREVAIL_EXT?

Of course if we are affraid to remove the symbols from libstdc++, I wonder
how safe we are about APIs of other C++ libraries.  Do you think people dlsym
inline functions and comdat vtables from dlopened DSOs? 

Honza
> 
>   Jakub


Re: [ubsan] Use pointer map instead of hash table.

2013-08-29 Thread Jakub Jelinek
On Thu, Aug 29, 2013 at 02:56:58PM +0200, Marek Polacek wrote:
> --- gcc/Makefile.in.mp2013-08-29 14:24:49.839578625 +0200
> +++ gcc/Makefile.in   2013-08-29 14:54:39.354258737 +0200
> @@ -2273,7 +2273,7 @@ tsan.o : $(CONFIG_H) $(SYSTEM_H) $(TREE_
> intl.h cfghooks.h output.h options.h $(C_COMMON_H) tsan.h asan.h \
> tree-ssa-propagate.h
>  ubsan.o : ubsan.c ubsan.h $(CONFIG_H) $(SYSTEM_H) $(GIMPLE_H) \
> -   output.h coretypes.h $(TREE_H) $(CGRAPH_H) pointer-set.h \
> +   output.h coretypes.h $(TREE_H) $(CGRAPH_H) $(HASH_TABLE_H) gt-ubsan.h \
> toplev.h $(C_COMMON_H)
>  tree-ssa-tail-merge.o: tree-ssa-tail-merge.c \
> $(SYSTEM_H) $(CONFIG_H) coretypes.h $(TM_H) $(BITMAP_H) \
> --- gcc/ubsan.c.mp2013-08-29 14:24:49.840578629 +0200
> +++ gcc/ubsan.c   2013-08-29 14:24:54.848597440 +0200
> @@ -24,39 +24,71 @@ along with GCC; see the file COPYING3.
>  #include "tree.h"
>  #include "cgraph.h"
>  #include "gimple.h"
> +#include "hash-table.h"

Why not #include "hashtab.h" instead (and corresponding change in
Makefile.in ?

>  #include "pointer-set.h"
>  #include "output.h"
>  #include "toplev.h"
>  #include "ubsan.h"
>  #include "c-family/c-common.h"
>  
> -/* Map a TYPE to an ubsan type descriptor VAR_DECL for that type.  */
> -static pointer_map *typedesc_map;
> +/* Map from a tree to a VAR_DECL tree.  */
>  
> -/* Insert DECL as the VAR_DECL for TYPE in the TYPEDESC_MAP.  */
> +struct GTY(()) tree_type_map {
> +  struct tree_map_base type;
> +  unsigned int hash;

You use TYPE_UID as hash, and never use hash field, so why you are adding
it?

> +#define tree_type_map_eq tree_map_base_eq
> +#define tree_type_map_marked_p tree_map_base_marked_p
> +
> +static GTY ((if_marked ("tree_type_map_marked_p"), param_is (struct 
> tree_type_map)))
> + htab_t decl_tree_for_type;
> +
> +/* Initialize the hash table.  */
>  
>  static void
> -insert_decl_for_type (tree decl, tree type)
> +init_hash_table (void)
>  {
> -  *typedesc_map->insert (type) = decl;
> +  decl_tree_for_type = htab_create_ggc (512, tree_decl_map_hash,
> + tree_decl_map_eq, 0);

tree_type_map_hash and tree_type_map_eq here, right?
You could also just manually inline init_hash_table into the single caller, 
after
all, it is just one statement.
512 might be too much, how many types we put there typically?  10, 20?

> -/* Find the VAR_DECL for TYPE in TYPEDESC_MAP.  If TYPE does not
> -   exist in the map, return NULL_TREE, otherwise, return the VAR_DECL
> -   we found.  */
> +/* Lookup a VAR_DECL for TYPE, and return it if we find one.  */
>  
> -static tree
> -lookup_decl_for_type (tree type)
> +tree
> +decl_for_type_lookup (tree type)

Why are you dropping the static here?

> +/* Insert a mapping TYPE->DECL in the VAR_DECL for type hashtable.  */
> +
> +void
> +decl_for_type_insert (tree type, tree decl)

Again, can't this be static?

Otherwise looks good.

Jakub


Re: [PATCH ARM]Refine scaled address expression on ARM

2013-08-29 Thread Richard Earnshaw
On 28/08/13 08:00, bin.cheng wrote:
> Hi,
> 
> This patch refines scaled address expression on ARM.  It supports
> "base+index*scale" in arm_legitimate_address_outer_p.  It also tries to
> legitimize "base + index * scale + offset" with "reg <- base + offset;  reg
> + index * scale" by introducing thumb2_legitimize_address.  For now function
> thumb2_legitimize_address is a kind of placeholder and just does the
> mentioned transformation by calling to try_multiplier_address.  Hoping we
> can improve it in the future.
> 
> With this patch:
> 1) "base+index*scale" is recognized.

That's because (PLUS (REG) (MULT (REG) (CONST))) is not canonical form.
 So this shouldn't be necessary.  Can you identify where this
non-canoncial form is being generated?

R.

> 2) PR57540 is fixed.
> 
> Bootstrapped and Tested on A15.  Is it OK?
> 
> Thanks.
> Bin
> 
> 2013-08-28  Bin Cheng  
> 
>   * config/arm/arm.c (arm_legitimate_address_outer_p):
>   Support addressing mode like "base + index * scale".
>   (try_multiplier_address): New function.
>   (arm_legitimize_address): Call try_multiplier_address.
> 
> 
> 6-arm-scaled_address-20130828.txt
> 
> 
> Index: gcc/config/arm/arm.c
> ===
> --- gcc/config/arm/arm.c  (revision 200774)
> +++ gcc/config/arm/arm.c  (working copy)
> @@ -5931,7 +5931,9 @@ arm_legitimate_address_outer_p (enum machine_mode
>   && arm_legitimate_index_p (mode, xop1, outer, strict_p))
>  || (!strict_p && will_be_in_index_register (xop1
> || (arm_address_register_rtx_p (xop1, strict_p)
> -   && arm_legitimate_index_p (mode, xop0, outer, strict_p)));
> +   && arm_legitimate_index_p (mode, xop0, outer, strict_p))
> +   || (arm_address_register_rtx_p (xop0, strict_p)
> +   && arm_legitimate_index_p (mode, xop1, outer, strict_p)));
>  }
>  
>  #if 0
> @@ -6652,6 +6654,106 @@ legitimize_tls_address (rtx x, rtx reg)
>  }
>  }
>  
> +/* Try to find address expression like base + index * scale + offset
> +   in X.  If we find one, force base + offset into register and
> +   construct new expression reg + index * scale; return the new
> +   address expression if it's valid.  Otherwise return X.  */
> +static rtx
> +try_multiplier_address (rtx x, enum machine_mode mode ATTRIBUTE_UNUSED)
> +{
> +  rtx tmp, base_reg, new_rtx;
> +  rtx base = NULL_RTX, index = NULL_RTX, scale = NULL_RTX, offset = NULL_RTX;
> +
> +  gcc_assert (GET_CODE (x) == PLUS);
> +
> +  /* Try to find and record base/index/scale/offset in X. */
> +  if (GET_CODE (XEXP (x, 1)) == MULT)
> +{
> +  tmp = XEXP (x, 0);
> +  index = XEXP (XEXP (x, 1), 0);
> +  scale = XEXP (XEXP (x, 1), 1);
> +  if (GET_CODE (tmp) != PLUS)
> + return x;
> +
> +  base = XEXP (tmp, 0);
> +  offset = XEXP (tmp, 1);
> +}
> +  else
> +{
> +  tmp = XEXP (x, 0);
> +  offset = XEXP (x, 1);
> +  if (GET_CODE (tmp) != PLUS)
> + return x;
> +
> +  base = XEXP (tmp, 0);
> +  scale = XEXP (tmp, 1);
> +  if (GET_CODE (base) == MULT)
> + {
> +   tmp = base;
> +   base = scale;
> +   scale = tmp;
> + }
> +  if (GET_CODE (scale) != MULT)
> + return x;
> +
> +  index = XEXP (scale, 0);
> +  scale = XEXP (scale, 1);
> +}
> +
> +  if (CONST_INT_P (base))
> +{
> +  tmp = base;
> +  base = offset;
> +  offset = tmp;
> +}
> +
> +  if (CONST_INT_P (index))
> +{
> +  tmp = index;
> +  index = scale;
> +  scale = tmp;
> +}
> +
> +  /* ARM only supports constant scale in address.  */
> +  if (!CONST_INT_P (scale))
> +return x;
> +
> +  if (GET_MODE (base) != SImode || GET_MODE (index) != SImode)
> +return x;
> +
> +  /* Only register/constant are allowed in each part.  */
> +  if (!symbol_mentioned_p (base)
> +  && !symbol_mentioned_p (offset)
> +  && !symbol_mentioned_p (index)
> +  && !symbol_mentioned_p (scale))
> +{
> +  /* Force "base+offset" into register and construct
> +  "register+index*scale".  Return the new expression
> +  only if it's valid.  */
> +  tmp = gen_rtx_PLUS (SImode, base, offset);
> +  base_reg = force_reg (SImode, tmp);
> +  tmp = gen_rtx_fmt_ee (MULT, SImode, index, scale);
> +  new_rtx = gen_rtx_PLUS (SImode, base_reg, tmp);
> +  return new_rtx;
> +}
> +
> +  return x;
> +}
> +
> +/* Try machine-dependent ways of modifying an illegitimate Thumb2 address
> +   to be legitimate.  If we find one, return the new address.
> +
> +   TODO: legitimize_address for Thumb2.  */
> +static rtx
> +thumb2_legitimize_address (rtx x, rtx orig_x ATTRIBUTE_UNUSED,
> +enum machine_mode mode)
> +{
> +  if (GET_CODE (x) == PLUS)
> +return try_multiplier_address (x, mode);
> +
> +  return x;
> +}
> +
>  /* Try machine-dependent ways of modifying an illegitimate address
> 

Re: [RFC] Old school parallelization of WPA streaming

2013-08-29 Thread Michael Matz
Hi,

On Thu, 29 Aug 2013, Richard Biener wrote:

> > Fork-fire-forget is really a much simpler choice here IMO; no worries 
> > about shared resources, less debug hassle.
> 
> It might be not as cheap as it is on Linux hosts on other hosts of
> course.

Sure.  Don't use it there then.  Not a reason for not having the 
improvement on linux.

> Also I'd rather try to avoid I/O than solving the issue by parallelizing 
> it.

Of course.  There's always something still better.

> Of course we can always come back to this kind of hack later.

For 4.9 latest, if we don't have anything nicer by then.  OTOH we could 
also remove Honzas patch when and if something better comes around ;)


Ciao,
Michael.


Re: [PATCH] Convert more passes to new dump framework

2013-08-29 Thread Teresa Johnson
On Wed, Aug 28, 2013 at 9:07 AM, Teresa Johnson  wrote:
> On Wed, Aug 28, 2013 at 7:09 AM, Teresa Johnson  wrote:
>> On Wed, Aug 28, 2013 at 4:01 AM, Richard Biener
>>  wrote:
>>> On Wed, Aug 7, 2013 at 7:23 AM, Teresa Johnson  wrote:
 On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson  
 wrote:
> On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor  wrote:
>> Hi,
>>
>> On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote:
>>> On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor  wrote:
>>> > On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
>>> >> This patch ports messages to the new dump framework,
>>> >
>>> > It would be great this new framework was documented somewhere.  I lost
>>> > track of what was agreed it would be and from the uses in the
>>> > vectorizer I was never quite sure how to utilize it in other passes.
>>>
>>> Cc'ing Sharad who implemented this - Sharad, is this documented on a
>>> wiki or elsewhere?
>>
>> Thanks
>>
>>>
>>> >
>>> > I'd also like to point out two other minor things inline:
>>> >
>>> > [...]
>>> >
>>> >> 2013-08-06  Teresa Johnson  
>>> >> Dehao Chen  
>>> >>
>>> >> * dumpfile.c (dump_loc): Add column number to output, make 
>>> >> newlines
>>> >> consistent.
>>> >> * dumpfile.h (OPTGROUP_OTHER): Add and enable under 
>>> >> OPTGROUP_ALL.
>>> >> * ipa-inline-transform.c (clone_inlined_nodes):
>>> >> (cgraph_node_opt_info): New function.
>>> >> (cgraph_node_call_chain): Ditto.
>>> >> (dump_inline_decision): Ditto.
>>> >> (inline_call): Invoke dump_inline_decision.
>>> >> * doc/invoke.texi: Document optall -fopt-info flag.
>>> >> * profile.c (read_profile_edge_counts): Use new dump 
>>> >> framework.
>>> >> (compute_branch_probabilities): Ditto.
>>> >> * passes.c (pass_manager::register_one_dump_file): Use 
>>> >> OPTGROUP_OTHER
>>> >> when pass not in any opt group.
>>> >> * value-prof.c (check_counter): Use new dump framework.
>>> >> (find_func_by_funcdef_no): Ditto.
>>> >> (check_ic_target): Ditto.
>>> >> * coverage.c (get_coverage_counts): Ditto.
>>> >> (coverage_init): Setup new dump framework.
>>> >> * ipa-inline.c (inline_small_functions): Set 
>>> >> is_in_ipa_inline.
>>> >> * ipa-inline.h (is_in_ipa_inline): Declare.
>>> >>
>>> >> * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
>>> >> * testsuite/gcc.dg/pr26570.c: Ditto.
>>> >> * testsuite/gcc.dg/pr32773.c: Ditto.
>>> >> * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.
>>> >>
>>> >
>>> > [...]
>>> >
>>> >> Index: ipa-inline-transform.c
>>> >> ===
>>> >> --- ipa-inline-transform.c  (revision 201461)
>>> >> +++ ipa-inline-transform.c  (working copy)
>>> >> @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, 
>>> >> bool d
>>> >>  }
>>> >>
>>> >>
>>> >> +#define MAX_INT_LENGTH 20
>>> >> +
>>> >> +/* Return NODE's name and profile count, if available.  */
>>> >> +
>>> >> +static const char *
>>> >> +cgraph_node_opt_info (struct cgraph_node *node)
>>> >> +{
>>> >> +  char *buf;
>>> >> +  size_t buf_size;
>>> >> +  const char *bfd_name = lang_hooks.dwarf_name (node->symbol.decl, 
>>> >> 0);
>>> >> +
>>> >> +  if (!bfd_name)
>>> >> +bfd_name = "unknown";
>>> >> +
>>> >> +  buf_size = strlen (bfd_name) + 1;
>>> >> +  if (profile_info)
>>> >> +buf_size += (MAX_INT_LENGTH + 3);
>>> >> +
>>> >> +  buf = (char *) xmalloc (buf_size);
>>> >> +
>>> >> +  strcpy (buf, bfd_name);
>>> >> +
>>> >> +  if (profile_info)
>>> >> +sprintf (buf, "%s ("HOST_WIDEST_INT_PRINT_DEC")", buf, 
>>> >> node->count);
>>> >> +  return buf;
>>> >> +}
>>> >
>>> > I'm not sure if output of this function is aimed only at the user or
>>> > if it is supposed to be used by gcc developers as well.  If the
>>> > latter, an incredibly useful thing is to also dump node->symbol.order
>>> > too.  We usually dump it after "/" sign separating it from node name.
>>> > It is invaluable when examining decisions in C++ code where you can
>>> > have lots of clones of a node (and also because existing dumps print
>>> > it, it is easy to combine them).
>>>
>>> The output is useful for both power users doing performance tuning of
>>> their application, and by gcc developers. Adding the id is not so
>>> useful for the former, but I agree that it is very useful for compiler
>>> developers. In fact, in the google branch vers

Re: [ubsan] Use pointer map instead of hash table.

2013-08-29 Thread Marek Polacek
On Thu, Aug 29, 2013 at 03:05:59PM +0200, Jakub Jelinek wrote:
> On Thu, Aug 29, 2013 at 02:56:58PM +0200, Marek Polacek wrote:
> > --- gcc/Makefile.in.mp  2013-08-29 14:24:49.839578625 +0200
> > +++ gcc/Makefile.in 2013-08-29 14:54:39.354258737 +0200
> > @@ -2273,7 +2273,7 @@ tsan.o : $(CONFIG_H) $(SYSTEM_H) $(TREE_
> > intl.h cfghooks.h output.h options.h $(C_COMMON_H) tsan.h asan.h \
> > tree-ssa-propagate.h
> >  ubsan.o : ubsan.c ubsan.h $(CONFIG_H) $(SYSTEM_H) $(GIMPLE_H) \
> > -   output.h coretypes.h $(TREE_H) $(CGRAPH_H) pointer-set.h \
> > +   output.h coretypes.h $(TREE_H) $(CGRAPH_H) $(HASH_TABLE_H) gt-ubsan.h \
> > toplev.h $(C_COMMON_H)
> >  tree-ssa-tail-merge.o: tree-ssa-tail-merge.c \
> > $(SYSTEM_H) $(CONFIG_H) coretypes.h $(TM_H) $(BITMAP_H) \
> > --- gcc/ubsan.c.mp  2013-08-29 14:24:49.840578629 +0200
> > +++ gcc/ubsan.c 2013-08-29 14:24:54.848597440 +0200
> > @@ -24,39 +24,71 @@ along with GCC; see the file COPYING3.
> >  #include "tree.h"
> >  #include "cgraph.h"
> >  #include "gimple.h"
> > +#include "hash-table.h"
> 
> Why not #include "hashtab.h" instead (and corresponding change in
> Makefile.in ?

Fixed.

> >  #include "pointer-set.h"
> >  #include "output.h"
> >  #include "toplev.h"
> >  #include "ubsan.h"
> >  #include "c-family/c-common.h"
> >  
> > -/* Map a TYPE to an ubsan type descriptor VAR_DECL for that type.  */
> > -static pointer_map *typedesc_map;
> > +/* Map from a tree to a VAR_DECL tree.  */
> >  
> > -/* Insert DECL as the VAR_DECL for TYPE in the TYPEDESC_MAP.  */
> > +struct GTY(()) tree_type_map {
> > +  struct tree_map_base type;
> > +  unsigned int hash;
> 
> You use TYPE_UID as hash, and never use hash field, so why you are adding
> it?

Removed the hash field.

> > +#define tree_type_map_eq tree_map_base_eq
> > +#define tree_type_map_marked_p tree_map_base_marked_p
> > +
> > +static GTY ((if_marked ("tree_type_map_marked_p"), param_is (struct 
> > tree_type_map)))
> > + htab_t decl_tree_for_type;
> > +
> > +/* Initialize the hash table.  */
> >  
> >  static void
> > -insert_decl_for_type (tree decl, tree type)
> > +init_hash_table (void)
> >  {
> > -  *typedesc_map->insert (type) = decl;
> > +  decl_tree_for_type = htab_create_ggc (512, tree_decl_map_hash,
> > +   tree_decl_map_eq, 0);
> 
> tree_type_map_hash and tree_type_map_eq here, right?

Yeah.  Fixed.

> You could also just manually inline init_hash_table into the single caller, 
> after
> all, it is just one statement.

Done.

> 512 might be too much, how many types we put there typically?  10, 20?

Of course, 10 should be enough.

> > -/* Find the VAR_DECL for TYPE in TYPEDESC_MAP.  If TYPE does not
> > -   exist in the map, return NULL_TREE, otherwise, return the VAR_DECL
> > -   we found.  */
> > +/* Lookup a VAR_DECL for TYPE, and return it if we find one.  */
> >  
> > -static tree
> > -lookup_decl_for_type (tree type)
> > +tree
> > +decl_for_type_lookup (tree type)
> 
> Why are you dropping the static here?

:( No clue.  Fixed.

> > +/* Insert a mapping TYPE->DECL in the VAR_DECL for type hashtable.  */
> > +
> > +void
> > +decl_for_type_insert (tree type, tree decl)
> 
> Again, can't this be static?

Fixed.

> Otherwise looks good.

Thanks, will give it another bootstrap-ubsan round and if that's fine,
I'll commit the following.

2013-08-29  Marek Polacek  

* Makefile.in (ubsan.o): Add HASHAB_H and gt-ubsan.h
dependencies.  Remove pointer-set.h dependency.
* ubsan.c: Convert to C style hash table.

--- gcc/Makefile.in.mp  2013-08-29 14:24:49.839578625 +0200
+++ gcc/Makefile.in 2013-08-29 15:13:16.538323950 +0200
@@ -2273,7 +2273,7 @@ tsan.o : $(CONFIG_H) $(SYSTEM_H) $(TREE_
intl.h cfghooks.h output.h options.h $(C_COMMON_H) tsan.h asan.h \
tree-ssa-propagate.h
 ubsan.o : ubsan.c ubsan.h $(CONFIG_H) $(SYSTEM_H) $(GIMPLE_H) \
-   output.h coretypes.h $(TREE_H) $(CGRAPH_H) pointer-set.h \
+   output.h coretypes.h $(TREE_H) $(CGRAPH_H) $(HASHTAB_H) gt-ubsan.h \
toplev.h $(C_COMMON_H)
 tree-ssa-tail-merge.o: tree-ssa-tail-merge.c \
$(SYSTEM_H) $(CONFIG_H) coretypes.h $(TM_H) $(BITMAP_H) \
--- gcc/ubsan.c.mp  2013-08-29 14:24:49.840578629 +0200
+++ gcc/ubsan.c 2013-08-29 15:22:42.368445186 +0200
@@ -24,39 +24,63 @@ along with GCC; see the file COPYING3.
 #include "tree.h"
 #include "cgraph.h"
 #include "gimple.h"
+#include "hashtab.h"
 #include "pointer-set.h"
 #include "output.h"
 #include "toplev.h"
 #include "ubsan.h"
 #include "c-family/c-common.h"
 
-/* Map a TYPE to an ubsan type descriptor VAR_DECL for that type.  */
-static pointer_map *typedesc_map;
+/* Map from a tree to a VAR_DECL tree.  */
 
-/* Insert DECL as the VAR_DECL for TYPE in the TYPEDESC_MAP.  */
+struct GTY(()) tree_type_map {
+  struct tree_map_base type;
+  tree decl;
+};
 
-static void
-insert_decl_for_type (tree decl, tree type)
-{
-  *typedesc_map->insert (type) = decl;
-}
+#define tree_type_map_eq tree_map_base_e

Re: wide-int branch now up for public comment and review

2013-08-29 Thread Kenneth Zadeck

On 08/28/2013 05:08 AM, Richard Biener wrote:

On Sun, 25 Aug 2013, Mike Stump wrote:


On Aug 25, 2013, at 12:26 AM, Richard Sandiford  
wrote:

(2) Adding a new namespace, wi, for the operators.  So far this
just contains the previously-static comparison functions
and whatever else was needed to avoid cross-dependencies
between wi and wide_int_ro (except for the debug routines).

It seems reasonable; I don't see anything I object to.  Seems like most of the 
time, the code is shorter (though, you use wi, which is fairly short).  It 
doesn't seem any more complex, though, knowing how to spell the operation 
wide_int:: v wi:: is confusing on the client side.  I'm torn between this and 
the nice things that come with the patch.


(3) Removing the comparison member functions and using the static
ones everywhere.

I've love to have richi weigh in (or someone else that wants to play the
role of C++ coding expert)?  I'd defer to them?

Yeah - wi::lt (a, b) is much better than a.lt (b) IMHO.  It mimics how
the standard library works.


The idea behind using a namespace rather than static functions
is that it makes it easier to separate the core, tree and rtx bits.

Being able to separate core, tree and rtx bits gets a +1 in my book.  I
do understand the beauty of this.

Now, if you look back in discussions I wanted a storage
abstraction anyway.  Basically the interface is

class wide_int_storage
{
   int precision ();
   int len ();
   element_t get (unsigned);
   void set (unsigned, element_t);
};

and wide_int is then templated like

template 
class wide_int : public storage
{
};

where RTX / tree storage classes provide read-only access to their
storage and a rvalue integer rep to its value.

You can look at my example draft implementation I posted some
months ago.  But I'll gladly wiggle on the branch to make it
more like above (easy step one: don't access the wide-int members
directly but via accessor functions)
you are of course welcome to do this.   But there were two questions 
that i never got an answer to.


1) how does doing this help in any way.   Are their clients that will 
use this?


2) isn't this just going to slow wide-int down?  It seems clear that 
there is a real performance cost in that now there is going to be an 
extra step of indirection on each access to the underlying data structures.


I will point out we have cut down on the copying by sharing the 
underlying value from trees or rtl when the value is short lived. But 
given that most constants only take a single hwi to represent, getting 
rid of the copying seems like a very small payback for the increase in 
access costs.



IMO wide-int.h shouldn't know about trees and rtxes, and all routines
related to them should be in tree.h and rtl.h instead.  But using
static functions means that you have to declare everything in one place.
Also, it feels odd for wide_int to be both an object and a home
of static functions that don't always operate on wide_ints, e.g. when
comparing a CONST_INT against 16.

Indeed - in my sample the wide-int-rtx-storage and wide-int-tree-storage
storage models were declared in rtl.h and tree.h and wide-int.h did
know nothing about them.

this could be done for us too, it has nothing to do with the storage model.

Yes, though, does wi feel odd being a home for comparing a CONST_INT and
16?  :-)


I realise I'm probably not being helpful here.

Iterating on how we want to code to look like is reasonable.  Prettying
it up where it needs it, is good.

Indeed, if the code is as you like, and as richi likes, we'll then our
mission is just about complete.  :-)  For this patch, I'd love to defer
to richi (or someone that has a stronger opinion than I do) to say,
better, worse?

The comparisons?  Better.

richard s is making these static.

Thanks,
Richard.




Re: [PATCH] Convert more passes to new dump framework

2013-08-29 Thread Teresa Johnson
On Thu, Aug 29, 2013 at 3:05 AM, Richard Biener
 wrote:
 Does
 'make newlines consitent' avoid all the spurious vertical spacing I see 
 with
 -fopt-info?
>>>
>>> Well, it helps get us there. The problem was that before, since
>>> dump_loc was not consistently emitting newlines, the calls had to emit
>>> their own newlines manually in the string to ensure there was a
>>> newline at all. I was thinking that once this is fixed I could go back
>>> and clean up all those calls by removing the newlines in the string. I
>>> could split this part into a separate patch and do both at once.
>>>
>>> However, after thinking about this some more this morning, I am
>>> wondering whether it is better to remove the newline emission
>>> completely from dump_loc and rely on the caller to put the newline in
>>> the string. The reason is that there are 2 high level interfaces to
>>> the new dump infrastructure, dump_printf() and dump_printf_loc(). Only
>>> the latter invokes dump_loc and gets the newline at the start of the
>>> message. The typical usage seems to be to start a message via
>>> dump_printf_loc, and then use dump_printf to emit parts of the message
>>> (thus not requiring a newline), but I think it may lead to problems to
>>> rely on this assumption.
>>>
>>> So if you agree, I will simply remove the newline altogether from
>>> dump_loc, and ensure that all clients of dump_printf/dump_printf_loc
>>> include a newline char as appropriate in the string they pass.
>>
>>
>> As a helper function, dump_loc should not blindly emit new line as it
>> has no context.  I have tried to remove it, and push the newline to
>> higher level helpers -- it mostly works, but the vectorizer verbose
>> messages need serious clean up -- most of them assume that
>> dump_printf_loc does not end with new line, so that the expression
>> dump can follow in the same line (the message texts need clean up too
>> -- i do not like the === === in info messages).
>
> I know, but we should really do that cleanup.

I can work on this and will send a separate patch.
Teresa

-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: [RFC] Old school parallelization of WPA streaming

2013-08-29 Thread Richard Biener
On Thu, 29 Aug 2013, Jan Hubicka wrote:

> Jakub,
> I am adding you to CC since I put my current toughts on LTO and debug info
> in here.
> > > Fork-fire-forget is really a much simpler choice here IMO; no worries 
> > > about shared resources, less debug hassle.
> > 
> > It might be not as cheap as it is on Linux hosts on other hosts of
> > course.  Also I'd rather try to avoid I/O than solving the issue
> 
> I still have some items on list here
>  1) avoid function sections to be decompressed by WPA
> (this won't cause much compile time improvements as decompression is
>  well bellow 10% of runtime)

still low-hanging

finally get a LTO section header!  (with a flag telling whether the
section is compressed)

>  2) put variable initializers into named sections just as function bodies
> are.
> Seeing Martin's systemtaps of firefox/gimp/inkscape, to my surprise the
> initializers are actually about as big as the text segment.  While
> it seems bit wasteful to pust single integer_cst there (and we can
> special case this), it seems that there is a promise for vtables
> and other stuff.
> 
> To make devirt work, we will need to load vtables into memory (or
> invent representation to stream them other way that would be similarly
> big). Still we will avoid need to load them in 5000 copies and merge
> them.
>  3) I think good part of function/partitioning overhead is because abstract
> origin streaming is utterly broken.
> 
> Currently we can have DECL_ABSTRACT_ORIGIN on a function.  This I can now
> track by used_as_abstract_origin flag and I can stream those functions
> into partitins using them.
> 
> This is still wrong for multitude of reasons
> 
> 1) we really want DECL_INITIAL tree of the functions used as abstract
>origins in the form before any gimple optimizations happened on them.
>(that is when debug hook is called)
>This is not what happens - we stream the tree as it looks during
>TLO streaming time - i.e. after early optimizations.
> 
>I think we may just (at a time calling the debug hook) duplicate 
> DECL_INITIAL
>same way we duplicate decls for save_function_body and saving it 
> elsewhere.
>Making this tree to be abstract origin of the offline copy of the 
> function itself.
> 
> 2) dwarf2out doesn't really the DECL_INITIAL tree so it does something 
> useful
>only when it is already there. 
>It can simply call cgraph_get_body when it needs the DECL_INITIAL, but 
> it
>doesn't becuase push_cfun causes ICE.
>If we really can't push_cfun from middle of RTL queueu, I suppose I can
>just save it elsewhere
> 
> 3) It is not only toplevel decl that has origin, but all local vars in the
>function.
> 
>I think this goes terribly wrong - these decls are not indexable so 
> they
>are stored into function section of every function referring to them.
>They are then read in many duplicates and never merged with the 
> DECL_INITIAL
>tree of the actual abstract origin. For some reason dwarf2out doesn't
>seem to ICE, but I also do not see how this can produce working debug.
>Moreover I think the duplicates contribute to our current debug info
>size problems with LTO.
> 
>If we solve 1) as discussed by above (i.e. by having separate
>block trees for functions that are abstract origins), we can then 
> solve 3)
>by streaming those into global decl stream and make 
> cross-function_context
>tree references to become global.
> 
> 4) Of course after early inlining function may need abstract origins from
>multiple other functions.  I do not track this at all.
>May be easy to just collect a vector of functions that are needed into
>cgraph_node.
> 
> Of course solving 1)-4) is bit of early debug info without actually going 
> to
> stream the dwarf dies, but by using the BLOCK trees as a temporary 
> representation.
> Incrementally we can have this saved BLOCK tree to be a dwarf DIE and have
> origins to point to them instead of decls.
> 
> To get resonable streaming performance it would be nice to have way to get
> abstract origin references cross-partition that debug info can accomplish.

Most of the abstract origin stuff is dropped on the floor by streaming
because you cannot really stream that stuff.  And yes, we need early
debug info to generate the offline abstract origin copy of later inlined
functions, and yes, we have to handle streaming / referencing those in
some way.  But OTOH abstract origins are an optimization for debug info
size, so we can as well not have them.

> Said that, I now have the fork() patch in all my trees and enjoy 50% faster
> WPA times.  I changed my mind about claim that stremaing should be disk bound 
> -
> it is hard to hope for disk boundness for something that sho

Re: [PATCH] Convert more passes to new dump framework

2013-08-29 Thread Richard Biener
Ok.

Richard.

On Thu, Aug 29, 2013 at 3:18 PM, Teresa Johnson  wrote:
> On Wed, Aug 28, 2013 at 9:07 AM, Teresa Johnson  wrote:
>> On Wed, Aug 28, 2013 at 7:09 AM, Teresa Johnson  wrote:
>>> On Wed, Aug 28, 2013 at 4:01 AM, Richard Biener
>>>  wrote:
 On Wed, Aug 7, 2013 at 7:23 AM, Teresa Johnson  
 wrote:
> On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson  
> wrote:
>> On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor  wrote:
>>> Hi,
>>>
>>> On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote:
 On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor  wrote:
 > On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
 >> This patch ports messages to the new dump framework,
 >
 > It would be great this new framework was documented somewhere.  I 
 > lost
 > track of what was agreed it would be and from the uses in the
 > vectorizer I was never quite sure how to utilize it in other passes.

 Cc'ing Sharad who implemented this - Sharad, is this documented on a
 wiki or elsewhere?
>>>
>>> Thanks
>>>

 >
 > I'd also like to point out two other minor things inline:
 >
 > [...]
 >
 >> 2013-08-06  Teresa Johnson  
 >> Dehao Chen  
 >>
 >> * dumpfile.c (dump_loc): Add column number to output, make 
 >> newlines
 >> consistent.
 >> * dumpfile.h (OPTGROUP_OTHER): Add and enable under 
 >> OPTGROUP_ALL.
 >> * ipa-inline-transform.c (clone_inlined_nodes):
 >> (cgraph_node_opt_info): New function.
 >> (cgraph_node_call_chain): Ditto.
 >> (dump_inline_decision): Ditto.
 >> (inline_call): Invoke dump_inline_decision.
 >> * doc/invoke.texi: Document optall -fopt-info flag.
 >> * profile.c (read_profile_edge_counts): Use new dump 
 >> framework.
 >> (compute_branch_probabilities): Ditto.
 >> * passes.c (pass_manager::register_one_dump_file): Use 
 >> OPTGROUP_OTHER
 >> when pass not in any opt group.
 >> * value-prof.c (check_counter): Use new dump framework.
 >> (find_func_by_funcdef_no): Ditto.
 >> (check_ic_target): Ditto.
 >> * coverage.c (get_coverage_counts): Ditto.
 >> (coverage_init): Setup new dump framework.
 >> * ipa-inline.c (inline_small_functions): Set 
 >> is_in_ipa_inline.
 >> * ipa-inline.h (is_in_ipa_inline): Declare.
 >>
 >> * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
 >> * testsuite/gcc.dg/pr26570.c: Ditto.
 >> * testsuite/gcc.dg/pr32773.c: Ditto.
 >> * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.
 >>
 >
 > [...]
 >
 >> Index: ipa-inline-transform.c
 >> ===
 >> --- ipa-inline-transform.c  (revision 201461)
 >> +++ ipa-inline-transform.c  (working copy)
 >> @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, 
 >> bool d
 >>  }
 >>
 >>
 >> +#define MAX_INT_LENGTH 20
 >> +
 >> +/* Return NODE's name and profile count, if available.  */
 >> +
 >> +static const char *
 >> +cgraph_node_opt_info (struct cgraph_node *node)
 >> +{
 >> +  char *buf;
 >> +  size_t buf_size;
 >> +  const char *bfd_name = lang_hooks.dwarf_name (node->symbol.decl, 
 >> 0);
 >> +
 >> +  if (!bfd_name)
 >> +bfd_name = "unknown";
 >> +
 >> +  buf_size = strlen (bfd_name) + 1;
 >> +  if (profile_info)
 >> +buf_size += (MAX_INT_LENGTH + 3);
 >> +
 >> +  buf = (char *) xmalloc (buf_size);
 >> +
 >> +  strcpy (buf, bfd_name);
 >> +
 >> +  if (profile_info)
 >> +sprintf (buf, "%s ("HOST_WIDEST_INT_PRINT_DEC")", buf, 
 >> node->count);
 >> +  return buf;
 >> +}
 >
 > I'm not sure if output of this function is aimed only at the user or
 > if it is supposed to be used by gcc developers as well.  If the
 > latter, an incredibly useful thing is to also dump node->symbol.order
 > too.  We usually dump it after "/" sign separating it from node name.
 > It is invaluable when examining decisions in C++ code where you can
 > have lots of clones of a node (and also because existing dumps print
 > it, it is easy to combine them).

 The output is useful for both power users doing performance tuning of
>>>

Re: [RFC] Old school parallelization of WPA streaming

2013-08-29 Thread Jan Hubicka
> > Said that, I now have the fork() patch in all my trees and enjoy 50% faster
> > WPA times.  I changed my mind about claim that stremaing should be disk 
> > bound -
> > it is hard to hope for disk boundness for something that should fit in 
> > cache.
> 
> It should at least limit its fork rate according to -flto=N or jobserver.
It limits forks to -flto=N.
If the patch seems resonable, I will look into posiblity of adding my jobserver 
client
based on GNU make code.

I also think with -flto we want wrapper to figure out number of threads and 
suppy
default =N (i.e. nonparallel lto would be -flto=0).  Most people don't want to 
worry
about =n/=jobserv parameters and those few projects that don't want to start 
too many
processes to not explode in memory use can get their build machinery right.

Honza
> 
> > We went down from 5GB to 2GB of streaming for Firefox that is good.  But we 
> > will
> > see again 4GB once Martin's code layout work will land.  I think it is from 
> > good
> > part because of the origin fun above.
> 
> Ugh.
> 
> Richard.


[PATCH] Support dumping type bindings and 'mutable' qualifier in lambda diagnostics.

2013-08-29 Thread Adam Butcher
* error.c (dump_lambda_function): New function, dependent on ...
(maybe_dump_template_bindings): ... this new function, factored out of
...
(dump_function_decl): ... here.  Updated to early-out with call to
dump_lambda_function after determining template bindings.
---
Reworked as requested.  Okay to commit?
---
 gcc/cp/error.c | 60 ++
 1 file changed, 39 insertions(+), 21 deletions(-)

diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index c82a0ce..1edfa0b 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -1363,6 +1363,39 @@ find_typenames (tree t)
   return ft.typenames;
 }
 
+static void
+maybe_dump_template_bindings (cxx_pretty_printer *pp,
+ tree t, tree template_parms, tree template_args,
+ int flags)
+{
+  if (template_parms != NULL_TREE && template_args != NULL_TREE
+  && !(flags & TFF_NO_TEMPLATE_BINDINGS))
+{
+  vec *typenames = find_typenames (t);
+  pp_cxx_whitespace (pp);
+  pp_cxx_left_bracket (pp);
+  pp_cxx_ws_string (pp, M_("with"));
+  pp_cxx_whitespace (pp);
+  dump_template_bindings (pp, template_parms, template_args, typenames);
+  pp_cxx_right_bracket (pp);
+}
+}
+
+static void
+dump_lambda_function (cxx_pretty_printer *pp,
+ tree fn, tree template_parms, tree template_args,
+ int flags)
+{
+  /* A lambda's signature is essentially its "type".  */
+  dump_type (pp, DECL_CONTEXT (fn), flags);
+  if (!(TYPE_QUALS (class_of_this_parm (TREE_TYPE (fn))) & TYPE_QUAL_CONST))
+{
+  pp->padding = pp_before;
+  pp_c_ws_string (pp, "mutable");
+}
+  maybe_dump_template_bindings (pp, fn, template_parms, template_args, flags);
+}
+
 /* Pretty print a function decl. There are several ways we want to print a
function declaration. The TFF_ bits in FLAGS tells us how to behave.
As error can only apply the '#' flag once to give 0 and 1 for V, there
@@ -1379,15 +1412,6 @@ dump_function_decl (cxx_pretty_printer *pp, tree t, int 
flags)
   int show_return = flags & TFF_RETURN_TYPE || flags & TFF_DECL_SPECIFIERS;
   int do_outer_scope = ! (flags & TFF_UNQUALIFIED_NAME);
   tree exceptions;
-  vec *typenames = NULL;
-
-  if (DECL_NAME (t) && LAMBDA_FUNCTION_P (t))
-{
-  /* A lambda's signature is essentially its "type", so defer.  */
-  gcc_assert (LAMBDA_TYPE_P (DECL_CONTEXT (t)));
-  dump_type (pp, DECL_CONTEXT (t), flags);
-  return;
-}
 
   flags &= ~(TFF_UNQUALIFIED_NAME | TFF_TEMPLATE_NAME);
   if (TREE_CODE (t) == TEMPLATE_DECL)
@@ -1409,10 +1433,13 @@ dump_function_decl (cxx_pretty_printer *pp, tree t, int 
flags)
{
  template_parms = DECL_TEMPLATE_PARMS (tmpl);
  t = tmpl;
- typenames = find_typenames (t);
}
 }
 
+  if (DECL_NAME (t) && LAMBDA_FUNCTION_P (t))
+return dump_lambda_function (pp, t, template_parms, template_args,
+flags);
+
   fntype = TREE_TYPE (t);
   parmtypes = FUNCTION_FIRST_USER_PARMTYPE (t);
 
@@ -1476,17 +1503,8 @@ dump_function_decl (cxx_pretty_printer *pp, tree t, int 
flags)
   if (show_return)
dump_type_suffix (pp, TREE_TYPE (fntype), flags);
 
-  /* If T is a template instantiation, dump the parameter binding.  */
-  if (template_parms != NULL_TREE && template_args != NULL_TREE
- && !(flags & TFF_NO_TEMPLATE_BINDINGS))
-   {
- pp_cxx_whitespace (pp);
- pp_cxx_left_bracket (pp);
- pp_cxx_ws_string (pp, M_("with"));
- pp_cxx_whitespace (pp);
- dump_template_bindings (pp, template_parms, template_args, typenames);
- pp_cxx_right_bracket (pp);
-   }
+  maybe_dump_template_bindings (pp, t, template_parms, template_args,
+   flags);
 }
   else if (template_args)
 {
-- 
1.8.4



Re: [PATCH] Convert more passes to new dump framework

2013-08-29 Thread Teresa Johnson
On Thu, Aug 29, 2013 at 3:04 AM, Richard Biener
 wrote:
 New patch below that removes this global variable, and also outputs
 the node->symbol.order (in square brackets after the function name so
 as to not clutter it). Inline messages with profile data look look:

 test.c:8:3: note: foobar [0] (9000) inlined into foo [2] (1000)
 with call count 9000 (via inline instance bar [3] (9000))
>>>
>>> Ick.  This looks both redundant and cluttered.  This is supposed to be
>>> understandable by GCC users, not only GCC developers.
>>
>> The main part that is only useful/understandable to gcc developers is
>> the node->symbol.order in square brackes, requested by Martin. One
>> possibility is that I could put that part under a param, disabled by
>> default. We have something similar on the google branches that emits
>> LIPO module info in the message, enabled via a param.
>
> But we have _dump files_ for that.  That's the developer-consumed
> form of opt-info.  -fopt-info is purely user sugar and for usual translation
> units it shouldn't exceed a single terminal full of output.

But as a developer I don't want to have to parse lots of dump files
for a summary of the major optimizations performed (e.g. inlining,
unrolling) for an application, unless I am diving into the reasons for
why or why not one of those optimizations occurred in a particular
location. I really do want a summary emitted to stderr so that it is
easily searchable/summarizable for the app as a whole.

For example, some of the apps I am interested in have thousands of
input files, and trying to collect and parse dump files for each and
every one is overwhelming (it probably would be even if my input files
numbered in the hundreds). What has been very useful is having these
high level summary messages of inlines and unrolls emitted to stderr
by -fopt-info. Then it is easy to search and sort by hotness to get a
feel for things like what inlines are missing when moving to a new
compiler, or compiling a new version of the source, for example. Then
you know which files to focus on and collect dump files for.

>
>> I'd argue that the other information (the profile counts, emitted only
>> when using -fprofile-use, and the inline call chains) are useful if
>> you want to understand whether and how critical inlines are occurring.
>> I think this is the type of information that users focused on
>> optimizations, as well as gcc developers, want when they use
>> -fopt-info. Otherwise it is difficult to make sense of the inline
>> information.
>
> Well, I doubt that inline information is interesting to users unless we are
> able to aggressively filter it to what users are interested in.  Which IMHO
> isn't possible - users are interested in "I have not inlined this even though
> inlining would severely improve performance" which would indicate a bug
> in the heuristics we can reliably detect and thus it wouldn't be there.

I have interacted with users who are aware of optimizations such as
inlining and unrolling and want to look at that information to
diagnose performance differences when refactoring code or using a new
compiler version. I also think inlining (especially cross-module) is
one example of an optimization that is still being tuned, and user
reports of performance issues related to that have been useful.

I really think that the two groups of people who will find -fopt-info
useful are gcc developers and savvy performance-hungry users. For the
former group the additional info is extremely useful. For the latter
group some of the extra information may not be required (although a
call count is useful for those using profile feedback), but IMO is not
unreasonable.

Teresa


-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: [PATCH] Support dumping type bindings and 'mutable' qualifier in lambda diagnostics.

2013-08-29 Thread Gabriel Dos Reis
On Thu, Aug 29, 2013 at 9:20 AM, Adam Butcher  wrote:
> * error.c (dump_lambda_function): New function, dependent on ...
> (maybe_dump_template_bindings): ... this new function, factored out of
> ...
> (dump_function_decl): ... here.  Updated to early-out with call to
> dump_lambda_function after determining template bindings.
> ---
> Reworked as requested.  Okay to commit?

Document the new functions.
Use pp->translate_string ("with") instead of
pp_cxx_ws_string (pp, M_("with")).

 OK to commit with those changes.

Thanks,
-- Gaby

>
> +static void
> +maybe_dump_template_bindings (cxx_pretty_printer *pp,
> + tree t, tree template_parms, tree template_args,
> + int flags)
> +{
> +  if (template_parms != NULL_TREE && template_args != NULL_TREE
> +  && !(flags & TFF_NO_TEMPLATE_BINDINGS))
> +{
> +  vec *typenames = find_typenames (t);
> +  pp_cxx_whitespace (pp);
> +  pp_cxx_left_bracket (pp);
> +  pp_cxx_ws_string (pp, M_("with"));
> +  pp_cxx_whitespace (pp);
> +  dump_template_bindings (pp, template_parms, template_args, typenames);
> +  pp_cxx_right_bracket (pp);
> +}
> +}
> +
> +static void
> +dump_lambda_function (cxx_pretty_printer *pp,
> + tree fn, tree template_parms, tree template_args,
> + int flags)
> +{
> +  /* A lambda's signature is essentially its "type".  */
> +  dump_type (pp, DECL_CONTEXT (fn), flags);
> +  if (!(TYPE_QUALS (class_of_this_parm (TREE_TYPE (fn))) & TYPE_QUAL_CONST))
> +{
> +  pp->padding = pp_before;
> +  pp_c_ws_string (pp, "mutable");
> +}
> +  maybe_dump_template_bindings (pp, fn, template_parms, template_args, 
> flags);
> +}
> +
>  /* Pretty print a function decl. There are several ways we want to print a
> function declaration. The TFF_ bits in FLAGS tells us how to behave.
> As error can only apply the '#' flag once to give 0 and 1 for V, there
> @@ -1379,15 +1412,6 @@ dump_function_decl (cxx_pretty_printer *pp, tree t, 
> int flags)
>int show_return = flags & TFF_RETURN_TYPE || flags & TFF_DECL_SPECIFIERS;
>int do_outer_scope = ! (flags & TFF_UNQUALIFIED_NAME);
>tree exceptions;
> -  vec *typenames = NULL;
> -
> -  if (DECL_NAME (t) && LAMBDA_FUNCTION_P (t))
> -{
> -  /* A lambda's signature is essentially its "type", so defer.  */
> -  gcc_assert (LAMBDA_TYPE_P (DECL_CONTEXT (t)));
> -  dump_type (pp, DECL_CONTEXT (t), flags);
> -  return;
> -}
>
>flags &= ~(TFF_UNQUALIFIED_NAME | TFF_TEMPLATE_NAME);
>if (TREE_CODE (t) == TEMPLATE_DECL)
> @@ -1409,10 +1433,13 @@ dump_function_decl (cxx_pretty_printer *pp, tree t, 
> int flags)
> {
>   template_parms = DECL_TEMPLATE_PARMS (tmpl);
>   t = tmpl;
> - typenames = find_typenames (t);
> }
>  }
>
> +  if (DECL_NAME (t) && LAMBDA_FUNCTION_P (t))
> +return dump_lambda_function (pp, t, template_parms, template_args,
> +flags);
> +
>fntype = TREE_TYPE (t);
>parmtypes = FUNCTION_FIRST_USER_PARMTYPE (t);
>
> @@ -1476,17 +1503,8 @@ dump_function_decl (cxx_pretty_printer *pp, tree t, 
> int flags)
>if (show_return)
> dump_type_suffix (pp, TREE_TYPE (fntype), flags);
>
> -  /* If T is a template instantiation, dump the parameter binding.  */
> -  if (template_parms != NULL_TREE && template_args != NULL_TREE
> - && !(flags & TFF_NO_TEMPLATE_BINDINGS))
> -   {
> - pp_cxx_whitespace (pp);
> - pp_cxx_left_bracket (pp);
> - pp_cxx_ws_string (pp, M_("with"));
> - pp_cxx_whitespace (pp);
> - dump_template_bindings (pp, template_parms, template_args, 
> typenames);
> - pp_cxx_right_bracket (pp);
> -   }
> +  maybe_dump_template_bindings (pp, t, template_parms, template_args,
> +   flags);
>  }
>else if (template_args)
>  {
> --
> 1.8.4
>


[C++ Patch] PR 51424

2013-08-29 Thread Paolo Carlini

Hi,

thus I have this simple patch which at least catches pure 
self-delegation (no cycles). Better than nothing, I would say, given its 
simplicity ;)


At first I thought I would put the check in expand_default_init but then 
I noticed that in case of, eg, virtual bases, the simple pattern 
matching (which looks inside CALL_EXPRs) would miss some cases, like 
when the self-delegation is only in the derived type, not in the base.


Tested x86_64-linux.

Thanks!
Paolo.

/
/cp
2013-08-29  Paolo Carlini  

PR c++/51424
* cp-tree.h (LOOKUP_DELEGATING_CONS): Add.
* init.c (perform_target_ctor): Use it.
* call.c (build_special_member_call): Diagnose self-delegating
constructors.

/testsuite
2013-08-29  Paolo Carlini  

PR c++/51424
* g++.dg/cpp0x/dc8.C: New.
* g++.dg/template/meminit1.C: Adjust.
Index: cp/call.c
===
--- cp/call.c   (revision 202071)
+++ cp/call.c   (working copy)
@@ -7442,6 +7442,14 @@ build_special_member_call (tree instance, tree nam
   if (allocated != NULL)
 release_tree_vector (allocated);
 
+  if ((complain & tf_error)
+  && (flags & LOOKUP_DELEGATING_CONS)
+  && name == complete_ctor_identifier 
+  && TREE_CODE (ret) == CALL_EXPR
+  && (DECL_ABSTRACT_ORIGIN (TREE_OPERAND (CALL_EXPR_FN (ret), 0))
+ == current_function_decl))
+error ("constructor delegates to itself");
+
   return ret;
 }
 
Index: cp/cp-tree.h
===
--- cp/cp-tree.h(revision 202071)
+++ cp/cp-tree.h(working copy)
@@ -4509,6 +4509,8 @@ enum overload_flags { NO_SPECIAL = 0, DTOR_FLAG, T
 #define LOOKUP_NO_RVAL_BIND (LOOKUP_EXPLICIT_TMPL_ARGS << 1)
 /* Used by case_conversion to disregard non-integral conversions.  */
 #define LOOKUP_NO_NON_INTEGRAL (LOOKUP_NO_RVAL_BIND << 1)
+/* Used for delegating constructors in order to diagnose self-delegation.  */
+#define LOOKUP_DELEGATING_CONS (LOOKUP_NO_NON_INTEGRAL << 1)
 
 #define LOOKUP_NAMESPACES_ONLY(F)  \
   (((F) & LOOKUP_PREFER_NAMESPACES) && !((F) & LOOKUP_PREFER_TYPES))
Index: cp/init.c
===
--- cp/init.c   (revision 202071)
+++ cp/init.c   (working copy)
@@ -500,8 +500,9 @@ perform_target_ctor (tree init)
   tree decl = current_class_ref;
   tree type = current_class_type;
 
-  finish_expr_stmt (build_aggr_init (decl, init, LOOKUP_NORMAL,
- tf_warning_or_error));
+  finish_expr_stmt (build_aggr_init (decl, init,
+LOOKUP_NORMAL|LOOKUP_DELEGATING_CONS,
+tf_warning_or_error));
   if (TYPE_HAS_NONTRIVIAL_DESTRUCTOR (type))
 {
   tree expr = build_delete (type, decl, sfk_complete_destructor,
Index: testsuite/g++.dg/cpp0x/dc8.C
===
--- testsuite/g++.dg/cpp0x/dc8.C(revision 0)
+++ testsuite/g++.dg/cpp0x/dc8.C(working copy)
@@ -0,0 +1,66 @@
+// PR c++/51424
+// { dg-do compile { target c++11 } }
+
+template 
+struct S
+{
+  S() : S() {}  // { dg-error "delegates to itself" }
+  S(int x) : S(x) {}// { dg-error "delegates to itself" }
+};
+
+struct B1
+{
+  B1() : B1() {}// { dg-error "delegates to itself" }
+  B1(int y) : B1(y) {}  // { dg-error "delegates to itself" }
+};
+
+struct V1 : virtual B1
+{
+  V1() : B1() {}
+  V1(int x) : B1(x) {}
+};
+
+struct B2
+{
+  B2() : B2() {}// { dg-error "delegates to itself" }
+  B2(int y) : B2(y) {}  // { dg-error "delegates to itself" }
+};
+
+struct V2 : virtual B2
+{
+  V2() : V2() {}// { dg-error "delegates to itself" }
+  V2(int x) : V2(x) {}  // { dg-error "delegates to itself" }
+};
+
+struct B3
+{
+  B3() {}
+  B3(int y) {}
+};
+
+struct V3 : virtual B3
+{
+  V3() : V3() {}// { dg-error "delegates to itself" }
+  V3(int x) : V3(x) {}  // { dg-error "delegates to itself" }
+};
+
+struct CE1
+{
+  constexpr CE1() : CE1() {}// { dg-error "delegates to itself" }
+  constexpr CE1(int x) : CE1(x) {}  // { dg-error "delegates to itself" }
+};
+
+struct CEB2
+{
+  constexpr CEB2() : CEB2() {}// { dg-error "delegates to itself" }
+  constexpr CEB2(int x) : CEB2(x) {}  // { dg-error "delegates to itself" }
+};
+
+struct CE2 : CEB2
+{
+  constexpr CE2() : CEB2() {}
+  constexpr CE2(int x) : CEB2(x) {}
+};
+
+S s1;
+S s2(1);
Index: testsuite/g++.dg/template/meminit1.C
===
--- testsuite/g++.dg/template/meminit1.C(revision 202068)
+++ testsuite/g++.dg/template/meminit1.C(working copy)
@@ -3,6 +3,6 @@ template 
 struct S
 {
   S() : S() {} // { dg-message "delegating constructors" }
-};
+}; // { dg-error "delegates to itself" "" { target *-*-* } 5 }
 
 S s;


Re: [PATCH] Convert more passes to new dump framework

2013-08-29 Thread Teresa Johnson
On Wed, Aug 28, 2013 at 7:09 AM, Teresa Johnson  wrote:
>> The inline stuff should be split and re-sent, it's non-obvious to me (extra
>> function parameters are not documented for example).  I'd rather have
>> inline_and_report_call () for example instead of an extra bool parameter.
>> But let's iterate over this once it's split out.
>
> Ok, I will send this separately. I guess we could have a separate
> interface inline_and_report_call that is a wrapper around inline_call
> and simply invokes the dumper. Note that flatten_function will need to
> conditionally call one of the two interfaces based on the value of its
> bool early parameter though.

Here is the split out patch. I realized why I need the extra bool
parameter to indicate early inlining: the early inliner messages are
emitted only under the more verbose MSG_NOTE, so we need to know
whether we are in early or ipa inlining in dump_inline_decision. I
have documented the additional parameter.

Thanks,
Teresa

2013-08-29  Teresa Johnson  
Dehao Chen  

* ipa-inline.c (recursive_inlining): New inline_call parameter.
(inline_small_functions): Ditto.
(flatten_function): Ditto.
(ipa_inline): Ditto.
(inline_always_inline_functions): Ditto.
(early_inline_small_functions): Ditto.
* ipa-inline.h: Ditto.
* ipa-inline-transform.c (cgraph_node_opt_info): New function.
(cgraph_node_call_chain): Ditto.
(dump_inline_decision): Ditto.
(inline_call): Invoke dump_inline_decision, new parameter.

Index: ipa-inline.c
===
--- ipa-inline.c(revision 202059)
+++ ipa-inline.c(working copy)
@@ -1342,7 +1342,7 @@ recursive_inlining (struct cgraph_edge *edge,
   reset_edge_growth_cache (curr);
}

-  inline_call (curr, false, new_edges, &overall_size, true);
+  inline_call (curr, false, new_edges, &overall_size, true, false);
   lookup_recursive_calls (node, curr->callee, heap);
   n++;
 }
@@ -1757,7 +1757,8 @@ inline_small_functions (void)
fprintf (dump_file, " Peeling recursion with depth %i\n", depth);

  gcc_checking_assert (!callee->global.inlined_to);
- inline_call (edge, true, &new_indirect_edges, &overall_size, true);
+ inline_call (edge, true, &new_indirect_edges, &overall_size, true,
+   false);
  if (flag_indirect_inlining)
add_new_edges_to_heap (edge_heap, new_indirect_edges);

@@ -1879,7 +1880,7 @@ flatten_function (struct cgraph_node *node, bool e
 xstrdup (cgraph_node_name (callee)),
 xstrdup (cgraph_node_name (e->caller)));
   orig_callee = callee;
-  inline_call (e, true, NULL, NULL, false);
+  inline_call (e, true, NULL, NULL, false, early);
   if (e->callee != orig_callee)
orig_callee->symbol.aux = (void *) node;
   flatten_function (e->callee, early);
@@ -2017,7 +2018,7 @@ ipa_inline (void)
   inline_summary (node->callers->caller)->size);
}

- inline_call (node->callers, true, NULL, NULL, true);
+ inline_call (node->callers, true, NULL, NULL, true, false);
  if (dump_file)
fprintf (dump_file,
 " Inlined into %s which now has %i size\n",
@@ -2089,7 +2090,7 @@ inline_always_inline_functions (struct cgraph_node
fprintf (dump_file, "  Inlining %s into %s (always_inline).\n",
 xstrdup (cgraph_node_name (e->callee)),
 xstrdup (cgraph_node_name (e->caller)));
-  inline_call (e, true, NULL, NULL, false);
+  inline_call (e, true, NULL, NULL, false, true);
   inlined = true;
 }
   if (inlined)
@@ -2141,7 +2142,7 @@ early_inline_small_functions (struct cgraph_node *
fprintf (dump_file, " Inlining %s into %s.\n",
 xstrdup (cgraph_node_name (callee)),
 xstrdup (cgraph_node_name (e->caller)));
-  inline_call (e, true, NULL, NULL, true);
+  inline_call (e, true, NULL, NULL, true, true);
   inlined = true;
 }

Index: ipa-inline.h
===
--- ipa-inline.h(revision 202059)
+++ ipa-inline.h(working copy)
@@ -229,7 +229,8 @@ void compute_inline_parameters (struct cgraph_node
 bool speculation_useful_p (struct cgraph_edge *e, bool anticipate_inlining);

 /* In ipa-inline-transform.c  */
-bool inline_call (struct cgraph_edge *, bool, vec *,
int *, bool);
+bool inline_call (struct cgraph_edge *, bool, vec *, int *,
+  bool, bool);
 unsigned int inline_transform (struct cgraph_node *);
 void clone_inlined_nodes (struct cgraph_edge *e, bool, bool, int *);

Index: ipa-inline-transform.c
===
--- ipa-inline-transform.c  (revision

[PATCH 4/6] Implement is_a_helper <>::test specializations for various gimple types

2013-08-29 Thread David Malcolm
* gimple.h (is_a_helper ::test): New.
(is_a_helper ::test): New.
(is_a_helper ::test): New.
(is_a_helper ::test): New.
---
 gcc/gimple.h | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/gcc/gimple.h b/gcc/gimple.h
index 43573a1..e2cd383 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -1722,6 +1722,21 @@ gimple_has_ops (const_gimple g)
   return gimple_code (g) >= GIMPLE_COND && gimple_code (g) <= GIMPLE_RETURN;
 }
 
+template <>
+template <>
+inline bool
+is_a_helper ::test (const_gimple gs)
+{
+  return gimple_has_ops (gs);
+}
+
+template <>
+template <>
+inline bool
+is_a_helper ::test (gimple gs)
+{
+  return gimple_has_ops (gs);
+}
 
 /* Return true if GIMPLE statement G has memory operands.  */
 
@@ -1731,6 +1746,21 @@ gimple_has_mem_ops (const_gimple g)
   return gimple_code (g) >= GIMPLE_ASSIGN && gimple_code (g) <= GIMPLE_RETURN;
 }
 
+template <>
+template <>
+inline bool
+is_a_helper ::test (const_gimple gs)
+{
+  return gimple_has_mem_ops (gs);
+}
+
+template <>
+template <>
+inline bool
+is_a_helper ::test (gimple gs)
+{
+  return gimple_has_mem_ops (gs);
+}
 
 /* Return the set of USE operands for statement G.  */
 
-- 
1.7.11.7



[PATCH 2/6] Hand-written port of various accessors within gimple.h

2013-08-29 Thread David Malcolm
* gimple.h (gimple_use_ops): Port from union to usage of
dyn_cast.
(gimple_set_use_ops): Port from union to usage of as_a.
(gimple_set_vuse): Likewise.
(gimple_set_vdef): Likewise.
(gimple_call_internal_fn): Port from union to a static_cast,
given that the type has already been asserted.
(gimple_omp_body_ptr): Port from unchecked union usage to
a static_cast.
(gimple_omp_set_body): Likewise.
---
 gcc/gimple.h | 27 ---
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/gcc/gimple.h b/gcc/gimple.h
index d7ea2e4..1ee6238 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -1433,9 +1433,11 @@ gimple_has_mem_ops (const_gimple g)
 static inline struct use_optype_d *
 gimple_use_ops (const_gimple g)
 {
-  if (!gimple_has_ops (g))
+  const gimple_statement_with_ops *ops_stmt =
+dyn_cast  (g);
+  if (!ops_stmt)
 return NULL;
-  return g->gsops.opbase.use_ops;
+  return ops_stmt->use_ops;
 }
 
 
@@ -1444,8 +1446,9 @@ gimple_use_ops (const_gimple g)
 static inline void
 gimple_set_use_ops (gimple g, struct use_optype_d *use)
 {
-  gcc_gimple_checking_assert (gimple_has_ops (g));
-  g->gsops.opbase.use_ops = use;
+  gimple_statement_with_ops *ops_stmt =
+as_a  (g);
+  ops_stmt->use_ops = use;
 }
 
 
@@ -1522,8 +1525,9 @@ gimple_vdef_ptr (gimple g)
 static inline void
 gimple_set_vuse (gimple g, tree vuse)
 {
-  gcc_gimple_checking_assert (gimple_has_mem_ops (g));
-  g->gsmembase.vuse = vuse;
+  gimple_statement_with_memory_ops *mem_ops_stmt =
+as_a  (g);
+  mem_ops_stmt->vuse = vuse;
 }
 
 /* Set the single VDEF operand of the statement G.  */
@@ -1531,8 +1535,9 @@ gimple_set_vuse (gimple g, tree vuse)
 static inline void
 gimple_set_vdef (gimple g, tree vdef)
 {
-  gcc_gimple_checking_assert (gimple_has_mem_ops (g));
-  g->gsmembase.vdef = vdef;
+  gimple_statement_with_memory_ops *mem_ops_stmt =
+as_a  (g);
+  mem_ops_stmt->vdef = vdef;
 }
 
 
@@ -2188,7 +2193,7 @@ static inline enum internal_fn
 gimple_call_internal_fn (const_gimple gs)
 {
   gcc_gimple_checking_assert (gimple_call_internal_p (gs));
-  return gs->gimple_call.u.internal_fn;
+  return static_cast  (gs)->u.internal_fn;
 }
 
 
@@ -3860,7 +3865,7 @@ gimple_debug_source_bind_set_value (gimple dbg, tree 
value)
 static inline gimple_seq *
 gimple_omp_body_ptr (gimple gs)
 {
-  return &gs->omp.body;
+  return &static_cast  (gs)->body;
 }
 
 /* Return the body for the OMP statement GS.  */
@@ -3876,7 +3881,7 @@ gimple_omp_body (gimple gs)
 static inline void
 gimple_omp_set_body (gimple gs, gimple_seq body)
 {
-  gs->omp.body = body;
+  static_cast  (gs)->body = body;
 }
 
 
-- 
1.7.11.7



[PATCH 1/6] Convert gimple types from a union to a C++ class hierarchy

2013-08-29 Thread David Malcolm
* coretypes.h (union gimple_statement_d): Remove declaration.
(gimple): Convert from being a "union gimple_statement_d *"
to a "struct gimple_statement_base *".
(const_gimple): Likewise (with "const").
* ggc.h (ggc_alloc_cleared_gimple_statement_d_stat): Replace
with...
(ggc_alloc_cleared_gimple_statement_stat): ...this.
* gimple-pretty-print.c (debug): Change parameter from a
"gimple_statement_d &" to a "gimple_statement_base &".
(debug): Change parameter from a "gimple_statement_d *" to
a "gimple_statement_base *".
* gimple-pretty-print.h (debug): Update declarations as above.
* gimple.c (gimple_alloc_stat): Update for renaming of
ggc_alloc_cleared_gimple_statement_d_stat to
ggc_alloc_cleared_gimple_statement_stat.
* Makefile.in (GIMPLE_H): Add dep on is-a.h.
* gimple.h: Include "is-a.h"
(struct gimple statement_base): Convert to GTY((user)).
(gimple_statement_with_ops_base): Convert to a subclass of
gimple_statement_base, dropping initial "gsbase" field.
(gimple_statement_with_ops): Convert to a subclass of
gimple_statement_with_ops_base, dropping initial "opbase" field.
(gimple_statement_with_memory_ops_base): Likewise.
(gimple_statement_with_memory_ops): Convert to a subclass of
public gimple_statement_with_memory_ops_base, dropping initial
"membase" field.
(gimple_statement_call): Likewise.
(gimple_statement_omp): Convert to a subclass of
gimple_statement_base, dropping initial "gsbase" field.
(gimple_statement_bind): Likewise.
(gimple_statement_catch): Likewise.
(gimple_statement_eh_filter): Likewise.
(gimple_statement_eh_else): Likewise.
(gimple_statement_eh_mnt): Likewise.
(gimple_statement_phi): Likewise.
(gimple_statement_eh_ctrl): Likewise.
(gimple_statement_try): Likewise.
(gimple_statement_wce): Likewise.
(gimple_statement_asm): Convert to a subclass of
gimple_statement_with_memory_ops_base, dropping initial
"membase" field.
(gimple_statement_omp_critical): Convert to a subclass of
gimple_statement_omp, dropping initial "omp" field.
(gimple_statement_omp_for): Likewise.
(gimple_statement_omp_parallel): Likewise.
(gimple_statement_omp_task): Convert to a subclass of
gimple_statement_omp_parallel, dropping initial "par" field.
(gimple_statement_omp_sections): Convert to a subclass of
gimple_statement_omp, dropping initial "omp" field.
(gimple_statement_omp_continue): Convert to a subclass of
gimple_statement_base, dropping initial "gsbase" field.
(gimple_statement_omp_single): Convert to a subclass of
gimple_statement_omp, dropping initial "omp" field.
(gimple_statement_omp_atomic_load): Convert to a subclass of
gimple_statement_base, dropping initial "gsbase" field.
(gimple_statement_omp_atomic_store): Convert to a subclass of
gimple_statement_base, dropping initial "gsbase" field.
(gimple_statement_transaction): Convert to a subclass of
gimple_statement_with_memory_ops_base, dropping initial "gsbase"
field.
(union gimple_statement_d): Remove.
* system.h (CONST_CAST_GIMPLE): Update to use
"struct gimple_statement_base *" rather than
"union gimple_statement_d *".
* tree-ssa-ccp.c (gimple_htab): Convert underlying type from
gimple_statement_d to gimple_statement_base.
---
 gcc/Makefile.in   |   2 +-
 gcc/coretypes.h   |   5 +-
 gcc/ggc.h |   6 +-
 gcc/gimple-pretty-print.c |   4 +-
 gcc/gimple-pretty-print.h |   4 +-
 gcc/gimple.c  |   2 +-
 gcc/gimple.h  | 183 ++
 gcc/system.h  |   2 +-
 gcc/tree-ssa-ccp.c|   2 +-
 9 files changed, 84 insertions(+), 126 deletions(-)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 387b60f..321c591 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -882,7 +882,7 @@ BASIC_BLOCK_H = basic-block.h $(PREDICT_H) $(VEC_H) 
$(FUNCTION_H) \
cfg-flags.def cfghooks.h
 GIMPLE_H = gimple.h gimple.def gsstruct.def pointer-set.h $(VEC_H) \
$(GGC_H) $(BASIC_BLOCK_H) $(TREE_H) tree-ssa-operands.h \
-   tree-ssa-alias.h $(INTERNAL_FN_H) $(HASH_TABLE_H)
+   tree-ssa-alias.h $(INTERNAL_FN_H) $(HASH_TABLE_H) is-a.h
 TRANS_MEM_H = trans-mem.h
 GCOV_IO_H = gcov-io.h gcov-iov.h auto-host.h
 COVERAGE_H = coverage.h $(GCOV_IO_H)
diff --git a/gcc/coretypes.h b/gcc/coretypes.h
index bff8f5c..5ced7cb 100644
--- a/gcc/coretypes.h
+++ b/gcc/coretypes.h
@@ -58,9 +58,8 @@ typedef const struct rtvec_def *const_rtvec;
 union tree_node;
 typedef union tree_node *tree;
 typedef const union tree_node *const_tree;
-union gimple_stat

[PATCH 0/6] Convert gimple to a C++ class hierarchy

2013-08-29 Thread David Malcolm
The various gimple types are currently implemented using a hand-coded
C inheritance scheme, with a "union gimple_statement_d" holding the
various possible structs for a statement.

The following series of patches convert it to a C++ class
hierarchy, using the existing structs, eliminating the union. The
"gimple" typedef changes from being a
  (union gimple_statement_d *)
to being a:
  (struct gimple_statement_base *)

There are no virtual functions in the new code: the sizes of the various
structs are unchanged.

It makes use of "is-a.h", using the as_a  template function to
perform downcasts, which are checked (via gcc_checking_assert) in an
ENABLE_CHECKING build, and are simple casts in an unchecked build,
albeit it in an inlined function rather than a macro.

For example, one can write:

  gimple_statement_phi *phi =
as_a  (gsi_stmt (gsi));

and then directly access the fields of the phi, as a phi.  The existing
accessor functions in gimple.h become somewhat redundant in this
scheme, but are preserved.

I believe this is a superior scheme to the C implementation:

  * We can get closer to compile-time type-safety, checking the gimple
code once and downcasting with an as_a, then directly accessing
fields, rather than going through accessor functions that check
each time.  In some places we may want to replace a "gimple" with
a subclass e.g. phis are always of the phi subclass, to get full
compile-time type-safety.

  * This scheme is likely to be easier for newbies to understand.
  
  * Currently in gdb, dereferencing a gimple leads to screenfuls of text,
showing all the various union values.  With this, you get just the base
class, and can cast it to the appropriate subclass.

  * We're working directly with the language constructs, rather than
rolling our own, and thus other tools can better understand the
code.  For example, you can see Doxygen's rendering of the gimple
class hierarchy (in the modified code) here:

http://fedorapeople.org/~dmalcolm/gcc/2013-08-28/doxygen/html/structgimple__statement__base.html

The names of the structs are rather verbose.  I would prefer to also
rename them all to eliminate the "_statement" component:
  "gimple_statement_base" -> "gimple_base"
  "gimple_statement_phi"  -> "gimple_phi"
  "gimple_statement_omp"  -> "gimple_omp"
etc, but I didn't do this to mimimize the patch size.  But if the core
maintainers are up for that, I can redo the patch series with that
change also.

The patch is in 6 parts; all of them are needed together.

  * Patch 1 of 6: This patch adds inheritance to the various gimple
types, eliminating the initial baseclass fields, and eliminating the
union gimple_statement_d.   All the types remain structs.  They
become marked with GTY((user)).

  * Patch 2 of 6: This patch ports various accessor functions within
gimple.h to the new scheme.

  * Patch 3 of 6: This patch is autogenerated by "refactor_gimple.py"
from https://github.com/davidmalcolm/gcc-refactoring-scripts
There is a test suite "test_refactor_gimple.py" which may give a
clearer idea of the changes that the script makes (and add
confidence that it's doing the right thing).
The patch converts code of the form:
  {
GIMPLE_CHECK (gs, SOME_CODE);
gimple_subclass_get/set_some_field (gs, value);
  }
to code of this form:
  {
some_subclass *stmt = as_a  (gs);
stmt->some_field = value;
  }
It also autogenerates specializations of 
is_a_helper ::test
equivalent to a GIMPLE_CHECK() for use by is_a and as_a.

  * Patch 4 of 6: This patch implement further specializations of
is_a_helper ::test, for gimple_has_ops and gimple_has_mem_ops.

  * Patch 5 of 6: This patch does the rest of porting from union access
to subclass access (all the fiddly places that the script in patch 3
couldn't handle).

  * Patch 6 of 6: This patch adds manual implementation of the GTY
hooks, written by heavily-editing the autogenerated gtype-desc.c
code from an earlier incarnation.   It implements the equivalent
of chain_next for gimple statements.

Successfully bootstrapped and tested on x86_64-unknown-linux-gnu: all
testcases show the same results as an unpatched build (relative to
r202029).

OK for trunk?

 gcc/Makefile.in   |2 +-
 gcc/coretypes.h   |5 +-
 gcc/ggc.h |6 +-
 gcc/gimple-iterator.c |   72 +--
 gcc/gimple-pretty-print.c |6 +-
 gcc/gimple-pretty-print.h |4 +-
 gcc/gimple-streamer-in.c  |   19 +-
 gcc/gimple-streamer-out.c |2 +-
 gcc/gimple.c  |  819 +++-
 gcc/gimple.h  | 1545 +
 gcc/gimplify.c|4 +-
 gcc/system.h  |2 +-
 gcc/tree-cfg.c|6 +-
 gcc/tree-inline.c |2 +-
 gcc/tree-phinodes.c   |   39 +-
 gcc/tree-ssa-ccp.c|2 

[PATCH 6/6] Add manual GTY hooks

2013-08-29 Thread David Malcolm
* gimple.c (gt_ggc_mx (gimple)): New, as required by GTY((user)).
(gt_pch_nx (gimple)): Likewise.
(gt_pch_nx (gimple, gt_pointer_operator, void *)): Likewise.
* gimple.h  (gt_ggc_mx (gimple)): Declare.
(gt_pch_nx (gimple)): Declare.
(gt_pch_nx (gimple, gt_pointer_operator, void *)): Declare.
* tree-cfg.c (ggc_mx (gimple&)): Remove declaration, as this
collides with the function that GTY((user)) expects.
(gt_ggc_mx (edge_def *)): Replace call to gt_ggc_mx on the
gimple with gt_ggc_mx_gimple_statement_base: in the
pre-GTY((user)) world, "gt_ggc_mx" was the function to be called
on a possibly NULL pointed to check if needed marking and if so
to traverse its fields.  In the GTY((user)) world, "gt_ggc_mx"
is the function to be called on non-NULL objects immediately *after*
they have been marked: it does not mark the object itself.
(gt_pch_nx (gimple&)): Remove declaration.
(gt_pch_nx (edge_def *)): Update as per the mx hook.
---
 gcc/gimple.c   | 743 +
 gcc/gimple.h   |   6 +
 gcc/tree-cfg.c |   6 +-
 3 files changed, 751 insertions(+), 4 deletions(-)

diff --git a/gcc/gimple.c b/gcc/gimple.c
index 1ad36d1..dd99cda 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -4338,4 +4338,747 @@ build_type_cast (tree to_type, gimple op, enum ssa_mode 
mode)
   return build_type_cast (to_type, gimple_assign_lhs (op), mode);
 }
 
+void
+gt_ggc_mx (gimple gs)
+{
+  gimple x = gs;
+  /* Emulation of the "chain_next" GTY attribute.
+
+ gs has already been marked.
+ Iterate the chain of next statements, marking until we reach one that
+ has already been marked, or NULL.   */
+  gimple xlimit = gs->next;
+  while (ggc_test_and_set_mark (xlimit))
+xlimit = xlimit->next;
+
+  /* All of the statements within the half-open interval [x..xlimit) have
+ just been marked.  Iterate through the list, visiting their fields.  */
+  while (x != xlimit)
+{
+  gt_ggc_m_15basic_block_def (x->bb);
+  switch (gimple_statement_structure (&((*x
+   {
+   case GSS_BASE:
+ break;
+   case GSS_WITH_OPS:
+ {
+   gimple_statement_with_ops *stmt
+ = static_cast  (x);
+   size_t num = (size_t)(stmt->num_ops);
+   for (size_t i = 0; i != num; i++)
+ gt_ggc_m_9tree_node (stmt->op[i]);
+ }
+ break;
+   case GSS_WITH_MEM_OPS_BASE:
+ break;
+   case GSS_WITH_MEM_OPS:
+ {
+   gimple_statement_with_memory_ops *stmt
+ = static_cast  (x);
+   size_t num = (size_t)(stmt->num_ops);
+   for (size_t i = 0; i != num; i++)
+ gt_ggc_m_9tree_node (stmt->op[i]);
+ }
+ break;
+   case GSS_CALL:
+ {
+   gimple_statement_call *stmt
+ = static_cast  (x);
+   gt_ggc_m_15bitmap_head_def (stmt->call_used.vars);
+   gt_ggc_m_15bitmap_head_def (stmt->call_clobbered.vars);
+   switch (stmt->subcode & GF_CALL_INTERNAL)
+ {
+ case 0:
+   gt_ggc_m_9tree_node (stmt->u.fntype);
+   break;
+ case GF_CALL_INTERNAL:
+   break;
+ default:
+   break;
+ }
+   size_t num = (size_t)(stmt->num_ops);
+   for (size_t i = 0; i != num; i++)
+ gt_ggc_m_9tree_node (stmt->op[i]);
+ }
+ break;
+   case GSS_OMP:
+ {
+   gimple_statement_omp *stmt
+ = static_cast  (x);
+   gt_ggc_mx_gimple_statement_base (stmt->body);
+ }
+ break;
+   case GSS_BIND:
+ {
+gimple_statement_bind *stmt
+ = static_cast  (x);
+   gt_ggc_m_9tree_node (stmt->vars);
+   gt_ggc_m_9tree_node (stmt->block);
+   gt_ggc_mx_gimple_statement_base (stmt->body);
+ }
+ break;
+   case GSS_CATCH:
+ {
+   gimple_statement_catch *stmt
+ = static_cast  (x);
+   gt_ggc_m_9tree_node (stmt->types);
+   gt_ggc_mx_gimple_statement_base (stmt->handler);
+ }
+ break;
+   case GSS_EH_FILTER:
+ {
+   gimple_statement_eh_filter *stmt
+ = static_cast  (x);
+   gt_ggc_m_9tree_node (stmt->types);
+   gt_ggc_mx_gimple_statement_base (stmt->failure);
+ }
+ break;
+   case GSS_EH_MNT:
+ {
+   gimple_statement_eh_mnt *stmt
+ = static_cast  (x);
+   gt_ggc_m_9tree_node (stmt->fndecl);
+ }
+ break;
+   case GSS_EH_ELSE:
+ {
+   gimple_statement_eh_else*stmt
+ = static_cast  (x);
+   gt_ggc_mx_gimple_statement_base (stmt->n_body);
+   gt_ggc_mx_gimple_statement_base (stmt->e_body);
+ }
+ br

[PATCH 5/6] Port various places from union access to subclass access.

2013-08-29 Thread David Malcolm
* gimple-streamer-in.c (input_gimple_stmt): Port from union
access to use of as_a.
* gimple.c (gimple_build_asm_1): Likewise.
(gimple_build_try): Likewise.  Also, return a specific subclass
rather than just gimple.
(gimple_build_resx): Port from union access to use of as_a.
(gimple_build_eh_dispatch): Likewise.
(gimple_build_omp_for): Likewise.  Also, convert allocation of iter
now that gengtype no longer provides a typed allocator function.
(gimple_copy): Likewise.
* gimple.h (gimple_build_try): Return a specific subclass rather
than just gimple.
* gimplify.c (gimplify_cleanup_point_expr): Replace union access
with subclass access by making use of new return type of
gimple_build_try.
* tree-phinodes.c: (allocate_phi_node): Return a
"gimple_statement_phi *" rather than just a gimple.
(resize_phi_node): Likewise.
(make_phi_node): Replace union access with subclass access by
making use of new return type of allocate_phi_node.
(reserve_phi_args_for_new_edge): Replace union access with as_a.
(remove_phi_arg_num): Accept a "gimple_statement_phi *" rather
than just a gimple.
(remove_phi_args): Update for change to remove_phi_arg_num.
---
 gcc/gimple-streamer-in.c | 11 -
 gcc/gimple.c | 58 ++--
 gcc/gimple.h |  3 ++-
 gcc/gimplify.c   |  4 ++--
 gcc/tree-phinodes.c  | 37 --
 5 files changed, 66 insertions(+), 47 deletions(-)

diff --git a/gcc/gimple-streamer-in.c b/gcc/gimple-streamer-in.c
index fc0b20a..f89f42e 100644
--- a/gcc/gimple-streamer-in.c
+++ b/gcc/gimple-streamer-in.c
@@ -126,13 +126,14 @@ input_gimple_stmt (struct lto_input_block *ib, struct 
data_in *data_in,
 case GIMPLE_ASM:
   {
/* FIXME lto.  Move most of this into a new gimple_asm_set_string().  */
+   gimple_statement_asm *asm_stmt = as_a  (stmt);
tree str;
-   stmt->gimple_asm.ni = streamer_read_uhwi (ib);
-   stmt->gimple_asm.no = streamer_read_uhwi (ib);
-   stmt->gimple_asm.nc = streamer_read_uhwi (ib);
-   stmt->gimple_asm.nl = streamer_read_uhwi (ib);
+   asm_stmt->ni = streamer_read_uhwi (ib);
+   asm_stmt->no = streamer_read_uhwi (ib);
+   asm_stmt->nc = streamer_read_uhwi (ib);
+   asm_stmt->nl = streamer_read_uhwi (ib);
str = streamer_read_string_cst (data_in, ib);
-   stmt->gimple_asm.string = TREE_STRING_POINTER (str);
+   asm_stmt->string = TREE_STRING_POINTER (str);
   }
   /* Fallthru  */
 
diff --git a/gcc/gimple.c b/gcc/gimple.c
index aceb1b0..1ad36d1 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -626,21 +626,22 @@ static inline gimple
 gimple_build_asm_1 (const char *string, unsigned ninputs, unsigned noutputs,
 unsigned nclobbers, unsigned nlabels)
 {
-  gimple p;
+  gimple_statement_asm *p;
   int size = strlen (string);
 
   /* ASMs with labels cannot have outputs.  This should have been
  enforced by the front end.  */
   gcc_assert (nlabels == 0 || noutputs == 0);
 
-  p = gimple_build_with_ops (GIMPLE_ASM, ERROR_MARK,
-ninputs + noutputs + nclobbers + nlabels);
+  p = as_a  (
+gimple_build_with_ops (GIMPLE_ASM, ERROR_MARK,
+  ninputs + noutputs + nclobbers + nlabels));
 
-  p->gimple_asm.ni = ninputs;
-  p->gimple_asm.no = noutputs;
-  p->gimple_asm.nc = nclobbers;
-  p->gimple_asm.nl = nlabels;
-  p->gimple_asm.string = ggc_alloc_string (string, size);
+  p->ni = ninputs;
+  p->no = noutputs;
+  p->nc = nclobbers;
+  p->nl = nlabels;
+  p->string = ggc_alloc_string (string, size);
 
   if (GATHER_STATISTICS)
 gimple_alloc_sizes[(int) gimple_alloc_kind (GIMPLE_ASM)] += size;
@@ -752,14 +753,14 @@ gimple_build_eh_else (gimple_seq n_body, gimple_seq 
e_body)
KIND is either GIMPLE_TRY_CATCH or GIMPLE_TRY_FINALLY depending on
whether this is a try/catch or a try/finally respectively.  */
 
-gimple
+gimple_statement_try *
 gimple_build_try (gimple_seq eval, gimple_seq cleanup,
  enum gimple_try_flags kind)
 {
-  gimple p;
+  gimple_statement_try *p;
 
   gcc_assert (kind == GIMPLE_TRY_CATCH || kind == GIMPLE_TRY_FINALLY);
-  p = gimple_alloc (GIMPLE_TRY, 0);
+  p = as_a  (gimple_alloc (GIMPLE_TRY, 0));
   gimple_set_subcode (p, kind);
   if (eval)
 gimple_try_set_eval (p, eval);
@@ -789,8 +790,10 @@ gimple_build_wce (gimple_seq cleanup)
 gimple
 gimple_build_resx (int region)
 {
-  gimple p = gimple_build_with_ops (GIMPLE_RESX, ERROR_MARK, 0);
-  p->gimple_eh_ctrl.region = region;
+  gimple_statement_eh_ctrl *p =
+as_a  (
+  gimple_build_with_ops (GIMPLE_RESX, ERROR_MARK, 0));
+  p->region = region;
   return p;
 }
 
@@ -837,8 +840,10 @@ gimple_build_switch (tree index, tree default_label, 
vec args)
 gimple
 gim

[PATCH 3/6] Autogenerated conversion of gimple to C++

2013-08-29 Thread David Malcolm
This patch is 110KB in size, so to avoid mailing-list size limits I've uploaded 
it to:

http://dmalcolm.fedorapeople.org/gcc/large-patches/a89d361b4f95dd216e1d29cb44fbaf90372c48b8-0003-Autogenerated-conversion-of-gimple-to-C.patch

The ChangeLog entry and diffstat follow:

gcc/

Patch autogenerated by refactor_gimple.py from
https://github.com/davidmalcolm/gcc-refactoring-scripts
revision 26aabff11750d29e6129930a426242a564538d1a

* gimple-iterator.c (update_bb_for_stmts): Update for conversion of
gimple types to a true class hierarchy.
(update_call_edge_frequencies): Likewise.
(gsi_insert_seq_nodes_before): Likewise.
(gsi_insert_seq_nodes_after): Likewise.
(gsi_split_seq_after): Likewise.
(gsi_set_stmt): Likewise.
(gsi_split_seq_before): Likewise.
(gsi_remove): Likewise.
* gimple-pretty-print.c (dump_gimple_debug): Likewise.
* gimple-streamer-in.c (input_gimple_stmt): Likewise.
* gimple-streamer-out.c (output_gimple_stmt): Likewise.
* gimple.c (gimple_set_code): Likewise.
(gimple_alloc_stat): Likewise.
(gimple_set_subcode): Likewise.
(gimple_build_call_internal_1): Likewise.
(gimple_check_failed): Likewise.
(gimple_call_flags): Likewise.
(gimple_set_bb): Likewise.
* gimple.h (is_a_helper  (gimple)): New.
(is_a_helper  (gimple)): Likewise.
(is_a_helper  (gimple)): Likewise.
(is_a_helper  (gimple)): Likewise.
(is_a_helper  (gimple)): Likewise.
(is_a_helper  (gimple)): Likewise.
(is_a_helper  (gimple)): Likewise.
(is_a_helper  (gimple)): Likewise.
(is_a_helper  (gimple)): Likewise.
(is_a_helper  (gimple)): Likewise.
(is_a_helper  (gimple)): Likewise.
(is_a_helper  (gimple)): Likewise.
(is_a_helper  (gimple)): Likewise.
(is_a_helper  (gimple)): Likewise.
(is_a_helper  (gimple)): Likewise.
(is_a_helper  (gimple)): Likewise.
(is_a_helper  (gimple)): Likewise.
(is_a_helper  (gimple)): Likewise.
(is_a_helper  (gimple)): Likewise.
(is_a_helper  (gimple)): Likewise.
(is_a_helper  (gimple)): Likewise.
(is_a_helper  (const_gimple)): Likewise.
(is_a_helper  (const_gimple)): Likewise.
(is_a_helper  (const_gimple)): Likewise.
(is_a_helper  (const_gimple)): Likewise.
(is_a_helper  (const_gimple)):
Likewise.
(is_a_helper  (const_gimple)):
Likewise.
(is_a_helper  (const_gimple)):
Likewise.
(is_a_helper 
(const_gimple)): Likewise.
(is_a_helper  (const_gimple)):
Likewise.
(is_a_helper  (const_gimple)):
Likewise.
(is_a_helper  (const_gimple)):
Likewise.
(is_a_helper  (const_gimple)):
Likewise.
(is_a_helper  (const_gimple)):
Likewise.
(is_a_helper  (const_gimple)):
Likewise.
(is_a_helper  (const_gimple)):
Likewise.
(is_a_helper  (const_gimple)): Likewise.
(is_a_helper  (const_gimple)):
Likewise.
(gimple_seq_last): Update for conversion of gimple types to a true
class hierarchy.
(gimple_seq_set_last): Likewise.
(gimple_code): Likewise.
(gimple_bb): Likewise.
(gimple_block): Likewise.
(gimple_set_block): Likewise.
(gimple_location): Likewise.
(gimple_location_ptr): Likewise.
(gimple_set_location): Likewise.
(gimple_no_warning_p): Likewise.
(gimple_set_no_warning): Likewise.
(gimple_set_visited): Likewise.
(gimple_visited_p): Likewise.
(gimple_set_plf): Likewise.
(gimple_plf): Likewise.
(gimple_set_uid): Likewise.
(gimple_uid): Likewise.
(gimple_init_singleton): Likewise.
(gimple_modified_p): Likewise.
(gimple_set_modified): Likewise.
(gimple_expr_code): Likewise.
(gimple_has_volatile_ops): Likewise.
(gimple_set_has_volatile_ops): Likewise.
(gimple_omp_subcode): Likewise.
(gimple_omp_set_subcode): Likewise.
(gimple_omp_return_set_nowait): Likewise.
(gimple_omp_section_set_last): Likewise.
(gimple_omp_parallel_set_combined_p): Likewise.
(gimple_omp_atomic_set_need_value): Likewise.
(gimple_num_ops): Likewise.
(gimple_set_num_ops): Likewise.
(gimple_assign_nontemporal_move_p): Likewise.
(gimple_assign_set_nontemporal_move): Likewise.
(gimple_assign_rhs_code): Likewise.
(gimple_assign_set_rhs_code): Likewise.
(gimple_call_internal_p): Likewise.
(gimple_call_set_tail): Likewise.
(gimple_call_tail_p): Likewise.
(gimple_call_set_return_slot_opt): Likewise.
(gimple_call_return_slot_opt_p): Likewise.
(gimple_call_set_from_thunk): Likewise.
   

Re: [PATCH] Support dumping type bindings and 'mutable' qualifier in lambda diagnostics.

2013-08-29 Thread Adam Butcher

On 29.08.2013 16:25, Gabriel Dos Reis wrote:
On Thu, Aug 29, 2013 at 9:20 AM, Adam Butcher  
wrote:
* error.c (dump_lambda_function): New function, dependent on 
...
(maybe_dump_template_bindings): ... this new function, 
factored out of

...
(dump_function_decl): ... here.  Updated to early-out with 
call to

dump_lambda_function after determining template bindings.
---
Reworked as requested.  Okay to commit?


Document the new functions.
Use pp->translate_string ("with") instead of
pp_cxx_ws_string (pp, M_("with")).

Done.  In documenting 'maybe_dump_template_bindings' and reviewing it 
again I'm not sure it's got the right name.  It wraps 
'dump_template_bindings' in "[with " and "]".  So it does more than just 
filter a call to 'dump_template_bindings'.


Any suggestions?  What do you think of 'maybe_dump_with_clause' or 
something similar?


Cheers,
Adam



Thanks,
-- Gaby



+static void
+maybe_dump_template_bindings (cxx_pretty_printer *pp,
+ tree t, tree template_parms, tree 
template_args,

+ int flags)
+{
+  if (template_parms != NULL_TREE && template_args != NULL_TREE
+  && !(flags & TFF_NO_TEMPLATE_BINDINGS))
+{
+  vec *typenames = find_typenames (t);
+  pp_cxx_whitespace (pp);
+  pp_cxx_left_bracket (pp);
+  pp_cxx_ws_string (pp, M_("with"));
+  pp_cxx_whitespace (pp);
+  dump_template_bindings (pp, template_parms, template_args, 
typenames);

+  pp_cxx_right_bracket (pp);
+}
+}
+
+static void
+dump_lambda_function (cxx_pretty_printer *pp,
+ tree fn, tree template_parms, tree 
template_args,

+ int flags)
+{
+  /* A lambda's signature is essentially its "type".  */
+  dump_type (pp, DECL_CONTEXT (fn), flags);
+  if (!(TYPE_QUALS (class_of_this_parm (TREE_TYPE (fn))) & 
TYPE_QUAL_CONST))

+{
+  pp->padding = pp_before;
+  pp_c_ws_string (pp, "mutable");
+}
+  maybe_dump_template_bindings (pp, fn, template_parms, 
template_args, flags);

+}
+
 /* Pretty print a function decl. There are several ways we want to 
print a
function declaration. The TFF_ bits in FLAGS tells us how to 
behave.
As error can only apply the '#' flag once to give 0 and 1 for V, 
there
@@ -1379,15 +1412,6 @@ dump_function_decl (cxx_pretty_printer *pp, 
tree t, int flags)
   int show_return = flags & TFF_RETURN_TYPE || flags & 
TFF_DECL_SPECIFIERS;

   int do_outer_scope = ! (flags & TFF_UNQUALIFIED_NAME);
   tree exceptions;
-  vec *typenames = NULL;
-
-  if (DECL_NAME (t) && LAMBDA_FUNCTION_P (t))
-{
-  /* A lambda's signature is essentially its "type", so defer.  
*/

-  gcc_assert (LAMBDA_TYPE_P (DECL_CONTEXT (t)));
-  dump_type (pp, DECL_CONTEXT (t), flags);
-  return;
-}

   flags &= ~(TFF_UNQUALIFIED_NAME | TFF_TEMPLATE_NAME);
   if (TREE_CODE (t) == TEMPLATE_DECL)
@@ -1409,10 +1433,13 @@ dump_function_decl (cxx_pretty_printer *pp, 
tree t, int flags)

{
  template_parms = DECL_TEMPLATE_PARMS (tmpl);
  t = tmpl;
- typenames = find_typenames (t);
}
 }

+  if (DECL_NAME (t) && LAMBDA_FUNCTION_P (t))
+return dump_lambda_function (pp, t, template_parms, 
template_args,

+flags);
+
   fntype = TREE_TYPE (t);
   parmtypes = FUNCTION_FIRST_USER_PARMTYPE (t);

@@ -1476,17 +1503,8 @@ dump_function_decl (cxx_pretty_printer *pp, 
tree t, int flags)

   if (show_return)
dump_type_suffix (pp, TREE_TYPE (fntype), flags);

-  /* If T is a template instantiation, dump the parameter 
binding.  */

-  if (template_parms != NULL_TREE && template_args != NULL_TREE
- && !(flags & TFF_NO_TEMPLATE_BINDINGS))
-   {
- pp_cxx_whitespace (pp);
- pp_cxx_left_bracket (pp);
- pp_cxx_ws_string (pp, M_("with"));
- pp_cxx_whitespace (pp);
- dump_template_bindings (pp, template_parms, template_args, 
typenames);

- pp_cxx_right_bracket (pp);
-   }
+  maybe_dump_template_bindings (pp, t, template_parms, 
template_args,

+   flags);
 }
   else if (template_args)
 {
--
1.8.4





Re: [Google] Refine hot caller heuristic

2013-08-29 Thread Easwaran Raman
On Tue, Aug 20, 2013 at 9:35 PM, Xinliang David Li  wrote:
> Do you need to guard the jump function access with check if
> (ipa_node_params_vector.exists ())?
I believe it is not necessary since, for example, ipa_analyze_node
calls ipa_check_create_node_params that calls create. But I see it is
used in ipa-inline-analysis.c everywhere. So I have added a check and
conservatively return false.

>
> Ideally, useful_cold_callee should be folded into the inline hints
> estimation.  Question about the heuristic: why filtering out
> PASS_THROUGH parameter cases completely? Passing 'this' parameter in
> many cases can result in good PRE opportunities.  Why not skip the
> unknown type?

The rationale is it is useful to inline bar into foo in the snippet below:

void foo ()
{
  A a;
  bar(&a);
  ...
}

Capturing this requires UNKNOWN and KNOWN_TYPE jump functions. I have
changed the check accordingly. I have attached the new patch.

- Easwaran

> David
>
> On Tue, Aug 20, 2013 at 12:26 PM, Easwaran Raman  wrote:
>> The current hot caller heuristic simply promotes edges whose caller is
>> hot. This patch does the following:
>> * Turn it off for applications with large footprint since the size
>> increase hurts them
>> * Be more selective by considering arguments to callee when the
>> heuristic is enabled.
>>
>> This performs well on internal benchmarks. Ok for google/4_8 branch if
>> all tests pass?
>>
>> - Easwaran


hot_caller.patch
Description: Binary data


Re: [Google] Refine hot caller heuristic

2013-08-29 Thread Easwaran Raman
On Wed, Aug 21, 2013 at 6:47 AM, Teresa Johnson  wrote:
>> +/* Knob to control hot-caller heuristic. 0 means it is turned off, 1 means
>> +   it is always applied, and 2 means it is applied only if the footprint is
>> +   smaller than PARAM_HOT_CALLER_CODESIZE_THRESHOLD.  */
>>  DEFPARAM (PARAM_INLINE_HOT_CALLER,
>> "inline-hot-caller",
>> "Consider cold callsites for inlining if caller contains hot code",
>> +   2, 0, 2)
>> +
>> +/* The maximum code size estimate under which hot caller heuristic is
>> +   applied.  */
>> +DEFPARAM(PARAM_HOT_CALLER_CODESIZE_THRESHOLD,
>> + "hot-caller-codesize-threshold",
>> + "Maximum profile-based code size footprint estimate for "
>> + "hot caller heuristic  ",
>> + 1, 0, 0)
>
> Out of curiousity, how sensitive is performance to the value of this
> parameter? I.e. is there a clear cutoff for the codes that benefit
> from disabling this inlining vs those that benefit from enabling it?
>
> Also, have you tried spec2006? I remember that the codesize of the gcc
> benchmark was above the larger 15000 threshold I use for tuning down
> unrolling/peeling, and I needed to refine my heuristics to identify
> profitable loops to unroll/peel even in the case of large codesize.
> I'm not sure if there are more benchmarks that will be above the
> smaller 10K threshold.
The cutoff works well for internal benchmarks, but there is a wide
range to set the value. I ran spec2006 and gcc benchmark does well.
However, there are two c++ benchmarks (which fall below the cutoff)
that degrades if I exclude PASS_THRU. I'll investigate that later.

>
>> +
>> +DEFPARAM (PARAM_INLINE_USEFUL_COLD_CALLEE,
>> +   "inline-useful-cold-callee",
>> +   "Consider cold callsites for inlining if caller contains hot code",
>> 1, 0, 1)
>
> The description of this param is wrong (it is the same as the
> description of PARAM_INLINE_HOT_CALLER). It should probably be
> something like
> "Only consider cold callsites for inlining if analysis finds
> optimization opportunities"
Thanks. Fixed.

>
>>
>>  /* Limit of iterations of early inliner.  This basically bounds number of
>> Index: gcc/ipa-inline.c
>> ===
>> --- gcc/ipa-inline.c  (revision 201768)
>> +++ gcc/ipa-inline.c  (working copy)
>> @@ -528,12 +528,60 @@ big_speedup_p (struct cgraph_edge *e)
>>return false;
>>  }
>>
>> +/* Returns true if callee of edge E is considered useful to inline
>> +   even if it is cold. A callee is considered useful if there is at
>> +   least one argument of pointer type that is not a pass-through.  */
>
> Can you expand this comment a bit to add why such arguments indicate
> useful inlining?
Added comments.

- Easwaran
>
> Thanks,
> Teresa
>
>> +
>> +static inline bool
>> +useful_cold_callee (struct cgraph_edge *e)
>> +{
>> +  gimple call = e->call_stmt;
>> +  int n, arg_num = gimple_call_num_args (call);
>> +  struct ipa_edge_args *args = IPA_EDGE_REF (e);
>> +
>> +  for (n = 0; n < arg_num; n++)
>> +{
>> +  tree arg = gimple_call_arg (call, n);
>> +  if (POINTER_TYPE_P (TREE_TYPE (arg)))
>> +{
>> +   struct ipa_jump_func *jfunc = ipa_get_ith_jump_func (args, n);
>> +   if (jfunc->type != IPA_JF_PASS_THROUGH)
>> +return true;
>> +}
>> +}
>> +  return false;
>> +}
>> +
>> +/* Returns true if hot caller heuristic should be used.  */
>> +
>> +static inline bool
>> +enable_hot_caller_heuristic (void)
>> +{
>> +
>> +  gcov_working_set_t *ws = NULL;
>> +  int size_threshold = PARAM_VALUE (PARAM_HOT_CALLER_CODESIZE_THRESHOLD);
>> +  int num_counters = 0;
>> +  int param_inline_hot_caller = PARAM_VALUE (PARAM_INLINE_HOT_CALLER);
>> +
>> +  if (param_inline_hot_caller == 0)
>> +return false;
>> +  else if (param_inline_hot_caller == 1)
>> +return true;
>> +
>> +  ws = find_working_set(PARAM_VALUE (HOT_BB_COUNT_WS_PERMILLE));
>> +  if (!ws)
>> +return false;
>> +  num_counters = ws->num_counters;
>> +  return num_counters <= size_threshold;
>> +
>> +}
>>  /* Returns true if an edge or its caller are hot enough to
>> be considered for inlining.  */
>>
>>  static bool
>>  edge_hot_enough_p (struct cgraph_edge *edge)
>>  {
>> +  static bool use_hot_caller_heuristic = enable_hot_caller_heuristic ();
>>if (cgraph_maybe_hot_edge_p (edge))
>>  return true;
>>
>> @@ -543,9 +591,17 @@ edge_hot_enough_p (struct cgraph_edge *edge)
>>if (flag_auto_profile && edge->callee->count == 0
>>&& edge->callee->max_bb_count > 0)
>>  return false;
>> -  if (PARAM_VALUE (PARAM_INLINE_HOT_CALLER)
>> -  && maybe_hot_count_p (NULL, edge->caller->max_bb_count))
>> -return true;
>> +  if (use_hot_caller_heuristic)
>> +{
>> +  struct cgraph_node *where = edge->caller;
>> +  if (maybe_hot_count_p (NULL, where->max_bb_count))
>> +{
>> +  if (PARAM_VALUE (PARAM_INLINE_USEFUL_COLD_CALLEE))
>> +return useful_cold_c

Re: [Patch 2/2] Localization problem in regex

2013-08-29 Thread Tim Shen
> Fixed.
>
> It's not fully tested. I just run the 28_regex testcases. I'll do a
> full test before committing.

Committed.


-- 
Tim Shen


changelog
Description: Binary data


regex-locale.patch
Description: Binary data


Re: [PATCH] Fix illegal cast to rtx (*insn_gen_fn) (rtx, ...)

2013-08-29 Thread Oleg Endo
On Wed, 2013-08-07 at 21:24 +0200, Oleg Endo wrote:
> On Wed, 2013-08-07 at 15:08 -0400, Michael Meissner wrote:
> > On Tue, Aug 06, 2013 at 11:45:40PM +0200, Oleg Endo wrote:
> > > On Mon, 2013-08-05 at 13:25 -1000, Richard Henderson wrote:
> > > > On 08/05/2013 12:32 PM, Oleg Endo wrote:
> > > > > Thanks, committed as rev 201513.
> > > > > 4.8 also has the same problem.  The patch applies on 4.8 branch 
> > > > > without
> > > > > problems and make all-gcc works.
> > > > > OK for 4.8, too?
> > > > 
> > > > Hum.  I suppose so, since it's relatively self-contained.  I suppose the
> > > > out-of-tree openrisc port will thank us...
> > > 
> > > Maybe it's better to wait for a while and collect follow up patches such
> > > as the rs6000 one.
> > 
> > The tree right now is broken for the powerpc.  I would prefer to get patches
> > installed ASAP rather than waiting for additional ports.
> 
> I've just committed the PPC fix for trunk.  Sorry for the delay.
> I haven't committed anything related to this issue on the 4.8 branch
> yet.  I'll do that next week if nothing else comes up.

Sorry for the delay.  I've just backported the 2 patches to 4.8.
Tested with 'make all-gcc' for SH and PPC cross compilers.
Committed as rev 202083.

Cheers,
Oleg
Index: gcc/expr.c
===
--- gcc/expr.c	(revision 202080)
+++ gcc/expr.c	(working copy)
@@ -119,7 +119,7 @@
   int reverse;
 };
 
-static void move_by_pieces_1 (rtx (*) (rtx, ...), enum machine_mode,
+static void move_by_pieces_1 (insn_gen_fn, machine_mode,
 			  struct move_by_pieces_d *);
 static bool block_move_libcall_safe_for_call_parm (void);
 static bool emit_block_move_via_movmem (rtx, rtx, rtx, unsigned, unsigned, HOST_WIDE_INT);
@@ -128,7 +128,7 @@
 static rtx clear_by_pieces_1 (void *, HOST_WIDE_INT, enum machine_mode);
 static void clear_by_pieces (rtx, unsigned HOST_WIDE_INT, unsigned int);
 static void store_by_pieces_1 (struct store_by_pieces_d *, unsigned int);
-static void store_by_pieces_2 (rtx (*) (rtx, ...), enum machine_mode,
+static void store_by_pieces_2 (insn_gen_fn, machine_mode,
 			   struct store_by_pieces_d *);
 static tree clear_storage_libcall_fn (int);
 static rtx compress_float_constant (rtx, rtx);
@@ -1043,7 +1043,7 @@
to make a move insn for that mode.  DATA has all the other info.  */
 
 static void
-move_by_pieces_1 (rtx (*genfun) (rtx, ...), enum machine_mode mode,
+move_by_pieces_1 (insn_gen_fn genfun, machine_mode mode,
 		  struct move_by_pieces_d *data)
 {
   unsigned int size = GET_MODE_SIZE (mode);
@@ -2657,7 +2657,7 @@
to make a move insn for that mode.  DATA has all the other info.  */
 
 static void
-store_by_pieces_2 (rtx (*genfun) (rtx, ...), enum machine_mode mode,
+store_by_pieces_2 (insn_gen_fn genfun, machine_mode mode,
 		   struct store_by_pieces_d *data)
 {
   unsigned int size = GET_MODE_SIZE (mode);
Index: gcc/recog.h
===
--- gcc/recog.h	(revision 202080)
+++ gcc/recog.h	(working copy)
@@ -256,8 +256,58 @@
 
 typedef int (*insn_operand_predicate_fn) (rtx, enum machine_mode);
 typedef const char * (*insn_output_fn) (rtx *, rtx);
-typedef rtx (*insn_gen_fn) (rtx, ...);
 
+struct insn_gen_fn
+{
+  typedef rtx (*f0) (void);
+  typedef rtx (*f1) (rtx);
+  typedef rtx (*f2) (rtx, rtx);
+  typedef rtx (*f3) (rtx, rtx, rtx);
+  typedef rtx (*f4) (rtx, rtx, rtx, rtx);
+  typedef rtx (*f5) (rtx, rtx, rtx, rtx, rtx);
+  typedef rtx (*f6) (rtx, rtx, rtx, rtx, rtx, rtx);
+  typedef rtx (*f7) (rtx, rtx, rtx, rtx, rtx, rtx, rtx);
+  typedef rtx (*f8) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
+  typedef rtx (*f9) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
+  typedef rtx (*f10) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
+  typedef rtx (*f11) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
+  typedef rtx (*f12) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
+  typedef rtx (*f13) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
+  typedef rtx (*f14) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
+  typedef rtx (*f15) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
+  typedef rtx (*f16) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
+
+  typedef f0 stored_funcptr;
+
+  rtx operator () (void) const { return ((f0)func) (); }
+  rtx operator () (rtx a0) const { return ((f1)func) (a0); }
+  rtx operator () (rtx a0, rtx a1) const { return ((f2)func) (a0, a1); }
+  rtx operator () (rtx a0, rtx a1, rtx a2) const { return ((f3)func) (a0, a1, a2); }
+  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3) const { return ((f4)func) (a0, a1, a2, a3); }
+  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4) const { return ((f5)func) (a0, a1, a2, a3, a4); }
+  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5) const { return ((f6)func) (a0, a1, a2, a3, a4

Re: [PATCH] Support dumping type bindings and 'mutable' qualifier in lambda diagnostics.

2013-08-29 Thread Adam Butcher
* error.c (dump_lambda_function): New function, dependent on ...
(maybe_dump_with_clause): ... this new function, factored out of ...
(subst_to_string): ... here and ...
(dump_function_decl): ... here.  Updated to early-out with call to
dump_lambda_function after determining template bindings.
---
Hi Gaby,

On 29.08.2013 18:04, Adam Butcher wrote:
> On 29.08.2013 16:25, Gabriel Dos Reis wrote:
> >
> > Document the new functions.
> > Use pp->translate_string ("with") instead of
> > pp_cxx_ws_string (pp, M_("with")).
> >
> Done.  In documenting 'maybe_dump_template_bindings' and reviewing
> it again I'm not sure it's got the right name.  It wraps
> 'dump_template_bindings' in "[with " and "]".  So it does more than
> just filter a call to 'dump_template_bindings'.
>
> Any suggestions?  What do you think of 'maybe_dump_with_clause' or
> something similar?
>
Here's a diff with the name change (though I'm not all that happy with
it) and the docs.

I've also reimplemented subst_to_string in terms of the new function
as it was duplicating much of the code from dump_function_decl (the
only downside of this is some unnecessary tests in the subst_to_string
case but I think they should get optimized away if it were inlined --
we're dealing with diagnostics formatting here anyway so performance
is probably not a big factor).

Cheers,
Adam
---
 gcc/cp/error.c | 73 --
 1 file changed, 46 insertions(+), 27 deletions(-)

diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index c82a0ce..3d1e173 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -1363,6 +1363,47 @@ find_typenames (tree t)
   return ft.typenames;
 }
 
+/* Output the "[with ...]" clause for a template instantiation T iff
+   TEMPLATE_PARMS, TEMPLATE_ARGS and FLAGS are suitable.  T may be NULL if
+   formatting a deduction/substitution diagnostic rather than an
+   instantiation.  */
+
+static void
+maybe_dump_with_clause (cxx_pretty_printer *pp,
+   tree t, tree template_parms, tree template_args,
+   int flags)
+{
+  if (template_parms != NULL_TREE && template_args != NULL_TREE
+  && !(flags & TFF_NO_TEMPLATE_BINDINGS))
+{
+  vec *typenames = t ? find_typenames (t) : NULL;
+  pp_cxx_whitespace (pp);
+  pp_cxx_left_bracket (pp);
+  pp->translate_string ("with");
+  pp_cxx_whitespace (pp);
+  dump_template_bindings (pp, template_parms, template_args, typenames);
+  pp_cxx_right_bracket (pp);
+}
+}
+
+/* Dump the lambda function FN including its 'mutable' qualifier and any
+   template bindings.  */
+
+static void
+dump_lambda_function (cxx_pretty_printer *pp,
+ tree fn, tree template_parms, tree template_args,
+ int flags)
+{
+  /* A lambda's signature is essentially its "type".  */
+  dump_type (pp, DECL_CONTEXT (fn), flags);
+  if (!(TYPE_QUALS (class_of_this_parm (TREE_TYPE (fn))) & TYPE_QUAL_CONST))
+{
+  pp->padding = pp_before;
+  pp_c_ws_string (pp, "mutable");
+}
+  maybe_dump_with_clause (pp, fn, template_parms, template_args, flags);
+}
+
 /* Pretty print a function decl. There are several ways we want to print a
function declaration. The TFF_ bits in FLAGS tells us how to behave.
As error can only apply the '#' flag once to give 0 and 1 for V, there
@@ -1379,15 +1420,6 @@ dump_function_decl (cxx_pretty_printer *pp, tree t, int 
flags)
   int show_return = flags & TFF_RETURN_TYPE || flags & TFF_DECL_SPECIFIERS;
   int do_outer_scope = ! (flags & TFF_UNQUALIFIED_NAME);
   tree exceptions;
-  vec *typenames = NULL;
-
-  if (DECL_NAME (t) && LAMBDA_FUNCTION_P (t))
-{
-  /* A lambda's signature is essentially its "type", so defer.  */
-  gcc_assert (LAMBDA_TYPE_P (DECL_CONTEXT (t)));
-  dump_type (pp, DECL_CONTEXT (t), flags);
-  return;
-}
 
   flags &= ~(TFF_UNQUALIFIED_NAME | TFF_TEMPLATE_NAME);
   if (TREE_CODE (t) == TEMPLATE_DECL)
@@ -1409,10 +1441,12 @@ dump_function_decl (cxx_pretty_printer *pp, tree t, int 
flags)
{
  template_parms = DECL_TEMPLATE_PARMS (tmpl);
  t = tmpl;
- typenames = find_typenames (t);
}
 }
 
+  if (DECL_NAME (t) && LAMBDA_FUNCTION_P (t))
+return dump_lambda_function (pp, t, template_parms, template_args, flags);
+
   fntype = TREE_TYPE (t);
   parmtypes = FUNCTION_FIRST_USER_PARMTYPE (t);
 
@@ -1476,17 +1510,7 @@ dump_function_decl (cxx_pretty_printer *pp, tree t, int 
flags)
   if (show_return)
dump_type_suffix (pp, TREE_TYPE (fntype), flags);
 
-  /* If T is a template instantiation, dump the parameter binding.  */
-  if (template_parms != NULL_TREE && template_args != NULL_TREE
- && !(flags & TFF_NO_TEMPLATE_BINDINGS))
-   {
- pp_cxx_whitespace (pp);
- pp_cxx_left_bracket (pp);
- pp_cxx_ws_string (pp, M_("with"));
- pp_cxx_whitespace (pp);
- dump_t

Re: [PATCH] Fix illegal cast to rtx (*insn_gen_fn) (rtx, ...)

2013-08-29 Thread Jakub Jelinek
On Thu, Aug 29, 2013 at 08:45:33PM +0200, Oleg Endo wrote:
> Sorry for the delay.  I've just backported the 2 patches to 4.8.
> Tested with 'make all-gcc' for SH and PPC cross compilers.
> Committed as rev 202083.

Please fix the overly long lines as a follow-up.

> +struct insn_gen_fn
> +{
> +  typedef rtx (*f0) (void);
> +  typedef rtx (*f1) (rtx);
> +  typedef rtx (*f2) (rtx, rtx);
> +  typedef rtx (*f3) (rtx, rtx, rtx);
> +  typedef rtx (*f4) (rtx, rtx, rtx, rtx);
> +  typedef rtx (*f5) (rtx, rtx, rtx, rtx, rtx);
> +  typedef rtx (*f6) (rtx, rtx, rtx, rtx, rtx, rtx);
> +  typedef rtx (*f7) (rtx, rtx, rtx, rtx, rtx, rtx, rtx);
> +  typedef rtx (*f8) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
> +  typedef rtx (*f9) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
> +  typedef rtx (*f10) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
> +  typedef rtx (*f11) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
> +  typedef rtx (*f12) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, 
> rtx);
> +  typedef rtx (*f13) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, 
> rtx, rtx);
> +  typedef rtx (*f14) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, 
> rtx, rtx, rtx);
> +  typedef rtx (*f15) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, 
> rtx, rtx, rtx, rtx);
> +  typedef rtx (*f16) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, 
> rtx, rtx, rtx, rtx, rtx);
> +
> +  typedef f0 stored_funcptr;
> +
> +  rtx operator () (void) const { return ((f0)func) (); }
> +  rtx operator () (rtx a0) const { return ((f1)func) (a0); }
> +  rtx operator () (rtx a0, rtx a1) const { return ((f2)func) (a0, a1); }
> +  rtx operator () (rtx a0, rtx a1, rtx a2) const { return ((f3)func) (a0, 
> a1, a2); }
> +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3) const { return ((f4)func) 
> (a0, a1, a2, a3); }
> +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4) const { return 
> ((f5)func) (a0, a1, a2, a3, a4); }
> +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5) const { 
> return ((f6)func) (a0, a1, a2, a3, a4, a5); }
> +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5, rtx a6) 
> const { return ((f7)func) (a0, a1, a2, a3, a4, a5, a6); }
> +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5, rtx a6, 
> rtx a7) const { return ((f8)func) (a0, a1, a2, a3, a4, a5, a6, a7); }
> +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5, rtx a6, 
> rtx a7, rtx a8) const { return ((f9)func) (a0, a1, a2, a3, a4, a5, a6, a7, 
> a8); }
> +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5, rtx a6, 
> rtx a7, rtx a8, rtx a9) const { return ((f10)func) (a0, a1, a2, a3, a4, a5, 
> a6, a7, a8, a9); }
> +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5, rtx a6, 
> rtx a7, rtx a8, rtx a9, rtx a10) const { return ((f11)func) (a0, a1, a2, a3, 
> a4, a5, a6, a7, a8, a9, a10); }
> +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5, rtx a6, 
> rtx a7, rtx a8, rtx a9, rtx a10, rtx a11) const { return ((f12)func) (a0, a1, 
> a2, a3, a4, a5, a6, a7, a8, a9, a10, a11); }
> +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5, rtx a6, 
> rtx a7, rtx a8, rtx a9, rtx a10, rtx a11, rtx a12) const { return ((f13)func) 
> (a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12); }
> +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5, rtx a6, 
> rtx a7, rtx a8, rtx a9, rtx a10, rtx a11, rtx a12, rtx a13) const { return 
> ((f14)func) (a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13); }
> +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5, rtx a6, 
> rtx a7, rtx a8, rtx a9, rtx a10, rtx a11, rtx a12, rtx a13, rtx a14) const { 
> return ((f15)func) (a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, 
> a13, a14); }
> +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5, rtx a6, 
> rtx a7, rtx a8, rtx a9, rtx a10, rtx a11, rtx a12, rtx a13, rtx a14, rtx a15) 
> const { return ((f16)func) (a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, 
> a12, a13, a14, a15); }
> +
> +  // This is for compatibility of code that invokes functions like
> +  //   (*funcptr) (arg)
> +  insn_gen_fn operator * (void) const { return *this; }
> +
> +  // The wrapped function pointer must be public and there must not be any
> +  // constructors.  Otherwise the insn_data_d struct initializers generated
> +  // by genoutput.c will result in static initializer functions, which 
> defeats
> +  // the purpose of the generated insn_data_d array.
> +  stored_funcptr func;
> +};

Jakub


Re: [PATCH] Fix illegal cast to rtx (*insn_gen_fn) (rtx, ...)

2013-08-29 Thread Oleg Endo
On Thu, 2013-08-29 at 20:51 +0200, Jakub Jelinek wrote:
> On Thu, Aug 29, 2013 at 08:45:33PM +0200, Oleg Endo wrote:
> > Sorry for the delay.  I've just backported the 2 patches to 4.8.
> > Tested with 'make all-gcc' for SH and PPC cross compilers.
> > Committed as rev 202083.
> 
> Please fix the overly long lines as a follow-up.
> 

In m original mail
http://gcc.gnu.org/ml/gcc-patches/2013-07/msg01315.html
I wrote:

* I don't know whether it's really needed to properly format the code of
class insn_gen_fn.  After reading the first two or three overloads
(which do fit into 80 columns) one gets the idea and so I guess nobody
is going to read that stuff completely anyway.

Nobody commented on it and after Richard's OK to the patch I assumed
it's fine that way as an exception.

Of course I'll do it if you insist :)

Cheers,
Oleg


> > +struct insn_gen_fn
> > +{
> > +  typedef rtx (*f0) (void);
> > +  typedef rtx (*f1) (rtx);
> > +  typedef rtx (*f2) (rtx, rtx);
> > +  typedef rtx (*f3) (rtx, rtx, rtx);
> > +  typedef rtx (*f4) (rtx, rtx, rtx, rtx);
> > +  typedef rtx (*f5) (rtx, rtx, rtx, rtx, rtx);
> > +  typedef rtx (*f6) (rtx, rtx, rtx, rtx, rtx, rtx);
> > +  typedef rtx (*f7) (rtx, rtx, rtx, rtx, rtx, rtx, rtx);
> > +  typedef rtx (*f8) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
> > +  typedef rtx (*f9) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
> > +  typedef rtx (*f10) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
> > +  typedef rtx (*f11) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, 
> > rtx);
> > +  typedef rtx (*f12) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, 
> > rtx, rtx);
> > +  typedef rtx (*f13) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, 
> > rtx, rtx, rtx);
> > +  typedef rtx (*f14) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, 
> > rtx, rtx, rtx, rtx);
> > +  typedef rtx (*f15) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, 
> > rtx, rtx, rtx, rtx, rtx);
> > +  typedef rtx (*f16) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, 
> > rtx, rtx, rtx, rtx, rtx, rtx);
> > +
> > +  typedef f0 stored_funcptr;
> > +
> > +  rtx operator () (void) const { return ((f0)func) (); }
> > +  rtx operator () (rtx a0) const { return ((f1)func) (a0); }
> > +  rtx operator () (rtx a0, rtx a1) const { return ((f2)func) (a0, a1); }
> > +  rtx operator () (rtx a0, rtx a1, rtx a2) const { return ((f3)func) (a0, 
> > a1, a2); }
> > +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3) const { return 
> > ((f4)func) (a0, a1, a2, a3); }
> > +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4) const { return 
> > ((f5)func) (a0, a1, a2, a3, a4); }
> > +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5) const { 
> > return ((f6)func) (a0, a1, a2, a3, a4, a5); }
> > +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5, rtx a6) 
> > const { return ((f7)func) (a0, a1, a2, a3, a4, a5, a6); }
> > +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5, rtx a6, 
> > rtx a7) const { return ((f8)func) (a0, a1, a2, a3, a4, a5, a6, a7); }
> > +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5, rtx a6, 
> > rtx a7, rtx a8) const { return ((f9)func) (a0, a1, a2, a3, a4, a5, a6, a7, 
> > a8); }
> > +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5, rtx a6, 
> > rtx a7, rtx a8, rtx a9) const { return ((f10)func) (a0, a1, a2, a3, a4, a5, 
> > a6, a7, a8, a9); }
> > +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5, rtx a6, 
> > rtx a7, rtx a8, rtx a9, rtx a10) const { return ((f11)func) (a0, a1, a2, 
> > a3, a4, a5, a6, a7, a8, a9, a10); }
> > +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5, rtx a6, 
> > rtx a7, rtx a8, rtx a9, rtx a10, rtx a11) const { return ((f12)func) (a0, 
> > a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11); }
> > +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5, rtx a6, 
> > rtx a7, rtx a8, rtx a9, rtx a10, rtx a11, rtx a12) const { return 
> > ((f13)func) (a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12); }
> > +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5, rtx a6, 
> > rtx a7, rtx a8, rtx a9, rtx a10, rtx a11, rtx a12, rtx a13) const { return 
> > ((f14)func) (a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13); }
> > +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5, rtx a6, 
> > rtx a7, rtx a8, rtx a9, rtx a10, rtx a11, rtx a12, rtx a13, rtx a14) const 
> > { return ((f15)func) (a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, 
> > a12, a13, a14); }
> > +  rtx operator () (rtx a0, rtx a1, rtx a2, rtx a3, rtx a4, rtx a5, rtx a6, 
> > rtx a7, rtx a8, rtx a9, rtx a10, rtx a11, rtx a12, rtx a13, rtx a14, rtx 
> > a15) const { return ((f16)func) (a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, 
> > a10, a11, a12, a13, a14, a15); }
> > +
> > +  // This is for compatibility of code that invokes functions like
> > +  //   (*funcptr) (arg)
> > +  insn_gen_fn operator * (void) const { return *this; }
> > +

gcc_update fix build error

2013-08-29 Thread Mike Stump
This fixes a build error in trunk.  Before:

# cat gcc/REVISION 
[trunk
trunk revision 202083]

and after:

$ cat gcc/REVISION 
[trunk revision 202083]

$ svn info
Path: .
Working Copy Root Path: /Users/mrs/net/gcc
URL: svn+ssh://m...@gcc.gnu.org/svn/gcc/trunk
Relative URL: ^/trunk
Repository Root: svn+ssh://m...@gcc.gnu.org/svn/gcc
Repository UUID: 138bc75d-0d04-0410-961f-82ee72b054a4
Revision: 202083
Node Kind: directory
Schedule: normal
Last Changed Author: timshen
Last Changed Rev: 202082
Last Changed Date: 2013-08-29 11:33:07 -0700 (Thu, 29 Aug 2013)


The Relative URL was messing it up.  This is with svn:

$ svn --version
svn, version 1.8.1 (r1503906)

which required an upgrade on the old repository.

I checked this in.


2013-08-29  Mike Stump  

* gcc_update (configure): Update to handle svn 1.8.1.

Index: contrib/gcc_update
===
--- contrib/gcc_update  (revision 202083)
+++ contrib/gcc_update  (working copy)
@@ -385,7 +385,7 @@ case $vcs_type in
fi
 
revision=`$GCC_SVN info | awk '/Revision:/ { print $2 }'`
-   branch=`$GCC_SVN info | sed -ne "/URL:/ {
+   branch=`$GCC_SVN info | sed -ne "/^URL:/ {
s,.*/trunk,trunk,
s,.*/branches/,,
s,.*/tags/,,



Re: wide-int branch now up for public comment and review

2013-08-29 Thread Mike Stump
On Aug 29, 2013, at 12:36 AM, Richard Biener  wrote:
> On Wed, 28 Aug 2013, Mike Stump wrote:
> 
>> On Aug 28, 2013, at 3:22 AM, Richard Biener  wrote:
>>> Btw, rtl.h still wastes space with
>>> 
>>> struct GTY((variable_size)) hwivec_def {
>>> int num_elem; /* number of elements */
>>> HOST_WIDE_INT elem[1];
>>> };
>>> 
>>> struct GTY((chain_next ("RTX_NEXT (&%h)"),
>>>   chain_prev ("RTX_PREV (&%h)"), variable_size)) rtx_def {
>>> ...
>>> /* The first element of the operands of this rtx.
>>>The number of operands and their types are controlled
>>>by the `code' field, according to rtl.def.  */
>>> union u {
>>>   rtunion fld[1];
>>>   HOST_WIDE_INT hwint[1];
>>>   struct block_symbol block_sym;
>>>   struct real_value rv;
>>>   struct fixed_value fv;
>>>   struct hwivec_def hwiv;
>>> } GTY ((special ("rtx_def"), desc ("GET_CODE (&%0)"))) u;
>>> };
>>> 
>>> there are 32bits available before the union.  If you don't use
>>> those for num_elem then all wide-ints will at least take as
>>> much space as DOUBLE_INTs originally took - and large ints
>>> that would have required DOUBLE_INTs in the past will now
>>> require more space than before.  Which means your math
>>> motivating the 'num_elem' encoding stuff is wrong.  With
>>> moving 'num_elem' before u you can even re-use the hwint
>>> field in the union as the existing double-int code does
>>> (which in fact could simply do the encoding trick in the
>>> old CONST_DOUBLE scheme, similar for the tree INTEGER_CST
>>> container).
>> 
>> So, HOST_WIDE_INT is likely 64 bits, and likely is 64 bit aligned.  The 
>> base (stuff before the union) is 32 bits.  There is a 32 bit gap, even 
>> if not used before the HOST_WIDE_INT elem.  We place the num_elem is 
>> this gap.
> 
> No, you don't.  You place num_elem 64bit aligned _after_ the gap.
> And you have another 32bit gap, as you say, before elem.

Ah, ok, I get it, thanks for the explanation.  This removes the second gap 
creator and puts the field into the gap before the u union.

diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index ce40347..143f298 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -594,7 +594,7 @@ immed_wide_int_const (const wide_int &v, enum machine_mode 
mode)
 /* It is so tempting to just put the mode in here.  Must control
myself ... */
 PUT_MODE (value, VOIDmode);
-HWI_PUT_NUM_ELEM (CONST_WIDE_INT_VEC (value), len);
+CWI_PUT_NUM_ELEM (value, len);
 
 for (i = 0; i < len; i++)
   CONST_WIDE_INT_ELT (value, i) = v.elt (i);
diff --git a/gcc/print-rtl.c b/gcc/print-rtl.c
index 3620bd6..8dad9f6 100644
--- a/gcc/print-rtl.c
+++ b/gcc/print-rtl.c
@@ -616,7 +616,7 @@ print_rtx (const_rtx in_rtx)
 case CONST_WIDE_INT:
   if (! flag_simple)
fprintf (outfile, " ");
-  hwivec_output_hex (outfile, CONST_WIDE_INT_VEC (in_rtx));
+  cwi_output_hex (outfile, in_rtx);
   break;
 #endif
 
diff --git a/gcc/read-rtl.c b/gcc/read-rtl.c
index 707ef3f..c198b5b 100644
--- a/gcc/read-rtl.c
+++ b/gcc/read-rtl.c
@@ -1352,7 +1352,6 @@ read_rtx_code (const char *code_name)
   read_name (&name);
   validate_const_wide_int (name.string);
   {
-   hwivec hwiv;
const char *s = name.string;
int len;
int index = 0;
@@ -1377,7 +1376,6 @@ read_rtx_code (const char *code_name)
 
return_rtx = const_wide_int_alloc (wlen);
 
-   hwiv = CONST_WIDE_INT_VEC (return_rtx);
while (pos > 0)
  {
 #if HOST_BITS_PER_WIDE_INT == 64
@@ -1385,13 +1383,13 @@ read_rtx_code (const char *code_name)
 #else
sscanf (s + pos, "%8" HOST_WIDE_INT_PRINT "x", &wi);
 #endif
-   XHWIVEC_ELT (hwiv, index++) = wi;
+   CWI_ELT (return_rtx, index++) = wi;
pos -= gs;
  }
strncpy (buf, s, gs - pos);
buf [gs - pos] = 0;
sscanf (buf, "%" HOST_WIDE_INT_PRINT "x", &wi);
-   XHWIVEC_ELT (hwiv, index++) = wi;
+   CWI_ELT (return_rtx, index++) = wi;
/* TODO: After reading, do we want to canonicalize with:
   value = lookup_const_wide_int (value); ? */
   }
diff --git a/gcc/rtl.c b/gcc/rtl.c
index 074e425..b913d0d 100644
--- a/gcc/rtl.c
+++ b/gcc/rtl.c
@@ -225,18 +225,18 @@ rtx_alloc_stat (RTX_CODE code MEM_STAT_DECL)
   return rtx_alloc_stat_v (code PASS_MEM_STAT, 0);
 }
 
-/* Write the wide constant OP0 to OUTFILE.  */
+/* Write the wide constant X to OUTFILE.  */
 
 void
-hwivec_output_hex (FILE *outfile, const_hwivec op0)
+cwi_output_hex (FILE *outfile, const_rtx x)
 {
-  int i = HWI_GET_NUM_ELEM (op0);
+  int i = CWI_GET_NUM_ELEM (x);
   gcc_assert (i > 0);
-  if (XHWIVEC_ELT (op0, i-1) == 0)
+  if (CWI_ELT (x, i-1) == 0)
 fprintf (outfile, "0x");
-  fprintf (outfile, HOST_WIDE_INT_PRINT_HEX, XHWIVEC_ELT (op0, --i));
+  fprintf (outfile, HOST_WIDE_INT_PRINT_HEX, CWI_ELT (x, --i));
   while (--i >= 0)
-fprintf (outfile, HOST_WIDE_INT_PRINT_PADDED_HEX, XHWIVEC_ELT (op0, i));
+fprintf (outfile, HOST_WIDE_INT_

Re: [PATCH] Support dumping type bindings and 'mutable' qualifier in lambda diagnostics.

2013-08-29 Thread Gabriel Dos Reis
On Thu, Aug 29, 2013 at 1:51 PM, Adam Butcher  wrote:
> * error.c (dump_lambda_function): New function, dependent on ...
> (maybe_dump_with_clause): ... this new function, factored out of ...
> (subst_to_string): ... here and ...
> (dump_function_decl): ... here.  Updated to early-out with call to
> dump_lambda_function after determining template bindings.
> ---
> Hi Gaby,
>
> On 29.08.2013 18:04, Adam Butcher wrote:
>> On 29.08.2013 16:25, Gabriel Dos Reis wrote:
>> >
>> > Document the new functions.
>> > Use pp->translate_string ("with") instead of
>> > pp_cxx_ws_string (pp, M_("with")).
>> >
>> Done.  In documenting 'maybe_dump_template_bindings' and reviewing
>> it again I'm not sure it's got the right name.  It wraps
>> 'dump_template_bindings' in "[with " and "]".  So it does more than
>> just filter a call to 'dump_template_bindings'.
>>
>> Any suggestions?  What do you think of 'maybe_dump_with_clause' or
>> something similar?
>>
> Here's a diff with the name change (though I'm not all that happy with
> it) and the docs.

call it dump_substitution.  We don't need the 'maybe_' prefix since
an empty substitution is the identity transformation.

>
> I've also reimplemented subst_to_string in terms of the new function
> as it was duplicating much of the code from dump_function_decl (the
> only downside of this is some unnecessary tests in the subst_to_string
> case but I think they should get optimized away if it were inlined --
> we're dealing with diagnostics formatting here anyway so performance
> is probably not a big factor).

Patch OK with that change.  Thanks!

>
> Cheers,
> Adam
> ---
>  gcc/cp/error.c | 73 
> --
>  1 file changed, 46 insertions(+), 27 deletions(-)
>
> diff --git a/gcc/cp/error.c b/gcc/cp/error.c
> index c82a0ce..3d1e173 100644
> --- a/gcc/cp/error.c
> +++ b/gcc/cp/error.c
> @@ -1363,6 +1363,47 @@ find_typenames (tree t)
>return ft.typenames;
>  }
>
> +/* Output the "[with ...]" clause for a template instantiation T iff
> +   TEMPLATE_PARMS, TEMPLATE_ARGS and FLAGS are suitable.  T may be NULL if
> +   formatting a deduction/substitution diagnostic rather than an
> +   instantiation.  */
> +
> +static void
> +maybe_dump_with_clause (cxx_pretty_printer *pp,
> +   tree t, tree template_parms, tree template_args,
> +   int flags)
> +{
> +  if (template_parms != NULL_TREE && template_args != NULL_TREE
> +  && !(flags & TFF_NO_TEMPLATE_BINDINGS))
> +{
> +  vec *typenames = t ? find_typenames (t) : NULL;
> +  pp_cxx_whitespace (pp);
> +  pp_cxx_left_bracket (pp);
> +  pp->translate_string ("with");
> +  pp_cxx_whitespace (pp);
> +  dump_template_bindings (pp, template_parms, template_args, typenames);
> +  pp_cxx_right_bracket (pp);
> +}
> +}
> +
> +/* Dump the lambda function FN including its 'mutable' qualifier and any
> +   template bindings.  */
> +
> +static void
> +dump_lambda_function (cxx_pretty_printer *pp,
> + tree fn, tree template_parms, tree template_args,
> + int flags)
> +{
> +  /* A lambda's signature is essentially its "type".  */
> +  dump_type (pp, DECL_CONTEXT (fn), flags);
> +  if (!(TYPE_QUALS (class_of_this_parm (TREE_TYPE (fn))) & TYPE_QUAL_CONST))
> +{
> +  pp->padding = pp_before;
> +  pp_c_ws_string (pp, "mutable");
> +}
> +  maybe_dump_with_clause (pp, fn, template_parms, template_args, flags);
> +}
> +
>  /* Pretty print a function decl. There are several ways we want to print a
> function declaration. The TFF_ bits in FLAGS tells us how to behave.
> As error can only apply the '#' flag once to give 0 and 1 for V, there
> @@ -1379,15 +1420,6 @@ dump_function_decl (cxx_pretty_printer *pp, tree t, 
> int flags)
>int show_return = flags & TFF_RETURN_TYPE || flags & TFF_DECL_SPECIFIERS;
>int do_outer_scope = ! (flags & TFF_UNQUALIFIED_NAME);
>tree exceptions;
> -  vec *typenames = NULL;
> -
> -  if (DECL_NAME (t) && LAMBDA_FUNCTION_P (t))
> -{
> -  /* A lambda's signature is essentially its "type", so defer.  */
> -  gcc_assert (LAMBDA_TYPE_P (DECL_CONTEXT (t)));
> -  dump_type (pp, DECL_CONTEXT (t), flags);
> -  return;
> -}
>
>flags &= ~(TFF_UNQUALIFIED_NAME | TFF_TEMPLATE_NAME);
>if (TREE_CODE (t) == TEMPLATE_DECL)
> @@ -1409,10 +1441,12 @@ dump_function_decl (cxx_pretty_printer *pp, tree t, 
> int flags)
> {
>   template_parms = DECL_TEMPLATE_PARMS (tmpl);
>   t = tmpl;
> - typenames = find_typenames (t);
> }
>  }
>
> +  if (DECL_NAME (t) && LAMBDA_FUNCTION_P (t))
> +return dump_lambda_function (pp, t, template_parms, template_args, 
> flags);
> +
>fntype = TREE_TYPE (t);
>parmtypes = FUNCTION_FIRST_USER_PARMTYPE (t);
>
> @@ -1476,17 +1510,7 @@ dump_function_decl (cxx_pretty_printer *pp, tree t, 
> int flags)
>   

Re: PR 58148 patch

2013-08-29 Thread François Dumont
Indeed, I check the Standard about const_pointer, so here is another 
attempt.


Tested under Linux x86_64.

2013-08-29  François Dumont  

PR libstdc++/58148
* include/debug/functions.h (__foreign_iterator_aux4): Use
sequence const_pointer as common type to compare pointers. Add a
fallback overload in case pointers cannot be cast to sequence
const_pointer.
* testsuite/23_containers/vector/modifiers/insert/58148.cc: New.

Ok to commit ?

François


On 08/28/2013 10:50 PM, Paolo Carlini wrote:

Hi,

On 08/28/2013 09:30 PM, François Dumont wrote:

- std::addressof(*(__it._M_get_sequence()->_M_base().begin(
+ &(*(__it._M_get_sequence()->_M_base().begin(
I'm not convinced that you can avoid these std::addressof: it seems to 
me that the value_type can still have an overloaded operator&


Paolo.



Index: include/debug/functions.h
===
--- include/debug/functions.h	(revision 201966)
+++ include/debug/functions.h	(working copy)
@@ -36,7 +36,7 @@
 #include // for __addressof and addressof
 #if __cplusplus >= 201103L
 # include 		  // for less and greater_equal
-# include 			  // for common_type
+# include 			  // for is_lvalue_reference and __and_
 #endif
 #include 
 
@@ -172,17 +172,16 @@
 }
 
 #if __cplusplus >= 201103L
+  // Default implementation.
   template
+	   typename _InputIterator>
 inline bool
 __foreign_iterator_aux4(const _Safe_iterator<_Iterator, _Sequence>& __it,
 			_InputIterator __other,
-			_PointerType1, _PointerType2)
+			typename _Sequence::const_pointer,
+			typename _Sequence::const_pointer)
 {
-  typedef typename std::common_type<_PointerType1,
-	_PointerType2>::type _PointerType;
+  typedef typename _Sequence::const_pointer _PointerType;
   constexpr std::less<_PointerType> __l{};
   constexpr std::greater_equal<_PointerType> __ge{};
 
@@ -192,7 +191,16 @@
 		  std::addressof(*(__it._M_get_sequence()->_M_base().end()
    - 1)) + 1));
 }
-			  
+
+  // Fallback when address type cannot be implicitely casted to sequence
+  // const_pointer.
+  template
+inline bool
+__foreign_iterator_aux4(const _Safe_iterator<_Iterator, _Sequence>&,
+			_InputIterator, ...)
+{ return true; }
+
   template
 inline bool
 __foreign_iterator_aux3(const _Safe_iterator<_Iterator, _Sequence>& __it,
@@ -223,7 +231,7 @@
 			std::false_type)
 { return true; }
 #endif
-			   
+
   /** Checks that iterators do not belong to the same sequence. */
   template
 inline bool
Index: testsuite/23_containers/vector/modifiers/insert/58148.cc
===
--- testsuite/23_containers/vector/modifiers/insert/58148.cc	(revision 0)
+++ testsuite/23_containers/vector/modifiers/insert/58148.cc	(revision 0)
@@ -0,0 +1,35 @@
+// Copyright (C) 2013 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++11" }
+// { dg-do compile }
+
+#include 
+
+void
+test01()
+{
+  std::vector v;
+  char c = 'a';
+  v.insert(v.begin(), &c, &c);
+}
+
+int main()
+{
+  test01();
+  return 0;
+}


Re: [C++ Patch] PR 51424

2013-08-29 Thread Jason Merrill

On 08/29/2013 11:24 AM, Paolo Carlini wrote:

+  if ((complain & tf_error)
+  && (flags & LOOKUP_DELEGATING_CONS)
+  && name == complete_ctor_identifier
+  && TREE_CODE (ret) == CALL_EXPR
+  && (DECL_ABSTRACT_ORIGIN (TREE_OPERAND (CALL_EXPR_FN (ret), 0))
+ == current_function_decl))
+error ("constructor delegates to itself");


How about doing this check in perform_target_ctor instead, so you don't 
need another LOOKUP flag?


Jason



Re: [PATCH 6/6] Add manual GTY hooks

2013-08-29 Thread Steven Bosscher
On Thu, Aug 29, 2013 at 6:20 PM, David Malcolm  wrote:
> * gimple.c (gt_ggc_mx (gimple)): New, as required by GTY((user)).
> (gt_pch_nx (gimple)): Likewise.

GIMPLE isn't supposed to end up in a PCH. Can you please make this
function simply call gcc_unreachable()?

FWIW 1: I really think all these hand-written markers aren't a good
idea, we should really figure out a way to have automatic marker
function generators, something less complex than gengtype, of course.
But to have all these calls to the type-mangled marker functions
(gt_pch_n_9tree_node, etc.) is going to be a problem in the long term.

It seems to me that the first step in all these conversions to
hand-written markers should be to make gengtype spit out the marker
functions *without* the type name mangling, i.e. all marker functions
should just be gt_ggc_mx(type) / gt_pch_nx(type).

Ciao!
Steven


Re: Go patch committed: Set TREE_PUBLIC reliably

2013-08-29 Thread Uros Bizjak
On Thu, Aug 29, 2013 at 2:49 AM, Ian Lance Taylor  wrote:
> Uros tracked down a problem with the section used for immutable structs:
> the value of compute_reloc_for_constant would change between the time
> immutable_struct_set_init would call it and the time that
> get_variable_section would call it, for the same decl and the same decl
> initializer.  He identified the problem: TREE_PUBLIC was set in the
> latter case but not the former.  This patch fixes the problem.
> Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
> Committed to mainline and 4.8 branch.

I can confirm that there is no invalid section flags warning in the
libgo testsuite on alpha.

Thanks,
Uros.


[PATCH, committed] Support dumping type bindings and 'mutable' qualifier in lambda diagnostics.

2013-08-29 Thread Adam Butcher
From: abutcher 

* error.c (dump_lambda_function): New function, dependent on ...
(dump_substitution): ... this new function, factored out of ...
(subst_to_string): ... here and ...
(dump_function_decl): ... here.  Updated to early-out with call to
dump_lambda_function after determining template bindings.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@202087 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/cp/ChangeLog |  8 +++
 gcc/cp/error.c   | 73 +++-
 2 files changed, 54 insertions(+), 27 deletions(-)

diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
index bf49198..f848b81 100644
--- a/gcc/cp/ChangeLog
+++ b/gcc/cp/ChangeLog
@@ -1,3 +1,11 @@
+2013-08-29  Adam Butcher  
+
+   * error.c (dump_lambda_function): New function, dependent on ...
+   (dump_substitution): ... this new function, factored out of ...
+   (subst_to_string): ... here and ...
+   (dump_function_decl): ... here.  Updated to early-out with call to
+   dump_lambda_function after determining template bindings.
+
 2013-08-28  Paolo Carlini  
 
PR c++/58255
diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index c82a0ce..af71000 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -1363,6 +1363,47 @@ find_typenames (tree t)
   return ft.typenames;
 }
 
+/* Output the "[with ...]" clause for a template instantiation T iff
+   TEMPLATE_PARMS, TEMPLATE_ARGS and FLAGS are suitable.  T may be NULL if
+   formatting a deduction/substitution diagnostic rather than an
+   instantiation.  */
+
+static void
+dump_substitution (cxx_pretty_printer *pp,
+   tree t, tree template_parms, tree template_args,
+   int flags)
+{
+  if (template_parms != NULL_TREE && template_args != NULL_TREE
+  && !(flags & TFF_NO_TEMPLATE_BINDINGS))
+{
+  vec *typenames = t ? find_typenames (t) : NULL;
+  pp_cxx_whitespace (pp);
+  pp_cxx_left_bracket (pp);
+  pp->translate_string ("with");
+  pp_cxx_whitespace (pp);
+  dump_template_bindings (pp, template_parms, template_args, typenames);
+  pp_cxx_right_bracket (pp);
+}
+}
+
+/* Dump the lambda function FN including its 'mutable' qualifier and any
+   template bindings.  */
+
+static void
+dump_lambda_function (cxx_pretty_printer *pp,
+ tree fn, tree template_parms, tree template_args,
+ int flags)
+{
+  /* A lambda's signature is essentially its "type".  */
+  dump_type (pp, DECL_CONTEXT (fn), flags);
+  if (!(TYPE_QUALS (class_of_this_parm (TREE_TYPE (fn))) & TYPE_QUAL_CONST))
+{
+  pp->padding = pp_before;
+  pp_c_ws_string (pp, "mutable");
+}
+  dump_substitution (pp, fn, template_parms, template_args, flags);
+}
+
 /* Pretty print a function decl. There are several ways we want to print a
function declaration. The TFF_ bits in FLAGS tells us how to behave.
As error can only apply the '#' flag once to give 0 and 1 for V, there
@@ -1379,15 +1420,6 @@ dump_function_decl (cxx_pretty_printer *pp, tree t, int 
flags)
   int show_return = flags & TFF_RETURN_TYPE || flags & TFF_DECL_SPECIFIERS;
   int do_outer_scope = ! (flags & TFF_UNQUALIFIED_NAME);
   tree exceptions;
-  vec *typenames = NULL;
-
-  if (DECL_NAME (t) && LAMBDA_FUNCTION_P (t))
-{
-  /* A lambda's signature is essentially its "type", so defer.  */
-  gcc_assert (LAMBDA_TYPE_P (DECL_CONTEXT (t)));
-  dump_type (pp, DECL_CONTEXT (t), flags);
-  return;
-}
 
   flags &= ~(TFF_UNQUALIFIED_NAME | TFF_TEMPLATE_NAME);
   if (TREE_CODE (t) == TEMPLATE_DECL)
@@ -1409,10 +1441,12 @@ dump_function_decl (cxx_pretty_printer *pp, tree t, int 
flags)
{
  template_parms = DECL_TEMPLATE_PARMS (tmpl);
  t = tmpl;
- typenames = find_typenames (t);
}
 }
 
+  if (DECL_NAME (t) && LAMBDA_FUNCTION_P (t))
+return dump_lambda_function (pp, t, template_parms, template_args, flags);
+
   fntype = TREE_TYPE (t);
   parmtypes = FUNCTION_FIRST_USER_PARMTYPE (t);
 
@@ -1476,17 +1510,7 @@ dump_function_decl (cxx_pretty_printer *pp, tree t, int 
flags)
   if (show_return)
dump_type_suffix (pp, TREE_TYPE (fntype), flags);
 
-  /* If T is a template instantiation, dump the parameter binding.  */
-  if (template_parms != NULL_TREE && template_args != NULL_TREE
- && !(flags & TFF_NO_TEMPLATE_BINDINGS))
-   {
- pp_cxx_whitespace (pp);
- pp_cxx_left_bracket (pp);
- pp_cxx_ws_string (pp, M_("with"));
- pp_cxx_whitespace (pp);
- dump_template_bindings (pp, template_parms, template_args, typenames);
- pp_cxx_right_bracket (pp);
-   }
+  dump_substitution (pp, t, template_parms, template_args, flags);
 }
   else if (template_args)
 {
@@ -2950,12 +2974,7 @@ subst_to_string (tree p)
 
   reinit_cxx_pp ();
   dump_template_decl (cxx_pp, TREE_PURPOSE (p), flags);
-  pp_cxx_whitespace (cxx

Re: [PATCH, committed] Support dumping type bindings and 'mutable' qualifier in lambda diagnostics.

2013-08-29 Thread Gabriel Dos Reis
On Thu, Aug 29, 2013 at 3:57 PM, Adam Butcher  wrote:
> From: abutcher 
>
> * error.c (dump_lambda_function): New function, dependent on ...
> (dump_substitution): ... this new function, factored out of ...
> (subst_to_string): ... here and ...
> (dump_function_decl): ... here.  Updated to early-out with call to
> dump_lambda_function after determining template bindings.

Please go ahead and commit it.  You do have SVN commit access, right?

-- Gaby

>
> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@202087 
> 138bc75d-0d04-0410-961f-82ee72b054a4
> ---
>  gcc/cp/ChangeLog |  8 +++
>  gcc/cp/error.c   | 73 
> +++-
>  2 files changed, 54 insertions(+), 27 deletions(-)
>
> diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
> index bf49198..f848b81 100644
> --- a/gcc/cp/ChangeLog
> +++ b/gcc/cp/ChangeLog
> @@ -1,3 +1,11 @@
> +2013-08-29  Adam Butcher  
> +
> +   * error.c (dump_lambda_function): New function, dependent on ...
> +   (dump_substitution): ... this new function, factored out of ...
> +   (subst_to_string): ... here and ...
> +   (dump_function_decl): ... here.  Updated to early-out with call to
> +   dump_lambda_function after determining template bindings.
> +
>  2013-08-28  Paolo Carlini  
>
> PR c++/58255
> diff --git a/gcc/cp/error.c b/gcc/cp/error.c
> index c82a0ce..af71000 100644
> --- a/gcc/cp/error.c
> +++ b/gcc/cp/error.c
> @@ -1363,6 +1363,47 @@ find_typenames (tree t)
>return ft.typenames;
>  }
>
> +/* Output the "[with ...]" clause for a template instantiation T iff
> +   TEMPLATE_PARMS, TEMPLATE_ARGS and FLAGS are suitable.  T may be NULL if
> +   formatting a deduction/substitution diagnostic rather than an
> +   instantiation.  */
> +
> +static void
> +dump_substitution (cxx_pretty_printer *pp,
> +   tree t, tree template_parms, tree template_args,
> +   int flags)
> +{
> +  if (template_parms != NULL_TREE && template_args != NULL_TREE
> +  && !(flags & TFF_NO_TEMPLATE_BINDINGS))
> +{
> +  vec *typenames = t ? find_typenames (t) : NULL;
> +  pp_cxx_whitespace (pp);
> +  pp_cxx_left_bracket (pp);
> +  pp->translate_string ("with");
> +  pp_cxx_whitespace (pp);
> +  dump_template_bindings (pp, template_parms, template_args, typenames);
> +  pp_cxx_right_bracket (pp);
> +}
> +}
> +
> +/* Dump the lambda function FN including its 'mutable' qualifier and any
> +   template bindings.  */
> +
> +static void
> +dump_lambda_function (cxx_pretty_printer *pp,
> + tree fn, tree template_parms, tree template_args,
> + int flags)
> +{
> +  /* A lambda's signature is essentially its "type".  */
> +  dump_type (pp, DECL_CONTEXT (fn), flags);
> +  if (!(TYPE_QUALS (class_of_this_parm (TREE_TYPE (fn))) & TYPE_QUAL_CONST))
> +{
> +  pp->padding = pp_before;
> +  pp_c_ws_string (pp, "mutable");
> +}
> +  dump_substitution (pp, fn, template_parms, template_args, flags);
> +}
> +
>  /* Pretty print a function decl. There are several ways we want to print a
> function declaration. The TFF_ bits in FLAGS tells us how to behave.
> As error can only apply the '#' flag once to give 0 and 1 for V, there
> @@ -1379,15 +1420,6 @@ dump_function_decl (cxx_pretty_printer *pp, tree t, 
> int flags)
>int show_return = flags & TFF_RETURN_TYPE || flags & TFF_DECL_SPECIFIERS;
>int do_outer_scope = ! (flags & TFF_UNQUALIFIED_NAME);
>tree exceptions;
> -  vec *typenames = NULL;
> -
> -  if (DECL_NAME (t) && LAMBDA_FUNCTION_P (t))
> -{
> -  /* A lambda's signature is essentially its "type", so defer.  */
> -  gcc_assert (LAMBDA_TYPE_P (DECL_CONTEXT (t)));
> -  dump_type (pp, DECL_CONTEXT (t), flags);
> -  return;
> -}
>
>flags &= ~(TFF_UNQUALIFIED_NAME | TFF_TEMPLATE_NAME);
>if (TREE_CODE (t) == TEMPLATE_DECL)
> @@ -1409,10 +1441,12 @@ dump_function_decl (cxx_pretty_printer *pp, tree t, 
> int flags)
> {
>   template_parms = DECL_TEMPLATE_PARMS (tmpl);
>   t = tmpl;
> - typenames = find_typenames (t);
> }
>  }
>
> +  if (DECL_NAME (t) && LAMBDA_FUNCTION_P (t))
> +return dump_lambda_function (pp, t, template_parms, template_args, 
> flags);
> +
>fntype = TREE_TYPE (t);
>parmtypes = FUNCTION_FIRST_USER_PARMTYPE (t);
>
> @@ -1476,17 +1510,7 @@ dump_function_decl (cxx_pretty_printer *pp, tree t, 
> int flags)
>if (show_return)
> dump_type_suffix (pp, TREE_TYPE (fntype), flags);
>
> -  /* If T is a template instantiation, dump the parameter binding.  */
> -  if (template_parms != NULL_TREE && template_args != NULL_TREE
> - && !(flags & TFF_NO_TEMPLATE_BINDINGS))
> -   {
> - pp_cxx_whitespace (pp);
> - pp_cxx_left_bracket (pp);
> - pp_cxx_ws_string (pp, M_("with"));
> - pp_cxx_whitespace (pp);
> - 

Re: [PATCH, committed] Support dumping type bindings and 'mutable' qualifier in lambda diagnostics.

2013-08-29 Thread Adam Butcher

On 29.08.2013 21:59, Gabriel Dos Reis wrote:
On Thu, Aug 29, 2013 at 3:57 PM, Adam Butcher  
wrote:

From: abutcher 

* error.c (dump_lambda_function): New function, dependent on 
...
(dump_substitution): ... this new function, factored out of 
...

(subst_to_string): ... here and ...
(dump_function_decl): ... here.  Updated to early-out with 
call to

dump_lambda_function after determining template bindings.


Please go ahead and commit it.  You do have SVN commit access, right?

Yes, this is the committed patch.  I was under the impression that I 
should ping the list with the commit once pushed.


Adam


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@202087 
138bc75d-0d04-0410-961f-82ee72b054a4

---
 gcc/cp/ChangeLog |  8 +++
 gcc/cp/error.c   | 73 
+++-

 2 files changed, 54 insertions(+), 27 deletions(-)

diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
index bf49198..f848b81 100644
--- a/gcc/cp/ChangeLog
+++ b/gcc/cp/ChangeLog
@@ -1,3 +1,11 @@
+2013-08-29  Adam Butcher  
+
+   * error.c (dump_lambda_function): New function, dependent on 
...
+   (dump_substitution): ... this new function, factored out of 
...

+   (subst_to_string): ... here and ...
+   (dump_function_decl): ... here.  Updated to early-out with 
call to

+   dump_lambda_function after determining template bindings.
+
 2013-08-28  Paolo Carlini  

PR c++/58255
diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index c82a0ce..af71000 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -1363,6 +1363,47 @@ find_typenames (tree t)
   return ft.typenames;
 }

+/* Output the "[with ...]" clause for a template instantiation T 
iff
+   TEMPLATE_PARMS, TEMPLATE_ARGS and FLAGS are suitable.  T may be 
NULL if

+   formatting a deduction/substitution diagnostic rather than an
+   instantiation.  */
+
+static void
+dump_substitution (cxx_pretty_printer *pp,
+   tree t, tree template_parms, tree template_args,
+   int flags)
+{
+  if (template_parms != NULL_TREE && template_args != NULL_TREE
+  && !(flags & TFF_NO_TEMPLATE_BINDINGS))
+{
+  vec *typenames = t ? find_typenames (t) : NULL;
+  pp_cxx_whitespace (pp);
+  pp_cxx_left_bracket (pp);
+  pp->translate_string ("with");
+  pp_cxx_whitespace (pp);
+  dump_template_bindings (pp, template_parms, template_args, 
typenames);

+  pp_cxx_right_bracket (pp);
+}
+}
+
+/* Dump the lambda function FN including its 'mutable' qualifier 
and any

+   template bindings.  */
+
+static void
+dump_lambda_function (cxx_pretty_printer *pp,
+ tree fn, tree template_parms, tree 
template_args,

+ int flags)
+{
+  /* A lambda's signature is essentially its "type".  */
+  dump_type (pp, DECL_CONTEXT (fn), flags);
+  if (!(TYPE_QUALS (class_of_this_parm (TREE_TYPE (fn))) & 
TYPE_QUAL_CONST))

+{
+  pp->padding = pp_before;
+  pp_c_ws_string (pp, "mutable");
+}
+  dump_substitution (pp, fn, template_parms, template_args, flags);
+}
+
 /* Pretty print a function decl. There are several ways we want to 
print a
function declaration. The TFF_ bits in FLAGS tells us how to 
behave.
As error can only apply the '#' flag once to give 0 and 1 for V, 
there
@@ -1379,15 +1420,6 @@ dump_function_decl (cxx_pretty_printer *pp, 
tree t, int flags)
   int show_return = flags & TFF_RETURN_TYPE || flags & 
TFF_DECL_SPECIFIERS;

   int do_outer_scope = ! (flags & TFF_UNQUALIFIED_NAME);
   tree exceptions;
-  vec *typenames = NULL;
-
-  if (DECL_NAME (t) && LAMBDA_FUNCTION_P (t))
-{
-  /* A lambda's signature is essentially its "type", so defer.  
*/

-  gcc_assert (LAMBDA_TYPE_P (DECL_CONTEXT (t)));
-  dump_type (pp, DECL_CONTEXT (t), flags);
-  return;
-}

   flags &= ~(TFF_UNQUALIFIED_NAME | TFF_TEMPLATE_NAME);
   if (TREE_CODE (t) == TEMPLATE_DECL)
@@ -1409,10 +1441,12 @@ dump_function_decl (cxx_pretty_printer *pp, 
tree t, int flags)

{
  template_parms = DECL_TEMPLATE_PARMS (tmpl);
  t = tmpl;
- typenames = find_typenames (t);
}
 }

+  if (DECL_NAME (t) && LAMBDA_FUNCTION_P (t))
+return dump_lambda_function (pp, t, template_parms, 
template_args, flags);

+
   fntype = TREE_TYPE (t);
   parmtypes = FUNCTION_FIRST_USER_PARMTYPE (t);

@@ -1476,17 +1510,7 @@ dump_function_decl (cxx_pretty_printer *pp, 
tree t, int flags)

   if (show_return)
dump_type_suffix (pp, TREE_TYPE (fntype), flags);

-  /* If T is a template instantiation, dump the parameter 
binding.  */

-  if (template_parms != NULL_TREE && template_args != NULL_TREE
- && !(flags & TFF_NO_TEMPLATE_BINDINGS))
-   {
- pp_cxx_whitespace (pp);
- pp_cxx_left_bracket (pp);
- pp_cxx_ws_string (pp, M_("with"));
- pp_cxx_whitespace (pp);
- dump_template_bindings (pp, template_parms, templa

Re: [PATCH, committed] Support dumping type bindings and 'mutable' qualifier in lambda diagnostics.

2013-08-29 Thread Gabriel Dos Reis
On Thu, Aug 29, 2013 at 4:06 PM, Adam Butcher  wrote:
> On 29.08.2013 21:59, Gabriel Dos Reis wrote:
>>
>> On Thu, Aug 29, 2013 at 3:57 PM, Adam Butcher 
>> wrote:
>>>
>>> From: abutcher 
>>>
>>> * error.c (dump_lambda_function): New function, dependent on ...
>>> (dump_substitution): ... this new function, factored out of ...
>>> (subst_to_string): ... here and ...
>>> (dump_function_decl): ... here.  Updated to early-out with call
>>> to
>>> dump_lambda_function after determining template bindings.
>>
>>
>> Please go ahead and commit it.  You do have SVN commit access, right?
>>
> Yes, this is the committed patch.  I was under the impression that I should
> ping the list with the commit once pushed.

yes, you are right; my confusion.


>
> Adam
>
>
>
>>> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@202087
>>> 138bc75d-0d04-0410-961f-82ee72b054a4
>>> ---
>>>  gcc/cp/ChangeLog |  8 +++
>>>  gcc/cp/error.c   | 73
>>> +++-
>>>  2 files changed, 54 insertions(+), 27 deletions(-)
>>>
>>> diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
>>> index bf49198..f848b81 100644
>>> --- a/gcc/cp/ChangeLog
>>> +++ b/gcc/cp/ChangeLog
>>> @@ -1,3 +1,11 @@
>>> +2013-08-29  Adam Butcher  
>>> +
>>> +   * error.c (dump_lambda_function): New function, dependent on ...
>>> +   (dump_substitution): ... this new function, factored out of ...
>>> +   (subst_to_string): ... here and ...
>>> +   (dump_function_decl): ... here.  Updated to early-out with call
>>> to
>>> +   dump_lambda_function after determining template bindings.
>>> +
>>>  2013-08-28  Paolo Carlini  
>>>
>>> PR c++/58255
>>> diff --git a/gcc/cp/error.c b/gcc/cp/error.c
>>> index c82a0ce..af71000 100644
>>> --- a/gcc/cp/error.c
>>> +++ b/gcc/cp/error.c
>>> @@ -1363,6 +1363,47 @@ find_typenames (tree t)
>>>return ft.typenames;
>>>  }
>>>
>>> +/* Output the "[with ...]" clause for a template instantiation T iff
>>> +   TEMPLATE_PARMS, TEMPLATE_ARGS and FLAGS are suitable.  T may be NULL
>>> if
>>> +   formatting a deduction/substitution diagnostic rather than an
>>> +   instantiation.  */
>>> +
>>> +static void
>>> +dump_substitution (cxx_pretty_printer *pp,
>>> +   tree t, tree template_parms, tree template_args,
>>> +   int flags)
>>> +{
>>> +  if (template_parms != NULL_TREE && template_args != NULL_TREE
>>> +  && !(flags & TFF_NO_TEMPLATE_BINDINGS))
>>> +{
>>> +  vec *typenames = t ? find_typenames (t) : NULL;
>>> +  pp_cxx_whitespace (pp);
>>> +  pp_cxx_left_bracket (pp);
>>> +  pp->translate_string ("with");
>>> +  pp_cxx_whitespace (pp);
>>> +  dump_template_bindings (pp, template_parms, template_args,
>>> typenames);
>>> +  pp_cxx_right_bracket (pp);
>>> +}
>>> +}
>>> +
>>> +/* Dump the lambda function FN including its 'mutable' qualifier and any
>>> +   template bindings.  */
>>> +
>>> +static void
>>> +dump_lambda_function (cxx_pretty_printer *pp,
>>> + tree fn, tree template_parms, tree template_args,
>>> + int flags)
>>> +{
>>> +  /* A lambda's signature is essentially its "type".  */
>>> +  dump_type (pp, DECL_CONTEXT (fn), flags);
>>> +  if (!(TYPE_QUALS (class_of_this_parm (TREE_TYPE (fn))) &
>>> TYPE_QUAL_CONST))
>>> +{
>>> +  pp->padding = pp_before;
>>> +  pp_c_ws_string (pp, "mutable");
>>> +}
>>> +  dump_substitution (pp, fn, template_parms, template_args, flags);
>>> +}
>>> +
>>>  /* Pretty print a function decl. There are several ways we want to print
>>> a
>>> function declaration. The TFF_ bits in FLAGS tells us how to behave.
>>> As error can only apply the '#' flag once to give 0 and 1 for V,
>>> there
>>> @@ -1379,15 +1420,6 @@ dump_function_decl (cxx_pretty_printer *pp, tree
>>> t, int flags)
>>>int show_return = flags & TFF_RETURN_TYPE || flags &
>>> TFF_DECL_SPECIFIERS;
>>>int do_outer_scope = ! (flags & TFF_UNQUALIFIED_NAME);
>>>tree exceptions;
>>> -  vec *typenames = NULL;
>>> -
>>> -  if (DECL_NAME (t) && LAMBDA_FUNCTION_P (t))
>>> -{
>>> -  /* A lambda's signature is essentially its "type", so defer.  */
>>> -  gcc_assert (LAMBDA_TYPE_P (DECL_CONTEXT (t)));
>>> -  dump_type (pp, DECL_CONTEXT (t), flags);
>>> -  return;
>>> -}
>>>
>>>flags &= ~(TFF_UNQUALIFIED_NAME | TFF_TEMPLATE_NAME);
>>>if (TREE_CODE (t) == TEMPLATE_DECL)
>>> @@ -1409,10 +1441,12 @@ dump_function_decl (cxx_pretty_printer *pp, tree
>>> t, int flags)
>>> {
>>>   template_parms = DECL_TEMPLATE_PARMS (tmpl);
>>>   t = tmpl;
>>> - typenames = find_typenames (t);
>>> }
>>>  }
>>>
>>> +  if (DECL_NAME (t) && LAMBDA_FUNCTION_P (t))
>>> +return dump_lambda_function (pp, t, template_parms, template_args,
>>> flags);
>>> +
>>>fntype = TREE_TYPE (t);
>>>parmtypes = FUNCTION_FIRST_USER_PARMTYPE (t);
>>>
>>> @@ -1476,1

wide-int branch: fixed mips regression

2013-08-29 Thread Kenneth Zadeck

Fixed

FAIL: gcc.dg/fixed-point/int-warning.c  (test for warnings, line 12)

on


 mips64-unknown-linux-gnu

Index: gcc/tree.c
===
--- gcc/tree.c	(revision 202088)
+++ gcc/tree.c	(working copy)
@@ -8531,11 +8531,11 @@ bool
 int_fits_type_p (const_tree c, const_tree type)
 {
   tree type_low_bound, type_high_bound;
-  bool ok_for_low_bound, ok_for_high_bound, unsc;
+  bool ok_for_low_bound, ok_for_high_bound;
   wide_int wc, wd;
+  signop sgn_c = TYPE_SIGN (TREE_TYPE (c));
 
   wc = c;
-  unsc = TYPE_UNSIGNED (TREE_TYPE (c));
 
 retry:
   type_low_bound = TYPE_MIN_VALUE (type);
@@ -8555,17 +8555,17 @@ retry:
   if (type_low_bound && TREE_CODE (type_low_bound) == INTEGER_CST)
 {
   wd = type_low_bound;
-  if (unsc != TYPE_UNSIGNED (TREE_TYPE (type_low_bound)))
+  if (sgn_c != TYPE_SIGN (TREE_TYPE (type_low_bound)))
 	{
-	  int c_neg = (!unsc && wc.neg_p ());
-	  int t_neg = (unsc && wd.neg_p ());
+	  int c_neg = (sgn_c == SIGNED && wc.neg_p ());
+	  int t_neg = (sgn_c == UNSIGNED && wd.neg_p ());
 
 	  if (c_neg && !t_neg)
 	return false;
 	  if ((c_neg || !t_neg) && wc.ltu_p (wd))
 	return false;
 	}
-  else if (wc.lt_p (wd, TYPE_SIGN (TREE_TYPE (type_low_bound
+  else if (wc.lt_p (wd, sgn_c))
 	return false;
   ok_for_low_bound = true;
 }
@@ -8576,17 +8576,17 @@ retry:
   if (type_high_bound && TREE_CODE (type_high_bound) == INTEGER_CST)
 {
   wd = type_high_bound;
-  if (unsc != TYPE_UNSIGNED (TREE_TYPE (type_high_bound)))
+  if (sgn_c != TYPE_SIGN (TREE_TYPE (type_high_bound)))
 	{
-	  int c_neg = (!unsc && wc.neg_p ());
-	  int t_neg = (unsc && wd.neg_p ());
+	  int c_neg = (sgn_c == SIGNED && wc.neg_p ());
+	  int t_neg = (sgn_c == UNSIGNED && wd.neg_p ());
 
 	  if (t_neg && !c_neg)
 	return false;
 	  if ((t_neg || !c_neg) && wc.gtu_p (wd))
 	return false;
 	}
-  else if (wc.gt_p (wd, TYPE_SIGN (TREE_TYPE (type_high_bound
+  else if (wc.gt_p (wd, sgn_c))
 	return false;
   ok_for_high_bound = true;
 }
@@ -8600,7 +8600,7 @@ retry:
   /* Perform some generic filtering which may allow making a decision
  even if the bounds are not constant.  First, negative integers
  never fit in unsigned types, */
-  if (TYPE_UNSIGNED (type) && !unsc && wc.neg_p ())
+  if (TYPE_UNSIGNED (type) && sgn_c == SIGNED && wc.neg_p ())
 return false;
 
   /* Second, narrower types always fit in wider ones.  */
@@ -8608,7 +8608,7 @@ retry:
 return true;
 
   /* Third, unsigned integers with top bit set never fit signed types.  */
-  if (! TYPE_UNSIGNED (type) && unsc && wc.neg_p ())
+  if (!TYPE_UNSIGNED (type) && sgn_c == UNSIGNED && wc.neg_p ())
 return false;
 
   /* If we haven't been able to decide at this point, there nothing more we
Index: gcc/fold-const.c
===
--- gcc/fold-const.c	(revision 202088)
+++ gcc/fold-const.c	(working copy)
@@ -1690,7 +1690,7 @@ fold_convert_const_int_from_fixed (tree
   /* Given a fixed-point constant, make new constant with new type,
  appropriately sign-extended or truncated.  */
   t = force_fit_type (type, wide_int::from_double_int (temp, 
-		   TYPE_PRECISION (type)), 
+		   HOST_BITS_PER_DOUBLE_INT), 
 		  -1,
 		  (temp.is_negative ()
 		   && (TYPE_UNSIGNED (type)


[patch, fortran] PR 56519: Reject impure intrinsic subroutines in DO CONCURRENT

2013-08-29 Thread Thomas Koenig
Hello world,

the attached patch rejects impure subroutines, such as RANDOM_NUMBER,
within DO CONCURRENT.

Regression-tested.  OK for trunk?

Regards

Thomas

2013-08-29  Thomas Koenig  

PR fortran/PR56519
* gfortran.h:  Declare gfc_do_concurrent_flag as extern.
* resolve.c:  Rename do_concurrent_flag to gfc_do_concurrent_flag.
and make non-static.
(resolve_function):  Use gfc_do_concurrent_flag instead of
do_concurrent_flag.
(pure_subroutine):  Likewise.
(resolve_code):  Likewise.
(resolve_types):  Likewise.
* intrinsic.c (gfc_intrinsic_sub_interface):  Raise error for
non-pure intrinsic subroutines within DO CONCURRENT.

2013-08-29  Thomas Koenig  

PR fortran/PR56519
* gfortran.dg/do_concurrent_3.f90:  New test case.
Index: gfortran.h
===
--- gfortran.h	(Revision 201996)
+++ gfortran.h	(Arbeitskopie)
@@ -2846,6 +2846,7 @@ gfc_expr *gfc_expr_to_initialize (gfc_expr *);
 bool gfc_type_is_extensible (gfc_symbol *);
 bool gfc_resolve_intrinsic (gfc_symbol *, locus *);
 bool gfc_explicit_interface_required (gfc_symbol *, char *, int);
+extern int gfc_do_concurrent_flag;
 
 
 /* array.c */
Index: intrinsic.c
===
--- intrinsic.c	(Revision 201996)
+++ intrinsic.c	(Arbeitskopie)
@@ -4397,6 +4397,13 @@ gfc_intrinsic_sub_interface (gfc_code *c, int erro
   c->resolved_sym->attr.elemental = isym->elemental;
 }
 
+  if (gfc_do_concurrent_flag && !isym->pure)
+{
+  gfc_error ("Subroutine call to intrinsic '%s' in DO CONCURRENT "
+		 "block at %L is not PURE", name, &c->loc);
+  return MATCH_ERROR;
+}
+
   if (gfc_pure (NULL) && !isym->pure)
 {
   gfc_error ("Subroutine call to intrinsic '%s' at %L is not PURE", name,
Index: resolve.c
===
--- resolve.c	(Revision 201996)
+++ resolve.c	(Arbeitskopie)
@@ -60,7 +60,7 @@ static code_stack *cs_base = NULL;
 /* Nonzero if we're inside a FORALL or DO CONCURRENT block.  */
 
 static int forall_flag;
-static int do_concurrent_flag;
+int gfc_do_concurrent_flag;
 
 /* True when we are resolving an expression that is an actual argument to
a procedure.  */
@@ -2986,11 +2986,11 @@ resolve_function (gfc_expr *expr)
 		 forall_flag == 2 ? "mask" : "block");
 	  t = false;
 	}
-  else if (do_concurrent_flag)
+  else if (gfc_do_concurrent_flag)
 	{
 	  gfc_error ("Reference to non-PURE function '%s' at %L inside a "
 		 "DO CONCURRENT %s", name, &expr->where,
-		 do_concurrent_flag == 2 ? "mask" : "block");
+		 gfc_do_concurrent_flag == 2 ? "mask" : "block");
 	  t = false;
 	}
   else if (gfc_pure (NULL))
@@ -3059,7 +3059,7 @@ pure_subroutine (gfc_code *c, gfc_symbol *sym)
   if (forall_flag)
 gfc_error ("Subroutine call to '%s' in FORALL block at %L is not PURE",
 	   sym->name, &c->loc);
-  else if (do_concurrent_flag)
+  else if (gfc_do_concurrent_flag)
 gfc_error ("Subroutine call to '%s' in DO CONCURRENT block at %L is not "
 	   "PURE", sym->name, &c->loc);
   else if (gfc_pure (NULL))
@@ -9629,7 +9629,7 @@ resolve_code (gfc_code *code, gfc_namespace *ns)
 {
   frame.current = code;
   forall_save = forall_flag;
-  do_concurrent_save = do_concurrent_flag;
+  do_concurrent_save = gfc_do_concurrent_flag;
 
   if (code->op == EXEC_FORALL)
 	{
@@ -9663,9 +9663,9 @@ resolve_code (gfc_code *code, gfc_namespace *ns)
 		 to transform the SELECT TYPE into ASSOCIATE first.  */
 	  break;
 case EXEC_DO_CONCURRENT:
-	  do_concurrent_flag = 1;
+	  gfc_do_concurrent_flag = 1;
 	  gfc_resolve_blocks (code->block, ns);
-	  do_concurrent_flag = 2;
+	  gfc_do_concurrent_flag = 2;
 	  break;
 	case EXEC_OMP_WORKSHARE:
 	  omp_workshare_save = omp_workshare_flag;
@@ -9684,7 +9684,7 @@ resolve_code (gfc_code *code, gfc_namespace *ns)
   if (code->op != EXEC_COMPCALL && code->op != EXEC_CALL_PPC)
 	t = gfc_resolve_expr (code->expr1);
   forall_flag = forall_save;
-  do_concurrent_flag = do_concurrent_save;
+  gfc_do_concurrent_flag = do_concurrent_save;
 
   if (!gfc_resolve_expr (code->expr2))
 	t = false;
@@ -14404,7 +14404,7 @@ resolve_types (gfc_namespace *ns)
 }
 
   forall_flag = 0;
-  do_concurrent_flag = 0;
+  gfc_do_concurrent_flag = 0;
   gfc_check_interfaces (ns);
 
   gfc_traverse_ns (ns, resolve_values);
! { dg-do compile }
! PR 56519 - flag impure intrinsic subroutine calls
! within DO CONCURRENT
program main
  implicit none
  integer :: i
  real :: array(123), val

  do concurrent (i = 1:123)
 call random_number (val) ! { dg-error "is not PURE" }
 array(i) = val
  end do
end program main


Re: [PATCH]: Fix PR middle-end/56382 -- Only move MODE_COMPLEX_FLOAT by parts if we can create pseudos

2013-08-29 Thread Steven Bosscher
On Thu, Aug 29, 2013 at 2:14 AM, John David Anglin wrote:
> As expected, your patch doesn't fix the PR.

Hmm, unfortunate. The reason why I proposed it is because it would
revert to the way this code worked before http://gcc.gnu.org/r104371

The idea was to make "force" false, and let the normal back-up plans
work after failure in emit_move_via_integer.


> The bug lies in using "emit_move_complex_parts" when we can't create
> pseudos.
> There are several places in the functions that it calls where "gen_reg_rtx"
> can be
> called (e.g., "store_bit_field_using_insv").  In debugging, I found that
> fixing these
> didn't help.  In the end, it fails to find a way move the complex parts.
>
> In gcc.c-torture/compile/pr55921.c, we have two register operands so try_int
> can be true.  "emit_move_via_integer" is successful in this case.  Your
> patch might be correct.
>
> I'm not sure that can_create_pseudo_p is that big a hammer as it appears
> emit_move_complex is mainly called prior to reload.  The proposed change
> only affects the code when we can't create pseudos.  Even then, we fall back
> to emit_move_complex_parts.
>
> As you say, some other check might be more appropriate to determine
> whether a call to gen_reg_rtx might be needed in emit_move_complex_parts.
> For the PA, it would be something like "GET_MODE_BITSIZE (mode) >
> BITS_PER_WORD".
> But, we still need to check can_create_pseudo_p as we probably still want to
> use emit_move_complex_parts before reload.

I haven't tried to see what goes on in the compiler, but it feels like
your.patch just fixes a symptom. Might be just reload behaving like
reload, but it just seems to me that the problem is not that you
cannot create new pseudos at the point of the ICE. The test on whether
the optab exists is obviously not enough, you somehow need to make
sure that the move doesn't require new registers. That's something I
would expect reload to check: Can the target make the move I'm asking
for without introducing new registers...

I realize I'm not being very helpful... just thinking out loud ;-)

Ciao!
Steven


Re: [PATCH]: Fix PR middle-end/56382 -- Only move MODE_COMPLEX_FLOAT by parts if we can create pseudos

2013-08-29 Thread John David Anglin

On 29-Aug-13, at 6:52 PM, Steven Bosscher wrote:


Can the target make the move I'm asking
for without introducing new registers...


Yes, it can execute extract and insert instructions using the same  
source and target register.

Of course, this clobbers the source.

Dave
--
John David Anglin   dave.ang...@bell.net





Re: [PATCH]: Fix PR middle-end/56382 -- Only move MODE_COMPLEX_FLOAT by parts if we can create pseudos

2013-08-29 Thread John David Anglin

On 29-Aug-13, at 6:52 PM, Steven Bosscher wrote:


I haven't tried to see what goes on in the compiler, but it feels like
your.patch just fixes a symptom. Might be just reload behaving like
reload, but it just seems to me that the problem is not that you
cannot create new pseudos at the point of the ICE. The test on whether
the optab exists is obviously not enough, you somehow need to make
sure that the move doesn't require new registers. That's something I
would expect reload to check: Can the target make the move I'm asking
for without introducing new registers...


I believe that reload does not handle complex moves.  The PA backend  
only
has reload patterns for integer and floating point moves, and the  
documentation

indicates that the backend doesn't have to provide complex support.

Maybe I'm missing something, but I don't think I have a way to trigger  
a reload
pattern for this case in the backend.  There are no reloads patterns  
for insert

and extract operations.  The middle is supposed to handle it.

Dave
--
John David Anglin   dave.ang...@bell.net





Eliminate vectorizer analysis side effects

2013-08-29 Thread Xinliang David Li
I was debugging a runtime failure of SPEC06 xalancbmk built with LIPO.
Using -fdisable- option pinpoints the problem in slp vectorize
pass on a particular function. dbgcnt support is added to to track
down the individual BB, but it  fails even when the dbg count is set
to 0.

It turns out that no BB was actually vectorized for that function, but
turning on/off slp-vectorize does make a difference in generated code
-- the only difference between the good and bad case is stack layout.
 The problem is  in the alignment analysis phase -- which
speculatively changes the base declaration's alignment regardless
whether the vectorization transformation will be performed or not
later.

The attached patch fixes the problem. Testing is ok. Ok for trunk?

thanks,

David
Index: ChangeLog
===
--- ChangeLog   (revision 202088)
+++ ChangeLog   (working copy)
@@ -1,5 +1,17 @@
 2013-08-29  Xinliang David Li  
 
+   * tree-vect-data-refs.c (vect_compute_data_ref_alignment):
+   Delay base decl alignment adjustment.
+   * tree-vectorizer.c (ensure_base_alignment): New function.
+   (vectorize_loops): Add dbg_cnt support. Perform alignment
+   adjustment.
+   (execute_vect_slp): Ditto.
+   * dbgcnt.def: New debug counter.
+   * tree-data-ref.h: New fields.
+   * Makefile: New dependency.
+
+2013-08-29  Xinliang David Li  
+
* loop-unroll.c (report_unroll_peel): Minor message
change.
* tree-vect-loop-manip.c (vect_do_peeling_for_alignment):
Index: tree-vect-data-refs.c
===
--- tree-vect-data-refs.c   (revision 202088)
+++ tree-vect-data-refs.c   (working copy)
@@ -763,15 +763,10 @@ vect_compute_data_ref_alignment (struct
   dump_generic_expr (MSG_NOTE, TDF_SLIM, ref);
 }
 
-  DECL_ALIGN (base) = TYPE_ALIGN (vectype);
-  DECL_USER_ALIGN (base) = 1;
+  DR_BASE_DECL (dr) = base;
+  DR_BASE_MISALIGNED (dr) = true;
 }
 
-  /* At this point we assume that the base is aligned.  */
-  gcc_assert (base_aligned
- || (TREE_CODE (base) == VAR_DECL
- && DECL_ALIGN (base) >= TYPE_ALIGN (vectype)));
-
   /* If this is a backward running DR then first access in the larger
  vectype actually is N-1 elements before the address in the DR.
  Adjust misalign accordingly.  */
Index: dbgcnt.def
===
--- dbgcnt.def  (revision 202088)
+++ dbgcnt.def  (working copy)
@@ -172,6 +172,8 @@ DEBUG_COUNTER (pre_insn)
 DEBUG_COUNTER (treepre_insert)
 DEBUG_COUNTER (tree_sra)
 DEBUG_COUNTER (eipa_sra)
+DEBUG_COUNTER (vect_loop)
+DEBUG_COUNTER (vect_slp)
 DEBUG_COUNTER (sched2_func)
 DEBUG_COUNTER (sched_block)
 DEBUG_COUNTER (sched_func)
Index: tree-vectorizer.c
===
--- tree-vectorizer.c   (revision 202088)
+++ tree-vectorizer.c   (working copy)
@@ -68,6 +68,7 @@ along with GCC; see the file COPYING3.
 #include "tree-pass.h"
 #include "hash-table.h"
 #include "tree-ssa-propagate.h"
+#include "dbgcnt.h"
 
 /* Loop or bb location.  */
 LOC vect_location;
@@ -279,6 +280,37 @@ note_simd_array_uses (hash_table  datarefs;
+  struct data_reference *dr;
+  unsigned int i;
+
+ if (loop_vinfo)
+datarefs = LOOP_VINFO_DATAREFS (loop_vinfo);
+  else
+datarefs = BB_VINFO_DATAREFS (bb_vinfo);
+
+  FOR_EACH_VEC_ELT (datarefs, i, dr)
+{
+  tree base_decl = DR_BASE_DECL (dr);
+  if (base_decl && DR_BASE_MISALIGNED (dr))
+{
+  gimple stmt = DR_STMT (dr);
+  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
+  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
+  DECL_ALIGN (base_decl) = TYPE_ALIGN (vectype);
+  DECL_USER_ALIGN (base_decl) = 1;
+  DR_BASE_MISALIGNED (dr) = false;
+}
+}
+}
+
+
 /* Function vectorize_loops.
 
Entry point to loop vectorization phase.  */
@@ -331,10 +363,14 @@ vectorize_loops (void)
if (!loop_vinfo || !LOOP_VINFO_VECTORIZABLE_P (loop_vinfo))
  continue;
 
+if (!dbg_cnt (vect_loop))
+ break;
+
 if (LOCATION_LOCUS (vect_location) != UNKNOWN_LOC
&& dump_enabled_p ())
   dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, vect_location,
"loop vectorized\n");
+ensure_base_alignment (loop_vinfo, NULL);
vect_transform_loop (loop_vinfo);
num_vectorized_loops++;
/* Now that the loop has been vectorized, allow it to be unrolled
@@ -431,6 +467,7 @@ static unsigned int
 execute_vect_slp (void)
 {
   basic_block bb;
+  bb_vec_info bb_vinfo;
 
   init_stmt_vec_info_vec ();
 
@@ -438,8 +475,12 @@ execute_vect_slp (void)
 {
   vect_location = find_bb_location (bb);
 
-  if (vect_slp_analyze_bb (bb))
+  if ((bb_vinfo = vect_slp_analyze_bb (bb)))
 {
+ if (!d

Re: Fix PR middle-end/57370

2013-08-29 Thread Easwaran Raman
Richard,
Do you want me to commit everything but the  insert_stmt_after hunk
now? There are more similar failures reported in
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57393 and I have attached
the updated patch there. Shall I send that for review? Apart from the
debug statement issue, almost all the bugs are due to dependence
violation because certain newly inserted statements do not have the
right UID. Instead of trying to catch all of them, will it be better
if I check if the stmt has a proper uid (non-zero if it is not the
first stmt) and assign a sensible value at the point where it is used
(find_insert_point and appears_later_in_bb) instead of where the stmt
is created? I think that would be less brittle.

- Easwaran



On Tue, Aug 27, 2013 at 3:35 AM, Richard Biener
 wrote:
> On Thu, Aug 1, 2013 at 1:34 AM, Easwaran Raman  wrote:
>> I have a new patch that supersedes this. The new patch also fixes PR
>> tree-optimization/57393 and PR tree-optimization/58011. Bootstraps and
>> no test regression on x86_64/linux. Ok for trunk?
>>
>> 2013-07-31  Easwaran Raman  
>>
>> PR middle-end/57370
>> * tree-ssa-reassoc.c (build_and_add_sum): Fix UID assignment and reset
>> of debug statements that cause inconsistent IR.
>
> Missing ChangeLog entry for the insert_stmt_after hunk which I do not like
> at all.  The other hunks are ok, but we need to work harder to preserve
> debug stmts - simply removing all is not going to fly.
>
> Richard.
>
>>
>> testsuite/ChangeLog:
>> 2013-07-31  Easwaran Raman  
>>
>> PR middle-end/57370
>> PR tree-optimization/57393
>> PR tree-optimization/58011
>> * gfortran.dg/reassoc_12.f90: New testcase.
>> * gcc.dg/tree-ssa/reassoc-31.c: New testcase.
>> * gcc.dg/tree-ssa/reassoc-31.c: New testcase.
>>
>>
>> On Fri, Jul 12, 2013 at 7:57 AM, Easwaran Raman  wrote:
>>> Ping.
>>>
>>> On Thu, Jun 27, 2013 at 10:15 AM, Easwaran Raman  wrote:
 A newly generated statement in build_and_add_sum  function of
 tree-ssa-reassoc.c has to be assigned the UID of its adjacent
 statement. In one instance, it was assigned the wrong uid (of an
 earlier phi statement) which messed up the IR and caused the test
 program to hang. Bootstraps and no test regressions on x86_64/linux.
 Ok for trunk?

 Thanks,
 Easwaran


 2013-06-27  Easwaran Raman  

 PR middle-end/57370
 * tree-ssa-reassoc.c (build_and_add_sum): Do not use the UID of a 
 phi
 node for a non-phi gimple statement.

 testsuite/ChangeLog:
 2013-06-27  Easwaran Raman  

 PR middle-end/57370
 * gfortran.dg/reassoc_12.f90: New testcase.


 Index: gcc/testsuite/gfortran.dg/reassoc_12.f90
 ===
 --- gcc/testsuite/gfortran.dg/reassoc_12.f90 (revision 0)
 +++ gcc/testsuite/gfortran.dg/reassoc_12.f90 (revision 0)
 @@ -0,0 +1,74 @@
 +! { dg-do compile }
 +! { dg-options "-O2 -ffast-math" }
 +! PR middle-end/57370
 +
 + SUBROUTINE xb88_lr_adiabatic_lda_calc(e_ndrho_ndrho_ndrho, &
 +   grad_deriv,npoints, sx)
 +IMPLICIT REAL*8 (t)
 +INTEGER, PARAMETER :: dp=8
 +REAL(kind=dp), DIMENSION(1:npoints) :: e_ndrho_ndrho_ndrho, &
 +   e_ndrho_ndrho_rho
 +  DO ii=1,npoints
 +  IF( grad_deriv >= 2 .OR. grad_deriv == -2 ) THEN
 +t1425 = t233 * t557
 +t1429 = beta * t225
 +t1622 = t327 * t1621
 +t1626 = t327 * t1625
 +t1632 = t327 * t1631
 +t1685 = t105 * t1684
 +t2057 = t1636 + t8 * (t2635 + t3288)
 +  END IF
 +  IF( grad_deriv >= 3 .OR. grad_deriv == -3 ) THEN
 +t5469 = t5440 - t5443 - t5446 - t5449 - &
 +t5451 - t5454 - t5456 + t5459  - &
 +t5462 + t5466 - t5468
 +t5478 = 0.240e2_dp * t1616 * t973 * t645 * t1425
 +t5489 = 0.16e2_dp * t1429 * t1658
 +t5531 = 0.160e2_dp * t112 * t1626
 +t5533 = 0.160e2_dp * t112 * t1632
 +t5537 = 0.160e2_dp * t112 * t1622
 +t5541 = t5472 - t5478 - t5523 + t5525 + &
 +t5531 + t5533 + t5535 + t5537 + &
 +t5540
 +t5565 = t112 * t1685
 +t5575 = t5545 - t5548 + t5551 + t5553 - &
 +t5558 + t5560 - t5562 + t5564 - &
 +0.80e1_dp * t5565 + t5568 + t5572 + &
 +t5574
 +t5611 = t5579 - t5585 + t5590 - t5595 + &
 +t5597 - t5602 + t5604 + t5607 + &
 +t5610
 +t5613 = t5469 + t5541 + t5575 + t5611
 +t6223 = t

Re: [Google] Refine hot caller heuristic

2013-08-29 Thread Xinliang David Li
Ok.  Do consider generalize it with 1) more inline hints (how the
parameters are used in callee); and 2) more parameter type (such as
addr_k, non_null_k, range_k etc you have proposed before) in the
future.

thanks,

David

On Thu, Aug 29, 2013 at 11:24 AM, Easwaran Raman  wrote:
> On Tue, Aug 20, 2013 at 9:35 PM, Xinliang David Li  wrote:
>> Do you need to guard the jump function access with check if
>> (ipa_node_params_vector.exists ())?
> I believe it is not necessary since, for example, ipa_analyze_node
> calls ipa_check_create_node_params that calls create. But I see it is
> used in ipa-inline-analysis.c everywhere. So I have added a check and
> conservatively return false.
>
>>
>> Ideally, useful_cold_callee should be folded into the inline hints
>> estimation.  Question about the heuristic: why filtering out
>> PASS_THROUGH parameter cases completely? Passing 'this' parameter in
>> many cases can result in good PRE opportunities.  Why not skip the
>> unknown type?
>
> The rationale is it is useful to inline bar into foo in the snippet below:
>
> void foo ()
> {
>   A a;
>   bar(&a);
>   ...
> }
>
> Capturing this requires UNKNOWN and KNOWN_TYPE jump functions. I have
> changed the check accordingly. I have attached the new patch.
>
> - Easwaran
>
>> David
>>
>> On Tue, Aug 20, 2013 at 12:26 PM, Easwaran Raman  wrote:
>>> The current hot caller heuristic simply promotes edges whose caller is
>>> hot. This patch does the following:
>>> * Turn it off for applications with large footprint since the size
>>> increase hurts them
>>> * Be more selective by considering arguments to callee when the
>>> heuristic is enabled.
>>>
>>> This performs well on internal benchmarks. Ok for google/4_8 branch if
>>> all tests pass?
>>>
>>> - Easwaran


Re: [RFC] Old school parallelization of WPA streaming

2013-08-29 Thread Andi Kleen
On Thu, Aug 29, 2013 at 03:58:45PM +0200, Jan Hubicka wrote:
> > > Said that, I now have the fork() patch in all my trees and enjoy 50% 
> > > faster
> > > WPA times.  I changed my mind about claim that stremaing should be disk 
> > > bound -
> > > it is hard to hope for disk boundness for something that should fit in 
> > > cache.
> > 
> > It should at least limit its fork rate according to -flto=N or jobserver.
> It limits forks to -flto=N.
> If the patch seems resonable, I will look into posiblity of adding my 
> jobserver client
> based on GNU make code.
> 
> I also think with -flto we want wrapper to figure out number of threads and 
> suppy
> default =N (i.e. nonparallel lto would be -flto=0).  Most people don't want 
> to worry
> about =n/=jobserv parameters and those few projects that don't want to start 
> too many
> processes to not explode in memory use can get their build machinery right.
> 

Job server should do that already. You get whatever the user specifies with -j
on the top level make. That's imho the right area to control this.

The only problem is we need to work around the jobserver pipe bug first
I suspect this may need a change in make :-/

-Andi


Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition

2013-08-29 Thread Teresa Johnson
On Wed, Aug 28, 2013 at 11:20 AM, Teresa Johnson  wrote:
> On Wed, Aug 28, 2013 at 9:58 AM, Jan Hubicka  wrote:
>> Hi,
>> with Martin we did bit of progress on analyzing the problems.  We now have
>> COMDAT profile merging for FDO
>
>  Great! Is this the LTO merging you were talking about in an earlier
> message, or the gcov runtime fix (that would presumably not be
> lto-specific)?
>
>> and we also noticed that forks can make your
>> basic blocks appear never executed even though they are executed every run:
>> the fork is accounted as 3 independnet runs of the program.  First run
>> is until fork, the second 2 runs are both variant.
>>
>> I have patch to track this.  Moreover vforks seems to produce repeated
>> merging of results.
>
> Aha, does this explain the gimp issue as well?
>
>>
>> These two factors seems to help to riddle enought the firefox profiles
>> so it took us weeks to understand what happens.
>>> + if (changed)
>>> +{
>>> +  /* Edge forwarding in particular can cause hot blocks 
>>> previously
>>> + reached by both hot and cold blocks to become dominated 
>>> only
>>> + by cold blocks. This will cause the verification
>>> below to fail,
>>> + and lead to now cold code in the hot section. This is not 
>>> easy
>>> + to detect and fix during edge forwarding, and in some 
>>> cases
>>> + is only visible after newly unreachable blocks are 
>>> deleted,
>>> + which will be done in fixup_partitions.  */
>>> +  fixup_partitions ();
>>
>> Is it really necessary to run this from internal loop of the cfgcleanup?
>
> The reason I added it here is that just below there is a call to
> verify_flow_info, and that will fail with the new verification.
>
>> It seems
>> you will play back and forth game where edge forwarding will remove your 
>> fallthru
>> and you will be re-adding it?
>
> fixup_partitions will not add new fall through edges. (It may invoke
> force_nonfallthru to do the opposite.) So there shouldn't be any
> ping-ponging effect.
>
>>
>> I would wait for cfgcleanup to finish its job (I don't really think it needs 
>> to be
>> iterative) and then do the fixup possibly cleaning up when the blocks was 
>> repoisitoined
>> (I suppose it is only case where the code above introduces new cfgcleanup 
>> oppurtunities).
>
> As noted above, I can't do this due to the call to verify_flow_info
> for each iteration. One option is to move both down outside the loop.
>
>>
>>> +  /* Walk the preds/succs and check if there is at least one already
>>> + marked hot. Keep track of the most frequent pred/succ so that we
>>> + can mark it hot if we don't find one.  */
>>> +  FOR_EACH_EDGE (e, ei, edges)
>>> +{
>>> +  basic_block reach_bb = walk_up ? e->src : e->dest;
>>> +
>>> +  if (e->flags & EDGE_DFS_BACK)
>>> +continue;
>>> +
>>> +  if (BB_PARTITION (reach_bb) != BB_COLD_PARTITION)
>>> +  {
>>> +found = true;
>>> +break;
>>> +  }
>>> +  if (e->probability > highest_probability)
>>> +highest_probability = e->probability;
>>
>> When doing predecestor walk if you have two predecestors, one executing 
>> 10
>> times and getting to the block with probability 1%, you want to chose it over
>> block executing once and getting to you with probability 100%.
>>
>> You probably want to look for most likely predecestor here.  You need to look
>> for highest e->count and if they are all 0 for highest EDGE_FREQUENCY and for
>> maximal probability only if all fails?
>
> Yes, thanks, let me do that.

New patch that addresses this is included below.

Thanks,
Teresa

>
>>
>>> +}
>>> +
>>> +  /* If bb is reached by (or reaches, in the case of !WALK_UP) another 
>>> hot
>>> + block (or unpartitioned, e.g. the entry block) then it is ok. If 
>>> not,
>>> + then the most frequent pred (or succ) needs to be adjusted.  In 
>>> the
>>> + case where multiple preds/succs have the same probability (e.g. a
>>> + 50-50 branch), then both will be adjusted.  */
>>> +  if (found)
>>> +continue;
>>> +
>>> +  FOR_EACH_EDGE (e, ei, edges)
>>> +{
>>> +  if (e->flags & EDGE_DFS_BACK)
>>> +continue;
>>> +  if (e->probability < highest_probability)
>>> +continue;
>>
>> Again for predecestor walk you need to wind down the bit crazy logic 
>> described above.
>>> Index: predict.c
>>> ===
>>> --- predict.c   (revision 202021)
>>> +++ predict.c   (working copy)
>>> @@ -241,6 +241,22 @@ probably_never_executed_bb_p (struct function *fun
>>>return false;
>>>  }
>>>
>>> +
>>> +/* Return true in case edge E is probably never executed.  */
>>> +
>>> +bool
>>> +probably_never_executed_edge_p (struct function *fun, edge