RE: [PATCH]Construct canonical scaled address expression in IVOPT

2013-09-24 Thread bin.cheng


 -Original Message-
 From: Richard Biener [mailto:richard.guent...@gmail.com]
 Sent: Monday, September 23, 2013 8:08 PM
 To: Bin Cheng
 Cc: GCC Patches
 Subject: Re: [PATCH]Construct canonical scaled address expression in IVOPT
 
 On Fri, Sep 20, 2013 at 12:00 PM, bin.cheng bin.ch...@arm.com wrote:
  Hi,
  For now IVOPT constructs scaled address expression in the form of
  scaled*index and checks whether backend supports it. The problem is
  the address expression is invalid on ARM, causing scaled expression
  disabled in IVOPT on ARM.  This patch fixes the IVOPT part by
  constructing rtl address expression like index*scaled+base.
 
 -  addr = gen_rtx_fmt_ee (MULT, address_mode, reg1, NULL_RTX);
 +  /* Construct address expression in the canonical form of
 + base+index*scale and call memory_address_addr_space_p to see
 whether
 + it is allowed by target processors.  */
 +  index = gen_rtx_fmt_ee (MULT, address_mode, reg1, NULL_RTX);
for (i = -MAX_RATIO; i = MAX_RATIO; i++)
   {
 -  XEXP (addr, 1) = gen_int_mode (i, address_mode);
 +  if (i == 1)
 +continue;
 +
 +  XEXP (index, 1) = gen_int_mode (i, address_mode);  addr =
 + gen_rtx_fmt_ee (PLUS, address_mode, index, reg1);
if (memory_address_addr_space_p (mode, addr, as))
  bitmap_set_bit (valid_mult, i + MAX_RATIO);
 
 so you now build reg1*scale+reg1 - which checks if offset and scale work
at
 the same time (and with the same value - given this is really
reg1*(scale+1)).
 It might be that this was implicitely assumed (or that no targets allow
scale
 but no offset), but it's a change that requires audit of behavior on more
 targets.
So literally, function multiplier_allowed_in_address_p should check whether
multiplier is allowed in any kind of addressing mode, like [reg*scale +
offset] and [reg*scale + reg].   Apparently it's infeasible to check every
possibility for each architecture, is it ok we at least check index, then
addr if index is failed?  By any kind of addressing modes, I mean
modes supported by function get_address_cost,  i.e., in form of [base] +
[off] + [var] + (reg|reg*scale).

 
 The above also builds more RTX waste which you can fix by re-using the
PLUS
 by building it up-front similar to the multiplication.  You also miss the
Yes, this can be fixed.

 opportunity to have scale == 1 denote as to whether reg1 + reg2 is valid.
I
 would expect that many targets support reg1 * scale + constant-offset but
 not many reg1 * scale + reg2.
I thought scale==1 is unnecessary because the addressing mode degrades into
reg or reg+reg.  Moreover, calls of multiplier_allowed_in_address_p in
both get_address_cost and get_computation_cost_at have scale other than 1.

 
 So no, the helper now checks sth completely different.  What's the problem
 with arm supporting reg1 * scale?  Why shouldn't it being able to handle
the
 implicit zero offset?

As Richard clarified, ARM does not support scaled addressing mode without
base register.

Thanks.
bin





Re: [PATCH] Bug fix: *var and MEM[(const int *)var] (var has int* type) are not treated as the same data ref.

2013-09-24 Thread Jakub Jelinek
Hi!

On Mon, Sep 23, 2013 at 05:26:13PM -0700, Cong Hou wrote:

Missing ChangeLog entry.

 --- gcc/testsuite/gcc.dg/alias-14.c (revision 0)
 +++ gcc/testsuite/gcc.dg/alias-14.c (revision 0)

Vectorizer tests should go into gcc.dg/vect/ instead, or, if they are
for a single target (but there is no reason why this should be a single
target), into gcc.target/target/.

 --- gcc/fold-const.c (revision 202662)
 +++ gcc/fold-const.c (working copy)
 @@ -2693,8 +2693,9 @@ operand_equal_p (const_tree arg0, const_
  operand_equal_p (TYPE_SIZE (TREE_TYPE (arg0)),
 TYPE_SIZE (TREE_TYPE (arg1)), flags)))
 types_compatible_p (TREE_TYPE (arg0), TREE_TYPE (arg1))
 -   (TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg0, 1)))
 -  == TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg1, 1
 +   types_compatible_p (
 +   TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg0, 1))),
 +   TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg1, 1
 OP_SAME (0)  OP_SAME (1));

This looks wrong.  types_compatible_p will happily return true say
for unsigned long and unsigned long long types on x86_64, because
they are both integral types with the same precision, but the second
argument of MEM_REF contains aliasing information, where the distinction
between the two is important.
So, while == comparison of main variant is too strict, types_compatible_p
is too weak, so I guess you need to write a new predicate that will either
handle the == and a few special cases that are safe to be handled, or
look for what exactly we use the type of the second MEM_REF argument
and check those properties.  We certainly need that
get_deref_alias_set_1 and get_deref_alias_set return the same values
for both the types, but whether that is the only information we are using,
not sure, CCing Richard.

Jakub


Re: [C++1y] [PATCH 3/4] ... canonical type workaround and refactoring

2013-09-24 Thread Adam Butcher

On 23.09.2013 08:15, Adam Butcher wrote:

On 22.09.2013 18:57, Adam Butcher wrote:

The following solves the canonical type issue in the general case
(pointers and refs) and makes it equivalent to the explicit template
case -- in terms of canonical type at least.

  for (tree t = TREE_TYPE (TREE_VALUE (p)); t; t = TREE_CHAIN 
(t))

TYPE_CANONICAL (t) = t;


The above is insufficient; the traversal doesn't handle function
pointers.  Currently, to get my local testcases passing, I have the
following workaround that abuses find_type_usage.  I intend for this
to be replaced with a better solution but it will at least mean that
people can start experimenting with this feature now (as they appear
to be doing http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58500).  This
supports that testcase also.



Shall I push the patch below to trunk as an intermediate workaround 
whilst I get to refactoring to support on-the-fly template parm 
synthesis?


On the subject of on-the-fly synthesis: I haven't started yet but I'm 
thinking to trigger in 'cp_parser_simple_type_specifier' when 
'current_binding_level-kind == sk_function_parms'.  I can foresee a 
potential issue with packs in that, upon reading the 'auto' (potentially 
nested in some other type), a plain template parm will be synthesized; 
but it may need to be a pack parm type if '...' is found later.


My initial testcase is:

  template typename T struct X { T m(int, float); };

  auto f(Xauto, auto (Xauto::*) (auto...))
  {
 char* s = warn;
  }

  int main()
  {
 Xchar x;
 f(x, Xvoid::m);
  }

where f should be translated as similar to:

  template typename A1, typename A2, typename A3,
typename... A4
  auto f(XA1, A2 (XA3::*) (A4...))
  {
 char* s = warn;
  }


In the case of something like:

  auto f(Xauto...)

the translation would need to be:

  template typename... A
  auto f(XA...)


I'm thinking that getting that to happen correctly might be tricky (but 
haven't tried yet).  The 'auto' would trigger plain template parameter 
synthesis.  Perhaps a 'could_be_parameter_pack_p' on the 
template_type_parm?  Though I don't know how the following could be 
handled as implicit


  template typename T, typename A...
  auto f(XT, A...)

It would not be possible to infer which of the template parms to make 
the pack.


  auto f(Xauto, auto...)

Probably multiple generic-type pack expansions should be forbidden.

I'll see what happens when I get there but any guidance/thoughts you 
have on the subject will be valuable.


Cheers,
Adam



Workaround implicit function template parameter canonical type 
issue.


* parser.c (add_implicit_template_parms): Workaround to fix
up canonical
type references left over from before substation of 'auto'.  
Better

solution needed but this makes test cases functional.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index f3133f3..4171476 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -28987,6 +28987,25 @@ add_implicit_template_parms (cp_parser
*parser, size_t expect_count,
   if (!generic_type_ptr)
continue;

+  /* Make the canonical type of each part of the parameter 
distinct.
+FIXME: A better solution is needed for this.  Not least the 
abuse of
+find_type_usage.  Need foreach_type or similar for proper 
mutable

+access.  If something like this does turn out to be
necessary then the
+find_type_usage loop above can be replaced by a foreach_type
that fixes
+up the canonical types on the way to finding the 'auto'.  */
+
+  struct helper { static bool fixup_canonical_types (tree t) {
+ t = TREE_TYPE (t);
+ if (!t)
+   return false;
+ if (is_auto_or_concept (t))
+   return true;
+ TYPE_CANONICAL (t) = t;
+ return false;
+  }};
+  find_type_usage (TREE_VALUE (p), (bool (*)(const_tree))
+  helper::fixup_canonical_types);
+
   ++synth_count;

   tree synth_id = make_generic_type_name ();


Re: [PATCH] Bug fix: *var and MEM[(const int *)var] (var has int* type) are not treated as the same data ref.

2013-09-24 Thread Richard Biener
On Tue, 24 Sep 2013, Jakub Jelinek wrote:

 Hi!
 
 On Mon, Sep 23, 2013 at 05:26:13PM -0700, Cong Hou wrote:
 
 Missing ChangeLog entry.
 
  --- gcc/testsuite/gcc.dg/alias-14.c (revision 0)
  +++ gcc/testsuite/gcc.dg/alias-14.c (revision 0)
 
 Vectorizer tests should go into gcc.dg/vect/ instead, or, if they are
 for a single target (but there is no reason why this should be a single
 target), into gcc.target/target/.
 
  --- gcc/fold-const.c (revision 202662)
  +++ gcc/fold-const.c (working copy)
  @@ -2693,8 +2693,9 @@ operand_equal_p (const_tree arg0, const_
   operand_equal_p (TYPE_SIZE (TREE_TYPE (arg0)),
  TYPE_SIZE (TREE_TYPE (arg1)), flags)))
  types_compatible_p (TREE_TYPE (arg0), TREE_TYPE (arg1))
  -   (TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg0, 1)))
  -  == TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg1, 1
  +   types_compatible_p (
  +   TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg0, 1))),
  +   TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg1, 1
  OP_SAME (0)  OP_SAME (1));
 
 This looks wrong.  types_compatible_p will happily return true say
 for unsigned long and unsigned long long types on x86_64, because
 they are both integral types with the same precision, but the second
 argument of MEM_REF contains aliasing information, where the distinction
 between the two is important.
 So, while == comparison of main variant is too strict, types_compatible_p
 is too weak, so I guess you need to write a new predicate that will either
 handle the == and a few special cases that are safe to be handled, or
 look for what exactly we use the type of the second MEM_REF argument
 and check those properties.  We certainly need that
 get_deref_alias_set_1 and get_deref_alias_set return the same values
 for both the types, but whether that is the only information we are using,
 not sure, CCing Richard.

Using TYPE_MAIN_VARIANT is exactly correct - this is the best we
can do that will work with all frontends.  TYPE_MAIN_VARIANT
guarantees that the alias-sets stay the same:

  /* If the innermost reference is a MEM_REF that has a
 conversion embedded treat it like a VIEW_CONVERT_EXPR above,
 using the memory access type for determining the alias-set.  */
 if (TREE_CODE (inner) == MEM_REF
  TYPE_MAIN_VARIANT (TREE_TYPE (inner))
!= TYPE_MAIN_VARIANT
   (TREE_TYPE (TREE_TYPE (TREE_OPERAND (inner, 1)
   return get_deref_alias_set (TREE_OPERAND (inner, 1));

so we cannot change the compatibility checks without touching the
alias-set deriving code.  For the testcase in question we have
MEM[(const int )_7] vs. MEM[(int *)_7] and unfortunately pointer
and reference types are not variant types.

We also cannot easily resort to pointed-to types as our all-beloved
ref-all qualification is on the pointer type rather than on the
pointed-to type.

But yes, we could implement a more complicated predicate

bool
alias_ptr_types_compatible_p (const_tree t1, const_tree t2)
{
  t1 = TYPE_MAIN_VARIANT (t1);
  t2 = TYPE_MAIN_VARIANT (t2);
  if (t1 == t2)
return true;

  if (TYPE_REF_CAN_ALIAS_ALL (t1)
  || TYPE_REF_CAN_ALIAS_ALL (t2))
return false;

  return (TYPE_MAIN_VARIANT (TREE_TYPE (t1))
  == TYPE_MAIN_VARIANT (TREE_TYPE (t2)));
}

Note that the fold-const code in question is

  return ((TYPE_SIZE (TREE_TYPE (arg0)) == TYPE_SIZE (TREE_TYPE 
(arg1))
   || (TYPE_SIZE (TREE_TYPE (arg0))
TYPE_SIZE (TREE_TYPE (arg1))
operand_equal_p (TYPE_SIZE (TREE_TYPE (arg0)),
   TYPE_SIZE (TREE_TYPE (arg1)), 
flags)))
   types_compatible_p (TREE_TYPE (arg0), TREE_TYPE 
(arg1))
   (TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg0, 
1)))
  == TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg1, 
1
   OP_SAME (0)  OP_SAME (1));

which you may notice uses types_compatible_p on the reference type
which is at least suspicious as that can let through reference trees
that will end up using different alias sets due to the stricter
check in get_alias_set.

So in addition to alias_ptr_types_compatible_p we may want to have

bool
reference_alias_ptr_types_compatible_p (const_tree t1, const_tree t2)
{
...
}

abstracting this away for the actual reference trees (also noting
that reference_alias_ptr_type isn't 1:1 following what get_alias_set
does either).

I will give this a try.

Richard.


[PATCH, ARM] Fix assembly scan test.

2013-09-24 Thread Yvan Roux
Hi,

this patch fix the scan-assembler pattern of
gcc.target/arm/atomic-comp-swap-release-acquire.c, which didn't
allowed aliases register and failed  when enabling LRA where 'ip' is
used in the ldaex instruction.

Thanks,
Yvan

2013-09-24  Yvan Roux  yvan.r...@linaro.org

* gcc.target/arm/atomic-comp-swap-release-acquire.c: Correct
scan-assembler pattern to support register aliases.


fix-asm-scan.diff
Description: Binary data


Re: [PATCH, RTL] Prepare ARM build with LRA

2013-09-24 Thread Yvan Roux
Ping

On 16 September 2013 10:57, Yvan Roux yvan.r...@linaro.org wrote:
 Adding Eric and Steven in the loop as it is RTL related.

 Thanks
 Yvan

 On 11 September 2013 21:08, Yvan Roux yvan.r...@linaro.org wrote:
 Here is the new patch discussed in the other thread.

 Thanks
 Yvan

 2013-09-11  Yvan Roux  yvan.r...@linaro.org
 Vladimir Makarov  vmaka...@redhat.com

 * rtlanal.c (lsb_bitfield_op_p): New predicate for bitfield 
 operations
 from the least significant bit.
 (strip_address_mutations): Add bitfield operations handling.
 (shift_code_p): New predicate for shifting operations.
 (must_be_index_p): Add shifting operations handling.
 (set_address_index): Likewise.


Re: [PATCH, ARM] Fix assembly scan test.

2013-09-24 Thread Ramana Radhakrishnan

On 09/24/13 09:27, Yvan Roux wrote:

Hi,

this patch fix the scan-assembler pattern of
gcc.target/arm/atomic-comp-swap-release-acquire.c, which didn't
allowed aliases register and failed  when enabling LRA where 'ip' is
used in the ldaex instruction.



Ok -

The changelog could just read : Adjust expected output.

or some other words to that effect.

Ramana


Thanks,
Yvan

2013-09-24  Yvan Roux  yvan.r...@linaro.org

 * gcc.target/arm/atomic-comp-swap-release-acquire.c: Correct
 scan-assembler pattern to support register aliases.=






RE: [PR58463][Backport to 4.8] Adjust dumping for ref nodes

2013-09-24 Thread Paulo Matos
Ping on this patch.

Note I don't have write access.

Paulo Matos


 -Original Message-
 From: Richard Biener [mailto:richard.guent...@gmail.com]
 Sent: 23 September 2013 09:00
 To: Paulo Matos
 Cc: gcc-patches@gcc.gnu.org
 Subject: Re: [PR58463][Backport to 4.8] Adjust dumping for ref nodes
 
 On Fri, Sep 20, 2013 at 5:39 PM, Paulo Matos pma...@broadcom.com wrote:
 
  Can we please backport this to 4.8 (it will fix PR58463)?
  I attach Richard's patch to master, let me know if instead I should create
 one specific for 4.8.1. I tested this against 4.8 branch and everything looks
 fine.
 
 Ok.
 
 Thanks,
 Richard.
 
 
  2013-03-27  Richard Biener  rguent...@suse.de
 
  PR tree-optimization/56716
  * tree-ssa-structalias.c (perform_var_substitution): Adjust
  dumping for ref nodes.
 
  Modified:
  trunk/gcc/ChangeLog
  trunk/gcc/tree-ssa-structalias.c
 
  Paulo Matos
 
 



Re: [PATCH] Bug fix: *var and MEM[(const int *)var] (var has int* type) are not treated as the same data ref.

2013-09-24 Thread Richard Biener
On Tue, 24 Sep 2013, Richard Biener wrote:

 On Tue, 24 Sep 2013, Jakub Jelinek wrote:
 
  Hi!
  
  On Mon, Sep 23, 2013 at 05:26:13PM -0700, Cong Hou wrote:
  
  Missing ChangeLog entry.
  
   --- gcc/testsuite/gcc.dg/alias-14.c (revision 0)
   +++ gcc/testsuite/gcc.dg/alias-14.c (revision 0)
  
  Vectorizer tests should go into gcc.dg/vect/ instead, or, if they are
  for a single target (but there is no reason why this should be a single
  target), into gcc.target/target/.
  
   --- gcc/fold-const.c (revision 202662)
   +++ gcc/fold-const.c (working copy)
   @@ -2693,8 +2693,9 @@ operand_equal_p (const_tree arg0, const_
operand_equal_p (TYPE_SIZE (TREE_TYPE (arg0)),
   TYPE_SIZE (TREE_TYPE (arg1)), flags)))
   types_compatible_p (TREE_TYPE (arg0), TREE_TYPE (arg1))
   -   (TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg0, 1)))
   -  == TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg1, 1
   +   types_compatible_p (
   +   TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg0, 1))),
   +   TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg1, 1
   OP_SAME (0)  OP_SAME (1));
  
  This looks wrong.  types_compatible_p will happily return true say
  for unsigned long and unsigned long long types on x86_64, because
  they are both integral types with the same precision, but the second
  argument of MEM_REF contains aliasing information, where the distinction
  between the two is important.
  So, while == comparison of main variant is too strict, types_compatible_p
  is too weak, so I guess you need to write a new predicate that will either
  handle the == and a few special cases that are safe to be handled, or
  look for what exactly we use the type of the second MEM_REF argument
  and check those properties.  We certainly need that
  get_deref_alias_set_1 and get_deref_alias_set return the same values
  for both the types, but whether that is the only information we are using,
  not sure, CCing Richard.
 
 Using TYPE_MAIN_VARIANT is exactly correct - this is the best we
 can do that will work with all frontends.  TYPE_MAIN_VARIANT
 guarantees that the alias-sets stay the same:
 
   /* If the innermost reference is a MEM_REF that has a
  conversion embedded treat it like a VIEW_CONVERT_EXPR above,
  using the memory access type for determining the alias-set.  */
  if (TREE_CODE (inner) == MEM_REF
   TYPE_MAIN_VARIANT (TREE_TYPE (inner))
 != TYPE_MAIN_VARIANT
(TREE_TYPE (TREE_TYPE (TREE_OPERAND (inner, 1)
return get_deref_alias_set (TREE_OPERAND (inner, 1));
 
 so we cannot change the compatibility checks without touching the
 alias-set deriving code.  For the testcase in question we have
 MEM[(const int )_7] vs. MEM[(int *)_7] and unfortunately pointer
 and reference types are not variant types.
 
 We also cannot easily resort to pointed-to types as our all-beloved
 ref-all qualification is on the pointer type rather than on the
 pointed-to type.
 
 But yes, we could implement a more complicated predicate
 
 bool
 alias_ptr_types_compatible_p (const_tree t1, const_tree t2)
 {
   t1 = TYPE_MAIN_VARIANT (t1);
   t2 = TYPE_MAIN_VARIANT (t2);
   if (t1 == t2)
 return true;
 
   if (TYPE_REF_CAN_ALIAS_ALL (t1)
   || TYPE_REF_CAN_ALIAS_ALL (t2))
 return false;
 
   return (TYPE_MAIN_VARIANT (TREE_TYPE (t1))
   == TYPE_MAIN_VARIANT (TREE_TYPE (t2)));
 }
 
 Note that the fold-const code in question is
 
   return ((TYPE_SIZE (TREE_TYPE (arg0)) == TYPE_SIZE (TREE_TYPE 
 (arg1))
|| (TYPE_SIZE (TREE_TYPE (arg0))
 TYPE_SIZE (TREE_TYPE (arg1))
 operand_equal_p (TYPE_SIZE (TREE_TYPE (arg0)),
TYPE_SIZE (TREE_TYPE (arg1)), 
 flags)))
types_compatible_p (TREE_TYPE (arg0), TREE_TYPE 
 (arg1))
(TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg0, 
 1)))
   == TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg1, 
 1
OP_SAME (0)  OP_SAME (1));
 
 which you may notice uses types_compatible_p on the reference type
 which is at least suspicious as that can let through reference trees
 that will end up using different alias sets due to the stricter
 check in get_alias_set.
 
 So in addition to alias_ptr_types_compatible_p we may want to have
 
 bool
 reference_alias_ptr_types_compatible_p (const_tree t1, const_tree t2)
 {
 ...
 }
 
 abstracting this away for the actual reference trees (also noting
 that reference_alias_ptr_type isn't 1:1 following what get_alias_set
 does either).
 
 I will give this a try.

The following is an untested (well, the testcase from PR58513 is now
vectorized) patch doing that refactoring.

Richard.

Index: gcc/tree.c
===
*** gcc/tree.c  (revision 202859)
--- gcc/tree.c  (working copy)
*** mem_ref_offset 

Re: [gomp4] Dumping gimple for offload.

2013-09-24 Thread Richard Biener
On Mon, Sep 23, 2013 at 3:29 PM, Ilya Tocar tocarip.in...@gmail.com wrote:
 Hi,

 I've rebased my patch.
 Is it ok for gomp4

Passing through is_omp looks bad - please find a more suitable place
to attach this meta-information.  From a quick look you only need it to
produce an alternate section name, thus consider assigning the section
name in a different place.

Richard.


 2013/9/13 Ilya Tocar tocarip.in...@gmail.com:
 Hi,

 I'm working on dumping gimple for omp pragma target stuff into
 gnu.target_lto_ sections.
 I've tried to reuse current lto infrastructure as much as possible.

 Could you please take a look at attached patch?


 ---
  gcc/ipa-inline-analysis.c |   2 +-
  gcc/ipa-profile.c |   2 +-
  gcc/ipa-prop.c|   4 +-
  gcc/ipa-pure-const.c  |   2 +-
  gcc/ipa-reference.c   |   2 +-
  gcc/lto-cgraph.c  |  22 +++--
  gcc/lto-opts.c|   2 +-
  gcc/lto-section-out.c |  14 ++-
  gcc/lto-streamer-out.c| 215 
 +++---
  gcc/lto-streamer.c|   6 +-
  gcc/lto-streamer.h|  13 +--
  gcc/lto/lto.c |   2 +-
  gcc/passes.c  |   3 +-
  gcc/passes.def|   2 +
  gcc/timevar.def   |   2 +
  gcc/tree-pass.h   |   2 +
  16 files changed, 237 insertions(+), 58 deletions(-)

 diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c
 index ba6221e..ea3fc90 100644
 --- a/gcc/ipa-inline-analysis.c
 +++ b/gcc/ipa-inline-analysis.c
 @@ -4023,7 +4023,7 @@ inline_write_summary (void)
 }
  }
streamer_write_char_stream (ob-main_stream, 0);
 -  produce_asm (ob, NULL);
 +  produce_asm (ob, NULL, false);
destroy_output_block (ob);

if (optimize  !flag_ipa_cp)
 diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c
 index 424e4a6..b16ba6c 100644
 --- a/gcc/ipa-profile.c
 +++ b/gcc/ipa-profile.c
 @@ -247,7 +247,7 @@ ipa_profile_write_summary (void)
streamer_write_uhwi_stream (ob-main_stream, histogram[i]-time);
streamer_write_uhwi_stream (ob-main_stream, histogram[i]-size);
  }
 -  lto_destroy_simple_output_block (ob);
 +  lto_destroy_simple_output_block (ob, false);
  }

  /* Deserialize the ipa info for lto.  */
 diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
 index c09ec2f..69603c9 100644
 --- a/gcc/ipa-prop.c
 +++ b/gcc/ipa-prop.c
 @@ -4234,7 +4234,7 @@ ipa_prop_write_jump_functions (void)
  ipa_write_node_info (ob, node);
  }
streamer_write_char_stream (ob-main_stream, 0);
 -  produce_asm (ob, NULL);
 +  produce_asm (ob, NULL, false);
destroy_output_block (ob);
  }

 @@ -4409,7 +4409,7 @@ ipa_prop_write_all_agg_replacement (void)
 write_agg_replacement_chain (ob, node);
  }
streamer_write_char_stream (ob-main_stream, 0);
 -  produce_asm (ob, NULL);
 +  produce_asm (ob, NULL, false);
destroy_output_block (ob);
  }

 diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
 index 55b679d..d6bbd52 100644
 --- a/gcc/ipa-pure-const.c
 +++ b/gcc/ipa-pure-const.c
 @@ -988,7 +988,7 @@ pure_const_write_summary (void)
 }
  }

 -  lto_destroy_simple_output_block (ob);
 +  lto_destroy_simple_output_block (ob, false);
  }


 diff --git a/gcc/ipa-reference.c b/gcc/ipa-reference.c
 index e6f19fd..0593c77 100644
 --- a/gcc/ipa-reference.c
 +++ b/gcc/ipa-reference.c
 @@ -1022,7 +1022,7 @@ ipa_reference_write_optimization_summary (void)
   }
}
BITMAP_FREE (ltrans_statics);
 -  lto_destroy_simple_output_block (ob);
 +  lto_destroy_simple_output_block (ob, false);
splay_tree_delete (reference_vars_to_consider);
  }

 diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
 index 952588d..831e74d 100644
 --- a/gcc/lto-cgraph.c
 +++ b/gcc/lto-cgraph.c
 @@ -690,7 +690,7 @@ output_outgoing_cgraph_edges (struct cgraph_edge *edge,
  /* Output the part of the cgraph in SET.  */

  static void
 -output_refs (lto_symtab_encoder_t encoder)
 +output_refs (lto_symtab_encoder_t encoder, bool is_omp)
  {
lto_symtab_encoder_iterator lsei;
struct lto_simple_output_block *ob;
 @@ -719,7 +719,7 @@ output_refs (lto_symtab_encoder_t encoder)

streamer_write_uhwi_stream (ob-main_stream, 0);

 -  lto_destroy_simple_output_block (ob);
 +  lto_destroy_simple_output_block (ob, is_omp);
  }

  /* Add NODE into encoder as well as nodes it is cloned from.
 @@ -878,7 +878,7 @@ compute_ltrans_boundary (lto_symtab_encoder_t in_encoder)
  /* Output the part of the symtab in SET and VSET.  */

  void
 -output_symtab (void)
 +output_symtab (bool is_omp)
  {
struct cgraph_node *node;
struct lto_simple_output_block *ob;
 @@ -907,9 +907,15 @@ output_symtab (void)
  {
symtab_node node = lto_symtab_encoder_deref (encoder, i);
if (cgraph_node *cnode = dyn_cast cgraph_node (node))
 -lto_output_node (ob, cnode, encoder);
 +   {
 + if (!is_omp || lookup_attribute (omp declare target,
 + DECL_ATTRIBUTES 
 

Re: [PATCH] Fix typo in check for null

2013-09-24 Thread Richard Biener
On Mon, Sep 23, 2013 at 4:03 PM, Paulo Matos pma...@broadcom.com wrote:
 This is a patch for master, where in number_of_loops, we want to check if the 
 loops (returned by loops_for_fn) is non-null but instead we are using fn, 
 which is the function argument. I haven't opened a bug report, please let me 
 know if I need to do that before submitting patches.

Patch is ok.

Thanks,
Richard.

 2013-09-23 Paulo Matos pma...@broadcom.com

   * gcc/cfgloop.h (number_of_loops): Fix typo in check for null

 Paulo Matos




Re: [PATCH i386 3/8] [AVX512] [1/n] Add AVX-512 patterns: VF iterator extended.

2013-09-24 Thread Kirill Yukhin
Hello,
On 18 Sep 11:17, Kirill Yukhin wrote:
 Hello,
 On 13 Sep 14:28, Kirill Yukhin wrote:
  Hello,
  On 09 Sep 15:11, Kirill Yukhin wrote:
   Hello,
   On 06 Sep 17:41, Kirill Yukhin wrote:
Hello,

PING.
   PING.
  PING.
 PING
PING.

--
Thanks, K


[PATCH, LRA, AARCH64] Switching LRA on for AArch64

2013-09-24 Thread Yvan Roux
Hi,

The following patch switch LRA on for AArch64.  The patch introduces
an undocumented option -mlra to use LRA instead of  reload, for a
testing purpose.  Please notice that this patch is dependent on the
one submitted in the thread below:

http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00805.html

Thanks,
Yvan

2013-09-24  Yvan Roux  yvan.r...@linaro.org

* config/aarch64/aarch64.opt (mlra): New option.
* config/aarch64/aarch64.c (aarch64_lra_p): New function.
(TARGET_LRA_P): Define.


aarch64-lra.diff
Description: Binary data


[PATCH]Fix computation of offset in ivopt

2013-09-24 Thread bin.cheng
Hi,
This patch fix two minor bugs when computing offset in IVOPT.
1) Considering below example:
#define MAX 100
struct tag
{
  int i;
  int j;
}
struct tag arr[MAX]

int foo (int len)
{
  int i = 0;
  for (; i  len; i++)
  {
access arr[i].j;
  }
}

Without this patch, the offset computed by strip_offset_1 for address
arr[i].j is ZERO, which is apparently not.

2) Considering below example:
//...
  bb 15:
  KeyIndex_66 = KeyIndex_194 + 4294967295;
  if (KeyIndex_66 != 0)
goto bb 16;
  else
goto bb 18;

  bb 16:

  bb 17:
  # KeyIndex_194 = PHI KeyIndex_66(16), KeyIndex_180(73)
  _62 = KeyIndex_194 + 1073741823;
  _63 = _62 * 4;
  _64 = pretmp_184 + _63;
  _65 = *_64;
  if (_65 == 0)
goto bb 15;
  else
goto bb 71;
//...

There are iv use and candidate like:

use 1
  address
  in statement _65 = *_64;

  at position *_64
  type handletype *
  base pretmp_184 + ((sizetype) KeyIndex_180 + 1073741823) * 4
  step 4294967292
  base object (void *) pretmp_184
  related candidates 

candidate 6
  var_before ivtmp.16
  var_after ivtmp.16
  incremented before use 1
  type unsigned int
  base (unsigned int) (pretmp_184 + (sizetype) KeyIndex_180 * 4)
  step 4294967292
  base object (void *) pretmp_184
Candidate 6 is related to use 1

In function get_computation_cost_at for use 1 using cand 6, ubase and cbase
are:
pretmp_184 + ((sizetype) KeyIndex_180 + 1073741823) * 4
pretmp_184 + (sizetype) KeyIndex_180 * 4

The cstepi computed in HOST_WIDE_INT is :  0xfffc, while offset
computed in TYPE(utype) is : 0xfffc.  Though they both stand for value
-4 in different precision, statement offset -= ratio * cstepi returns
0x1, which is wrong.

Tested on x86_64 and arm.  Is it OK?

Thanks.
bin


2013-09-24  Bin Cheng  bin.ch...@arm.com

* tree-ssa-loop-ivopts.c (strip_offset_1): Count
DECL_FIELD_BIT_OFFSET when computing offset for COMPONENT_REF.
(get_computation_cost_at): Sign extend offset.Index: gcc/tree-ssa-loop-ivopts.c
===
--- gcc/tree-ssa-loop-ivopts.c  (revision 200774)
+++ gcc/tree-ssa-loop-ivopts.c  (working copy)
@@ -2133,19 +2133,28 @@ strip_offset_1 (tree expr, bool inside_addr, bool
   break;
 
 case COMPONENT_REF:
-  if (!inside_addr)
-   return orig_expr;
+  {
+   tree field;
+   HOST_WIDE_INT boffset = 0;
+   if (!inside_addr)
+ return orig_expr;
 
-  tmp = component_ref_field_offset (expr);
-  if (top_compref
-  cst_and_fits_in_hwi (tmp))
-   {
- /* Strip the component reference completely.  */
- op0 = TREE_OPERAND (expr, 0);
- op0 = strip_offset_1 (op0, inside_addr, top_compref, off0);
- *offset = off0 + int_cst_value (tmp);
- return op0;
-   }
+   field = TREE_OPERAND (expr, 1);
+   if (DECL_FIELD_BIT_OFFSET (field)
+cst_and_fits_in_hwi (DECL_FIELD_BIT_OFFSET (field)))
+ boffset = int_cst_value (DECL_FIELD_BIT_OFFSET (field));
+
+   tmp = component_ref_field_offset (expr);
+   if (top_compref
+cst_and_fits_in_hwi (tmp))
+ {
+   /* Strip the component reference completely.  */
+   op0 = TREE_OPERAND (expr, 0);
+   op0 = strip_offset_1 (op0, inside_addr, top_compref, off0);
+   *offset = off0 + int_cst_value (tmp) + boffset / BITS_PER_UNIT;
+   return op0;
+ }
+  }
   break;
 
 case ADDR_EXPR:
@@ -4133,6 +4142,9 @@ get_computation_cost_at (struct ivopts_data *data,
 bitmap_clear (*depends_on);
 }
 
+  /* Sign-extend offset if utype has lower precision than HOST_WIDE_INT.  */
+  offset = sext_hwi (offset, TYPE_PRECISION (utype));
+
   /* If we are after the increment, the value of the candidate is higher by
  one iteration.  */
   if (stmt_is_after_inc)


Re: RFC: patch to build GCC for arm with LRA

2013-09-24 Thread Yvan Roux
Hi,

 Fair enough - we should just fix the test and move on.

Done.

 I would suggest in addition a transitional command-line option to switch
 between LRA and reload as a temporary measure so that folks can do some more
 experimenting for AArch32.

I've a patch which fixes the REG_NOTE issues, it's still under
validation, but seems to work.  I'll add the -mlra option to switch
between LRA and reload, but my question is, will LRA be enabled by
default or not, as we still have the Thumb issue ? Maybe switching LRA
on only in ARM mode is a good trade off, but in that case what is the
best way to do it ? (I'll pass a full validation when we'll be agree
on that point)

Thanks,
Yvan


Re: Fix PR middle-end/57393

2013-09-24 Thread Richard Biener
On Mon, Sep 23, 2013 at 5:56 PM, Easwaran Raman era...@google.com wrote:
 Ping.


 On Mon, Sep 16, 2013 at 4:42 PM, Easwaran Raman era...@google.com wrote:
 There are two separate root causes for the problem reported in PR
 57393. This patch attempts to fix both.

 First is due to newly created stmts that have the default UID of 0
 which are compared with statements with valid UIDs leading to broken
 dependences. As discussed in an earlier thread, I check the UIDs
 before using them and ensure stmts have a valid UID. In the worst
 case, this reassigns UIDs for the entire BB containing the stmts in
 question.

 The second is due to debug stmts being out of sync with the IR after
 reassociation. I think the right fix is to create debug temps before
 actual reassociation to decouple them from the SSA variables involved
 in reassociation.

 This bootstraps in x86_64 and I am running the tests. Ok for trunk?

In the end I like the scheduling code less and less - I believe it will become
a major compile-time hog.  Still,

+  gimple temp = gsi_stmt (gsi);
+  uid = gimple_uid (temp);

no need for 'temp', just write gimple_uid (gsi_stmt (gsi))

+}
+  if (iters = MAX_UNASSIGNED_UIDS || uid == 0)
+renumber_gimple_stmt_uids_in_blocks (bb, 1);

so this will renumber uids whenever we have trailing zero-UID stmts at the
end of a basic-block?

+  else
+{
+  for (gsi_prev (gsi); !gsi_end_p (gsi); gsi_prev (gsi))
+{
+  stmt = gsi_stmt (gsi);
+  if (gimple_uid (stmt)  0)
+break;
+  gimple_set_uid (stmt, uid);
+}

while this fills zero-UIDs from the found UID.  A comment on the
re-fill strathegy
would be nice here.

@@ -2901,6 +2930,9 @@ find_insert_point (gimple stmt, gimple dep_stmt)
   gimple insert_stmt = stmt;
   if (dep_stmt == NULL)
 return stmt;
+  ensure_valid_uid (stmt);
+  ensure_valid_uid (dep_stmt);
+
   if (gimple_uid (insert_stmt) == gimple_uid (dep_stmt)

vertical space oddity - move the blank line before the first
ensure_valid_uid () call.

+{
+  tree rhs1 = gimple_assign_rhs1 (stmt);
+  swap_ops_for_binary_stmt (ops, len - 3, stmt);
+  regenerate_debug_stmts (SSA_NAME_DEF_STMT
(rhs1), len - 3);
+}

I'm not sure this is the right place for this.  Also note my previous
comment that it
is wrong to re-use SSA names for different values because of associated info
that can now be wrong.  Yes, it's a pre-existing bug but you end up exposing it.

I believe the fix is in the places that adjust a stmts rhs1/rhs2.  Yes, reassoc
is a complete mess with regard to this as it modifies the IL for its analysis
purposes.

We'll have interesting times dealing with all this now that we preserve
value-range information for SSA names (which definitely is garbled by
reassoc now).

We absolutely have to do something about this for 4.9 and I think I
shouldn't have
approved the original scheduling work :/  Which complicates matters more,
similar to the || and  reassoc work, and makes the needed rewrite a much
more complex task.

Bah.

Richard.

 Thanks,
 Easwaran

 2013-09-16  Easwaran Raman  era...@google.com

 PR middle-end/57393
 * tree-ssa-reassoc.c (get_stmt_uid_with_default): Remove.
 (build_and_add_sum): Do not set UID of newly created statements.
 (ensure_valid_uid): New function,
 (find_insert_point): called here before UIDs are compared.
 (insert_stmt_after): Do not reset debug statements.
 (regenerate_debug_stmts): New function,
 (reassociate_bb): use here.

 testsuite/ChangeLog:
 2013-09-16  Easwaran Raman  era...@google.com

 PR middle-end/57393
 * gcc.dg/tree-ssa/reassoc-32.c: New testcase.
 * gcc.dg/tree-ssa/reassoc-33.c: New testcase.
 * gcc.dg/tree-ssa/reassoc-34.c: New testcase.


Re: Fix PR middle-end/57393

2013-09-24 Thread Jakub Jelinek
On Tue, Sep 24, 2013 at 11:26:16AM +0200, Richard Biener wrote:
 +  if (iters = MAX_UNASSIGNED_UIDS || uid == 0)
 +renumber_gimple_stmt_uids_in_blocks (bb, 1);
 
 so this will renumber uids whenever we have trailing zero-UID stmts at the
 end of a basic-block?

Trailing zero-UIDs at the end of a basic block should be trivially handlable
without renumbering the whole basic block.

Jakub


Re: RFA: Store the REG_BR_PROB probability directly as an int

2013-09-24 Thread Andreas Schwab
Richard Sandiford rdsandif...@googlemail.com writes:

 REG_BR_PROB notes are stored as:

   (expr_list:REG_BR_PROB (const_int prob) chain)

 but a full const_int rtx seems a bit heavweight when all we want is
 a plain int.  This patch uses:

   (int_list:REG_BR_PROB prob chain)

 instead.

I think you left out the handling of INT_LIST in eliminate_regs_1.  This
lets me finish the build:

diff --git a/gcc/reload1.c b/gcc/reload1.c
index 7a82c07..41f1aa8 100644
--- a/gcc/reload1.c
+++ b/gcc/reload1.c
@@ -2576,6 +2576,7 @@ eliminate_regs_1 (rtx x, enum machine_mode mem_mode, rtx 
insn,
 case ADDR_VEC:
 case ADDR_DIFF_VEC:
 case RETURN:
+case INT_LIST:
   return x;
 
 case REG:

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
And now for something completely different.


Re: gimple build interface

2013-09-24 Thread Richard Biener
On Mon, Sep 23, 2013 at 7:16 PM, Andrew MacLeod amacl...@redhat.com wrote:
 On 09/23/2013 01:05 PM, David Malcolm wrote:

 On Mon, 2013-09-23 at 12:21 -0400, Andrew MacLeod wrote:

 On 09/20/2013 04:08 AM, Richard Biener wrote:

 On Thu, Sep 19, 2013 at 6:56 PM, Andrew MacLeod amacl...@redhat.com
 wrote:

 On 09/19/2013 09:24 AM, Andrew MacLeod wrote:

 I think this is of most use to ssa passes that need to construct code
 snippets, so I propose we make this ssa specific and put it in
 tree-ssa.c
 (renaming it ssa_build_assign),  *OR* we could leave it general
 purpose and
 put it in its own set of files, gimple-ssa-build.[ch] or something
 that
 crosses the border between the two representations.

 I'd also suggest that the final optional parameter be changed to tree
 *lhs
 = NULL_TREE,  which would allow the caller to specify the LHS if they
 want,
 otherwise make_ssa_name would be called. If we want to leave it
 supporting
 both gimple and ssa, then anyone from gimple land could pass in a
 gimple LHS
 variable thus avoiding the call to make_ssa_name

 Thoughts?
 Andrew

 Anyway, here is a patch which does that and a bit more.  I didn't
 rename
 build_assign() to ssa_build_assign()..   even though those are the only
 kind
 actually created right now.   we can leave that for the day someone
 actually
 decides to flush this interface out, and maybe we'll want to pass in
 gimple_tmps and call them from front ends or other places... then it
 would
 have to be renamed again. So I just left it as is for the moment, but
 that
 could be changed.

 I also moved gimple_replace_lhs() to tree-ssa.c and renamed it
 ssa_replace_lhs(). It calls insert_debug_temp_for_var_def() from
 tree-ssa.c
 and that only works with the immediate use operands.. so that is an SSA
 specific routine, which makes this one SSA specific as well.

 Those 2 changes allow tree-ssa.h to no longer be included, it is
 replaced
 with tree-flow.h.   Some preliminary work to enable removing immediate
 use
 routines out of tree-flow.h include:

 struct count_ptr_d, count_ptr_derefs(), count_uses_and_derefs() also
 get
 moved to tree-ssa.c since those are also require the immediate use
 mechanism, and thus is also SSA dependent.

 This bootstraps on x86_64-unknown-linux-gnu and has no new regressions.
 OK?

 Can you move the builders to asan.c please?  From a quick glance it
 seems
 to have various issues so it shouldn't be used (I wonder who approved
 them
 in the end ... maybe it was even me).

 ssa_replace_lhs sounds odd (a 'SSA' has a lhs?), but maybe it's just me.
 I'd have chosen gimple_replace_ssa_lhs?

 That sounds better.  done.

 And I also think a seperate file for those builders is probably best...
 here's a patch with those changes.. New files called
 gimple-builder.[ch]...Then diego can eventually do whatever his
 grand vision for them is.  I minimized the includes.

 bootstraps and rerunning tests.  OK?

 Did you forget to attach the patch?

 Of course not!  :-P

 Oh how I hate mondays.

Old patch attached?

Richard.

 Andrew


Re: [ping][PATCH][1 of 2] Add value range info to SSA_NAME for zero sign extension elimination in RTL

2013-09-24 Thread Richard Biener
On Mon, Sep 23, 2013 at 10:34 PM, Eric Botcazou ebotca...@adacore.com wrote:
 I have committed it for you (rev 202831), with a few modifications
 (ChangeLog formatting, typos).
 Here is what I have committed:

 2013-09-23  Kugan Vivekanandarajah  kug...@linaro.org

 * gimple-pretty-print.c (dump_ssaname_info): New function.
 (dump_gimple_phi): Call it.
 (pp_gimple_stmt_1): Likewise.
 * tree-core.h (tree_ssa_name): New union ssa_name_info_type field.
 (range_info_def): Declare.
 * tree-pretty-print.c (pp_double_int): New function.
 (dump_generic_node): Call it.
 * tree-pretty-print.h (pp_double_int): Declare.
 * tree-ssa-alias.c (dump_alias_info): Check pointer type.
 * tree-ssanames.h (range_info_def): New structure.
 (value_range_type): Move definition here.
 (set_range_info, value_range_type, duplicate_ssa_name_range_info):
 Declare.
 * tree-ssanames.c (make_ssa_name_fn): Check pointer type at
 initialization.
 (set_range_info): New function.
 (get_range_info): Likewise.
 (duplicate_ssa_name_range_info): Likewise.
 (duplicate_ssa_name_fn): Check pointer type and call
 duplicate_ssa_name_range_info.
 * tree-ssa-copy.c (fini_copy_prop): Likewise.
 * tree-vrp.c (value_range_type): Remove definition, now in
 tree-ssanames.h.
 (vrp_finalize): Call set_range_info to update value range of
 SSA_NAMEs.
 * tree.h (SSA_NAME_PTR_INFO): Macro changed to access via union.
 (SSA_NAME_RANGE_INFO): New macro.

 Nice patch, but the formatting is totally wrong wrt spaces, please reformat
 using 2-space indentation and 8-space TABs, as already used in the files.

 The patch has also introduced 2 regressions in Ada:

 === acats tests ===
 FAIL:   c37211b
 FAIL:   c37211c

 === acats Summary ===
 # of expected passes2318
 # of unexpected failures2


 Program received signal SIGSEGV, Segmentation fault.
 vrp_finalize () at /home/eric/svn/gcc/gcc/tree-vrp.c:9458
 9458  if (POINTER_TYPE_P (TREE_TYPE (name))
 (gdb) bt

I'm testing a trivial patch to fix that.

Richard.

 #0  vrp_finalize () at /home/eric/svn/gcc/gcc/tree-vrp.c:9458
 #1  execute_vrp () at /home/eric/svn/gcc/gcc/tree-vrp.c:9583
 #2  (anonymous namespace)::pass_vrp::execute (this=optimized out)
 at /home/eric/svn/gcc/gcc/tree-vrp.c:9673
 #3  0x00c52c9a in execute_one_pass (pass=pass@entry=0x22e2210)
 at /home/eric/svn/gcc/gcc/passes.c:2201
 #4  0x00c52e76 in execute_pass_list (pass=0x22e2210)
 at /home/eric/svn/gcc/gcc/passes.c:2253
 #5  0x00c52e88 in execute_pass_list (pass=0x22e04d0)
 at /home/eric/svn/gcc/gcc/passes.c:2254
 #6  0x009b9c49 in expand_function (node=0x76d12e40)
 at /home/eric/svn/gcc/gcc/cgraphunit.c:1750
 #7  0x009bbc17 in expand_all_functions ()
 at /home/eric/svn/gcc/gcc/cgraphunit.c:1855
 #8  compile () at /home/eric/svn/gcc/gcc/cgraphunit.c:2192
 #9  0x009bc1fa in finalize_compilation_unit ()
 at /home/eric/svn/gcc/gcc/cgraphunit.c:2269
 #10 0x006681b5 in gnat_write_global_declarations ()
 at /home/eric/svn/gcc/gcc/ada/gcc-interface/utils.c:5630
 #11 0x00d4577d in compile_file ()
 at /home/eric/svn/gcc/gcc/toplev.c:560
 #12 0x00d4750a in do_compile () at
 /home/eric/svn/gcc/gcc/toplev.c:1891
 #13 toplev_main (argc=14, argv=0x7fffdca8)
 at /home/eric/svn/gcc/gcc/toplev.c:1967
 #14 0x76f2a23d in __libc_start_main () from /lib64/libc.so.6
 #15 0x00635381 in _start () at ../sysdeps/x86_64/elf/start.S:113
 (gdb) p name
 $1 = (tree) 0x0


 --
 Eric Botcazou


Re: [PATCH, PR 57748] Check for out of bounds access, Part 2

2013-09-24 Thread Richard Biener
On Tue, Sep 24, 2013 at 4:52 AM, Bernd Edlinger
bernd.edlin...@hotmail.de wrote:
 Hi,

 with the attached patch the read-side of the out of bounds accesses are fixed.
 There is a single new test case pr57748-3.c that is derived from Martin's 
 test case.
 The test case does only test the read access and does not depend on part 1 of 
 the
 patch.

 This patch was boot-strapped and regression tested on 
 x86_64-unknown-linux-gnu.

 Additionally I generated eCos and an eCos-application (on ARMv5 using packed
 structures) with an arm-eabi cross compiler, and looked for differences in the
 disassembled code with and without this patch, but there were none.

 OK for trunk?

Index: gcc/expr.c
===
--- gcc/expr.c  (revision 202778)
+++ gcc/expr.c  (working copy)
@@ -9878,7 +9878,7 @@
   modifier != EXPAND_STACK_PARM
  ? target : NULL_RTX),
 VOIDmode,
-modifier == EXPAND_SUM ? EXPAND_NORMAL : modifier);
+EXPAND_MEMORY);

/* If the bitfield is volatile, we want to access it in the
   field's mode, not the computed mode.

context suggests that we may arrive with EXPAND_STACK_PARM here
which is a correctness modifier (see its docs).  But I'm not too familiar
with the details of the various expand modifiers, Eric may be though.

That said, I still believe that fixing the misalign path in expand_assignment
would be better than trying to avoid it.  For this testcase the issue is
again that expand_assignment passes the wrong mode/target to the
movmisalign optab.

Richard.


 Regards
 Bernd.


Re: [ping][PATCH][1 of 2] Add value range info to SSA_NAME for zero sign extension elimination in RTL

2013-09-24 Thread Kugan

On 24/09/13 19:23, Richard Biener wrote:

On Mon, Sep 23, 2013 at 10:34 PM, Eric Botcazou ebotca...@adacore.com wrote:

I have committed it for you (rev 202831), with a few modifications
(ChangeLog formatting, typos).
Here is what I have committed:

2013-09-23  Kugan Vivekanandarajah  kug...@linaro.org

 * gimple-pretty-print.c (dump_ssaname_info): New function.
 (dump_gimple_phi): Call it.
 (pp_gimple_stmt_1): Likewise.
 * tree-core.h (tree_ssa_name): New union ssa_name_info_type field.
 (range_info_def): Declare.
 * tree-pretty-print.c (pp_double_int): New function.
 (dump_generic_node): Call it.
 * tree-pretty-print.h (pp_double_int): Declare.
 * tree-ssa-alias.c (dump_alias_info): Check pointer type.
 * tree-ssanames.h (range_info_def): New structure.
 (value_range_type): Move definition here.
 (set_range_info, value_range_type, duplicate_ssa_name_range_info):
 Declare.
 * tree-ssanames.c (make_ssa_name_fn): Check pointer type at
 initialization.
 (set_range_info): New function.
 (get_range_info): Likewise.
 (duplicate_ssa_name_range_info): Likewise.
 (duplicate_ssa_name_fn): Check pointer type and call
 duplicate_ssa_name_range_info.
 * tree-ssa-copy.c (fini_copy_prop): Likewise.
 * tree-vrp.c (value_range_type): Remove definition, now in
 tree-ssanames.h.
 (vrp_finalize): Call set_range_info to update value range of
 SSA_NAMEs.
 * tree.h (SSA_NAME_PTR_INFO): Macro changed to access via union.
 (SSA_NAME_RANGE_INFO): New macro.


Nice patch, but the formatting is totally wrong wrt spaces, please reformat
using 2-space indentation and 8-space TABs, as already used in the files.



I am looking at everything and will send a patch to fix that.


The patch has also introduced 2 regressions in Ada:

 === acats tests ===
FAIL:   c37211b
FAIL:   c37211c

 === acats Summary ===
# of expected passes2318
# of unexpected failures2



I am sorry I missed this as I didnt test ada. I wrongly assumed that all 
the frontends are enabled by dafault.




Program received signal SIGSEGV, Segmentation fault.
vrp_finalize () at /home/eric/svn/gcc/gcc/tree-vrp.c:9458
9458  if (POINTER_TYPE_P (TREE_TYPE (name))
(gdb) bt


I'm testing a trivial patch to fix that.

I think the return value of ssa_name () (i.e. name) can be NULL and it 
has to be checked for NULL. In tree-vrp.c it is not checked in some 
other places related to debugging. In other places (eg. in 
tree-ssa-pre.c) there are checks .


Thanks for looking into it and I will wait for your fix.

Thanks,
Kugan



Richard.


#0  vrp_finalize () at /home/eric/svn/gcc/gcc/tree-vrp.c:9458
#1  execute_vrp () at /home/eric/svn/gcc/gcc/tree-vrp.c:9583
#2  (anonymous namespace)::pass_vrp::execute (this=optimized out)
 at /home/eric/svn/gcc/gcc/tree-vrp.c:9673
#3  0x00c52c9a in execute_one_pass (pass=pass@entry=0x22e2210)
 at /home/eric/svn/gcc/gcc/passes.c:2201
#4  0x00c52e76 in execute_pass_list (pass=0x22e2210)
 at /home/eric/svn/gcc/gcc/passes.c:2253
#5  0x00c52e88 in execute_pass_list (pass=0x22e04d0)
 at /home/eric/svn/gcc/gcc/passes.c:2254
#6  0x009b9c49 in expand_function (node=0x76d12e40)
 at /home/eric/svn/gcc/gcc/cgraphunit.c:1750
#7  0x009bbc17 in expand_all_functions ()
 at /home/eric/svn/gcc/gcc/cgraphunit.c:1855
#8  compile () at /home/eric/svn/gcc/gcc/cgraphunit.c:2192
#9  0x009bc1fa in finalize_compilation_unit ()
 at /home/eric/svn/gcc/gcc/cgraphunit.c:2269
#10 0x006681b5 in gnat_write_global_declarations ()
 at /home/eric/svn/gcc/gcc/ada/gcc-interface/utils.c:5630
#11 0x00d4577d in compile_file ()
 at /home/eric/svn/gcc/gcc/toplev.c:560
#12 0x00d4750a in do_compile () at
/home/eric/svn/gcc/gcc/toplev.c:1891
#13 toplev_main (argc=14, argv=0x7fffdca8)
 at /home/eric/svn/gcc/gcc/toplev.c:1967
#14 0x76f2a23d in __libc_start_main () from /lib64/libc.so.6
#15 0x00635381 in _start () at ../sysdeps/x86_64/elf/start.S:113
(gdb) p name
$1 = (tree) 0x0


--
Eric Botcazou




Re: [PATCH]Construct canonical scaled address expression in IVOPT

2013-09-24 Thread Richard Biener
On Tue, Sep 24, 2013 at 8:20 AM, bin.cheng bin.ch...@arm.com wrote:


 -Original Message-
 From: Richard Biener [mailto:richard.guent...@gmail.com]
 Sent: Monday, September 23, 2013 8:08 PM
 To: Bin Cheng
 Cc: GCC Patches
 Subject: Re: [PATCH]Construct canonical scaled address expression in IVOPT

 On Fri, Sep 20, 2013 at 12:00 PM, bin.cheng bin.ch...@arm.com wrote:
  Hi,
  For now IVOPT constructs scaled address expression in the form of
  scaled*index and checks whether backend supports it. The problem is
  the address expression is invalid on ARM, causing scaled expression
  disabled in IVOPT on ARM.  This patch fixes the IVOPT part by
  constructing rtl address expression like index*scaled+base.

 -  addr = gen_rtx_fmt_ee (MULT, address_mode, reg1, NULL_RTX);
 +  /* Construct address expression in the canonical form of
 + base+index*scale and call memory_address_addr_space_p to see
 whether
 + it is allowed by target processors.  */
 +  index = gen_rtx_fmt_ee (MULT, address_mode, reg1, NULL_RTX);
for (i = -MAX_RATIO; i = MAX_RATIO; i++)
   {
 -  XEXP (addr, 1) = gen_int_mode (i, address_mode);
 +  if (i == 1)
 +continue;
 +
 +  XEXP (index, 1) = gen_int_mode (i, address_mode);  addr =
 + gen_rtx_fmt_ee (PLUS, address_mode, index, reg1);
if (memory_address_addr_space_p (mode, addr, as))
  bitmap_set_bit (valid_mult, i + MAX_RATIO);

 so you now build reg1*scale+reg1 - which checks if offset and scale work
 at
 the same time (and with the same value - given this is really
 reg1*(scale+1)).
 It might be that this was implicitely assumed (or that no targets allow
 scale
 but no offset), but it's a change that requires audit of behavior on more
 targets.
 So literally, function multiplier_allowed_in_address_p should check whether
 multiplier is allowed in any kind of addressing mode, like [reg*scale +
 offset] and [reg*scale + reg].

Or even [reg*scale] (not sure about that).  But yes, at least reg*scale + offset
and reg*scale + reg.

   Apparently it's infeasible to check every
 possibility for each architecture, is it ok we at least check index, then
 addr if index is failed?  By any kind of addressing modes, I mean
 modes supported by function get_address_cost,  i.e., in form of [base] +
 [off] + [var] + (reg|reg*scale).

I suppose so, but it of course depends on what IVOPTs uses the answer
for in the end.  Appearantly it doesn't distinguish between the various cases
even though TARGET_MEM_REF does support all the variants in question
(reg * scale, reg * scale + reg, reg * scale + const, reg * scale +
reg + const).

So the better answer may be to teach the costs about the differences?

 The above also builds more RTX waste which you can fix by re-using the
 PLUS
 by building it up-front similar to the multiplication.  You also miss the
 Yes, this can be fixed.

 opportunity to have scale == 1 denote as to whether reg1 + reg2 is valid.
 I
 would expect that many targets support reg1 * scale + constant-offset but
 not many reg1 * scale + reg2.
 I thought scale==1 is unnecessary because the addressing mode degrades into
 reg or reg+reg.  Moreover, calls of multiplier_allowed_in_address_p in
 both get_address_cost and get_computation_cost_at have scale other than 1.

Ok.


 So no, the helper now checks sth completely different.  What's the problem
 with arm supporting reg1 * scale?  Why shouldn't it being able to handle
 the
 implicit zero offset?

 As Richard clarified, ARM does not support scaled addressing mode without
 base register.

I see.

Richard.


Re: [PR58463][Backport to 4.8] Adjust dumping for ref nodes

2013-09-24 Thread Richard Biener
On Tue, Sep 24, 2013 at 10:42 AM, Paulo Matos pma...@broadcom.com wrote:
 Ping on this patch.

 Note I don't have write access.

Please get yourself write access (I suppose you do have a copyright assignment),
you can name me as sponsor.

Richard.

 Paulo Matos


 -Original Message-
 From: Richard Biener [mailto:richard.guent...@gmail.com]
 Sent: 23 September 2013 09:00
 To: Paulo Matos
 Cc: gcc-patches@gcc.gnu.org
 Subject: Re: [PR58463][Backport to 4.8] Adjust dumping for ref nodes

 On Fri, Sep 20, 2013 at 5:39 PM, Paulo Matos pma...@broadcom.com wrote:
 
  Can we please backport this to 4.8 (it will fix PR58463)?
  I attach Richard's patch to master, let me know if instead I should create
 one specific for 4.8.1. I tested this against 4.8 branch and everything looks
 fine.

 Ok.

 Thanks,
 Richard.

 
  2013-03-27  Richard Biener  rguent...@suse.de
 
  PR tree-optimization/56716
  * tree-ssa-structalias.c (perform_var_substitution): Adjust
  dumping for ref nodes.
 
  Modified:
  trunk/gcc/ChangeLog
  trunk/gcc/tree-ssa-structalias.c
 
  Paulo Matos
 
 



Re: [PATCH]Fix computation of offset in ivopt

2013-09-24 Thread Richard Biener
On Tue, Sep 24, 2013 at 11:13 AM, bin.cheng bin.ch...@arm.com wrote:
 Hi,
 This patch fix two minor bugs when computing offset in IVOPT.
 1) Considering below example:
 #define MAX 100
 struct tag
 {
   int i;
   int j;
 }
 struct tag arr[MAX]

 int foo (int len)
 {
   int i = 0;
   for (; i  len; i++)
   {
 access arr[i].j;
   }
 }

 Without this patch, the offset computed by strip_offset_1 for address
 arr[i].j is ZERO, which is apparently not.

 2) Considering below example:
 //...
   bb 15:
   KeyIndex_66 = KeyIndex_194 + 4294967295;
   if (KeyIndex_66 != 0)
 goto bb 16;
   else
 goto bb 18;

   bb 16:

   bb 17:
   # KeyIndex_194 = PHI KeyIndex_66(16), KeyIndex_180(73)
   _62 = KeyIndex_194 + 1073741823;
   _63 = _62 * 4;
   _64 = pretmp_184 + _63;
   _65 = *_64;
   if (_65 == 0)
 goto bb 15;
   else
 goto bb 71;
 //...

 There are iv use and candidate like:

 use 1
   address
   in statement _65 = *_64;

   at position *_64
   type handletype *
   base pretmp_184 + ((sizetype) KeyIndex_180 + 1073741823) * 4
   step 4294967292
   base object (void *) pretmp_184
   related candidates

 candidate 6
   var_before ivtmp.16
   var_after ivtmp.16
   incremented before use 1
   type unsigned int
   base (unsigned int) (pretmp_184 + (sizetype) KeyIndex_180 * 4)
   step 4294967292
   base object (void *) pretmp_184
 Candidate 6 is related to use 1

 In function get_computation_cost_at for use 1 using cand 6, ubase and cbase
 are:
 pretmp_184 + ((sizetype) KeyIndex_180 + 1073741823) * 4
 pretmp_184 + (sizetype) KeyIndex_180 * 4

 The cstepi computed in HOST_WIDE_INT is :  0xfffc, while offset
 computed in TYPE(utype) is : 0xfffc.  Though they both stand for value
 -4 in different precision, statement offset -= ratio * cstepi returns
 0x1, which is wrong.

 Tested on x86_64 and arm.  Is it OK?

+   field = TREE_OPERAND (expr, 1);
+   if (DECL_FIELD_BIT_OFFSET (field)
+cst_and_fits_in_hwi (DECL_FIELD_BIT_OFFSET (field)))
+ boffset = int_cst_value (DECL_FIELD_BIT_OFFSET (field));
+
+   tmp = component_ref_field_offset (expr);
+   if (top_compref
+cst_and_fits_in_hwi (tmp))
+ {
+   /* Strip the component reference completely.  */
+   op0 = TREE_OPERAND (expr, 0);
+   op0 = strip_offset_1 (op0, inside_addr, top_compref, off0);
+   *offset = off0 + int_cst_value (tmp) + boffset / BITS_PER_UNIT;
+   return op0;
+ }

the failure paths seem mangled, that is, if cst_and_fits_in_hwi is false
for either offset part you may end up doing half accounting and not
stripping.

Btw, DECL_FIELD_BIT_OFFSET is always non-NULL.  I suggest to
rewrite to

 if (!inside_addr)
   return orig_expr;

 tmp = component_ref_field_offset (expr);
 field = TREE_OPERAND (expr, 1);
 if (top_compref
  cst_and_fits_in_hwi (tmp)
  cst_and_fits_in_hwi (DECL_FIELD_BIT_OFFSET (field)))
{
  ...
}

note that this doesn't really handle overflows correctly as

+   *offset = off0 + int_cst_value (tmp) + boffset / BITS_PER_UNIT;

may still overflow.

@@ -4133,6 +4142,9 @@ get_computation_cost_at (struct ivopts_data *data,
 bitmap_clear (*depends_on);
 }

+  /* Sign-extend offset if utype has lower precision than HOST_WIDE_INT.  */
+  offset = sext_hwi (offset, TYPE_PRECISION (utype));
+

offset is computed elsewhere in difference_cost and the bug to me seems that
it is unsigned.  sign-extending it here is odd at least (and the extension
should probably happen at sizetype precision, not that of utype).

Richard.


 Thanks.
 bin


 2013-09-24  Bin Cheng  bin.ch...@arm.com

 * tree-ssa-loop-ivopts.c (strip_offset_1): Count
 DECL_FIELD_BIT_OFFSET when computing offset for COMPONENT_REF.
 (get_computation_cost_at): Sign extend offset.


Re: [ping][PATCH][1 of 2] Add value range info to SSA_NAME for zero sign extension elimination in RTL

2013-09-24 Thread Richard Biener
On Tue, 24 Sep 2013, Richard Biener wrote:

 On Mon, Sep 23, 2013 at 10:34 PM, Eric Botcazou ebotca...@adacore.com wrote:
  I have committed it for you (rev 202831), with a few modifications
  (ChangeLog formatting, typos).
  Here is what I have committed:
 
  2013-09-23  Kugan Vivekanandarajah  kug...@linaro.org
 
  * gimple-pretty-print.c (dump_ssaname_info): New function.
  (dump_gimple_phi): Call it.
  (pp_gimple_stmt_1): Likewise.
  * tree-core.h (tree_ssa_name): New union ssa_name_info_type field.
  (range_info_def): Declare.
  * tree-pretty-print.c (pp_double_int): New function.
  (dump_generic_node): Call it.
  * tree-pretty-print.h (pp_double_int): Declare.
  * tree-ssa-alias.c (dump_alias_info): Check pointer type.
  * tree-ssanames.h (range_info_def): New structure.
  (value_range_type): Move definition here.
  (set_range_info, value_range_type, duplicate_ssa_name_range_info):
  Declare.
  * tree-ssanames.c (make_ssa_name_fn): Check pointer type at
  initialization.
  (set_range_info): New function.
  (get_range_info): Likewise.
  (duplicate_ssa_name_range_info): Likewise.
  (duplicate_ssa_name_fn): Check pointer type and call
  duplicate_ssa_name_range_info.
  * tree-ssa-copy.c (fini_copy_prop): Likewise.
  * tree-vrp.c (value_range_type): Remove definition, now in
  tree-ssanames.h.
  (vrp_finalize): Call set_range_info to update value range of
  SSA_NAMEs.
  * tree.h (SSA_NAME_PTR_INFO): Macro changed to access via union.
  (SSA_NAME_RANGE_INFO): New macro.
 
  Nice patch, but the formatting is totally wrong wrt spaces, please reformat
  using 2-space indentation and 8-space TABs, as already used in the files.
 
  The patch has also introduced 2 regressions in Ada:
 
  === acats tests ===
  FAIL:   c37211b
  FAIL:   c37211c
 
  === acats Summary ===
  # of expected passes2318
  # of unexpected failures2
 
 
  Program received signal SIGSEGV, Segmentation fault.
  vrp_finalize () at /home/eric/svn/gcc/gcc/tree-vrp.c:9458
  9458  if (POINTER_TYPE_P (TREE_TYPE (name))
  (gdb) bt
 
 I'm testing a trivial patch to fix that.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-09-24  Richard Biener  rguent...@suse.de

* tree-vrp.c (vrp_finalize): Check for SSA name presence.

Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c  (revision 202860)
+++ gcc/tree-vrp.c  (working copy)
@@ -9455,7 +9455,8 @@ vrp_finalize (void)
 {
   tree name = ssa_name (i);
 
-  if (POINTER_TYPE_P (TREE_TYPE (name))
+  if (!name
+ || POINTER_TYPE_P (TREE_TYPE (name))
   || (vr_value[i]-type == VR_VARYING)
   || (vr_value[i]-type == VR_UNDEFINED))
 continue;


Re: [PATCH]Construct canonical scaled address expression in IVOPT

2013-09-24 Thread Richard Earnshaw
On 24/09/13 11:12, Richard Biener wrote:
 Or even [reg*scale] (not sure about that).  But yes, at least reg*scale + 
 offset
 and reg*scale + reg.

I can't conceive of a realistic case where one would want to scale the
base address.  Scaling the offset is fine, but never the base.  So
reg*scale+offset seems meaningless.

Base + Offset * Scale on the other hand makes much more sense.

Of course, this is all about terminology, so if you regard an immediate,
such as a symbol as an offset, then perhaps you have something close to
the original term, but I think then you've inverted things, since your
symbol is really the base, not the offset.

R.



Re: [PATCH, ARM, LRA] Prepare ARM build with LRA

2013-09-24 Thread Eric Botcazou
 Here is the new patch discussed in the other thread.
 
 Thanks
 Yvan
 
 2013-09-11  Yvan Roux  yvan.r...@linaro.org
 Vladimir Makarov  vmaka...@redhat.com
 
   * rtlanal.c (lsb_bitfield_op_p): New predicate for bitfield operations
   from the least significant bit.
   (strip_address_mutations): Add bitfield operations handling.
   (shift_code_p): New predicate for shifting operations.
   (must_be_index_p): Add shifting operations handling.
   (set_address_index): Likewise.

+/* Return true if X is a sign_extract or zero_extract from the least
+   significant bit.  */
+
+static bool
+lsb_bitfield_op_p (rtx x)
+{
+  if (GET_RTX_CLASS (GET_CODE (x)) == RTX_BITFIELD_OPS)
+{
+  enum machine_mode mode = GET_MODE(x);
+  unsigned HOST_WIDE_INT len = INTVAL (XEXP (x, 1));
+  HOST_WIDE_INT pos = INTVAL (XEXP (x, 2));
+
+  return (pos == (BITS_BIG_ENDIAN ? GET_MODE_PRECISION (mode) - len : 
0));

It seems strange to use the destination mode to decide whether this is the LSB 
of the source.

+/* Return true if X is a shifting operation.  */
+
+static bool
+shift_code_p (rtx x)
+{
+  return (GET_CODE (x) == ASHIFT
+  || GET_CODE (x) == ASHIFTRT
+  || GET_CODE (x) == LSHIFTRT
+  || GET_CODE (x) == ROTATE
+  || GET_CODE (x) == ROTATERT);
+}

ROTATE and ROTATERT aren't really shifting operations though, so are they 
really needed here?

-- 
Eric Botcazou


Re: expand_expr tweaks to fix PR57134

2013-09-24 Thread Alan Modra
On Fri, Sep 13, 2013 at 12:37:20PM +0930, Alan Modra wrote:
   PR middle-end/57586
   * stmt.c (expand_asm_operands): Call expand_expr with
   EXPAND_MEMORY for output operands that disallow regs.  Don't
   use EXPAND_WRITE on inout operands.

Ping?

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH]Construct canonical scaled address expression in IVOPT

2013-09-24 Thread Richard Biener
On Tue, Sep 24, 2013 at 12:36 PM, Richard Earnshaw rearn...@arm.com wrote:
 On 24/09/13 11:12, Richard Biener wrote:
 Or even [reg*scale] (not sure about that).  But yes, at least reg*scale + 
 offset
 and reg*scale + reg.

 I can't conceive of a realistic case where one would want to scale the
 base address.  Scaling the offset is fine, but never the base.  So
 reg*scale+offset seems meaningless.

 Base + Offset * Scale on the other hand makes much more sense.

 Of course, this is all about terminology, so if you regard an immediate,
 such as a symbol as an offset, then perhaps you have something close to
 the original term, but I think then you've inverted things, since your
 symbol is really the base, not the offset.

Sure, this can't be the complete memory address - but that doesn't seem to be
what IVOPTs is testing.  Your example of $SYM + offset * scale is a good one
for example.  IVOPTs is merely asking whether it can use a scaled offset.
That the present test doesn't work for arm hints at that it doesn't build up
the correct RTL for the test, but also since this works for other targets which
probably also don't only have a scaled offset without a base it isn't all that
clear what a) the codes desire is, b) the correct solution is without
regressing on
other targets.

Btw, it should be reasonably possible to compute the whole
multiplier_allowed_in_address_p table for all primary and secondary archs
(simply build cross-cc1) and compare the results before / after a patch
candidate.  Querying both reg * scale and reg + reg * scale if the first
fails sounds like a good solution to me.

Richard.

 R.



Re: expand_expr tweaks to fix PR57134

2013-09-24 Thread Richard Biener
On Tue, Sep 24, 2013 at 12:43 PM, Alan Modra amo...@gmail.com wrote:
 On Fri, Sep 13, 2013 at 12:37:20PM +0930, Alan Modra wrote:
   PR middle-end/57586
   * stmt.c (expand_asm_operands): Call expand_expr with
   EXPAND_MEMORY for output operands that disallow regs.  Don't
   use EXPAND_WRITE on inout operands.

 Ping?

Ok.

Thanks,
Richard.

 --
 Alan Modra
 Australia Development Lab, IBM


[PATCH] Refactor type handling in get_alias_set, fix PR58513

2013-09-24 Thread Richard Biener

As noted in PR58513 the alias type comparison in operand_equal_p for
MEM_REF and TARGET_MEM_REF is overly restrictive.  The following
adds a new alias_ptr_types_compatible_p helper for a less conservative
comparison and refactors get_alias_set and reference_alias_ptr_type
to share code and behave in a more similar manner.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2013-09-24  Richard Biener  rguent...@suse.de

PR middle-end/58513
* tree.c (reference_alias_ptr_type): Move ...
* alias.c (reference_alias_ptr_type): ... here and implement
in terms of the new reference_alias_ptr_type_1.
(ref_all_alias_ptr_type_p): New helper.
(get_deref_alias_set_1): Drop flag_strict_aliasing here,
use ref_all_alias_ptr_type_p.
(get_deref_alias_set): Add flag_strict_aliasing check here.
(reference_alias_ptr_type_1): New function, split out from ...
(get_alias_set): ... here.
(alias_ptr_types_compatible_p): New function.
* alias.h (reference_alias_ptr_type): Declare.
(alias_ptr_types_compatible_p): Likewise.
* tree.h (reference_alias_ptr_type): Remove.
* fold-const.c (operand_equal_p): Use alias_ptr_types_compatible_p
to compare MEM_REF alias types.

* g++.dg/vect/pr58513.cc: New testcase.

Index: gcc/tree.c
===
*** gcc/tree.c.orig 2013-09-24 10:48:25.0 +0200
--- gcc/tree.c  2013-09-24 10:48:27.646770499 +0200
*** mem_ref_offset (const_tree t)
*** 4263,4286 
return tree_to_double_int (toff).sext (TYPE_PRECISION (TREE_TYPE (toff)));
  }
  
- /* Return the pointer-type relevant for TBAA purposes from the
-gimple memory reference tree T.  This is the type to be used for
-the offset operand of MEM_REF or TARGET_MEM_REF replacements of T.  */
- 
- tree
- reference_alias_ptr_type (const_tree t)
- {
-   const_tree base = t;
-   while (handled_component_p (base))
- base = TREE_OPERAND (base, 0);
-   if (TREE_CODE (base) == MEM_REF)
- return TREE_TYPE (TREE_OPERAND (base, 1));
-   else if (TREE_CODE (base) == TARGET_MEM_REF)
- return TREE_TYPE (TMR_OFFSET (base)); 
-   else
- return build_pointer_type (TYPE_MAIN_VARIANT (TREE_TYPE (base)));
- }
- 
  /* Return an invariant ADDR_EXPR of type TYPE taking the address of BASE
 offsetted by OFFSET units.  */
  
--- 4263,4268 
Index: gcc/tree.h
===
*** gcc/tree.h.orig 2013-09-24 10:48:25.0 +0200
--- gcc/tree.h  2013-09-24 10:48:27.662770694 +0200
*** extern tree build_simple_mem_ref_loc (lo
*** 4345,4351 
  #define build_simple_mem_ref(T)\
build_simple_mem_ref_loc (UNKNOWN_LOCATION, T)
  extern double_int mem_ref_offset (const_tree);
- extern tree reference_alias_ptr_type (const_tree);
  extern tree build_invariant_address (tree, tree, HOST_WIDE_INT);
  extern tree constant_boolean_node (bool, tree);
  extern tree div_if_zero_remainder (enum tree_code, const_tree, const_tree);
--- 4345,4350 
Index: gcc/alias.c
===
*** gcc/alias.c.orig2013-09-24 10:48:25.0 +0200
--- gcc/alias.c 2013-09-24 10:55:15.949616280 +0200
*** component_uses_parent_alias_set (const_t
*** 547,552 
--- 547,564 
  }
  }
  
+ 
+ /* Return whether the pointer-type T effective for aliasing may
+access everything and thus the reference has to be assigned
+alias-set zero.  */
+ 
+ static bool
+ ref_all_alias_ptr_type_p (const_tree t)
+ {
+   return (TREE_CODE (TREE_TYPE (t)) == VOID_TYPE
+ || TYPE_REF_CAN_ALIAS_ALL (t));
+ }
+ 
  /* Return the alias set for the memory pointed to by T, which may be
 either a type or an expression.  Return -1 if there is nothing
 special about dereferencing T.  */
*** component_uses_parent_alias_set (const_t
*** 554,564 
  static alias_set_type
  get_deref_alias_set_1 (tree t)
  {
-   /* If we're not doing any alias analysis, just assume everything
-  aliases everything else.  */
-   if (!flag_strict_aliasing)
- return 0;
- 
/* All we care about is the type.  */
if (! TYPE_P (t))
  t = TREE_TYPE (t);
--- 566,571 
*** get_deref_alias_set_1 (tree t)
*** 566,573 
/* If we have an INDIRECT_REF via a void pointer, we don't
   know anything about what that might alias.  Likewise if the
   pointer is marked that way.  */
!   if (TREE_CODE (TREE_TYPE (t)) == VOID_TYPE
!   || TYPE_REF_CAN_ALIAS_ALL (t))
  return 0;
  
return -1;
--- 573,579 
/* If we have an INDIRECT_REF via a void pointer, we don't
   know anything about what that might alias.  Likewise if the
   pointer is marked that way.  */
!   if (ref_all_alias_ptr_type_p (t))
  return 0;
  
return -1;
*** 

Re: Ping patch to enable *.cc files in gengtype

2013-09-24 Thread Basile Starynkevitch
On Fri, Sep 20, 2013 at 05:56:22PM +0200, Jakub Jelinek wrote:
 On Fri, Sep 20, 2013 at 05:52:38PM +0200, Basile Starynkevitch wrote:
  On Fri, Sep 20, 2013 at 09:53:10AM -0400, Diego Novillo wrote:
   On 2013-09-16 04:19 , Basile Starynkevitch wrote:
   Hello all,
   
   I'm pinging the patch (of september 2nd) on
   http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00036.html
   
   
    gcc/ChangeLog entry
   
   2013-09-16  Basile Starynkevitch  bas...@starynkevitch.net
   
* gengtype.c (file_rules): Added rule for *.cc files.
(get_output_file_with_visibility): Give fatal message when no
rules found.
   
   OK.
  
  Thanks for the review and the approval. I just commited  revision 202782.
  
  BTW, I believe this patch should be back ported, at least to next GCC 4.8.2
  (and perhaps even to 4.7). What do you think?
 
 Why?  There are no *.cc files in 4.8.x and I don't see why this patch would
 be desirable for the release branches.


Because 4.8 (and even 4.7) plugins may have .cc files and they are running 
(when they have GTY-ed data) gengtype to be built.

In general, I still do find that naming .c a C++ file is very confusing 
(especially for newbies). BTW, some other compilers don't like that 
(in particular Clang issues a warning when compiling a C++ file named .c, 
so Clang issues a lot of warnings when compiling GCC)


Of course, one might say that no recent C++ formal standard states that 
C++ source files should have a .cc or .cxx or .cpp or .C extension 
(but I am not sure of this) but this is a common practice since several 
decades (probably since the origin of C++).

INSHO plugins for GCC don't have any reasons to follow the GCC 
(obscure and undocumented) bad  crazy habit of naming .c a C++ file 
(that a plain C compiler won't even compile!).

So I believe that the plugin infrastructure should accept -and even mandates, 
or at least strongly advise- that plugin files should be named *.cc (or *.cpp).

And since GCC 4.8 (and even 4.7) requires a C++ compiler, their plugins 
should generally be coded in C++ not in C. And such a plugin C++ source 
file has absolutely no reason to be named *.c (the reason why GCC still 
have *.c file is some obscure version controlling reason; one might be 
tempted to think that if the version control wants *.c files for C++ source, 
that version control is broken and should be replaced by some more 
fancy thing...)


Not accepting plugin source files named .cc for C++ source code is IMHO a bug.
And some plugins (those with GTY-ed data) do use gengtype during their build.

Cheers.
-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***


Re: [PATCH, ARM, LRA] Prepare ARM build with LRA

2013-09-24 Thread Yvan Roux
Hi Eric,

Thanks for the review.

 +/* Return true if X is a sign_extract or zero_extract from the least
 +   significant bit.  */
 +
 +static bool
 +lsb_bitfield_op_p (rtx x)
 +{
 +  if (GET_RTX_CLASS (GET_CODE (x)) == RTX_BITFIELD_OPS)
 +{
 +  enum machine_mode mode = GET_MODE(x);
 +  unsigned HOST_WIDE_INT len = INTVAL (XEXP (x, 1));
 +  HOST_WIDE_INT pos = INTVAL (XEXP (x, 2));
 +
 +  return (pos == (BITS_BIG_ENDIAN ? GET_MODE_PRECISION (mode) - len :
 0));

 It seems strange to use the destination mode to decide whether this is the LSB
 of the source.

Indeed, I think it has to be the mode of loc, but I just wonder if it
is not always the same, as in the doc it is written that mode m is the
same as the mode that would be used for loc if it were a register.


 +/* Return true if X is a shifting operation.  */
 +
 +static bool
 +shift_code_p (rtx x)
 +{
 +  return (GET_CODE (x) == ASHIFT
 +  || GET_CODE (x) == ASHIFTRT
 +  || GET_CODE (x) == LSHIFTRT
 +  || GET_CODE (x) == ROTATE
 +  || GET_CODE (x) == ROTATERT);
 +}

 ROTATE and ROTATERT aren't really shifting operations though, so are they
 really needed here?

According to gcc internals, ROTATE and ROTATERT are similar to the
shifting operations, but to be more accurate maybe we can rename
shif_code_p in shift_and _rotate_code_p rotation are used in arm
address calculation, and thus need to be handle in  must_be_index_p
and set_address_index

Thanks,
Yvan


Re: New GCC options for loop vectorization

2013-09-24 Thread Richard Biener
On Mon, Sep 23, 2013 at 10:37 PM, Xinliang David Li davi...@google.com wrote:
 Thanks. I modified the patch so that the max allowed peel iterations
 can be specified via a parameter. Testing on going. Ok for trunk ?

+DEFPARAM(PARAM_VECT_MAX_PEELING_FOR_ALIGNMENT,
+ vect-max-peeling-for-alignment,
+ Max number of loop peels to enhancement alignment of data
references in a loop,
+ -1, 0, 0)

I believe this doesn't allow --param vect-max-peeling-for-alignment=-1 as you
specify 0 as the minimum.

So, ok with changing minimum and maxmum to -1. (double check that this works)

Thanks,
Richard.



 David

 On Mon, Sep 23, 2013 at 4:31 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Wed, Sep 18, 2013 at 10:21 PM, Xinliang David Li davi...@google.com 
 wrote:
 On Tue, Sep 17, 2013 at 1:20 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Mon, Sep 16, 2013 at 10:24 PM, Xinliang David Li davi...@google.com 
 wrote:
 On Mon, Sep 16, 2013 at 3:13 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Fri, Sep 13, 2013 at 5:16 PM, Xinliang David Li davi...@google.com 
 wrote:
 On Fri, Sep 13, 2013 at 1:30 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Thu, Sep 12, 2013 at 10:31 PM, Xinliang David Li 
 davi...@google.com wrote:
 Currently -ftree-vectorize turns on both loop and slp vectorizations,
 but there is no simple way to turn on loop vectorization alone. The
 logic for default O3 setting is also complicated.

 In this patch, two new options are introduced:

 1) -ftree-loop-vectorize

 This option is used to turn on loop vectorization only. option
 -ftree-slp-vectorize also becomes a first class citizen, and no funny
 business of Init(2) is needed.  With this change, -ftree-vectorize
 becomes a simple alias to -ftree-loop-vectorize +
 -ftree-slp-vectorize.

 For instance, to turn on only slp vectorize at O3, the old way is:

  -O3 -fno-tree-vectorize -ftree-slp-vectorize

 With the new change it becomes:

 -O3 -fno-loop-vectorize


 To turn on only loop vectorize at O2, the old way is

 -O2 -ftree-vectorize -fno-slp-vectorize

 The new way is

 -O2 -ftree-loop-vectorize



 2) -ftree-vect-loop-peeling

 This option is used to turn on/off loop peeling for alignment.  In the
 long run, this should be folded into the cheap cost model proposed by
 Richard.  This option is also useful in scenarios where peeling can
 introduce runtime problems:
 http://gcc.gnu.org/ml/gcc/2005-12/msg00390.html  which happens to be
 common in practice.



 Patch attached. Compiler boostrapped. Ok after testing?

 I'd like you to split 1) and 2), mainly because I agree on 1) but not 
 on 2).

 Ok. Can you also comment on 2) ?

 I think we want to decide how granular we want to control the vectorizer
 and using which mechanism.  My cost-model re-org makes
 ftree-vect-loop-version a no-op (basically removes it), so 2) looks like
 a step backwards in this context.

 Using cost model to do a coarse grain control/configuration is
 certainly something we want, but having a fine grain control is still
 useful.


 So, can you summarize what pieces (including versioning) of the 
 vectorizer
 you'd want to be able to disable separately?

 Loop peeling seems to be the main one. There is also a correctness
 issue related. For instance, the following code is common in practice,
 but loop peeling wrongly assumes initial base-alignment and generates
 aligned mov instruction after peeling, leading to SEGV.  Peeling is
 not something we can blindly turned on -- even when it is on, there
 should be a way to turn it off explicitly:

 char a[1];

 void foo(int n)
 {
   int* b = (int*)(a+n);
   int i = 0;
   for (; i  1000; ++i)
 b[i] = 1;
 }

 int main(int argn, char** argv)
 {
   foo(argn);
 }

 But that's just a bug that should be fixed (looking into it).

  Just disabling peeling for
 alignment may get you into the versioning for alignment path (and thus
 an unvectorized loop at runtime).

 This is not true for target supporting mis-aligned access. I have not
 seen a case where alignment driver loop version happens on x86.

Also it's know that the alignment peeling
 code needs some serious TLC (it's outcome depends on the order of DRs,
 the cost model it uses leaves to be desired as we cannot distinguish
 between unaligned load and store costs).

 Yet another reason to turn it off as it is not effective anyways?

 As said I'll disable all remains of -ftree-vect-loop-version with the cost 
 model
 patch because it wasn't guarding versioning for aliasing but only 
 versioning
 for alignment.

 We have to be consistent here - if we add a way to disable peeling for
 alignment then we certainly don't want to remove the ability to disable
 versioning for alignment, no?

 We already have the ability to turn off versioning -- via --param. It
 is a more natural way to fine tune a pass instead of introducing a -f
 option. For this reason, your planned deprecation of the option is a
 good 

Re: [PATCH, PR 57748] Check for out of bounds access

2013-09-24 Thread Richard Biener
On Tue, Sep 17, 2013 at 2:08 PM, Bernd Edlinger
bernd.edlin...@hotmail.de wrote:
 On Tue, 17 Sep 2013 12:45:40, Richard Biener wrote:

 On Tue, Sep 17, 2013 at 12:00 PM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Sun, Sep 15, 2013 at 6:55 PM, Bernd Edlinger
 bernd.edlin...@hotmail.de wrote:
 Hello Richard,

 attached is my second attempt at fixing PR 57748. This time the movmisalign
 path is completely removed and a similar bug in the read handling of 
 misaligned
 structures with a non-BLKmode is fixed too. There are several new test 
 cases for the
 different possible failure modes.

 This patch was boot-strapped and regression tested on 
 x86_64-unknown-linux-gnu
 and i686-pc-linux-gnu.

 Additionally I generated eCos and an eCos-application (on ARMv5 using 
 packed
 structures) with an arm-eabi cross compiler, and looked for differences in 
 the
 disassembled code with and without this patch, but there were none.

 OK for trunk?

 I agree that the existing movmisaling path that you remove is simply bogus, 
 so
 removing it looks fine to me. Can you give rationale to

 @@ -4773,6 +4738,8 @@ expand_assignment (tree to, tree from, b
 if (MEM_P (to_rtx)
  GET_MODE (to_rtx) == BLKmode
  GET_MODE (XEXP (to_rtx, 0)) != VOIDmode
 +  bitregion_start == 0
 +  bitregion_end == 0
  bitsize 0
  (bitpos % bitsize) == 0
  (bitsize % GET_MODE_ALIGNMENT (mode1)) == 0

 OK, as already said, I think it could be dangerous to set bitpos=0 without
 considering bitregion_start/end, but I think it may be possible that this
 can not happen, because if bitsize is a multiple if ALIGNMENT, and
 bitpos is a multiple of bitsize, we probably do not have a bit-field at all.
 And of course I have no test case that fails without this hunk.
 Maybe it would be better to add an assertion here like:

   {
   gcc_assert (bitregion_start == 0  bitregion_end == 0);
   to_rtx = adjust_address (to_rtx, mode1, bitpos / BITS_PER_UNIT);
   bitpos = 0;
   }

 and especially to

 @@ -9905,7 +9861,7 @@ expand_expr_real_1 (tree exp, rtx target
  modifier != EXPAND_STACK_PARM
 ? target : NULL_RTX),
 VOIDmode,
 - modifier == EXPAND_SUM ? EXPAND_NORMAL : modifier);
 + EXPAND_MEMORY);

 /* If the bitfield is volatile, we want to access it in the
 field's mode, not the computed mode.

 which AFAIK makes memory expansion of loads/stores from/to registers
 change (fail? go through stack memory?) - see handling of non-MEM return
 values from that expand_expr call.

 I wanted to make the expansion of MEM_REF and TARGET_MEM_REF
 not go thru the final misalign handling, which is guarded by
 if (modifier != EXPAND_WRITE   modifier != EXPAND_MEMORY  ...

 What we want here is most likely EXPAND_MEMORY, which returns a
 memory context if possible.

 Could you specify more explicitly what you mean with handling of non-MEM 
 return
 values from that expand_expr call, then I could try finding test cases for
 that.


 In particular this seems to disable all movmisalign handling for MEM_REFs
 wrapped in component references which looks wrong. I was playing with

 typedef long long V
 __attribute__ ((vector_size (2 * sizeof (long long)), may_alias));

 struct S { long long a[11]; V v; }__attribute__((aligned(8),packed)) ;
 struct S a, *b = a;
 V v, w;

 int main()
 {
 v = b-v;
 b-v = w;
 return 0;
 }

 (use -fno-common) and I see that we use unaligned stores too often
 (even with a properly aligned MEM).

 The above at least shows movmisalign opportunities wrapped in component-refs.

 hmm, interesting. This does not compile differently with or without this 
 patch.

 I have another observation, regarding the testcase pr50444.c:

 method:
 .LFB4:
 .cfi_startproc
 movq32(%rdi), %rax
 testq   %rax, %rax
 jne .L7
 addl$1, 16(%rdi)
 movl$3, %eax
 movq%rax, 32(%rdi)
 movdqu  16(%rdi), %xmm0
 pxor(%rdi), %xmm0
 movdqu  %xmm0, 40(%rdi)

 here the first movdqu could as well be movdqa, because 16+rdi is 128-bit 
 aligned.
 In the ctor method a movdqa is used, but the SRA is very pessimistic and 
 generates
 an unaligned MEM_REF. Also this example does not compile any different with 
 this patch.


 That is, do you see anything break with just removing the movmisalign path?
 I'd rather install that (with the new testcases that then pass) separately 
 as
 this is a somewhat fragile area and being able to more selectively
 bisect/backport
 would be nice.

 No, I think that is a good idea.

 Attached the first part of the patch, that does only remove the movmisalign 
 path.

 Should I apply this one after regression testing?

It seems you already have...?

Richard.

 Bernd.

 Thanks,
 Richard.

 Regards
 Bernd.


Re: gimple build interface

2013-09-24 Thread Andrew MacLeod

On 09/24/2013 05:50 AM, Richard Biener wrote:



Did you forget to attach the patch?


Of course not!  :-P

Oh how I hate mondays.

Old patch attached?

Richard.


Errr. no.  that last one has gimple-builder.[ch] and the 
ssa_replace_lhs renamed to gimple_replace_ssa_lhs..


I'll also wait for Tromey's auto dependency patch, and then just add the 
.o to the list to be built in Makefile.in.


Andrew


Re: gimple build interface

2013-09-24 Thread Richard Biener
On Tue, Sep 24, 2013 at 1:31 PM, Andrew MacLeod amacl...@redhat.com wrote:
 On 09/24/2013 05:50 AM, Richard Biener wrote:


 Did you forget to attach the patch?

 Of course not!  :-P

 Oh how I hate mondays.

 Old patch attached?

 Richard.


 Errr. no.  that last one has gimple-builder.[ch] and the ssa_replace_lhs
 renamed to gimple_replace_ssa_lhs..

 I'll also wait for Tromey's auto dependency patch, and then just add the .o
 to the list to be built in Makefile.in.

Ah, stupid firefox saving as gimp(1).patch.

This looks ok.
Thanks,
Richard.

 Andrew


Re: [PATCH]Construct canonical scaled address expression in IVOPT

2013-09-24 Thread Bin.Cheng
On Tue, Sep 24, 2013 at 6:12 PM, Richard Biener
richard.guent...@gmail.com wrote:
 On Tue, Sep 24, 2013 at 8:20 AM, bin.cheng bin.ch...@arm.com wrote:


 -Original Message-

 Or even [reg*scale] (not sure about that).  But yes, at least reg*scale + 
 offset
 and reg*scale + reg.

   Apparently it's infeasible to check every
 possibility for each architecture, is it ok we at least check index, then
 addr if index is failed?  By any kind of addressing modes, I mean
 modes supported by function get_address_cost,  i.e., in form of [base] +
 [off] + [var] + (reg|reg*scale).

 I suppose so, but it of course depends on what IVOPTs uses the answer
 for in the end.  Appearantly it doesn't distinguish between the various cases
 even though TARGET_MEM_REF does support all the variants in question
 (reg * scale, reg * scale + reg, reg * scale + const, reg * scale +
 reg + const).

 So the better answer may be to teach the costs about the differences?
Ideally, IVOPT should be aware whether scaling is allowed in every
kind of addressing modes and account cost of multiplier accordingly.
For current code, there are two scenarios here
1) If target supports reg*scale+reg, but not reg*scale, in this case,
IVOPT considers multiplier is not allowed in any addressing mode and
account multiplier with high cost.  This is the problem arm having.
2) If target supports reg*scale, but not some kind of addressing mode
(saying reg*scale+reg), in this case, IVOPT still constructs various
scaled addressing mode in get_address_cost and depends on address_cost
to compute correct cost for that addressing expression.  I think this
happens to work even IVOPT doesn't know reg*scale+reg is actually
not supported.


 The above also builds more RTX waste which you can fix by re-using the
 PLUS
 by building it up-front similar to the multiplication.  You also miss the
 Yes, this can be fixed.

 opportunity to have scale == 1 denote as to whether reg1 + reg2 is valid.
 I
 would expect that many targets support reg1 * scale + constant-offset but
 not many reg1 * scale + reg2.
 I thought scale==1 is unnecessary because the addressing mode degrades into
 reg or reg+reg.  Moreover, calls of multiplier_allowed_in_address_p in
 both get_address_cost and get_computation_cost_at have scale other than 1.

 Ok.


 So no, the helper now checks sth completely different.  What's the problem
 with arm supporting reg1 * scale?  Why shouldn't it being able to handle
 the
 implicit zero offset?

 As Richard clarified, ARM does not support scaled addressing mode without
 base register.

 I see.

Also from the newer comments:

 Btw, it should be reasonably possible to compute the whole
 multiplier_allowed_in_address_p table for all primary and secondary archs
 (simply build cross-cc1) and compare the results before / after a patch
 candidate.  Querying both reg * scale and reg + reg * scale if the first
 fails sounds like a good solution to me.
I take this as we should do minimal change by checking reg + reg *
scale if reg * scale is failed,  right?

Thanks.
bin
-- 
Best Regards.


[PATCH][RFC] Remove quadratic loop with component_uses_parent_alias_set

2013-09-24 Thread Richard Biener

Eric, the current way component_uses_parent_alias_set is called from
get_alias_set causes the reference tree chain to be walked O(n^2)
in case there is any DECL_NONADDRESSABLE_P or TYPE_NONALIASED_COMPONENT
reference in the tree chain.  Also the comment above
component_uses_parent_alias_set doesn't seem to match behavior
in get_alias_set.  get_alias_set ends up using exactly the type of 
the parent
of the DECL_NONADDRESSABLE_P / TYPE_NONALIASED_COMPONENT reference
instead of the alias set provided by the object at the heart of T

I also fail to see why component_uses_parent_alias_set invokes
get_alias_set (is that to make 'a.c' in struct { char c; } a;
use the alias set of 'a' instead of the alias set of 'char'?
Or is it for correctness reasons in case there is a ref-all
indirect ref at the base?  For the former I believe it doesn't
work because it only checks the non-outermost type).

So, the following patch removes the quadraticness by returning
the parent of the innermost non-addressable (or alias-set zero)
reference component instead of just a flag.  That changes behavior
for the alias-set zero case but still doesn't match the overall
function comment.

The only other use besides from get_alias_set is to set
MEM_KEEP_ALIAS_SET_P - whatever that is exactly for, I can
see a single non-obvious use in store_constructor_field
(didn't bother to lookup how callers specify the alias_set argument).
In

if (MEM_P (to_rtx)  !MEM_KEEP_ALIAS_SET_P (to_rtx)
 DECL_NONADDRESSABLE_P (field))
  {
to_rtx = copy_rtx (to_rtx);
MEM_KEEP_ALIAS_SET_P (to_rtx) = 1;
  }

we can just drop the MEM_KEEP_ALIAS_SET_P check (well, in case
MEM_KEEP_ALIAS_SET_P is dropped).  And for the following call

store_constructor_field (to_rtx, bitsize, bitpos, mode,
 value, cleared,
 get_alias_set (TREE_TYPE (field)));

pass MEM_ALIAS_SET instead of get_alias_set (TREE_TYPE (field)) for
DECL_NONADDRESSABLE_P (field), similar for the array handling in
the other use.

Btw, checks like

 if (!MEM_KEEP_ALIAS_SET_P (to_rtx)  MEM_ALIAS_SET (to_rtx) != 0)
set_mem_alias_set (to_rtx, alias_set);

seem to miss MEM_ALIAS_SET (to_rtx) being ALIAS_SET_MEMORY_BARRIER?
Or alias_set being zero / ALIAS_SET_MEMORY_BARRIER?

Anyway, the patch nearly preserves behavior for the emit-rtl.c use.

Comments?

Thanks,
Richard.

Index: gcc/alias.c
===
--- gcc/alias.c (revision 202865)
+++ gcc/alias.c (working copy)
@@ -510,41 +510,49 @@ objects_must_conflict_p (tree t1, tree t
alias set used by the component, but we don't have per-FIELD_DECL
assignable alias sets.  */
 
-bool
+tree
 component_uses_parent_alias_set (const_tree t)
 {
-  while (1)
+  const_tree non_addressable = NULL_TREE;
+  while (handled_component_p (t))
 {
-  /* If we're at the end, it vacuously uses its own alias set.  */
-  if (!handled_component_p (t))
-   return false;
-
   switch (TREE_CODE (t))
{
case COMPONENT_REF:
  if (DECL_NONADDRESSABLE_P (TREE_OPERAND (t, 1)))
-   return true;
+   non_addressable = t;
  break;
 
case ARRAY_REF:
case ARRAY_RANGE_REF:
  if (TYPE_NONALIASED_COMPONENT (TREE_TYPE (TREE_OPERAND (t, 0
-   return true;
+   non_addressable = t;
  break;
 
case REALPART_EXPR:
case IMAGPART_EXPR:
  break;
 
-   default:
+   case BIT_FIELD_REF:
+   case VIEW_CONVERT_EXPR:
  /* Bitfields and casts are never addressable.  */
- return true;
+ non_addressable = t;
+ break;
+
+   default:
+ gcc_unreachable ();
}
 
-  t = TREE_OPERAND (t, 0);
   if (get_alias_set (TREE_TYPE (t)) == 0)
-   return true;
+   non_addressable = t;
+
+  t = TREE_OPERAND (t, 0);
 }
+ 
+  if (non_addressable)
+return TREE_OPERAND (non_addressable, 0);
+
+  return NULL_TREE;
 }
 
 
@@ -645,14 +653,11 @@ reference_alias_ptr_type_1 (tree *t)
   (TREE_TYPE (TREE_TYPE (TREE_OPERAND (inner, 1))
 return TREE_TYPE (TREE_OPERAND (inner, 1));
 
-  /* Otherwise, pick up the outermost object that we could have a pointer
- to, processing conversions as above.  */
-  /* ???  Ick, this is worse than quadratic!  */
-  while (component_uses_parent_alias_set (*t))
-{
-  *t = TREE_OPERAND (*t, 0);
-  STRIP_NOPS (*t);
-}
+  /* Otherwise, pick up the outermost object that we could have
+ a pointer to.  */
+  tree tem = component_uses_parent_alias_set (*t);
+  if (tem)
+*t = tem;
 
   return NULL_TREE;
 }
Index: gcc/alias.h
===
--- gcc/alias.h (revision 202865)
+++ gcc/alias.h (working copy)
@@ -33,7 +33,7 @@ extern alias_set_type get_alias_set 

Re: [PATCH]Construct canonical scaled address expression in IVOPT

2013-09-24 Thread Richard Biener
On Tue, Sep 24, 2013 at 1:40 PM, Bin.Cheng amker.ch...@gmail.com wrote:
 On Tue, Sep 24, 2013 at 6:12 PM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Tue, Sep 24, 2013 at 8:20 AM, bin.cheng bin.ch...@arm.com wrote:


 -Original Message-

 Or even [reg*scale] (not sure about that).  But yes, at least reg*scale + 
 offset
 and reg*scale + reg.

   Apparently it's infeasible to check every
 possibility for each architecture, is it ok we at least check index, then
 addr if index is failed?  By any kind of addressing modes, I mean
 modes supported by function get_address_cost,  i.e., in form of [base] +
 [off] + [var] + (reg|reg*scale).

 I suppose so, but it of course depends on what IVOPTs uses the answer
 for in the end.  Appearantly it doesn't distinguish between the various cases
 even though TARGET_MEM_REF does support all the variants in question
 (reg * scale, reg * scale + reg, reg * scale + const, reg * scale +
 reg + const).

 So the better answer may be to teach the costs about the differences?
 Ideally, IVOPT should be aware whether scaling is allowed in every
 kind of addressing modes and account cost of multiplier accordingly.
 For current code, there are two scenarios here
 1) If target supports reg*scale+reg, but not reg*scale, in this case,
 IVOPT considers multiplier is not allowed in any addressing mode and
 account multiplier with high cost.  This is the problem arm having.
 2) If target supports reg*scale, but not some kind of addressing mode
 (saying reg*scale+reg), in this case, IVOPT still constructs various
 scaled addressing mode in get_address_cost and depends on address_cost
 to compute correct cost for that addressing expression.  I think this
 happens to work even IVOPT doesn't know reg*scale+reg is actually
 not supported.


 The above also builds more RTX waste which you can fix by re-using the
 PLUS
 by building it up-front similar to the multiplication.  You also miss the
 Yes, this can be fixed.

 opportunity to have scale == 1 denote as to whether reg1 + reg2 is valid.
 I
 would expect that many targets support reg1 * scale + constant-offset but
 not many reg1 * scale + reg2.
 I thought scale==1 is unnecessary because the addressing mode degrades into
 reg or reg+reg.  Moreover, calls of multiplier_allowed_in_address_p in
 both get_address_cost and get_computation_cost_at have scale other than 1.

 Ok.


 So no, the helper now checks sth completely different.  What's the problem
 with arm supporting reg1 * scale?  Why shouldn't it being able to handle
 the
 implicit zero offset?

 As Richard clarified, ARM does not support scaled addressing mode without
 base register.

 I see.

 Also from the newer comments:

 Btw, it should be reasonably possible to compute the whole
 multiplier_allowed_in_address_p table for all primary and secondary archs
 (simply build cross-cc1) and compare the results before / after a patch
 candidate.  Querying both reg * scale and reg + reg * scale if the first
 fails sounds like a good solution to me.
 I take this as we should do minimal change by checking reg + reg *
 scale if reg * scale is failed,  right?

Yes, you can share a single RTL expression for all this and I think
querying reg + reg * scale first makes sense (then fallback to reg * scale
for compatibility).

Richard.

 Thanks.
 bin
 --
 Best Regards.


Re: [Patch] match_results::format and regex_replace

2013-09-24 Thread Ed Smith-Rowland

On 09/23/2013 10:09 PM, Tim Shen wrote:

On Sun, Sep 22, 2013 at 4:20 PM, Paolo Carlini paolo.carl...@oracle.com wrote:

If testing goes well patch is Ok to commit.

Tested under -m32 and -m64 and committed :)

I'll learn how locale in glibc works.

Thank you all!



Thank *you*!

regex has been dogging us for years.

I, for one, hope you hang around after GSOC.  There are lots of library 
components coming up for C++2014 and other TS. ;-)


Ed



Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition

2013-09-24 Thread Jan Hubicka
 Hi Honza,
 
 I am finally getting back to working on this after a few weeks of
 working on some other priorities.

I am also trying to return to this, so good timming ;)
Martin has got smaller C++ programs (Inkscape) to not touch cold segment
during the startup with FDO (w/o partitioning). Firefox still does, I think
the problem are lost samples due to different linker decisions even with LTO.
(i.e. linker pick an object from .a libraryat profile-generate time that i never
passes later.).

I plan to look into that today.
 
 Did you mean to commit the above change? I see that it went in as part
 of r202258 but doesn't show up in the ChangeLog entry for that
 revision.

Yes, I meant to check it in, but did not mean to do so w/o Changelog.  I wil
fix that.
 
 
  In other cases it was mostly loop unrolling in combination with jump 
  threading. So
  I modified my script to separately report when failure happens for test 
  trained
  once and test trained hundred times.
 
 Thanks for the linker script. I reproduced your results. I looked at a
 couple cases. The first was one that failed after 1 training run only
 (2910-2.c). It was due to jump threading, which you noted was a
 problem. For this one I think we can handle it in the partitioning,
 since there is an FDO insanity that we could probably treat more
 conservatively when splitting.

We should fix the roundoff issues - when I was introducing the
frequency/probability/count system I made an assumptions that parts of programs
with very low counts do not matter, since they are not part of hot spot (and I
cared only about the hot spot).  Now we care about identifying unlikely
executed spots and we need to fix this.
 
 I looked at one that failed after 100 as well (20031204-1.c). In this
 case, it was due to expansion which was creating multiple branches/bbs
 from a logical OR and guessing incorrectly on how to assign the
 counts:
 
  if (octets == 4  (*cp == ':' || *cp == '\0')) {
 
 The (*cp == ':' || *cp == '\0') part looked like the following going
 into RTL expansion:
 
   [20031204-1.c : 31:33] _29 = _28 == 58;
   [20031204-1.c : 31:33] _30 = _28 == 0;
   [20031204-1.c : 31:33] _31 = _29 | _30;
   [20031204-1.c : 31:18] if (_31 != 0)
 goto bb 16;
   else
 goto bb 19;
 
 where the result of the OR was always true, so bb 16 had a count of
 100 and bb 19 a count of 0. When it was expanded, the expanded version
 of the above turned into 2 bbs with a branch in between. Both
 comparisons were done in the first bb, but the first bb checked
 whether the result of the *cp == '\0' compare was true, and if not
 branched to the check for whether the *cp == ':' compare was true. It
 gave the branch to the second check against ':' a count of 0, so that
 bb got a count of 0 and was split out, and put the count of 100 on the
 fall through assuming the compare with '\0' always evaluated to true.
 In reality, this OR condition was always true because *cp was ':', not
 '\0'. Therefore, the count of 0 on the second block with the check for
 ':' was incorrect, we ended up trying to execute it, and failed.
 
 Presumably we had the correct profile data for both blocks, but the
 accuracy was reduced when the OR was represented as a logical
 computation with a single branch. We could change the expansion code
 to do something different, e.g. treat as a 50-50 branch. But we would
 still end up with integer truncation issues when there was a single
 training run. But that could be dealt with conservatively in the
 bbpart code as I suggested for the jump threading issue above. I.e. a
 cold block with incoming non-cold edges conservatively not marked cold
 for splitting.
 
 
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/2422-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/2910-2.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20020413-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20030903-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20031204-1.c
  FAIL /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20031204-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20060420-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20060905-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20120427-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20120427-2.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20120808-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20121108-1.c
  FAIL /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20121108-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/920501-6.c
  FAIL /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/920501-6.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/920726-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/981001-1.c
  FAIL /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/981001-1.c
  FAIL1 

Re: [PATCH] Fix typo in check for null

2013-09-24 Thread Marek Polacek
On Mon, Sep 23, 2013 at 02:03:02PM +, Paulo Matos wrote:
 This is a patch for master, where in number_of_loops, we want to check if the 
 loops (returned by loops_for_fn) is non-null but instead we are using fn, 
 which is the function argument. I haven't opened a bug report, please let me 
 know if I need to do that before submitting patches.
 
 2013-09-23 Paulo Matos pma...@broadcom.com
 
   * gcc/cfgloop.h (number_of_loops): Fix typo in check for null

The CL should have two spaces (twice) between the date and the
date and the e-mail, plus please drop the gcc/ prefix and add a full stop at
the end of the sentence.

Marek


Re: [PATCH][ARM][testsuite] Add effective target check for arm conditional execution

2013-09-24 Thread Kyrill Tkachov

On 13/09/13 16:25, Kyrill Tkachov wrote:

Hi all,

gcc.target/arm/minmax_minus.c is really only valid when we have
conditional execution available, that is non Thumb1-only targets. I've
added an effective target check for that and used it in the test so that
it does not get run and give a false negative when testing Thumb1 targets.

Ok for trunk?

Thanks,
Kyrill

2013-09-13  Kyrylo Tkachov  kyrylo.tkac...@arm.com

  * lib/target-supports.exp (check_effective_target_arm_cond_exec):
  New procedure.
  * gcc.target/arm/minmax_minus.c: Check for cond_exec target.

Ping.



Re: patch to canonize small wide-ints.

2013-09-24 Thread Richard Biener
On Tue, 17 Sep 2013, Kenneth Zadeck wrote:

 Richi,
 
 This patch canonizes the bits above the precision for wide ints with types or
 modes that are not a perfect multiple of HOST_BITS_PER_WIDE_INT.
 
 I expect that most of the changes in rtl.h will go away.   in particular, when
 we decide that we can depend on richard's patch to clean up rtl constants,
 then the only thing that will be left will be the addition of the
 TARGET_SUPPORTS_WIDE_INT test.
 
 I do believe that there is one more conserved force in the universe than what
 physicist's generally consider: it is uglyness.  There is a lot of truth and
 beauty in the patch but in truth there is a lot of places where the uglyness
 is just moved someplace else.
 
 in the pushing the ugly around dept, trees and wide-ints are not canonized the
 same way.I spent several days going down the road where it tried to have
 them be the same, but it got very ugly having 32 bit unsigned int csts have
 the upper 32 bits set.   So now wide_int_to_tree and the wide-int constructors
 from tree-cst are now more complex.
 
 i think that i am in favor of this patch, especially in conjunction with
 richards cleanup, but only mildly.
 
 There is also some cleanup where richard wanted the long lines addressed.
 
 Ok to commit to the wide-int branch?

Looks good to me.

I'll be doing a separate review of the to/from tree parts when I
find time to do that, but that's unrelated to this patch.

Thanks,
Richard.


RE: [PATCH] Fix typo in check for null

2013-09-24 Thread Paulo Matos
 -Original Message-
 From: Marek Polacek [mailto:pola...@redhat.com]
 Sent: 24 September 2013 13:57
 To: Paulo Matos
 Cc: gcc-patches@gcc.gnu.org
 Subject: Re: [PATCH] Fix typo in check for null
 
 On Mon, Sep 23, 2013 at 02:03:02PM +, Paulo Matos wrote:
  This is a patch for master, where in number_of_loops, we want to check if
 the loops (returned by loops_for_fn) is non-null but instead we are using fn,
 which is the function argument. I haven't opened a bug report, please let me
 know if I need to do that before submitting patches.
 
  2013-09-23 Paulo Matos pma...@broadcom.com
 
* gcc/cfgloop.h (number_of_loops): Fix typo in check for null
 
 The CL should have two spaces (twice) between the date and the
 date and the e-mail, plus please drop the gcc/ prefix and add a full stop at
 the end of the sentence.
 
   Marek


Thanks for the comments Marek, will fix it.

2013-09-24  Paulo Matos pma...@broadcom.com

  * cfgloop.h (number_of_loops): Fix typo in check for null.


RE: [PATCH] Fix typo in check for null

2013-09-24 Thread Paulo Matos

 -Original Message-
 From: Richard Biener [mailto:richard.guent...@gmail.com]
 Sent: 24 September 2013 10:03
 To: Paulo Matos
 Cc: gcc-patches@gcc.gnu.org
 Subject: Re: [PATCH] Fix typo in check for null
 
 On Mon, Sep 23, 2013 at 4:03 PM, Paulo Matos pma...@broadcom.com wrote:
  This is a patch for master, where in number_of_loops, we want to check if
 the loops (returned by loops_for_fn) is non-null but instead we are using fn,
 which is the function argument. I haven't opened a bug report, please let me
 know if I need to do that before submitting patches.
 
 Patch is ok.
 
 Thanks,
 Richard.

Will submit once I get a reply from the write access request.

Cheers,
Paulo


Re: patch to canonize small wide-ints.

2013-09-24 Thread Richard Biener
On Tue, 24 Sep 2013, Kenneth Zadeck wrote:

 On 09/24/2013 09:39 AM, Richard Biener wrote:
  On Tue, 17 Sep 2013, Kenneth Zadeck wrote:
  
   Richi,
   
   This patch canonizes the bits above the precision for wide ints with types
   or
   modes that are not a perfect multiple of HOST_BITS_PER_WIDE_INT.
   
   I expect that most of the changes in rtl.h will go away.   in particular,
   when
   we decide that we can depend on richard's patch to clean up rtl constants,
   then the only thing that will be left will be the addition of the
   TARGET_SUPPORTS_WIDE_INT test.
   
   I do believe that there is one more conserved force in the universe than
   what
   physicist's generally consider: it is uglyness.  There is a lot of truth
   and
   beauty in the patch but in truth there is a lot of places where the
   uglyness
   is just moved someplace else.
   
   in the pushing the ugly around dept, trees and wide-ints are not canonized
   the
   same way.I spent several days going down the road where it tried to
   have
   them be the same, but it got very ugly having 32 bit unsigned int csts
   have
   the upper 32 bits set.   So now wide_int_to_tree and the wide-int
   constructors
   from tree-cst are now more complex.
   
   i think that i am in favor of this patch, especially in conjunction with
   richards cleanup, but only mildly.
   
   There is also some cleanup where richard wanted the long lines addressed.
   
   Ok to commit to the wide-int branch?
  Looks good to me.
  
  I'll be doing a separate review of the to/from tree parts when I
  find time to do that, but that's unrelated to this patch.
  
  Thanks,
  Richard.
 as i said on irc, after i check in a cleaned up version of this (for ppc) i
 will change the rep for unsiged tree-csts so that they have an extra block of
 zeros if their top bit is set.   this will clean that up somewhat.

Yes, this will give wide-ints a sign (as I originally proposed).  For
RTL constants we're going to lie and make all constants negative
that have their MSB set.

Richard.


Re: [PATCH] Fix typo in check for null

2013-09-24 Thread Marek Polacek
On Tue, Sep 24, 2013 at 01:44:30PM +, Paulo Matos wrote:
 Thanks for the comments Marek, will fix it.
 
 2013-09-24  Paulo Matos pma...@broadcom.com

 ^^
Two spaces between name and e-mail.  Sorry for nitpicking like
this.  Thanks!

Marek


Re: patch to canonize small wide-ints.

2013-09-24 Thread Kenneth Zadeck

On 09/24/2013 09:39 AM, Richard Biener wrote:

On Tue, 17 Sep 2013, Kenneth Zadeck wrote:


Richi,

This patch canonizes the bits above the precision for wide ints with types or
modes that are not a perfect multiple of HOST_BITS_PER_WIDE_INT.

I expect that most of the changes in rtl.h will go away.   in particular, when
we decide that we can depend on richard's patch to clean up rtl constants,
then the only thing that will be left will be the addition of the
TARGET_SUPPORTS_WIDE_INT test.

I do believe that there is one more conserved force in the universe than what
physicist's generally consider: it is uglyness.  There is a lot of truth and
beauty in the patch but in truth there is a lot of places where the uglyness
is just moved someplace else.

in the pushing the ugly around dept, trees and wide-ints are not canonized the
same way.I spent several days going down the road where it tried to have
them be the same, but it got very ugly having 32 bit unsigned int csts have
the upper 32 bits set.   So now wide_int_to_tree and the wide-int constructors
from tree-cst are now more complex.

i think that i am in favor of this patch, especially in conjunction with
richards cleanup, but only mildly.

There is also some cleanup where richard wanted the long lines addressed.

Ok to commit to the wide-int branch?

Looks good to me.

I'll be doing a separate review of the to/from tree parts when I
find time to do that, but that's unrelated to this patch.

Thanks,
Richard.
as i said on irc, after i check in a cleaned up version of this (for 
ppc) i will change the rep for unsiged tree-csts so that they have an 
extra block of zeros if their top bit is set.   this will clean that up 
somewhat.


thanks

kenny


Re: patch to canonize small wide-ints.

2013-09-24 Thread Kenneth Zadeck

On 09/24/2013 09:51 AM, Richard Biener wrote:

On Tue, 24 Sep 2013, Kenneth Zadeck wrote:


On 09/24/2013 09:39 AM, Richard Biener wrote:

On Tue, 17 Sep 2013, Kenneth Zadeck wrote:


Richi,

This patch canonizes the bits above the precision for wide ints with types
or
modes that are not a perfect multiple of HOST_BITS_PER_WIDE_INT.

I expect that most of the changes in rtl.h will go away.   in particular,
when
we decide that we can depend on richard's patch to clean up rtl constants,
then the only thing that will be left will be the addition of the
TARGET_SUPPORTS_WIDE_INT test.

I do believe that there is one more conserved force in the universe than
what
physicist's generally consider: it is uglyness.  There is a lot of truth
and
beauty in the patch but in truth there is a lot of places where the
uglyness
is just moved someplace else.

in the pushing the ugly around dept, trees and wide-ints are not canonized
the
same way.I spent several days going down the road where it tried to
have
them be the same, but it got very ugly having 32 bit unsigned int csts
have
the upper 32 bits set.   So now wide_int_to_tree and the wide-int
constructors
from tree-cst are now more complex.

i think that i am in favor of this patch, especially in conjunction with
richards cleanup, but only mildly.

There is also some cleanup where richard wanted the long lines addressed.

Ok to commit to the wide-int branch?

Looks good to me.

I'll be doing a separate review of the to/from tree parts when I
find time to do that, but that's unrelated to this patch.

Thanks,
Richard.

as i said on irc, after i check in a cleaned up version of this (for ppc) i
will change the rep for unsiged tree-csts so that they have an extra block of
zeros if their top bit is set.   this will clean that up somewhat.

Yes, this will give wide-ints a sign (as I originally proposed).  For
RTL constants we're going to lie and make all constants negative
that have their MSB set.

Richard.
it only sort of does. anyway it seems to be mostly consistent. As i 
said on irc, i am pleasantly surprised at the lack of damage to the 
ports by assuming that the values are always canonized.   Two large 
public ports and my private port required no changes.


RE: [PATCH] Fix typo in check for null

2013-09-24 Thread Paulo Matos
 -Original Message-
 From: Marek Polacek [mailto:pola...@redhat.com]
 Sent: 24 September 2013 14:52
 To: Paulo Matos
 Cc: gcc-patches@gcc.gnu.org
 Subject: Re: [PATCH] Fix typo in check for null
 
 On Tue, Sep 24, 2013 at 01:44:30PM +, Paulo Matos wrote:
  Thanks for the comments Marek, will fix it.
 
  2013-09-24  Paulo Matos pma...@broadcom.com
 
^^
 Two spaces between name and e-mail.  Sorry for nitpicking like
 this.  Thanks!
 
   Marek

It's fine. :)

2013-09-24  Paulo Matos  pma...@broadcom.com

  * cfgloop.h (number_of_loops): Fix typo in check for null.


Re: [PATCH, PR 57748] Check for out of bounds access, Part 2

2013-09-24 Thread Eric Botcazou
 Index: gcc/expr.c
 ===
 --- gcc/expr.c  (revision 202778)
 +++ gcc/expr.c  (working copy)
 @@ -9878,7 +9878,7 @@
modifier != EXPAND_STACK_PARM
   ? target : NULL_RTX),
  VOIDmode,
 -modifier == EXPAND_SUM ? EXPAND_NORMAL : modifier);
 +EXPAND_MEMORY);
 
 /* If the bitfield is volatile, we want to access it in the
field's mode, not the computed mode.
 
 context suggests that we may arrive with EXPAND_STACK_PARM here
 which is a correctness modifier (see its docs).  But I'm not too familiar
 with the details of the various expand modifiers, Eric may be though.

Yes, this change looks far too bold and is presumably papering over the 
underlying issue...

 That said, I still believe that fixing the misalign path in
 expand_assignment would be better than trying to avoid it.  For this
 testcase the issue is again that expand_assignment passes the wrong
 mode/target to the
 movmisalign optab.

...then let's just fix the movmisalign stuff.

-- 
Eric Botcazou


Re: [patch, libgfortran, configure] Cross-compile support for libgfortran

2013-09-24 Thread Richard Earnshaw
On 23/09/13 18:43, Steve Ellcey wrote:
 On Mon, 2013-09-23 at 16:26 +0100, Marcus Shawcroft wrote:
 On 4 June 2013 20:49, Steve Ellcey sell...@mips.com wrote:
 This patch allows me to build libgfortran for a cross-compiling toolchain
 using newlib.  Currently the checks done by AC_CHECK_FUNCS_ONCE fail with
 my toolchain because the compile/link fails due to the configure script not
 using the needed linker script in the link command.  The check for 
 with_newlib
 is how libjava deals with the problem and it fixes my build problems.

 My only concern is defining HAVE_STRTOLD, strtold exists in my newlib but
 I am not sure if that is true for all newlib builds.  I didn't see any
 flags that I could easily use to check for long double support in the
 libgfortran configure.ac, but it seems to assume that the type exists.

 OK to checkin?

 This patch breaks systems where long double is wider than double.  The
 newlib implementation provides strtold for systems where double is as
 wide as long double but not on systems where long double is wider.

 I don;t see any issues with AC_CHECK_FUNC_ONCE when cross configuring
 aarch64 toolchains but I would have thought that fixing the link test
 issue you encountered would be preferable to hard coding assumptions
 in the configure script?

 Cheers
 /Marcus
 
 AC_CHECK_FUNC_ONCE may work for aarch64 but I don't think there is
 anyway to make it work generally for all platforms.  In the libjava
 configure.ac file is the comments:
 

I think it would be preferable to check whether link tests do something
sane and only fall back to hard coding choices when that is not the
case.  As you've currently written things we do it the other way around.
 Historically that's caused problems as newlib has evolved.





[PATCH, LRA] Remove REG_DEAD and REG_UNUSED notes.

2013-09-24 Thread Yvan Roux
Hi,

This patch removes REG_DEAD and REG_UNUSED notes in update_inc_notes,
as it is what the function is supposed to do (see the comments) and as
keeping these notes produce some failures, at least on ARM.

Thanks,
Yvan

2013-09-24  Yvan Roux  yvan.r...@linaro.org

* lra.c (update_inc_notes): Remove all REG_DEAD and REG_UNUSED notes.


lra-reg-notes.diff
Description: Binary data


Re: [PATCH, ARM, LRA] Prepare ARM build with LRA

2013-09-24 Thread Eric Botcazou
 Indeed, I think it has to be the mode of loc, but I just wonder if it
 is not always the same, as in the doc it is written that mode m is the
 same as the mode that would be used for loc if it were a register.

So can we assert that we have a REG here and use GET_MODE (XEXP (x, 0))?  Or 
else return false if we don't have a REG.

 According to gcc internals, ROTATE and ROTATERT are similar to the
 shifting operations, but to be more accurate maybe we can rename
 shif_code_p in shift_and _rotate_code_p rotation are used in arm
 address calculation, and thus need to be handle in  must_be_index_p
 and set_address_index

Egad.  I guess I just wanted to see it written down. :-)

Btw, are you sure that you don't need to modify must_be_index_p instead?

/* Return true if X must be an index rather than a base.  */

static bool
must_be_index_p (rtx x)
{
  return GET_CODE (x) == MULT || GET_CODE (x) == ASHIFT;
}

and call it from set_address_index?

-- 
Eric Botcazou


Re: [PATCH, LRA] Remove REG_DEAD and REG_UNUSED notes.

2013-09-24 Thread Eric Botcazou
 This patch removes REG_DEAD and REG_UNUSED notes in update_inc_notes,
 as it is what the function is supposed to do (see the comments) and as
 keeping these notes produce some failures, at least on ARM.

The description is too terse.  In the RTL middle-end, you shouldn't have to 
manually deal with the REG_DEAD and REG_UNUSED notes (unlike REG_EQUAL and 
REG_EQUIV notes), as the DF framework is supposed to do it for you.

-- 
Eric Botcazou


Re: [PATCH, ARM, LRA] Prepare ARM build with LRA

2013-09-24 Thread Yvan Roux
 So can we assert that we have a REG here and use GET_MODE (XEXP (x, 0))?  Or
 else return false if we don't have a REG.

I'm currently testing the patch with the modification below

+static bool
+lsb_bitfield_op_p (rtx x)
+{
+  if (GET_RTX_CLASS (GET_CODE (x)) == RTX_BITFIELD_OPS)
+{
+  enum machine_mode mode = GET_MODE(XEXP (x, 0));
+  unsigned HOST_WIDE_INT len = INTVAL (XEXP (x, 1));
+  HOST_WIDE_INT pos = INTVAL (XEXP (x, 2));
+
+  return (pos == (BITS_BIG_ENDIAN ? GET_MODE_PRECISION (mode) - len : 0));
+}
+  return false;
+}


 According to gcc internals, ROTATE and ROTATERT are similar to the
 shifting operations, but to be more accurate maybe we can rename
 shif_code_p in shift_and _rotate_code_p rotation are used in arm
 address calculation, and thus need to be handle in  must_be_index_p
 and set_address_index

 Egad.  I guess I just wanted to see it written down. :-)

Sorry, I'm not sure I understand well, it means that you prefer the
shift_and _rotate_code_p notation, right ?

 Btw, are you sure that you don't need to modify must_be_index_p instead?

 /* Return true if X must be an index rather than a base.  */

 static bool
 must_be_index_p (rtx x)
 {
   return GET_CODE (x) == MULT || GET_CODE (x) == ASHIFT;
 }

 and call it from set_address_index?

Yes, if we don't modify must_be_index_p we'll have failures when
enabling LRA on ARM.  You can find an history if the patch in the
thread named RFC: patch to build GCC for arm with LRA which started
more or less here :

http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00526.html

Thanks,
Yvan

 --
 Eric Botcazou


Re: [PATCH, ARM, LRA] Prepare ARM build with LRA

2013-09-24 Thread Eric Botcazou
 Sorry, I'm not sure I understand well, it means that you prefer the
 shift_and _rotate_code_p notation, right ?

Let's do the following in addition to the lsb_bitfield_op_p thing:
  1. Replace the LO_SUM test in set_address_base by a call to must_be_base_p,
  2. Replace the MULT || ASHIFT test in set_address_index by a call to 
must_be_index_p,
  3. Add the new cases to must_be_index_p directly, with a comment saying that 
there are e.g. for the ARM.

-- 
Eric Botcazou


Re: [patch] Cleanup tree-ssa-ter.c exports

2013-09-24 Thread Andrew MacLeod

On 09/16/2013 10:42 AM, Andrew MacLeod wrote:

On 09/16/2013 04:55 AM, Richard Biener wrote:
On Fri, Sep 13, 2013 at 9:15 PM, Andrew MacLeod amacl...@redhat.com 
wrote:



OK, a slightly different take..
I realized that I should be adding tree-outof-ssa.h to handle the 3 
exports
from tree-outof-ssa.c that are in ssaexpand.h...  In fact, by far 
the most
sensible thing to do is to simply rename tree-outof-ssa.c to 
ssaexpand.c.

This actually resolves a number of warts... And is_replaceable_p() very
naturally fits in ssaexpand.c now...

what do you think of this option? :-)  and svn rename preserves all the
tree-outof-ssa.c history...
I don't like the new name for tree-outof-ssa.c, it matches less to 
its contents.


I'd say either keep ssaexpand.h and tree-outof-ssa.c as-is or rename
ssaexpand.h to tree-outof-ssa.h.  I prefer the latter.


I as well.  ssaexpand.h -tree-outof-ssa.h it is.



The rest of the changes look ok to me, but watch out for odd whitespace
changes:

+static inline bool
+ter_is_replaceable_p (gimple stmt)
+{
+
+  if (ssa_is_replaceable_p (stmt))

spurious vertical space.



bah, where'd that come from :-P.

I'll check this approach in  after  running it through the gauntlet 
again.


Andrew
Huh, still in my queue here.  For the record, here's the patch I'll was 
going to apply changes again , (without Makefile.in dependencies), and 
once I re-verify with a new bootstrap and testsuite runs I'll check it in.


Andrew


	* tree-ssa-live.h (find_replaceable_exprs, dump_replaceable_exprs): Move
	prototypes to...
	* tree-ssa-ter.h: New File.  Move prototypes here.
	* tree-flow.h (stmt_is_replaceable_p): Remove prototype.
	* tree-outof-ssa.h: New. Rename ssaexpand.h, include tree-ssa-ter.h.
	* tree-outof-ssa.c (ssa_is_replaceable_p): New.  Refactor common bits
	from is_replaceable_p. 
	* tree-ssa-ter.c (is_replaceable_p, stmt_is_replaceable_p): Delete.
	(ter_is_replaceable_p): New.  Use new refactored ssa_is_replaceable_p.
	(process_replaceable): Use ter_is_replaceable_p.
	(find_replaceable_in_bb): Use ter_is_replaceable_p.
	* expr.c (stmt_is_replaceable_p): Relocate from tree-ssa-ter.c.  Use
	newly refactored ssa_is_replaceable_p.
	* cfgexpand.c: Include tree-outof-ssa.h.
	* ssaexpand.h: Delete.

Index: tree-ssa-live.h
===
*** tree-ssa-live.h	(revision 202699)
--- tree-ssa-live.h	(working copy)
*** make_live_on_entry (tree_live_info_p liv
*** 325,334 
  /* From tree-ssa-coalesce.c  */
  extern var_map coalesce_ssa_name (void);
  
- 
- /* From tree-ssa-ter.c  */
- extern bitmap find_replaceable_exprs (var_map);
- extern void dump_replaceable_exprs (FILE *, bitmap);
- 
- 
  #endif /* _TREE_SSA_LIVE_H  */
--- 325,328 
Index: tree-ssa-ter.h
===
*** tree-ssa-ter.h	(revision 0)
--- tree-ssa-ter.h	(revision 0)
***
*** 0 
--- 1,26 
+ /* Header file for tree-ssa-ter.c exports.
+Copyright (C) 2013 Free Software Foundation, Inc.
+ 
+ This file is part of GCC.
+ 
+ GCC is free software; you can redistribute it and/or modify it under
+ the terms of the GNU General Public License as published by the Free
+ Software Foundation; either version 3, or (at your option) any later
+ version.
+ 
+ GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+ WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+  for more details.
+ 
+ You should have received a copy of the GNU General Public License
+ along with GCC; see the file COPYING3.  If not see
+ http://www.gnu.org/licenses/.  */
+ 
+ #ifndef GCC_TREE_SSA_TER_H
+ #define GCC_TREE_SSA_TER_H
+ 
+ extern bitmap find_replaceable_exprs (var_map);
+ extern void dump_replaceable_exprs (FILE *, bitmap);
+ 
+ #endif /* GCC_TREE_SSA_TER_H */
Index: tree-flow.h
===
*** tree-flow.h	(revision 202699)
--- tree-flow.h	(working copy)
*** bool fixup_noreturn_call (gimple stmt);
*** 683,691 
  /* In ipa-pure-const.c  */
  void warn_function_noreturn (tree);
  
- /* In tree-ssa-ter.c  */
- bool stmt_is_replaceable_p (gimple);
- 
  /* In tree-parloops.c  */
  bool parallelized_function_p (tree);
  
--- 683,688 
Index: tree-outof-ssa.h
===
*** tree-outof-ssa.h	(revision 202699)
--- tree-outof-ssa.h	(working copy)
*** along with GCC; see the file COPYING3.  
*** 22,27 
--- 22,28 
  #define _SSAEXPAND_H 1
  
  #include tree-ssa-live.h
+ #include tree-ssa-ter.h
  
  /* This structure (of which only a singleton SA exists) is used to
 pass around information between the outof-SSA functions, cfgexpand
*** get_gimple_for_ssa_name (tree exp)
*** 71,79 
return NULL;
  }
  
! /* In tree-outof-ssa.c.  */
! void finish_out_of_ssa 

Re: [ping][PATCH][1 of 2] Add value range info to SSA_NAME for zero sign extension elimination in RTL

2013-09-24 Thread Christophe Lyon
On 23 September 2013 22:34, Eric Botcazou ebotca...@adacore.com wrote:
 Nice patch, but the formatting is totally wrong wrt spaces, please reformat
 using 2-space indentation and 8-space TABs, as already used in the files.


Sorry for missing this problem when committing Kugan's patch.

I have just committed the attached patch, which I hope fixes all the
spaces/indentation issues introduced.

Christophe.

2013-09-24  Christophe Lyon  christophe.l...@linaro.org

* gimple-pretty-print.c: Various whitespace tweaks.
* tree-core.h: Likewise.
* tree-pretty-print.c: Likewise.
* tree-ssa-alias.c: Likewise.
* tree-ssa-copy.c: Likewise.
* tree-ssanames.c: Likewise.
* tree-ssanames.h: Likewise.
* tree-vrp.c: Likewise.
Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c  (revision 202868)
+++ gcc/tree-vrp.c  (revision 202869)
@@ -9451,48 +9451,48 @@ vrp_finalize (void)
 
   /* Set value range to non pointer SSA_NAMEs.  */
   for (i  = 0; i  num_vr_values; i++)
-   if (vr_value[i])
-{
-  tree name = ssa_name (i);
+if (vr_value[i])
+  {
+   tree name = ssa_name (i);
 
   if (!name
  || POINTER_TYPE_P (TREE_TYPE (name))
-  || (vr_value[i]-type == VR_VARYING)
-  || (vr_value[i]-type == VR_UNDEFINED))
-continue;
-
-  if ((TREE_CODE (vr_value[i]-min) == INTEGER_CST)
-   (TREE_CODE (vr_value[i]-max) == INTEGER_CST))
-{
-  if (vr_value[i]-type == VR_RANGE)
-set_range_info (name,
-tree_to_double_int (vr_value[i]-min),
-tree_to_double_int (vr_value[i]-max));
-  else if (vr_value[i]-type == VR_ANTI_RANGE)
-{
-  /* VR_ANTI_RANGE ~[min, max] is encoded compactly as
- [max + 1, min - 1] without additional attributes.
- When min value  max value, we know that it is
- VR_ANTI_RANGE; it is VR_RANGE otherwise.  */
-
- /* ~[0,0] anti-range is represented as
- range.  */
-  if (TYPE_UNSIGNED (TREE_TYPE (name))
-   integer_zerop (vr_value[i]-min)
-   integer_zerop (vr_value[i]-max))
-set_range_info (name,
-double_int_one,
-double_int::max_value
-(TYPE_PRECISION (TREE_TYPE (name)), true));
-  else
-set_range_info (name,
-tree_to_double_int (vr_value[i]-max)
-+ double_int_one,
-tree_to_double_int (vr_value[i]-min)
-- double_int_one);
-}
-}
-}
+ || (vr_value[i]-type == VR_VARYING)
+ || (vr_value[i]-type == VR_UNDEFINED))
+   continue;
+
+   if ((TREE_CODE (vr_value[i]-min) == INTEGER_CST)
+(TREE_CODE (vr_value[i]-max) == INTEGER_CST))
+ {
+   if (vr_value[i]-type == VR_RANGE)
+ set_range_info (name,
+ tree_to_double_int (vr_value[i]-min),
+ tree_to_double_int (vr_value[i]-max));
+   else if (vr_value[i]-type == VR_ANTI_RANGE)
+ {
+   /* VR_ANTI_RANGE ~[min, max] is encoded compactly as
+  [max + 1, min - 1] without additional attributes.
+  When min value  max value, we know that it is
+  VR_ANTI_RANGE; it is VR_RANGE otherwise.  */
+
+   /* ~[0,0] anti-range is represented as
+  range.  */
+   if (TYPE_UNSIGNED (TREE_TYPE (name))
+integer_zerop (vr_value[i]-min)
+integer_zerop (vr_value[i]-max))
+ set_range_info (name,
+ double_int_one,
+ double_int::max_value
+ (TYPE_PRECISION (TREE_TYPE (name)), true));
+   else
+ set_range_info (name,
+ tree_to_double_int (vr_value[i]-max)
+ + double_int_one,
+ tree_to_double_int (vr_value[i]-min)
+ - double_int_one);
+ }
+ }
+  }
 
   /* Free allocated memory.  */
   for (i = 0; i  num_vr_values; i++)
Index: gcc/tree-pretty-print.c
===
--- gcc/tree-pretty-print.c (revision 202868)
+++ gcc/tree-pretty-print.c (revision 202869)
@@ -1063,8 +1063,8 @@ dump_generic_node (pretty_printer *buffe
  pp_string (buffer, B); /* pseudo-unit */
}
   else
-pp_double_int (buffer, tree_to_double_int (node),
-   TYPE_UNSIGNED (TREE_TYPE (node)));
+   

Re: patch to canonize small wide-ints.

2013-09-24 Thread Richard Sandiford
Richard Biener rguent...@suse.de writes:
 On Tue, 24 Sep 2013, Kenneth Zadeck wrote:
 On 09/24/2013 09:39 AM, Richard Biener wrote:
  On Tue, 17 Sep 2013, Kenneth Zadeck wrote:
  
   Richi,
   
   This patch canonizes the bits above the precision for wide ints with 
   types
   or
   modes that are not a perfect multiple of HOST_BITS_PER_WIDE_INT.
   
   I expect that most of the changes in rtl.h will go away.   in particular,
   when
   we decide that we can depend on richard's patch to clean up rtl 
   constants,
   then the only thing that will be left will be the addition of the
   TARGET_SUPPORTS_WIDE_INT test.
   
   I do believe that there is one more conserved force in the universe than
   what
   physicist's generally consider: it is uglyness.  There is a lot of truth
   and
   beauty in the patch but in truth there is a lot of places where the
   uglyness
   is just moved someplace else.
   
   in the pushing the ugly around dept, trees and wide-ints are not 
   canonized
   the
   same way.I spent several days going down the road where it tried to
   have
   them be the same, but it got very ugly having 32 bit unsigned int csts
   have
   the upper 32 bits set.   So now wide_int_to_tree and the wide-int
   constructors
   from tree-cst are now more complex.
   
   i think that i am in favor of this patch, especially in conjunction with
   richards cleanup, but only mildly.
   
   There is also some cleanup where richard wanted the long lines addressed.
   
   Ok to commit to the wide-int branch?
  Looks good to me.
  
  I'll be doing a separate review of the to/from tree parts when I
  find time to do that, but that's unrelated to this patch.
  
  Thanks,
  Richard.
 as i said on irc, after i check in a cleaned up version of this (for ppc) i
 will change the rep for unsiged tree-csts so that they have an extra block of
 zeros if their top bit is set.   this will clean that up somewhat.

 Yes, this will give wide-ints a sign (as I originally proposed).

No, nothing changes on that front.  An N-bit wide_int has no sign.
It looks the same whether it came from a signed tree constant,
an unsigned tree constant, or an rtx (which also has no sign).

All the posted patch does is change the internal representation of
excess bits, which are operationally don't-cares.  The semantics are
just the same as before.  And the point of the follow-up change that
Kenny mentioned is to avoid a copy when treating an N-bit INTEGER_CST
as a wider-than-N wide_int in cases where we're extending according to
TYPE_SIGN.  I.e. it's purely an optimisation.  The resulting wide_int
storage is exactly the same as before, we just do less work to get it.

Thanks,
Richard


Re: [PATCH, ARM, LRA] Prepare ARM build with LRA

2013-09-24 Thread Yvan Roux
Thanks Eric, here is the new patch, validation is ongoing for ARM.

Yvan

2013-09-24  Yvan Roux  yvan.r...@linaro.org
Vladimir Makarov  vmaka...@redhat.com

* rtlanal.c (lsb_bitfield_op_p): New predicate for bitfield operations
from the least significant bit.
(strip_address_mutations): Add bitfield operations handling.
(must_be_index_p): Add shifting and rotate operations handling.
(set_address_base): Use must_be_base_p predicate.
(set_address_index):Use must_be_index_p predicate.



On 24 September 2013 17:26, Eric Botcazou ebotca...@adacore.com wrote:
 Sorry, I'm not sure I understand well, it means that you prefer the
 shift_and _rotate_code_p notation, right ?

 Let's do the following in addition to the lsb_bitfield_op_p thing:
   1. Replace the LO_SUM test in set_address_base by a call to must_be_base_p,
   2. Replace the MULT || ASHIFT test in set_address_index by a call to
 must_be_index_p,
   3. Add the new cases to must_be_index_p directly, with a comment saying that
 there are e.g. for the ARM.

 --
 Eric Botcazou


rtlanal.patch
Description: Binary data


Re: [v3] More noexcept -- 6th

2013-09-24 Thread Paolo Carlini

Hi,

On 9/24/13 11:13 AM, Marc Glisse wrote:

Hello,

bootstrap+testsuite ok. I think all container iterators are done, but 
not the containers themselves.

Ok, thanks.

Paolo.


[PATCH] Set expr loc safely (PR c++/58516)

2013-09-24 Thread Marek Polacek
I admit I haven't spent much time on this, but it seems we should just
check whether we can set the expr location before actually setting it...

Regtested/bootstrapped on x86_64-linux, ok for trunk?

2013-09-24  Marek Polacek  pola...@redhat.com

PR c++/58516
cp/
* semantics.c (finish_transaction_stmt): Set location only when the
expression can have location.

testsuite/
* g++.dg/tm/pr58516.C: New test.

--- gcc/cp/semantics.c.mp   2013-09-24 17:24:59.907548551 +0200
+++ gcc/cp/semantics.c  2013-09-24 17:25:04.251564960 +0200
@@ -5199,7 +5199,9 @@ finish_transaction_stmt (tree stmt, tree
 {
   tree body = build_must_not_throw_expr (TRANSACTION_EXPR_BODY (stmt),
 noex);
-  SET_EXPR_LOCATION (body, EXPR_LOCATION (TRANSACTION_EXPR_BODY (stmt)));
+  /* This may occur when the STATEMENT_LIST is empty.  */
+  if (CAN_HAVE_LOCATION_P (body))
+SET_EXPR_LOCATION (body, EXPR_LOCATION (TRANSACTION_EXPR_BODY (stmt)));
   TREE_SIDE_EFFECTS (body) = 1;
   TRANSACTION_EXPR_BODY (stmt) = body;
 }
--- gcc/testsuite/g++.dg/tm/pr58516.C.mp2013-09-24 17:27:08.859055542 
+0200
+++ gcc/testsuite/g++.dg/tm/pr58516.C   2013-09-24 17:28:29.829354635 +0200
@@ -0,0 +1,7 @@
+// { dg-do compile }
+// { dg-options -std=c++0x -fgnu-tm }
+
+void foo()
+{
+  __transaction_atomic noexcept(false) {}
+}

Marek


Re: [PATCH, ARM, LRA] Prepare ARM build with LRA

2013-09-24 Thread Richard Sandiford
Eric Botcazou ebotca...@adacore.com writes:
 Sorry, I'm not sure I understand well, it means that you prefer the
 shift_and _rotate_code_p notation, right ?

 Let's do the following in addition to the lsb_bitfield_op_p thing:
   1. Replace the LO_SUM test in set_address_base by a call to must_be_base_p,
   2. Replace the MULT || ASHIFT test in set_address_index by a call to 
 must_be_index_p,
   3. Add the new cases to must_be_index_p directly, with a comment saying 
 that 
 there are e.g. for the ARM.

FWIW, I'd prefer to keep it as-is, since must_be_base_p (x) and
must_be_index_p (x) don't imply that we should look specifically at
XEXP (x, 0) (rather than just X, or XEXP (x, 1), etc.).  I think it's
better to keep the code tests and the associated XEXPs together.

Thanks,
Richard


Re: [PATCH, LRA] Remove REG_DEAD and REG_UNUSED notes.

2013-09-24 Thread Yvan Roux
 The description is too terse.  In the RTL middle-end, you shouldn't have to
 manually deal with the REG_DEAD and REG_UNUSED notes (unlike REG_EQUAL and
 REG_EQUIV notes), as the DF framework is supposed to do it for you.

Sorry, for that.  The description of the LRA function update_inc_notes
explicitly says that it removes all the REG_DEAD and REG_UNUSED notes,
but the code didn't do it, and after the LRA pass we indeed still have
that kind of notes, which wasn't the case with reload, and for
instance with the code below:

int f(int x)
{
  return (x  (sizeof (x) * __CHAR_BIT__ - 1)) ? -1 : 1;
}

we used to have that kind of insns on ARM:

(insn 8 3 27 2 (parallel [
(set (reg:CC 100 cc)
(compare:CC (reg:SI 0 r0 [ x ])
(const_int 0 [0])))
(set (reg/v:SI 112 [ x ])
(reg:SI 0 r0 [ x ]))
]) cmp-lra.c:4 205 {*movsi_compare0}
 (expr_list:REG_DEAD (reg:SI 0 r0 [ x ])
(expr_list:REG_UNUSED (reg:CC 100 cc)
(nil
(insn 27 8 28 2 (set (reg:CC 100 cc)
(compare:CC (reg/v:SI 112 [ x ])
(const_int 0 [0]))) cmp-lra.c:4 228 {*arm_cmpsi_insn}
 (expr_list:REG_DEAD (reg/v:SI 112 [ x ])
(nil)))
(note 28 27 17 2 NOTE_INSN_DELETED)
(insn 17 28 20 2 (set (reg/i:SI 0 r0)
(if_then_else:SI (ge (reg:CC 100 cc)
(const_int 0 [0]))
(const_int 1 [0x1])
(const_int -1 [0x]))) cmp-lra.c:5 249
{*movsicc_insn}
 (expr_list:REG_DEAD (reg:CC 100 cc)
(nil)))
(insn 20 17 0 2 (use (reg/i:SI 0 r0)) cmp-lra.c:5 -1
 (nil))

with the notes still present after LRA, and the post-reload pass just
drop insns 8 and 27 and the generated code has no more comparison in
it...

Hope it makes more sense, now

Thanks,
Yvan


Re: [PATCH] Set expr loc safely (PR c++/58516)

2013-09-24 Thread Paolo Carlini

Hi,

On 9/24/13 11:35 AM, Marek Polacek wrote:

I admit I haven't spent much time on this, but it seems we should just
check whether we can set the expr location before actually setting it...

Regtested/bootstrapped on x86_64-linux, ok for trunk?

Two minor nits (assuming the general idea makes sense, first blush it does).

First, solicited by the ChangeLog entry too and given that non-null 
expression trees can always have locations, I wonder if we shouldn't 
check EXPR_P instead, it seems more direct, unless we really fear that 
body could be NULL_TREE. I would also phrase in a negative form the 
comment before the check, like This may not be true when...


Second, I'm pretty sure about this one, I think we are standardizing on 
c++11 for testcases, by now c++0x is essentially legacy.


Paolo.


Re: [PATCH i386 3/8] [AVX512] [1/n] Add AVX-512 patterns: VF iterator extended.

2013-09-24 Thread Richard Henderson
On 08/27/2013 11:37 AM, Kirill Yukhin wrote:
 Hello,
 
 This patch is still far too large.

 I think you should split it up based on every single mode iterator that
 you need to add or change.
 
 Problem is that some iterators are depend on each other, so patches are
 not going to be tiny.
 
 Here is 1st one. It extends VF iterator - biggest impact I believe
 
 Is it Ok?
 
 Testing:
   1. Bootstrap pass.
   2. make check shows no regressions.
   3. Spec 2000  2006 build show no regressions both with and without 
 -mavx512f option.
   4. Spec 2000  2006 run shows no stability regressions without -mavx512f 
 option.


Ok.


r~


Re: [PATCH][ARM][testsuite] Add effective target check for arm conditional execution

2013-09-24 Thread Mike Stump
On Sep 13, 2013, at 8:25 AM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:
 gcc.target/arm/minmax_minus.c is really only valid when we have conditional 
 execution available, that is non Thumb1-only targets. I've added an effective 
 target check for that and used it in the test so that it does not get run and 
 give a false negative when testing Thumb1 targets.
 
 Ok for trunk?

Ok.  I was hoping the arm folks would review it, usually they are very 
responsive.  I'm fine with target maintainers reviewing patches like this.

wide-int: Re: patch to canonize small wide-ints.

2013-09-24 Thread Kenneth Zadeck

I just committed the patch to do this as revision 202871.

there are two changes from the earlier patch:

1) the addition of frag in loop-doloop.c.   This fixes an rtl 
canonization bug, but it is unique to the branch.   it did not cause a 
problem on x86 but did on ppc.
2) the code in rtl.h for converting rtl to wide-ints is now much 
simpler.   I now rely on richard's patch, which has been merged into 
this branch to canonize the rtl constants.   I have an assert to make 
sure that canonization is never violated.


I did not change the tree-cst constructor as richard and i had talked 
about on irc.my next patch will be to change the rep of tree-cst so 
that unsigned types that have precisions of some multiple of HBPWI and 
have their top bit set will just carry an extra block of zeros.   That 
will force a rewrite of this constructor.


Kenny

On 09/24/2013 12:06 PM, Richard Sandiford wrote:

Richard Biener rguent...@suse.de writes:

On Tue, 24 Sep 2013, Kenneth Zadeck wrote:

On 09/24/2013 09:39 AM, Richard Biener wrote:

On Tue, 17 Sep 2013, Kenneth Zadeck wrote:


Richi,

This patch canonizes the bits above the precision for wide ints with types
or
modes that are not a perfect multiple of HOST_BITS_PER_WIDE_INT.

I expect that most of the changes in rtl.h will go away.   in particular,
when
we decide that we can depend on richard's patch to clean up rtl constants,
then the only thing that will be left will be the addition of the
TARGET_SUPPORTS_WIDE_INT test.

I do believe that there is one more conserved force in the universe than
what
physicist's generally consider: it is uglyness.  There is a lot of truth
and
beauty in the patch but in truth there is a lot of places where the
uglyness
is just moved someplace else.

in the pushing the ugly around dept, trees and wide-ints are not canonized
the
same way.I spent several days going down the road where it tried to
have
them be the same, but it got very ugly having 32 bit unsigned int csts
have
the upper 32 bits set.   So now wide_int_to_tree and the wide-int
constructors
from tree-cst are now more complex.

i think that i am in favor of this patch, especially in conjunction with
richards cleanup, but only mildly.

There is also some cleanup where richard wanted the long lines addressed.

Ok to commit to the wide-int branch?

Looks good to me.

I'll be doing a separate review of the to/from tree parts when I
find time to do that, but that's unrelated to this patch.

Thanks,
Richard.

as i said on irc, after i check in a cleaned up version of this (for ppc) i
will change the rep for unsiged tree-csts so that they have an extra block of
zeros if their top bit is set.   this will clean that up somewhat.

Yes, this will give wide-ints a sign (as I originally proposed).

No, nothing changes on that front.  An N-bit wide_int has no sign.
It looks the same whether it came from a signed tree constant,
an unsigned tree constant, or an rtx (which also has no sign).

All the posted patch does is change the internal representation of
excess bits, which are operationally don't-cares.  The semantics are
just the same as before.  And the point of the follow-up change that
Kenny mentioned is to avoid a copy when treating an N-bit INTEGER_CST
as a wider-than-N wide_int in cases where we're extending according to
TYPE_SIGN.  I.e. it's purely an optimisation.  The resulting wide_int
storage is exactly the same as before, we just do less work to get it.

Thanks,
Richard


Index: gcc/tree.c
===
--- gcc/tree.c	(revision 202811)
+++ gcc/tree.c	(working copy)
@@ -1112,7 +1112,6 @@ force_fit_type (tree type, const wide_in
 	  || (overflowable  0  sign == SIGNED))
 	{
 	  wide_int tmp = wide_int::from (cst, TYPE_PRECISION (type), sign);
-	  wi::clear_undef (tmp, sign);
 	  int l = tmp.get_len ();
 	  tree t = make_int_cst (l);
 	  if (l  1)
@@ -1205,11 +1204,11 @@ wide_int_to_tree (tree type, const wide_
 }
 
   wide_int cst = wide_int::from (pcst, prec, sgn);
-  /* The following call makes sure that all tree-cst's are canonical.
- i.e. it really does sign or zero extend the top block of the
- value if the precision of the type is not an even multiple of the
- size of an HWI.  */
-  wi::clear_undef (cst, sgn);
+  int len = int (cst.get_len ());
+  int small_prec = prec  (HOST_BITS_PER_WIDE_INT - 1);
+  bool recanonize = sgn == UNSIGNED
+ (prec + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT == len
+ small_prec;
 
   switch (TREE_CODE (type))
 {
@@ -1291,18 +1290,31 @@ wide_int_to_tree (tree type, const wide_
   t = TREE_VEC_ELT (TYPE_CACHED_VALUES (type), ix);
   if (t)
 	{
-	  /* Make sure no one is clobbering the shared constant.  */
+	  /* Make sure no one is clobbering the shared constant.  We
+	 must be careful here because tree-csts and wide-ints are
+	 not canonicalized in the same way.  */
 	  gcc_assert (TREE_TYPE (t) == type);
-	  gcc_assert 

Re: RFA: Store the REG_BR_PROB probability directly as an int

2013-09-24 Thread Richard Sandiford
Andreas Schwab sch...@suse.de writes:
 Richard Sandiford rdsandif...@googlemail.com writes:

 REG_BR_PROB notes are stored as:

   (expr_list:REG_BR_PROB (const_int prob) chain)

 but a full const_int rtx seems a bit heavweight when all we want is
 a plain int.  This patch uses:

   (int_list:REG_BR_PROB prob chain)

 instead.

 I think you left out the handling of INT_LIST in eliminate_regs_1.  This
 lets me finish the build:

Sorry for the breakage.  I think we need to handle INT_LIST in the same way
as INSN_LIST though, and eliminate in XEXP (x, 1).

How about the attached?  Testing in progress...

Thanks,
Richard


gcc/
* cse.c (count_reg_usage): Handle INT_LIST.
* lra-eliminations.c (lra_eliminate_regs_1): Likewise.
* reginfo.c (reg_scan_mark_refs): Likewise.
* reload1.c (eliminate_regs_1): Likewise.

Index: gcc/cse.c
===
--- gcc/cse.c   2013-09-24 18:29:49.308378918 +0100
+++ gcc/cse.c   2013-09-24 18:29:49.460380403 +0100
@@ -6739,6 +6739,7 @@ count_reg_usage (rtx x, int *counts, rtx
   return;
 
 case INSN_LIST:
+case INT_LIST:
   gcc_unreachable ();
 
 default:
Index: gcc/lra-eliminations.c
===
--- gcc/lra-eliminations.c  2013-09-24 18:29:49.308378918 +0100
+++ gcc/lra-eliminations.c  2013-09-24 18:29:49.461380412 +0100
@@ -471,6 +471,7 @@ lra_eliminate_regs_1 (rtx x, enum machin
   /* ... fall through ...  */
 
 case INSN_LIST:
+case INT_LIST:
   /* Now do eliminations in the rest of the chain. If this was
 an EXPR_LIST, this might result in allocating more memory than is
 strictly needed, but it simplifies the code.  */
Index: gcc/reginfo.c
===
--- gcc/reginfo.c   2013-09-24 18:29:49.309378928 +0100
+++ gcc/reginfo.c   2013-09-24 18:29:49.462380422 +0100
@@ -1075,6 +1075,7 @@ reg_scan_mark_refs (rtx x, rtx insn)
   break;
 
 case INSN_LIST:
+case INT_LIST:
   if (XEXP (x, 1))
reg_scan_mark_refs (XEXP (x, 1), insn);
   break;
Index: gcc/reload1.c
===
--- gcc/reload1.c   2013-09-24 18:29:49.311378947 +0100
+++ gcc/reload1.c   2013-09-24 18:29:49.463380432 +0100
@@ -2776,6 +2776,7 @@ eliminate_regs_1 (rtx x, enum machine_mo
   /* ... fall through ...  */
 
 case INSN_LIST:
+case INT_LIST:
   /* Now do eliminations in the rest of the chain.  If this was
 an EXPR_LIST, this might result in allocating more memory than is
 strictly needed, but it simplifies the code.  */


Re: [PATCH]Fix computation of offset in ivopt

2013-09-24 Thread Oleg Endo
On Tue, 2013-09-24 at 12:31 +0200, Richard Biener wrote:
 On Tue, Sep 24, 2013 at 11:13 AM, bin.cheng bin.ch...@arm.com wrote:
  Hi,
  This patch fix two minor bugs when computing offset in IVOPT.
  1) Considering below example:
  #define MAX 100
  struct tag
  {
int i;
int j;
  }
  struct tag arr[MAX]
 
  int foo (int len)
  {
int i = 0;
for (; i  len; i++)
{
  access arr[i].j;
}
  }
 
  Without this patch, the offset computed by strip_offset_1 for address
  arr[i].j is ZERO, which is apparently not.
 
  2) Considering below example:
  //...
bb 15:
KeyIndex_66 = KeyIndex_194 + 4294967295;
if (KeyIndex_66 != 0)
  goto bb 16;
else
  goto bb 18;
 
bb 16:
 
bb 17:
# KeyIndex_194 = PHI KeyIndex_66(16), KeyIndex_180(73)
_62 = KeyIndex_194 + 1073741823;
_63 = _62 * 4;
_64 = pretmp_184 + _63;
_65 = *_64;
if (_65 == 0)
  goto bb 15;
else
  goto bb 71;
  //...
 
  There are iv use and candidate like:
 
  use 1
address
in statement _65 = *_64;
 
at position *_64
type handletype *
base pretmp_184 + ((sizetype) KeyIndex_180 + 1073741823) * 4
step 4294967292
base object (void *) pretmp_184
related candidates
 
  candidate 6
var_before ivtmp.16
var_after ivtmp.16
incremented before use 1
type unsigned int
base (unsigned int) (pretmp_184 + (sizetype) KeyIndex_180 * 4)
step 4294967292
base object (void *) pretmp_184
  Candidate 6 is related to use 1
 
  In function get_computation_cost_at for use 1 using cand 6, ubase and cbase
  are:
  pretmp_184 + ((sizetype) KeyIndex_180 + 1073741823) * 4
  pretmp_184 + (sizetype) KeyIndex_180 * 4
 
  The cstepi computed in HOST_WIDE_INT is :  0xfffc, while offset
  computed in TYPE(utype) is : 0xfffc.  Though they both stand for value
  -4 in different precision, statement offset -= ratio * cstepi returns
  0x1, which is wrong.
 
  Tested on x86_64 and arm.  Is it OK?
 
 +   field = TREE_OPERAND (expr, 1);
 +   if (DECL_FIELD_BIT_OFFSET (field)
 +cst_and_fits_in_hwi (DECL_FIELD_BIT_OFFSET (field)))
 + boffset = int_cst_value (DECL_FIELD_BIT_OFFSET (field));
 +
 +   tmp = component_ref_field_offset (expr);
 +   if (top_compref
 +cst_and_fits_in_hwi (tmp))
 + {
 +   /* Strip the component reference completely.  */
 +   op0 = TREE_OPERAND (expr, 0);
 +   op0 = strip_offset_1 (op0, inside_addr, top_compref, off0);
 +   *offset = off0 + int_cst_value (tmp) + boffset / BITS_PER_UNIT;
 +   return op0;
 + }
 
 the failure paths seem mangled, that is, if cst_and_fits_in_hwi is false
 for either offset part you may end up doing half accounting and not
 stripping.
 
 Btw, DECL_FIELD_BIT_OFFSET is always non-NULL.  I suggest to
 rewrite to
 
  if (!inside_addr)
return orig_expr;
 
  tmp = component_ref_field_offset (expr);
  field = TREE_OPERAND (expr, 1);
  if (top_compref
   cst_and_fits_in_hwi (tmp)
   cst_and_fits_in_hwi (DECL_FIELD_BIT_OFFSET (field)))
 {
   ...
 }
 
 note that this doesn't really handle overflows correctly as
 
 +   *offset = off0 + int_cst_value (tmp) + boffset / BITS_PER_UNIT;
 
 may still overflow.
 
 @@ -4133,6 +4142,9 @@ get_computation_cost_at (struct ivopts_data *data,
  bitmap_clear (*depends_on);
  }
 
 +  /* Sign-extend offset if utype has lower precision than HOST_WIDE_INT.  */
 +  offset = sext_hwi (offset, TYPE_PRECISION (utype));
 +
 
 offset is computed elsewhere in difference_cost and the bug to me seems that
 it is unsigned.  sign-extending it here is odd at least (and the extension
 should probably happen at sizetype precision, not that of utype).
 

After reading overflow and ivopt, I was wondering whether
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55190 is somehow related.

Cheers,
Oleg



[PATCH v2 1/4] Ignore access-control keywords when parsing fields.

2013-09-24 Thread David Malcolm
Classes containing access-control keywords such as public:
confuse struct_field_seq, leading it to call consume_until_eos
i.e. ignore text until after the next semicolon.

This leads to the first field after an access-control keyword
being ignored by gengtype.  This can be seen in:
  http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01532.html
where the autogenerated marking function erroneously omitted the
traversal of the callees field of cgraph_node *.

This patch fixes up struct_field_seq to gracefully ignore such
keywords, thus fixing gengtype so that it does not erroneouly omit
fields of such a class.

* gengtype-parse.c (struct_field_seq): Ignore access-control
keywords (public: etc).
---
 gcc/gengtype-parse.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/gcc/gengtype-parse.c b/gcc/gengtype-parse.c
index 68d372e..e5204c1 100644
--- a/gcc/gengtype-parse.c
+++ b/gcc/gengtype-parse.c
@@ -733,6 +733,17 @@ struct_field_seq (void)
 {
   ty = type (opts, true);
 
+  /* Ignore access-control keywords (public: etc).  */
+  while (!ty  token () == IGNORABLE_CXX_KEYWORD)
+   {
+ const char *keyword = advance ();
+ if (strcmp (keyword, public:) != 0
+  strcmp (keyword, private:) != 0
+  strcmp (keyword, protected:) != 0)
+   break;
+ ty = type (opts, true);
+   }
+
   if (!ty || token () == ':')
{
  consume_until_eos ();
-- 
1.7.11.7



[PATCH v2 0/4] Support some cases of inheritance in gengtype

2013-09-24 Thread David Malcolm
Here's an updated version of the patch series.

Changes since v1 of the patch series:

  * Patch 1/4: this one is new: I noticed that the public: specifier
within class cgraph_node confused gengtype, leading it to erroneously
omit the first field within the class.  This patch fixes this bug.

  * Patch 2/4: this is the updated parser support for base classes.  In
the previous version of the patch, it only attempted to parse the
base classes of GTY()-marked types.  In this version of the patch it
only attempts to pass the bases of GTY()-marked types that are *not*
marked with GTY((user)) i.e. it ignores GTY((user)) types.

  * Patch 3/4: this version incorporates suggestions by Michael Matz:
set_gc_used_type now visits base classes, and all inherited fields,
not just those in the class itself, and fixes the stylistic nits he
noted.   walk_subclasses now only writes out cases for concrete
subclasses i.e. those with a tag GTY option.  Similarly, I updated
USED_BY_TYPED_GC_P so that it emits allocators etc for subclasses
with a tag GTY option.  I didn't fix the O(N^2) in walk_subclasses
(it doesn't seem to affect the speed of the build).

  * Patch 4/4: this is new: it adds documentation to gty.texi about
how desc and tag can be used for class hierarchies.

Successfully bootstrapped and regtested together with the symtab patches
posted earlier, which port symtab/cgraph/varpool to use this machinery.

OK for trunk?


David Malcolm (4):
  Ignore access-control keywords when parsing fields.
  Parse base classes for GTY-marked types
  Handle simple inheritance in gengtype.
  Add documentation about gengtype and inheritance

 gcc/doc/gty.texi |  52 +++
 gcc/gengtype-parse.c |  63 ---
 gcc/gengtype-state.c |   2 +
 gcc/gengtype.c   | 117 +--
 gcc/gengtype.h   |  12 +-
 5 files changed, 226 insertions(+), 20 deletions(-)

-- 
1.7.11.7



[PATCH v2 4/4] Add documentation about gengtype and inheritance

2013-09-24 Thread David Malcolm
gcc/
* doc/gty.texi (GTY Options): Add note about inheritance to
description of desc and tag.
(Inheritance and GTY): New.
---
 gcc/doc/gty.texi | 52 
 1 file changed, 52 insertions(+)

diff --git a/gcc/doc/gty.texi b/gcc/doc/gty.texi
index 82e8e4f..090f6a6 100644
--- a/gcc/doc/gty.texi
+++ b/gcc/doc/gty.texi
@@ -87,6 +87,7 @@ data members.
 
 @menu
 * GTY Options:: What goes inside a @code{GTY(())}.
+* Inheritance and GTY:: Adding GTY to a class hierarchy.
 * User GC::Adding user-provided GC marking routines.
 * GGC Roots::   Making global variables GGC roots.
 * Files::   How the generated files work.
@@ -234,6 +235,10 @@ In this example, the value of BINDING_HAS_LEVEL_P when 
applied to a
 mechanism will treat the field @code{level} as being present and if 0,
 will treat the field @code{scope} as being present.
 
+The @code{desc} and @code{tag} options can also be used for inheritance
+to denote which subclass an instance is.  See @ref{Inheritance and GTY}
+for more information.
+
 @findex param_is
 @findex use_param
 @item param_is (@var{type})
@@ -470,6 +475,53 @@ fields is completely handled by user-provided routines.  
See section
 @ref{User GC} for details on what functions need to be provided.
 @end table
 
+@node Inheritance and GTY
+@section Support for inheritance
+gengtype has some support for simple class hierarchies.  You can use
+this to have gengtype autogenerate marking routines, provided:
+
+@itemize @bullet
+@item
+There must be a concrete base class, with a discriminator expression
+that can be used to identify which subclass an instance is.
+@item
+Only single inheritance is used.
+@item
+None of the classes within the hierarchy are templates.
+@end itemize
+
+If your class hierarchy does not fit in this pattern, you must use
+@ref{User GC} instead.
+
+The base class and its discriminator must be identified using the ``desc''
+option.  Each concrete subclass must use the ``tag'' option to identify
+which value of the discriminator it corresponds to.
+
+@smallexample
+class GTY((desc(%h.kind), tag(0))) example_base
+@{
+public:
+int kind;
+tree a;
+@};
+
+class GTY((tag(1)) some_subclass : public example_base
+@{
+public:
+tree b;
+@};
+
+class GTY((tag(2)) some_other_subclass : public example_base
+@{
+public:
+tree c;
+@};
+@end smallexample
+
+The generated marking routines for the above will contain a ``switch''
+on ``kind'', visiting all appropriate fields.  For example, if kind is
+2, it will cast to ``some_other_subclass'' and visit fields a, b, and c.
+
 @node User GC
 @section Support for user-provided GC marking routines
 @cindex user gc
-- 
1.7.11.7



[PATCH v2 2/4] Parse base classes for GTY-marked types

2013-09-24 Thread David Malcolm
Extend gengtype (and gtype.state reading/writing) so that it is able to
parse base classes in simple cases, and only attempt to do it for
GTY-marked types.

* gengtype-parse.c (require_without_advance): New.
(type): For GTY-marked types that are not GTY((user)), parse any
base classes, requiring them to be single-inheritance, and not
be templates.  For non-GTY-marked types and GTY((user)),
continue to skip over any C++ inheritance specification.
* gengtype-state.c (state_writer::write_state_struct_type):
Write base class of type (if any).
(read_state_struct_type): Read base class of type (if any).
* gengtype.c (new_structure): Add a base_class parameter.
(create_optional_field_): Update for new parameter to
new_structure.
(adjust_field_rtx_def): Likewise.
(adjust_field_tree_exp): Likewise.
* gengtype.h (struct type): Add base_class field to the s
union field.
(new_structure): Add base parameter.
---
 gcc/gengtype-parse.c | 50 +-
 gcc/gengtype-state.c |  2 ++
 gcc/gengtype.c   | 15 ---
 gcc/gengtype.h   |  4 +++-
 4 files changed, 58 insertions(+), 13 deletions(-)

diff --git a/gcc/gengtype-parse.c b/gcc/gengtype-parse.c
index e5204c1..31d493a 100644
--- a/gcc/gengtype-parse.c
+++ b/gcc/gengtype-parse.c
@@ -165,6 +165,21 @@ require (int t)
   return v;
 }
 
+/* As per require, but do not advance.  */
+static const char *
+require_without_advance (int t)
+{
+  int u = token ();
+  const char *v = T.value;
+  if (u != t)
+{
+  parse_error (expected %s, have %s,
+  print_token (t, 0), print_token (u, v));
+  return 0;
+}
+  return v;
+}
+
 /* If the next token does not have one of the codes T1 or T2, report a
parse error; otherwise return the token's value.  */
 static const char *
@@ -829,6 +844,7 @@ type (options_p *optsp, bool nested)
 case STRUCT:
 case UNION:
   {
+   type_p base_class = NULL;
options_p opts = 0;
/* GTY annotations follow attribute syntax
   GTY_BEFORE_ID is for union/struct declarations
@@ -868,16 +884,39 @@ type (options_p *optsp, bool nested)
opts = gtymarker_opt ();
  }
 
+   bool is_user_gty = opts_have (opts, user);
+
if (token () == ':')
  {
-   /* Skip over C++ inheritance specification.  */
-   while (token () != '{')
- advance ();
+   if (is_gty  !is_user_gty)
+ {
+   /* For GTY-marked types that are not user, parse some C++
+  inheritance specifications.
+  We require single-inheritance from a non-template type.  */
+   advance ();
+   const char *basename = require (ID);
+   /* This may be either an access specifier, or the base name.  */
+   if (0 == strcmp (basename, public)
+   || 0 == strcmp (basename, protected)
+   || 0 == strcmp (basename, private))
+ basename = require (ID);
+   base_class = find_structure (basename, TYPE_STRUCT);
+   if (!base_class)
+ parse_error (unrecognized base class: %s, basename);
+   require_without_advance ('{');
+ }
+   else
+ {
+   /* For types lacking GTY-markings, skip over C++ inheritance
+  specification (and thus avoid having to parse e.g. template
+  types).  */
+   while (token () != '{')
+ advance ();
+ }
  }
 
if (is_gty)
  {
-   bool is_user_gty = opts_have (opts, user);
if (token () == '{')
  {
pair_p fields;
@@ -900,7 +939,8 @@ type (options_p *optsp, bool nested)
return create_user_defined_type (s, lexer_line);
  }
 
-   return new_structure (s, kind, lexer_line, fields, opts);
+   return new_structure (s, kind, lexer_line, fields, opts,
+ base_class);
  }
  }
else if (token () == '{')
diff --git a/gcc/gengtype-state.c b/gcc/gengtype-state.c
index ba7948a..54e4287 100644
--- a/gcc/gengtype-state.c
+++ b/gcc/gengtype-state.c
@@ -957,6 +957,7 @@ state_writer::write_state_struct_type (type_p current)
 {
   write_state_struct_union_type (current, struct);
   write_state_type (current-u.s.lang_struct);
+  write_state_type (current-u.s.base_class);
 }
 
 /* Write a GTY user-defined struct type.  */
@@ -1613,6 +1614,7 @@ read_state_struct_type (type_p type)
   read_state_options ((type-u.s.opt));
   read_state_lang_bitmap ((type-u.s.bitmap));
   read_state_type ((type-u.s.lang_struct));
+  read_state_type ((type-u.s.base_class));
 }
   else
 {
diff --git 

[PATCH v2 3/4] Handle simple inheritance in gengtype.

2013-09-24 Thread David Malcolm
Treat GTY structs that have a desc as being the root of an inheritance
hierarchy.  Generate a switch on desc within the marking function with
cases for each subclass, visiting all fields of the type (including
inherited ones).

Don't create marking functions for subclasses, instead using the base
class marking functions.  Use walk_type on them within walk_subclasses
to generate the case within the switch for handling the tag, directly
walking all fields of the type.

* gengtype-parse.c (opts_have): Drop static so that
we can use this from gengtype.c.
* gengtype.c (set_gc_used_type): Mark any base class as used;
update field traversal to visit inherited fields.
(output_mangled_typename):  Convert references to classes within
an inheritance hierarchy to reference the ultimate base class,
since only it will have gt_ functions.
(get_string_option): New.
(walk_subclasses): New.
(walk_type): Treat GTY structs that have a desc as being the
root of an inheritance hierarchy.  Generate a switch on it
within the marking function which walks all subclasses, adding
cases for them via walk_subclasses.  For subclasses, visit all
fields of the type (including inherited ones).
(write_func_for_structure): Don't write fns for subclasses, only
for the ultimate base class within an inheritance hierarchy.
Subclasses-marking will be handled by the base class marking
functions.
(write_types): Likewise.
(write_local_func_for_structure): Likewise.
(USED_BY_TYPED_GC_P): Emit allocators for subclasses that have
a tag option (and are thus concrete subclasses).
(write_root): Use the marker function for the ultimate base class.
* gengtype.h (FOR_ALL_INHERITED_FIELDS): New.
(opts_have): Add declaration.
---
 gcc/gengtype-parse.c |   2 +-
 gcc/gengtype.c   | 102 ---
 gcc/gengtype.h   |   8 
 3 files changed, 105 insertions(+), 7 deletions(-)

diff --git a/gcc/gengtype-parse.c b/gcc/gengtype-parse.c
index 31d493a..f480503 100644
--- a/gcc/gengtype-parse.c
+++ b/gcc/gengtype-parse.c
@@ -793,7 +793,7 @@ struct_field_seq (void)
 
 /* Return true if OPTS contain the option named STR.  */
 
-static bool
+bool
 opts_have (options_p opts, const char *str)
 {
   for (options_p opt = opts; opt; opt = opt-next)
diff --git a/gcc/gengtype.c b/gcc/gengtype.c
index a6dc221..4224a5e 100644
--- a/gcc/gengtype.c
+++ b/gcc/gengtype.c
@@ -1532,7 +1532,11 @@ set_gc_used_type (type_p t, enum gc_used_enum level, 
type_p param[NUM_PARAM],
process_gc_options (t-u.s.opt, level, dummy, dummy, dummy, dummy,
dummy2);
 
-   for (f = t-u.s.fields; f; f = f-next)
+   if (t-u.s.base_class)
+ set_gc_used_type (t-u.s.base_class, level, param,
+   allow_undefined_types);
+
+   FOR_ALL_INHERITED_FIELDS(t, f)
  {
int maybe_undef = 0;
int pass_param = 0;
@@ -2534,6 +2538,11 @@ output_mangled_typename (outf_p of, const_type_p t)
   case TYPE_LANG_STRUCT:
   case TYPE_USER_STRUCT:
{
+ /* For references to classes within an inheritance hierarchy,
+only ever reference the ultimate base class, since only
+it will have gt_ functions.  */
+ while (t-u.s.base_class)
+   t = t-u.s.base_class;
  const char *id_for_tag = filter_type_name (t-u.s.tag);
  oprintf (of, %lu%s, (unsigned long) strlen (id_for_tag),
   id_for_tag);
@@ -2596,6 +2605,44 @@ output_escaped_param (struct walk_type_data *d, const 
char *param,
}
 }
 
+const char *
+get_string_option (options_p opt, const char *key)
+{
+  for (; opt; opt = opt-next)
+if (strcmp (opt-name, key) == 0)
+  return opt-info.string;
+  return NULL;
+}
+
+static void
+walk_subclasses (type_p base, struct walk_type_data *d)
+{
+  for (type_p sub = structures; sub != NULL; sub = sub-next)
+{
+  if (sub-u.s.base_class == base)
+   {
+ const char *type_tag = get_string_option (sub-u.s.opt, tag);
+ if (type_tag)
+   {
+ oprintf (d-of, %*scase %s:\n, d-indent, , type_tag);
+ d-indent += 2;
+ oprintf (d-of, %*s{\n, d-indent, );
+ d-indent += 2;
+ oprintf (d-of, %*s%s *sub = static_cast %s * (x);\n,
+  d-indent, , sub-u.s.tag, sub-u.s.tag);
+ const char *old_val = d-val;
+ d-val = (*sub);
+ walk_type (sub, d);
+ d-val = old_val;
+ d-indent -= 2;
+ oprintf (d-of, %*s}\n, d-indent, );
+ oprintf (d-of, %*sbreak;\n, d-indent, );
+ d-indent -= 2;
+   }
+ walk_subclasses (sub, d);
+   }
+}
+}
 
 /* Call D-PROCESS_FIELD for every field (or subfield) 

Re: [PATCH] Set expr loc safely (PR c++/58516)

2013-09-24 Thread Marek Polacek
On Tue, Sep 24, 2013 at 12:00:41PM -0500, Paolo Carlini wrote:
 Hi,
 
 On 9/24/13 11:35 AM, Marek Polacek wrote:
 I admit I haven't spent much time on this, but it seems we should just
 check whether we can set the expr location before actually setting it...
 
 Regtested/bootstrapped on x86_64-linux, ok for trunk?
 Two minor nits (assuming the general idea makes sense, first blush it does).
 
 First, solicited by the ChangeLog entry too and given that non-null
 expression trees can always have locations, I wonder if we shouldn't
 check EXPR_P instead, it seems more direct, unless we really fear
 that body could be NULL_TREE.

Well, seems build_must_not_throw_expr won't return NULL_TREE, and given:
#define CAN_HAVE_LOCATION_P(NODE) ((NODE)  EXPR_P (NODE))
I think EXPR_P should do ;).

 I would also phrase in a negative form
 the comment before the check, like This may not be true when...

Fixed.
 
 Second, I'm pretty sure about this one, I think we are standardizing
 on c++11 for testcases, by now c++0x is essentially legacy.

Sure, adjusted.  Thanks.

Regtested on x86_64-linux.

2013-09-24  Marek Polacek  pola...@redhat.com

PR c++/58516
cp/
* semantics.c (finish_transaction_stmt): Check for EXPR_P before
setting the expr location.

testsuite/
* g++.dg/tm/pr58516.C: New test.

--- gcc/cp/semantics.c.mp   2013-09-24 17:24:59.907548551 +0200
+++ gcc/cp/semantics.c  2013-09-24 19:14:18.590639066 +0200
@@ -5199,7 +5199,9 @@ finish_transaction_stmt (tree stmt, tree
 {
   tree body = build_must_not_throw_expr (TRANSACTION_EXPR_BODY (stmt),
 noex);
-  SET_EXPR_LOCATION (body, EXPR_LOCATION (TRANSACTION_EXPR_BODY (stmt)));
+  /* This may not be true when the STATEMENT_LIST is empty.  */
+  if (EXPR_P (body))
+SET_EXPR_LOCATION (body, EXPR_LOCATION (TRANSACTION_EXPR_BODY (stmt)));
   TREE_SIDE_EFFECTS (body) = 1;
   TRANSACTION_EXPR_BODY (stmt) = body;
 }
--- gcc/testsuite/g++.dg/tm/pr58516.C.mp2013-09-24 17:27:08.859055542 
+0200
+++ gcc/testsuite/g++.dg/tm/pr58516.C   2013-09-24 19:13:36.958487511 +0200
@@ -0,0 +1,7 @@
+// { dg-do compile }
+// { dg-options -std=c++11 -fgnu-tm }
+
+void foo()
+{
+  __transaction_atomic noexcept(false) {}
+}

Marek


Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition

2013-09-24 Thread Jan Hubicka
 
 I looked at one that failed after 100 as well (20031204-1.c). In this
 case, it was due to expansion which was creating multiple branches/bbs
 from a logical OR and guessing incorrectly on how to assign the
 counts:
 
  if (octets == 4  (*cp == ':' || *cp == '\0')) {
 
 The (*cp == ':' || *cp == '\0') part looked like the following going
 into RTL expansion:
 
   [20031204-1.c : 31:33] _29 = _28 == 58;
   [20031204-1.c : 31:33] _30 = _28 == 0;
   [20031204-1.c : 31:33] _31 = _29 | _30;
   [20031204-1.c : 31:18] if (_31 != 0)
 goto bb 16;
   else
 goto bb 19;
 
 where the result of the OR was always true, so bb 16 had a count of
 100 and bb 19 a count of 0. When it was expanded, the expanded version
 of the above turned into 2 bbs with a branch in between. Both
 comparisons were done in the first bb, but the first bb checked
 whether the result of the *cp == '\0' compare was true, and if not
 branched to the check for whether the *cp == ':' compare was true. It
 gave the branch to the second check against ':' a count of 0, so that
 bb got a count of 0 and was split out, and put the count of 100 on the
 fall through assuming the compare with '\0' always evaluated to true.
 In reality, this OR condition was always true because *cp was ':', not
 '\0'. Therefore, the count of 0 on the second block with the check for
 ':' was incorrect, we ended up trying to execute it, and failed.

I see, we produce:
;; if (_26 != 0)  

(insn 94 93 95 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:QI 107 [ D.2184 ])
(const_int 0 [0]))) a.c:31 -1
 (nil))

(insn 95 94 96 (set (reg:QI 122 [ D.2186 ])
(eq:QI (reg:CCZ 17 flags)
(const_int 0 [0]))) a.c:31 -1
 (nil)) 

(insn 96 95 97 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:QI 122 [ D.2186 ])
(const_int 0 [0]))) a.c:31 -1
 (nil))

(jump_insn 97 96 98 (set (pc)
(if_then_else (ne (reg:CCZ 17 flags)
(const_int 0 [0]))
(label_ref 100)
(pc))) a.c:31 -1
 (expr_list:REG_BR_PROB (const_int 6100 [0x17d4])
(nil)))
 
(insn 98 97 99 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:QI 108 [ D.2186 ])
(const_int 0 [0]))) a.c:31 -1 
 (nil)) 
 
(jump_insn 99 98 100 (set (pc)
(if_then_else (eq (reg:CCZ 17 flags)
(const_int 0 [0]))
(label_ref 0)
(pc))) a.c:31 -1
 (expr_list:REG_BR_PROB (const_int 3900 [0xf3c])
(nil)))

(code_label 100 99 0 14  [0 uses])

That is because we TER together _26 = _25 | _24 and if (_26 != 0)

First I think the logic of do_jump should really be moved to trees.  It is not
doing things that can not be adequately represented by gimple.

I am not that certain we want to move it before profiling though.
 
 Presumably we had the correct profile data for both blocks, but the
 accuracy was reduced when the OR was represented as a logical
 computation with a single branch. We could change the expansion code
 to do something different, e.g. treat as a 50-50 branch. But we would
 still end up with integer truncation issues when there was a single
 training run. But that could be dealt with conservatively in the

Yep, but it is still better than what we have now - if the test above was
in hot part of program (i.e. not executed once), we will end up optimizing
the second conditional for size.

So I think it is do_jump bug to not distribute probabilities across the two
conditoinals introduced.
 bbpart code as I suggested for the jump threading issue above. I.e. a
 cold block with incoming non-cold edges conservatively not marked cold
 for splitting.

Yep, we can probably do that, but we ought to fix the individual cases
above at least for resonable number of runs.

Will you look into logic of do_jump or shall I try to dive in?

Honza


Re: [PATCH, PR 57748] Check for out of bounds access, Part 2

2013-09-24 Thread Martin Jambor
Hi,

On Tue, Sep 24, 2013 at 12:02:17PM +0200, Richard Biener wrote:
 On Tue, Sep 24, 2013 at 4:52 AM, Bernd Edlinger
 bernd.edlin...@hotmail.de wrote:
  Hi,
 
  with the attached patch the read-side of the out of bounds accesses are 
  fixed.
  There is a single new test case pr57748-3.c that is derived from Martin's 
  test case.
  The test case does only test the read access and does not depend on part 1 
  of the
  patch.
 
  This patch was boot-strapped and regression tested on 
  x86_64-unknown-linux-gnu.
 
  Additionally I generated eCos and an eCos-application (on ARMv5 using packed
  structures) with an arm-eabi cross compiler, and looked for differences in 
  the
  disassembled code with and without this patch, but there were none.
 
  OK for trunk?
 
 Index: gcc/expr.c
 ===
 --- gcc/expr.c  (revision 202778)
 +++ gcc/expr.c  (working copy)
 @@ -9878,7 +9878,7 @@
modifier != EXPAND_STACK_PARM
   ? target : NULL_RTX),
  VOIDmode,
 -modifier == EXPAND_SUM ? EXPAND_NORMAL : modifier);
 +EXPAND_MEMORY);
 
 /* If the bitfield is volatile, we want to access it in the
field's mode, not the computed mode.
 
 context suggests that we may arrive with EXPAND_STACK_PARM here
 which is a correctness modifier (see its docs).  But I'm not too familiar
 with the details of the various expand modifiers, Eric may be though.
 
 That said, I still believe that fixing the misalign path in expand_assignment
 would be better than trying to avoid it.  For this testcase the issue is
 again that expand_assignment passes the wrong mode/target to the
 movmisalign optab.

Perhaps I'm stating the obvious, but this is supposed to address a
separate bug in the expansion of the RHS (as opposed to the first bug
which was in the expansion of the LHS), and so we would have to make
expand_assignment to also examine potential misalignments in the RHS,
which it so far does not do, despite its complexity.

Having said that, I also dislike the patch, but I am quite convinced
that if we allow non-BLK structures with zero sized arrays, the fix
will be ugly - but I'll be glad to be shown I've been wrong.

Martin


Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition

2013-09-24 Thread Teresa Johnson
On Tue, Sep 24, 2013 at 10:57 AM, Jan Hubicka hubi...@ucw.cz wrote:

 I looked at one that failed after 100 as well (20031204-1.c). In this
 case, it was due to expansion which was creating multiple branches/bbs
 from a logical OR and guessing incorrectly on how to assign the
 counts:

  if (octets == 4  (*cp == ':' || *cp == '\0')) {

 The (*cp == ':' || *cp == '\0') part looked like the following going
 into RTL expansion:

   [20031204-1.c : 31:33] _29 = _28 == 58;
   [20031204-1.c : 31:33] _30 = _28 == 0;
   [20031204-1.c : 31:33] _31 = _29 | _30;
   [20031204-1.c : 31:18] if (_31 != 0)
 goto bb 16;
   else
 goto bb 19;

 where the result of the OR was always true, so bb 16 had a count of
 100 and bb 19 a count of 0. When it was expanded, the expanded version
 of the above turned into 2 bbs with a branch in between. Both
 comparisons were done in the first bb, but the first bb checked
 whether the result of the *cp == '\0' compare was true, and if not
 branched to the check for whether the *cp == ':' compare was true. It
 gave the branch to the second check against ':' a count of 0, so that
 bb got a count of 0 and was split out, and put the count of 100 on the
 fall through assuming the compare with '\0' always evaluated to true.
 In reality, this OR condition was always true because *cp was ':', not
 '\0'. Therefore, the count of 0 on the second block with the check for
 ':' was incorrect, we ended up trying to execute it, and failed.

 I see, we produce:
 ;; if (_26 != 0)

 (insn 94 93 95 (set (reg:CCZ 17 flags)
 (compare:CCZ (reg:QI 107 [ D.2184 ])
 (const_int 0 [0]))) a.c:31 -1
  (nil))

 (insn 95 94 96 (set (reg:QI 122 [ D.2186 ])
 (eq:QI (reg:CCZ 17 flags)
 (const_int 0 [0]))) a.c:31 -1
  (nil))

 (insn 96 95 97 (set (reg:CCZ 17 flags)
 (compare:CCZ (reg:QI 122 [ D.2186 ])
 (const_int 0 [0]))) a.c:31 -1
  (nil))

 (jump_insn 97 96 98 (set (pc)
 (if_then_else (ne (reg:CCZ 17 flags)
 (const_int 0 [0]))
 (label_ref 100)
 (pc))) a.c:31 -1
  (expr_list:REG_BR_PROB (const_int 6100 [0x17d4])
 (nil)))

 (insn 98 97 99 (set (reg:CCZ 17 flags)
 (compare:CCZ (reg:QI 108 [ D.2186 ])
 (const_int 0 [0]))) a.c:31 -1
  (nil))

 (jump_insn 99 98 100 (set (pc)
 (if_then_else (eq (reg:CCZ 17 flags)
 (const_int 0 [0]))
 (label_ref 0)
 (pc))) a.c:31 -1
  (expr_list:REG_BR_PROB (const_int 3900 [0xf3c])
 (nil)))

 (code_label 100 99 0 14  [0 uses])

 That is because we TER together _26 = _25 | _24 and if (_26 != 0)

 First I think the logic of do_jump should really be moved to trees.  It is not
 doing things that can not be adequately represented by gimple.

 I am not that certain we want to move it before profiling though.

 Presumably we had the correct profile data for both blocks, but the
 accuracy was reduced when the OR was represented as a logical
 computation with a single branch. We could change the expansion code
 to do something different, e.g. treat as a 50-50 branch. But we would
 still end up with integer truncation issues when there was a single
 training run. But that could be dealt with conservatively in the

 Yep, but it is still better than what we have now - if the test above was
 in hot part of program (i.e. not executed once), we will end up optimizing
 the second conditional for size.

 So I think it is do_jump bug to not distribute probabilities across the two
 conditoinals introduced.
 bbpart code as I suggested for the jump threading issue above. I.e. a
 cold block with incoming non-cold edges conservatively not marked cold
 for splitting.

 Yep, we can probably do that, but we ought to fix the individual cases
 above at least for resonable number of runs.

I made this change and it removed a few of the failures.

I looked at another case that still failed with 1 train run but passed
with 100. It turned out to be another truncation issue exposed by RTL
expansion, where we created some control flow for a memset builtin
which was in a block with an execution count of 1. Some of the blocks
got frequencies less than half the original block, so the count was
rounded down or truncated to 0. I noticed that in this case (as well
as the jump threading case I fixed by looking for non-zero incoming
edges in partitioning) that the bb frequency was non-zero.

Why not just have probably_never_executed_bb_p return simply return
false bb-frequency is non-zero (right now it does the opposite -
returns true when bb-frequency is 0)? Making this change removed a
bunch of other failures. With this change as well, there are only 3
cases that still fail with 1 train run that pass with 100. Need to
look at those.


 Will you look into logic of do_jump or shall I try to dive in?

I can take a look, but probably won't have a chance until late this
week. If you don't get to it before then I will 

[wwwdocs] svnwrite.html -- extend documentation of gcc.gnu.org accounts

2013-09-24 Thread Gerald Pfeifer
Caroline made me realize that we should improve our documentation
around gcc.gnu.org accounts.

This is a first step in that direction.

(As a side effect, wrap the e-mail address in div/div to avoid
undesired line breaks.)

Applied.

Gerald

Index: svnwrite.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/svnwrite.html,v
retrieving revision 1.27
diff -u -3 -p -r1.27 svnwrite.html
--- svnwrite.html   18 Aug 2013 20:59:25 -  1.27
+++ svnwrite.html   24 Sep 2013 18:20:48 -
@@ -411,9 +411,10 @@ like Merge from mainline or similar./
 hr /
 h2a name=accountTipsamp;Tricks around your account/a/h2
 
-pIf you ever need to change the address e-mail to
-iusername/i@gcc.gnu.org is forwarded to, you can easily do so
-using/p
+pYour gcc.gnu.org account also receives e-mail (and is what you
+use for Bugzilla).  If you ever need to change the address e-mail to
+iusername/i@gcc.gnu.org is forwarded to, you can easily
+do so using/p
 blockquotepre
 ssh iusername/i@gcc.gnu.org email mynewaddr...@example.com
 /pre/blockquote


Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition

2013-09-24 Thread Teresa Johnson
Rong - can you answer the questions below on the comdat patch?


On Tue, Sep 24, 2013 at 5:31 AM, Jan Hubicka hubi...@ucw.cz wrote:
 Hi Honza,

 I am finally getting back to working on this after a few weeks of
 working on some other priorities.

 I am also trying to return to this, so good timming ;)
 Martin has got smaller C++ programs (Inkscape) to not touch cold segment
 during the startup with FDO (w/o partitioning). Firefox still does, I think
 the problem are lost samples due to different linker decisions even with LTO.
 (i.e. linker pick an object from .a libraryat profile-generate time that i 
 never
 passes later.).

 I plan to look into that today.

 Did you mean to commit the above change? I see that it went in as part
 of r202258 but doesn't show up in the ChangeLog entry for that
 revision.

 Yes, I meant to check it in, but did not mean to do so w/o Changelog.  I wil
 fix that.

Should the same fix be applied to probably_never_executed_edge_p?


 
  In other cases it was mostly loop unrolling in combination with jump 
  threading. So
  I modified my script to separately report when failure happens for test 
  trained
  once and test trained hundred times.

 Thanks for the linker script. I reproduced your results. I looked at a
 couple cases. The first was one that failed after 1 training run only
 (2910-2.c). It was due to jump threading, which you noted was a
 problem. For this one I think we can handle it in the partitioning,
 since there is an FDO insanity that we could probably treat more
 conservatively when splitting.

 We should fix the roundoff issues - when I was introducing the
 frequency/probability/count system I made an assumptions that parts of 
 programs
 with very low counts do not matter, since they are not part of hot spot (and I
 cared only about the hot spot).  Now we care about identifying unlikely
 executed spots and we need to fix this.

 I looked at one that failed after 100 as well (20031204-1.c). In this
 case, it was due to expansion which was creating multiple branches/bbs
 from a logical OR and guessing incorrectly on how to assign the
 counts:

  if (octets == 4  (*cp == ':' || *cp == '\0')) {

 The (*cp == ':' || *cp == '\0') part looked like the following going
 into RTL expansion:

   [20031204-1.c : 31:33] _29 = _28 == 58;
   [20031204-1.c : 31:33] _30 = _28 == 0;
   [20031204-1.c : 31:33] _31 = _29 | _30;
   [20031204-1.c : 31:18] if (_31 != 0)
 goto bb 16;
   else
 goto bb 19;

 where the result of the OR was always true, so bb 16 had a count of
 100 and bb 19 a count of 0. When it was expanded, the expanded version
 of the above turned into 2 bbs with a branch in between. Both
 comparisons were done in the first bb, but the first bb checked
 whether the result of the *cp == '\0' compare was true, and if not
 branched to the check for whether the *cp == ':' compare was true. It
 gave the branch to the second check against ':' a count of 0, so that
 bb got a count of 0 and was split out, and put the count of 100 on the
 fall through assuming the compare with '\0' always evaluated to true.
 In reality, this OR condition was always true because *cp was ':', not
 '\0'. Therefore, the count of 0 on the second block with the check for
 ':' was incorrect, we ended up trying to execute it, and failed.

 Presumably we had the correct profile data for both blocks, but the
 accuracy was reduced when the OR was represented as a logical
 computation with a single branch. We could change the expansion code
 to do something different, e.g. treat as a 50-50 branch. But we would
 still end up with integer truncation issues when there was a single
 training run. But that could be dealt with conservatively in the
 bbpart code as I suggested for the jump threading issue above. I.e. a
 cold block with incoming non-cold edges conservatively not marked cold
 for splitting.

 
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/2422-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/2910-2.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20020413-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20030903-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20031204-1.c
  FAIL /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20031204-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20060420-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20060905-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20120427-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20120427-2.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20120808-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20121108-1.c
  FAIL /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20121108-1.c
  FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/920501-6.c
  FAIL /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/920501-6.c
  FAIL1 

Re: [PATCH]: Fix use of __builtin_eh_pointer in EH_ELSE

2013-09-24 Thread Richard Henderson
On 09/03/2013 07:08 AM, Tristan Gingold wrote:
 Hi,
 
 The field state-ehp_region wasn't updated before lowering constructs in the 
 eh
 path of EH_ELSE.  As a consequence, __builtin_eh_pointer is lowered to 0 (or
 possibly to a wrong region number) in this path.
 
 The only user of EH_ELSE looks to be trans-mem.c:lower_transaction, and the
 consequence of that is a memory leak.
 
 Furthermore, according to calls.c:flags_from_decl_or_type, the 
 transaction_pure
 attribute must be set on the function type, not on the function declaration.
 Hence the change to declare __builtin_eh_pointer.
 (I don't understand the guard condition to set the attribute for it in tree.c.
  Why is 'builtin_decl_explicit_p (BUILT_IN_TM_LOAD_1)' needed in addition to
  flag_tm ?)

Clearly these are totally unrelated and should not be in the same patch.

The BUILT_IN_TM_LOAD_1 thing looks like it might have something to do with
non-C front ends, which don't all create the builtins.

 diff --git a/gcc/tree-eh.c b/gcc/tree-eh.c
 index 6ffbd26..86ee77e 100644
 --- a/gcc/tree-eh.c
 +++ b/gcc/tree-eh.c
 @@ -1093,8 +1093,12 @@ lower_try_finally_nofallthru (struct leh_state *state,
  
if (tf-may_throw)
   {
 +   eh_region prev_ehp_region = state-ehp_region;
 +
 finally = gimple_eh_else_e_body (eh_else);
 +   state-ehp_region = tf-region;
 lower_eh_constructs_1 (state, finally);
 +   state-ehp_region = prev_ehp_region;

This doesn't look right.

Does it work to save and restore ehp_region in the two places that
we currently set it instead?


r~


Re: [PATCH 0/3] Support some cases of inheritance in gengtype; use it for symtab

2013-09-24 Thread David Malcolm
On Mon, 2013-09-23 at 13:19 +0200, Richard Biener wrote:
 On Fri, Sep 20, 2013 at 4:05 PM, David Malcolm dmalc...@redhat.com wrote:
  There have been various discussions about how to handle inheritance
  within ggc and PCH: whether to extend gengtype to support C++ syntax
  such as templates, or to require people to use GTY((user)) and manually
  write field-traversal.
 
  I've attempted to introduce class hierarchies in a few places recently,
  for example for the symtab types:
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00668.html
 
  and the gimple types:
http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01788.html
 
  In each case the GTY((user)) is always the most painful part: it
  requires ugly hand-written code which must be kept up-to-date with
  changes to the underlying types, lest we run into
  difficult-to-track-down crashes inside the GC and PCH.
 
  I got sick of writing and debugging these GTY((user)) traversal
  functions, for IMHO quite modest use of C++ (no templates etc), so I had
  a go at implementing support for inheritance in gengtype, as seen in the
  following patch series.
 
  The key compromise I'm introducing, as I see it, is to require the user
  to opt in to the support for the inheritance, on a class-by-class
  basis, and not attempt to support all of C++, but only single
  inheritance, without templates.
 
 Please make sure to update doc/gty.texi with this feature and its
 restrictions.

Thanks; I've added documentation in the latest version of the patches:
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01803.html

  The idea is that if you add a GTY marker to a subclass, you are
  opting in to gengtype, and on any restrictions on the base class that
  gengtype needs to impose to avoid having to deal with all of C++ syntax:
  specifically, no multiple inheritance, and the base class can't be a
  template.
 
  If those restrictions are too much, e.g. you have something like:
 
 struct alloc_pool_hasher : typed_noop_remove alloc_pool_descriptor
 
  then don't GTY-mark the class, and gengtype won't attempt to parse the
  base class (as per the current parser).
 
 Such restrictions are bad - does gengtype at least diagnose the case where
 a bad base class is used?

I'm sorry if my initial email was unclear.  The point I was trying to
make is that we already have restrictions: we currently can't use any
kind of inheritance with GTY, unless we use GTY((user)) and do things
by hand.  This patch makes things less restrictive, in that some forms
of inheritance can now be used without needing to use GTY((user)).

The example I picked is one that is already in the source tree, and
which doesn't use GTY.  I chose that particular example as an example of
inheritance within the source tree, to illustrate that adding this
parser support to gengtype doesn't break this existing code: it doesn't
introduce a restriction to the use of C++ in the code as a whole.  The
restriction merely applies to the feature itself, and can easily be
avoided (e.g. by not using GTY, or by using GTY((user))).

To answer your question, in the example above, if one adds a GTY marker
to alloc_pool_hasher:

struct GTY(()) alloc_pool_hasher : typed_noop_remove alloc_pool_descriptor

and adds alloc-pool.c to GTFILES so that it's present in gtyp-input.list
and thus gets parsed by gengtype, then one gets this error:

../../src/gcc/alloc-pool.c:86: parse error: expected '{', have ''

So yes: gengtype will fail immediately with an error message.

Note that you can still either omit GTY, or use GTY((user)) for such
cases, fixing the error message. (however, fwiw, it's meaningless to try
to GTY-mark this particular class, given that it's a traits type, not
holding any data itself).

 My initial worry about C++-ifying gengtype was exactly because of such
 arbitrary restrictions. 

Note that gengtype already has an arbitrary restriction: no inheritance,
unless you use GTY((user)).

This patch simply loosens the restriction somewhat, by allowing some
inheritance.   If code still falls within the looser restrictions, that
code can continue to use GTY((user)), as before.

 Isn't it possible to remove the restruction by
 requiring to GTY annotate the subclassing itself?  Like
 
 struct alloc_pool_hasher : typed_noop_remove alloc_pool_descriptor 
 GTY((...))
 
 with whatever information required to parse the class given explicitely as
 arguments of the GTY?  Like for the single inheritance case with a descriptor
 you'd give out a tag to identify all types participating in the
 inheritance group

FWIW it's meaningless for this particular class to be GTY-marked: as
mentioned above, it's a bundle of traits for use elsewhere, and doesn't
actually hold any data.

But speaking to the general point about templates: I don't have a need
to support automated generation of traversal functions for templates, we
can use GTY((user)) for that if we need to.

In contrast, I'm regularly running into a need for doing it for
non-template base 

Re: [PATCH] Refactor type handling in get_alias_set, fix PR58513

2013-09-24 Thread H.J. Lu
On Tue, Sep 24, 2013 at 4:00 AM, Richard Biener rguent...@suse.de wrote:

 As noted in PR58513 the alias type comparison in operand_equal_p for
 MEM_REF and TARGET_MEM_REF is overly restrictive.  The following
 adds a new alias_ptr_types_compatible_p helper for a less conservative
 comparison and refactors get_alias_set and reference_alias_ptr_type
 to share code and behave in a more similar manner.

 Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

 Richard.

 2013-09-24  Richard Biener  rguent...@suse.de

 PR middle-end/58513
 * tree.c (reference_alias_ptr_type): Move ...
 * alias.c (reference_alias_ptr_type): ... here and implement
 in terms of the new reference_alias_ptr_type_1.
 (ref_all_alias_ptr_type_p): New helper.
 (get_deref_alias_set_1): Drop flag_strict_aliasing here,
 use ref_all_alias_ptr_type_p.
 (get_deref_alias_set): Add flag_strict_aliasing check here.
 (reference_alias_ptr_type_1): New function, split out from ...
 (get_alias_set): ... here.
 (alias_ptr_types_compatible_p): New function.
 * alias.h (reference_alias_ptr_type): Declare.
 (alias_ptr_types_compatible_p): Likewise.
 * tree.h (reference_alias_ptr_type): Remove.
 * fold-const.c (operand_equal_p): Use alias_ptr_types_compatible_p
 to compare MEM_REF alias types.


This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58521


-- 
H.J.


Re: Fwd: [PATCH] Scheduling result adjustment to enable macro-fusion

2013-09-24 Thread Wei Mi
This is the updated patch2.
Changed:
1. For cmp/test with rip-relative addressing mem operand, don't group
insns. Bulldozer also doesn't support fusion for cmp/test with both
displacement MEM and immediate operand, while m_CORE_ALL doesn't
support fusion for cmp/test with MEM and immediate operand. I simplify
choose to use the more stringent constraint here (m_CORE_ALL's
constraint).
2. Add Budozer back and merge TARGET_FUSE_CMP_AND_BRANCH_64 and
TARGET_FUSE_CMP_AND_BRANCH_32.

bootstrap and regression pass. ok for trunk?

2013-09-24  Wei Mi  w...@google.com

* gcc/config/i386/i386.c (rip_relative_addr_p): New Function.
(ix86_macro_fusion_p): Ditto.
(ix86_macro_fusion_pair_p): Ditto.
* gcc/config/i386/i386.h: Add new tune features about macro-fusion.
* gcc/config/i386/x86-tune.def (DEF_TUNE): Ditto.
* gcc/doc/tm.texi: Generated.
* gcc/doc/tm.texi.in: Ditto.
* gcc/haifa-sched.c (try_group_insn): New Function.
(group_insns_for_macro_fusion): Ditto.
(sched_init): Call group_insns_for_macro_fusion.
* gcc/sched-rgn.c (add_branch_dependences): Keep insns in
a SCHED_GROUP at the end of BB to remain their location.
* gcc/target.def: Add two hooks: macro_fusion_p and
macro_fusion_pair_p.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 1fd3f60..4a04778 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -24856,6 +24856,167 @@ ia32_multipass_dfa_lookahead (void)
 }
 }

+/* Extracted from ix86_print_operand_address. Check whether ADDR is a
+   rip-relative address.  */
+
+static bool
+rip_relative_addr_p (rtx addr)
+{
+  struct ix86_address parts;
+  rtx base, index, disp;
+  int ok;
+
+  if (GET_CODE (addr) == UNSPEC  XINT (addr, 1) == UNSPEC_VSIBADDR)
+{
+  ok = ix86_decompose_address (XVECEXP (addr, 0, 0), parts);
+  parts.index = XVECEXP (addr, 0, 1);
+}
+  else if (GET_CODE (addr) == UNSPEC  XINT (addr, 1) == UNSPEC_LEA_ADDR)
+ok = ix86_decompose_address (XVECEXP (addr, 0, 0), parts);
+  else
+ok = ix86_decompose_address (addr, parts);
+
+  gcc_assert (ok);
+  base = parts.base;
+  index = parts.index;
+  disp = parts.disp;
+
+  if (TARGET_64BIT  !base  !index)
+{
+  rtx symbol = disp;
+
+  if (GET_CODE (disp) == CONST
+  GET_CODE (XEXP (disp, 0)) == PLUS
+  CONST_INT_P (XEXP (XEXP (disp, 0), 1)))
+   symbol = XEXP (XEXP (disp, 0), 0);
+
+  if (GET_CODE (symbol) == LABEL_REF
+ || (GET_CODE (symbol) == SYMBOL_REF
+  SYMBOL_REF_TLS_MODEL (symbol) == 0))
+   return true;
+}
+  if (flag_pic  !base  !index)
+{
+  if (GET_CODE (disp) == CONST
+  GET_CODE (XEXP (disp, 0)) == UNSPEC
+  (XINT (XEXP (disp, 0), 1) == UNSPEC_PCREL
+ || XINT (XEXP (disp, 0), 1) == UNSPEC_GOTPCREL
+ || (TARGET_64BIT
+  XINT (XEXP (disp, 0), 1) == UNSPEC_GOTNTPOFF)))
+   return true;
+}
+  return false;
+}
+
+/* Return true if target platform supports macro-fusion.  */
+
+static bool
+ix86_macro_fusion_p ()
+{
+  if (TARGET_FUSE_CMP_AND_BRANCH)
+return true;
+  else
+return false;
+}
+
+/* Check whether current microarchitecture support macro fusion
+   for insn pair CONDGEN + CONDJMP. Refer to
+   Intel Architectures Optimization Reference Manual. */
+
+static bool
+ix86_macro_fusion_pair_p (rtx condgen, rtx condjmp)
+{
+  rtx src, dest;
+  rtx single_set = single_set (condgen);
+  enum rtx_code ccode;
+  rtx compare_set = NULL_RTX, test_if, cond;
+  rtx alu_set = NULL_RTX, addr = NULL_RTX;
+
+  if (get_attr_type (condgen) != TYPE_TEST
+   get_attr_type (condgen) != TYPE_ICMP
+   get_attr_type (condgen) != TYPE_INCDEC
+   get_attr_type (condgen) != TYPE_ALU)
+return false;
+
+  if (single_set == NULL_RTX
+   !TARGET_FUSE_ALU_AND_BRANCH)
+return false;
+
+  if (single_set != NULL_RTX)
+compare_set = single_set;
+  else
+{
+  int i;
+  rtx pat = PATTERN (condgen);
+  for (i = 0; i  XVECLEN (pat, 0); i++)
+   if (GET_CODE (XVECEXP (pat, 0, i)) == SET)
+ {
+   rtx set_src = SET_SRC (XVECEXP (pat, 0, i));
+   if (GET_CODE (set_src) == COMPARE)
+ compare_set = XVECEXP (pat, 0, i);
+   else
+ alu_set = XVECEXP (pat, 0, i);
+ }
+}
+  if (compare_set == NULL_RTX)
+return false;
+  src = SET_SRC (compare_set);
+  if (GET_CODE (src) != COMPARE)
+return false;
+
+  /* Macro-fusion for cmp/test MEM-IMM + conditional jmp is not
+ supported.  */
+  if ((MEM_P (XEXP (src, 0))
+CONST_INT_P (XEXP (src, 1)))
+  || (MEM_P (XEXP (src, 1))
+  CONST_INT_P (XEXP (src, 0
+return false;
+
+  /* No fusion for RIP-relative address.  */
+  if (MEM_P (XEXP (src, 0)))
+addr = XEXP (XEXP (src, 0), 0);
+  else if (MEM_P (XEXP (src, 1)))
+addr = XEXP (XEXP (src, 1), 0);
+  if (addr  

Re: RFA: Store the REG_BR_PROB probability directly as an int

2013-09-24 Thread Andreas Schwab
Richard Sandiford rdsandif...@googlemail.com writes:

 Sorry for the breakage.  I think we need to handle INT_LIST in the same way
 as INSN_LIST though, and eliminate in XEXP (x, 1).

 How about the attached?  Testing in progress...

Works for me as well.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.


[gomp4] Taskgroup and cancellation compiler fixes

2013-09-24 Thread Jakub Jelinek
Hi!

This patch:
1) defers expansion of taskgroup into GOMP_taskgroup_start and
   GOMP_taskgroup_end until omplower/ompexp, mainly so that e.g. invalid
   nesting can be diagnosed (e.g. #pragma omp cancel * inside of
   #pragma omp taskgroup nested in some other construct)
2) diagnoses structured block restrictions also for #pragma omp target{,data}
   and #pragma omp teams and taskgroup
3) fixes a bug, where cancellation of sections, for and parallel/task
   jumped to after the dtors rather than before them, because then some
   variables won't be properly destructed

Will commit tomorrow to gomp-4_0-branch unless somebody finds issues with
this.

2013-09-24  Jakub Jelinek  ja...@redhat.com

* omp-low.c (lower_omp_sections, lower_omp_for, lower_omp_taskreg):
Emit ctx-cancel_label before destructors.

* gimple-pretty-print.c (dump_gimple_omp_block,
pp_gimple_stmt_1): Handle GIMPLE_OMP_TASKGROUP.
* tree-nested.c (convert_nonlocal_reference_stmt,
convert_local_reference_stmt, convert_gimple_call): Likewise.
* tree-cfg.c (make_edges): Likewise.
* gimple.h (gimple_build_omp_taskgroup): New prototype.
(gimple_has_substatement): Handle GIMPLE_OMP_TASKGROUP.
(CASE_GIMPLE_OMP): Likewise.
* gimplify.c (is_gimple_stmt, gimplify_expr): Handle OMP_TASKGROUP.
* omp-low.c (check_omp_nesting_restrictions): Warn if #pragma omp
cancel is used in nowait loop or sections construct.
(scan_omp_1_stmt, expand_omp_synch, expand_omp, lower_omp_1): Handle
GIMPLE_OMP_TASKGROUP.
(diagnose_sb_1, diagnose_sb_2): Likewise.  Handle GIMPLE_OMP_TARGET
and GIMPLE_OMP_TEAMS.
(lower_omp_taskgroup): New function.
* tree-inline.c (remap_gimple_stmt, estimate_num_insns): Handle
GIMPLE_OMP_TASKGROUP.
* gimple-low.c (lower_stmt): Likewise.
* tree.h (OMP_TASKGROUP_BODY): Define.
* tree.def (OMP_TASKGROUP): New tree.
* tree-pretty-print.c (dump_generic_node): Handle OMP_TASKGROUP.
* gimple.c (gimple_build_omp_taskgroup): New function.
(walk_gimple_stmt, gimple_copy): Handle GIMPLE_OMP_TASKGROUP.
* gimple.def (GIMPLE_OMP_TASKGROUP): New GIMPLE code.
c-family/
* c-common.h (c_finish_omp_taskgroup): New prototype.
* c-omp.c (c_finish_omp_taskgroup): New function.
c/
* c-parser.c (c_parser_omp_taskgroup): Return tree.
Don't call c_begin_omp_taskgroup.
(c_parser_omp_construct): Adjust caller.
* c-typeck.c (c_begin_omp_taskgroup, c_finish_omp_taskgroup): Remove.
* c-tree.h (c_begin_omp_taskgroup, c_finish_omp_taskgroup): Remove.
cp/
* parser.c (cp_parser_omp_taskgroup): Return tree.  Use
c_finish_omp_taskgroup.
(cp_parser_omp_construct): Adjust caller.
* cp-array-notation.c (expand_array_notation_exprs): Handle
OMP_TASKGROUP.
* pt.c (tsubst_expr): Handle OMP_TASKGROUP.
* semantics.c (finish_omp_taskgroup): Remove.
* cp-tree.h (finish_omp_taskgroup): Remove.
testsuite/
* g++.dg/gomp/target-1.C: New test.
* g++.dg/gomp/target-2.C: New test.
* g++.dg/gomp/teams-1.C: New test.
* g++.dg/gomp/taskgroup-1.C: New test.
* gcc.dg/gomp/teams-1.c: New test.
* gcc.dg/gomp/taskgroup-1.c: New test.
* gcc.dg/gomp/target-1.c: New test.
* gcc.dg/gomp/target-2.c: New test.
* c-c++-common/gomp/cancel-1.c: New test.

--- gcc/gimple-pretty-print.c.jj2013-09-13 16:48:21.0 +0200
+++ gcc/gimple-pretty-print.c   2013-09-23 15:06:23.857832390 +0200
@@ -1360,8 +1360,8 @@ dump_gimple_omp_sections (pretty_printer
 }
 }
 
-/* Dump a GIMPLE_OMP_{MASTER,ORDERED,SECTION} tuple on the pretty_printer
-   BUFFER.  */
+/* Dump a GIMPLE_OMP_{MASTER,TASKGROUP,ORDERED,SECTION} tuple on the
+   pretty_printer BUFFER.  */
 
 static void
 dump_gimple_omp_block (pretty_printer *buffer, gimple gs, int spc, int flags)
@@ -1376,6 +1376,9 @@ dump_gimple_omp_block (pretty_printer *b
case GIMPLE_OMP_MASTER:
  pp_string (buffer, #pragma omp master);
  break;
+   case GIMPLE_OMP_TASKGROUP:
+ pp_string (buffer, #pragma omp taskgroup);
+ break;
case GIMPLE_OMP_ORDERED:
  pp_string (buffer, #pragma omp ordered);
  break;
@@ -2131,6 +2134,7 @@ pp_gimple_stmt_1 (pretty_printer *buffer
   break;
 
 case GIMPLE_OMP_MASTER:
+case GIMPLE_OMP_TASKGROUP:
 case GIMPLE_OMP_ORDERED:
 case GIMPLE_OMP_SECTION:
   dump_gimple_omp_block (buffer, gs, spc, flags);
--- gcc/tree-nested.c.jj2013-09-13 16:49:05.0 +0200
+++ gcc/tree-nested.c   2013-09-23 15:06:23.857832390 +0200
@@ -1309,6 +1309,7 @@ convert_nonlocal_reference_stmt (gimple_
 
 case GIMPLE_OMP_SECTION:
 case GIMPLE_OMP_MASTER:
+case GIMPLE_OMP_TASKGROUP:
 case GIMPLE_OMP_ORDERED:
   walk_body 

[gomp4] Taskgroup library support

2013-09-24 Thread Jakub Jelinek
Hi!

This implements taskgroups in the library and their cancellation.
The implementation has been pretty straightforward, though I had to
consolidate some operations from {gomp_barrier_handle_tasks, GOMP_taskwait}
and the new GOMP_taskgroup_end to new inlines, because it became
non-maintainable.

In addition to this, the patch disallows plain discarding of tasks for which
we've already run the copy constructors, those will be executed and will be
cancelled only if they encounter a cancellation point.  There are omp-lang
discussions about whether the standard shouldn't be changed, so that the
copy ctors would be run only in task outlined body, not earlier.

And, lastly, this patch adds various extra cancellation testcase that
revealed the omp-low.c issue fixed in the previous patch.

Will commit tomorrow unless somebody complains.

2013-09-24  Jakub Jelinek  ja...@redhat.com

* parallel.c (GOMP_cancellation_point, GOMP_cancel): Handle
GIMPLE_CANCEL_TASKGROUP cancellation.
* libgomp.h (struct gomp_task): Add next_taskgroup, prev_taskgroup,
taskgroup and copy_ctors_done fields.
(struct gomp_taskgroup): New type.
* task.c (gomp_init_task): Initialize copy_ctors_done and taskgroup
fields.
(GOMP_task): Don't start a new thread also if it's taskgroup has
been cancelled.  Set copy_ctors_done field if needed.  Initialize
taskgroup field.  If copy_ctors_done and already cancelled, don't
discard the task.  If taskgroup is non-NULL, enqueue the task
into taskgroup queue.
(gomp_task_run_pre, gomp_task_run_post_remove_parent,
gomp_task_run_post_remove_taskgroup): New inline functions.
(gomp_barrier_handle_tasks, GOMP_taskwait): Use them.
(GOMP_taskgroup_start, GOMP_taskgroup_end): Implement taskgroup
support.
* testsuite/libgomp.c++/cancel-parallel-1.C: New test.
* testsuite/libgomp.c++/cancel-parallel-2.C: New test.
* testsuite/libgomp.c++/cancel-parallel-3.C: New test.
* testsuite/libgomp.c++/cancel-for-1.C: New test.
* testsuite/libgomp.c++/cancel-for-1.C: New test.
* testsuite/libgomp.c++/cancel-taskgroup-1.C: New test.
* testsuite/libgomp.c++/cancel-taskgroup-2.C: New test.
* testsuite/libgomp.c++/cancel-taskgroup-3.C: New test.
* testsuite/libgomp.c++/cancel-test.h: New file.
* testsuite/libgomp.c++/cancel-sections-1.C: New test.
* testsuite/libgomp.c++/taskgroup-1.C: New test.
* testsuite/libgomp.c/cancel-taskgroup-1.c: New test.
* testsuite/libgomp.c/cancel-taskgroup-2.c: New test.
* testsuite/libgomp.c/taskgroup-1.c: New test.
* testsuite/libgomp.c/cancel-parallel-3.c (do_some_work): Use
void return type.

--- libgomp/parallel.c.jj   2013-09-24 12:52:53.271887599 +0200
+++ libgomp/parallel.c  2013-09-24 13:10:29.345564211 +0200
@@ -147,7 +147,8 @@ GOMP_cancellation_point (int which)
   if (!gomp_cancel_var)
 return false;
 
-  struct gomp_team *team = gomp_thread ()-ts.team;
+  struct gomp_thread *thr = gomp_thread ();
+  struct gomp_team *team = thr-ts.team;
   if (which  (GOMP_CANCEL_LOOP | GOMP_CANCEL_SECTIONS))
 {
   if (team == NULL)
@@ -156,10 +157,11 @@ GOMP_cancellation_point (int which)
 }
   else if (which  GOMP_CANCEL_TASKGROUP)
 {
-  /* FIXME: Check if current taskgroup has been cancelled,
-then fallthru into the GOMP_CANCEL_PARALLEL case,
-because if the current parallel has been cancelled,
-all tasks should be cancelled too.  */
+  if (thr-task-taskgroup  thr-task-taskgroup-cancelled)
+   return true;
+  /* FALLTHRU into the GOMP_CANCEL_PARALLEL case,
+as #pragma omp cancel parallel also cancels all explicit
+tasks.  */
 }
   if (team)
 return gomp_team_barrier_cancelled (team-barrier);
@@ -176,7 +178,8 @@ GOMP_cancel (int which, bool do_cancel)
   if (!do_cancel)
 return ialias_call (GOMP_cancellation_point) (which);
 
-  struct gomp_team *team = gomp_thread ()-ts.team;
+  struct gomp_thread *thr = gomp_thread ();
+  struct gomp_team *team = thr-ts.team;
   if (which  (GOMP_CANCEL_LOOP | GOMP_CANCEL_SECTIONS))
 {
   /* In orphaned worksharing region, all we want to cancel
@@ -187,7 +190,12 @@ GOMP_cancel (int which, bool do_cancel)
 }
   else if (which  GOMP_CANCEL_TASKGROUP)
 {
-  /* FIXME: Handle taskgroup cancellation.  */
+  if (thr-task-taskgroup  !thr-task-taskgroup-cancelled)
+   {
+ gomp_mutex_lock (team-task_lock);
+ thr-task-taskgroup-cancelled = true;
+ gomp_mutex_unlock (team-task_lock);
+   }
   return true;
 }
   team-team_cancelled = 1;
--- libgomp/libgomp.h.jj2013-09-24 12:52:53.274887599 +0200
+++ libgomp/libgomp.h   2013-09-24 13:10:29.344564253 +0200
@@ -253,6 +253,8 @@ enum gomp_task_kind
   GOMP_TASK_TIED
 };
 
+struct gomp_taskgroup;
+
 /* This 

Re: [PATCH, LRA] Remove REG_DEAD and REG_UNUSED notes.

2013-09-24 Thread Steven Bosscher
On Tue, Sep 24, 2013 at 5:03 PM, Eric Botcazou wrote:
 This patch removes REG_DEAD and REG_UNUSED notes in update_inc_notes,
 as it is what the function is supposed to do (see the comments) and as
 keeping these notes produce some failures, at least on ARM.

 The description is too terse.  In the RTL middle-end, you shouldn't have to
 manually deal with the REG_DEAD and REG_UNUSED notes (unlike REG_EQUAL and
 REG_EQUIV notes), as the DF framework is supposed to do it for you.


Unfortunately LRA uses its own DF framework.

Ciao!
Steven


[PATCH, PR 58441] Fix location where libvtv puts its headers

2013-09-24 Thread Caroline Tice
The following patch updates where libvtv installs its header files
(fixing PR 58441).  This is a trivial change, and affects only libvtv,
so I am going to go ahead and commit it.

-- Caroline Tice
cmt...@google.com

2013-09-24  Caroline Tice  cmt...@google.com

* Makefile.am:  Change libvtv_includedir to the directory used by
the other libraries rather than the top include directory.
* Makefile.in: Regenerated.


libvtv-headers.patch
Description: Binary data


Re: [PATCH, LRA] Remove REG_DEAD and REG_UNUSED notes.

2013-09-24 Thread Mike Stump
On Sep 24, 2013, at 12:23 PM, Steven Bosscher stevenb@gmail.com wrote:
 On Tue, Sep 24, 2013 at 5:03 PM, Eric Botcazou wrote:
 This patch removes REG_DEAD and REG_UNUSED notes

 DF framework is supposed to do it for you.

 Unfortunately LRA uses its own DF framework.

Is that a bug, or a feature?


Re: [PATCH] Bug fix: *var and MEM[(const int *)var] (var has int* type) are not treated as the same data ref.

2013-09-24 Thread Cong Hou
Nice fix! I noticed that this patch is already combined to the trunk.

Thank you very much, Richard!



Cong

On Tue, Sep 24, 2013 at 1:49 AM, Richard Biener rguent...@suse.de wrote:
 On Tue, 24 Sep 2013, Richard Biener wrote:

 On Tue, 24 Sep 2013, Jakub Jelinek wrote:

  Hi!
 
  On Mon, Sep 23, 2013 at 05:26:13PM -0700, Cong Hou wrote:
 
  Missing ChangeLog entry.
 
   --- gcc/testsuite/gcc.dg/alias-14.c (revision 0)
   +++ gcc/testsuite/gcc.dg/alias-14.c (revision 0)
 
  Vectorizer tests should go into gcc.dg/vect/ instead, or, if they are
  for a single target (but there is no reason why this should be a single
  target), into gcc.target/target/.
 
   --- gcc/fold-const.c (revision 202662)
   +++ gcc/fold-const.c (working copy)
   @@ -2693,8 +2693,9 @@ operand_equal_p (const_tree arg0, const_
operand_equal_p (TYPE_SIZE (TREE_TYPE (arg0)),
   TYPE_SIZE (TREE_TYPE (arg1)), flags)))
   types_compatible_p (TREE_TYPE (arg0), TREE_TYPE (arg1))
   -   (TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg0, 1)))
   -  == TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg1, 1
   +   types_compatible_p (
   +   TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg0, 1))),
   +   TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg1, 1
   OP_SAME (0)  OP_SAME (1));
 
  This looks wrong.  types_compatible_p will happily return true say
  for unsigned long and unsigned long long types on x86_64, because
  they are both integral types with the same precision, but the second
  argument of MEM_REF contains aliasing information, where the distinction
  between the two is important.
  So, while == comparison of main variant is too strict, types_compatible_p
  is too weak, so I guess you need to write a new predicate that will either
  handle the == and a few special cases that are safe to be handled, or
  look for what exactly we use the type of the second MEM_REF argument
  and check those properties.  We certainly need that
  get_deref_alias_set_1 and get_deref_alias_set return the same values
  for both the types, but whether that is the only information we are using,
  not sure, CCing Richard.

 Using TYPE_MAIN_VARIANT is exactly correct - this is the best we
 can do that will work with all frontends.  TYPE_MAIN_VARIANT
 guarantees that the alias-sets stay the same:

   /* If the innermost reference is a MEM_REF that has a
  conversion embedded treat it like a VIEW_CONVERT_EXPR above,
  using the memory access type for determining the alias-set.  */
  if (TREE_CODE (inner) == MEM_REF
   TYPE_MAIN_VARIANT (TREE_TYPE (inner))
 != TYPE_MAIN_VARIANT
(TREE_TYPE (TREE_TYPE (TREE_OPERAND (inner, 1)
return get_deref_alias_set (TREE_OPERAND (inner, 1));

 so we cannot change the compatibility checks without touching the
 alias-set deriving code.  For the testcase in question we have
 MEM[(const int )_7] vs. MEM[(int *)_7] and unfortunately pointer
 and reference types are not variant types.

 We also cannot easily resort to pointed-to types as our all-beloved
 ref-all qualification is on the pointer type rather than on the
 pointed-to type.

 But yes, we could implement a more complicated predicate

 bool
 alias_ptr_types_compatible_p (const_tree t1, const_tree t2)
 {
   t1 = TYPE_MAIN_VARIANT (t1);
   t2 = TYPE_MAIN_VARIANT (t2);
   if (t1 == t2)
 return true;

   if (TYPE_REF_CAN_ALIAS_ALL (t1)
   || TYPE_REF_CAN_ALIAS_ALL (t2))
 return false;

   return (TYPE_MAIN_VARIANT (TREE_TYPE (t1))
   == TYPE_MAIN_VARIANT (TREE_TYPE (t2)));
 }

 Note that the fold-const code in question is

   return ((TYPE_SIZE (TREE_TYPE (arg0)) == TYPE_SIZE (TREE_TYPE
 (arg1))
|| (TYPE_SIZE (TREE_TYPE (arg0))
 TYPE_SIZE (TREE_TYPE (arg1))
 operand_equal_p (TYPE_SIZE (TREE_TYPE (arg0)),
TYPE_SIZE (TREE_TYPE (arg1)),
 flags)))
types_compatible_p (TREE_TYPE (arg0), TREE_TYPE
 (arg1))
(TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg0,
 1)))
   == TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (arg1,
 1
OP_SAME (0)  OP_SAME (1));

 which you may notice uses types_compatible_p on the reference type
 which is at least suspicious as that can let through reference trees
 that will end up using different alias sets due to the stricter
 check in get_alias_set.

 So in addition to alias_ptr_types_compatible_p we may want to have

 bool
 reference_alias_ptr_types_compatible_p (const_tree t1, const_tree t2)
 {
 ...
 }

 abstracting this away for the actual reference trees (also noting
 that reference_alias_ptr_type isn't 1:1 following what get_alias_set
 does either).

 I will give this a try.

 The following is an untested (well, the testcase from PR58513 is now
 vectorized) patch doing that refactoring.

 Richard.

 Index: gcc/tree.c
 

Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #3

2013-09-24 Thread Michael Meissner
This patch adds the initial support for putting DI, DF, and SF values in the
upper registers (traditional Altivec registers) using the -mupper-regs-df and
-mupper-regs-sf patches.  Those switches will not be enabled by default until
the rest of the changes are made.  This patch passes the bootstrap test and
make check test.  I tested all of the targets I tested previously (power4-8,
G4/G5, SPE, cell, e5500/e5600, and paired floating point), and all machines
generate the same code.  Is it ok to install this patch?

[gcc]
2013-09-24  Michael Meissner  meiss...@linux.vnet.ibm.com

* config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Allow
DFmode, DImode, and SFmode in the upper VSX registers based on the
-mupper-regs-{df,sf} flags.  Fix wu constraint to be ALTIVEC_REGS
if -mpower8-vector.  Combine -mvsx-timode handling with the rest
of the VSX register handling.

* config/rs6000/rs6000.md (f32_lv): Use %x0 for VSX regsters.
(f32_sv): Likewise.
(zero_extendsidi2_lfiwzx): Add support for loading into the
Altivec registers with -mpower8-vector.  Use wu/wv constraints to
only do VSX memory options on Altivec registers.
(extendsidi2_lfiwax): Likewise.
(extendsfdf2_fpr): Likewise.
(movmode_hardfloat, SF/SD modes): Likewise.
(movmode_hardfloat32, DF/DD modes): Likewise.
(movmode_hardfloat64, DF/DD modes): Likewise.
(movdi_internal64): Likewise.

[gcc/testsuite]
2013-09-24  Michael Meissner  meiss...@linux.vnet.ibm.com

* gcc.target/powerpc/p8vector-ldst.c: New test for -mupper-regs-sf
and -mupper-regs-df.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 202855)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -1626,19 +1626,28 @@ rs6000_hard_regno_mode_ok (int regno, en
 
   /* VSX registers that overlap the FPR registers are larger than for non-VSX
  implementations.  Don't allow an item to be split between a FP register
- and an Altivec register.  */
-  if (VECTOR_MEM_VSX_P (mode))
+ and an Altivec register.  Allow TImode in all VSX registers if the user
+ asked for it.  */
+  if (TARGET_VSX  VSX_REGNO_P (regno)
+   (VECTOR_MEM_VSX_P (mode)
+ || (TARGET_VSX_SCALAR_FLOAT  mode == SFmode)
+ || (TARGET_VSX_SCALAR_DOUBLE  (mode == DFmode || mode == DImode))
+ || (TARGET_VSX_TIMODE  mode == TImode)))
 {
   if (FP_REGNO_P (regno))
return FP_REGNO_P (last_regno);
 
   if (ALTIVEC_REGNO_P (regno))
-   return ALTIVEC_REGNO_P (last_regno);
-}
+   {
+ if (mode == SFmode  !TARGET_UPPER_REGS_SF)
+   return 0;
 
-  /* Allow TImode in all VSX registers if the user asked for it.  */
-  if (mode == TImode  TARGET_VSX_TIMODE  VSX_REGNO_P (regno))
-return 1;
+ if ((mode == DFmode || mode == DImode)  !TARGET_UPPER_REGS_DF)
+   return 0;
+
+ return ALTIVEC_REGNO_P (last_regno);
+   }
+}
 
   /* The GPRs can hold any mode, but values bigger than one register
  cannot go past R31.  */
@@ -2413,7 +2422,7 @@ rs6000_init_hard_regno_mode_ok (bool glo
 
   if (TARGET_P8_VECTOR)
 {
-  rs6000_constraints[RS6000_CONSTRAINT_wv] = ALTIVEC_REGS;
+  rs6000_constraints[RS6000_CONSTRAINT_wu] = ALTIVEC_REGS;
   rs6000_constraints[RS6000_CONSTRAINT_wy]
= rs6000_constraints[RS6000_CONSTRAINT_ww]
= (TARGET_UPPER_REGS_SF) ? VSX_REGS : FLOAT_REGS;
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 202846)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -314,13 +314,13 @@ (define_mode_attr real_value_to_target [
 (define_mode_attr f32_lr [(SF f)  (SD wz)])
 (define_mode_attr f32_lm [(SF m)  (SD Z)])
 (define_mode_attr f32_li [(SF lfs%U1%X1 %0,%1) (SD lfiwzx %0,%y1)])
-(define_mode_attr f32_lv [(SF lxsspx %0,%y1)  (SD lxsiwzx %0,%y1)])
+(define_mode_attr f32_lv [(SF lxsspx %x0,%y1) (SD lxsiwzx 
%x0,%y1)])
 
 ; Definitions for store from 32-bit fpr register
 (define_mode_attr f32_sr [(SF f)   (SD wx)])
 (define_mode_attr f32_sm [(SF m)   (SD Z)])
 (define_mode_attr f32_si [(SF stfs%U0%X0 %1,%0) (SD stfiwx %1,%y0)])
-(define_mode_attr f32_sv [(SF stxsspx %1,%y0)  (SD stxsiwzx 
%1,%y0)])
+(define_mode_attr f32_sv [(SF stxsspx %x1,%y0)  (SD stxsiwzx %x1,%y0)])
 
 ; Definitions for 32-bit fpr direct move
 (define_mode_attr f32_dm [(SF wn) (SD wm)])
@@ -541,7 +541,7 @@ (define_split
   )
 
 (define_insn *zero_extendsidi2_lfiwzx
-  [(set (match_operand:DI 0 gpc_reg_operand =r,r,??wm,!wz,!wm)
+  [(set (match_operand:DI 0 gpc_reg_operand =r,r,??wm,!wz,!wu)

Re: Fwd: [PATCH] Scheduling result adjustment to enable macro-fusion

2013-09-24 Thread H.J. Lu
On Tue, Sep 24, 2013 at 12:06 PM, Wei Mi w...@google.com wrote:
 This is the updated patch2.
 Changed:
 1. For cmp/test with rip-relative addressing mem operand, don't group
 insns. Bulldozer also doesn't support fusion for cmp/test with both
 displacement MEM and immediate operand, while m_CORE_ALL doesn't
 support fusion for cmp/test with MEM and immediate operand. I simplify
 choose to use the more stringent constraint here (m_CORE_ALL's
 constraint).
 2. Add Budozer back and merge TARGET_FUSE_CMP_AND_BRANCH_64 and
 TARGET_FUSE_CMP_AND_BRANCH_32.

 bootstrap and regression pass. ok for trunk?

 2013-09-24  Wei Mi  w...@google.com

 * gcc/config/i386/i386.c (rip_relative_addr_p): New Function.
 (ix86_macro_fusion_p): Ditto.
 (ix86_macro_fusion_pair_p): Ditto.
 * gcc/config/i386/i386.h: Add new tune features about macro-fusion.
 * gcc/config/i386/x86-tune.def (DEF_TUNE): Ditto.
 * gcc/doc/tm.texi: Generated.
 * gcc/doc/tm.texi.in: Ditto.
 * gcc/haifa-sched.c (try_group_insn): New Function.
 (group_insns_for_macro_fusion): Ditto.
 (sched_init): Call group_insns_for_macro_fusion.
 * gcc/sched-rgn.c (add_branch_dependences): Keep insns in
 a SCHED_GROUP at the end of BB to remain their location.
 * gcc/target.def: Add two hooks: macro_fusion_p and
 macro_fusion_pair_p.

 diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
 index 1fd3f60..4a04778 100644
 --- a/gcc/config/i386/i386.c
 +++ b/gcc/config/i386/i386.c
 @@ -24856,6 +24856,167 @@ ia32_multipass_dfa_lookahead (void)
  }
  }

 +/* Extracted from ix86_print_operand_address. Check whether ADDR is a
 +   rip-relative address.  */
 +
 +static bool
 +rip_relative_addr_p (rtx addr)
 +{
 +  struct ix86_address parts;
 +  rtx base, index, disp;
 +  int ok;
 +
 +  if (GET_CODE (addr) == UNSPEC  XINT (addr, 1) == UNSPEC_VSIBADDR)
 +{
 +  ok = ix86_decompose_address (XVECEXP (addr, 0, 0), parts);
 +  parts.index = XVECEXP (addr, 0, 1);
 +}
 +  else if (GET_CODE (addr) == UNSPEC  XINT (addr, 1) == UNSPEC_LEA_ADDR)
 +ok = ix86_decompose_address (XVECEXP (addr, 0, 0), parts);
 +  else
 +ok = ix86_decompose_address (addr, parts);
 +
 +  gcc_assert (ok);
 +  base = parts.base;
 +  index = parts.index;
 +  disp = parts.disp;
 +
 +  if (TARGET_64BIT  !base  !index)
 +{
 +  rtx symbol = disp;
 +
 +  if (GET_CODE (disp) == CONST
 +  GET_CODE (XEXP (disp, 0)) == PLUS
 +  CONST_INT_P (XEXP (XEXP (disp, 0), 1)))
 +   symbol = XEXP (XEXP (disp, 0), 0);
 +
 +  if (GET_CODE (symbol) == LABEL_REF
 + || (GET_CODE (symbol) == SYMBOL_REF
 +  SYMBOL_REF_TLS_MODEL (symbol) == 0))
 +   return true;
 +}
 +  if (flag_pic  !base  !index)
 +{
 +  if (GET_CODE (disp) == CONST
 +  GET_CODE (XEXP (disp, 0)) == UNSPEC
 +  (XINT (XEXP (disp, 0), 1) == UNSPEC_PCREL
 + || XINT (XEXP (disp, 0), 1) == UNSPEC_GOTPCREL
 + || (TARGET_64BIT
 +  XINT (XEXP (disp, 0), 1) == UNSPEC_GOTNTPOFF)))
 +   return true;
 +}
 +  return false;
 +}
 +

It doesn't look right.  IP relative address is only possible
with TARGET_64BIT and

1. base == pc. Or
2. UUNSPEC_PCREL,  UNSPEC_GOTPCREL, and
NSPEC_GOTNTPOFF.


-- 
H.J.


Re: [PATCH] Set expr loc safely (PR c++/58516)

2013-09-24 Thread Jason Merrill

OK.

Jason


Re: Fwd: [PATCH] Scheduling result adjustment to enable macro-fusion

2013-09-24 Thread Jan Hubicka
  +  gcc_assert (ok);
  +  base = parts.base;
  +  index = parts.index;
  +  disp = parts.disp;
  +
  +  if (TARGET_64BIT  !base  !index)
  +{
  +  rtx symbol = disp;
  +
  +  if (GET_CODE (disp) == CONST
  +  GET_CODE (XEXP (disp, 0)) == PLUS
  +  CONST_INT_P (XEXP (XEXP (disp, 0), 1)))
  +   symbol = XEXP (XEXP (disp, 0), 0);
  +
  +  if (GET_CODE (symbol) == LABEL_REF
  + || (GET_CODE (symbol) == SYMBOL_REF
  +  SYMBOL_REF_TLS_MODEL (symbol) == 0))
  +   return true;
  +}
  +  if (flag_pic  !base  !index)
  +{
  +  if (GET_CODE (disp) == CONST
  +  GET_CODE (XEXP (disp, 0)) == UNSPEC
  +  (XINT (XEXP (disp, 0), 1) == UNSPEC_PCREL
  + || XINT (XEXP (disp, 0), 1) == UNSPEC_GOTPCREL
  + || (TARGET_64BIT
  +  XINT (XEXP (disp, 0), 1) == UNSPEC_GOTNTPOFF)))
  +   return true;
  +}
  +  return false;
  +}
  +
 
 It doesn't look right.  IP relative address is only possible
 with TARGET_64BIT and
 
 1. base == pc. Or
 2. UUNSPEC_PCREL,  UNSPEC_GOTPCREL, and
 NSPEC_GOTNTPOFF.

Target 64bit should be tested above.  We however output RIP addresses
also for basic symbol references.  I.e. when base is an symbol addresss.
such as in:
int a;
int t()
{
  return a;
}

memory_address_length already contains logic to figure out if there is IP
relative addressing going on (I am not sure it is completely accurate either).
Better to break it out to a common predicate and perhaps unify with what
ix86_print_operand_address is doing.

Honza
 
 
 -- 
 H.J.


  1   2   >