Re: [DOC PATCH] Rewrite docs for inline asm

2014-05-05 Thread dw



+A combination that works in most places is a newline to break the
+line, plus a tab character to move to the instruction field (written
+as "\n\t").

Will everyone know what an instruction field is?  I'm not sure it's
that common of a term.

Hmm.  I brought that text across unchanged from the original text. I
know what it means, but only because I've programmed in languages that
require it.  I haven't seen an assembler that cares, but my cross
platform experience is weak.

I don't want to get into the business of describing how to format and
write assembler code.  That's (well) beyond the scope of this doc.
What would you say to:

A combination that works in most places is a newline to break the
line, plus a tab character (written as "\n\t").

Yep, that sounds good.


Changed.


+operands. To separate the classes of operands, you use colons. Basic
+@code{asm} statements contain no colons. (So, for example,
+@code{asm("int $3")} is Basic @code{asm}, and @code{asm("int $3" : )}
is
+Extended @code{asm}. @pxref{Basic Asm}.)

What does "classes of operands refer to"?  This does not seem to be
evident from the context.

Looking at it in context:

asm [volatile] ( AssemblerTemplate : [OutputOperands] [ : [InputOperands] [ :
[Clobbers] ] ] )

Looking at where the colons fall, they separate OutputOperands from
InputOperands and InputOperands from Clobbers.  Observation suggests that
makes "OutputOperands", "InputOperands" and "Clobbers" the classes in
question.

On the other hand, I already knew what I meant.  If this is unclear,
what would you suggest?

I also guessed what this meant, still I am wondering whether there
might be a better term for this than classes.


While I can think of some alternatives ("groupings?"), none of them seem 
any clearer than "classes."  asm is doing something unusual here, so 
none of the familiar terms apply.



+GCC's optimizer sometimes discards @code{asm} statements if it
determines
+that it has no need for the output variables.

How about just saying "GCC", not referring to the optimizer here?

I'm pretty sure it only happens when running the optimizer.

There is just no such thing as "the GCC optimizer".  There are many
optimizations that occur at different phases of translation, and
individual passes, just not one clearly demarked and defined optimizer.
That was behind my comment.


I have modified this (and all other occurrences) to be plural.  That 
seems to be consistent with the surrounding text when it uses this term.



+@code{DoCheck} routine. By omitting the @code{volatile} qualifier when
it
+isn't needed you allow the optimizer to produce the most efficient code
+possible.

How about "Only including ...when it is really necessary allows the
optimizer", that is, a more positive ("included" instead of "omitting")
approach?

Hmm.  How strongly do you feel about this one?  I'm ok with the current
text.

The current one text is okay, I just think making this more "active"
and defaulting to _not_ using it as oppoed to assume usage and omitting
in case is a bit better.


The problem is, I'm not convinced that's the way *users* approach it.  
In my (still limited) experience, people use volatile simply because 
they aren't quite sure what it does.  They used it "just to be safe."  
And that's true:  At worst, including it when it isn't needed produces 
inefficient code.  Omitting it when it is needed produces incorrect 
code.  So it -is- safer.


But I want to break people from that mindset.  They should understand 
what it does and use it appropriately.  That's why this 'volatile' 
section is so long.  I'm hoping that explaining and exampling and subtle 
phrasing will move people in the right direction.


Let's queue this with the other "Joseph" changes. 


I'm sorry if I'm leaving a mess for you guys to clean up.  But even with 
these bits unchanged, it's WAY better than it was.



+@headitem Modifier @tab Description @tab Operand @tab masm=att @tab
masm=intel

@code{masm...} ?

You think?  It's not really code.  How about @option?

Yes!  That was me being stuck with too much web page work, where
we only have .


Changed.


Given their involvement from the technical side, I'd suggest that
Andrew or Richard commit the patch after the last final tweaks from
your side.


So, at this point I have:

- Made the modifications described above.
- Updated the web pages on LGS.
- Produced the diff you requested of the changes since the last patch: 
http://www.LimeGreenSocks/gcc/g02.zip

- Produced a new .patch file: http://www.LimeGreenSocks/gcc/extend08.zip

Unless I hear otherwise, I will post the updated patch (with a corrected 
changelog) on this thread ~24 hours from this post.


It can then be committed as per usual.

dw


RE: [PATCH ARM] Improve ARM memset inlining

2014-05-05 Thread bin.cheng


> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of bin.cheng
> Sent: Monday, May 05, 2014 3:21 PM
> To: Richard Earnshaw
> Cc: gcc-patches@gcc.gnu.org
> Subject: RE: [PATCH ARM] Improve ARM memset inlining
> 
> Hi Richard,  Thanks for reviewing.  I embedded answers to your comments,
> also updated the patch.
> 
> > -Original Message-
> > From: Richard Earnshaw
> > Sent: Friday, May 02, 2014 10:00 PM
> > To: Bin Cheng
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [PATCH ARM] Improve ARM memset inlining
> >
> > On 30/04/14 03:52, bin.cheng wrote:
> > > Hi,
> > > This patch expands small memset calls into direct memory set
> > > instructions by introducing "setmemsi" pattern.  For processors
> > > without NEON support, it expands memset using general store
> > > instruction.  For example, strd for 4-bytes aligned addresses.  For
> > > processors with NEON support, it expands memset using neon
> > > instructions like vstr and miscellaneous vst1.* instructions for
> > > both
> aligned
> > and unaligned cases.
> > >
> > > This patch depends on
> > > http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01923.html otherwise
> > > vst1.64 will be generated for 32-bit aligned memory unit.
> > >
> > > There is also one leftover work of this patch:  Since vst1.*
> > > instructions only support post-increment addressing mode, the
> > > inlined memset for unaligned neon cases should be like:
> > >   vmov.i32   q8, #...
> > >   vst1.8 {q8}, [r3]!
> > >   vst1.8 {q8}, [r3]!
> > >   vst1.8 {q8}, [r3]!
> > >   vst1.8 {q8}, [r3]
> >
> > Other than for zero, I'd expect the vmov to be vmov.i8 to move an
> arbitrary
> I just used vmov.i32 as an example.  The element size is actually
calculated by
> function neon_valid_immediate which works as expected I think.
> 
> > byte value into all lanes in a vector.  After that, if the alignment
> > is
> known to
> > be more than 8-bit, I'd expect the vst1 instructions (with the
> > exception
> of the
> > last store if the length is not a multiple of the alignment) to use
> >
> > vst1. {reg}, [addr-reg :]!
> >
> > Hence, for 16-bit aligned data, we want
> >
> > vst1.16 {q8}, [r3:16]!
> Did I miss something important?  It seems to me the explicit alignment
notes
> supported are 64/128/256.  So what do you mean by 16 bits alignment here?
> 
> >
> > > But for now, gcc can't do this and below code is generated:
> > >   vmov.i32   q8, #...
> > >   vst1.8 {q8}, [r3]
> > >   addr2,   r3,  #16
> > >   addr3,   r2,  #16
> > >   vst1.8 {q8}, [r2]
> > >   vst1.8 {q8}, [r3]
> > >   addr2,   r3,  #16
> > >   vst1.8 {q8}, [r2]
> > >
> > > I investigated this issue.  The root cause lies in rtx cost returned
> > > by ARM backend.  Anyway, I think this is another issue and should be
> > > fixed in separated patch.
> > >
> > > Bootstrap and reg-test on cortex-a15, with or without neon support.
> > > Is it OK?
> > >
> >
> > Some more comments inline.
> >
> > > Thanks,
> > > bin
> > >
> > >
> > > 2014-04-29  Bin Cheng  
> > >
> > >   PR target/55701
> > >   * config/arm/arm.md (setmem): New pattern.
> > >   * config/arm/arm-protos.h (struct tune_params): New field.
> > >   (arm_gen_setmem): New prototype.
> > >   * config/arm/arm.c (arm_slowmul_tune): Initialize new field.
> > >   (arm_fastmul_tune, arm_strongarm_tune, arm_xscale_tune): Ditto.
> > >   (arm_9e_tune, arm_v6t2_tune, arm_cortex_tune): Ditto.
> > >   (arm_cortex_a8_tune, arm_cortex_a7_tune): Ditto.
> > >   (arm_cortex_a15_tune, arm_cortex_a53_tune): Ditto.
> > >   (arm_cortex_a57_tune, arm_cortex_a5_tune): Ditto.
> > >   (arm_cortex_a9_tune, arm_cortex_a12_tune): Ditto.
> > >   (arm_v7m_tune, arm_v6m_tune, arm_fa726te_tune): Ditto.
> > >   (arm_const_inline_cost): New function.
> > >   (arm_block_set_max_insns): New function.
> > >   (arm_block_set_straight_profit_p): New function.
> > >   (arm_block_set_vect_profit_p): New function.
> > >   (arm_block_set_unaligned_vect): New function.
> > >   (arm_block_set_aligned_vect): New function.
> > >   (arm_block_set_unaligned_straight): New function.
> > >   (arm_block_set_aligned_straight): New function.
> > >   (arm_block_set_vect, arm_gen_setmem): New functions.
> > >
> > > gcc/testsuite/ChangeLog
> > > 2014-04-29  Bin Cheng  
> > >
> > >   PR target/55701
> > >   * gcc.target/arm/memset-inline-1.c: New test.
> > >   * gcc.target/arm/memset-inline-2.c: New test.
> > >   * gcc.target/arm/memset-inline-3.c: New test.
> > >   * gcc.target/arm/memset-inline-4.c: New test.
> > >   * gcc.target/arm/memset-inline-5.c: New test.
> > >   * gcc.target/arm/memset-inline-6.c: New test.
> > >   * gcc.target/arm/memset-inline-7.c: New test.
> > >   * gcc.target/arm/memset-inline-8.c: New test.
> > >   * gcc.target/arm/memset-inline-9.c: New test.
> > >
> > >
> > > j1328-20140429.txt
> > >
> > >
> > > Index: gcc/config/arm/arm.c
> > >
> >
> 

Re: [GOOGLE] handle TYPE_PACK_EXPANSION in lipo_cmp_type

2014-05-05 Thread Xinliang David Li
ok. Also needed in google/gcc-4_9 branch.

David

On Mon, May 5, 2014 at 8:40 PM, Dehao Chen  wrote:
> This patch handles TYPE_PACK_EXPANSION in lipo_cmp_type.
>
> testing on going. OK for google-4_8?
>
> Thanks,
> Dehao
>
> Index: gcc/l-ipo.c
>
> ===
>
> --- gcc/l-ipo.c (revision 209226)
>
> +++ gcc/l-ipo.c (working copy)
>
> @@ -676,6 +676,7 @@ lipo_cmp_type (tree t1, tree t2)
>
>  case POINTER_TYPE:
>
>  case REFERENCE_TYPE:
>
>  case COMPLEX_TYPE:
>
> +case TYPE_PACK_EXPANSION:
>
>return lipo_cmp_type (TREE_TYPE (t1), TREE_TYPE (t2));
>
>  case ARRAY_TYPE:
>
>return (TYPE_DOMAIN (t1) == NULL || TYPE_DOMAIN (t2) == NULL


[GOOGLE] handle TYPE_PACK_EXPANSION in lipo_cmp_type

2014-05-05 Thread Dehao Chen
This patch handles TYPE_PACK_EXPANSION in lipo_cmp_type.

testing on going. OK for google-4_8?

Thanks,
Dehao

Index: gcc/l-ipo.c

===

--- gcc/l-ipo.c (revision 209226)

+++ gcc/l-ipo.c (working copy)

@@ -676,6 +676,7 @@ lipo_cmp_type (tree t1, tree t2)

 case POINTER_TYPE:

 case REFERENCE_TYPE:

 case COMPLEX_TYPE:

+case TYPE_PACK_EXPANSION:

   return lipo_cmp_type (TREE_TYPE (t1), TREE_TYPE (t2));

 case ARRAY_TYPE:

   return (TYPE_DOMAIN (t1) == NULL || TYPE_DOMAIN (t2) == NULL


Go patch committed: Use backend interface for array length

2014-05-05 Thread Ian Lance Taylor
This patch by Chris Manghane changes the Go frontend to use the backend
interface for the length of an array.  Bootstrapped and ran Go testsuite
on x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r f4135c695c00 go/types.cc
--- a/go/types.cc	Mon May 05 20:10:20 2014 -0400
+++ b/go/types.cc	Mon May 05 21:00:16 2014 -0400
@@ -245,7 +245,7 @@
   return ptype == NULL ? NULL : ptype->points_to();
 }
 
-// Return whether this is an open array type.
+// Return whether this is a slice type.
 
 bool
 Type::is_slice_type() const
@@ -5839,14 +5839,85 @@
   gogo->add_statement(s);
 }
 
-// Get a tree for the length of a fixed array.  The length may be
+// Get the backend representation of the fields of a slice.  This is
+// not declared in types.h so that types.h doesn't have to #include
+// backend.h.
+//
+// We use int for the count and capacity fields.  This matches 6g.
+// The language more or less assumes that we can't allocate space of a
+// size which does not fit in int.
+
+static void
+get_backend_slice_fields(Gogo* gogo, Array_type* type, bool use_placeholder,
+			 std::vector* bfields)
+{
+  bfields->resize(3);
+
+  Type* pet = Type::make_pointer_type(type->element_type());
+  Btype* pbet = (use_placeholder
+		 ? pet->get_backend_placeholder(gogo)
+		 : pet->get_backend(gogo));
+  Location ploc = Linemap::predeclared_location();
+
+  Backend::Btyped_identifier* p = &(*bfields)[0];
+  p->name = "__values";
+  p->btype = pbet;
+  p->location = ploc;
+
+  Type* int_type = Type::lookup_integer_type("int");
+
+  p = &(*bfields)[1];
+  p->name = "__count";
+  p->btype = int_type->get_backend(gogo);
+  p->location = ploc;
+
+  p = &(*bfields)[2];
+  p->name = "__capacity";
+  p->btype = int_type->get_backend(gogo);
+  p->location = ploc;
+}
+
+// Get a tree for the type of this array.  A fixed array is simply
+// represented as ARRAY_TYPE with the appropriate index--i.e., it is
+// just like an array in C.  A slice is a struct with three
+// fields: a data pointer, the length, and the capacity.
+
+Btype*
+Array_type::do_get_backend(Gogo* gogo)
+{
+  if (this->length_ == NULL)
+{
+  std::vector bfields;
+  get_backend_slice_fields(gogo, this, false, &bfields);
+  return gogo->backend()->struct_type(bfields);
+}
+  else
+{
+  Btype* element = this->get_backend_element(gogo, false);
+  Bexpression* len = this->get_backend_length(gogo);
+  return gogo->backend()->array_type(element, len);
+}
+}
+
+// Return the backend representation of the element type.
+
+Btype*
+Array_type::get_backend_element(Gogo* gogo, bool use_placeholder)
+{
+  if (use_placeholder)
+return this->element_type_->get_backend_placeholder(gogo);
+  else
+return this->element_type_->get_backend(gogo);
+}
+
+// Return the backend representation of the length. The length may be
 // computed using a function call, so we must only evaluate it once.
 
-tree
-Array_type::get_length_tree(Gogo* gogo)
+Bexpression*
+Array_type::get_backend_length(Gogo* gogo)
 {
   go_assert(this->length_ != NULL);
-  if (this->length_tree_ == NULL_TREE)
+  if (this->blength_ == NULL)
 {
   Numeric_constant nc;
   mpz_t val;
@@ -5854,8 +5925,8 @@
 	{
 	  if (mpz_sgn(val) < 0)
 	{
-	  this->length_tree_ = error_mark_node;
-	  return this->length_tree_;
+	  this->blength_ = gogo->backend()->error_expression();
+	  return this->blength_;
 	}
 	  Type* t = nc.type();
 	  if (t == NULL)
@@ -5863,9 +5934,8 @@
 	  else if (t->is_abstract())
 	t = t->make_non_abstract_type();
   Btype* btype = t->get_backend(gogo);
-  Bexpression* iexpr =
-  gogo->backend()->integer_constant_expression(btype, val);
-	  this->length_tree_ = expr_to_tree(iexpr);
+  this->blength_ =
+	gogo->backend()->integer_constant_expression(btype, val);
 	  mpz_clear(val);
 	}
   else
@@ -5873,97 +5943,15 @@
 	  // Make up a translation context for the array length
 	  // expression.  FIXME: This won't work in general.
 	  Translate_context context(gogo, NULL, NULL, NULL);
-	  tree len = this->length_->get_tree(&context);
-	  if (len != error_mark_node)
-	{
-	  Type* int_type = Type::lookup_integer_type("int");
-	  tree int_type_tree = type_to_tree(int_type->get_backend(gogo));
-	  len = convert_to_integer(int_type_tree, len);
-	  len = save_expr(len);
-	}
-	  this->length_tree_ = len;
-	}
-}
-  return this->length_tree_;
-}
-
-// Get the backend representation of the fields of a slice.  This is
-// not declared in types.h so that types.h doesn't have to #include
-// backend.h.
-//
-// We use int for the count and capacity fields.  This matches 6g.
-// The language more or less assumes that we can't allocate space of a
-// size which does not fit in int.
-
-static void
-get_backend_slice_fields(Gogo* gogo, Array_type* type, bool use_placeholder,
-			 std::vector* bfields)
-{
-  bfields->resize(3);
-
-  Type* pet = Type::make_pointer_type(type->e

Re: [SH, committed] add missing function* argument in sh_optimize_sett_clrt::execute

2014-05-05 Thread Trevor Saunders
On Sat, May 03, 2014 at 10:24:31PM +0200, Oleg Endo wrote:
> Hi,
> 
> I've committed the attached patch as r210040, which reinstates the RTL
> pass.
> 
> Trevor, could you please double check that you haven't missed any other
> cases in your pass refactoring patches?

I have no clue how this happened, I remember changing it... :(

Anyway I did another round of greping and just finished looking over a
set of config-list.mk results, and didn't find anything.

thanks and sorry~

Trev

> 
> Cheers,
> Oleg
> 
> gcc/ChangeLog:
>   * config/sh/sh_optimize_sett_clrt.cc (sh_optimize_sett_clrt::execute):
>   Add missing function* argument.

> Index: gcc/config/sh/sh_optimize_sett_clrt.cc
> ===
> --- gcc/config/sh/sh_optimize_sett_clrt.cc(revision 210027)
> +++ gcc/config/sh/sh_optimize_sett_clrt.cc(working copy)
> @@ -79,8 +79,8 @@
>  public:
>sh_optimize_sett_clrt (gcc::context* ctx, const char* name);
>virtual ~sh_optimize_sett_clrt (void);
> -  virtual bool gate (function *);
> -  virtual unsigned int execute (void);
> +  virtual bool gate (function*);
> +  virtual unsigned int execute (function* fun);
>  
>  private:
>static const pass_data default_pass_data;
> @@ -161,13 +161,13 @@
>  }
>  
>  bool
> -sh_optimize_sett_clrt::gate (function *)
> +sh_optimize_sett_clrt::gate (function*)
>  {
>return optimize > 0;
>  }
>  
>  unsigned int
> -sh_optimize_sett_clrt::execute (void)
> +sh_optimize_sett_clrt::execute (function* fun)
>  {
>unsigned int ccr0 = INVALID_REGNUM;
>unsigned int ccr1 = INVALID_REGNUM;
> @@ -205,7 +205,7 @@
>// Look for insns that set the ccreg to a constant value and see if it can
>// be optimized.
>basic_block bb;
> -  FOR_EACH_BB_REVERSE_FN (bb, cfun)
> +  FOR_EACH_BB_REVERSE_FN (bb, fun)
>  for (rtx next_i, i = NEXT_INSN (BB_HEAD (bb));
>i != NULL_RTX && i != BB_END (bb); i = next_i)
>{



signature.asc
Description: Digital signature


Go patch committed: Use backend interface for slice construction

2014-05-05 Thread Ian Lance Taylor
This patch by Chris Manghane changes the Go frontend to use the backend
interface for slice construction.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian


2014-05-05  Chris Manghane  

* go-gcc.cc (Gcc_backend::implicit_variable): Rename from
gc_root_variable.  Add name and is_constant parameters.


Index: gcc/go/gofrontend/expressions.h
===
--- gcc/go/gofrontend/expressions.h	(revision 209819)
+++ gcc/go/gofrontend/expressions.h	(working copy)
@@ -1305,7 +1305,7 @@ class Unary_expression : public Expressi
   Unary_expression(Operator op, Expression* expr, Location location)
 : Expression(EXPRESSION_UNARY, location),
   op_(op), escapes_(true), create_temp_(false), is_gc_root_(false),
-  expr_(expr), issue_nil_check_(false)
+  is_slice_init_(false), expr_(expr), issue_nil_check_(false)
   { }
 
   // Return the operator.
@@ -1344,6 +1344,15 @@ class Unary_expression : public Expressi
 this->is_gc_root_ = true;
   }
 
+  // Record that this is an address expression of a slice value initializer,
+  // which is mutable if the values are not copied to the heap.
+  void
+  set_is_slice_init()
+  {
+go_assert(this->op_ == OPERATOR_AND);
+this->is_slice_init_ = true;
+  }
+
   // Apply unary opcode OP to UNC, setting NC.  Return true if this
   // could be done, false if not.  Issue errors for overflow.
   static bool
@@ -1427,6 +1436,11 @@ class Unary_expression : public Expressi
   // special struct composite literal that is mutable when addressed, meaning
   // it cannot be represented as an immutable_struct in the backend.
   bool is_gc_root_;
+  // True if this is an address expression for a slice value with an immutable
+  // initializer.  The initializer for a slice's value pointer has an array
+  // type, meaning it cannot be represented as an immutable_struct in the
+  // backend.
+  bool is_slice_init_;
   // The operand.
   Expression* expr_;
   // Whether or not to issue a nil check for this expression if its address
Index: gcc/go/gofrontend/backend.h
===
--- gcc/go/gofrontend/backend.h	(revision 210087)
+++ gcc/go/gofrontend/backend.h	(working copy)
@@ -536,11 +536,16 @@ class Backend
 		 bool address_is_taken, Location location,
 		 Bstatement** pstatement) = 0;
 
-  // Create a GC root variable. TYPE is the __go_gc_root_list struct described
-  // in Gogo::register_gc_vars.  INIT is the composite literal consisting of a
-  // pointer to the next GC root and the global variables registered.
+  // Create an implicit variable that is compiler-defined.  This is used when
+  // generating GC root variables and storing the values of a slice constructor.
+  // NAME is the name of the variable, either gc# for GC roots or C# for slice
+  // initializers.  TYPE is the type of the implicit variable with an initial
+  // value INIT.  IS_CONSTANT is true if the implicit variable should be treated
+  // like it is immutable.  For slice initializers, if the values must be copied
+  // to the heap, the variable IS_CONSTANT.
   virtual Bvariable*
-  gc_root_variable(Btype* type, Bexpression* init) = 0;
+  implicit_variable(const std::string& name, Btype* type, Bexpression* init,
+		bool is_constant) = 0;
 
   // Create a named immutable initialized data structure.  This is
   // used for type descriptors, map descriptors, and function
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc	(revision 210087)
+++ gcc/go/gofrontend/expressions.cc	(working copy)
@@ -4113,20 +4113,47 @@ Unary_expression::do_get_tree(Translate_
 	}
 	}
 
-  if (this->is_gc_root_)
-	{
-	  // Build a decl for a GC root variable.  GC roots are mutable, so they
-	  // cannot be represented as an immutable_struct in the backend.
-	  Bvariable* gc_root = gogo->backend()->gc_root_variable(btype, bexpr);
-	  bexpr = gogo->backend()->var_expression(gc_root, loc);
+  static unsigned int counter;
+  char buf[100];
+  if (this->is_gc_root_ || this->is_slice_init_)
+	{
+	  bool copy_to_heap = false;
+	  if (this->is_gc_root_)
+	{
+	  // Build a decl for a GC root variable.  GC roots are mutable, so
+	  // they cannot be represented as an immutable_struct in the
+	  // backend.
+	  static unsigned int root_counter;
+	  snprintf(buf, sizeof buf, "gc%u", root_counter);
+	  ++root_counter;
+	}
+	  else
+	{
+	  // Build a decl for a slice value initializer.  An immutable slice
+	  // value initializer may have to be copied to the heap if it
+	  // contains pointers in a non-constant context.
+	  snprintf(buf, sizeof buf, "C%u", counter);
+	  ++counter;
+
+	  Array_type* at = this->expr_->type()->array_type();
+	  go_assert(at != NULL);
+
+	  // If we are n

Go patch committed: Handle recursive pointer types in indirection

2014-05-05 Thread Ian Lance Taylor
This patch by Chris Manghane change the Go frontend to move the handling
of indirecting through recursive pointer types to the backend.
Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian


2014-05-05  Chris Manghane  

* go-gcc.cc (Gcc_backend::indirect_expression): Add btype
parameter.
(Gcc_backend::temporary_variable): Check for erroneous function.


Index: gcc/go/go-gcc.cc
===
--- gcc/go/go-gcc.cc	(revision 209941)
+++ gcc/go/go-gcc.cc	(working copy)
@@ -229,7 +229,7 @@ class Gcc_backend : public Backend
   var_expression(Bvariable* var, Location);
 
   Bexpression*
-  indirect_expression(Bexpression* expr, bool known_valid, Location);
+  indirect_expression(Btype*, Bexpression* expr, bool known_valid, Location);
 
   Bexpression*
   named_constant_expression(Btype* btype, const std::string& name,
@@ -1147,14 +1147,26 @@ Gcc_backend::var_expression(Bvariable* v
 // An expression that indirectly references an expression.
 
 Bexpression*
-Gcc_backend::indirect_expression(Bexpression* expr, bool known_valid,
- Location location)
+Gcc_backend::indirect_expression(Btype* btype, Bexpression* expr,
+ bool known_valid, Location location)
 {
+  tree expr_tree = expr->get_tree();
+  tree type_tree = btype->get_tree();
+  if (expr_tree == error_mark_node || type_tree == error_mark_node)
+return this->error_expression();
+
+  // If the type of EXPR is a recursive pointer type, then we
+  // need to insert a cast before indirecting.
+  tree target_type_tree = TREE_TYPE(TREE_TYPE(expr_tree));
+  if (VOID_TYPE_P(target_type_tree))
+expr_tree = fold_convert_loc(location.gcc_location(),
+ build_pointer_type(type_tree), expr_tree);
+
   tree ret = build_fold_indirect_ref_loc(location.gcc_location(),
- expr->get_tree());
+ expr_tree);
   if (known_valid)
 TREE_THIS_NOTRAP(ret) = 1;
-  return tree_to_expr(ret);
+  return this->make_expression(ret);
 }
 
 // Return an expression that declares a constant named NAME with the
@@ -2406,17 +2418,18 @@ Gcc_backend::temporary_variable(Bfunctio
 Location location,
 Bstatement** pstatement)
 {
+  gcc_assert(function != NULL);
+  tree decl = function->get_tree();
   tree type_tree = btype->get_tree();
   tree init_tree = binit == NULL ? NULL_TREE : binit->get_tree();
-  if (type_tree == error_mark_node || init_tree == error_mark_node)
+  if (type_tree == error_mark_node
+  || init_tree == error_mark_node
+  || decl == error_mark_node)
 {
   *pstatement = this->error_statement();
   return this->error_variable();
 }
 
-  gcc_assert(function != NULL);
-  tree decl = function->get_tree();
-
   tree var;
   // We can only use create_tmp_var if the type is not addressable.
   if (!TREE_ADDRESSABLE(type_tree))
Index: gcc/go/gofrontend/gogo.cc
===
--- gcc/go/gofrontend/gogo.cc	(revision 209983)
+++ gcc/go/gofrontend/gogo.cc	(working copy)
@@ -5210,7 +5210,10 @@ Function::return_value(Gogo* gogo, Named
   Bvariable* bvar = no->get_backend_variable(gogo, named_function);
   Bexpression* val = gogo->backend()->var_expression(bvar, location);
   if (no->result_var_value()->is_in_heap())
-val = gogo->backend()->indirect_expression(val, true, location);
+	{
+	  Btype* bt = no->result_var_value()->type()->get_backend(gogo);
+	  val = gogo->backend()->indirect_expression(bt, val, true, location);
+	}
   vals[i] = val;
 }
   return gogo->backend()->return_statement(this->fndecl_, vals, location);
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc	(revision 210084)
+++ gcc/go/gofrontend/expressions.cc	(working copy)
@@ -760,16 +760,24 @@ Var_expression::do_get_tree(Translate_co
 			  context->function());
   bool is_in_heap;
   Location loc = this->location();
+  Btype* btype;
+  Gogo* gogo = context->gogo();
   if (this->variable_->is_variable())
-is_in_heap = this->variable_->var_value()->is_in_heap();
+{
+  is_in_heap = this->variable_->var_value()->is_in_heap();
+  btype = this->variable_->var_value()->type()->get_backend(gogo);
+}
   else if (this->variable_->is_result_variable())
-is_in_heap = this->variable_->result_var_value()->is_in_heap();
+{
+  is_in_heap = this->variable_->result_var_value()->is_in_heap();
+  btype = this->variable_->result_var_value()->type()->get_backend(gogo);
+}
   else
 go_unreachable();
 
   Bexpression* ret = context->backend()->var_expression(bvar, loc);
   if (is_in_heap)
-ret = context->backend()->indirect_expression(ret, true, loc);
+ret = context->backend()->indirect_expression(btype, ret, true, loc);
   return expr_to_tree(ret);
 }
 
@

[PATCH] Move shared attributes to common location

2014-05-05 Thread Iain Buclaw
Hi,

This patch (changelog entry needs to be written up) moves the minimal
set of attributes to accommodate the needs of builtins out of each
respective front-end and into a common area shared between each.

Defines lhd_common_attribute_table and lhd_format_attribute_table to
provide as defaults for the COMMON_ATTRIBUTE_TABLE and
FORMAT_ATTRIBUTE_TABLE langhooks respectively.  All attribute handlers
in LTO; Java, and most in Ada have been moved/merged together.
Attributes not required for builtins.def, but in use by Ada have been
left in their own language attribute table.  And C family languages
still override these two langhooks with their own common and format
attribute tables defined in c-common.c

langhooks.c seems the most appropriate place to put these, but they
could just as easily go in a new source file.

Regards
Iain.
diff --git a/gcc/ada/gcc-interface/utils.c b/gcc/ada/gcc-interface/utils.c
index c81ab00..93e84e1 100644
--- a/gcc/ada/gcc-interface/utils.c
+++ b/gcc/ada/gcc-interface/utils.c
@@ -86,62 +86,20 @@ tree gnat_raise_decls[(int) LAST_REASON_CODE + 1];
 tree gnat_raise_decls_ext[(int) LAST_REASON_CODE + 1];
 
 /* Forward declarations for handlers of attributes.  */
-static tree handle_const_attribute (tree *, tree, tree, int, bool *);
-static tree handle_nothrow_attribute (tree *, tree, tree, int, bool *);
-static tree handle_pure_attribute (tree *, tree, tree, int, bool *);
-static tree handle_novops_attribute (tree *, tree, tree, int, bool *);
-static tree handle_nonnull_attribute (tree *, tree, tree, int, bool *);
-static tree handle_sentinel_attribute (tree *, tree, tree, int, bool *);
-static tree handle_noreturn_attribute (tree *, tree, tree, int, bool *);
-static tree handle_leaf_attribute (tree *, tree, tree, int, bool *);
-static tree handle_malloc_attribute (tree *, tree, tree, int, bool *);
-static tree handle_type_generic_attribute (tree *, tree, tree, int, bool *);
 static tree handle_vector_size_attribute (tree *, tree, tree, int, bool *);
 static tree handle_vector_type_attribute (tree *, tree, tree, int, bool *);
 
-/* Fake handler for attributes we don't properly support, typically because
-   they'd require dragging a lot of the common-c front-end circuitry.  */
-static tree fake_attribute_handler  (tree *, tree, tree, int, bool *);
-
-/* Table of machine-independent internal attributes for Ada.  We support
-   this minimal set of attributes to accommodate the needs of builtins.  */
+/* Table of machine-independent internal attributes for Ada.  */
 const struct attribute_spec gnat_internal_attribute_table[] =
 {
   /* { name, min_len, max_len, decl_req, type_req, fn_type_req, handler,
affects_type_identity } */
-  { "const",0, 0,  true,  false, false, handle_const_attribute,
-false },
-  { "nothrow",  0, 0,  true,  false, false, handle_nothrow_attribute,
-false },
-  { "pure", 0, 0,  true,  false, false, handle_pure_attribute,
-false },
-  { "no vops",  0, 0,  true,  false, false, handle_novops_attribute,
-false },
-  { "nonnull",  0, -1, false, true,  true,  handle_nonnull_attribute,
-false },
-  { "sentinel", 0, 1,  false, true,  true,  handle_sentinel_attribute,
-false },
-  { "noreturn", 0, 0,  true,  false, false, handle_noreturn_attribute,
-false },
-  { "leaf", 0, 0,  true,  false, false, handle_leaf_attribute,
-false },
-  { "malloc",   0, 0,  true,  false, false, handle_malloc_attribute,
-false },
-  { "type generic", 0, 0,  false, true, true, handle_type_generic_attribute,
-false },
-
   { "vector_size",  1, 1,  false, true, false,  handle_vector_size_attribute,
 false },
   { "vector_type",  0, 0,  false, true, false,  handle_vector_type_attribute,
 false },
   { "may_alias",0, 0, false, true, false, NULL, false },
 
-  /* ??? format and format_arg are heavy and not supported, which actually
- prevents support for stdio builtins, which we however declare as part
- of the common builtins.def contents.  */
-  { "format", 3, 3,  false, true,  true,  fake_attribute_handler, false },
-  { "format_arg", 1, 1,  false, true,  true,  fake_attribute_handler, false },
-
   { NULL, 0, 0, false, false, false, NULL, false }
 };
 
@@ -6115,303 +6073,6 @@ install_builtin_attributes (void)
 #undef DEF_ATTR_TREE_LIST
 }
 
-/* Handle a "const" attribute; arguments as in
-   struct attribute_spec.handler.  */
-
-static tree
-handle_const_attribute (tree *node, tree ARG_UNUSED (name),
-			tree ARG_UNUSED (args), int ARG_UNUSED (flags),
-			bool *no_add_attrs)
-{
-  if (TREE_CODE (*node) == FUNCTION_DECL)
-TREE_READONLY (*node) = 1;
-  else
-*no_add_attrs = true;
-
-  return NULL_TREE;
-}
-
-/* Handle a "nothrow" attribute; arguments as in
-   struct attribute_spec.handler.  */
-
-static tree
-handle_nothrow_attribute (tree *node, tree ARG_UNUSED (name),
-			  tree ARG_UNUSED (args), int ARG_UNUSED (flags),
-			  bool *no_add_attrs)
-{
-  if 

Go patch committed: Use backend interface for type info expressions

2014-05-05 Thread Ian Lance Taylor
This patch by Chris Manghane changes the Go frontend to use the backend
interface for type info expressions.  Bootstrapped and ran Go testsuite
on x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 1a6ec84147d3 go/expressions.cc
--- a/go/expressions.cc	Mon May 05 15:05:17 2014 -0400
+++ b/go/expressions.cc	Mon May 05 17:08:57 2014 -0400
@@ -14095,9 +14095,13 @@
 default:
   go_unreachable();
 }
-  tree val_type_tree = type_to_tree(this->type()->get_backend(gogo));
-  go_assert(val_type_tree != error_mark_node);
-  return build_int_cstu(val_type_tree, val);
+  mpz_t cst;
+  mpz_init_set_ui(cst, val);
+  Btype* int_btype = this->type()->get_backend(gogo);
+  Bexpression* ret =
+gogo->backend()->integer_constant_expression(int_btype, cst);
+  mpz_clear(cst);
+  return expr_to_tree(ret);
 }
 
 // Dump ast representation for a type info expression.


Re: [C++ Patch] PR 60999

2014-05-05 Thread Paolo Carlini

Hi,

On 05/05/2014 10:48 PM, Jason Merrill wrote:

On 05/05/2014 12:25 PM, Paolo Carlini wrote:

Good, but is it Ok to use uses_template_parms for that? A few days ago I
struggled to find something simpler ready to use, to no avail (well,
this is probably well known to you, but there are surprisingly few
places in pt.c where we explain either in comments or in obvious code
that we are handling full (vs partial) specializations).


+/* True if the given class type is a template or a partial 
specialization.  */

+#define CLASSTYPE_IS_TEMPLATE_OR_PARTIAL_SPECIALIZATION(NODE) \
+  (CLASSTYPE_TEMPLATE_INFO (NODE)\
+   && uses_template_parms (NODE))


I think this would be true for a non-template member class of a 
template class.  I think we want something like


CLASSTYPE_TEMPLATE_INFO (NODE)
&& PRIMARY_TEMPLATE_P (CLASSTYPE_TI_TEMPLATE (NODE))
&& uses_template_parms (INNERMOST_TEMPLATE_ARGS (CLASSTYPE_TI_ARGS 
(NODE))
Thanks. The latter works almost fine testsuite-wise, besides the 
"famous" nsdmi-template7.C, for which we definitely want the predicate 
to return true for y, because the unnamed non-type template parameter 
could have a name used for its initialization:


struct A {
template
struct B {
struct C {
int x = 0;
  double y = x;
} c;
};
};
int main() {
A::B<>();
}

what happens instead is that PRIMARY_TEMPLATE_P (CLASSTYPE_TI_TEMPLATE 
(NODE)) is false. Would the below make sense then?


Thanks,
Paolo.

///
Index: cp/cp-tree.h
===
--- cp/cp-tree.h(revision 210083)
+++ cp/cp-tree.h(working copy)
@@ -3160,6 +3160,11 @@ more_aggr_init_expr_args_p (const aggr_init_expr_a
&& !CLASSTYPE_USE_TEMPLATE (NODE) \
&& PRIMARY_TEMPLATE_P (CLASSTYPE_TI_TEMPLATE (NODE)))
 
+/* True if the given class type is a template or a partial specialization.  */
+#define CLASSTYPE_IS_TEMPLATE_OR_PARTIAL_SPECIALIZATION(NODE)  \
+  (CLASSTYPE_TEMPLATE_INFO (NODE)   \
+   && uses_template_parms (INNERMOST_TEMPLATE_ARGS (CLASSTYPE_TI_ARGS (NODE
+
 /* The name used by the user to name the typename type.  Typically,
this is an IDENTIFIER_NODE, and the same as the DECL_NAME on the
corresponding TYPE_DECL.  However, this may also be a
Index: cp/pt.c
===
--- cp/pt.c (revision 210083)
+++ cp/pt.c (working copy)
@@ -462,9 +462,12 @@ maybe_begin_member_template_processing (tree decl)
   bool nsdmi = TREE_CODE (decl) == FIELD_DECL;
 
   if (nsdmi)
-decl = (CLASSTYPE_TEMPLATE_INFO (DECL_CONTEXT (decl))
-   ? CLASSTYPE_TI_TEMPLATE (DECL_CONTEXT (decl))
-   : NULL_TREE);
+{
+  tree ctx = DECL_CONTEXT (decl);
+  decl = (CLASSTYPE_IS_TEMPLATE_OR_PARTIAL_SPECIALIZATION (ctx)
+ ? CLASSTYPE_TI_TEMPLATE (ctx)
+ : NULL_TREE);
+}
 
   if (inline_needs_template_parms (decl, nsdmi))
 {
Index: testsuite/g++.dg/cpp0x/nsdmi-template10.C
===
--- testsuite/g++.dg/cpp0x/nsdmi-template10.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/nsdmi-template10.C   (working copy)
@@ -0,0 +1,14 @@
+// PR c++/60999
+// { dg-do compile { target c++11 } }
+
+struct B
+{
+  template
+  struct A;
+
+  template
+  struct A<1, M>
+  {
+int X = M;
+  };
+};
Index: testsuite/g++.dg/cpp0x/nsdmi-template9.C
===
--- testsuite/g++.dg/cpp0x/nsdmi-template9.C(revision 0)
+++ testsuite/g++.dg/cpp0x/nsdmi-template9.C(working copy)
@@ -0,0 +1,16 @@
+// PR c++/60999
+// { dg-do compile { target c++11 } }
+
+template 
+struct foo
+{
+};
+  
+template<>
+struct foo
+{
+  static constexpr int code = 42;
+  unsigned int bar = static_cast(code);
+};
+  
+foo a;


Re: [wide-int] Handle zero-precision INTEGER_CSTs again

2014-05-05 Thread Richard Sandiford
Richard Biener  writes:
> On Mon, May 5, 2014 at 4:51 PM, Richard Sandiford
>  wrote:
>> Richard Biener  writes:
>>> On Mon, May 5, 2014 at 12:54 PM, Richard Sandiford
>>>  wrote:
 Richard Biener  writes:
> On Fri, May 2, 2014 at 9:19 PM, Richard Sandiford
>  wrote:
>> I'd hoped the days of zero-precision INTEGER_CSTs were behind us after
>> Richard's patch to remove min amd max values from zero-width bitfields,
>> but a boostrap-ubsan showed otherwise.  One source is in:
>>
>>   null_pointer_node = build_int_cst (build_pointer_type
>> (void_type_node), 0);
>>
>> if no_target, since the precision of the type comes from ptr_mode,
>> which remains the default VOIDmode.  Maybe that's a bug, but setting
>> it to an arbitrary nonzero value would also be dangerous.
>
> Can you explain what 'no_target' should be?  ptr_mode should never be
> VOIDmode.

 Sorry, meant "no_backend" rather than "no_target".  See do_compile.
>>>
>>> Ok.  So we do
>>>
>>>   /* This must be run always, because it is needed to compute the FP
>>>  predefined macros, such as __LDBL_MAX__, for targets using non
>>>  default FP formats.  */
>>>   init_adjust_machine_modes ();
>>>
>>>   /* Set up the back-end if requested.  */
>>>   if (!no_backend)
>>> backend_init ();
>>>
>>> where I think that init_adjust_machine_modes should initialize the
>>> {byte,word,ptr}_mode globals.  Move from init_emit_once.

init_adjust_machine_modes is an auto-generated function so I ended up
using a new function instead.  Tested on x86_64-linux-gnu.  OK to install?

> So I don't think we want this patch.  Instead stor-layout should
> ICE on zero-precision integer/pointer types.

 What should happen for void_zero_node?
>>>
>>> Not sure what that beast is supposed to be or why it should be
>>> of INTEGER_CST kind (it's not even initialized in any meaningful
>>> way).
>>>
>>> That said, the wide-int code shouldn't be slowed down by
>>> precision == 0 checks.  We should never ever reach wide-int
>>> with such a constant.
>>
>> void_zero_node is used for ubsan too, and survives into gimple.
>> I did hit this in real tests, it wasn't just theoretical.
>
> Ugh - for what does it use that ... :/
>
> Please remember how to trigger those issues and I'll happily have
> a look after the merge.

At the time it was just a normal bootstrap-ubsan, but that was
before the zero-precision patch.  Probably the best way of
checking for zero-precision tree constants is to put an assert
for nonzero precisions in:

inline unsigned int
wi::int_traits ::get_precision (const_tree tcst)
{
  return TYPE_PRECISION (TREE_TYPE (tcst));
}

and:

template 
inline wi::extended_tree ::extended_tree (const_tree t)
  : m_t (t)
{
  gcc_checking_assert (TYPE_PRECISION (TREE_TYPE (t)) <= N);
}

and then bootstrap-ubsan.  But like Marek says, the ubsan code uses
void_zero_node by name.

Thanks,
Richard


gcc/
* emit-rtl.c (init_derived_machine_modes): New functionm, split
out from...
(init_emit_once): ...here.
* rtl.h (init_derived_machine_modes): Declare.
* toplev.c (do_compile): Call it even if no_backend.

Index: gcc/emit-rtl.c
===
--- gcc/emit-rtl.c  2014-05-03 20:18:49.157107743 +0100
+++ gcc/emit-rtl.c  2014-05-05 17:44:53.579038259 +0100
@@ -5620,6 +5620,30 @@ init_emit_regs (void)
 }
 }
 
+/* Initialize global machine_mode variables.  */
+
+void
+init_derived_machine_modes (void)
+{
+  byte_mode = VOIDmode;
+  word_mode = VOIDmode;
+
+  for (enum machine_mode mode = GET_CLASS_NARROWEST_MODE (MODE_INT);
+   mode != VOIDmode;
+   mode = GET_MODE_WIDER_MODE (mode))
+{
+  if (GET_MODE_BITSIZE (mode) == BITS_PER_UNIT
+ && byte_mode == VOIDmode)
+   byte_mode = mode;
+
+  if (GET_MODE_BITSIZE (mode) == BITS_PER_WORD
+ && word_mode == VOIDmode)
+   word_mode = mode;
+}
+
+  ptr_mode = mode_for_size (POINTER_SIZE, GET_MODE_CLASS (Pmode), 0);
+}
+
 /* Create some permanent unique rtl objects shared between all functions.  */
 
 void
@@ -5643,36 +5667,6 @@ init_emit_once (void)
   reg_attrs_htab = htab_create_ggc (37, reg_attrs_htab_hash,
reg_attrs_htab_eq, NULL);
 
-  /* Compute the word and byte modes.  */
-
-  byte_mode = VOIDmode;
-  word_mode = VOIDmode;
-  double_mode = VOIDmode;
-
-  for (mode = GET_CLASS_NARROWEST_MODE (MODE_INT);
-   mode != VOIDmode;
-   mode = GET_MODE_WIDER_MODE (mode))
-{
-  if (GET_MODE_BITSIZE (mode) == BITS_PER_UNIT
- && byte_mode == VOIDmode)
-   byte_mode = mode;
-
-  if (GET_MODE_BITSIZE (mode) == BITS_PER_WORD
- && word_mode == VOIDmode)
-   word_mode = mode;
-}
-
-  for (mode = GET_CLASS_NARROWEST_MODE (MODE_FLOAT);
-   mode != VOIDmode;
-   mode = GET_MODE_WIDER_MODE (mode))
-{
-  i

Re: [PATCH, rs6000] Use new __builtin_pack_longdouble within libgcc's ibm-ldouble.c

2014-05-05 Thread Peter Bergner
On Mon, 2014-05-05 at 13:33 -0500, Peter Bergner wrote:
> Currently, the IBM long double routines in libgcc use a union to construct
> a long double from two double values.  This causes horrific code generation
> that copies the two double from the FP registers over to GPRs and back
> again, giving us two loads and two stores, which leads to two load-hit-store
> hazzards.  The following patch makes use of the new __builtin_pack_longdouble
> builtin to construct the long double giving us at worse, one or two fmrs.
> 
> Is this ok for mainline once my bootstrap and regtesting are complete?

Ok, bootstrapping and regtesting have completed with no regressions.

Peter




Re: [C++ Patch] PR 60999

2014-05-05 Thread Jason Merrill

On 05/05/2014 12:25 PM, Paolo Carlini wrote:

Good, but is it Ok to use uses_template_parms for that? A few days ago I
struggled to find something simpler ready to use, to no avail (well,
this is probably well known to you, but there are surprisingly few
places in pt.c where we explain either in comments or in obvious code
that we are handling full (vs partial) specializations).



+/* True if the given class type is a template or a partial specialization.  */
+#define CLASSTYPE_IS_TEMPLATE_OR_PARTIAL_SPECIALIZATION(NODE)  \
+  (CLASSTYPE_TEMPLATE_INFO (NODE)\
+   && uses_template_parms (NODE))


I think this would be true for a non-template member class of a template 
class.  I think we want something like


CLASSTYPE_TEMPLATE_INFO (NODE)
&& PRIMARY_TEMPLATE_P (CLASSTYPE_TI_TEMPLATE (NODE))
&& uses_template_parms (INNERMOST_TEMPLATE_ARGS (CLASSTYPE_TI_ARGS (NODE))


Re: [Patch, testsuite, mips] Fix two failing MIPS tests.

2014-05-05 Thread Richard Sandiford
Steve Ellcey  writes:
> On Sat, 2014-05-03 at 07:52 +0100, Richard Sandiford wrote:
>
>> > Old:
>> >
>> >li  $5,305397760# 0x1234
>> >addiu   $4,$5,1
>> >addiu   $5,$5,-1
>> >
>> > New:
>> >
>> >li  $5,305332224# 0x1233
>> >ori $5,$5,0x
>> >addiu   $4,$5,2
>> 
>> This isn't as good though -- $4 now depends on two previous
>> instructions.  Do you know which specific change was responsible?
>> 
>> Thanks,
>> Richard
>
> A quick follow up.  The code in question is:
>
> void f () { g (0x12340001, 0x1233); }
>
> And if I swap the arguments I get something more like the old code.
> But if I run an older GCC on that new swapped code then I get something
> like the new code.  So while the old code may have been better for this
> example, the new code is better if I swap the arguments around.

OK, in that case it sounds like the optimisation wasn't working as well
as it was supposed to.  But I think we should fix that rather than change
the expected output.  AFAICT the new output means the const-anchor
optimisation isn't kicking in.

Thanks,
Richard





Re: Ping: RFA: speeding up dg-extract-results.sh

2014-05-05 Thread Richard Sandiford
Jakub Jelinek  writes:
> On Sat, May 03, 2014 at 08:34:07AM +0100, Richard Sandiford wrote:
>> Ping for this old patch.  I didn't ping it before because I can imagine
>> it wasn't a good idea at that stage in 4.8.  But I've been using it locally
>> since and it really does make a big difference when testing lots of
>> multilibs.
>
> Have you tested that it creates bitwise identical output given the same
> input from the old script on a couple of regtests (e.g. your 5 hour mips
> case with many multilibs, and say common x86_64 one)?

Not bitwise, because there were some deliberate differences (see the
list in the quoted email).  But I did a diff and the only differences
were the expected ones.

Thanks,
Richard


Ping [patch, toplevel] configure nios2-elf libraries to build with -mno-gpopt

2014-05-05 Thread Sandra Loosemore

Ping!

http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01618.html

-Sandra


Re: [C PATCH] Don't reject valid code with _Alignas (PR c/61053)

2014-05-05 Thread Marek Polacek
On Mon, May 05, 2014 at 10:27:03PM +0200, Marek Polacek wrote:
> In this PR the issue is that we reject (valid) code such as
> _Alignas (long long) long long foo;
> with -m32, because we trip this condition:
> 
>alignas_align = 1U << declspecs->align_log;
>if (alignas_align < TYPE_ALIGN_UNIT (type))
>  {
>if (name)
>  error_at (loc, "%<_Alignas%> specifiers cannot reduce "
>"alignment of %qE", name);
> 
> and error later on, since alignas_align is 4 (correct, see PR52023 for
> why), but TYPE_ALIGN_UNIT of long long is 8.  I think TYPE_ALIGN_UNIT
> is wrong here as that won't give us minimal alignment required.
> In c_sizeof_or_alignof_type we already have the code to compute such
> minimal alignment so I just moved the code to a separate function
> and used that instead of TYPE_ALIGN_UNIT.
> 
> Note that the test is run only on i?86 and x86_64, because we can't (?)
> easily determine which target requires what alignment.
> 
> Regtested/bootstrapped on x86_64-unknown-linux-gnu and
> powerpc64-unknown-linux-gnu, ok for trunk?

I should add that I checked what clang does on my testcase, and with this
patch both compilers accept and reject the same sort of testcases,
both with -m32 and -m64.

Marek


[C PATCH] Don't reject valid code with _Alignas (PR c/61053)

2014-05-05 Thread Marek Polacek
In this PR the issue is that we reject (valid) code such as
_Alignas (long long) long long foo;
with -m32, because we trip this condition:

   alignas_align = 1U << declspecs->align_log;
   if (alignas_align < TYPE_ALIGN_UNIT (type))
 {
   if (name)
 error_at (loc, "%<_Alignas%> specifiers cannot reduce "
   "alignment of %qE", name);

and error later on, since alignas_align is 4 (correct, see PR52023 for
why), but TYPE_ALIGN_UNIT of long long is 8.  I think TYPE_ALIGN_UNIT
is wrong here as that won't give us minimal alignment required.
In c_sizeof_or_alignof_type we already have the code to compute such
minimal alignment so I just moved the code to a separate function
and used that instead of TYPE_ALIGN_UNIT.

Note that the test is run only on i?86 and x86_64, because we can't (?)
easily determine which target requires what alignment.

Regtested/bootstrapped on x86_64-unknown-linux-gnu and
powerpc64-unknown-linux-gnu, ok for trunk?

2014-05-05  Marek Polacek  

PR c/61053
c-family/
* c-common.c (min_align_of_type): New function factored out from...
(c_sizeof_or_alignof_type): ...here.
* c-common.h (min_align_of_type): Declare.
c/
* c-decl.c (grokdeclarator): Use min_align_of_type instead of
TYPE_ALIGN_UNIT.
testsuite/
* gcc.dg/pr61053.c: New test.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index 0ad955d..6d4440e 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -4931,6 +4931,26 @@ c_common_get_alias_set (tree t)
   return -1;
 }
 
+/* Return the least alignment required for type TYPE.  */
+
+unsigned int
+min_align_of_type (tree type)
+{
+  unsigned int align = TYPE_ALIGN (type);
+  align = MIN (align, BIGGEST_ALIGNMENT);
+#ifdef BIGGEST_FIELD_ALIGNMENT
+  align = MIN (align, BIGGEST_FIELD_ALIGNMENT);
+#endif
+  unsigned int field_align = align;
+#ifdef ADJUST_FIELD_ALIGN
+  tree field = build_decl (UNKNOWN_LOCATION, FIELD_DECL, NULL_TREE,
+  type);
+  field_align = ADJUST_FIELD_ALIGN (field, field_align);
+#endif
+  align = MIN (align, field_align);
+  return align / BITS_PER_UNIT;
+}
+
 /* Compute the value of 'sizeof (TYPE)' or '__alignof__ (TYPE)', where
the IS_SIZEOF parameter indicates which operator is being applied.
The COMPLAIN flag controls whether we should diagnose possibly
@@ -5009,21 +5029,7 @@ c_sizeof_or_alignof_type (location_t loc,
size_int (TYPE_PRECISION (char_type_node)
  / BITS_PER_UNIT));
   else if (min_alignof)
-   {
- unsigned int align = TYPE_ALIGN (type);
- align = MIN (align, BIGGEST_ALIGNMENT);
-#ifdef BIGGEST_FIELD_ALIGNMENT
- align = MIN (align, BIGGEST_FIELD_ALIGNMENT);
-#endif
- unsigned int field_align = align;
-#ifdef ADJUST_FIELD_ALIGN
- tree field = build_decl (UNKNOWN_LOCATION, FIELD_DECL, NULL_TREE,
-  type);
- field_align = ADJUST_FIELD_ALIGN (field, field_align);
-#endif
- align = MIN (align, field_align);
- value = size_int (align / BITS_PER_UNIT);
-   }
+   value = size_int (min_align_of_type (type));
   else
value = size_int (TYPE_ALIGN_UNIT (type));
 }
diff --git gcc/c-family/c-common.h gcc/c-family/c-common.h
index 57b7dce..d34d2bb 100644
--- gcc/c-family/c-common.h
+++ gcc/c-family/c-common.h
@@ -758,6 +758,7 @@ extern tree c_wrap_maybe_const (tree, bool);
 extern tree c_save_expr (tree);
 extern tree c_common_truthvalue_conversion (location_t, tree);
 extern void c_apply_type_quals_to_decl (int, tree);
+extern unsigned int min_align_of_type (tree);
 extern tree c_sizeof_or_alignof_type (location_t, tree, bool, bool, int);
 extern tree c_alignof_expr (location_t, tree);
 /* Print an error message for invalid operands to arith operation CODE.
diff --git gcc/c/c-decl.c gcc/c/c-decl.c
index 6e7c589..9f7419a 100644
--- gcc/c/c-decl.c
+++ gcc/c/c-decl.c
@@ -5931,7 +5931,7 @@ grokdeclarator (const struct c_declarator *declarator,
   else if (declspecs->align_log != -1)
{
  alignas_align = 1U << declspecs->align_log;
- if (alignas_align < TYPE_ALIGN_UNIT (type))
+ if (alignas_align < min_align_of_type (type))
{
  if (name)
error_at (loc, "%<_Alignas%> specifiers cannot reduce "
diff --git gcc/testsuite/gcc.dg/pr61053.c gcc/testsuite/gcc.dg/pr61053.c
index e69de29..e3db534 100644
--- gcc/testsuite/gcc.dg/pr61053.c
+++ gcc/testsuite/gcc.dg/pr61053.c
@@ -0,0 +1,75 @@
+/* PR c/61053 */
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
+
+_Alignas (char) char cc;
+_Alignas (short int) char cs;
+_Alignas (int) char ci;
+_Alignas (long int) char cl;
+_Alignas (long long int) char cll;
+_Alignas (float) char cf;
+_Alignas (double) char cd;
+_Alignas (long double) char cld;
+
+_Alignas (char) shor

Re: [PATCH] Remove "keep_aligning" from get_inner_reference

2014-05-05 Thread Mike Stump
On May 5, 2014, at 1:00 AM, Richard Biener  wrote:
> And can we remove TYPE_ALIGN_OK as followup?  (ISTR it's used
> by obj-c/c++ as well, but I can't find such use)

I didn’t find any current hint…  the only landmine I found was:

./gcc/config/pa/pa.c:1878:/* Using TYPE_ALIGN_OK is rather 
conservative as
./gcc/config/pa/pa.c:1880:align = (TYPE_ALIGN_OK (type) ? 
TYPE_ALIGN (type)

Re: [DOC PATCH] Rewrite docs for inline asm

2014-05-05 Thread Gerald Pfeifer
[ Oh no!  I wrote this on a plane two days ago and now it got stuck 
  in my outbox.  Let's get this patch in before any further delays
  happen. ]

On Sun, 27 Apr 2014, dw wrote:
> The goal of this patch was to rewrite section 6.41.  That's no simple 
> task, since 6.41 was ~10 very full pages.  Unfortunately the current 
> text is so random, I couldn't think of any way to update it piecemeal.

Understood.  Let's see that we can get an update committed soon.
We can always improve on it further later on, which then will be
a lot easier to do, review, and get pushed.

> A number of your concerns seem to stem from reviewing this text only by 
> looking at the patch.  I think your questions might answered by looking 
> at the actual generated output.

Right, I just did not want to delay this any further.

> Since reg vars wasn't really the focus of this patch, I mostly left the 
> existing text.  It may look new in the patch because it got moved into 
> this new menu.  I don't know why this was originally written this way 
> (sounds like laziness).  "Register" and "Variables" make more sense to 
> me.  That said, unless you feel strongly, I'd suggest doing this in my 
> upcoming RegVars patch.

Okay.

> As for the larger issue of "ISO C" vs "ISO C90", that's a tougher 
> question. Now that I've undone this change, all the inline asm stuff 
> says "ISO C," which is what the original text used.  "ISO C90" and "ISO 
> C99" are sprinkled throughout the other sections in extend.texi, and I'm 
> reluctant to make wholesale changes to the rest of the file as part of 
> this patch.

Let's leave the asm stuff at "ISO C" unless anyone has concerns
about this.

 +By definition, a Basic @code{asm} statement is one with no operands.
 +@code{asm} statements that contain one or more colons (used to
 delineate
 +operands) are considered to be Extended (for example, @code{asm("int
 $3")}
 +is Basic, and @code{asm("int $3" : )} is Extended). @xref{Extended
 Asm}.
>> At this point the reader does not yet know the concept of those
>> colons, does she?  So that can be seen as a bit confusing.
> I understand what you mean, however I don't see a practical solution 
> here. Describing the use of colons in Basic asm, only to follow it by 
> saying "now that you understand all that, none of it applies here" seems 
> like it would be even more confusing.
> 
> Saying that colons are part of Extended asm, and providing a link to 
> Extended asm which does describe them seems to me the best compromise.

That'd work for me.

> What's more, while a sequential reading of the docs might experience the 
> problem you describe, does anyone actually read the docs from beginning 
> to end?  "At this point" is an undefined term.

I've seen both uses of our documentation.  This is not a biggie, I
am fine keeping it as is (or going with the compromise you suggest
above).

 +This is a literal string that specifies the assembler code. The string
 can
>> Just "assembler code", without "the", would be my take, but let's see
>> what native speakers so.
> "The" sounds right to me.  Alternate might be "thine?"  Maybe not.

Let's keep it as is, then.

 +You may place multiple assembler instructions together in a single
 @code{asm}
 +string, separated by the characters normally used in assembly code for
 the
 +system.
>> Might "by the same characters normally...the target system" be more 
>> clear?
> Hmm.  Not to me.  I'm not sure what "same" would mean in this context.  
> The same as what?

Actually, reading this again after a couple of days, I'm not sure
what triggered my comment.  Now this works just fine.

 +A combination that works in most places is a newline to break the
 +line, plus a tab character to move to the instruction field (written
 +as "\n\t").
>> Will everyone know what an instruction field is?  I'm not sure it's
>> that common of a term.
> Hmm.  I brought that text across unchanged from the original text. I 
> know what it means, but only because I've programmed in languages that 
> require it.  I haven't seen an assembler that cares, but my cross 
> platform experience is weak.
> 
> I don't want to get into the business of describing how to format and 
> write assembler code.  That's (well) beyond the scope of this doc.  
> What would you say to:
> 
> A combination that works in most places is a newline to break the
> line, plus a tab character (written as "\n\t").

Yep, that sounds good.

> If it seems to you I am trying to make Basic asm sound bad, I am. There 
> are so many things that "look right" when written in Basic asm that just 
> won't work, I'm trying my best to discourage its use.

Fair enough. :-)

 +Since GCC does not parse the AssemblerInstructions, it has no
 visibility of
>> ^
>> Something wrong here.
> AssemblerInstructions is the first parameter to the Basic asm.

Right.  I think I was expectin

Re: [PATCH] Error for invalid args to -fsanitize (PR driver/61065)

2014-05-05 Thread Jeff Law

On 05/05/14 08:13, Marek Polacek wrote:

It seems it's better to error out when someone passes bogus option to
-fsanitize rather than only give a warning.  Error gives non-zero exit
status so with this patch the behavior is friendlier to configure
scripts.

Bootstrapped on x86_64-linux, ok for trunk?

2014-05-05  Marek Polacek  

PR driver/61065
* opts.c (common_handle_option): Call error_at instead of warning_at.

OK.
Jeff



Re: [PATCH 0/3] Compile-time gimple checking, without typedefs

2014-05-05 Thread Trevor Saunders
On Mon, May 05, 2014 at 01:44:06PM -0600, Jeff Law wrote:
> On 05/05/14 11:37, Richard Biener wrote:
> >
> >Well, I hope that Andrew doesn't do without a namespace (and I still
> >don't believe in what he tries to achieve without laying proper ground-work
> >throughout the compiler).  With a namespace gimple we can use
> >gimple::stmt.
> namespaces, while nice, aren't going to solve all these issues.  While I
> think we can get a good separation between gimple and the rest of the world,
> I suspect namespaces aren't going to help much with the statement vs
> expression vs type issues.
> 
> Ultimately I suspect we're not going to have too many places where we can
> stick a "using namespace gimple-whatever", but time will tell.

yeah, gcc::gimple::stmt::switch seems a bit excessive ;)

> >Agreed on that, btw.  But switch_ can't be the answer either.  Maybe
> >swidch (similar do klass) or swjdch.  Or swtch.  I like swtch the best
> >(similar to stmt).
> As David pointed out there's several others that map to keywords.  I'd
> rather set a standard here across the project so that we don't have folks
> using gto for goto, others using goto_, _goto, whatever.  While swtch works
> well, I don't think the other examples work nearly as well.  Thus some kind
> of prefix/suffix seems better to me (though I'm sure my eyes will bleed as a
> result of looking at those objects).

personally I've become desensitized to cammel case, but how about
switch_stmt, goto_stmt etc? imho its not that bad to be a little
explicit these are statement types, but yet its pretty short.

Trev

> 
> jeff


signature.asc
Description: Digital signature


Re: [PATCH 0/3] Compile-time gimple checking, without typedefs

2014-05-05 Thread Jakub Jelinek
On Mon, May 05, 2014 at 01:44:06PM -0600, Jeff Law wrote:
> On 05/05/14 11:37, Richard Biener wrote:
> >
> >Well, I hope that Andrew doesn't do without a namespace (and I still
> >don't believe in what he tries to achieve without laying proper ground-work
> >throughout the compiler).  With a namespace gimple we can use
> >gimple::stmt.
> namespaces, while nice, aren't going to solve all these issues.
> While I think we can get a good separation between gimple and the
> rest of the world, I suspect namespaces aren't going to help much
> with the statement vs expression vs type issues.
> 
> Ultimately I suspect we're not going to have too many places where
> we can stick a "using namespace gimple-whatever", but time will
> tell.
> 
> >Agreed on that, btw.  But switch_ can't be the answer either.  Maybe
> >swidch (similar do klass) or swjdch.  Or swtch.  I like swtch the best
> >(similar to stmt).
> As David pointed out there's several others that map to keywords.
> I'd rather set a standard here across the project so that we don't
> have folks using gto for goto, others using goto_, _goto, whatever.
> While swtch works well, I don't think the other examples work nearly
> as well.  Thus some kind of prefix/suffix seems better to me (though
> I'm sure my eyes will bleed as a result of looking at those
> objects).

But the prefix can be as short as e.g. "g" (for gimple), so gtry, ggoto,
gassign, gcall.

Jakub


Fix PR ipa/60965 (placement new wrt ipa-devirt)

2014-05-05 Thread Jan Hubicka
Hi,
this patch fixes unfortunate thinko in get_class_context that invalidates
transitions from non-POD type to a polymorphic type that may happen by virtue
of placement new and apparently breaks openJDK and Qt. I really tried to keep
placement new in mind when implementing ipa-devirt, so hope there are no
similar negaitve surprises.

The patch disables about 200 out of 3 devirtualizations happening in
Firefox.

I have commited it to 4.9 and will commit to mainline once testing finishes.

Honza

PR ipa/60965
* g++.dg/ipa/devirt-31.C: New testcase.
* g++.dg/ipa/devirt-11.C: Adjust testcase.
* ipa-devirt.c (get_class_context): Allow POD to change to non-POD.


Index: testsuite/g++.dg/ipa/devirt-31.C
===
--- testsuite/g++.dg/ipa/devirt-31.C(revision 0)
+++ testsuite/g++.dg/ipa/devirt-31.C(revision 0)
@@ -0,0 +1,23 @@
+/* { dg-options "-O2 -std=c++11 -fdump-ipa-inline"  } */
+#include 
+
+class EmbeddedObject {
+public:
+  virtual int val() { return 2; }
+};
+
+class Container {
+  alignas(EmbeddedObject) char buffer[sizeof(EmbeddedObject)];
+public:
+  EmbeddedObject *obj() { return (EmbeddedObject*)buffer; }
+  Container() { new (buffer) EmbeddedObject(); }
+};
+
+Container o;
+
+int main()
+{
+  __builtin_printf("%d\n", o.obj()->val());
+}
+/* { dg-final { scan-ipa-dump-not "__builtin_unreachable"  "inline"  } } */
+/* { dg-final { cleanup-ipa-dump "inline" } } */
Index: testsuite/g++.dg/ipa/devirt-11.C
===
--- testsuite/g++.dg/ipa/devirt-11.C(revision 210049)
+++ testsuite/g++.dg/ipa/devirt-11.C(working copy)
@@ -45,5 +45,5 @@ bar ()
 /* While inlining function called once we should devirtualize a new call to fn2
and two to fn3. While doing so the new symbol for fn2 needs to be
introduced.  */
-/* { dg-final { scan-ipa-dump-times "Discovered a virtual call to a known 
target" 3 "inline"  } } */
+/* { dg-final { scan-ipa-dump-times "Discovered a virtual call to a known 
target" 1 "inline"  } } */
 /* { dg-final { cleanup-ipa-dump "inline" } } */
Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 210049)
+++ ipa-devirt.c(working copy)
@@ -987,6 +987,17 @@ give_up:
   context->outer_type = expected_type;
   context->offset = 0;
   context->maybe_derived_type = true;
+  context->maybe_in_construction = true;
+  /* POD can be changed to an instance of a polymorphic type by
+ placement new.  Here we play safe and assume that any
+ non-polymorphic type is POD.  */
+  if ((TREE_CODE (type) != RECORD_TYPE
+   || !TYPE_BINFO (type)
+   || !polymorphic_type_binfo_p (TYPE_BINFO (type)))
+  && (TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST
+ || (offset + tree_to_uhwi (TYPE_SIZE (expected_type)) <=
+ tree_to_uhwi (TYPE_SIZE (type)
+return true;
   return false;
 }
 


Re: [PATCH 0/3] Compile-time gimple checking, without typedefs

2014-05-05 Thread Jeff Law

On 05/05/14 11:37, Richard Biener wrote:


Well, I hope that Andrew doesn't do without a namespace (and I still
don't believe in what he tries to achieve without laying proper ground-work
throughout the compiler).  With a namespace gimple we can use
gimple::stmt.
namespaces, while nice, aren't going to solve all these issues.  While I 
think we can get a good separation between gimple and the rest of the 
world, I suspect namespaces aren't going to help much with the statement 
vs expression vs type issues.


Ultimately I suspect we're not going to have too many places where we 
can stick a "using namespace gimple-whatever", but time will tell.



Agreed on that, btw.  But switch_ can't be the answer either.  Maybe
swidch (similar do klass) or swjdch.  Or swtch.  I like swtch the best
(similar to stmt).
As David pointed out there's several others that map to keywords.  I'd 
rather set a standard here across the project so that we don't have 
folks using gto for goto, others using goto_, _goto, whatever.  While 
swtch works well, I don't think the other examples work nearly as well. 
 Thus some kind of prefix/suffix seems better to me (though I'm sure my 
eyes will bleed as a result of looking at those objects).


jeff


Re: [Patch, testsuite, mips] Fix two failing MIPS tests.

2014-05-05 Thread Steve Ellcey
On Sat, 2014-05-03 at 07:52 +0100, Richard Sandiford wrote:

> > Old:
> >
> > li  $5,305397760# 0x1234
> > addiu   $4,$5,1
> > addiu   $5,$5,-1
> >
> > New:
> >
> > li  $5,305332224# 0x1233
> > ori $5,$5,0x
> > addiu   $4,$5,2
> 
> This isn't as good though -- $4 now depends on two previous
> instructions.  Do you know which specific change was responsible?
> 
> Thanks,
> Richard

A quick follow up.  The code in question is:

void f () { g (0x12340001, 0x1233); }

And if I swap the arguments I get something more like the old code.
But if I run an older GCC on that new swapped code then I get something
like the new code.  So while the old code may have been better for this
example, the new code is better if I swap the arguments around.

Steve Ellcey
sell...@mips.com



Re: [PATCH 0/3] Compile-time gimple checking, without typedefs

2014-05-05 Thread Jeff Law

On 05/05/14 12:37, David Malcolm wrote:


FWIW I assumed the ick from Jeff was in relation to CamelCase.

Yes.  That is a personal preference, of course.


Note that it's not just GIMPLE_SWITCH that has this problem when using
this naming convention; I believe the list of codes that would have
class names clashing with C++ keywords would be:

GIMPLE_GOTO
GIMPLE_SWITCH
GIMPLE_RETURN
GIMPLE_CATCH
GIMPLE_TRY

IMHO, I think trying to eliminate vowels or similar would be confusing
(e.g. how does one respell "try"? [1]) and that standardizing on a
trailing underscore for all of these seems to me to be cleaner and
simpler.
Seems reasonable to me (trailing underscore).  Or we could keep the 
gimple_ prefix.


Jeff


Re: Ping: RFA: speeding up dg-extract-results.sh

2014-05-05 Thread Jakub Jelinek
On Sat, May 03, 2014 at 08:34:07AM +0100, Richard Sandiford wrote:
> Ping for this old patch.  I didn't ping it before because I can imagine
> it wasn't a good idea at that stage in 4.8.  But I've been using it locally
> since and it really does make a big difference when testing lots of
> multilibs.

Have you tested that it creates bitwise identical output given the same
input from the old script on a couple of regtests (e.g. your 5 hour mips
case with many multilibs, and say common x86_64 one)?

Jakub


Go patch committed: Use backend interface for sink expressions

2014-05-05 Thread Ian Lance Taylor
This patch by Chris Manghane changes the Go frontend to use the backend
interface for sink expressions.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r e59376bb96c4 go/expressions.cc
--- a/go/expressions.cc	Mon May 05 14:49:01 2014 -0400
+++ b/go/expressions.cc	Mon May 05 14:53:13 2014 -0400
@@ -931,7 +931,7 @@
  public:
   Sink_expression(Location location)
 : Expression(EXPRESSION_SINK, location),
-  type_(NULL), var_(NULL_TREE)
+  type_(NULL), bvar_(NULL)
   { }
 
  protected:
@@ -959,7 +959,7 @@
   // The type of this sink variable.
   Type* type_;
   // The temporary variable we generate.
-  tree var_;
+  Bvariable* bvar_;
 };
 
 // Return the type of a sink expression.
@@ -987,13 +987,24 @@
 tree
 Sink_expression::do_get_tree(Translate_context* context)
 {
-  if (this->var_ == NULL_TREE)
+  Location loc = this->location();
+  Gogo* gogo = context->gogo();
+  if (this->bvar_ == NULL)
 {
   go_assert(this->type_ != NULL && !this->type_->is_sink_type());
+  Named_object* fn = context->function();
+  go_assert(fn != NULL);
+  Bfunction* fn_ctx = fn->func_value()->get_or_make_decl(gogo, fn);
   Btype* bt = this->type_->get_backend(context->gogo());
-  this->var_ = create_tmp_var(type_to_tree(bt), "blank");
-}
-  return this->var_;
+  Bstatement* decl;
+  this->bvar_ =
+	gogo->backend()->temporary_variable(fn_ctx, context->bblock(), bt, NULL,
+	false, loc, &decl);
+  Bexpression* var_ref = gogo->backend()->var_expression(this->bvar_, loc);
+  var_ref = gogo->backend()->compound_expression(decl, var_ref, loc);
+  return expr_to_tree(var_ref);
+}
+  return expr_to_tree(gogo->backend()->var_expression(this->bvar_, loc));
 }
 
 // Ast dump for sink expression.


Re: [patch] change specific int128 -> generic intN

2014-05-05 Thread DJ Delorie

> After tomorrow, this should be:
> 
> MAX_BITSIZE_MODE_ANY_INT, not 128.

Heh.

Ok, after tomorrow, assume my patch has that instead :-)


Re: [patch] change specific int128 -> generic intN

2014-05-05 Thread Mike Stump
On May 4, 2014, at 2:09 PM, DJ Delorie  wrote:
> Here's an independent change that removes the decimal table and
> replaces it with generated hex values.  I included the relevent output
> of gcc -E -dM also.

> +  /* Allows bit sizes up to 128 bits.  */
> +#define PBOH_SZ (128/4+4)

After tomorrow, this should be:

MAX_BITSIZE_MODE_ANY_INT, not 128.

> +  char value[PBOH_SZ];

:-)


Go patch committed: Use backend interface for set-and-use temps

2014-05-05 Thread Ian Lance Taylor
This patch by Chris Manghane changes the Go frontend to use the backend
interface for set-and-use temporary variables.  Bootstrapped and ran Go
testsuite on x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 9ec9e72945e4 go/expressions.cc
--- a/go/expressions.cc	Mon May 05 13:54:25 2014 -0400
+++ b/go/expressions.cc	Mon May 05 14:40:12 2014 -0400
@@ -890,16 +890,16 @@
 tree
 Set_and_use_temporary_expression::do_get_tree(Translate_context* context)
 {
+  Location loc = this->location();
+  Gogo* gogo = context->gogo();
   Bvariable* bvar = this->statement_->get_backend_variable(context);
-  tree var_tree = var_to_tree(bvar);
-  tree expr_tree = this->expr_->get_tree(context);
-  if (var_tree == error_mark_node || expr_tree == error_mark_node)
-return error_mark_node;
-  Location loc = this->location();
-  return build2_loc(loc.gcc_location(), COMPOUND_EXPR, TREE_TYPE(var_tree),
-		build2_loc(loc.gcc_location(), MODIFY_EXPR, void_type_node,
-			   var_tree, expr_tree),
-		var_tree);
+  Bexpression* var_ref = gogo->backend()->var_expression(bvar, loc);
+
+  Bexpression* bexpr = tree_to_expr(this->expr_->get_tree(context));
+  Bstatement* set = gogo->backend()->assignment_statement(var_ref, bexpr, loc);
+  var_ref = gogo->backend()->var_expression(bvar, loc);
+  Bexpression* ret = gogo->backend()->compound_expression(set, var_ref, loc);
+  return expr_to_tree(ret);
 }
 
 // Dump.


Re: [PATCH 0/3] Compile-time gimple checking, without typedefs

2014-05-05 Thread David Malcolm
On Mon, 2014-05-05 at 19:37 +0200, Richard Biener wrote:
> On Mon, May 5, 2014 at 6:01 PM, Jeff Law  wrote:
> > On 05/05/14 02:22, Richard Biener wrote:
> >>>
> >>>
> >>> Also, AIUI, Andrew is looking at introducing concepts of gimple types and
> >>> gimple expressions, so "gimple" may no longer imply a *statement*.
> >>>
> >>> Alternatively, we could make the base class be just "gimple" (which would
> >>> be more consistent with the names of the accessor functions).
> >>
> >>
> >> I strongly prefer to name it 'gimple', not 'gimple_stmt'.  Because it's
> >> less
> >> to type, and because it will make all other types shorter as well.  And
> >> because
> >> 'gimple' _is_ a stmt right now, so gimple_stmt is redundant.  Same applies
> >> to gimple_stmt_with_FOO, just make it gimple_with_FOO.
> >>
> >> I understand the namespace issue, but we don't have a namespace right now.
> >> Also gimple::gimple works just fine, no?
> >
> > But this approach is going to be inconsistent with Andrew's work, right?
> > ISTM we'd end up with something like...
> >
> > So statements would be "gimple"
> > types would be "gimple_type"
> > expressions would be "gimple_expr"
> 
> Well, I hope that Andrew doesn't do without a namespace (and I still
> don't believe in what he tries to achieve without laying proper ground-work
> throughout the compiler).  With a namespace gimple we can use
> gimple::stmt.
> 
> > It's a bit of bikeshedding, but I'd prefer "gimple_stmt".  If you really
> > feel strongly about it, I'll go along without objection, but it seems wrong
> > to me.
> 
> Less typing.  But yes, it's bikeshedding.  Still gimple_stmt is
> redundant information in almost all context.  'stmt' was opposed to
> elsewhere so we settle on 'gimple' (which is already existing).
> IMHO not changing things is the best bikeshedding argument.
> 
> >>
> >>> There's also the "bargain basement" namespaces approach, where we don't
> >>> have an implicit "gimple" namespace, but just *pretend* we do, and rename
> >>> the base type to "stmt", with e.g. "gimple_statement_phi" becoming just
> >>> "phi". ["gimple_switch" would need to become "switch_", say, to avoid the
> >>> reserved word].
> >>
> >>
> >> Ick (for the 'switch' case ... CamelCase anyone? :)).
> >
> > :-)  Please, no
> 
> Agreed on that, btw.  But switch_ can't be the answer either.  Maybe
> swidch (similar do klass) or swjdch.  Or swtch.  I like swtch the best
> (similar to stmt).

FWIW I assumed the ick from Jeff was in relation to CamelCase.

Note that it's not just GIMPLE_SWITCH that has this problem when using
this naming convention; I believe the list of codes that would have
class names clashing with C++ keywords would be:

   GIMPLE_GOTO
   GIMPLE_SWITCH
   GIMPLE_RETURN
   GIMPLE_CATCH
   GIMPLE_TRY

IMHO, I think trying to eliminate vowels or similar would be confusing
(e.g. how does one respell "try"? [1]) and that standardizing on a
trailing underscore for all of these seems to me to be cleaner and
simpler.

But, yeah, we're in bikeshedding territory here :)

Dave

[1]  e.g. in patch 32 of the series which converts a dozen or so
"gimple" vars in tree-eh.c to be explicitly one of these.  "trie" would
make me think of a data structure, rather than a "try" statement, and,
errr "trigh" would make me think of Battlestar Galactica's XO :)



[PATCH, rs6000] Use new __builtin_pack_longdouble within libgcc's ibm-ldouble.c

2014-05-05 Thread Peter Bergner
Currently, the IBM long double routines in libgcc use a union to construct
a long double from two double values.  This causes horrific code generation
that copies the two double from the FP registers over to GPRs and back
again, giving us two loads and two stores, which leads to two load-hit-store
hazzards.  The following patch makes use of the new __builtin_pack_longdouble
builtin to construct the long double giving us at worse, one or two fmrs.

Is this ok for mainline once my bootstrap and regtesting are complete?

Peter


libgcc/
* config/rs6000/ibm-ldouble.c (typedef union longDblUnion): Delete.
(pack_ldouble): New function.
(__gcc_qadd): Use it.
(__gcc_qmul): Likewise.
(__gcc_qdiv): Likewise.
(__gcc_qneg): Likewise.
(__gcc_stoq): Likewise.
(__gcc_dtoq): Likewise.


Index: libgcc/config/rs6000/ibm-ldouble.c
===
--- libgcc/config/rs6000/ibm-ldouble.c  (revision 210073)
+++ libgcc/config/rs6000/ibm-ldouble.c  (working copy)
@@ -87,18 +87,29 @@ __asm__ (".symver __gcc_qadd,_xlqadd@GCC
 ".symver .__gcc_qdiv,._xlqdiv@GCC_3.4");
 #endif
 
-typedef union
-{
-  long double ldval;
-  double dval[2];
-} longDblUnion;
+/* Combine two 'double' values into one 'long double' and return the result.  
*/
+static inline long double
+pack_ldouble (double dh, double dl)
+{
+#if defined (_SOFT_FLOAT) || defined (__NO_FPRS__)
+  union
+  {
+long double ldval;
+double dval[2];
+  } x;
+  x.dval[0] = dh;
+  x.dval[1] = dl;
+  return x.ldval;
+#else
+  return __builtin_pack_longdouble (dh, dl);
+#endif
+}
 
 /* Add two 'long double' values and return the result. */
 long double
 __gcc_qadd (double a, double aa, double c, double cc)
 {
-  longDblUnion x;
-  double z, q, zz, xh;
+  double xh, xl, z, q, zz;
 
   z = a + c;
 
@@ -109,12 +120,12 @@ __gcc_qadd (double a, double aa, double
   z = cc + aa + c + a;
   if (nonfinite (z))
return z;
-  x.dval[0] = z;  /* Will always be DBL_MAX.  */
+  xh = z;  /* Will always be DBL_MAX.  */
   zz = aa + cc;
   if (fabs(a) > fabs(c))
-   x.dval[1] = a - z + c + zz;
+   xl = a - z + c + zz;
   else
-   x.dval[1] = c - z + a + zz;
+   xl = c - z + a + zz;
 }
   else
 {
@@ -129,10 +140,9 @@ __gcc_qadd (double a, double aa, double
   if (nonfinite (xh))
return xh;
 
-  x.dval[0] = xh;
-  x.dval[1] = z - xh + zz;
+  xl = z - xh + zz;
 }
-  return x.ldval;
+  return pack_ldouble (xh, xl);
 }
 
 long double
@@ -148,8 +158,7 @@ static double fmsub (double, double, dou
 long double
 __gcc_qmul (double a, double b, double c, double d)
 {
-  longDblUnion z;
-  double t, tau, u, v, w;
+  double xh, xl, t, tau, u, v, w;
   
   t = a * c;   /* Highest order double term.  */
 
@@ -173,16 +182,15 @@ __gcc_qmul (double a, double b, double c
   /* Construct long double result.  */
   if (nonfinite (u))
 return u;
-  z.dval[0] = u;
-  z.dval[1] = (t - u) + tau;
-  return z.ldval;
+  xh = u;
+  xl = (t - u) + tau;
+  return pack_ldouble (xh, xl);
 }
 
 long double
 __gcc_qdiv (double a, double b, double c, double d)
 {
-  longDblUnion z;
-  double s, sigma, t, tau, u, v, w;
+  double xh, xl, s, sigma, t, tau, u, v, w;
   
   t = a / c;/* highest order double term */
   
@@ -219,9 +227,9 @@ __gcc_qdiv (double a, double b, double c
   /* Construct long double result.  */
   if (nonfinite (u))
 return u;
-  z.dval[0] = u;
-  z.dval[1] = (t - u) + tau;
-  return z.ldval;
+  xh = u;
+  xl = (t - u) + tau;
+  return pack_ldouble (xh, xl);
 }
 
 #if defined (_SOFT_DOUBLE) && defined (__LONG_DOUBLE_128__)
@@ -248,11 +256,7 @@ extern int __gedf2 (double, double);
 long double
 __gcc_qneg (double a, double aa)
 {
-  longDblUnion x;
-
-  x.dval[0] = -a;
-  x.dval[1] = -aa;
-  return x.ldval;
+  return pack_ldouble (-a, -aa);
 }
 
 /* Compare two 'long double' values for equality.  */
@@ -292,24 +296,14 @@ strong_alias (__gcc_qge, __gcc_qgt);
 long double
 __gcc_stoq (float a)
 {
-  longDblUnion x;
-
-  x.dval[0] = (double) a;
-  x.dval[1] = 0.0;
-
-  return x.ldval;
+  return pack_ldouble ((double) a, 0.0);
 }
 
 /* Convert double to long double.  */
 long double
 __gcc_dtoq (double a)
 {
-  longDblUnion x;
-
-  x.dval[0] = a;
-  x.dval[1] = 0.0;
-
-  return x.ldval;
+  return pack_ldouble (a, 0.0);
 }
 
 /* Convert long double to single.  */




Re: RFA: speeding up dg-extract-results.sh

2014-05-05 Thread Jeff Law

On 05/05/14 12:14, Mike Stump wrote:

On May 5, 2014, at 9:08 AM, Jeff Law  wrote:

It would mean a bit of pain for initial system bootstraps by the
distributors,


I don’t expect any pain there.  System distributors usually have a
distribution for the last release that includes binaries for all,
including python, tcl and gcc.  They merely use these to then build
the new software.  I don’t know of any that don’t start with binaries
someplace.  :-)
I'm referring to new system bringups -- ie, there is no prior release to 
build from.  There's a couple of those in progress right now.  Package 
interdependencies are a real problem, but as I stated, this in case 
python is limited to the testsuite, which can be disabled during the 
initial building phases.


jeff


Re: RFA: speeding up dg-extract-results.sh

2014-05-05 Thread Mike Stump
On May 5, 2014, at 9:08 AM, Jeff Law  wrote:
> It would mean a bit of pain for initial system bootstraps by the distributors,

I don’t expect any pain there.  System distributors usually have a distribution 
for the last release that includes binaries for all, including python, tcl and 
gcc.  They merely use these to then build the new software.  I don’t know of 
any that don’t start with binaries someplace.  :-)


Re: [wide-int] Handle zero-precision INTEGER_CSTs again

2014-05-05 Thread Marek Polacek
On Mon, May 05, 2014 at 07:32:31PM +0200, Richard Biener wrote:
> > void_zero_node is used for ubsan too, and survives into gimple.
> > I did hit this in real tests, it wasn't just theoretical.
> 
> Ugh - for what does it use that ... :/

It's used like this:
t = fold_build3 (COND_EXPR, void_type_node, t, tt, void_zero_node);
so if condition T isn't true, nothing should happen.  I remember that
initially I used something similar to void_zero_node, but that
resulted in ICEs.

Marek


Go patch committed: Use backend interface for struct field offsets

2014-05-05 Thread Ian Lance Taylor
This patch by Chris Manghane changes the Go frontend to use the backend
interface for struct field offsets.  Bootstrapped and ran Go testsuite
on x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r d87cfc273ffc go/expressions.cc
--- a/go/expressions.cc	Mon May 05 13:21:55 2014 -0400
+++ b/go/expressions.cc	Mon May 05 13:39:45 2014 -0400
@@ -14815,28 +14815,27 @@
 tree
 Struct_field_offset_expression::do_get_tree(Translate_context* context)
 {
-  tree type_tree = type_to_tree(this->type_->get_backend(context->gogo()));
-  if (type_tree == error_mark_node)
-return error_mark_node;
-
-  tree val_type_tree = type_to_tree(this->type()->get_backend(context->gogo()));
-  go_assert(val_type_tree != error_mark_node);
-
   const Struct_field_list* fields = this->type_->fields();
-  tree struct_field_tree = TYPE_FIELDS(type_tree);
   Struct_field_list::const_iterator p;
+  unsigned i = 0;
   for (p = fields->begin();
p != fields->end();
-   ++p, struct_field_tree = DECL_CHAIN(struct_field_tree))
-{
-  go_assert(struct_field_tree != NULL_TREE);
-  if (&*p == this->field_)
-	break;
-}
+   ++p, ++i)
+if (&*p == this->field_)
+  break;
   go_assert(&*p == this->field_);
 
-  return fold_convert_loc(BUILTINS_LOCATION, val_type_tree,
-			  byte_position(struct_field_tree));
+  Gogo* gogo = context->gogo();
+  Btype* btype = this->type_->get_backend(gogo);
+
+  size_t offset = gogo->backend()->type_field_offset(btype, i);
+  mpz_t offsetval;
+  mpz_init_set_ui(offsetval, offset);
+  Type* uptr_type = Type::lookup_integer_type("uintptr");
+  Expression* ret = Expression::make_integer(&offsetval, uptr_type,
+	 Linemap::predeclared_location());
+  mpz_clear(offsetval);
+  return ret->get_tree(context);
 }
 
 // Dump ast representation for a struct field offset expression.


[PATCH][4/n] Handle TODO_verify_rtl_sharing from TODO_verify_il

2014-05-05 Thread Richard Biener

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2014-05-05  Richard Biener  

* passes.c (execute_function_todo): Don't reset TODO_verify_ssa
from last_verified if update_ssa ran.  Move TODO_verify_rtl_sharing
under the TODO_verify_il umbrella.

Index: gcc/passes.c
===
--- gcc/passes.c(revision 210072)
+++ gcc/passes.c(working copy)
@@ -1743,7 +1743,6 @@ execute_function_todo (function *fn, voi
 {
   unsigned update_flags = flags & TODO_update_ssa_any;
   update_ssa (update_flags);
-  cfun->last_verified &= ~TODO_verify_ssa;
 }
 
   if (flag_tree_pta && (flags & TODO_rebuild_alias))
@@ -1791,9 +1790,9 @@ execute_function_todo (function *fn, voi
  if (current_loops
  && loops_state_satisfies_p (LOOP_CLOSED_SSA))
verify_loop_closed_ssa (false);
+ if (cfun->curr_properties & PROP_rtl)
+   verify_rtl_sharing ();
}
-  if (flags & TODO_verify_rtl_sharing)
-   verify_rtl_sharing ();
 
   /* Make sure verifiers don't change dominator state.  */
   gcc_assert (dom_info_state (fn, CDI_DOMINATORS) == pre_verify_state);


Re: Fix various x86 tests for --with-arch=bdver3 --with-cpu=bdver3

2014-05-05 Thread Uros Bizjak
On Mon, May 5, 2014 at 6:44 PM, Joseph S. Myers  wrote:

>> These are due to TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL tuning flag.
>> Currently, this flag applies to all vector sizes (128, 256 and 512
>> bits), but I guess it is effective only for 128 bit sizes. Can you
>> please review usage of this flag in i386/sse.md?
>
> Indeed, the optimization as described in
>  is purely
> about reducing code size, and is irrelevant in VEX-prefixed cases.
> Thus, this patch adds  == 16 conditionals in relevant cases
> (some cases already had such conditionals or otherwise wouldn't be
> used for larger vectors).
>
> Tested with no regressions for x86_64-linux-gnu (--with-arch=bdver3
> --with-cpu=bdver3, where it fixes most of the remaining scan-assembler
> test failures).  OK to commit?
>
> 2014-05-05  Joseph Myers  
>
> * config/i386/sse.md (*mov_internal)
> (*_loadu)
> (*_loaddqu)
> (_andnot3, 3, *andnot3)
> (*3, *andnot3)
> (3): Only consider
> TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL for modes of size 16.

This is OK for mainline with a slight change below.

> Index: gcc/config/i386/sse.md
> ===
> --- gcc/config/i386/sse.md  (revision 209980)
> +++ gcc/config/i386/sse.md  (working copy)
> @@ -758,7 +758,8 @@
>[(set_attr "type" "sselog1,ssemov,ssemov")
> (set_attr "prefix" "maybe_vex")
> (set (attr "mode")
> -   (cond [(match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
> +   (cond [(and (match_test " == 16")
> +   (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))
>  (const_string "")
>(and (match_test " == 16")
> (and (eq_attr "alternative" "2")

Please merge the changed first and the second conditional to:

(cond [(and (match_test " == 16")
   (ior (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
 (and (eq_attr "alternative" "2")
  (match_test "TARGET_SSE_TYPELESS_STORES"
 (const_string "")

Thanks,
Uros.


Re: [PATCH 0/3] Compile-time gimple checking, without typedefs

2014-05-05 Thread Richard Biener
On Mon, May 5, 2014 at 6:01 PM, Jeff Law  wrote:
> On 05/05/14 02:22, Richard Biener wrote:
>>>
>>>
>>> Also, AIUI, Andrew is looking at introducing concepts of gimple types and
>>> gimple expressions, so "gimple" may no longer imply a *statement*.
>>>
>>> Alternatively, we could make the base class be just "gimple" (which would
>>> be more consistent with the names of the accessor functions).
>>
>>
>> I strongly prefer to name it 'gimple', not 'gimple_stmt'.  Because it's
>> less
>> to type, and because it will make all other types shorter as well.  And
>> because
>> 'gimple' _is_ a stmt right now, so gimple_stmt is redundant.  Same applies
>> to gimple_stmt_with_FOO, just make it gimple_with_FOO.
>>
>> I understand the namespace issue, but we don't have a namespace right now.
>> Also gimple::gimple works just fine, no?
>
> But this approach is going to be inconsistent with Andrew's work, right?
> ISTM we'd end up with something like...
>
> So statements would be "gimple"
> types would be "gimple_type"
> expressions would be "gimple_expr"

Well, I hope that Andrew doesn't do without a namespace (and I still
don't believe in what he tries to achieve without laying proper ground-work
throughout the compiler).  With a namespace gimple we can use
gimple::stmt.

> It's a bit of bikeshedding, but I'd prefer "gimple_stmt".  If you really
> feel strongly about it, I'll go along without objection, but it seems wrong
> to me.

Less typing.  But yes, it's bikeshedding.  Still gimple_stmt is
redundant information in almost all context.  'stmt' was opposed to
elsewhere so we settle on 'gimple' (which is already existing).
IMHO not changing things is the best bikeshedding argument.

>>
>>> There's also the "bargain basement" namespaces approach, where we don't
>>> have an implicit "gimple" namespace, but just *pretend* we do, and rename
>>> the base type to "stmt", with e.g. "gimple_statement_phi" becoming just
>>> "phi". ["gimple_switch" would need to become "switch_", say, to avoid the
>>> reserved word].
>>
>>
>> Ick (for the 'switch' case ... CamelCase anyone? :)).
>
> :-)  Please, no

Agreed on that, btw.  But switch_ can't be the answer either.  Maybe
swidch (similar do klass) or swjdch.  Or swtch.  I like swtch the best
(similar to stmt).

Richard.

> Jeff


Re: [wide-int] Handle zero-precision INTEGER_CSTs again

2014-05-05 Thread Richard Biener
On Mon, May 5, 2014 at 4:51 PM, Richard Sandiford
 wrote:
> Richard Biener  writes:
>> On Mon, May 5, 2014 at 12:54 PM, Richard Sandiford
>>  wrote:
>>> Richard Biener  writes:
 On Fri, May 2, 2014 at 9:19 PM, Richard Sandiford
  wrote:
> I'd hoped the days of zero-precision INTEGER_CSTs were behind us after
> Richard's patch to remove min amd max values from zero-width bitfields,
> but a boostrap-ubsan showed otherwise.  One source is in:
>
>   null_pointer_node = build_int_cst (build_pointer_type
> (void_type_node), 0);
>
> if no_target, since the precision of the type comes from ptr_mode,
> which remains the default VOIDmode.  Maybe that's a bug, but setting
> it to an arbitrary nonzero value would also be dangerous.

 Can you explain what 'no_target' should be?  ptr_mode should never be
 VOIDmode.
>>>
>>> Sorry, meant "no_backend" rather than "no_target".  See do_compile.
>>
>> Ok.  So we do
>>
>>   /* This must be run always, because it is needed to compute the FP
>>  predefined macros, such as __LDBL_MAX__, for targets using non
>>  default FP formats.  */
>>   init_adjust_machine_modes ();
>>
>>   /* Set up the back-end if requested.  */
>>   if (!no_backend)
>> backend_init ();
>>
>> where I think that init_adjust_machine_modes should initialize the
>> {byte,word,ptr}_mode globals.  Move from init_emit_once.
>>
>> But why do we even run into uses of null_pointer_node for no_backend?
>> For sure for -fsyntax-only we can't really omit backend init as IMHO
>> no parser piece is decoupled enough to never call target hooks or so.
>>
>> Thus, omit less of the initialization for no_backend.
>
> OK, but I don't think the wide-int merge should be held up for clean-ups
> like that.  At the moment the tree & gimple leve do use 0 precision, and
> although I think we both agree it'd be better if they didn't, let's do
> one thing at a time.

Fair enough.

 So I don't think we want this patch.  Instead stor-layout should
 ICE on zero-precision integer/pointer types.
>>>
>>> What should happen for void_zero_node?
>>
>> Not sure what that beast is supposed to be or why it should be
>> of INTEGER_CST kind (it's not even initialized in any meaningful
>> way).
>>
>> That said, the wide-int code shouldn't be slowed down by
>> precision == 0 checks.  We should never ever reach wide-int
>> with such a constant.
>
> void_zero_node is used for ubsan too, and survives into gimple.
> I did hit this in real tests, it wasn't just theoretical.

Ugh - for what does it use that ... :/

Please remember how to trigger those issues and I'll happily have
a look after the merge.

Thanks,
Richard.

> Thanks,
> Richard


Go patch committed: Use backend interface for bound method expressions

2014-05-05 Thread Ian Lance Taylor
This patch from Chris Manghane changes the Go frontend to use the
backend interface for bound method expressions.  Bootstrapped and ran Go
testsuite on x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r e828f77c32b7 go/expressions.cc
--- a/go/expressions.cc	Thu May 01 15:18:08 2014 -0400
+++ b/go/expressions.cc	Mon May 05 12:54:14 2014 -0400
@@ -6736,12 +6736,9 @@
   Expression* ret = Expression::make_struct_composite_literal(st, vals, loc);
   ret = Expression::make_heap_expression(ret, loc);
 
-  tree ret_tree = ret->get_tree(context);
+  // See whether the expression or any embedded pointers are nil.
 
   Expression* nil_check = NULL;
-
-  // See whether the expression or any embedded pointers are nil.
-
   Expression* expr = this->expr_;
   if (this->method_->field_indexes() != NULL)
 {
@@ -6764,26 +6761,19 @@
 	nil_check = Expression::make_binary(OPERATOR_OROR, nil_check, n, loc);
 }
 
+  Bexpression* bme = tree_to_expr(ret->get_tree(context));
   if (nil_check != NULL)
 {
-  tree nil_check_tree = nil_check->get_tree(context);
-  Expression* crash_expr =
-	context->gogo()->runtime_error(RUNTIME_ERROR_NIL_DEREFERENCE, loc);
-  tree crash = crash_expr->get_tree(context);
-  if (ret_tree == error_mark_node
-	  || nil_check_tree == error_mark_node
-	  || crash == error_mark_node)
-	return error_mark_node;
-
-  ret_tree = fold_build2_loc(loc.gcc_location(), COMPOUND_EXPR,
- TREE_TYPE(ret_tree),
- build3_loc(loc.gcc_location(), COND_EXPR,
-	void_type_node, nil_check_tree,
-	crash, NULL_TREE),
- ret_tree);
-}
-
-  return ret_tree;
+  Gogo* gogo = context->gogo();
+  Expression* crash =
+	gogo->runtime_error(RUNTIME_ERROR_NIL_DEREFERENCE, loc);
+  Bexpression* bcrash = tree_to_expr(crash->get_tree(context));
+  Btype* btype = ret->type()->get_backend(gogo);
+  Bexpression* bcheck = tree_to_expr(nil_check->get_tree(context));
+  bme = gogo->backend()->conditional_expression(btype, bcheck, bcrash,
+		bme, loc);
+}
+  return expr_to_tree(bme);
 }
 
 // Dump ast representation of a bound method expression.


Re: [PATCH 2/2, x86] Add palignr support for AVX2.

2014-05-05 Thread Evgeny Stupachenko
Assuming first part of the patch is committed. Is the following patch
ok? It passes bootstrap and make check.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 91f6f21..475448e 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -42808,6 +42808,7 @@ expand_vec_perm_pshufb (struct expand_vec_perm_d *d)
 }

 static bool expand_vec_perm_vpshufb2_vpermq (struct expand_vec_perm_d *d);
+static bool expand_vec_perm_palignr (struct expand_vec_perm_d *d);

 /* A subroutine of ix86_expand_vec_perm_builtin_1.  Try to instantiate D
in a single instruction.  */
@@ -42943,6 +42944,10 @@ expand_vec_perm_1 (struct expand_vec_perm_d *d)
   if (expand_vec_perm_vpermil (d))
 return true;

+  /* Try palignr on one operand.  */
+  if (d->one_operand_p && expand_vec_perm_palignr (d))
+return true;
+
   /* Try the SSSE3 pshufb or XOP vpperm or AVX2 vperm2i128,
  vpshufb, vpermd, vpermps or vpermq variable permutation.  */
   if (expand_vec_perm_pshufb (d))
@@ -43040,22 +43045,36 @@ expand_vec_perm_palignr (struct expand_vec_perm_d *d)
   else
 return false;

-  min = 2 * nelt, max = 0;
-  for (i = 0; i < nelt; ++i)
+  /* For a rotaion permutation with one operand like: {5 6 7 0 1 2 3 4}
+ PALIGNR is better than any other permutaion expand.
+ Check for a rotation in permutation.  */
+  if (d->one_operand_p)
 {
-  unsigned e = d->perm[i];
-  if (e < min)
-   min = e;
-  if (e > max)
-   max = e;
+  min = d->perm[0];
+  for (i = 1; i < nelt; ++i)
+   if (d->perm[i] != ((min + i) & (nelt - 1)))
+ return false;
 }
-  if (min == 0 || max - min >= nelt)
-return false;
+  /* For a 2 operand permutaion we check if elements fit within one vector.  */
+  else
+{
+  min = 2 * nelt, max = 0;
+  for (i = 0; i < nelt; ++i)
+   {
+ unsigned e = d->perm[i];
+ if (e < min)
+   min = e;
+ if (e > max)
+   max = e;
+   }
+  if (min == 0 || max - min >= nelt)
+   return false;

-  /* Given that we have SSSE3, we know we'll be able to implement the
- single operand permutation after the palignr with pshufb.  */
-  if (d->testing_p)
-return true;
+  /* Given that we have SSSE3, we know we'll be able to implement the
+single operand permutation after the palignr with pshufb.  */
+  if (d->testing_p)
+   return true;
+}

   dcopy = *d;
   shift = GEN_INT (min * GET_MODE_BITSIZE (GET_MODE_INNER (d->vmode)));
@@ -43089,6 +43108,14 @@ expand_vec_perm_palignr (struct expand_vec_perm_d *d)
 }


+  /* For a rotaion permutation with one operand like: {5 6 7 0 1 2 3 4}
+ the rest single operand permutation is just move.  */
+  if (d->one_operand_p)
+{
+  emit_move_insn (d->target, gen_lowpart (d->vmode, target));
+  return true;
+}
+
   dcopy.op0 = dcopy.op1 = gen_lowpart (d->vmode, target);
   dcopy.one_operand_p = true;


On Wed, Apr 30, 2014 at 6:25 PM, Evgeny Stupachenko  wrote:
> On Tue, Apr 29, 2014 at 9:39 PM, Richard Henderson  wrote:
>> On 04/29/2014 10:13 AM, Evgeny Stupachenko wrote:
>>> +  /* For a rotaion permutation with one operand like: {5 6 7 0 1 2 3 4}
>>> + PALIGNR is better than PSHUFB.  Check for a rotation in permutation.  
>>> */
>>> +  for (i = 0; i < nelt; ++i)
>>> +if d->perm[(i + 1) & (nelt - 1)] - d->perm[i])) & (nelt - 1)) != 1)
>>> +  return false;
>>> +
>>> +  min = d->perm[0];
>>
>> Why are you running this loop NELT times instead of NELT-1 like I suggested?
>> Why is that test expression so complicated?
>>
>> Obviously d->perm[0] is the anchor and everything else can be computed 
>> relative
>> to that.  I'd expect no more than
>>
>>   min = d->perm[0];
>>   for (i = 1; i < nelt; ++i)
>> if (d->perm[i] != ((min + i) & (nelt - 1)))
>>   return false;
>
> Agree on this. The loop is less complicated.
>
>>
>> Now that I think of it,
>>
>>> +  /* PALIGNR of 2 128-bits registers takes only 1 instrucion.
>>> + Requires SSSE3.  */
>>> +  if (GET_MODE_SIZE (d->vmode) == 16)
>>> +{
>>> +  if (!TARGET_SSSE3)
>>> +   return false;
>>> +}
>>> +  /* PALIGNR of 2 256-bits registers on AVX2 costs only 2 instructions:
>>> + PERM and PALIGNR.  It is more profitable than 2 PSHUFB and PERM.  */
>>> +  else if (GET_MODE_SIZE (d->vmode) == 32)
>>> +{
>>> +  if (!TARGET_AVX2)
>>> +   return false;
>>> +}
>>> +  else
>>> +return false;
>>> +
>>> +  if (!d->one_operand_p)
>>> +return false;
>>
>> The comments are wrong.  Move the operand_p test to the top,
>> where it should be, and they're more obviously wrong:
>>
>>   "must have one operand"
>>   "palignr of two operands..."
>>
>> I think your comment
>>
>>   /* For a rotaion permutation with one operand like: {5 6 7 0 1 2 3 4}
>>  we want to use PALIGNR.  */
>>
>> belongs up there instead of those two incorrect comments.
>>
>
> What I mean in the comments
>   16 bytes case:
> For 1 operan

Re: [PATCH] pedantic warning behavior when casting void* to ptr-to-func, 4.8 and 4.9

2014-05-05 Thread Daniel Gutson
On Mon, May 5, 2014 at 11:08 AM, Richard Biener
 wrote:
> On Mon, Apr 28, 2014 at 8:14 PM, Jason Merrill  wrote:
>> Applied, thanks.  Sorry for the delay.
>
> It caused PR61066 on the branch.
>
> Richard.
>
>> Jason

It seems some tests didn't run when I tested the patch, sorry about that.
I'll try to run all the tests and update them as needed.
I will post the new patch as soon as possible.
Sorry for the inconveniences.

   Daniel.


Re: [PATCH 1/2, x86] Add palignr support for AVX2.

2014-05-05 Thread Evgeny Stupachenko
Is the following patch ok? It passes bootstrap and make check.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 88142a8..91f6f21 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -42807,6 +42807,8 @@ expand_vec_perm_pshufb (struct expand_vec_perm_d *d)
   return true;
 }

+static bool expand_vec_perm_vpshufb2_vpermq (struct expand_vec_perm_d *d);
+
 /* A subroutine of ix86_expand_vec_perm_builtin_1.  Try to instantiate D
in a single instruction.  */

@@ -42946,6 +42948,10 @@ expand_vec_perm_1 (struct expand_vec_perm_d *d)
   if (expand_vec_perm_pshufb (d))
 return true;

+  /* Try the AVX2 vpshufb.  */
+  if (expand_vec_perm_vpshufb2_vpermq (d))
+return true;
+
   /* Try the AVX512F vpermi2 instructions.  */
   rtx vec[64];
   enum machine_mode mode = d->vmode;
@@ -43004,7 +43010,7 @@ expand_vec_perm_pshuflw_pshufhw (struct
expand_vec_perm_d *d)
 }

 /* A subroutine of ix86_expand_vec_perm_builtin_1.  Try to simplify
-   the permutation using the SSSE3 palignr instruction.  This succeeds
+   the permutation using the SSSE3/AVX2 palignr instruction.  This succeeds
when all of the elements in PERM fit within one vector and we merely
need to shift them down so that a single vector permutation has a
chance to succeed.  */
@@ -43015,14 +43021,26 @@ expand_vec_perm_palignr (struct expand_vec_perm_d *d)
   unsigned i, nelt = d->nelt;
   unsigned min, max;
   bool in_order, ok;
-  rtx shift, target;
+  rtx shift, shift1, target, tmp;
   struct expand_vec_perm_d dcopy;

-  /* Even with AVX, palignr only operates on 128-bit vectors.  */
-  if (!TARGET_SSSE3 || GET_MODE_SIZE (d->vmode) != 16)
+  /* SSSE3 is required to apply PALIGNR on 16 bytes operands.  */
+  if (GET_MODE_SIZE (d->vmode) == 16)
+{
+  if (!TARGET_SSSE3)
+   return false;
+}
+  /* AVX2 is required to apply PALIGNR on 32 bytes operands.  */
+  else if (GET_MODE_SIZE (d->vmode) == 32)
+{
+  if (!TARGET_AVX2)
+   return false;
+}
+  /* Other sizes are not supported.  */
+  else
 return false;

-  min = nelt, max = 0;
+  min = 2 * nelt, max = 0;
   for (i = 0; i < nelt; ++i)
 {
   unsigned e = d->perm[i];
@@ -43041,9 +43059,35 @@ expand_vec_perm_palignr (struct expand_vec_perm_d *d)

   dcopy = *d;
   shift = GEN_INT (min * GET_MODE_BITSIZE (GET_MODE_INNER (d->vmode)));
-  target = gen_reg_rtx (TImode);
-  emit_insn (gen_ssse3_palignrti (target, gen_lowpart (TImode, d->op1),
- gen_lowpart (TImode, d->op0), shift));
+  shift1 = GEN_INT ((min - nelt / 2) *
+  GET_MODE_BITSIZE (GET_MODE_INNER (d->vmode)));
+
+  if (GET_MODE_SIZE (d->vmode) != 32)
+{
+  target = gen_reg_rtx (TImode);
+  emit_insn (gen_ssse3_palignrti (target, gen_lowpart (TImode, d->op1),
+ gen_lowpart (TImode, d->op0), shift));
+}
+  else
+{
+  target = gen_reg_rtx (V2TImode);
+  tmp = gen_reg_rtx (V4DImode);
+  emit_insn (gen_avx2_permv2ti (tmp,
+   gen_lowpart (V4DImode, d->op0),
+   gen_lowpart (V4DImode, d->op1),
+   GEN_INT (33)));
+  if (min < nelt / 2)
+emit_insn (gen_avx2_palignrv2ti (target,
+gen_lowpart (V2TImode, tmp),
+gen_lowpart (V2TImode, d->op0),
+shift));
+  else
+   emit_insn (gen_avx2_palignrv2ti (target,
+gen_lowpart (V2TImode, d->op1),
+gen_lowpart (V2TImode, tmp),
+shift1));
+}
+

   dcopy.op0 = dcopy.op1 = gen_lowpart (d->vmode, target);
   dcopy.one_operand_p = true;

On Tue, Apr 29, 2014 at 1:03 AM, Richard Henderson  wrote:
> On 04/28/2014 01:43 PM, Evgeny Stupachenko wrote:
>> Agree on checks:
>>
>>   /* PALIGNR of 2 128-bits registers takes only 1 instrucion.
>>  Requires SSSE3.  */
>>   if (GET_MODE_SIZE (d->vmode) == 16)
>> {
>>   if(!TARGET_SSSE3)
>> return false;
>> }
>>   /* PALIGNR of 2 256-bits registers on AVX2 costs only 2 instructions:
>>  PERM and PALIGNR.  It is more profitable than 2 PSHUFB and PERM.  */
>>   else if (GET_MODE_SIZE (d->vmode) == 32)
>> {
>>   if(!TARGET_AVX2)
>> return false;
>> }
>>   else
>> return false;
>
> Thanks, much better.
>
>
> r~


Re: Fix various x86 tests for --with-arch=bdver3 --with-cpu=bdver3

2014-05-05 Thread Joseph S. Myers
On Wed, 2 Apr 2014, Uros Bizjak wrote:

> These are due to TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL tuning flag.
> Currently, this flag applies to all vector sizes (128, 256 and 512
> bits), but I guess it is effective only for 128 bit sizes. Can you
> please review usage of this flag in i386/sse.md?

Indeed, the optimization as described in
 is purely
about reducing code size, and is irrelevant in VEX-prefixed cases.
Thus, this patch adds  == 16 conditionals in relevant cases
(some cases already had such conditionals or otherwise wouldn't be
used for larger vectors).

Tested with no regressions for x86_64-linux-gnu (--with-arch=bdver3
--with-cpu=bdver3, where it fixes most of the remaining scan-assembler
test failures).  OK to commit?

2014-05-05  Joseph Myers  

* config/i386/sse.md (*mov_internal)
(*_loadu)
(*_loaddqu)
(_andnot3, 3, *andnot3)
(*3, *andnot3)
(3): Only consider
TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL for modes of size 16.

Index: gcc/config/i386/sse.md
===
--- gcc/config/i386/sse.md  (revision 209980)
+++ gcc/config/i386/sse.md  (working copy)
@@ -758,7 +758,8 @@
   [(set_attr "type" "sselog1,ssemov,ssemov")
(set_attr "prefix" "maybe_vex")
(set (attr "mode")
-   (cond [(match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
+   (cond [(and (match_test " == 16")
+   (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))
 (const_string "")
   (and (match_test " == 16")
(and (eq_attr "alternative" "2")
@@ -967,7 +968,8 @@
(set_attr "ssememalign" "8")
(set_attr "prefix" "maybe_vex")
(set (attr "mode")
-   (cond [(match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
+   (cond [(and (match_test " == 16")
+   (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))
 (const_string "")
   (match_test "TARGET_AVX")
 (const_string "")
@@ -1089,7 +1091,8 @@
  (const_string "1")))
(set_attr "prefix" "maybe_vex")
(set (attr "mode")
-   (cond [(match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
+   (cond [(and (match_test " == 16")
+   (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))
 (const_string "")
   (match_test "TARGET_AVX")
 (const_string "")
@@ -2377,7 +2380,8 @@
(set_attr "type" "sselog")
(set_attr "prefix" "orig,maybe_evex")
(set (attr "mode")
-   (cond [(match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
+   (cond [(and (match_test " == 16")
+   (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))
 (const_string "")
   (match_test "TARGET_AVX")
 (const_string "")
@@ -2449,7 +2453,8 @@
(set_attr "type" "sselog")
(set_attr "prefix" "orig,maybe_evex")
(set (attr "mode")
-   (cond [(match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
+   (cond [(and (match_test " == 16")
+   (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))
 (const_string "")
   (match_test "TARGET_AVX")
 (const_string "")
@@ -2513,7 +2518,8 @@
(set_attr "type" "sselog")
(set_attr "prefix" "orig,vex")
(set (attr "mode")
-   (cond [(match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
+   (cond [(and (match_test " == 16")
+   (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))
 (const_string "V4SF")
   (match_test "TARGET_AVX")
 (const_string "")
@@ -2600,7 +2606,8 @@
(set_attr "type" "sselog")
(set_attr "prefix" "orig,vex")
(set (attr "mode")
-   (cond [(match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
+   (cond [(and (match_test " == 16")
+   (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))
 (const_string "V4SF")
   (match_test "TARGET_AVX")
 (const_string "")
@@ -9047,7 +9054,8 @@
(const_string "*")))
(set_attr "prefix" "")
(set (attr "mode")
-   (cond [(match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
+   (cond [(and (match_test " == 16")
+   (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))
 (const_string "")
   (match_test "TARGET_AVX2")
 (const_string "")
@@ -9140,7 +9148,8 @@
(const_string "*")))
(set_attr "prefix" "")
(set (attr "mode")
-   (cond [(match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
+   (cond [(and (match_test " == 16")
+   (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))
 (const_string "")
   (match_test "TARGET_AVX2")
 (const_string "")

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [C++ Patch] PR 60999

2014-05-05 Thread Paolo Carlini

Hi,

On 05/05/2014 03:32 PM, Jason Merrill wrote:

On 05/03/2014 06:55 PM, Paolo Carlini wrote:

On 05/03/2014 11:14 PM, Jason Merrill wrote:

Do you want CLASSTYPE_IS_TEMPLATE here?



Uhm, you mean something simple like the attached? Because certainly I
tried it and didn't fully work (nsdmi-template7.C?) and for sure we
would regress on the below (which we should probably also add to
testsuite, I had it two days ago in my experiments...).


Ah, yes, CLASSTYPE_IS_TEMPLATE is false for partial specializations. 
What we want is a predicate to check for a class template or partial 
specialization.


Good, but is it Ok to use uses_template_parms for that? A few days ago I 
struggled to find something simpler ready to use, to no avail (well, 
this is probably well known to you, but there are surprisingly few 
places in pt.c where we explain either in comments or in obvious code 
that we are handling full (vs partial) specializations).


(Well, minor improvement of the below, I think it would be ok to call 
dependent_type_p directly, but processing_template_decl must be 1 around 
it anyway)


Paolo.
Index: cp-tree.h
===
--- cp-tree.h   (revision 210068)
+++ cp-tree.h   (working copy)
@@ -3160,6 +3160,11 @@ more_aggr_init_expr_args_p (const aggr_init_expr_a
&& !CLASSTYPE_USE_TEMPLATE (NODE) \
&& PRIMARY_TEMPLATE_P (CLASSTYPE_TI_TEMPLATE (NODE)))
 
+/* True if the given class type is a template or a partial specialization.  */
+#define CLASSTYPE_IS_TEMPLATE_OR_PARTIAL_SPECIALIZATION(NODE)  \
+  (CLASSTYPE_TEMPLATE_INFO (NODE)\
+   && uses_template_parms (NODE))
+
 /* The name used by the user to name the typename type.  Typically,
this is an IDENTIFIER_NODE, and the same as the DECL_NAME on the
corresponding TYPE_DECL.  However, this may also be a
Index: pt.c
===
--- pt.c(revision 210068)
+++ pt.c(working copy)
@@ -462,9 +462,12 @@ maybe_begin_member_template_processing (tree decl)
   bool nsdmi = TREE_CODE (decl) == FIELD_DECL;
 
   if (nsdmi)
-decl = (CLASSTYPE_TEMPLATE_INFO (DECL_CONTEXT (decl))
-   ? CLASSTYPE_TI_TEMPLATE (DECL_CONTEXT (decl))
-   : NULL_TREE);
+{
+  tree ctx = DECL_CONTEXT (decl);
+  decl = (CLASSTYPE_IS_TEMPLATE_OR_PARTIAL_SPECIALIZATION (ctx)
+ ? CLASSTYPE_TI_TEMPLATE (ctx)
+ : NULL_TREE);
+}
 
   if (inline_needs_template_parms (decl, nsdmi))
 {
@@ -11519,8 +11522,8 @@ tsubst (tree t, tree args, tsubst_flags_t complain
  r = instantiate_alias_template (tmpl, gen_args, complain);
}
   else if (DECL_CLASS_SCOPE_P (decl)
-  && CLASSTYPE_TEMPLATE_INFO (DECL_CONTEXT (decl))
-  && uses_template_parms (DECL_CONTEXT (decl)))
+  && (CLASSTYPE_IS_TEMPLATE_OR_PARTIAL_SPECIALIZATION
+  (DECL_CONTEXT (decl
{
  tree tmpl = most_general_template (DECL_TI_TEMPLATE (decl));
  tree gen_args = tsubst (DECL_TI_ARGS (decl), args, complain, in_decl);


[Patch] Minor fixes for regtesting gfortran with -flto

2014-05-05 Thread Dominique Dhumieres

With the following patch, gfortran can be regtested with -flto
with no failure, but pr54852 and pr60061.

OK for trunk?

Dominique

2014-05-05  Dominique d'Humieres 

* gfortran.dg/gfortran.dg/bind_c_array_params_2.f90:
Adjust regexp for -flto.
* gfortran.dg/gfortran.dg/pr48636-2.f90: Likewise.
* gfortran.dg/pr52835.f90: Likewise.

--- ../_clean/gcc/testsuite/gfortran.dg/bind_c_array_params_2.f90   
2013-01-25 09:57:48.0 +0100
+++ gcc/testsuite/gfortran.dg/bind_c_array_params_2.f90 2014-02-04 
19:24:06.0 +0100
@@ -16,7 +16,7 @@ integer :: aa(4,4)
 call test(aa)
 end
 
-! { dg-final { scan-assembler-times "myBindC" 1 { target { ! { hppa*-*-hpux* } 
} } } }
-! { dg-final { scan-assembler-times "myBindC,%r2" 1 { target { hppa*-*-hpux* } 
} } }
+! { dg-final { scan-assembler-times "call\[^\n\r\]*myBindC" 1 { target { ! { 
hppa*-*-hpux* } } } } }
+! { dg-final { scan-assembler-times "call\[^\n\r\]*myBindC,%r2" 1 { target { 
hppa*-*-hpux* } } } }
 ! { dg-final { scan-tree-dump-times "test \\\(&parm\\." 1 "original" } }
 ! { dg-final { cleanup-tree-dump "original" } }
--- ../_clean/gcc/testsuite/gfortran.dg/pr48636-2.f90   2012-11-07 
18:36:11.0 +0100
+++ gcc/testsuite/gfortran.dg/pr48636-2.f90 2014-02-04 17:33:00.0 
+0100
@@ -33,6 +33,6 @@ program main
   print *,x
 end program main
 
-! { dg-final { scan-ipa-dump "Creating a specialized node of bar/\[0-9\]*\\." 
"cp" } }
+! { dg-final { scan-ipa-dump "Creating a specialized node of 
\[^\n\r\]*bar/\[0-9\]*\\." "cp" } }
 ! { dg-final { scan-ipa-dump-times "Aggregate 
replacements\[^=\]*=\[^=\]*=\[^=\]*=\[^=\]*=\[^=\]*=\[^=\]*=\[^=\]*=\[^=\]*=\[^=\]*="
 2 "cp" } }
 ! { dg-final { cleanup-ipa-dump "cp" } }
--- ../_clean/gcc/testsuite/gfortran.dg/pr52835.f90 2012-04-03 
10:50:54.0 +0200
+++ gcc/testsuite/gfortran.dg/pr52835.f90   2014-02-04 17:18:15.0 
+0100
@@ -12,5 +12,5 @@ subroutine foo (x, y, z, n)
   end do
 end subroutine
 
-! { dg-final { scan-tree-dump "bar " "optimized" } }
+! { dg-final { scan-tree-dump "bar\[ _\]" "optimized" } }
 ! { dg-final { cleanup-tree-dump "optimized" } }


Re: [PATCH] gengtype: Support explicit pointers in template arguments

2014-05-05 Thread Jeff Law

On 04/30/14 21:07, David Malcolm wrote:

Currently, gengtype does not support template arguments that are
explicitly pointers, such as:
   static GTY(()) vec test_gimple; giving this
error:
   ../../src/gcc/gimple-expr.c:902: parse error: expected a string constant or 
',', have '*' instead
requiring them to be typedefs.

This patch removes this restriction, supporting up to a single pointer
in each template argument.

It also adds support for inheritance, allowing:
   static GTY(()) vec *test_phis;
where the generated gt-FOO.c file contains:

   void
   gt_ggc_mx (struct gimple_statement_phi *& x)
   {
 if (x)
   gt_ggc_mx_gimple_statement_base ((void *) x);
   }

i.e. handling the subclass by calling the marking function for the base
class.

Doing so uncovered an issue within write_user_func_for_structure_ptr,
which would unconditionally write out a func.  This would lead to
gt-FOO.c containing duplicate functions e.g.:
gtype-desc.c:280: multiple definition of `gt_ggc_mx(gimple_statement_base*&)'
if more than one GTY template type referenced such a type; for example like 
this:

   static GTY(()) vec *test_old_style_gimple;
   static GTY(()) vec *test_new_style_gimple;

where "gimple" is just an alias for "gimple_statement_base *".  This
could be worked around by ensuring that the source tree consistently
uses vec<> of just one kind, but for robustness the patch also makes
write_user_func_for_structure_ptr idempotent, so it only writes out the
func once for each (struct, which-of-ggc/pch) combination.

Motivation:
In the "Compile-time gimple-checking" discussion, Richi asked me to look
at eliminating the const_gimple typedef in favor of having "gimple" be
the struct, thus making the pointer to a gimple be explicit in the type:
   http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01520.html
as in:
   http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01632.html

[...] And I never liked the
const_ typedef variants that were introduced.  The main reason for
them was probably to avoid all the churn of replacing the tree pointer
typedef with a tree (non-pointer) typedef.  The mistake was to
repeat that for 'gimple' ...


I'm attempting a patch for that, but in the meantime, this patch adds
the needed support to gengtype.

Successfully bootstrapped®tested on x86_64-unknown-linux-gnu (Fedora
20).

OK for trunk?
Basically OK.  Per the wide-int folks request, please hold off until the 
wide-int merge is done.


If after wide-int, the only changes are trivial, then consider the patch 
pre-approved (post the final version here so that its archived).


If after wide-int merges the changes to this patch are non-trivial, then 
repost the RFA.


Obviously, you get to exercise some judgement as to what constitutes 
trivial vs non-trivial.


Jeff


Re: [Patch, testsuite, mips] Fix two failing MIPS tests.

2014-05-05 Thread Steve Ellcey
On Sat, 2014-05-03 at 07:52 +0100, Richard Sandiford wrote:

> > Old:
> >
> > li  $5,305397760# 0x1234
> > addiu   $4,$5,1
> > addiu   $5,$5,-1
> >
> > New:
> >
> > li  $5,305332224# 0x1233
> > ori $5,$5,0x
> > addiu   $4,$5,2
> 
> This isn't as good though -- $4 now depends on two previous
> instructions.  Do you know which specific change was responsible?
> 
> Thanks,
> Richard

It was this one:

2014-04-29  James Greenhalgh  

* calls.c (initialize_argument_information): Always treat
PUSH_ARGS_REVERSED as 1, simplify code accordingly.
(expand_call): Likewise.
(emit_library_call_calue_1): Likewise.
* expr.c (PUSH_ARGS_REVERSED): Do not define.
(emit_push_insn): Always treat PUSH_ARGS_REVERSED as 1, simplify
code accordingly.

Steve Ellcey
sell...@mips.com



Re: RFA: speeding up dg-extract-results.sh

2014-05-05 Thread Jeff Law

On 02/13/14 02:18, Richard Sandiford wrote:


 http://gcc.gnu.org/ml/gcc-testresults/2014-02/msg00025.html

the script takes just over 5 hours to produce the gcc.log file.

Good grief...



This patch tries to reduce that by providing an alternative single-script
version.  I was torn between Python and Tcl, but given how most people
tend to react to Tcl, I thought I'd better go for Python.  I wouldn't
mind rewriting it in Tcl if that seems better though, not least because
expect is already a prerequisite.
Can't argue with that reasoning.  There's been talk of rewriting the 
testsuite drivers to use python rather than tcl under the hood.  I'd be 
surprised if anyone in our community really prefers tcl over python.


It would mean a bit of pain for initial system bootstraps by the 
distributors, but those are relatively rare occurrences and they can 
always seed that process by building without testing to get the initial 
tools, when are then used to build python, then rebuild gcc to get the 
testing done.  I don't see this as being a major issue.





Python isn't yet required and I'm pretty sure this script needs 2.6
or later.  I'm also worried that the seek/tell stuff might not work on
Windows.  The patch therefore gets dg-extract-results.sh to check the
environment first and call into the python version if possible,
otherwise it falls back on the current approach.  This also means
that the patch is contained entirely within contrib/.  If this does
indeed not work on Windows then we should either fix the python code
(obviously preferred) or get dg-extract-results.sh to skip it on
Windows for now.

Kai can probably test an answer questions about the Windows space.


I checked that the outputs were otherwise identical for a set of
mips64-linux-gnu, mipsisa64-sde-elf and x86_64-linux-gnu runs.  I also
reran the acats tests with some nobbled testcases in order to test the
failure paths there.

Also bootstrapped & regression-tested on x86_64-linux-gnu.  OK to install?




Thanks,
Richard


contrib/
* dg-extract-results.py: New file.
* dg-extract-results.sh: Use it if the environment seems suitable.

OK by me.

jeff



Re: [PATCH 0/3] Compile-time gimple checking, without typedefs

2014-05-05 Thread Jeff Law

On 05/05/14 02:22, Richard Biener wrote:


Also, AIUI, Andrew is looking at introducing concepts of gimple types and
gimple expressions, so "gimple" may no longer imply a *statement*.

Alternatively, we could make the base class be just "gimple" (which would
be more consistent with the names of the accessor functions).


I strongly prefer to name it 'gimple', not 'gimple_stmt'.  Because it's less
to type, and because it will make all other types shorter as well.  And because
'gimple' _is_ a stmt right now, so gimple_stmt is redundant.  Same applies
to gimple_stmt_with_FOO, just make it gimple_with_FOO.

I understand the namespace issue, but we don't have a namespace right now.
Also gimple::gimple works just fine, no?
But this approach is going to be inconsistent with Andrew's work, right? 
 ISTM we'd end up with something like...


So statements would be "gimple"
types would be "gimple_type"
expressions would be "gimple_expr"

It's a bit of bikeshedding, but I'd prefer "gimple_stmt".  If you really 
feel strongly about it, I'll go along without objection, but it seems 
wrong to me.






There's also the "bargain basement" namespaces approach, where we don't
have an implicit "gimple" namespace, but just *pretend* we do, and rename
the base type to "stmt", with e.g. "gimple_statement_phi" becoming just
"phi". ["gimple_switch" would need to become "switch_", say, to avoid the
reserved word].


Ick (for the 'switch' case ... CamelCase anyone? :)).

:-)  Please, no

Jeff


Re: Add call_fusage_contains_non_callee_clobbers hook

2014-05-05 Thread Vladimir Makarov

On 2014-04-24, 3:12 AM, Tom de Vries wrote:

On 23-04-14 17:10, Richard Sandiford wrote:

FWIW I think this should be a plain bool rather than a function,
like delay_sched2 etc.



Vladimir,

I've reimplemented the hook using DEFHOOKPOD instead of DEFHOOK, to make
it a plain bool.

OK for trunk?



Yes, it is ok if it was not approved yet and sorry for the delay with 
the answer.



2013-04-29  Radovan Obradovic  
 Tom de Vries  

 * target.def (call_fusage_contains_non_callee_clobbers): New
DEFHOOKPOD.
 * doc/tm.texi.in (@node Stack and Calling): Add Miscellaneous
Register
 Hooks to @menu.
 (@node Miscellaneous Register Hooks): New node.
 (@hook TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS): New hook.
 * doc/tm.texi: Regenerate.







Re: [PATCH, PR60738] More LRA split for regno conflicting with single reg class operand

2014-05-05 Thread Vladimir Makarov

On 2014-04-25, 11:35 PM, Wei Mi wrote:

Hi,

This patch is to address the missing optimization reported in
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60738

Now in process_single_reg_class_operands, any allocno A conflicting
with a single reg class operand B is marked to never use the reg class
in IRA. This is non-optimal when A is in a hot region while B is in a
cold region. The patch allows A to use the register in the single reg
class if only the hotness difference between A and B is large enough.
The patch also extends lra_split to make sure A is splitted in the
code region for B instead of being spilled.

bootstrap and regression test are ok for x86_64-linux-gnu. Is it ok for trunk?




Thanks for working on this Wei.

The patch is ok for the trunk with the change you proposed to resolve 
Steven's concern and with resolving the changelog entries pitfalls below:


It is better to add the name of the parameter to make easier a future 
search when the parameter was introduced (e.g. "params.h 
(LRA_SPLIT_FREQ_RATIO): New param.").  Adding '#include "params.h"' is 
not in the changelog too.


And you should remove unfinished comment (see below).

For the future, it would be nice if such patches are benchmarked too 
(e.g. on some SPEC) in order to be sure that it works not worse in most 
cases (it is very hard to predict effect of such changes).



ChangeLog:

2014-04-25  Wei Mi  

 PR rtl-optimization/60738
 * params.h: New param.
 * params.def: Ditto.
 * lra-constraints.c (need_for_split_p): Let more
 cases to do lra-split.
 * ira-lives.c (process_single_reg_class_operands):
 Avoid to add single reg class into conflict hardreg set in
 some cases.


...


Index: lra-constraints.c
===
--- lra-constraints.c   (revision 209253)
+++ lra-constraints.c   (working copy)
@@ -129,6 +129,7 @@
  #include "ira.h"
  #include "rtl-error.h"
  #include "lra-int.h"
+#include "params.h"

  /* Value of LRA_CURR_RELOAD_NUM at the beginning of BB of the current
 insn.  Remember that LRA_CURR_RELOAD_NUM is the number of emitted
@@ -4632,8 +4633,13 @@ static bitmap_head ebb_global_regs;
  static inline bool
  need_for_split_p (HARD_REG_SET potential_reload_hard_regs, int regno)
  {
+  int freq;
+  rtx last_use_insn;
int hard_regno = regno < FIRST_PSEUDO_REGISTER ? regno : 
reg_renumber[regno];

+  last_use_insn = skip_usage_debug_insns (usage_insns[regno].insns);
+  freq = REG_FREQ_FROM_BB (BLOCK_FOR_INSN (last_use_insn));
+
lra_assert (hard_regno >= 0);
return ((TEST_HARD_REG_BIT (potential_reload_hard_regs, hard_regno)
/* Don't split eliminable hard registers, otherwise we can
@@ -4653,25 +4659,27 @@ need_for_split_p (HARD_REG_SET potential
&& (regno >= FIRST_PSEUDO_REGISTER
|| ! TEST_HARD_REG_BIT (call_used_reg_set, regno)
|| usage_insns[regno].calls_num == calls_num)
-  /* We need at least 2 reloads to make pseudo splitting
- profitable.  We should provide hard regno splitting in
- any case to solve 1st insn scheduling problem when
- moving hard register definition up might result in
- impossibility to find hard register for reload pseudo of
- small register class.  */
-  && (usage_insns[regno].reloads_num
-  + (regno < FIRST_PSEUDO_REGISTER ? 0 : 3) < reloads_num)
-  && (regno < FIRST_PSEUDO_REGISTER
-  /* For short living pseudos, spilling + inheritance can
- be considered a substitution for splitting.
- Therefore we do not splitting for local pseudos.  It
- decreases also aggressiveness of splitting.  The
- minimal number of references is chosen taking into
- account that for 2 references splitting has no sense
- as we can just spill the pseudo.  */
-  || (regno >= FIRST_PSEUDO_REGISTER
-  && lra_reg_info[regno].nrefs > 3
-  && bitmap_bit_p (&ebb_global_regs, regno
+  /* If

 ^
I guess it is a leftover from the final patch editing.  It should be 
removed too.



+  && ((LRA_SPLIT_FREQ_RATIO * freq < lra_reg_info[regno].freq)
+  /* We need at least 2 reloads to make pseudo splitting
+ profitable.  We should provide hard regno splitting in
+ any case to solve 1st insn scheduling problem when
+ moving hard register definition up might result in
+ impossibility to find hard register for reload pseudo of
+ small register class.  */
+  || ((usage_insns[regno].reloads_num
+   + (regno < FIRST_PSEUDO_REGISTER ? 0 : 3) < reloads_num)
+  && (regno < FIRST_PSEUDO_REGISTER
+  /* For short living pseudos, spi

[Patch] Minor cleanup for the gfortran test suite.

2014-05-05 Thread Dominique Dhumieres

If no one objects, I'll commit the following patch in a couple of days

2014-05-05  Dominique d'Humieres 

* gfortran.dg/list_read_12.f90: Delete the file.
* gfortran.dg/gfortran.dg/vect/fast-math-real8-pr40801.f90:
Cleanup module yomphy0.

--- ../_clean/gcc/testsuite/gfortran.dg/list_read_12.f902014-03-15 
16:19:35.0 +0100
+++ gcc/testsuite/gfortran.dg/list_read_12.f90  2014-04-17 10:27:58.0 
+0200
@@ -7,5 +7,6 @@ close(99)
 
 open(99, access='sequential', form='formatted')
 read(99, *, iostat=ios) i
+close(99, status="delete")
 if (ios /= 0) call abort
 end
--- ../_clean/gcc/testsuite/gfortran.dg/vect/fast-math-real8-pr40801.f90
2012-05-15 13:41:32.0 +0200
+++ gcc/testsuite/gfortran.dg/vect/fast-math-real8-pr40801.f90  2013-10-21 
23:51:38.0 +0200
@@ -35,3 +35,4 @@ ENDIF
 END SUBROUTINE ACCONV
 
 ! { dg-final { cleanup-tree-dump "vect" } }
+! { dg-final { cleanup-modules "yomphy0" } }

Dominique


Re: [wide-int] Handle zero-precision INTEGER_CSTs again

2014-05-05 Thread Richard Sandiford
Richard Biener  writes:
> On Mon, May 5, 2014 at 12:54 PM, Richard Sandiford
>  wrote:
>> Richard Biener  writes:
>>> On Fri, May 2, 2014 at 9:19 PM, Richard Sandiford
>>>  wrote:
 I'd hoped the days of zero-precision INTEGER_CSTs were behind us after
 Richard's patch to remove min amd max values from zero-width bitfields,
 but a boostrap-ubsan showed otherwise.  One source is in:

   null_pointer_node = build_int_cst (build_pointer_type
 (void_type_node), 0);

 if no_target, since the precision of the type comes from ptr_mode,
 which remains the default VOIDmode.  Maybe that's a bug, but setting
 it to an arbitrary nonzero value would also be dangerous.
>>>
>>> Can you explain what 'no_target' should be?  ptr_mode should never be
>>> VOIDmode.
>>
>> Sorry, meant "no_backend" rather than "no_target".  See do_compile.
>
> Ok.  So we do
>
>   /* This must be run always, because it is needed to compute the FP
>  predefined macros, such as __LDBL_MAX__, for targets using non
>  default FP formats.  */
>   init_adjust_machine_modes ();
>
>   /* Set up the back-end if requested.  */
>   if (!no_backend)
> backend_init ();
>
> where I think that init_adjust_machine_modes should initialize the
> {byte,word,ptr}_mode globals.  Move from init_emit_once.
>
> But why do we even run into uses of null_pointer_node for no_backend?
> For sure for -fsyntax-only we can't really omit backend init as IMHO
> no parser piece is decoupled enough to never call target hooks or so.
>
> Thus, omit less of the initialization for no_backend.

OK, but I don't think the wide-int merge should be held up for clean-ups
like that.  At the moment the tree & gimple leve do use 0 precision, and
although I think we both agree it'd be better if they didn't, let's do
one thing at a time.

>>> So I don't think we want this patch.  Instead stor-layout should
>>> ICE on zero-precision integer/pointer types.
>>
>> What should happen for void_zero_node?
>
> Not sure what that beast is supposed to be or why it should be
> of INTEGER_CST kind (it's not even initialized in any meaningful
> way).
>
> That said, the wide-int code shouldn't be slowed down by
> precision == 0 checks.  We should never ever reach wide-int
> with such a constant.

void_zero_node is used for ubsan too, and survives into gimple.
I did hit this in real tests, it wasn't just theoretical.

Thanks,
Richard


Re: [patch, libgfortran] PR52539 I/O: Wrong result for UTF-8/UCS-4 list-directed and namelist

2014-05-05 Thread Dominique Dhumieres
As posted in the PR, neither the patch in the PR nor the one in this thread
fixes the test for powerpc-apple-darwin9.

Cheers,

Dominique


[PATCH] Error for invalid args to -fsanitize (PR driver/61065)

2014-05-05 Thread Marek Polacek
It seems it's better to error out when someone passes bogus option to
-fsanitize rather than only give a warning.  Error gives non-zero exit
status so with this patch the behavior is friendlier to configure
scripts.

Bootstrapped on x86_64-linux, ok for trunk?

2014-05-05  Marek Polacek  

PR driver/61065
* opts.c (common_handle_option): Call error_at instead of warning_at.

diff --git gcc/opts.c gcc/opts.c
index 3c214f0..f15852d 100644
--- gcc/opts.c
+++ gcc/opts.c
@@ -1495,9 +1495,9 @@ common_handle_option (struct gcc_options *opts,
}
 
if (! found)
- warning_at (loc, 0,
- "unrecognized argument to -fsanitize= option: %q.*s",
- (int) len, p);
+ error_at (loc,
+   "unrecognized argument to -fsanitize= option: %q.*s",
+   (int) len, p);
 
if (comma == NULL)
  break;

Marek


Re: [PATCH] pedantic warning behavior when casting void* to ptr-to-func, 4.8 and 4.9

2014-05-05 Thread Richard Biener
On Mon, Apr 28, 2014 at 8:14 PM, Jason Merrill  wrote:
> Applied, thanks.  Sorry for the delay.

It caused PR61066 on the branch.

Richard.

> Jason


Re: [C++ Patch] PR 60999

2014-05-05 Thread Jason Merrill

On 05/03/2014 06:55 PM, Paolo Carlini wrote:

On 05/03/2014 11:14 PM, Jason Merrill wrote:

Do you want CLASSTYPE_IS_TEMPLATE here?



Uhm, you mean something simple like the attached? Because certainly I
tried it and didn't fully work (nsdmi-template7.C?) and for sure we
would regress on the below (which we should probably also add to
testsuite, I had it two days ago in my experiments...).


Ah, yes, CLASSTYPE_IS_TEMPLATE is false for partial specializations. 
What we want is a predicate to check for a class template or partial 
specialization.


Jason



Re: [PATCH] Fix PR39246: -Wuninitialized for partially initialized complex

2014-05-05 Thread Richard Biener
On Sun, May 4, 2014 at 10:32 AM, Thomas Preud'homme
 wrote:
> Warning for use of a partially initialized complex is displayed on the wrong 
> line (the one where the partial initialization is done) and is not consistent 
> between target (or even within a given target, for instance between hardfloat 
> and softfloat on arm-none-eabi target). This patch correctly retains the line 
> location when expanding a complex move, stop warning when a COMPLEX_EXPR is 
> constructed but instead warn when it is used, and for cases where a branch is 
> involved, get the line location from the phi argument involved as a fallback. 
> This fixes PR39246.
>
> The ChangeLog are as follows:
>
> *** gcc/ChangeLog ***
>
> 2014-05-04  Thomas Preud'homme  
>
> PR middle-end/39246
> * tree-complex.c (expand_complex_move): Keep line info when expanding
> complex move.

This part and the corresponding testcase adjustment is ok.  You can
commit it separately.

> * tree-ssa-uninit.c (warn_uninit): New argument. Ignore assignment of
> complex expression. Use new argument to display correct location for
> values coming from phi statement.
> (warn_uninitialized_vars): Adapt to new signature of warn_uninit.
> (warn_uninitialized_phi): Pass location of phi argument to 
> warn_uninit.
> * tree-ssa.c (ssa_undefined_value_p): For SSA_NAME initialized by a
> COMPLEX_EXPR, recurse on each part of the COMPLEX_EXPR.

This change will also affect PRE in a bogus way.  It also affects
tree-complex.c lowering process, eventually in a similar pessimizing way.

So please do the change local to tree-ssa-uninit.c.

Thanks,
Richard.

>
> *** gcc/testsuite/ChangeLog ***
>
> 2014-05-04  Thomas Preud'homme  
>
> * gcc.dg/uninit-13.c: Move warning on the actual source line where the
> uninitialized complex is used.
> * gcc.dg/uninit-17.c: New test to check partial initialization of
> complex with branches.
> * gcc.dg/uninit-17-O0.c: Likewise.
>
> Bootstrapped on x86_64-linux-gnu with no regression in the testsuite. Also 
> tested a cross-build of gcc for arm-none-eabi target and run the regression 
> testsuite in qemu for cortex-m3 without any regression and the uninit-13.c 
> test fixed.
>
> Ok for stage1? Patch with proper formatting in attachment.
>
> diff --git a/gcc/testsuite/gcc.dg/uninit-13.c 
> b/gcc/testsuite/gcc.dg/uninit-13.c
> index 631e8de..5e88a8a 100644
> --- a/gcc/testsuite/gcc.dg/uninit-13.c
> +++ b/gcc/testsuite/gcc.dg/uninit-13.c
> @@ -5,6 +5,6 @@ typedef _Complex float C;
>  C foo()
>  {
>C f;
> -  __imag__ f = 0;  /* { dg-warning "is used" "unconditional" } */
> -  return f;
> +  __imag__ f = 0;
> +  return f;/* { dg-warning "is used" "unconditional" } */
>  }
> diff --git a/gcc/testsuite/gcc.dg/uninit-17-O0.c 
> b/gcc/testsuite/gcc.dg/uninit-17-O0.c
> new file mode 100644
> index 000..0eaef05
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/uninit-17-O0.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Wuninitialized" } */
> +
> +typedef _Complex float C;
> +C foo(int cond)
> +{
> +  C f;
> +  __imag__ f = 0;
> +  if (cond)
> +{
> +  __real__ f = 1;
> +  return f;
> +}
> +  return f;
> +}
> diff --git a/gcc/testsuite/gcc.dg/uninit-17.c 
> b/gcc/testsuite/gcc.dg/uninit-17.c
> new file mode 100644
> index 000..8a95f15
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/uninit-17.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -Wuninitialized" } */
> +
> +typedef _Complex float C;
> +C foo(int cond)
> +{
> +  C f;
> +  __imag__ f = 0;
> +  if (cond)
> +{
> +  __real__ f = 1;
> +  return f;
> +}
> +  return f;/* { dg-warning "may be used" "unconditional" } */
> +}
> diff --git a/gcc/tree-complex.c b/gcc/tree-complex.c
> index a97aaf9..2286e2e 100644
> --- a/gcc/tree-complex.c
> +++ b/gcc/tree-complex.c
> @@ -831,12 +831,15 @@ expand_complex_move (gimple_stmt_iterator *gsi, tree 
> type)
>  {
>tree x;
>gimple t;
> +  location_t loc;
>
> +  loc = gimple_location (stmt);
>r = extract_component (gsi, rhs, 0, false);
>i = extract_component (gsi, rhs, 1, false);
>
>x = build1 (REALPART_EXPR, inner_type, unshare_expr (lhs));
>t = gimple_build_assign (x, r);
> +  gimple_set_location (t, loc);
>gsi_insert_before (gsi, t, GSI_SAME_STMT);
>
>if (stmt == gsi_stmt (*gsi))
> @@ -849,6 +852,7 @@ expand_complex_move (gimple_stmt_iterator *gsi, tree type)
> {
>   x = build1 (IMAGPART_EXPR, inner_type, unshare_expr (lhs));
>   t = gimple_build_assign (x, i);
> + gimple_set_location (t, loc);
>   gsi_insert_before (gsi, t, GSI_SAME_STMT);
>
>   stmt = gsi_stmt (*gsi);
> diff --git a/gcc/tree-ssa-uninit.c b/gcc/tree-ssa-uninit.c
> index ae251cc..4310e7e 100644
> --- a/gcc/tree-ssa-uninit.c
> +++ b/gcc/tree-ssa-uninit.c
> @@ -123,17 

Re: [PATCH] Clean up and extend VRP edge-assertion code

2014-05-05 Thread Patrick Palka
On Mon, May 5, 2014 at 8:47 AM, Richard Biener
 wrote:
> On Mon, May 5, 2014 at 4:45 AM, Patrick Palka  wrote:
>> On Sun, May 4, 2014 at 12:20 PM, Patrick Palka  wrote:
>>> This patch causes a latent test failure in gcc.dg/uninit-pred-9_b.c
>>> due to an oversight in the tree-ssa-uninit code for which I will send
>>> a fix shortly.
>>
>> Never mind, I spoke too soon..  I haven't been able to solve this properly.
>
> It would be good to split the refactoring from extending the functionality.
> Also each of the cases you add should be accompanied by a testcase.
>
> Thanks,
> Richard.

Ok, will do.


Re: [C PATCH] Better column info for initializers (PR c/60139)

2014-05-05 Thread Marek Polacek
On Thu, May 01, 2014 at 11:55:38PM +, Joseph S. Myers wrote:
> On Fri, 25 Apr 2014, Marek Polacek wrote:
> 
> > Another minor fix: use loc instead of input_location.  Also add
> > missing OPT_Wpedantic. 
> 
> Why do you say it's missing?  A default pedwarn generally means "this is 
> dubious code, not allowed by ISO C, but we don't want to make it a hard 
> error" (and the trend is more to making such things hard errors, rather 
> than limiting them to -pedantic).
> 
> For the *first* such change ("not computable at load time"), I think it's 
> already conditional on (pedantic && !flag_isoc99), based on the setting of 
> require_constant_elements, and so using OPT_Wpedantic is appropriate.  But 
> I think the second represents an undesirable change to how the compiler 
> behaves - code such as
> 
> int f();
> int a = 1 ? 1 : f();
> 
> *should* be diagnosed by default rather than needing -pedantic.  If you 

Yeah, this case is even now diagnosted by default (for this we warn in
digest_init).

> think this case also only affects for form of diagnostics that were 
> already only given if pedantic, please explain further why this is the 
> case.

Right.  I added OPT_Wpedantic to second pedwarn_init ("initializer
element is not a constant expression"), because this warning was
output only when using -pedantic and C89, but it didn't say "[-Wpedantic]",
thus it looked like the warning is enabled by default.  The reason
is that this pedwarn is guarded by require_constant_elements, which
is, as you said, dependent on (pedantic && !flag_isoc99) in
start_init.  Just for the record, this pedwarn warns e.g. on

double sin (double);
void
fn (int *p)
{
  double d[] = { sin (1.0) };
}

So the patch doesn't suppress any warnings, it only adds
"[-Wpedantic]" where it should be (there might be other cases where 
"[-Wpedantic]" is missing).

Marek


Re: [PATCH] Clean up and extend VRP edge-assertion code

2014-05-05 Thread Richard Biener
On Mon, May 5, 2014 at 4:45 AM, Patrick Palka  wrote:
> On Sun, May 4, 2014 at 12:20 PM, Patrick Palka  wrote:
>> This patch causes a latent test failure in gcc.dg/uninit-pred-9_b.c
>> due to an oversight in the tree-ssa-uninit code for which I will send
>> a fix shortly.
>
> Never mind, I spoke too soon..  I haven't been able to solve this properly.

It would be good to split the refactoring from extending the functionality.
Also each of the cases you add should be accompanied by a testcase.

Thanks,
Richard.


Re: [wide-int] Handle zero-precision INTEGER_CSTs again

2014-05-05 Thread Richard Biener
On Mon, May 5, 2014 at 12:54 PM, Richard Sandiford
 wrote:
> Richard Biener  writes:
>> On Fri, May 2, 2014 at 9:19 PM, Richard Sandiford
>>  wrote:
>>> I'd hoped the days of zero-precision INTEGER_CSTs were behind us after
>>> Richard's patch to remove min amd max values from zero-width bitfields,
>>> but a boostrap-ubsan showed otherwise.  One source is in:
>>>
>>>   null_pointer_node = build_int_cst (build_pointer_type (void_type_node), 
>>> 0);
>>>
>>> if no_target, since the precision of the type comes from ptr_mode,
>>> which remains the default VOIDmode.  Maybe that's a bug, but setting
>>> it to an arbitrary nonzero value would also be dangerous.
>>
>> Can you explain what 'no_target' should be?  ptr_mode should never be
>> VOIDmode.
>
> Sorry, meant "no_backend" rather than "no_target".  See do_compile.

Ok.  So we do

  /* This must be run always, because it is needed to compute the FP
 predefined macros, such as __LDBL_MAX__, for targets using non
 default FP formats.  */
  init_adjust_machine_modes ();

  /* Set up the back-end if requested.  */
  if (!no_backend)
backend_init ();

where I think that init_adjust_machine_modes should initialize the
{byte,word,ptr}_mode globals.  Move from init_emit_once.

But why do we even run into uses of null_pointer_node for no_backend?
For sure for -fsyntax-only we can't really omit backend init as IMHO
no parser piece is decoupled enough to never call target hooks or so.

Thus, omit less of the initialization for no_backend.

>> So I don't think we want this patch.  Instead stor-layout should
>> ICE on zero-precision integer/pointer types.
>
> What should happen for void_zero_node?

Not sure what that beast is supposed to be or why it should be
of INTEGER_CST kind (it's not even initialized in any meaningful
way).

That said, the wide-int code shouldn't be slowed down by
precision == 0 checks.  We should never ever reach wide-int
with such a constant.

Richard.

> Thanks,
> Richard


[PATCH] config-list.mk: `show' target to show all considered targets

2014-05-05 Thread Jan-Benedict Glaw
Hi!

I'd like to install this patch, which would help me to run the build
robot (http://toolchain.lug-owl.de/buildbot/):

2014-05-05  Jan-Benedict Glaw  

contrib/
* config-list.mk (show): New target.

diff --git a/contrib/config-list.mk b/contrib/config-list.mk
index ddf24a2..cea0a94 100644
--- a/contrib/config-list.mk
+++ b/contrib/config-list.mk
@@ -80,8 +80,10 @@ LIST = aarch64-elf aarch64-linux-gnu \
 LOGFILES = $(patsubst %,log/%-make.out,$(LIST))
 all: $(LOGFILES)
 config: $(LIST)
+show:
+   @echo $(LIST)
 
-.PHONY: make-log-dir all config
+.PHONY: make-log-dir all config show
 
 empty=
 


Ok?

MfG, JBG

-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of: 17:45 <@Eimann> Hrm, das E90 hat keinen Lebenszeit Call-Time 
Counter mehr
the second  : 17:46 <@jbglaw> Eimann: Wofür braucht man das?
  17:46 <@jbglaw> Eimann: Für mich ist an 'nem Handy wichtig, daß 
ich mein
  Gegeüber hören kann. Und daß mein Gegenüber mich 
versteht...
  17:47 <@KrisK> jbglaw: was du meinst ist wodka.
  17:47 <@KrisK> jbglaw: es klingelt und man hört stimmen


signature.asc
Description: Digital signature


Re: [PATCH Ping v2] Extend -fstack-protector-strong to cover calls with return slot

2014-05-05 Thread Florian Weimer

On 02/03/2014 10:05 AM, Florian Weimer wrote:

On 01/17/2014 11:26 AM, Florian Weimer wrote:

On 01/08/2014 03:57 PM, Florian Weimer wrote:


What about the attached version?  It still does not exactly match your
original suggestion because gimple_call_lhs (stmt) can be NULL_TREE if
the result is ignored and this case needs instrumentation, as you
explained, so I use the function return type in the aggregate_value_p
check.

Testing is still under way, but looks good so far.  I'm bootstrapping
with BOOT_CFLAGS="-O2 -g -fstack-protector-strong" with Ada enabled, for
additional coverage.


Testing passed without new regressions.  Is this okay for trunk?


Ping?  Thanks.


Now that 4.9 is released, I'd like to propose again this fix for inclusion:



--
Florian Weimer / Red Hat Product Security Team


RE: [PATCH][2/3] Fix PR54733 Optimize endian independent load/store

2014-05-05 Thread Thomas Preud'homme
I found a way to improve the function find_bswap/find_bswap_or_nop
and reduce its size. Please hold for the review, I will post an updated
version as soon as I finish testing.

Best regards,

Thomas Preud'homme





Re: [wide-int] Handle zero-precision INTEGER_CSTs again

2014-05-05 Thread Richard Sandiford
Richard Biener  writes:
> On Fri, May 2, 2014 at 9:19 PM, Richard Sandiford
>  wrote:
>> I'd hoped the days of zero-precision INTEGER_CSTs were behind us after
>> Richard's patch to remove min amd max values from zero-width bitfields,
>> but a boostrap-ubsan showed otherwise.  One source is in:
>>
>>   null_pointer_node = build_int_cst (build_pointer_type (void_type_node), 0);
>>
>> if no_target, since the precision of the type comes from ptr_mode,
>> which remains the default VOIDmode.  Maybe that's a bug, but setting
>> it to an arbitrary nonzero value would also be dangerous.
>
> Can you explain what 'no_target' should be?  ptr_mode should never be
> VOIDmode.

Sorry, meant "no_backend" rather than "no_target".  See do_compile.

> So I don't think we want this patch.  Instead stor-layout should
> ICE on zero-precision integer/pointer types.

What should happen for void_zero_node?

Thanks,
Richard


Re: [PATCH] Fix PR c++/59366 A friend function template defined in a class is found without ADL

2014-05-05 Thread Momchil Velikov

Bootstrapped/regtested on x86_64-linux-gnu.

One test, gcc/testsuite/g++.old-deja/g++.pt/friend5.C, breaks, as it should,
and was fixed.

~chill



[PATCH] Fix PR c++/59366 A friend function template defined in a class is found without ADL

2014-05-05 Thread Momchil Velikov

Hello,

A friend function, defined in a class and not declared outside should
be hidden from ordinary name lookup and only found by argument-dependent
lookup, like:

struct S
{
   friend void f() {}
   friend void g(const S &) {}
};

int
main()
{
   f(); // error
   g(S()); // correct, found by ADL
}

GCC correctly handles this case, but fails for function templates like:

struct S
{
   template friend void f(T) {}
};

int
main()
{
   f(1); // should be an error, GCC succeeds
   f(S()); // correct, found by ADL
}

~chill

diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
index 374cd0f..b374181 100644
--- a/gcc/cp/ChangeLog
+++ b/gcc/cp/ChangeLog
@@ -1,3 +1,9 @@
+2014-05-05  Momchil Velikov  
+
+	PR c++/59366
+	* name-lookup.c (pushdecl_maybe_friend_1): Hide friend functions
+	and function templates, declared only in the class.
+
 2014-05-03  Paolo Carlini  
 
 	PR c++/58582
diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index d900560..8441047 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -924,6 +924,29 @@ pushdecl_maybe_friend_1 (tree x, bool is_friend)
   if (DECL_DECLARES_FUNCTION_P (t))
 	check_default_args (t);
 
+  if (TREE_CODE (x) == FUNCTION_DECL || DECL_FUNCTION_TEMPLATE_P (x))
+{
+	  if (is_friend)
+	{
+	  if (t == x && !flag_friend_injection)
+		{
+		  /* This is a new friend declaration of a function or
+		 a function template, so hide it from ordinary
+		 function lookup.  */
+		  DECL_ANTICIPATED (t) = 1;
+		  DECL_HIDDEN_FRIEND_P (t) = 1;
+		}
+	}
+	  else if (t != x && t != error_mark_node)
+	{
+	  /* This is a non-friend re-declaration of a possibly
+		 hidden function or a function template, so don't hide
+		 it. */
+	  DECL_ANTICIPATED (t) = 0;
+	  DECL_HIDDEN_FRIEND_P (t) = 0;
+	}
+	}
+
   if (t != x || DECL_FUNCTION_TEMPLATE_P (t))
 	return t;
 
@@ -983,16 +1006,6 @@ pushdecl_maybe_friend_1 (tree x, bool is_friend)
 	}
 	}
 
-  if (TREE_CODE (x) == FUNCTION_DECL
-	  && is_friend
-	  && !flag_friend_injection)
-	{
-	  /* This is a new declaration of a friend function, so hide
-	 it from ordinary function lookup.  */
-	  DECL_ANTICIPATED (x) = 1;
-	  DECL_HIDDEN_FRIEND_P (x) = 1;
-	}
-
   /* This name is new in its binding level.
 	 Install the new declaration and return it.  */
   if (namespace_bindings_p ())
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 3b613d9..dbd63bd 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,10 @@
+2014-05-05  Momchil Velikov  
+
+	PR c++/59366
+	* g++.dg/template/friend56.C: New
+	* g++.old-deja/g++.pt/friend5.C (main): Fix testcase. The friend
+	functions `f` should be found only by ADL.
+
 2014-05-03  Paolo Carlini  
 
 	PR c++/58582
diff --git a/gcc/testsuite/g++.dg/template/friend56.C b/gcc/testsuite/g++.dg/template/friend56.C
new file mode 100644
index 000..7077d5e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/friend56.C
@@ -0,0 +1,21 @@
+// PR c++/59366
+// { dg-do compile }
+template void f(T);
+
+struct S
+{
+  template friend void f(T) {}
+  template friend void g(T) {}
+  template friend void h(T) {}
+};
+
+template void h(T);
+
+int
+main ()
+{
+  f(1);
+  g(1); // { dg-error "'g' was not declared in this scope" }
+  g(S());
+  h(1);
+}
diff --git a/gcc/testsuite/g++.old-deja/g++.pt/friend5.C b/gcc/testsuite/g++.old-deja/g++.pt/friend5.C
index 3feeb68..edb9d62 100644
--- a/gcc/testsuite/g++.old-deja/g++.pt/friend5.C
+++ b/gcc/testsuite/g++.old-deja/g++.pt/friend5.C
@@ -14,5 +14,5 @@ class C
 
 int main()
 {
-  f(7);
+  f(C());
 }



GCC 4.8.3 Status Report (2014-05-05)

2014-05-05 Thread Richard Biener

Status
==

After releasing GCC 4.9.0 it is now time to think about a GCC 4.8.3
release.  The branch remains in release-branch mode for now until
we do a first release candidate somewhen next week.  This means you
have about a week to do backports of important regression fixes - now
that GCC 4.9.0 is out you should become more careful of what you
backport, avoiding any risk if you can point people to GCC 4.9.0
as a workaround.  After GCC 4.8.3 is out the 4.7 branch will also
see a last release before it is closed.


Quality Data


Priority  #   Change from last report
---   ---
P10
P2  100+  30
P3   42-  31
---   ---
Total   142-   1


Previous Report
===

http://gcc.gnu.org/ml/gcc/2013-10/msg00170.html


The next report will be sent by the one doing GCC 4.8.3 RC1


[PATCH][3/n] Make TODO_verify_il handle TODO_verify_flow_info

2014-05-05 Thread Richard Biener

This tackles TODO_verify_flow_info, the last existing bit.  Similar
to others we have to avoid running after IPA passes as they leave
CFG bits to be fixed up by the fixup_cfg pass invocations.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

The next bit will be removing of all the TODO_ flags and their
use.  Missing is clever scheduling of loop verification.

Richard.

2014-05-05  Richard Biener  

* passes.c (execute_function_todo): Move TODO_verify_flow under
the TODO_verify_ul umbrella.

Index: gcc/passes.c
===
--- gcc/passes.c(revision 209928)
+++ gcc/passes.c(working copy)
@@ -1783,11 +1783,11 @@ execute_function_todo (function *fn, voi
/* IPA passes leave stmts to be fixed up, so make sure to
   not verify SSA operands whose verifier will choke on that.  */
verify_ssa (true, !from_ipa_pass);
-   }
-  if (flags & TODO_verify_flow)
-   verify_flow_info ();
-  if (flags & TODO_verify_il)
-   {
+ /* IPA passes leave basic-blocks unsplit, so make sure to
+not trip on that.  */
+ if ((cfun->curr_properties & PROP_cfg)
+ && !from_ipa_pass)
+   verify_flow_info ();
  if (current_loops
  && loops_state_satisfies_p (LOOP_CLOSED_SSA))
verify_loop_closed_ssa (false);



Re: [AArch64/ARM 3/3] Add execution tests of ARM ZIP Intrinsics

2014-05-05 Thread Christophe Lyon
This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61062


On 30 April 2014 11:18, Ramana Radhakrishnan  wrote:
> On Thu, Mar 27, 2014 at 10:53 AM, Alan Lawrence  wrote:
>> Final patch adds new tests of the ARM ZIP Intrinsics (subsuming the
>> autogenerated ones in testsuite/gcc.target/arm/neon/), that also check the
>> execution results, reusing the test bodies introduced into AArch64 in the
>> first patch.
>>
>> All tests passing on arm-none-eabi.
>
> This is OK - thanks,
>
> Ramana
>
>>
>> gcc/testsuite/ChangeLog:
>> 2012-03-27  Alan Lawrence  
>>
>> * gcc.target/arm/simd/simd.exp: New file.
>> * gcc.target/arm/simd/vzipqf32_1.c: New file.
>> * gcc.target/arm/simd/vzipqp16_1.c: New file.
>> * gcc.target/arm/simd/vzipqp8_1.c: New file.
>> * gcc.target/arm/simd/vzipqs16_1.c: New file.
>> * gcc.target/arm/simd/vzipqs32_1.c: New file.
>> * gcc.target/arm/simd/vzipqs8_1.c: New file.
>> * gcc.target/arm/simd/vzipqu16_1.c: New file.
>> * gcc.target/arm/simd/vzipqu32_1.c: New file.
>> * gcc.target/arm/simd/vzipqu8_1.c: New file.
>> * gcc.target/arm/simd/vzipf32_1.c: New file.
>> * gcc.target/arm/simd/vzipp16_1.c: New file.
>> * gcc.target/arm/simd/vzipp8_1.c: New file.
>> * gcc.target/arm/simd/vzips16_1.c: New file.
>> * gcc.target/arm/simd/vzips32_1.c: New file.
>> * gcc.target/arm/simd/vzips8_1.c: New file.
>> * gcc.target/arm/simd/vzipu16_1.c: New file.
>> * gcc.target/arm/simd/vzipu32_1.c: New file.
>> * gcc.target/arm/simd/vzipu8_1.c: New file.
>> diff --git a/gcc/testsuite/gcc.target/arm/simd/simd.exp
>> b/gcc/testsuite/gcc.target/arm/simd/simd.exp
>> new file mode 100644
>> index 000..746429d
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/arm/simd/simd.exp
>> @@ -0,0 +1,35 @@
>> +# Copyright (C) 1997-2014 Free Software Foundation, Inc.
>> +
>> +# This program is free software; you can redistribute it and/or modify
>> +# it under the terms of the GNU General Public License as published by
>> +# the Free Software Foundation; either version 3 of the License, or
>> +# (at your option) any later version.
>> +#
>> +# This program is distributed in the hope that it will be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +# GNU General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU General Public License
>> +# along with GCC; see the file COPYING3.  If not see
>> +# .
>> +
>> +# GCC testsuite that uses the `dg.exp' driver.
>> +
>> +# Exit immediately if this isn't an ARM target.
>> +if ![istarget arm*-*-*] then {
>> +  return
>> +}
>> +
>> +# Load support procs.
>> +load_lib gcc-dg.exp
>> +
>> +# Initialize `dg'.
>> +dg-init
>> +
>> +# Main loop.
>> +dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cCS\]]] \
>> +   "" ""
>> +
>> +# All done.
>> +dg-finish
>> diff --git a/gcc/testsuite/gcc.target/arm/simd/vzipf32_1.c
>> b/gcc/testsuite/gcc.target/arm/simd/vzipf32_1.c
>> new file mode 100644
>> index 000..efaa96e
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/arm/simd/vzipf32_1.c
>> @@ -0,0 +1,12 @@
>> +/* Test the `vzipf32' ARM Neon intrinsic.  */
>> +
>> +/* { dg-do run } */
>> +/* { dg-require-effective-target arm_neon_ok } */
>> +/* { dg-options "-save-temps -O1 -fno-inline" } */
>> +/* { dg-add-options arm_neon } */
>> +
>> +#include "arm_neon.h"
>> +#include "../../aarch64/simd/vzipf32.x"
>> +
>> +/* { dg-final { scan-assembler-times "vuzp\.32\[ \t\]+\[dD\]\[0-9\]+,
>> ?\[dD\]\[0-9\]+!?\(?:\[ \t\]+@\[a-zA-Z0-9 \]+\)?\n" 1 } } */
>> +/* { dg-final { cleanup-saved-temps } } */
>> diff --git a/gcc/testsuite/gcc.target/arm/simd/vzipp16_1.c
>> b/gcc/testsuite/gcc.target/arm/simd/vzipp16_1.c
>> new file mode 100644
>> index 000..4154333
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/arm/simd/vzipp16_1.c
>> @@ -0,0 +1,12 @@
>> +/* Test the `vzipp16' ARM Neon intrinsic.  */
>> +
>> +/* { dg-do run } */
>> +/* { dg-require-effective-target arm_neon_ok } */
>> +/* { dg-options "-save-temps -O1 -fno-inline" } */
>> +/* { dg-add-options arm_neon } */
>> +
>> +#include "arm_neon.h"
>> +#include "../../aarch64/simd/vzipp16.x"
>> +
>> +/* { dg-final { scan-assembler-times "vzip\.16\[ \t\]+\[dD\]\[0-9\]+,
>> ?\[dD\]\[0-9\]+!?\(?:\[ \t\]+@\[a-zA-Z0-9 \]+\)?\n" 1 } } */
>> +/* { dg-final { cleanup-saved-temps } } */
>> diff --git a/gcc/testsuite/gcc.target/arm/simd/vzipp8_1.c
>> b/gcc/testsuite/gcc.target/arm/simd/vzipp8_1.c
>> new file mode 100644
>> index 000..9fe2384
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/arm/simd/vzipp8_1.c
>> @@ -0,0 +1,12 @@
>> +/* Test the `vzipp8' ARM Neon intrinsic.  */
>> +
>> +/* { dg-do run } */
>> +/* { dg-require-effective-target arm_neon_ok } */
>> +/* { dg-options "-save-temps -O1 -fno-inline" } */
>> +/* { dg-add-optio

Re: Changes for if-convert to recognize simple conditional reduction.

2014-05-05 Thread Yuri Rumyantsev
No, this is quite different issue related to safety of load/stores
which are on branched paths, i.e. not always executed and may trap.
Note that we don't have such  issue for HSW which have masked
load/stores.

Yuri.

2014-05-05 11:44 GMT+04:00 Richard Biener :
> On Wed, Apr 30, 2014 at 5:50 PM, Yuri Rumyantsev  wrote:
>> Thanks a lot Richard for you review.
>> I did all proposed changes, checked that bootstrap and regression
>> testing did not show new failures.
>> Updated patch is attached.
>
> As said, this is ok for checkin.
>
> Thanks,
> Richard.
>
>> Best regards.
>> Yuri.
>>
>> 2014-04-30 16:40 GMT+04:00 Richard Biener :
>>> On Tue, Apr 29, 2014 at 4:29 PM, Yuri Rumyantsev  wrote:
 2014-04-28 16:16 GMT+04:00 Richard Biener :
> On Thu, Apr 17, 2014 at 3:09 PM, Yuri Rumyantsev  
> wrote:
>> Hi All,
>>
>> We implemented enhancement for if-convert phase to recognize the
>> simplest conditional reduction and to transform it vectorizable form,
>> e.g. statement
>> if (A[i] != 0) num+= 1; will be recognized.
>> A new test-case is also provided.
>>
>> Bootstrapping and regression testing did not show any new failures.
>
> Clever.  Can you add a testcase with a non-constant but invariant
> reduction value and one with a variable reduction value as well?
>
 [Yuri]
 I added another testcase to demonstrate ability of new algorithm, i.e.
 it transforms   if (a[i] != 0)   sum += a[i];
 to unconditional form without check on zero. Note also that any checks
 on non-reduction operand were deleted.

> +  if (!(is_cond_scalar_reduction (arg_0, &reduc, &op0, &op1)
> +   || is_cond_scalar_reduction (arg_1, &reduc, &op0, &op1)))
>
> Actually one of the args should be defined by a PHI node in the
> loop header and the PHI result should be the PHI arg on the
> latch edge, so I'd pass both PHI args to the predicate and do
> the decision on what the reduction op is there (you do that
> anyway).  The pattern matching is somewhat awkward
>
 [Yuri]
 I changed prototype of 'is_cond_scalar_reduction'  and now we have
 only one call:
 +  if (!is_cond_scalar_reduction (phi, &reduc, &op0, &op1))

> +  /* Consider only conditional reduction.  */
> +  bb = gimple_bb (stmt);
> +  if (!bb_has_predicate (bb))
> +return false;
> +  if (is_true_predicate (bb_predicate (bb)))
> +return false;
>
> should be replaced by matching the PHI structure
>
> loop-header:
>   reduc_1 = PHI <..., reduc_2>
>   ...
>   if (..)
> reduc_3 = ...
>   reduc_2 = PHI 
>
 [Yuri]
In fact, I re-wrote this function completely as you proposed using
 PHI structure matching.

> +  lhs = gimple_assign_lhs (stmt);
> +  if (TREE_CODE (lhs) != SSA_NAME)
> +return false;
>
> always true, in fact lhs == arg.
>
 [Yuri]
 Fixed.

> +  if (SSA_NAME_VAR (lhs) == NULL)
> +return false;
>
 [Yuri]
 Deleted.
> no need to check that (or later verify SSA_NAME_VAR equivalency), not
> sure why you think you need that.
>
> +  if (!single_imm_use (lhs, &use, &use_stmt))
> +return false;
> +  if (gimple_code (use_stmt) != GIMPLE_PHI)
> +return false;
>
> checking has_single_use (arg) is enough.  The above is error-prone
> wrt debug statements.
>
 [Yuri] Only proposed check is used:
 +  if (!has_single_use (lhs))
 +return false;

> +  if (reduction_op == PLUS_EXPR &&
> +  TREE_CODE (r_op2) == SSA_NAME)
>
> && goes to the next line
>
 [Yuri]
 Fixed.

> +  if (TREE_CODE (r_op2) != INTEGER_CST && TREE_CODE (r_op2) != REAL_CST)
> +return false;
>
> any reason for this check?  The vectorizer can cope with
> loop invariant non-constant values as well (at least).
>
 [Yuri]
 This checks were deleted, i.e. any operand is acceptable.

> +  /* Right operand is constant, check that left operand is equal to lhs. 
>  */
> +  if (SSA_NAME_VAR (lhs) !=  SSA_NAME_VAR (r_op1))
> +return false;
>
> see above - that looks weird.
>
 [Yuri]
 This code was deleted.
> Note that I think you may introduce undefined overflow in
> unconditionally executing the increment.  So you need to
> make sure to re-write the increment in unsigned arithmetic
> (for integral types, that is).
 [Yuri]
 I did not catch your point: if we have
 if (cond) s += val;
 it will be transformed to
 s += (cond? val: 0)
 which looks like completely equivalent to original stmt.
>>>
>>> Ah indeed.
>>>
>
> Thanks,
> Richard.
>
>
>
>> Is it OK for trunk?
>>
>> gcc/ChangeLog:
>> 2014-04-17  Yuri Rumyantsev  
>>
>> * tree-if-conv.c (is_cond_scalar_reduct

Re: [wide-int] Handle zero-precision INTEGER_CSTs again

2014-05-05 Thread Richard Biener
On Fri, May 2, 2014 at 9:19 PM, Richard Sandiford
 wrote:
> I'd hoped the days of zero-precision INTEGER_CSTs were behind us after
> Richard's patch to remove min amd max values from zero-width bitfields,
> but a boostrap-ubsan showed otherwise.  One source is in:
>
>   null_pointer_node = build_int_cst (build_pointer_type (void_type_node), 0);
>
> if no_target, since the precision of the type comes from ptr_mode,
> which remains the default VOIDmode.  Maybe that's a bug, but setting
> it to an arbitrary nonzero value would also be dangerous.

Can you explain what 'no_target' should be?  ptr_mode should never be
VOIDmode.  So I don't think we want this patch.  Instead stor-layout should
ICE on zero-precision integer/pointer types.

Richard.

>
> The other use is in the C/C++ void_zero_node, which is specifically
> defined as a VOID_TYPE, zero-precision INTEGER_CST.  This is used by the
> ubsan code in some ?: tests, as well as by other places in the frontend
> proper.  At least the ubsan use persists until gimple.
>
> This patch therefore restores the wide-int handling for zero precision,
> for which the length must be 1 and the single HWI must be zero.
> I've tried to wrap up most of the dependencies in two new functions,
> wi::blocks_needed (a function version of the .cc BLOCKS_NEEDED, now also
> used in the header) and wi::excess_bits, so that it'll be easier to
> remove the handling again if zero precision ever goes away.
> There are some remaining, easily-greppable cases that check directly
> for a precision of 0 though.
>
> The combination of this and the other patches allows boostrap-ubsan to
> complete.  There are a lot of extra testsuite failures compared to a
> normal bootstrap, but they don't look related to wide-int.
>
> Tested on x86_64-linux-gnu and powerpc64-linux-gnu.  OK to install?
>
> Thanks,
> Richard
>
>
> Index: gcc/wide-int.cc
> ===
> --- gcc/wide-int.cc 2014-05-02 16:28:07.657847935 +0100
> +++ gcc/wide-int.cc 2014-05-02 16:28:09.560842845 +0100
> @@ -48,8 +48,6 @@ static const HOST_WIDE_INT zeros[WIDE_IN
>  #define HALF_INT_MASK (((HOST_WIDE_INT) 1 << HOST_BITS_PER_HALF_WIDE_INT) - 
> 1)
>
>  #define BLOCK_OF(TARGET) ((TARGET) / HOST_BITS_PER_WIDE_INT)
> -#define BLOCKS_NEEDED(PREC) \
> -  (PREC ? (((PREC) + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT) : 
> 1)
>  #define SIGN_MASK(X) ((HOST_WIDE_INT) (X) < 0 ? -1 : 0)
>
>  /* Return the value a VAL[I] if I < LEN, otherwise, return 0 or -1
> @@ -69,20 +67,20 @@ safe_uhwi (const HOST_WIDE_INT *val, uns
>  static unsigned int
>  canonize (HOST_WIDE_INT *val, unsigned int len, unsigned int precision)
>  {
> -  unsigned int blocks_needed = BLOCKS_NEEDED (precision);
> +  unsigned int blocks_needed = wi::blocks_needed (precision);
>HOST_WIDE_INT top;
>int i;
>
>if (len > blocks_needed)
>  len = blocks_needed;
>
> +  if (wi::excess_bits (len, precision) > 0)
> +val[len - 1] = sext_hwi (val[len - 1], precision % 
> HOST_BITS_PER_WIDE_INT);
>if (len == 1)
>  return len;
>
>top = val[len - 1];
> -  if (len * HOST_BITS_PER_WIDE_INT > precision)
> -val[len - 1] = top = sext_hwi (top, precision % HOST_BITS_PER_WIDE_INT);
> -  if (top != 0 && top != (HOST_WIDE_INT)-1)
> +  if (top != 0 && top != (HOST_WIDE_INT) -1)
>  return len;
>
>/* At this point we know that the top is either 0 or -1.  Find the
> @@ -134,7 +132,7 @@ wi::from_buffer (const unsigned char *bu
>
>/* We have to clear all the bits ourself, as we merely or in values
>   below.  */
> -  unsigned int len = BLOCKS_NEEDED (precision);
> +  unsigned int len = wi::blocks_needed (precision);
>HOST_WIDE_INT *val = result.write_val ();
>for (unsigned int i = 0; i < len; ++i)
>  val[i] = 0;
> @@ -180,7 +178,7 @@ wi::to_mpz (const wide_int_ref &x, mpz_t
>  {
>int len = x.get_len ();
>const HOST_WIDE_INT *v = x.get_val ();
> -  int excess = len * HOST_BITS_PER_WIDE_INT - x.get_precision ();
> +  unsigned int excess = wi::excess_bits (len, x.get_precision ());
>
>if (wi::neg_p (x, sgn))
>  {
> @@ -306,7 +304,8 @@ wi::force_to_size (HOST_WIDE_INT *val, c
>unsigned int xlen, unsigned int xprecision,
>unsigned int precision, signop sgn)
>  {
> -  unsigned int blocks_needed = BLOCKS_NEEDED (precision);
> +  unsigned int blocks_needed = wi::blocks_needed (precision);
> +  unsigned int xblocks_needed = wi::blocks_needed (xprecision);
>unsigned int len = blocks_needed < xlen ? blocks_needed : xlen;
>for (unsigned i = 0; i < len; i++)
>  val[i] = xval[i];
> @@ -318,11 +317,11 @@ wi::force_to_size (HOST_WIDE_INT *val, c
>/* Expanding.  */
>if (sgn == UNSIGNED)
> {
> - if (small_xprecision && len == BLOCKS_NEEDED (xprecision))
> + if (small_xprecision && len == xblocks_needed)
> val[len - 1] = zext_hwi (val[len - 1], small_xprecision);
>   

Re: [PATCH] VRP: simplify assert location check

2014-05-05 Thread Richard Biener
On Sun, May 4, 2014 at 5:03 AM, Patrick Palka  wrote:
> We can check if any assertions were found by simply inspecting the
> need_assert_for bitmap.  This eliminates the need to pass all these
> bools around everywhere.
>
> 2014-05-03  Patrick Palka  
>
> * tree-vrp.c (register_edge_assert_for_1): Change return type
> to void.
> (register_edge_assert_for, find_conditional_asserts,
> find_switch_asserts, find_assert_locations): Likewise.
> (insert_range_assertions): Inspect the need_assert_for bitmap.
>
> Bootstrap and regtest on x86_64-unknown-linux-gnu in progress.

Ok if that passes.

Thanks,
Richard.

> ---
>  gcc/tree-vrp.c | 138 
> -
>  1 file changed, 48 insertions(+), 90 deletions(-)
>
> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
> index 0d18b42..3d24f8e 100644
> --- a/gcc/tree-vrp.c
> +++ b/gcc/tree-vrp.c
> @@ -4803,7 +4803,7 @@ masked_increment (double_int val, double_int mask, 
> double_int sgnbit,
> Invert the condition COND if INVERT is true.
> Return true if an assertion is registered.  */
>
> -static bool
> +static void
>  register_edge_assert_for_1 (tree name, edge e, gimple_stmt_iterator bsi,
> unsigned int limit, enum tree_code cond_code,
> tree cond_op0, tree cond_op1, bool invert)
> @@ -4811,29 +4811,25 @@ register_edge_assert_for_1 (tree name, edge e, 
> gimple_stmt_iterator bsi,
>tree val;
>enum tree_code comp_code, def_rhs_code;
>gimple def_stmt;
> -  bool retval = false;
>
>if (limit == 0)
> -return false;
> +return;
>
>/* We only care about SSA_NAMEs.  */
>if (TREE_CODE (name) != SSA_NAME)
> -return false;
> +return;
>
>if (!extract_code_and_val_from_cond_with_ops (name, cond_code,
> cond_op0,
> cond_op1,
> invert, &comp_code, &val))
> -return false;
> +return;
>
>/* Only register an ASSERT_EXPR if NAME was found in the sub-graph
>   reachable from E.  */
>if (live_on_edge (e, name)
>&& !has_single_use (name))
> -{
> -  register_new_assert_for (name, name, comp_code, val, NULL, e, bsi);
> -  retval = true;
> -}
> +register_new_assert_for (name, name, comp_code, val, NULL, e, bsi);
>
>/* In the case of NAME <= CST and NAME being defined as
>   NAME = (unsigned) NAME2 + CST2 we can assert NAME2 >= -CST2
> @@ -4894,8 +4890,6 @@ register_edge_assert_for_1 (tree name, edge e, 
> gimple_stmt_iterator bsi,
> }
>
>   register_new_assert_for (name3, tmp, comp_code, val, NULL, e, bsi);
> -
> - retval = true;
> }
>
>/* If name2 is used later, create an ASSERT_EXPR for it.  */
> @@ -4925,8 +4919,6 @@ register_edge_assert_for_1 (tree name, edge e, 
> gimple_stmt_iterator bsi,
> }
>
>   register_new_assert_for (name2, tmp, comp_code, val, NULL, e, bsi);
> -
> - retval = true;
> }
>  }
>
> @@ -4964,7 +4956,6 @@ register_edge_assert_for_1 (tree name, edge e, 
> gimple_stmt_iterator bsi,
>   cst = int_const_binop (code, val, cst);
>   register_new_assert_for (name2, name2, comp_code, cst,
>NULL, e, bsi);
> - retval = true;
> }
> }
>  }
> @@ -5028,8 +5019,6 @@ register_edge_assert_for_1 (tree name, edge e, 
> gimple_stmt_iterator bsi,
>
>   register_new_assert_for (name2, tmp, new_comp_code, cst, NULL,
>e, bsi);
> -
> - retval = true;
> }
> }
>
> @@ -5108,7 +5097,6 @@ register_edge_assert_for_1 (tree name, edge e, 
> gimple_stmt_iterator bsi,
>
>   register_new_assert_for (name2, tmp, new_comp_code, new_val,
>NULL, e, bsi);
> - retval = true;
> }
> }
>
> @@ -5129,8 +5117,7 @@ register_edge_assert_for_1 (tree name, edge e, 
> gimple_stmt_iterator bsi,
>   && TREE_CODE (TREE_TYPE (val)) == INTEGER_TYPE
>   && TYPE_UNSIGNED (TREE_TYPE (val))
>   && TYPE_PRECISION (TREE_TYPE (gimple_assign_rhs1 (def_stmt)))
> -> prec
> - && !retval))
> +> prec))
> {
>   name2 = gimple_assign_rhs1 (def_stmt);
>   if (rhs_code == BIT_AND_EXPR)
> @@ -5358,7 +5345,6 @@ register_edge_assert_for_1 (tree name, edge e, 
> gimple_stmt_iterator bsi,
>
> register_new_assert_for (names[i], tmp, LE_EXPR,
>  new_val, NULL, e, bsi);
> -   retval = true;
>   }
> }
> }
> @@ -5370,7 +5356,7 @@ register_edge_assert_for_1 (tree name, edge e, 
> gimple_stmt_iterator bsi,
>
>def_

Re: [PATCH 0/3] Compile-time gimple checking, without typedefs

2014-05-05 Thread Richard Biener
On Fri, May 2, 2014 at 11:56 PM, David Malcolm  wrote:
> This patch series demonstrates a way of reimplementing the 89-patch series:
>   "[PATCH 00/89] Compile-time gimple-checking"
>  http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01148.html
>
> whilst avoiding introducing a pair of "gimple_foo/const_gimple_foo" typedefs
> for each subclass.
>
> It eliminates the "gimple" and "const_gimple" typedefs,
> renaming "gimple_statement_base" to "gimple_stmt", giving types:
>   "gimple_stmt *" and "const gimple_stmt *"
> thoughout the middle-end.  The rest of the gimple statement classes are
> renamed, converting the various
>   gimple_statement_with_FOO
> to:
>   gimple_stmt_with_FOO
> and the remainder:
>   gimple_statement_SOME_SUBCLASS
> to just:
>   gimple_SOME_SUBCLASS
>
> The idea is then to reimplement the earlier patch series, porting many of
> these:
>   gimple_stmt *something
> to point to some more concrete subclass; I've done this for GIMPLE_SWITCH.

Thanks for doing this.

> It requires two patches that I've already posted separately:
>
>   (A): "[PATCH] gengtype: Support explicit pointers in template arguments":
>   http://gcc.gnu.org/ml/gcc-patches/2014-05/msg3.html
>(which apparently will need reworking after wide-int is merged;
> oh well).

I'll look at this after the wide-int merge (which hopefully happens soon...)

>   (B): "[PATCH 19/89] Const-correctness of gimple_call_builtin_p":
>   http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01194.html
>(I have a separate bootstrap®rtest in progress for just this one,
> which appears to be pre-approved per Jeff's earlier comments).
>
> Of the 3 patches in the series itself:
>
>   Patch 1:
> This one is handwritten: it renames the gimple statement classes in
> their declarations, in the docs, in the selftests and renames explicit
> uses of *subclasses*.
>
>   Patch 2:
>  This one is autogenerated: it does the mass conversion of "gimple" to
>  "gimple_stmt *" and "const_gimple" to "const gimple_stmt *" throughout
>  the code.
>
>  The conversion script is at:
>
> https://github.com/davidmalcolm/gcc-refactoring-scripts/blob/master/rename_gimple.py
>
>  It's not a real C++ parser, but it has some smarts for handling
>  awkward cases like:
> gimple def0, def2;
>  which must become:
> gimple_stmt *def0, *def2;
>^ note the second '*' character.
>
>  You can see the selftests for the conversion script at:
>
> https://github.com/davidmalcolm/gcc-refactoring-scripts/blob/master/test_rename_gimple.py
>
>   Patch 3:
> This one is a port of the previously-posted:
>   "[PATCH 02/89] Introduce gimple_switch and use it in various places"
>  http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01154.html
> to the new approach.  As per an earlier revision:
>   http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01549.html
> it *eliminates* the
>   stmt->as_a_gimple_switch ()
> and
>   stmt->dyn_cast_gimple_switch ()
> casting methods from the base class in favor of direct usage of is-a.h:
>   as_a  (stmt)
> and
>   dyn_cast  (stmt)
>
> This version of the patch also eliminates the
>gimple_switch / const_gimple_switch
> typedefs from the initial patch series in favor of "gimple_switch"
> being the name of the subclass.

I wonder if, when we intoduce a gimple namespace, we can write

 dyn_cast  (stmt)

by means of ADL?  Thus, imagine just doing

namespace gimple {

 ... gimple.h contents ...
}

using gimple::gimple;
... a few more? ...

typedef gimple::gimple gimple;

And then

  gimple *stmt;
  gimple::switch *s = dyn_cast  (stmt);

(yeah, awkward 'switch' reserved keyword as you say below ...)

Thus, for all the new explicit sub-types you introduce use the namespace
syntax.  Can ADL work in our favors here with the way dyn_cast and
friends work?  Thus, do we really need to write dyn_cast
 (stmt)?

No, I'm not asking you to do that now (but a limited form of a gimple namespace
would be nice to have - using the "old-style" 'gimple' type avoids changing too
much code I suppose).

Just sth that you don't run out of work ;)

> Hopefully porting this first patch in the series gives a sense of what
> the end-result would look like.
>
> There are a few possible variations on the names we could pick in this
> approach.
>
> I deliberately renamed "gimple" to "gimple_stmt" since IIRC Andrew MacLeod
> had suggested something like this, for the potential of making "gimple"
> be a namespace. That said, using namespaces may be controversial, and
> would need further gengtype support, which may be tricky (i.e. fully teach
> gengtype about C++ namespaces, which may be a gengtype hack too far).
> [I'd much prefer to avoid C++ namespaces in the core code, mostly because
> of gengtype].
>
> Also, AIUI, Andrew is looking at introducing concepts of gimple types and
> gimple expr

[PATCH] Update libstdc++ baseline symbols for ia64

2014-05-05 Thread Andreas Schwab
Committed to trunk.

Andreas.

* config/abi/post/ia64-linux-gnu/baseline_symbols.txt
(CXXABI_1.3.9): Remove __float128 symbols.

Index: config/abi/post/ia64-linux-gnu/baseline_symbols.txt
===
--- config/abi/post/ia64-linux-gnu/baseline_symbols.txt (revision 210061)
+++ config/abi/post/ia64-linux-gnu/baseline_symbols.txt (working copy)
@@ -2642,7 +2642,6 @@
 OBJECT:16:_ZTId@@CXXABI_1.3
 OBJECT:16:_ZTIe@@CXXABI_1.3
 OBJECT:16:_ZTIf@@CXXABI_1.3
-OBJECT:16:_ZTIg@@CXXABI_1.3.9
 OBJECT:16:_ZTIh@@CXXABI_1.3
 OBJECT:16:_ZTIi@@CXXABI_1.3
 OBJECT:16:_ZTIj@@CXXABI_1.3
@@ -3170,7 +3169,6 @@
 OBJECT:2:_ZTSd@@CXXABI_1.3
 OBJECT:2:_ZTSe@@CXXABI_1.3
 OBJECT:2:_ZTSf@@CXXABI_1.3
-OBJECT:2:_ZTSg@@CXXABI_1.3.9
 OBJECT:2:_ZTSh@@CXXABI_1.3
 OBJECT:2:_ZTSi@@CXXABI_1.3
 OBJECT:2:_ZTSj@@CXXABI_1.3
@@ -3204,7 +3202,6 @@
 OBJECT:32:_ZTIPKd@@CXXABI_1.3
 OBJECT:32:_ZTIPKe@@CXXABI_1.3
 OBJECT:32:_ZTIPKf@@CXXABI_1.3
-OBJECT:32:_ZTIPKg@@CXXABI_1.3.9
 OBJECT:32:_ZTIPKh@@CXXABI_1.3
 OBJECT:32:_ZTIPKi@@CXXABI_1.3
 OBJECT:32:_ZTIPKj@@CXXABI_1.3
@@ -3224,7 +3221,6 @@
 OBJECT:32:_ZTIPd@@CXXABI_1.3
 OBJECT:32:_ZTIPe@@CXXABI_1.3
 OBJECT:32:_ZTIPf@@CXXABI_1.3
-OBJECT:32:_ZTIPg@@CXXABI_1.3.9
 OBJECT:32:_ZTIPh@@CXXABI_1.3
 OBJECT:32:_ZTIPi@@CXXABI_1.3
 OBJECT:32:_ZTIPj@@CXXABI_1.3
@@ -3272,7 +3268,6 @@
 OBJECT:3:_ZTSPd@@CXXABI_1.3
 OBJECT:3:_ZTSPe@@CXXABI_1.3
 OBJECT:3:_ZTSPf@@CXXABI_1.3
-OBJECT:3:_ZTSPg@@CXXABI_1.3.9
 OBJECT:3:_ZTSPh@@CXXABI_1.3
 OBJECT:3:_ZTSPi@@CXXABI_1.3
 OBJECT:3:_ZTSPj@@CXXABI_1.3
@@ -3575,7 +3570,6 @@
 OBJECT:4:_ZTSPKd@@CXXABI_1.3
 OBJECT:4:_ZTSPKe@@CXXABI_1.3
 OBJECT:4:_ZTSPKf@@CXXABI_1.3
-OBJECT:4:_ZTSPKg@@CXXABI_1.3.9
 OBJECT:4:_ZTSPKh@@CXXABI_1.3
 OBJECT:4:_ZTSPKi@@CXXABI_1.3
 OBJECT:4:_ZTSPKj@@CXXABI_1.3

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [wide-int] out-of-range set_bit in java

2014-05-05 Thread Richard Biener
On Fri, May 2, 2014 at 5:20 PM, Richard Sandiford
 wrote:
> I locally tried adding an assertion to the wide-int version of set_bit
> to make sure that the bit number was in range.  It triggers for this
> code in boehm.c:mark_reference_fields (quoting trunk version):
>
>   /* First word in object corresponds to most significant byte of
>  bitmap.
>
>  In the case of a multiple-word record, we set pointer
>  bits for all words in the record. This is conservative, but the
>  size_words != 1 case is impossible in regular java code. */
>   for (i = 0; i < size_words; ++i)
> *mask = (*mask).set_bit (ubit - count - i - 1);
>
>   if (count >= ubit - 2)
> *pointer_after_end = 1;
>
> if count + i + 1 >= ubit.
>
> AIUI the lower 2 bits are used for something else:
>
>   /* Bottom two bits for bitmap mark type are 01.  */
>   mask = mask.set_bit (0);
>   value = double_int_to_tree (value_type, mask);
>
> which is why the pointer_after_end condition checks for count >= ubit - 2.
> We never actually use the mask if pointer_after_end is true, so this
> patch puts the set_bit in an else branch.
>
> On face value it looks like the condition should be:
>
>   count + size_words > ubit - 2
>
> instead, but it'd go without saying that I don't really understand this code.
>
> Tested on x86_64-linux-gnu and powerpc64-linux-gnu for wide-int.
> OK to install?

Ok.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/java/
> * boehm.c (mark_reference_fields): Don't update the mask when
> setting pointer_after_end.
>
> Index: gcc/java/boehm.c
> ===
> --- gcc/java/boehm.c2014-01-13 15:05:22.543887284 +
> +++ gcc/java/boehm.c2014-05-02 16:08:25.500760537 +0100
> @@ -101,17 +101,17 @@ mark_reference_fields (tree field,
>
>   *last_set_index = count;
>
> - /* First word in object corresponds to most significant byte of
> -bitmap.
> -
> -In the case of a multiple-word record, we set pointer
> -bits for all words in the record. This is conservative, but the
> -size_words != 1 case is impossible in regular java code. */
> - for (i = 0; i < size_words; ++i)
> -   *mask = wi::set_bit (*mask, ubit - count - i - 1);
> -
>   if (count >= ubit - 2)
> *pointer_after_end = 1;
> + else
> +   /* First word in object corresponds to most significant byte of
> +  bitmap.
> +
> +  In the case of a multiple-word record, we set pointer
> +  bits for all words in the record. This is conservative, but the
> +  size_words != 1 case is impossible in regular java code. */
> +   for (i = 0; i < size_words; ++i)
> + *mask = wi::set_bit (*mask, ubit - count - i - 1);
>
>   /* If we saw a non-reference field earlier, then we can't
>  use the count representation.  We keep track of that in
>


Re: [AArch64] Fix integer vabs intrinsics

2014-05-05 Thread Richard Biener
On Fri, May 2, 2014 at 12:39 PM, Richard Earnshaw  wrote:
> On 02/05/14 11:28, James Greenhalgh wrote:
>> On Fri, May 02, 2014 at 10:29:06AM +0100, pins...@gmail.com wrote:
>>>
>>>
 On May 2, 2014, at 2:21 AM, James Greenhalgh  
 wrote:

> On Fri, May 02, 2014 at 10:00:15AM +0100, Andrew Pinski wrote:
> On Fri, May 2, 2014 at 1:48 AM, James Greenhalgh
>  wrote:
>>
>> Hi,
>>
>> Unlike the mid-end's concept of an ABS_EXPR, which treats overflow as
>> undefined/impossible, the neon intrinsics vabs intrinsics should behave 
>> as
>> the hardware. That is to say, the pseudo-code sequence:
>
>
> Only for signed integer types.  You should be able to use an unsigned
> integer type here instead.

 If anything, I think that puts us in a worse position.
>>>
>>> Not if you cast it back.
>>>
>>>
 The issue that
 inspires this patch is that GCC will happily fold:

  t1 = ABS_EXPR (x)
  t2 = GE_EXPR (t1, 0)

 to

  t2 = TRUE

 Surely an unsigned integer type is going to suffer the same fate? 
 Certainly I
 can imagine somewhere in the compiler there being a fold path for:
>>>
>>> Yes but if add a cast from the unsigned type to the signed type gcc does not
>>> optimize that. If it does it is a bug since the overflow is defined there.
>>
>> I'm not sure I understand, are you saying I want to fold to:
>>
>>   t1 = VIEW_CONVERT_EXPR (x, unsigned)
>>   t2 = ABS_EXPR (t1)
>>   t3 = VIEW_CONVERT_EXPR (t2, signed)
>>
>> Surely ABS_EXPR (unsigned) is a nop, and the two VIEW_CONVERTs cancel each
>> other out leading to an overall NOP? It might just be Friday morning and a
>> lack of coffee talking, but I think I need you to spell this one out to
>> me in big letters!
>>
>
> I agree.  I think what you need is a type widening so that you get
>
> t1 = VEC_WIDEN (x)
> t2 = ABS_EXPR (t1)
> t3 = VEC_NARROW (t2)
>
> This then guarantees that the ABS expression cannot be undefined.  I'm
> less sure, however about the narrow causing a change in 'sign'.  Has it
> just punted the problem?  Maybe you need

Another option is to allow ABS_EXPR to have a TYPE_UNSIGNED
result type, thus do abs(int) -> unsigned (what we have as absu_hwi).
That is, have an ABS_EXPR that doesn't have the undefined issue
(at expense of optimization in case the result is immediately casted
back to signed)

Richard.

>
> t1 = VEC_WIDEN (x)
> t2 = ABS_EXPR (t1)
> t3 = VIEW_CONVERT_EXPR (x, unsigned)
> t4 = VEC_NARROW (t3)
> t5 = VIEW_CONVERT_EXPR (t4, signed)
>
> !!!
>
> How you capture this into RTL during expand, though, is another thing.
>
> R.
>

  (unsigned >= 0) == TRUE

>>
>>  a = vabs_s8 (vdup_n_s8 (-128));
>>  assert (a >= 0);
>>
>> does not hold. As in hardware
>>
>>  abs (-128) == -128
>>
>> Folding vabs intrinsics to an ABS_EXPR is thus a mistake, and we should 
>> avoid
>> it. In fact, we have to be even more careful than that, and keep the 
>> integer
>> vabs intrinsics as an unspec in the back end.
>
> No it is not.  The mistake is to use signed integer types here.  Just
> add a conversion to an unsigned integer vector and it will work
> correctly.
> In fact the ABS rtl code is not undefined for the overflow.

 Here we are covering ourselves against a seperate issue. For 
 auto-vectorized
 code we want the SABD combine patterns to kick in whenever sensible. For
 intrinsics code, in the case where vsub_s8 (x, y) would cause an underflow:

  vabs_s8 (vsub_s8 (x, y)) != vabd_s8 (x, y)

 So in this case, the combine would be erroneous. Likewise SABA.
>>>
>>> This sounds like it would problematic for unsigned types  and not just for
>>> vabs_s8 with vsub_s8. So I think you should be using unspec for vabd_s8
>>> instead. Since in rtl overflow and underflow is defined to be wrapping.
>>
>> There are no vabs_u8/vabd_u8 so I don't see how we can reach this point
>> with unsigned types. Further, I have never thought of RTL having signed
>> and unsigned types, just a bag of bits. We'll want to use unspec for the
>> intrinsic version of vabd_s8 - but we'll want to specify the
>>
>>   (abs (minus (reg) (reg)))
>>
>> behaviour so that auto-vectorized code can pick it up.
>>
>> So in the end we'll have these patterns:
>>
>>   (abs
>> (abs (reg)))
>>
>>   (intrinsic_abs
>> (unspec [(reg)] UNSPEC_ABS))
>>
>>   (abd
>> (abs (minus (reg) (reg
>>
>>   (intrinsic_abd
>> (unspec [(reg) (reg)] UNSPEC_ABD))
>>
>>   (aba
>> (plus (abs (minus (reg) (reg))) (reg)))
>>
>>   (intrinsic_aba
>> (plus (unspec [(reg) (reg)] UNSPEC_ABD) (reg)))
>>
>> which should give us reasonable auto-vectorized code without triggering any
>> of the issues mapping the semantics of the instructions to intrinsics.
>>
>> Thanks,
>> James
>>
>>>
>>> Thanks,
>>> Andrew Pinski
>>>

 Thanks,
 James
>>>
>>
>
>


Re: [PATCH] Remove "keep_aligning" from get_inner_reference

2014-05-05 Thread Richard Biener
On Fri, May 2, 2014 at 8:18 AM, Jeff Law  wrote:
> On 04/22/14 03:25, Eric Botcazou wrote:
>>>
>>> Sure, and thanks again for your help.
>>
>>
>> Thanks!
>>
>>> I was not able to find any difference on the generated code with
>>> or without that patch.
>>
>>
>> Yes, my gut feeling is that TYPE_ALIGN_OK is really obsolete now.  It is
>> set
>> in a single place in the compiler
>> (gcc-interface/decl.c:gnat_to_gnu_entity):
>>
>>/* Tell the middle-end that objects of tagged types are guaranteed
>> to
>>  be properly aligned.  This is necessary because conversions to
>> the
>>  class-wide type are translated into conversions to the root type,
>>  which can be less aligned than some of its derived types.  */
>>if (Is_Tagged_Type (gnat_entity)
>>   || Is_Class_Wide_Equivalent_Type (gnat_entity))
>> TYPE_ALIGN_OK (gnu_type) = 1;
>>
>> but we changed the way these conversions are done some time ago.
>
> So does this remove the last concern around Bernd's patch?

And can we remove TYPE_ALIGN_OK as followup?  (ISTR it's used
by obj-c/c++ as well, but I can't find such use)

Thanks,
Richard.

> Jeff
>


Re: RFA: Fix possible typo in lto-cgraph.c

2014-05-05 Thread Richard Biener
On Thu, May 1, 2014 at 4:50 PM, Richard Sandiford
 wrote:
> bootstrap-asan failed on wide-int from what looks like a typo in
> compute_ltrans_boundary.  The first loop uses the function-wide "node"
> variable while the second loop uses a local "vnode" variable.  The problem
> was that the second loop also had a reference to "node".
>
> The patch below gets me past the boostrap failure on x86_64-linux-gnu.
> OK to install?

Indeed looks like a typo.  Thus, ok.

Note that a more convincing thing than regular bootstrap/test is to do
a simple LTO bootstrap (--with-build-config=bootstrap-lto --enable-languages=c).
That excercises the LTO machinery a lot more than the few testcases we have.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/
> * lto-cgraph.c (compute_ltrans_boundary): Make node variables local
> to their respective blocks.  Fix inadvertent use of "node".
>
> Index: gcc/lto-cgraph.c
> ===
> --- gcc/lto-cgraph.c2014-04-28 15:33:50.239606407 +0100
> +++ gcc/lto-cgraph.c2014-05-01 15:39:42.317120147 +0100
> @@ -770,7 +770,6 @@ add_references (lto_symtab_encoder_t enc
>  lto_symtab_encoder_t
>  compute_ltrans_boundary (lto_symtab_encoder_t in_encoder)
>  {
> -  struct cgraph_node *node;
>struct cgraph_edge *edge;
>int i;
>lto_symtab_encoder_t encoder;
> @@ -785,7 +784,7 @@ compute_ltrans_boundary (lto_symtab_enco
>for (lsei = lsei_start_function_in_partition (in_encoder);
> !lsei_end_p (lsei); lsei_next_function_in_partition (&lsei))
>  {
> -  node = lsei_cgraph_node (lsei);
> +  struct cgraph_node *node = lsei_cgraph_node (lsei);
>add_node_to (encoder, node, true);
>lto_set_symtab_encoder_in_partition (encoder, node);
>add_references (encoder, &node->ref_list);
> @@ -809,7 +808,7 @@ compute_ltrans_boundary (lto_symtab_enco
>if (DECL_ABSTRACT_ORIGIN (vnode->decl))
> {
>   varpool_node *origin_node
> - = varpool_get_node (DECL_ABSTRACT_ORIGIN (node->decl));
> +   = varpool_get_node (DECL_ABSTRACT_ORIGIN (vnode->decl));
>   lto_set_symtab_encoder_in_partition (encoder, origin_node);
> }
>  }
> @@ -836,7 +835,7 @@ compute_ltrans_boundary (lto_symtab_enco
>for (lsei = lsei_start_function_in_partition (encoder);
> !lsei_end_p (lsei); lsei_next_function_in_partition (&lsei))
>  {
> -  node = lsei_cgraph_node (lsei);
> +  struct cgraph_node *node = lsei_cgraph_node (lsei);
>for (edge = node->callees; edge; edge = edge->next_callee)
> {
>   struct cgraph_node *callee = edge->callee;
>


Re: Changes for if-convert to recognize simple conditional reduction.

2014-05-05 Thread Richard Biener
On Wed, Apr 30, 2014 at 5:50 PM, Yuri Rumyantsev  wrote:
> Thanks a lot Richard for you review.
> I did all proposed changes, checked that bootstrap and regression
> testing did not show new failures.
> Updated patch is attached.

As said, this is ok for checkin.

Thanks,
Richard.

> Best regards.
> Yuri.
>
> 2014-04-30 16:40 GMT+04:00 Richard Biener :
>> On Tue, Apr 29, 2014 at 4:29 PM, Yuri Rumyantsev  wrote:
>>> 2014-04-28 16:16 GMT+04:00 Richard Biener :
 On Thu, Apr 17, 2014 at 3:09 PM, Yuri Rumyantsev  
 wrote:
> Hi All,
>
> We implemented enhancement for if-convert phase to recognize the
> simplest conditional reduction and to transform it vectorizable form,
> e.g. statement
> if (A[i] != 0) num+= 1; will be recognized.
> A new test-case is also provided.
>
> Bootstrapping and regression testing did not show any new failures.

 Clever.  Can you add a testcase with a non-constant but invariant
 reduction value and one with a variable reduction value as well?

>>> [Yuri]
>>> I added another testcase to demonstrate ability of new algorithm, i.e.
>>> it transforms   if (a[i] != 0)   sum += a[i];
>>> to unconditional form without check on zero. Note also that any checks
>>> on non-reduction operand were deleted.
>>>
 +  if (!(is_cond_scalar_reduction (arg_0, &reduc, &op0, &op1)
 +   || is_cond_scalar_reduction (arg_1, &reduc, &op0, &op1)))

 Actually one of the args should be defined by a PHI node in the
 loop header and the PHI result should be the PHI arg on the
 latch edge, so I'd pass both PHI args to the predicate and do
 the decision on what the reduction op is there (you do that
 anyway).  The pattern matching is somewhat awkward

>>> [Yuri]
>>> I changed prototype of 'is_cond_scalar_reduction'  and now we have
>>> only one call:
>>> +  if (!is_cond_scalar_reduction (phi, &reduc, &op0, &op1))
>>>
 +  /* Consider only conditional reduction.  */
 +  bb = gimple_bb (stmt);
 +  if (!bb_has_predicate (bb))
 +return false;
 +  if (is_true_predicate (bb_predicate (bb)))
 +return false;

 should be replaced by matching the PHI structure

 loop-header:
   reduc_1 = PHI <..., reduc_2>
   ...
   if (..)
 reduc_3 = ...
   reduc_2 = PHI 

>>> [Yuri]
>>>In fact, I re-wrote this function completely as you proposed using
>>> PHI structure matching.
>>>
 +  lhs = gimple_assign_lhs (stmt);
 +  if (TREE_CODE (lhs) != SSA_NAME)
 +return false;

 always true, in fact lhs == arg.

>>> [Yuri]
>>> Fixed.
>>>
 +  if (SSA_NAME_VAR (lhs) == NULL)
 +return false;

>>> [Yuri]
>>> Deleted.
 no need to check that (or later verify SSA_NAME_VAR equivalency), not
 sure why you think you need that.

 +  if (!single_imm_use (lhs, &use, &use_stmt))
 +return false;
 +  if (gimple_code (use_stmt) != GIMPLE_PHI)
 +return false;

 checking has_single_use (arg) is enough.  The above is error-prone
 wrt debug statements.

>>> [Yuri] Only proposed check is used:
>>> +  if (!has_single_use (lhs))
>>> +return false;
>>>
 +  if (reduction_op == PLUS_EXPR &&
 +  TREE_CODE (r_op2) == SSA_NAME)

 && goes to the next line

>>> [Yuri]
>>> Fixed.
>>>
 +  if (TREE_CODE (r_op2) != INTEGER_CST && TREE_CODE (r_op2) != REAL_CST)
 +return false;

 any reason for this check?  The vectorizer can cope with
 loop invariant non-constant values as well (at least).

>>> [Yuri]
>>> This checks were deleted, i.e. any operand is acceptable.
>>>
 +  /* Right operand is constant, check that left operand is equal to lhs.  
 */
 +  if (SSA_NAME_VAR (lhs) !=  SSA_NAME_VAR (r_op1))
 +return false;

 see above - that looks weird.

>>> [Yuri]
>>> This code was deleted.
 Note that I think you may introduce undefined overflow in
 unconditionally executing the increment.  So you need to
 make sure to re-write the increment in unsigned arithmetic
 (for integral types, that is).
>>> [Yuri]
>>> I did not catch your point: if we have
>>> if (cond) s += val;
>>> it will be transformed to
>>> s += (cond? val: 0)
>>> which looks like completely equivalent to original stmt.
>>
>> Ah indeed.
>>

 Thanks,
 Richard.



> Is it OK for trunk?
>
> gcc/ChangeLog:
> 2014-04-17  Yuri Rumyantsev  
>
> * tree-if-conv.c (is_cond_scalar_reduction): New function.
> (convert_scalar_cond_reduction): Likewise.
> (predicate_scalar_phi): Add recognition and transformation
> of simple conditioanl reduction to be vectorizable.
>
> gcc/testsuite/ChangeLog:
> 2014-04-17  Yuri Rumyantsev  
>
> * gcc.dg/cond-reduc.c: New test.
>>>
>>> New patch is added which includes also new test to detect
>>> vectorization of conditional reduction

Re: [PATCH] Add a couple of dialect and warning options regarding Objective-C instance variable scope

2014-05-05 Thread Dimitris Papavasiliou

Ping!

On 04/28/2014 01:35 PM, Dimitris Papavasiliou wrote:

On 04/25/2014 07:50 PM, Mike Stump wrote:

On Apr 25, 2014, at 9:34 AM, Dimitris Papavasiliou
wrote:


--Wreturn-type -Wsequence-point -Wshadow @gol
+-Wreturn-type -Wsequence-point -Wshadow -Wshadow-ivar @gol


This has to be -Wno-shadow-ivar, we document the non-default…


+@item -Wshadow-ivar @r{(Objective-C only)}


Likewise.


+ /* Check wheter the local variable hides the instance variable. */


spelling, whether...


Fixed these.


+ a = private; /* { dg-warning "hides instance variable" "" { xfail
*-*-* } } */
+ a = protected; /* { dg-warning "hides instance variable" "" { xfail
*-*-* } } */
+ a = public; /* { dg-warning "hides instance variable" "" { xfail
*-*-* } } */


No, we don’t expect failures. We makes the compiler do what we wants
or it gets the hose again. Then, we expect it to be perfect. If you
won’t want warning, and non are produces, then just remove the /* …
*/, or write /* no warning */.


I've fixed these as per your request. For the record though, this form
of test seems to be fairly common in the test suites as this output
indicates:

dimitris@debian:~/sandbox/gcc-build$ find ../gcc-source/gcc/testsuite/
-name "*.c" -o -name "*.C" -o -name "*.m" | xargs grep "xfail \*-\*-\*"
| wc -l
354

Many of these seem to be in error or warning messages which are expected
not to show up. In any case if the messages do show up they'll trigger
the excessive messages test so I suppose that's enough.


Also, synth up the ChnageLogs… :-), they are trivial enough.


Done.


And, just pop them all into one patch (cd ..; svn diff), 3 is 3x the
work for me.


Attached.


Once we resolve the 3 warning tests above, this will be ok.


Actually, there were a few more { xfail *-*-* } in the other test cases.
I've removed these as well.

Dimitris





RE: [PATCH ARM]Handle REG addressing mode in output_move_neon explicitly

2014-05-05 Thread bin.cheng


> -Original Message-
> From: Richard Earnshaw
> Sent: Thursday, May 01, 2014 10:03 PM
> To: Bin Cheng
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH ARM]Handle REG addressing mode in
> output_move_neon explicitly
> 
> On 29/04/14 04:02, bin.cheng wrote:
> > Hi,
> > Function output_move_neon now generates vld1.64 for memory ref like
> > "dx <- [r1:SI]", this is bogus because it requires at least 64-bit
> > alignment for 32-bit aligned memory ref.  It works now because GCC
> > doesn't generate such insns in the first place, but things are going
> > to change if memset/memcpy calls are inlined by using neon instructions.
> >
> 
> V[LD/ST]1.64 only need to be 64-bit aligned if strict alignment is
enabled.  We
> normally assume that not to be the case.  The exception to this is when an
theoretically, this doesn't make the problem go away, right?

> explicit alignment check is used in the address expression (the :64
suffix),
> which causes the address to be checked for strict alignment at all times.
> 
> Do you have a testcase?
I can't provide a test case without the memset inlining patch.

Thanks,
bin