Re: [PATCH v2] Destroy arguments for _Cilk_spawn calling in the child (PR 80038)

2017-05-02 Thread Andreas Schwab
This could be related to --enable-checking=release:

In file included from ../../gcc/c-family/c-common.h:26:0,
 from ../../gcc/c-family/cilk.c:28:
../../gcc/c-family/cilk.c: In function 'bool cilk_set_spawn_marker(location_t, 
tree)':
../../gcc/tree.h:901:42: error: 'tree_check2' was not declared in this scope
 CALL_EXPR, AGGR_INIT_EXPR)->base.u.bits.unsigned_flag)
  ^
../../gcc/c-family/cilk.c:113:9: note: in expansion of macro 'EXPR_CILK_SPAWN'
 EXPR_CILK_SPAWN (fcall) = 1;
 ^
../../gcc/tree.h:901:42: error: 'tree_check2' was not declared in this scope
 CALL_EXPR, AGGR_INIT_EXPR)->base.u.bits.unsigned_flag)
  ^
../../gcc/c-family/cilk.c:115:9: note: in expansion of macro 'EXPR_CILK_SPAWN'
 EXPR_CILK_SPAWN (TREE_OPERAND (fcall, 1)) = 1;
 ^

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH v2] Destroy arguments for _Cilk_spawn calling in the child (PR 80038)

2017-05-02 Thread Xi Ruoyao
On 2017-05-02 09:16 +0200, Andreas Schwab wrote:

> This could be related to --enable-checking=release:
> 
> In file included from ../../gcc/c-family/c-common.h:26:0,
>  from ../../gcc/c-family/cilk.c:28:
> ../../gcc/c-family/cilk.c: In function 'bool 
> cilk_set_spawn_marker(location_t, tree)':
> ../../gcc/tree.h:901:42: error: 'tree_check2' was not declared in this scope
>  CALL_EXPR, AGGR_INIT_EXPR)->base.u.bits.unsigned_flag)
>   ^
> ../../gcc/c-family/cilk.c:113:9: note: in expansion of macro 'EXPR_CILK_SPAWN'
>  EXPR_CILK_SPAWN (fcall) = 1;
>  ^
> ../../gcc/tree.h:901:42: error: 'tree_check2' was not declared in this scope
>  CALL_EXPR, AGGR_INIT_EXPR)->base.u.bits.unsigned_flag)
>   ^
> ../../gcc/c-family/cilk.c:115:9: note: in expansion of macro 'EXPR_CILK_SPAWN'
>  EXPR_CILK_SPAWN (TREE_OPERAND (fcall, 1)) = 1;
>  ^
> 
> Andreas.
> 

Sorry T_T.  I've made a stupid mistake in tree.h.

Let's apply following patch, and alert the RM when backporting r247446.

2017-05-02 Xi Ruoyao 

* tree.h (EXPR_CILK_SPAWN): Use macro TREE_CHECK2 instead of
function tree_check2.
---
 gcc/tree.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/tree.h b/gcc/tree.h
index 3bca90a..fdaa7af 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -897,8 +897,8 @@ extern void omp_clause_range_check_failed (const_tree, 
const char *, int,
 /* If this is true, we should insert a __cilk_detach call just before
this function call.  */
 #define EXPR_CILK_SPAWN(NODE) \
-  (tree_check2 (NODE, __FILE__, __LINE__, __FUNCTION__, \
-CALL_EXPR, AGGR_INIT_EXPR)->base.u.bits.unsigned_flag)
+  (TREE_CHECK2 (NODE, CALL_EXPR, \
+AGGR_INIT_EXPR)->base.u.bits.unsigned_flag)
 
 /* In a RESULT_DECL, PARM_DECL and VAR_DECL, means that it is
passed by invisible reference (and the TREE_TYPE is a pointer to the true
-- 
2.7.1

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University



[Ada] Compiler loop on use of a faulty object in an address clause.

2017-05-02 Thread Arnaud Charlet
This patch fixes a loop in the compiler when an address clause is used to
specify an overlay, and the overlaid object has an illegal object declaration
in which the expression is a premature reference to the object itself.

Compiling p.ads must yield:

  p.ads:3:33: object "Nowhere" cannot be used before end of its declaration

---
with system; use system;

---
with System; use System;
package P is
  Nowhere : constant Address := Nowhere;
  Thing : Integer;
  for Thing'Address use Nowhere;
end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-05-02  Ed Schonberg  

* sem_ch8.adb (Premature_Usage): If the premature usage of
an entity is as the expression in its own object decaration,
rewrite the reference as Any_Id to prevent cascaded errors or
compiler loops when such an entity is used in an address clause.

Index: sem_ch8.adb
===
--- sem_ch8.adb (revision 247461)
+++ sem_ch8.adb (working copy)
@@ -8562,6 +8562,14 @@
   else
  Error_Msg_N
("object& cannot be used before end of its declaration!", N);
+
+ --  If the premature reference appears as the expression in its own
+ --  declaration, rewrite it to prevent compiler loops in subsequent
+ --  uses of this mangled declaration in address clauses.
+
+ if Nkind (Parent (N)) = N_Object_Declaration then
+Set_Entity (N, Any_Id);
+ end if;
   end if;
end Premature_Usage;
 


Re: [PATCH] x86: vpermil2p{s,d} have no commutative operands

2017-05-02 Thread Jan Beulich
>>> On 01.05.17 at 11:09,  wrote:
> On Fri, Apr 28, 2017 at 4:51 PM, Jan Beulich  wrote:
>> While either of the last two operands can be in memory, they can't be
>> swapped.
>>
>> gcc/
>> 2017-04-28  Jan Beulich  
>>
>> * config/i386/sse.md (xop_vpermil23): Use alternatives.
> 
> Please write a more descriptive ChangeLog entry, e.g. "Do not allow
> operand swapping, add (x,x,m,x,n) alternative.
> 
> OK with above change for mainline and release branches.

Considering the most recent status announcement I understand
that I should not apply this to branches/gcc-7-branch until after
7.1 went out. Please correct me if that's wrong.

I'm also not sure whether with your reply you also meant to
cover branches/gcc-6-branch (and possibly even older, albeit
personally I don't care about 5.x anymore).

Jan



[Ada] Spurious error on aspect of a discriminated protected type

2017-05-02 Thread Arnaud Charlet
This patch fixes a spurious conformance error on the occurrence of the
discriminant of a protected type in the expression for an aspect of the type,
when the type and its body appear within a subprogram body.
The check that the expression has the same visbility at the freeze point of
the type and at the end of the current declarative list may have to examine
two different entities which result from analysis and expansion steps at
the freeze point and after analysis of the body and construction of the
corresponding protected subprograms.

The following must compile quietly:

---
with System;
procedure  Aspect_Bug is
   subtype Data is Integer range 1 .. 10;

   protected type Event (Ceiling : System.Priority)
  with Priority => Ceiling   --   Ceiling priority defined for each object
   is
   entry Wait (D : out Data);
   procedure Signal (D : in Data);
private
   Current: Data; -- Event data declaration
Signalled : Boolean := False;
end Event;

   protected body Event is
   entry Wait (D : out Data) when Signalled is
   begin
  D := Current;
  Signalled := False;
   end Wait;

   procedure Signal (D : in Data) is
   begin
  Current   := D;
  Signalled := True;
   end Signal;
end Event;

It : Event (15);
begin
   null;
end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-05-02  Ed Schonberg  

* sem_ch6.adb (Fully_Conformant_Expressions): Two entity
references are fully conformant if they are both expansions
of the discriminant of a protected type, within one of the
protected operations. One occurrence may be expanded into a
constant declaration while the other is an input parameter to
the corresponding generated subprogram.

Index: sem_ch6.adb
===
--- sem_ch6.adb (revision 247461)
+++ sem_ch6.adb (working copy)
@@ -8770,6 +8770,16 @@
 and then Ekind (Entity (E1)) = E_Discriminant
 and then Ekind (Entity (E2)) = E_In_Parameter)
 
+ --  The discriminant of a protected type is transformed into
+ --  a local constant and then into a parameter of a protected
+ --  operation.
+
+ or else (Ekind (Entity (E1)) = E_Constant
+ and then Ekind (Entity (E2)) = E_In_Parameter
+ and then Present (Discriminal_Link (Entity (E1)))
+ and then Discriminal_Link (Entity (E1)) =
+  Discriminal_Link (Entity (E2)))
+
  --  AI12-050: The loop variables of quantified expressions
  --  match if they have the same identifier, even though they
  --  are different entities.


[Ada] Compile-time warnings for uninitialized null-excluding components

2017-05-02 Thread Arnaud Charlet
This patch adds an enhancement for detecting and warning about constraint errors
in aggregate types with uninitialized null-excluding components at compile-time.
All composite types without aggregate initialization will now be recursivly
checked for such null-excluding components without default initialization and
extended information about the constraint error will be shown to the user.


-- Source --


--  main.adb

with Types; use Types;

procedure Main is
   Obj_1  : Named_Ptr;   --  OK
   pragma Unused (Obj_1);

   Obj_2  : Named_NE_Ptr;--  ERROR
   pragma Unused (Obj_2);

   Obj_3  : Anon_Array;  --  OK
   Obj_4  : Anon_NE_Array;   --  ERROR
   Obj_5  : Named_Array; --  OK
   Obj_6  : Named_NE_Array;  --  ERROR
   Obj_7  : Named_Inc;   --  OK
   pragma Unused (Obj_7);

   Obj_8  : Named_NE_Inc;--  ERROR
   pragma Unused (Obj_8);

   Obj_9  : Named_Priv;  --  OK
   Obj_10 : Named_NE_Priv ;  --  ERROR
   Obj_11 : Priv_1;  --  OK
   pragma Unused (Obj_11);

   Obj_12 : Priv_2;  --  OK
   Obj_13 : Priv_3;  --  ERROR
   Obj_14 : Priv_4;  --  OK
   Obj_15 : Priv_5;  --  ERROR
   Obj_16 : Priv_6;  --  OK
   Obj_17 : Priv_7;  --  ERROR
   Obj_18 : Priv_8;  --  ERROR
   Obj_19 : Priv_9;  --  ERROR
   Obj_20 : Priv_10; --  ERROR
   Obj_21 : Prot_1;  --  OK
   Obj_22 : Prot_2;  --  ERROR
   Obj_23 : Prot_3;  --  ERROR
   Obj_24 : Prot_4;  --  ERROR
   Obj_25 : Prot_5;  --  ERROR
   Obj_26 : Rec_1;   --  ERROR
   Obj_27 : Rec_2;   --  ERROR
   Obj_28 : Rec_3;   --  ERROR
   Obj_29 : Rec_4;   --  ERROR
   Obj_30 : Rec_5;   --  ERROR
   Obj_31 : Rec_6;   --  ERROR
   Obj_32 : Rec_7;   --  ERROR
   Obj_33 : Rec_8;   --  ERROR
   Obj_34 : Rec_9;   --  OK
   Obj_35 : Rec_10;  --  ERROR
   Obj_36 : Rec_11;  --  ERROR
   Obj_37 : Rec_12;  --  ERROR
   Obj_38 : Rec_13;  --  ERROR
   Obj_39 : Tag_1;   --  ERROR
   Obj_40 : Tag_2;   --  ERROR
   Obj_41 : Tag_3;   --  ERROR
   Obj_42 : Tag_4;   --  ERROR
   Obj_43 : Task_1;  --  OK
   Obj_44 : Named_Rec_Array; --  ERROR
   Obj_45 : Named_NE_Array_Array; -- ERROR
   Obj_46 : Rec_14;   --  ERROR
   Obj_47 : array (1 .. 2) of Rec_14; --  ERROR
begin
   null;
end Main;

--  types.ads

package Types is

   --  Composite  - array [sub]type, concurrent, incomplete, private, record,
   --   string literal subtype

   --  Concurrent - protected [sub]type, task [sub]type
   --  Incomplete - incomplete [sub]type
   --  Private- [limited] private [sub]type, record [sub]type with private
   --  Record - class-wide [sub]type, record [sub]type [with private]

   --
   -- Simple types --
   --

   --  Access

   type Named_Ptr is access Integer;
   type Named_NE_Ptr is not null access Integer;

   --  Arrays

   --type Rec_4;
   type Anon_Array  is array (1 .. 2) of access Integer;
   type Anon_NE_Array   is array (1 .. 2) of not null access Integer;
   --type Named_Rec_Array is array (1 .. 2) of Rec_4;
   type Named_Array is array (1 .. 2) of Named_Ptr;
   type Named_NE_Array  is array (1 .. 2) of Named_NE_Ptr;

   --  Incomplete

   type Named_Inc;
   type Named_NE_Inc;

   type Named_Inc is access Integer;
   type Named_NE_Inc is not null access Integer;

   --  Private

   type Named_Priv is private;
   type Named_NE_Priv is private;

   ---
   -- Complex types --
   ---

   --  Private

   type Priv_1 is private;
   type Priv_2 is private;
   type Priv_3 is private;
   type Priv_4 is private;
   type Priv_5 is private;
   type Priv_6 is private;
   type Priv_7 is private;
   type Priv_8 is private;
   type Priv_9 is private;
   type Priv_10 is limited private;

   --  Protected

   protected type Prot_1 is
   end Prot_1;

   protected type Prot_2 is
   private
  Comp_1 : Named_Ptr;
  Comp_A : Named_NE_Ptr;
   end Prot_2;

   protected type Prot_3 is
   private
  Comp_1 : Anon_Array;
  Comp_2 : Anon_NE_Array;
   end Prot_3;

   protected type Prot_4 is
   private
  Comp_1 : Named_Array;
  Comp_2 : Named_NE_Ptr;
   end Prot_4;

   protected type Prot_5 is
   private
  Comp_1 : Named_Priv;
  Comp_2 : Named_NE_Priv;
   end Prot_5;

   --  Record

   type Rec_1 is record
  Comp_1 : Named_Ptr;
  Comp_2 : Named_NE_Ptr;
   end record;

   type Rec_2 is record
  Comp_1 : Anon_Array;
  Comp_2 : Anon_NE_Array;
   end record;

   type Rec_3 is record
  Comp_1 : Named_Array;
  Comp_2 : Named_NE_Ptr;
   end record;

   type Rec_4 is record
  Comp_1 : Named_Priv;
  Comp_2 : Named_NE_Priv;
   end record;
   type Named_Rec_Array is array (1 .. 2) of Rec_4;

   type Rec_5 is record
  Comp : Rec_1;
   end record;

 

[Ada] Optimization of fixed/fixed operations with compatible 'smalls.

2017-05-02 Thread Arnaud Charlet
If the Small values of the fixed points involved in a division operation
have common factors, it is possible to simplify the restulting expression,
which will involve numerators and denominators of the corresponding Small
values of the type, as those values are integer literals.

The following must execute quietly:

   gcc -c -gnatDG ec.adb
   grep "30 / 30" ec.adb.dg


with S;
with Interfaces; use Interfaces;

procedure Ec is
   Signal_Denominator : constant := 400;
   type Signal_Type is delta 1.0 / Signal_Denominator
 range -65_536.0 / Signal_Denominator .. 65_535.0 / Signal_Denominator;
   for Signal_Type'Small use 1.0 / Signal_Denominator;

   Delay_30 : constant := 30;

   type Filter_30_Input_Type is delta 1.0 / Signal_Denominator
 range 0.0 .. Signal_Type'Last;
   for Filter_30_Input_Type'Small use 1.0 / Signal_Denominator;
   type Filter_30_Output_Type is delta 1.0 / Signal_Denominator / Delay_30 
 range 0.0 .. Signal_Type'Last;
   for Filter_30_Output_Type'Small use 1.0 / Signal_Denominator / Delay_30;

   type Sum_30_Index_Type is mod Delay_30;

   package Sum_30_Filter is new
 S (Input_Type => Filter_30_Input_Type,
 Output_Type => Filter_30_Output_Type,
 Filter_Index_Type => Sum_30_Index_Type);
begin
   null;
end Ec;
---
generic
   type Input_Type is delta <>;
   type Output_Type is delta <>;
   type Filter_Index_Type is mod <>;

package S is
   type Filter_Type is private;

   function Process_Sample (This   : in out Filter_Type;
Sample : in Input_Type)
   return Output_Type;

private
   type Sum_Filter_Array_Type is array (Filter_Index_Type) of Input_Type;
   type Filter_Type is record
  Samples : Sum_Filter_Array_Type;
  Index : Filter_Index_Type;
  Last_Output : Output_Type;
   end record;
end S;
---
package body S is

   function Process_Sample (This   : in out Filter_Type;
Sample : in Input_Type)
   return Output_Type is
  --  Initialize output as y(n-1).
  Output : Output_Type := This.Last_Output;

  type Filter_Delay_Type is delta 1.0 range 1.0 .. 65_535.0;
  D : constant Filter_Delay_Type :=
Filter_Delay_Type (Filter_Index_Type'Last -
 Filter_Index_Type'First) + 1.0;
   begin
  --  Compute y(n) = y(n-1) + (x(n) - x(n-D)) / D, where D = filter delay.
  Output := Output + (Sample - This.Samples (This.Index)) / D;

  return Output;
   end Process_Sample;
end S;

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-05-02  Ed Schonberg  

* exp_fixd.adb (Expand_Divide_Fixed_By_Fixed_Giving_Fixed):
Simplify the expression for a fixed-fixed division to remove
divisions by constants whenever possible, as an optimization
for restricted targets.

Index: exp_fixd.adb
===
--- exp_fixd.adb(revision 247461)
+++ exp_fixd.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 1992-2015, Free Software Foundation, Inc. --
+--  Copyright (C) 1992-2017, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -2008,6 +2008,31 @@
 
   else
  Do_Divide_Fixed_Fixed (N);
+
+ --  A focused optimization: if after constant folding the
+ --  expression is of the form:  T ((Exp * D) / D), where D is
+ --  a static constant, return  T (Exp). This form will show up
+ --  when D is the denominator of the static expression for the
+ --  'small of fixed-point types involved. This transformation
+ --  removes a division that may be expensive on some targets.
+
+ if Nkind (N) = N_Type_Conversion
+   and then Nkind (Expression (N)) = N_Op_Divide
+ then
+declare
+   Num : constant Node_Id := Left_Opnd  (Expression (N));
+   Den : constant Node_Id := Right_Opnd (Expression (N));
+
+begin
+   if Nkind (Den) = N_Integer_Literal
+ and then Nkind (Num) = N_Op_Multiply
+ and then Nkind (Right_Opnd (Num)) = N_Integer_Literal
+ and then Intval (Den) = Intval (Right_Opnd (Num))
+   then
+  Rewrite (Expression (N), Left_Opnd (Num));
+   end if;
+end;
+ end if;
   end if;
end Expand_Divide_Fixed_By_Fixed_Giving_Fixed;
 


[Ada] Crash on extended return of indefinite object

2017-05-02 Thread Arnaud Charlet
This patch suppresses the generation of a discriminant check when the
associated type is a constrained subtype created for an unconstrained nominal
type. The discriminant check is not needed because the subtype has the correct
discriminants by construction.


-- Source --


--  types.ads

package Types is
   type Priv (<>) is tagged private;
   function Create (Val : Integer) return Priv;

private
   type Priv (Discr : Integer) is tagged null record;
end Types;

--  types.adb

package body Types is
   function Create (Val : Integer) return Priv is
   begin
  return Priv'(Discr => Val);
   end Create;
end Types;

--  main.adb

with Types; use Types;

procedure Main is
   function Create_Any return Priv'Class is
   begin
  return Result : Priv := Create (1234);
   end Create_Any;

   Obj : constant Priv'Class := Create_Any;
begin 
   null; 
end Main;

-
-- Compilation --
-

$ gcc -c main.adb

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-05-02  Hristian Kirtchev  

* checks.adb (Apply_Constraint_Check): Do not apply
a discriminant check when the associated type is a constrained
subtype created for an unconstrained nominal type.

Index: checks.adb
===
--- checks.adb  (revision 247466)
+++ checks.adb  (working copy)
@@ -1355,8 +1355,13 @@
 
 Apply_Range_Check (N, Typ);
 
+ --  Do not install a discriminant check for a constrained subtype
+ --  created for an unconstrained nominal type because the subtype
+ --  has the correct constraints by construction.
+
  elsif Has_Discriminants (Base_Type (Desig_Typ))
-and then Is_Constrained (Desig_Typ)
+   and then Is_Constrained (Desig_Typ)
+   and then not Is_Constr_Subt_For_U_Nominal (Desig_Typ)
  then
 Apply_Discriminant_Check (N, Typ);
  end if;


[Ada] Warning on library-level objects that require dynamic allocation

2017-05-02 Thread Arnaud Charlet
When restriction No_Implicit_Heap_Allocation is active, the compiler rejects
a protected type that includes private components of dynamic size, This patch
extends the corresponding warning to the declaration of discriminated objects.

Given the following gnat.adc file:

   pragma profile (Ravenscar);

compiling p.adb must yield:

p.adb:13:04: warning: in instantiation at a-cbhama.ads:448
p.adb:13:04: warning: component "TC" of non-static size will violate
restriction No_Implicit_Heap_Allocation
p.adb:19:04: violation of restriction "no_implicit_heap_allocations"
p.adb:19:04: from profile "ravenscar" at gnat.adc:14

---
with Ada.Containers.Bounded_Hashed_Maps;
with Ada.Text_IO;
with Ada.Strings;
with Ada.Strings.Hash;
--  package body Flight_Data.Hash with
-- SPARK_Mode
--  is
package body P is
   subtype GUFI is String (1 .. 36); --key
   subtype Flight_ID is Integer range 1 ..5000;  --element

   function eq (Left, Right : Flight_ID) return Boolean is (Left = Right);
   package Flight_Maps is new Ada.Containers.Bounded_Hashed_Maps
  (Key_Type=> GUFI,
   Element_Type=> Flight_Id,
   Hash=> Ada.Strings.Hash,
   Equivalent_Keys => "=");
   use Flight_Maps;
   The_Hash_Table : Map (Capacity => 2000,
 Modulus  => Flight_Maps.Default_Modulus (2000));

   procedure Go is

  Cur : Cursor;
  My_Gufi : GUFI := GUFI'(others => 'a');
   begin

  Include(The_Hash_Table, My_GUFI, 12);
  Cur := Find(The_Hash_Table, My_GUFI);
  Ada.Text_IO.Put_Line (Flight_ID'Image(Element(Cur)));
   end Go;
end P;
---
with Ada.Containers.Formal_Hashed_Maps;
with Ada.Text_IO;
with Ada.Strings;
with Ada.Strings.Hash;
package P is
   procedure Go;
end P;

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-05-02  Ed Schonberg  

* exp_ch9.adb (Discriminated_Size): Moved to sem_util.
* sem_util.ads, sem_util.adb (Discriminated_Size): Predicate moved
here from exp_ch9, to recognize objects whose creation requires
dynamic allocation, so that the proper warning can be emitted
when restriction No_Implicit_Heap_Allocation is in effect.
* sem_ch3.adb (Analyze_Object_Declaration): Use Discriminated_Size
to emit proper warning when an object that requires dynamic
allocation is declared.

Index: sem_ch3.adb
===
--- sem_ch3.adb (revision 247468)
+++ sem_ch3.adb (working copy)
@@ -3133,6 +3133,9 @@
 
 when N_Derived_Type_Definition =>
Derived_Type_Declaration (T, N, T /= Def_Id);
+   if Ekind (T) /= E_Void and then Has_Predicates (T) then -- 
+  Set_Has_Predicates (Def_Id);
+   end if;
 
 when N_Enumeration_Type_Definition =>
Enumeration_Type_Declaration (T, Def);
@@ -3588,6 +3591,11 @@
 
   Prev_Entity : Entity_Id := Empty;
 
+  procedure Check_Dynamic_Object (Typ : Entity_Id);
+  --  A library-level object with non-static discriminant constraints may
+  --  require dynamic allocation. The declaration is illegal if the
+  --  profile includes the restriction No_Implicit_Heap_Allocations.
+
   procedure Check_For_Null_Excluding_Components
 (Obj_Typ  : Entity_Id;
  Obj_Decl : Node_Id);
@@ -3614,6 +3622,45 @@
 
   --  Any other relevant delayed aspects on object declarations ???
 
+  procedure Check_Dynamic_Object (Typ : Entity_Id) is
+ Comp : Entity_Id;
+ Obj_Type : Entity_Id;
+
+  begin
+ Obj_Type := Typ;
+ if Is_Private_Type (Obj_Type)
+and then Present (Full_View (Obj_Type))
+ then
+Obj_Type := Full_View (Obj_Type);
+ end if;
+
+ if Known_Static_Esize (Obj_Type) then
+return;
+ end if;
+
+ if Restriction_Active (No_Implicit_Heap_Allocations)
+   and then Expander_Active
+   and then Has_Discriminants (Obj_Type)
+ then
+Comp := First_Component (Obj_Type);
+while Present (Comp) loop
+   if Known_Static_Esize (Etype (Comp)) then
+  null;
+
+   elsif not Discriminated_Size (Comp)
+ and then Comes_From_Source (Comp)
+   then
+  Error_Msg_NE ("component& of non-static size will violate "
+& "restriction No_Implicit_Heap_Allocation?", N, Comp);
+
+   elsif Is_Record_Type (Etype (Comp)) then
+  Check_Dynamic_Object (Etype (Comp));
+   end if;
+   Next_Component (Comp);
+end loop;
+ end if;
+  end Check_Dynamic_Object;
+
   -
   -- Check_For_Null_Excluding_Components --
   -
@@ -4068,6 +4115,10 @@
 Object_Definition (N));
   end if;
 
+  if Is_Li

[Ada] Ceiling priorities off by one on Linux

2017-05-02 Thread Arnaud Charlet
This patch fixes a bug in which ceiling priorities were off
by one, so that Program_Error is raised for a ceiling violation when a
task calls a protected operation and the priority of the task is equal
to the ceiling priority of the protected object. Program_Error should be
raised only if the priority of the task is greater, not greater or
equal. No test available; too nondeterministic, and requires root
privileges.

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-05-02  Bob Duff  

* s-taprop-linux.adb (Prio_To_Linux_Prio): New function to correctly
compute the linux priority from the Ada priority. Call this everywhere
required. In particular, the previous version was not doing this
computation when setting the ceiling priority in various places. It
was just converting to C.int, which results in a ceiling that is off
by 1.

Index: s-taprop-linux.adb
===
--- s-taprop-linux.adb  (revision 247472)
+++ s-taprop-linux.adb  (working copy)
@@ -38,7 +38,7 @@
 --  Turn off polling, we do not want ATC polling to take place during tasking
 --  operations. It causes infinite loops and other problems.
 
-with Interfaces.C;
+with Interfaces.C; use Interfaces; use type Interfaces.C.int;
 
 with System.Task_Info;
 with System.Tasking.Debug;
@@ -60,7 +60,6 @@
 
use System.Tasking.Debug;
use System.Tasking;
-   use Interfaces.C;
use System.OS_Interface;
use System.Parameters;
use System.OS_Primitives;
@@ -111,14 +110,6 @@
--  Constant to indicate that the thread identifier has not yet been
--  initialized.
 
-   function geteuid return Integer;
-   pragma Import (C, geteuid, "geteuid");
-   pragma Warnings (Off, "non-static call not allowed in preelaborated unit");
-   Superuser : constant Boolean := geteuid = 0;
-   pragma Warnings (On, "non-static call not allowed in preelaborated unit");
-   --  True if we are running as 'root'. On Linux, ceiling priorities work only
-   --  in that case, so if this is False, we ignore Locking_Policy = 'C'.
-

-- Local Packages --

@@ -170,17 +161,52 @@
procedure Abort_Handler (signo : Signal);
 
function GNAT_pthread_condattr_setup
- (attr : access pthread_condattr_t) return int;
-   pragma Import (C,
- GNAT_pthread_condattr_setup, "__gnat_pthread_condattr_setup");
+ (attr : access pthread_condattr_t) return C.int;
+   pragma Import
+ (C, GNAT_pthread_condattr_setup, "__gnat_pthread_condattr_setup");
 
+   function Prio_To_Linux_Prio (Prio : Any_Priority) return C.int is
+ (C.int (Prio) + 1);
+   --  Convert Ada priority to Linux priority. Priorities are 1 .. 99 on
+   --  GNU/Linux, so we map 0 .. 98 to 1 .. 99.
+
+   function Get_Ceiling_Support return Boolean;
+   --  Get the value of the Ceiling_Support constant (see below).
+   --  ???For now, we're returning True only if running as superuser,
+   --  and ignore capabilities.
+
+   function Get_Ceiling_Support return Boolean is
+  Ceiling_Support : Boolean := False;
+   begin
+  if Locking_Policy = 'C' then
+ declare
+function geteuid return Integer;
+pragma Import (C, geteuid, "geteuid");
+Superuser : constant Boolean := geteuid = 0;
+ begin
+if Superuser then
+   Ceiling_Support := True;
+end if;
+ end;
+  end if;
+
+  return Ceiling_Support;
+   end Get_Ceiling_Support;
+
+   pragma Warnings (Off, "non-static call not allowed in preelaborated unit");
+   Ceiling_Support : constant Boolean := Get_Ceiling_Support;
+   pragma Warnings (On, "non-static call not allowed in preelaborated unit");
+   --  True if the locking policy is Ceiling_Locking, and the current process
+   --  has permission to use this policy. The process has permission if it is
+   --  running as 'root', or if the capability was set by the setcap command,
+   --  as in "sudo /sbin/setcap cap_sys_nice=ep exe_file". If it doesn't have
+   --  permission, then a request for Ceiling_Locking is ignored.
+
type RTS_Lock_Ptr is not null access all RTS_Lock;
 
-   function Init_Mutex
- (L : RTS_Lock_Ptr; Prio : Any_Priority)
- return Interfaces.C.int;
-   --  Initialize the mutex L. If the locking policy is Ceiling_Locking, then
-   --  set the ceiling to Prio.
+   function Init_Mutex (L : RTS_Lock_Ptr; Prio : Any_Priority) return C.int;
+   --  Initialize the mutex L. If Ceiling_Support is True, then set the ceiling
+   --  to Prio. Returns 0 for success, or ENOMEM for out-of-memory.
 
---
-- Abort_Handler --
@@ -190,7 +216,7 @@
   pragma Unreferenced (signo);
 
   Self_Id : constant Task_Id := Self;
-  Result  : Interfaces.C.int;
+  Result  : C.int;
   Old_Set : aliased sigset_t;
 
begin
@@ -272,30 +298,26 @@
-- Init_Mutex --

 
-   function Init_Mute

Re: [PATCH] x86: vpermil2p{s,d} have no commutative operands

2017-05-02 Thread Uros Bizjak
On Tue, May 2, 2017 at 10:26 AM, Jan Beulich  wrote:
 On 01.05.17 at 11:09,  wrote:
>> On Fri, Apr 28, 2017 at 4:51 PM, Jan Beulich  wrote:
>>> While either of the last two operands can be in memory, they can't be
>>> swapped.
>>>
>>> gcc/
>>> 2017-04-28  Jan Beulich  
>>>
>>> * config/i386/sse.md (xop_vpermil23): Use alternatives.
>>
>> Please write a more descriptive ChangeLog entry, e.g. "Do not allow
>> operand swapping, add (x,x,m,x,n) alternative.
>>
>> OK with above change for mainline and release branches.
>
> Considering the most recent status announcement I understand
> that I should not apply this to branches/gcc-7-branch until after
> 7.1 went out. Please correct me if that's wrong.

Yes, release rules still apply, so please wait until gcc-7 branch is
opened again.

> I'm also not sure whether with your reply you also meant to
> cover branches/gcc-6-branch (and possibly even older, albeit
> personally I don't care about 5.x anymore).

Your call, if you think the patch should be applied to older branches.
Otherwise, branches survived without the patch until now, so I guess
it is not that critical.

Uros.


[Ada] Move variable-length components to last position in record types

2017-05-02 Thread Arnaud Charlet
This optimizes the layout of some record types declared in the runtime, but
only in the private part of the spec or in the body, hence no API changes.

There is also a similar to the Xr_Tabls support package.

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-05-02  Eric Botcazou  

* g-forstr.ads (Data): Move Format component last.
* g-forstr.adb ("+"): Adjust for above change.
* g-rewdat.ads (Buffer): Move Buffer, Current, Pattern and Value last.
* g-sechas.ads (Context): Move Key last.
* g-socket.ads (Service_Entry_Type): Move Aliases last.
* s-fileio.adb (Temp_File_Record): Move Name last.
* s-regexp.adb (Regexp_Value): Move Case_Sensitive last.
* xr_tabls.ads (Project_File): Move Src_Dir and Obj_Dir last.

Index: xr_tabls.ads
===
--- xr_tabls.ads(revision 247461)
+++ xr_tabls.ads(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- S p e c  --
 --  --
---  Copyright (C) 1998-2014, Free Software Foundation, Inc. --
+--  Copyright (C) 1998-2017, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -292,12 +292,11 @@
 
 private
type Project_File (Src_Dir_Length, Obj_Dir_Length : Natural) is record
-  Src_Dir : String (1 .. Src_Dir_Length);
-  Src_Dir_Index : Integer;
-
-  Obj_Dir: String (1 .. Obj_Dir_Length);
+  Src_Dir_Index  : Integer;
   Obj_Dir_Index  : Integer;
   Last_Obj_Dir_Start : Natural;
+  Src_Dir: String (1 .. Src_Dir_Length);
+  Obj_Dir: String (1 .. Obj_Dir_Length);
end record;
 
type Project_File_Ptr is access all Project_File;
@@ -364,7 +363,6 @@
 
type Declaration_Record (Symbol_Length : Natural) is record
   Key  : Cst_String_Access;
-  Symbol   : String (1 .. Symbol_Length);
   Decl : Reference;
   Is_Parameter : Boolean := False; -- True if entity is subprog param
   Decl_Type: Character;
@@ -374,6 +372,7 @@
   Match: Boolean := False;
   Par_Symbol   : Declaration_Reference := null;
   Next : Declaration_Reference := null;
+  Symbol   : String (1 .. Symbol_Length);
end record;
--  The lists of referenced (Body_Ref, Ref_Ref and Modif_Ref) are
--  kept unsorted until the results needs to be printed. This saves
Index: g-sechas.ads
===
--- g-sechas.ads(revision 247461)
+++ g-sechas.ads(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- S p e c  --
 --  --
---  Copyright (C) 2009-2016, Free Software Foundation, Inc. --
+--  Copyright (C) 2009-2017, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -208,14 +208,14 @@
   --  KL is 0 for a normal hash context, > 0 for HMAC
 
   type Context (KL : Key_Length := 0) is record
- Key : Stream_Element_Array (1 .. KL);
- --  HMAC key
-
  H_State : Hash_State.State (0 .. State_Words - 1) := Initial_State;
  --  Function-specific state
 
  M_State : Message_State (Block_Length);
  --  Function-independent state (block buffer)
+
+ Key : Stream_Element_Array (1 .. KL);
+ --  HMAC key
   end record;
 
   Initial_Context : constant Context (KL => 0) := (others => <>);
Index: g-rewdat.ads
===
--- g-rewdat.ads(revision 247461)
+++ g-rewdat.ads(working copy)
@@ -5,7 +5,7 @@
 --  --
 -- S p e c  --
 --  --
---Copyright (C) 2014, Free Software Foundation, Inc.--
+--   Copyright (C) 2014-2017, Free Software Foundation, Inc.--
 --  --
 -- GNAT is free software;  you can  redistribute it  a

[Ada] Dispatching calls to renamed equality

2017-05-02 Thread Arnaud Charlet
This patch fixes a bug in the handling of primitive operations that involve
renamings of equality. The placement of the primitive in the dispatch table
depends on whether the operation overrides an existing operation, is an
explicit renaming, or is inherited by a type extension.

Executing testz88 must yield:

--- TestZ88 - Test equality for simple tagged types.
-- Simple Equality Test
-- Classwide Equality Test
-- Renamed Classwide Equality Test
--- TestZ88 Passed

---
with Text_IO; use Text_IO;
procedure TestZ88 is
   --
   -- TestZ88.Pkg - Test dispatching equality.
   --
   -- Edit History:
   --
   --  1/14/93 - RLB - Updated to use 'With Null'.
   --  5/28/93 - RLB - Changed to 'With Null Record'.
   --  8/13/96 - RLB - Added clarifying comments.

   Passed : Boolean := True;

   package A_Pack is
  type A is tagged record
 I : Integer;
  end record;

  function My_Eq (L, R : A) return Boolean renames "=";
  -- This is a dispatching renames, as it is primitive for
  -- A.  A renames somewhere else is not dispatching.

   end A_Pack;

   package B_Pack is
  type B is new A_Pack.A with null record;
   end B_Pack;

   A_Var : A_Pack.A := (I => 10);
   B_Var : B_Pack.B := (I => 5);

begin
   Put_Line ("--- TestZ88 - Test equality for simple tagged types.");

   declare
  use A_Pack, B_Pack;
  A_Var2 : A := (I => 20);
  A_Var3 : A := (I => 20);
  B_Var2 : B := (I => 5);
   begin
  -- Test simple equality.
  Put_Line ("-- Simple Equality Test");
  if A_Var = A_Var2 then
 Put_Line ("** Simple Equality Failed (1)");
 Passed := False;
  end if;
  if A_Var /= A_Var3 then
 null;
  else
 Put_Line ("** Simple Equality Failed (2)");
 Passed := False;
  end if;
  if A_Var2 /= A_Var3 then
 Put_Line ("** Simple Equality Failed (3)");
 Passed := False;
  end if;
  if B_Var = B_Var2 then
 null;
  else
 Put_Line ("** Simple Equality Failed (4)");
 Passed := False;
  end if;
   end;

   declare
  use A_Pack, B_Pack;
  A_Var2 : A := (I => 20);
  B_Var2 : B := (I => 5);

  procedure Class_EQ
(P1, P2 : A'Class;
 Result : Boolean;
 Key: Character)
  is
  -- Compare P1 = P2; the result ought to be Result.
  -- Use Key to produce error messages.
  begin
 if P1 = P2 then
if Result then
   null;
else
   Put_Line ("** Wrong result from equality (" & Key & ')');
   Passed := False;
end if;
 else
if Result then
   Put_Line ("** Wrong result from equality (" & Key & ')');
   Passed := False;
end if;
 end if;
 -- Now, try a boolean expression:
 if (P1 /= P2) = Result then
Put_Line ("** Wrong result from inequality (" & Key & ')');
Passed := False;
 end if;
  exception
 when Constraint_Error =>
Put_Line ("** Constraint_Error raised (" & Key & ')');
Passed := False;
  end Class_EQ;

   begin
  -- Test classwide equality.
  Put_Line ("-- Classwide Equality Test");
  Class_EQ (A_Var, A_Var, True, 'A');
  Class_EQ (A_Var, A_Var2, False, 'B');
  Class_EQ (B_Var, B_Var2, True, 'C');
  Class_EQ
(A_Var,
 B_Var,
 False,
 'D'); -- Different tags always return false.
  A_Var2.I := 5;
  Class_EQ (B_Var, A_Var2, False, 'E'); -- Different tags always return
  -- false, even when the values match.
   end;

   declare
  use A_Pack, B_Pack;
  A_Var2 : A := (I => 20);
  B_Var2 : B := (I => 5);

  procedure Renamed_Class_EQ
(P1, P2  : A'Class;
 Result, Exc : Boolean;
 Key : Character)
  is
  -- Compare P1 = P2; the result ought to be Result, unless Exc
  -- is true, where it ought to raise an exception.
  -- Use Key to produce error messages.
  begin
 begin
if My_Eq
(P1,
 P2)
then -- Note this is legal because it is dispatching.
   if Result or (not Exc) then
  null;
   else
  Put_Line ("** Wrong result from equality (" & Key & ')');
  Passed := False;
   end if;
else
   if Result or Exc then
  Put_Line ("** Wrong result from equality (" & Key & ')');
  Passed := False;
   end if;
end if;
 exception
when Constraint_Error =>
   if Exc then
  null;
   else
  Put_Line ("** Constraint_Error raised (" & Key & ')');
  Passed := False;
   end if;
 end;

 begin
-- Now, try a boolean expression:
if My

[Ada] GNAT option to treat run-time exception warnings as errors

2017-05-02 Thread Arnaud Charlet
This patch adds a gnatmake compiliation flag to treat certain warnings as
errors similar to -gnatwe. However, the new flag -gnatwE looks for any warnings
regarding run-time exceptions being generated in order to only raise a
compile-time error in these cases.


-- Source --


--  runtime_error.adb

procedure Runtime_Error is
  A : array (1..3) of Integer := (others => 0);
  B : Integer;
begin
   B := A (4);
   declare
  C : Integer;
   begin
  B := C;
   end;
end;

--  warn_only.adb

procedure Warn_Only is
  A : Integer;
  B : Integer := A;
begin
   null;
end;


-- Compilation and output --


& gnatmake -f -q -gnatwE runtime_error.adb
& gnatmake -f -q runtime_error.adb
& gnatmake -f -q -gnatwe runtime_error.adb
& gnatmake -f -q -gnatwE warn_only.adb
& gnatmake -f -q warn_only.adb
& gnatmake -f -q -gnatwe warn_only.adb
runtime_error.adb:5:12: warning: value not in range of subtype of
   "Standard.Integer" defined at line 2
runtime_error.adb:5:12: "Constraint_Error" would have been raised at run time
runtime_error.adb:7:07: warning: variable "C" is read but never assigned
gnatmake: "runtime_error.adb" compilation error
runtime_error.adb:5:12: warning: value not in range of subtype of
   "Standard.Integer" defined at line 2
runtime_error.adb:5:12: warning: "Constraint_Error" will be raised at run time
runtime_error.adb:7:07: warning: variable "C" is read but never assigned
runtime_error.adb:5:12: warning: value not in range of subtype of
   "Standard.Integer" defined at line 2
runtime_error.adb:5:12: warning: "Constraint_Error" will be raised at run time
runtime_error.adb:7:07: warning: variable "C" is read but never assigned
gnatmake: "runtime_error.adb" compilation error
warn_only.adb:2:03: warning: variable "A" is read but never assigned
warn_only.adb:2:03: warning: variable "A" is read but never assigned
warn_only.adb:2:03: warning: variable "A" is read but never assigned
gnatmake: "warn_only.adb" compilation error

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-05-02  Justin Squirek  

* errout.adb (Set_Msg_Text): Add a case to switch the message
type when the character '[' is detected signifying a warning
about a run-time exception.
* opt.ads Add a new Warning_Mode value for new switch
* switch-b.adb (Scan_Binder_Switches): Add case for the binder
to handle new warning mode
* usage.adb (Usage): Add usage entry for -gnatwE
* warnsw.adb (Set_Warning_Switch): Add case for the new switch

Index: usage.adb
===
--- usage.adb   (revision 247461)
+++ usage.adb   (working copy)
@@ -488,6 +488,7 @@
Write_Line ("etreat all warnings (but not info) as errors");
Write_Line (".e   turn on every optional info/warning " &
   "(no exceptions)");
+   Write_Line ("Etreat all run time warnings as errors");
Write_Line ("f+   turn on warnings for unreferenced formal");
Write_Line ("F*   turn off warnings for unreferenced formal");
Write_Line (".f   turn on warnings for suspicious Subp'Access");
Index: warnsw.adb
===
--- warnsw.adb  (revision 247461)
+++ warnsw.adb  (working copy)
@@ -532,6 +532,9 @@
  when 'e' =>
 Warning_Mode:= Treat_As_Error;
 
+ when 'E' =>
+Warning_Mode:= Treat_Run_Time_As_Error;
+
  when 'f' =>
 Check_Unreferenced_Formals  := True;
 
Index: errout.adb
===
--- errout.adb  (revision 247463)
+++ errout.adb  (working copy)
@@ -3097,6 +3097,17 @@
 --  '[' (will be/would have been raised at run time)
 
 when '[' =>
+
+   --  Switch the message from a warning to an error if the flag
+   --  -gnatwE is specified to treat run-time exception warnings
+   --  as errors.
+
+   if Is_Warning_Msg
+ and then Warning_Mode = Treat_Run_Time_As_Error
+   then
+  Is_Warning_Msg := False;
+   end if;
+
if Is_Warning_Msg then
   Set_Msg_Str ("will be raised at run time");
else
Index: switch-b.adb
===
--- switch-b.adb(revision 247461)
+++ switch-b.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 2001-2016, Free Software Foundation, Inc. --

[Ada] Missing error on T'Enum_Rep with no parameter

2017-05-02 Thread Arnaud Charlet
This patch fixes a bug in which the compiler fails to give an error on
T'Enum_Rep, where T is a type. If X is an object of enumeration type T,
then X'Enum_Rep and T'Enum_Rep(X) are allowed, but not T'Enum_Rep.

The following test must get an error.

enum_val_test.adb:4:21: prefix of "Enum_Rep" attribute must be discrete object

procedure Enum_Val_Test is
   type Color is (Red, Orange, Yellow);
   Current_Index : Color := Orange;
   Rep : Integer := Color'Enum_Rep; -- Illegal!
begin
   null;
end Enum_Val_Test;

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-05-02  Bob Duff  

* sem_attr.adb (Attribute_Enum_Rep): Disallow T'Enum_Rep.

Index: sem_attr.adb
===
--- sem_attr.adb(revision 247479)
+++ sem_attr.adb(working copy)
@@ -3763,13 +3763,23 @@
   --
 
   when Attribute_Enum_Rep =>
+ --  T'Enum_Rep (X) case
+
  if Present (E1) then
 Check_E1;
 Check_Discrete_Type;
 Resolve (E1, P_Base_Type);
 
- elsif not Is_Discrete_Type (Etype (P)) then
-Error_Attr_P ("prefix of % attribute must be of discrete type");
+ --  X'Enum_Rep case.  X must be an object or enumeration literal, and
+ --  it must be of a discrete type.
+
+ elsif not ((Is_Object_Reference (P)
+   or else (Is_Entity_Name (P)
+  and then Ekind (Entity (P)) =
+ E_Enumeration_Literal))
+and then Is_Discrete_Type (Etype (P)))
+ then
+Error_Attr_P ("prefix of % attribute must be discrete object");
  end if;
 
  Set_Etype (N, Universal_Integer);


[Ada] Bug in handling of library-level freeze actions

2017-05-02 Thread Arnaud Charlet
This patch fixes an error in the handling of freeze actions generated for
a generic package that is a compilation unit, whose entities carry iterable
aspects.

The following must compile quietly:

---
generic
   type Data_Type (<>) is limited private;
package Data_Streams is

   type Root_Data_Stream_Type is abstract tagged limited null record;

   function Has_Data (Stream : Root_Data_Stream_Type)
 return Boolean is abstract;

   function Consume (Stream : not null access Root_Data_Stream_Type)
 return Data_Type is abstract;


   generic
  type Data_Stream_Type is new Root_Data_Stream_Type with private;
   package Add_Iteration is

  type Iterable_Data_Stream_Type is new Data_Stream_Type with private
with Iterable => (First => First, Next => Next,
  Has_Element => Has_Element, Element => Element);

  type Cursor is private;

  function First (Stream : Iterable_Data_Stream_Type) return Cursor;

  function Next (
Stream   : Iterable_Data_Stream_Type;
Position : Cursor
  ) return Cursor;

  function Has_Element (
Stream   : Iterable_Data_Stream_Type;
Position : Cursor
  ) return Boolean;

  function Element (
Stream   : Iterable_Data_Stream_Type;
Position : Cursor
  ) return Data_Type;

   private

  type Reference_Type (Stream : not null access Iterable_Data_Stream_Type)
is null record;

  type Iterable_Data_Stream_Type is new Data_Stream_Type with record
 Self : Reference_Type (Iterable_Data_Stream_Type'Access);
  end record;

 type Cursor is null record;

   end Add_Iteration;

end Data_Streams;
---
package body Data_Streams is

   package body Add_Iteration is

  function First (Stream : Iterable_Data_Stream_Type) return Cursor is
  begin
 return (null record);
  end First;
   
  function Has_Element (
Stream   : Iterable_Data_Stream_Type;
Position : Cursor
  ) return Boolean is
  begin
 return Has_Data (Stream);
  end Has_Element;
   
  function Next (
Stream   : Iterable_Data_Stream_Type;
Position : Cursor
  ) return Cursor is
  begin
 return (null record);
  end Next;
   
  function Element (
Stream   : Iterable_Data_Stream_Type;
Position : Cursor
  ) return Data_Type is
  begin
 return Consume (Stream.Self.Stream);
  end Element;

   end Add_Iteration;

end Data_Streams;

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-05-02  Ed Schonberg  

* exp_util.adb (Insert_Library_Level_Action): Use proper scope
to analyze generated actions.  If the main unit is a body,
the required scope is that of the corresponding unit declaration.

Index: exp_util.adb
===
--- exp_util.adb(revision 247461)
+++ exp_util.adb(working copy)
@@ -7491,8 +7491,10 @@
   Aux : constant Node_Id := Aux_Decls_Node (Cunit (Main_Unit));
 
begin
-  Push_Scope (Cunit_Entity (Main_Unit));
-  --  ??? should this be Current_Sem_Unit instead of Main_Unit?
+  Push_Scope (Cunit_Entity (Current_Sem_Unit));
+  --  And not Main_Unit as previously. If the main unit is a body,
+  --  the scope needed to analyze the actions is the entity of the
+  --  corresponding declaration.
 
   if No (Actions (Aux)) then
  Set_Actions (Aux, New_List (N));


Re: [PATCH][GCC][AArch64] Fix subreg bug in scalar copysign

2017-05-02 Thread Tamar Christina
Ping

From: gcc-patches-ow...@gcc.gnu.org  on behalf 
of Tamar Christina 
Sent: Wednesday, March 15, 2017 4:04:35 PM
To: GCC Patches
Cc: nd; James Greenhalgh; Richard Earnshaw; Marcus Shawcroft
Subject: [PATCH][GCC][AArch64] Fix subreg bug in scalar copysign

Hi All,

This fixes a bug in the scalar version of copysign where due to a subreg
were generating less than efficient code.

This patch replaces

  return x * __builtin_copysignf (150.0f, y);

which used to generate

adrpx1, .LC1
mov x0, 2147483648
ins v3.d[0], x0
ldr s2, [x1, #:lo12:.LC1]
bsl v3.8b, v1.8b, v2.8b
fmuls0, s0, s3
ret

.LC1:
.word   1125515264

with
mov x0, 1125515264
moviv2.2s, 0x80, lsl 24
fmovd3, x0
bit v3.8b, v1.8b, v2.8b
fmuls0, s0, s3
ret

removing the incorrect ins.

Regression tested on aarch64-none-linux-gnu and no regressions.

OK for trunk?

Thanks,
Tamar

gcc/
2017-03-15  Tamar Christina  

* config/aarch64/aarch64.md
(copysignsf3): Fix mask generation.


Re: [PATCH] PR libstdc++/80553 don't allow destroying non-destructible types

2017-05-02 Thread Jonathan Wakely

On 28/04/17 13:56 +0100, Jonathan Wakely wrote:

We optimize _Destroy and _Destroy_n to do nothing when the type has a
trivial destructor, which means we do nothing (instead of giving an
error) when trying to destroy types with deleted destructors.


I wonder if this optimisation should even exist. The compiler should
be able to optimise away a loop that just calls trivial destructors,
without help from the library.




[Ada] Reimplement layout of partially constrained derived untagged types

2017-05-02 Thread Arnaud Charlet
The layout done in gigi for partially constrained derived untagged types,
that is to say discriminated record types derived from an untagged parent
type with constraints, is done independently of that of their parent type,
which is a bit annoying since the layouts must be compatible if there is no
representation clause on the derived type.  This works so far but will break
if components are reordered in record types based on whether their length
depends on a discriminant.

This patch changes that by reusing the machinery already present in gigi
to deduce the layout of a record subtype from that of its base type and
additional constraints.

The effect is visible on the following package with -gnatd.v and -gnatR1:

package Q is

  type Base (D : Positive; B : Boolean) is record
S : String (1 .. D);
I : Integer;
  end record;

  type Derived (B : Boolean) is new Base (D => 16, B => B);

end Q;

for which a compatible layout with reordered components is produced:

for Base'Object_Size use 17179869280;
for Base'Value_Size use ??;
for Base'Alignment use 4;
for Base use record
   D at  0 range  0 .. 31;
   B at  4 range  0 ..  7;
   S at 12 range  0 .. ??;
   I at  8 range  0 .. 31;
end record;

for Derived'Size use 224;
for Derived'Alignment use 4;
for Derived use record
   B at  4 range  0 ..  7;
   D at  0 range  0 .. 31;
   B at  4 range  0 ..  7;
   S at 12 range  0 .. 127;
   I at  8 range  0 .. 31;
end record;

which guarantees that the following procedure runs correctly:

with Q; use Q;

procedure P is

  procedure Inner (B : Base) is
  begin
if B.I /= 1 then
  raise Program_Error;
end if;
  end;

  D : Derived (True);

begin
  D.I := 1;
  Inner (Base (D));
end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-05-02  Eric Botcazou  

* einfo.ads (Corresponding_Record_Component): New alias
for Node21 used for E_Component and E_Discriminant.
* einfo.adb (Corresponding_Record_Component): New function.
(Set_Corresponding_Record_Component): New procedure.
(Write_Field21_Name): Handle Corresponding_Record_Component.
* sem_ch3.adb (Inherit_Component): Set
Corresponding_Record_Component for every component in
the untagged case.  Clear it afterwards for non-girder
discriminants.
* gcc-interface/decl.c (gnat_to_gnu_entity)
: For a derived untagged type with discriminants
and constraints, apply the constraints to the layout of the
parent type to deduce the layout.
(field_is_aliased): Delete.
(components_to_record): Test DECL_ALIASED_P directly.
(annotate_rep): Check that fields are present except for
an extension.
(create_field_decl_from): Add DEBUG_INFO_P
parameter and pass it in recursive and other calls.  Add guard
for the manual CSE on the size.
(is_stored_discriminant): New predicate.
(copy_and_substitute_in_layout): Consider only
stored discriminants and check that original fields are present
in the old type.  Deal with derived types.  Adjust call to
create_variant_part_from.

Index: sem_ch3.adb
===
--- sem_ch3.adb (revision 247481)
+++ sem_ch3.adb (working copy)
@@ -18147,6 +18147,7 @@
 
  if not Is_Tagged then
 Set_Original_Record_Component (New_C, New_C);
+Set_Corresponding_Record_Component (New_C, Old_C);
  end if;
 
  --  Set the proper type of an access discriminant
@@ -18245,6 +18246,7 @@
  and then Original_Record_Component (Corr_Discrim) = Old_C
then
   Set_Original_Record_Component (Discrim, New_C);
+  Set_Corresponding_Record_Component (Discrim, Empty);
end if;
 
Next_Discriminant (Discrim);
Index: einfo.adb
===
--- einfo.adb   (revision 247480)
+++ einfo.adb   (working copy)
@@ -185,6 +185,7 @@
--Scalar_RangeNode20
 
--Accept_Address  Elist21
+   --Corresponding_Record_Component  Node21
--Default_Expr_Function   Node21
--Discriminant_Constraint Elist21
--Interface_Name  Node21
@@ -950,6 +951,12 @@
   return Node18 (Id);
end Corresponding_Protected_Entry;
 
+   function Corresponding_Record_Component (Id : E) return E is
+   begin
+  pragma Assert (Ekind_In (Id, E_Component, E_Discriminant));
+  return Node21 (Id);
+   end Corresponding_Record_Component;
+
function Corresponding_Record_Type (Id : E) return E is
begin
   pragma Assert (Is_Concurrent_Type (Id));
@@ -4083,6 +4090,12 @@
   Set_Node18 (Id, V);
end Set_Corresponding_Protected_Entry;
 
+   procedure Set_Corresponding_Record_Component (Id : E; V : E) is
+   begin
+  pragma Assert (Ekind_In (Id, E_Comp

[PATCH] Fix PR80591

2017-05-02 Thread Richard Biener

I am testing reversal of r246809 which was bogus.

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

Richard.

2017-05-02  Richard Biener  

PR tree-optimization/80591
Revert
2017-04-10  Richard Biener  

* tree-ssa-structalias.c (find_func_aliases): Properly handle
asm inputs.

* gcc.dg/torture/pr80591.c: New testcase.

Index: gcc/tree-ssa-structalias.c
===
--- gcc/tree-ssa-structalias.c  (revision 247368)
+++ gcc/tree-ssa-structalias.c  (working copy)
@@ -4944,14 +4944,14 @@ find_func_aliases (struct function *fn,
make_escape_constraint (build_fold_addr_expr (op));
 
  /* The asm may read global memory, so outputs may point to
-any global or escaped memory.  */
+any global memory.  */
  if (op)
{
  auto_vec lhsc;
  struct constraint_expr rhsc, *lhsp;
  unsigned j;
  get_constraint_for (op, &lhsc);
- rhsc.var = escaped_id;
+ rhsc.var = nonlocal_id;
  rhsc.offset = 0;
  rhsc.type = SCALAR;
  FOR_EACH_VEC_ELT (lhsc, j, lhsp)
Index: gcc/testsuite/gcc.dg/torture/pr80591.c
===
--- gcc/testsuite/gcc.dg/torture/pr80591.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr80591.c  (working copy)
@@ -0,0 +1,20 @@
+/* PR tree-optimization/80591 */
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } "-flto" } */
+/* { dg-additional-options "-fdump-tree-optimized" } */
+
+static inline __attribute__((always_inline)) int *
+foo (void)
+{
+  __UINTPTR_TYPE__ sp;
+  asm ("" : "=r" (sp));
+  return (int *) sp;
+}
+
+void
+bar (void)
+{
+  foo ()[0] += 26;
+}
+
+/* { dg-final { scan-tree-dump "\\+ 26;" "optimized" } } */


[Patch, testsuite] Fix bogus pr78138.c failure for avr

2017-05-02 Thread Senthil Kumar Selvaraj
Hi,

  The trivial patch below fixes a bogus testsuite failure
  (gcc.dg/pr78138.c) for the avr target.

  The declaration for memcpy had the size parameter declared as 
  unsigned long. For avr, __SIZE_TYPE__ is unsigned int, and 
  this caused a builtin-declaration-mismatch warning, resulting
  in a couple of FAILs.

  The patch fixes that by typedef'ing __SIZE_TYPE__ to size_t and
  using size_t as the type for memcpy's third parameter.

  Committed to trunk as obvious.

Regards
Senthil

gcc/testsuite/ChangeLog

2017-05-02  Senthil Kumar Selvaraj  

* gcc.dg/pr78138.c: Use __SIZE_TYPE__ instead of
unsigned long.

Index: gcc.dg/pr78138.c
===
--- gcc.dg/pr78138.c(revision 247481)
+++ gcc.dg/pr78138.c(working copy)
@@ -5,7 +5,9 @@
 
 char d [5];
 
-void* memcpy (void*, const void*, unsigned long);
+__extension__ typedef __SIZE_TYPE__ size_t;
+
+void* memcpy (void*, const void*, size_t);
 extern char* strcpy (char*, const char*);
 
 void f (int i, int j)



Update ipa-cp to new time metrics

2017-05-02 Thread Jan Hubicka
Hi,
this patch makes ipa-cp to use nonspecialized time as a base for decision about
cloning.  I wonder about the capping - we perhaps want to use sreals further in
the code because time differences can be large (with profile feedback). But I
guess this can be done incrementally - main point of the patch is to update
interfaces from ipa-analysis.

Bootstrapped/regtested x86_64-linux, OK?

Honza

* ipa-cp.c (perform_estimation_of_a_value): Drop base_time parameter;
update use of estimate_ipcp_clone_size_and_time.
(estimate_local_effects): Update use of
estimate_ipcp_clone_size_and_time and perform_estimation_of_a_value.
* ipa-inline.h (estimate_ipcp_clone_size_and_time): Update prototype.
* ipa-inline-analysis.c (estimate_ipcp_clone_size_and_time):
Return nonspecialized time.
Index: ipa-cp.c
===
--- ipa-cp.c(revision 247436)
+++ ipa-cp.c(working copy)
@@ -2792,16 +2792,16 @@ static void
 perform_estimation_of_a_value (cgraph_node *node, vec known_csts,
   vec known_contexts,
   vec known_aggs_ptrs,
-  sreal base_time, int removable_params_cost,
+  int removable_params_cost,
   int est_move_cost, ipcp_value_base *val)
 {
   int size, time_benefit;
-  sreal time;
+  sreal time, base_time;
   inline_hints hints;
 
   estimate_ipcp_clone_size_and_time (node, known_csts, known_contexts,
 known_aggs_ptrs, &size, &time,
-&hints);
+&base_time, &hints);
   base_time -= time;
   if (base_time > 65535)
 base_time = 65535;
@@ -2836,15 +2836,14 @@ estimate_local_effects (struct cgraph_no
   vec known_aggs;
   vec known_aggs_ptrs;
   bool always_const;
-  sreal base_time = inline_summaries->get (node)->time.to_int ();
   int removable_params_cost;
 
   if (!count || !ipcp_versionable_function_p (node))
 return;
 
   if (dump_file && (dump_flags & TDF_DETAILS))
-fprintf (dump_file, "\nEstimating effects for %s/%i, base_time: %f.\n",
-node->name (), node->order, base_time.to_double ());
+fprintf (dump_file, "\nEstimating effects for %s/%i.\n",
+node->name (), node->order);
 
   always_const = gather_context_independent_values (info, &known_csts,
&known_contexts, 
&known_aggs,
@@ -2857,14 +2856,15 @@ estimate_local_effects (struct cgraph_no
 {
   struct caller_statistics stats;
   inline_hints hints;
-  sreal time;
+  sreal time, base_time;
   int size;
 
   init_caller_stats (&stats);
   node->call_for_symbol_thunks_and_aliases (gather_caller_stats, &stats,
  false);
   estimate_ipcp_clone_size_and_time (node, known_csts, known_contexts,
-known_aggs_ptrs, &size, &time, &hints);
+known_aggs_ptrs, &size, &time,
+&base_time, &hints);
   time -= devirt_bonus;
   time -= hint_time_bonus (hints);
   time -= removable_params_cost;
@@ -2877,20 +2877,20 @@ estimate_local_effects (struct cgraph_no
   if (size <= 0 || node->local.local)
{
  info->do_clone_for_all_contexts = true;
- base_time = time;
 
  if (dump_file)
fprintf (dump_file, " Decided to specialize for all "
 "known contexts, code not going to grow.\n");
}
-  else if (good_cloning_opportunity_p (node, (base_time - time).to_int (),
+  else if (good_cloning_opportunity_p (node,
+  MAX ((base_time - time).to_int (),
+   65536),
   stats.freq_sum, stats.count_sum,
   size))
{
  if (size + overall_size <= max_new_size)
{
  info->do_clone_for_all_contexts = true;
- base_time = time;
  overall_size += size;
 
  if (dump_file)
@@ -2926,7 +2926,7 @@ estimate_local_effects (struct cgraph_no
 
  int emc = estimate_move_cost (TREE_TYPE (val->value), true);
  perform_estimation_of_a_value (node, known_csts, known_contexts,
-known_aggs_ptrs, base_time,
+known_aggs_ptrs,
 removable_params_cost, emc, val);
 
  if (dump_file && (dump_flags & TDF_DETAILS))
@@ -2961,7 +2961,7 @@ estimate_local_effects (struct cgraph_no
{
  known_contexts[i] = val->value;
  perform_estimation_of_a_value (node, known_csts, known_contexts,
-

RE: [PATCH][x86] Add missing intrinsics for ADD[SD,SS] and SUB[SD,SS]

2017-05-02 Thread Peryt, Sebastian
Hi,
Can you please commit it for me?

Thanks,
Sebastian

-Original Message-
From: Uros Bizjak [mailto:ubiz...@gmail.com] 
Sent: Monday, May 1, 2017 11:28 AM
To: Peryt, Sebastian 
Cc: gcc-patches@gcc.gnu.org; kirill.yuk...@gmail.com
Subject: Re: [PATCH][x86] Add missing intrinsics for ADD[SD,SS] and SUB[SD,SS]

On Thu, Apr 27, 2017 at 10:22 AM, Peryt, Sebastian  
wrote:
> Hi,
>
> This patch adds missing intrinsics for ADDSD, ADDSS, SUBSD and SUBSS 
> instructions.
>
> gcc/
> * config/i386/avx512fintrin.h (_mm_mask_add_round_sd,
> _mm_maskz_add_round_sd, _mm_mask_add_round_ss,
> _mm_maskz_add_round_ss, _mm_mask_sub_round_sd,
> _mm_maskz_sub_round_sd, _mm_mask_sub_round_ss,
> _mm_maskz_sub_round_ss, _mm_mask_add_sd,
> _mm_maskz_add_sd, _mm_mask_add_ss, _mm_maskz_add_ss,
> _mm_mask_sub_sd, _mm_maskz_sub_sd, _mm_mask_sub_ss,
> _mm_maskz_sub_ss): New intrinsics.
> * config/i386/i386-builtin-types.def 
> (V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT,
> V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT): New function type aliases.
> * config/i386/i386-builtin.def (__builtin_ia32_addsd_mask_round,
> __builtin_ia32_addss_mask_round, __builtin_ia32_subsd_mask_round,
> __builtin_ia32_subss_mask_round): New builtins.
> * config/i386/i386.c (V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT,
> V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT): Handle new types.
> * config/i386/sse.md (_vm3): 
> Renamed to  ...
> (_vm3): ... this.
> (v\t{%2, %1, 
> %0|%0, %1, %2}): Changed to ...
> (v\t{%2, %1, 
> %0|%0, %1, %2}): ... this.
>
> gcc/testsuite/
> * gcc.target/i386/avx512f-vaddsd-1.c (_mm_mask_add_sd,
> _mm_maskz_add_sd, _mm_mask_add_round_sd,
> _mm_maskz_add_round_sd): Test new intrinsics.
> * gcc.target/i386/avx512f-vaddsd-2.c: New.
> * gcc.target/i386/avx512f-vaddss-1.c (_mm_mask_add_ss,
> _mm_maskz_add_ss, _mm_mask_add_round_ss,
> _mm_maskz_add_round_ss): Test new intrinsics.
> * gcc.target/i386/avx512f-vaddss-2.c: New.
> * gcc.target/i386/avx512f-vsubsd-1.c (_mm_mask_sub_sd,
> _mm_maskz_sub_sd, _mm_mask_sub_round_sd,
> _mm_maskz_sub_round_sd): Test new intrinsics.
> * gcc.target/i386/avx512f-vsubsd-2.c: New.
> * gcc.target/i386/avx512f-vsubss-1.c (_mm_mask_sub_ss,
> _mm_maskz_sub_ss, _mm_mask_sub_round_ss,
> _mm_maskz_sub_round_ss): Test new intrinsics.
> * gcc.target/i386/avx512f-vsubss-2.c: New.
> * gcc.target/i386/avx-1.c (__builtin_ia32_addsd_mask_round,
> __builtin_ia32_addss_mask_round, __builtin_ia32_subsd_mask_round,
> __builtin_ia32_subss_mask_round): Test new builtins.
> * gcc.target/i386/sse-13.c: Ditto.
> * gcc.target/i386/sse-23.c: Ditto.
> * gcc.target/i386/sse-14.c (_mm_maskz_add_round_sd,
> _mm_maskz_add_round_ss, _mm_maskz_sub_round_sd,
> _mm_maskz_sub_round_ss, _mm_mask_add_round_sd,
> _mm_mask_add_round_ss, _mm_mask_sub_round_sd,
> _mm_mask_sub_round_ss): Test new intrinsics.
> * gcc.target/i386/testround-1.c: Ditto.
>
> Is it ok for trunk?

OK.

Thanks,
Uros.


Re: [PATCH] PR libstdc++/80553 don't allow destroying non-destructible types

2017-05-02 Thread Jonathan Wakely

On 02/05/17 10:16 +0100, Jonathan Wakely wrote:

On 28/04/17 13:56 +0100, Jonathan Wakely wrote:

We optimize _Destroy and _Destroy_n to do nothing when the type has a
trivial destructor, which means we do nothing (instead of giving an
error) when trying to destroy types with deleted destructors.


I wonder if this optimisation should even exist. The compiler should
be able to optimise away a loop that just calls trivial destructors,
without help from the library.


The compiler can indeed do that optimisation, even for destructors
like ~T() { } that are empty, but not trivial according to the
language rules. The libstdc++ optimisation does make a difference at
-O0 though. If we get any more bugs in that code I think we should
just remove it though, and let the compiler do the right thing.




[Patch AArch64] Do not increase data alignment at -Os and with -fconserve-stack.

2017-05-02 Thread Ramana Radhakrishnan
We unnecessarily align data to 8 byte alignments even when -Os is 
specified. This brings the logic in the AArch64 backend more in line 
with the ARM backend and helps gain some image size in a few places. 
Caught by an internal report on the size of rodata sections being high 
with aarch64 gcc.


* config/aarch64/aarch64.h (AARCH64_EXPAND_ALIGNMENT): New.
  (DATA_ALIGNMENT): Update to use AARCH64_EXPAND_ALIGNMENT.
  (LOCAL_ALIGNMENT): Update to use AARCH64_EXPAND_ALIGNMENT.

Bootstrapped and regression tested on aarch64-none-linux-gnu with no 
regressions.


Ok to commit ?


cheers
Ramana

diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index e4fb96f..95907b2 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -98,14 +98,24 @@
 && (ALIGN) < BITS_PER_WORD)\
? BITS_PER_WORD : ALIGN)
 
-#define DATA_ALIGNMENT(EXP, ALIGN) \
-  ALIGN) < BITS_PER_WORD)  \
-&& (TREE_CODE (EXP) == ARRAY_TYPE  \
-   || TREE_CODE (EXP) == UNION_TYPE\
-   || TREE_CODE (EXP) == RECORD_TYPE)) \
-   ? BITS_PER_WORD : (ALIGN))
-
-#define LOCAL_ALIGNMENT(EXP, ALIGN) DATA_ALIGNMENT(EXP, ALIGN)
+/* Align definitions of arrays, unions and structures so that
+   initializations and copies can be made more efficient.  This is not
+   ABI-changing, so it only affects places where we can see the
+   definition.  Increasing the alignment tends to introduce padding,
+   so don't do this when optimizing for size/conserving stack space.  */
+#define AARCH64_EXPAND_ALIGNMENT(COND, EXP, ALIGN) \
+  (((COND) && ((ALIGN) < BITS_PER_WORD)
\
+&& (TREE_CODE (EXP) == ARRAY_TYPE  \
+   || TREE_CODE (EXP) == UNION_TYPE\
+   || TREE_CODE (EXP) == RECORD_TYPE)) ? BITS_PER_WORD : (ALIGN))
+
+/* Align global data.  */
+#define DATA_ALIGNMENT(EXP, ALIGN) \
+  AARCH64_EXPAND_ALIGNMENT (!optimize_size, EXP, ALIGN)
+
+/* Similarly, make sure that objects on the stack are sensibly aligned.  */
+#define LOCAL_ALIGNMENT(EXP, ALIGN)\
+  AARCH64_EXPAND_ALIGNMENT (!flag_conserve_stack, EXP, ALIGN)
 
 #define STRUCTURE_SIZE_BOUNDARY8
 


Re: Update ipa-cp to new time metrics

2017-05-02 Thread Richard Biener
On Tue, May 2, 2017 at 11:33 AM, Jan Hubicka  wrote:
> Hi,
> this patch makes ipa-cp to use nonspecialized time as a base for decision 
> about
> cloning.  I wonder about the capping - we perhaps want to use sreals further 
> in
> the code because time differences can be large (with profile feedback). But I
> guess this can be done incrementally - main point of the patch is to update
> interfaces from ipa-analysis.
>
> Bootstrapped/regtested x86_64-linux, OK?

Ok.

Richard.

> Honza
>
> * ipa-cp.c (perform_estimation_of_a_value): Drop base_time parameter;
> update use of estimate_ipcp_clone_size_and_time.
> (estimate_local_effects): Update use of
> estimate_ipcp_clone_size_and_time and perform_estimation_of_a_value.
> * ipa-inline.h (estimate_ipcp_clone_size_and_time): Update prototype.
> * ipa-inline-analysis.c (estimate_ipcp_clone_size_and_time):
> Return nonspecialized time.
> Index: ipa-cp.c
> ===
> --- ipa-cp.c(revision 247436)
> +++ ipa-cp.c(working copy)
> @@ -2792,16 +2792,16 @@ static void
>  perform_estimation_of_a_value (cgraph_node *node, vec known_csts,
>vec 
> known_contexts,
>vec known_aggs_ptrs,
> -  sreal base_time, int removable_params_cost,
> +  int removable_params_cost,
>int est_move_cost, ipcp_value_base *val)
>  {
>int size, time_benefit;
> -  sreal time;
> +  sreal time, base_time;
>inline_hints hints;
>
>estimate_ipcp_clone_size_and_time (node, known_csts, known_contexts,
>  known_aggs_ptrs, &size, &time,
> -&hints);
> +&base_time, &hints);
>base_time -= time;
>if (base_time > 65535)
>  base_time = 65535;
> @@ -2836,15 +2836,14 @@ estimate_local_effects (struct cgraph_no
>vec known_aggs;
>vec known_aggs_ptrs;
>bool always_const;
> -  sreal base_time = inline_summaries->get (node)->time.to_int ();
>int removable_params_cost;
>
>if (!count || !ipcp_versionable_function_p (node))
>  return;
>
>if (dump_file && (dump_flags & TDF_DETAILS))
> -fprintf (dump_file, "\nEstimating effects for %s/%i, base_time: %f.\n",
> -node->name (), node->order, base_time.to_double ());
> +fprintf (dump_file, "\nEstimating effects for %s/%i.\n",
> +node->name (), node->order);
>
>always_const = gather_context_independent_values (info, &known_csts,
> &known_contexts, 
> &known_aggs,
> @@ -2857,14 +2856,15 @@ estimate_local_effects (struct cgraph_no
>  {
>struct caller_statistics stats;
>inline_hints hints;
> -  sreal time;
> +  sreal time, base_time;
>int size;
>
>init_caller_stats (&stats);
>node->call_for_symbol_thunks_and_aliases (gather_caller_stats, &stats,
>   false);
>estimate_ipcp_clone_size_and_time (node, known_csts, known_contexts,
> -known_aggs_ptrs, &size, &time, 
> &hints);
> +known_aggs_ptrs, &size, &time,
> +&base_time, &hints);
>time -= devirt_bonus;
>time -= hint_time_bonus (hints);
>time -= removable_params_cost;
> @@ -2877,20 +2877,20 @@ estimate_local_effects (struct cgraph_no
>if (size <= 0 || node->local.local)
> {
>   info->do_clone_for_all_contexts = true;
> - base_time = time;
>
>   if (dump_file)
> fprintf (dump_file, " Decided to specialize for all "
>  "known contexts, code not going to grow.\n");
> }
> -  else if (good_cloning_opportunity_p (node, (base_time - time).to_int 
> (),
> +  else if (good_cloning_opportunity_p (node,
> +  MAX ((base_time - time).to_int (),
> +   65536),
>stats.freq_sum, stats.count_sum,
>size))
> {
>   if (size + overall_size <= max_new_size)
> {
>   info->do_clone_for_all_contexts = true;
> - base_time = time;
>   overall_size += size;
>
>   if (dump_file)
> @@ -2926,7 +2926,7 @@ estimate_local_effects (struct cgraph_no
>
>   int emc = estimate_move_cost (TREE_TYPE (val->value), true);
>   perform_estimation_of_a_value (node, known_csts, known_contexts,
> -known_aggs_ptrs, base_time,
> +known_aggs_ptrs,
>  removable_param

Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts

2017-05-02 Thread Jakub Jelinek
On Mon, Apr 24, 2017 at 03:15:11PM +0200, Allan Sandfeld Jensen wrote:
> Okay, I have tried that, and I also made it more obvious how the intrinsics 
> can become non-immediate shift.
> 

> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index b58f5050db0..b9406550fc5 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,10 @@
> +2017-04-22  Allan Sandfeld Jensen  
> +
> + * config/i386/emmintrin.h (_mm_slli_*, _mm_srli_*):
> + Use vector intrinstics instead of builtins.
> + * config/i386/avx2intrin.h (_mm256_slli_*, _mm256_srli_*):
> + Use vector intrinstics instead of builtins.
> +
>  2017-04-21  Uros Bizjak  
>  
>   * config/i386/i386.md (*extzvqi_mem_rex64): Move above *extzv.
> diff --git a/gcc/config/i386/avx2intrin.h b/gcc/config/i386/avx2intrin.h
> index 82f170a3d61..64ba52b244e 100644
> --- a/gcc/config/i386/avx2intrin.h
> +++ b/gcc/config/i386/avx2intrin.h
> @@ -665,13 +665,6 @@ _mm256_slli_si256 (__m256i __A, const int __N)
>  
>  extern __inline __m256i
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> -_mm256_slli_epi16 (__m256i __A, int __B)
> -{
> -  return (__m256i)__builtin_ia32_psllwi256 ((__v16hi)__A, __B);
> -}
> -
> -extern __inline __m256i
> -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm256_sll_epi16 (__m256i __A, __m128i __B)
>  {
>return (__m256i)__builtin_ia32_psllw256((__v16hi)__A, (__v8hi)__B);
> @@ -679,9 +672,11 @@ _mm256_sll_epi16 (__m256i __A, __m128i __B)
>  
>  extern __inline __m256i
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> -_mm256_slli_epi32 (__m256i __A, int __B)
> +_mm256_slli_epi16 (__m256i __A, int __B)
>  {
> -  return (__m256i)__builtin_ia32_pslldi256 ((__v8si)__A, __B);
> +  if (__builtin_constant_p(__B))
> +return ((unsigned int)__B < 16) ? (__m256i)((__v16hi)__A << __B) : 
> _mm256_setzero_si256();
> +  return _mm256_sll_epi16(__A, _mm_cvtsi32_si128(__B));
>  }

The formatting is wrong, missing spaces before function names and opening (,
too long lines.  Also, you've removed some builtin uses like
__builtin_ia32_psllwi256 above, but haven't removed those builtins from the
compiler (unlike the intrinsics, the builtins are not supported and can be
removed).  But I guess the primary question is on Uros, do we
want to handle this in the *intrin.h headers and thus increase the size
of those (and their parsing time etc.), or do we want to handle this
in the target folders (tree as well as gimple one), where we'd convert
e.g. __builtin_ia32_psllwi256 to the shift if the shift count is constant.

Jakub


Re: [PATCH] adding missing LTO to some warning options (PR 78606)

2017-05-02 Thread Tom de Vries

On 05/01/2017 08:05 PM, Martin Sebor wrote:

On 04/30/2017 02:02 PM, Tom de Vries wrote:

On 01/10/2017 11:16 PM, Martin Sebor wrote:

+  __builtin_sprintf (d, "%32s", "x");   /* { dg-warning "directive
writing 32 bytes into a region of size 12" "-Wformat-length" { xfail
*-*-* } } */


This xpasses for me on an older system:
...
XPASS: gcc.dg/pr78768.c -Wformat-overflow (test for warnings, line 11)
...

The mechanism is as follows:
- the system doesn't have linker plugin support, and sets
  HAVE_LTO_PLUGIN to 0
- consequently, opts.c:finish_options() sets x_flag_fat_lto_objects to 1
  in cc1
- cgraphunit.c:symbol_table::compile() does not return
  here:
  ...
 /* Do nothing else if any IPA pass found errors or if we are just
streaming LTO.  */
 if (seen_error ()
|| (!in_lto_p && flag_lto && !flag_fat_lto_objects))
  {
timevar_pop (TV_CGRAPHOPT);
return;
  }
  ...
  and ends up calling expand_all_functions, which calls
  pass_sprintf_length
- the warning is generated by cc1

Maybe the test needs:
...
/* { dg-require-linker-plugin "" } */
...
?


That seems possible.  IIUC, without linker plugin support the
pass will run when the ordinary object file is created during
the first stage of compilation.  I don't have a system to confirm
it on but if pr78768.c xfails when you add the directive I'd say
go ahead and commit the fix as obvious.


Done.

Thanks,
- Tom
Require linker plugin for pr78768.c

The test-case has an xfail-ed line.  For linkers without plugin support, that
line happens to xpass.  Require linker with plugin support, such that the line
is no longer xpass-ing, but unsupported.

2017-05-01  Tom de Vries  

	* gcc.dg/pr78768.c: Require linker plugin.

---
 gcc/testsuite/ChangeLog| 4 
 gcc/testsuite/gcc.dg/pr78768.c | 1 +
 2 files changed, 5 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/pr78768.c b/gcc/testsuite/gcc.dg/pr78768.c
index 68d717a..b6cda47 100644
--- a/gcc/testsuite/gcc.dg/pr78768.c
+++ b/gcc/testsuite/gcc.dg/pr78768.c
@@ -2,6 +2,7 @@
by -flto
   { dg-do link }
   { dg-require-effective-target lto }
+  { dg-require-linker-plugin "" }
   { dg-options "-O2 -Walloca-larger-than=10 -Wformat -Wformat-overflow -flto" } */
 
 int main (void)


Re: [PATCH v4 0/12] [i386] Improve 64-bit Microsoft to System V ABI pro/epilogues

2017-05-02 Thread JonY
On 05/01/2017 11:31 AM, Uros Bizjak wrote:
> On Thu, Apr 27, 2017 at 10:04 AM, Daniel Santos  
> wrote:
>> All of patches are concerned with 64-bit Microsoft ABI functions that call
>> System V ABI function which clobbers RSI, RDI and XMM6-15 and are aimed at
>> improving performance and .text size of Wine 64. I had previously submitted
>> these as separate patch sets, but have combined them for simplicity. (Does
>> this make the ChangeLogs too big? Please let me know if you want me to break
>> these back apart.) Below are the included patchsets and a summary of changes
>> since the previous post(s):
> 
> Well, the ChangeLog is acceptable.
> 
> I have comments on how new RTX patterns are generated and checked
> (patches 9/12 and 11/12). Other patches look good to me, so after
> issues with 9/12 and 11/12 are resolved, I think the patch set is
> ready to go.
> 
> After the above issue is addressed, I propose to move forward by
> committing the patchset, and resolve any possible issues later. There
> are just too many code paths in the stack frame construction and
> teardown to notice all possible interactions between new and old code.
> It looks that existing code won't be affected without activating new
> option, so we can be a bit less cautious with the patchset. An
> important part is thus a comprehensive added test suite, which seems
> to pass.
> 
> I also assume that Cygwin and MinGW people agree with the patch and
> the functionality itself.
> 
> Uros.
> 

Cygwin and MinGW does not use SysV/MS transitions directly in their own
code, changes should be OK.





signature.asc
Description: OpenPGP digital signature


Re: [PATCH] handle sprintf(d, "%s", ...) in gimple-ssa-sprintf.c

2017-05-02 Thread Richard Biener
On Fri, Apr 28, 2017 at 6:35 PM, Jeff Law  wrote:
> On 04/25/2017 09:55 PM, Martin Sebor wrote:
>>
>> On 04/25/2017 04:05 PM, Jeff Law wrote:
>>>
>>> On 04/21/2017 03:33 PM, Martin Sebor wrote:

 Bug 77671 - missing -Wformat-overflow warning on sprintf overflow
 with "%s", is caused by gimple-fold.c transforming s{,n}printf
 calls with a plain "%s" format string into strcpy regardless of
 whether they are valid well before the -Wformtat-overflow warning
 has had a chance to validate them.

 The attached patch moves the transformation from gimple-fold.c
 into the gimple-ssa-sprintf.c pass, thus allowing the calls to
 be validated.  Only valid calls (i.e., those that do not overflow
 the destination) are transformed.

 Martin

 gcc-77671.diff


 commit 2cd98763984ffb93606ee96ad658733b4c95c1e4
 Author: Martin Sebor
 Date:   Wed Apr 12 18:36:26 2017 -0600

  PR middle-end/77671 - missing -Wformat-overflow warning on
 sprintf overflow
  with %s
   gcc/ChangeLog:
   PR middle-end/77671
  * gimple-fold.c (gimple_fold_builtin_sprintf): Make extern.
  (gimple_fold_builtin_snprintf): Same.
  * gimple-fold.h (gimple_fold_builtin_sprintf): Declare.
  (gimple_fold_builtin_snprintf): Same.
  * gimple-ssa-sprintf.c (get_format_string): Correct the
 detection
  of character types.
  (is_call_safe): New function.
  (try_substitute_return_value): Call it.
  (try_simplify_call): New function.
  (pass_sprintf_length::handle_gimple_call): Call it.
   gcc/testsuite/ChangeLog:
   PR middle-end/77671
  * gcc.dg/tree-ssa/builtin-sprintf-7.c: New test.
  * gcc.dg/tree-ssa/builtin-sprintf-8.c: New test.
  * gcc.dg/tree-ssa/builtin-sprintf-warn-1.c: Adjust.
  * gcc.dg/tree-ssa/builtin-sprintf-warn-2.c: Adjust.
  * gcc.dg/tree-ssa/builtin-sprintf-warn-3.c: Adjust.
>>>
>>> I assume this went through the usual regression testing cycle?  I'm a
>>> bit surprised nothing failed due to moving the transformation later in
>>> the pipeline.
>>
>>
>> It did.  There was one regression I neglected to mention.  A test
>> exercising -fexec-charset (bug 20110) fails because the sprintf
>> pass assumes the execution character set is the same as the host
>> character set.  With -fexec-charset set to EBCDIC it gets just
>> as confused as the -Wformat warning does (this is subject of
>> the still unresolved bug 20110).  I've opened bug 80523 for
>> the new problem and I'm testing a patch that handles it.
>>
>>
 diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c
 index 2474391..07d6897 100644
 --- a/gcc/gimple-ssa-sprintf.c
 +++ b/gcc/gimple-ssa-sprintf.c
 @@ -373,7 +373,16 @@ get_format_string (tree format, location_t *ploc)
 if (TREE_CODE (format) != STRING_CST)
   return NULL;
   -  if (TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (format))) !=
 char_type_node)
 +  tree type = TREE_TYPE (format);
 +  if (TREE_CODE (type) == ARRAY_TYPE)
 +type = TREE_TYPE (type);
 +
 +  /* Can't just test that:
 +   TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (format))) !=
 char_type_node
 + See bug 79062.  */
 +  if (TREE_CODE (type) != INTEGER_TYPE
 +  || TYPE_MODE (type) != TYPE_MODE (char_type_node)
 +  || TYPE_PRECISION (type) != TYPE_PRECISION (char_type_node))
   {
 /* Wide format string.  */
 return NULL;
>>>
>>> In the referenced BZ Richi mentions how fold-const.c checks for this
>>> case and also hints that you might want t look at tree-ssa-strlen as
>>> well.  That seems wise.  It also seems wise to factor that check and use
>>> it from all the identified locations.  I don't like that this uses a
>>> different check than what's in fold-const.c.
>>
>>
>> I think what I did comes from tree-ssa-strlen.c (Richi's other
>> suggestion in bug 79062).  In either case I don't fully understand
>> why the existing code doesn't work.
>>
>>> It's also not clear if this change is a requirement to address 77671 or
>>> not.  If so ISTM that we fix 79062 first, then 77671.  If not we can fix
>>> them independently.  In both cases the fix for 79062 can be broken out
>>> into its own patch.
>>
>>
>> I suppose that makes sense.  The hunk above doesn't fully fix
>> 79062.  It just lets the sprintf return value optimization take
>> place with -flto.
>>
 @@ -3278,31 +3287,21 @@ get_destination_size (tree dest)
 return HOST_WIDE_INT_M1U;
   }
   -/* Given a suitable result RES of a call to a formatted output
 function
 -   described by INFO, substitute the result for the return value of
 -   the call.  T

Re: Drop Z from X + Z < Y + Z

2017-05-02 Thread Richard Biener
On Fri, Apr 28, 2017 at 10:23 PM, Marc Glisse  wrote:
> On Fri, 28 Apr 2017, Richard Biener wrote:
>
>> On Fri, Apr 28, 2017 at 1:35 PM, Marc Glisse  wrote:
>>>
>>> Hello,
>>>
>>> surprisingly, this did not cause any Wstrict-overflow failure. Some of it
>>> sounds more like reassoc's job, but it is convenient to handle simple
>>> cases
>>> in match.pd. I think we could wait until there are reports of regressions
>>> related to register pressure before adding single_use tests.
>>>
>>> For a std::vector v, we now simplify v.size()==v.capacity() to a
>>> single comparison (long)finish==(long)end_storage (I guess I could still
>>> try
>>> to drop the casts to consider it really done). Handling
>>> v.size()>> choice to use unsigned types. I may still be able to remove the
>>> divisions,
>>> I'll see if I can sprinkle some 'convert' in recent transformations.
>>>
>>> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
>>
>>
>> +(for cmp (eq ne minus)
>>
>> Fat fingered 'minus' (in all places) or did you want to get fancy?
>> (the transforms
>> look valid even for cmp == minus)  Maybe adjust comments to reflect this.
>
>
> I started with just comparisons and then noticed that minus worked as well.
> I'll adjust the comment and rename cmp to op as suggested by Jakub. I may
> have to separate the minus case when I add converts in a future patch.
>
>> There are a few related cases in fold-const.c, namely X +- Y CMP X -> Y
>> CMP 0,
>> some of them also handling POINTER_PLUS_EXPR.  So I wonder if you can
>> handle pointer_plus like plus and maybe move those fold-const.c patterns.
>
>
> You already added (T)(P + A) - (T)(P + B) -> (T)A - (T)B, so I'll have to
> take care to avoid redundancy. Since it was meant for pointer_plus more than
> plus, you used convert, not convert?, and it is currently not redundant with
> my patch. If I integrate pointer_plus in the same transformation, I will
> probably have to use :C instead of :c, and we will get 2 dead paths
> (operand_equal_p on the first argument of one pointer_plus and the second
> argument of another pointer_plus), it might be easier as a separate pattern.

Yes.   Just spotted the few patterns in fold-const.c when searching for things
doing the transform you add.

Richard.

> --
> Marc Glisse


Re: [PATCH v4 0/12] [i386] Improve 64-bit Microsoft to System V ABI pro/epilogues

2017-05-02 Thread Kai Tietz
2017-05-02 12:21 GMT+02:00 JonY <10wa...@gmail.com>:
> On 05/01/2017 11:31 AM, Uros Bizjak wrote:
>> On Thu, Apr 27, 2017 at 10:04 AM, Daniel Santos  
>> wrote:
>>> All of patches are concerned with 64-bit Microsoft ABI functions that call
>>> System V ABI function which clobbers RSI, RDI and XMM6-15 and are aimed at
>>> improving performance and .text size of Wine 64. I had previously submitted
>>> these as separate patch sets, but have combined them for simplicity. (Does
>>> this make the ChangeLogs too big? Please let me know if you want me to break
>>> these back apart.) Below are the included patchsets and a summary of changes
>>> since the previous post(s):
>>
>> Well, the ChangeLog is acceptable.
>>
>> I have comments on how new RTX patterns are generated and checked
>> (patches 9/12 and 11/12). Other patches look good to me, so after
>> issues with 9/12 and 11/12 are resolved, I think the patch set is
>> ready to go.
>>
>> After the above issue is addressed, I propose to move forward by
>> committing the patchset, and resolve any possible issues later. There
>> are just too many code paths in the stack frame construction and
>> teardown to notice all possible interactions between new and old code.
>> It looks that existing code won't be affected without activating new
>> option, so we can be a bit less cautious with the patchset. An
>> important part is thus a comprehensive added test suite, which seems
>> to pass.
>>
>> I also assume that Cygwin and MinGW people agree with the patch and
>> the functionality itself.
>>
>> Uros.
>>
>
> Cygwin and MinGW does not use SysV/MS transitions directly in their own
> code, changes should be OK.
>
>
>

Right, and Wine people will tell, if something doesn't work for them.
So ok for me too.

Kai


[PATCH, GCC/ARM, Stage 1] Enable Purecode for ARMv8-M Baseline

2017-05-02 Thread Prakhar Bahuguna
This patch adds support for purecode to ARMv8-M Baseline, in addition to the
existing support for ARMv7-M and ARMv8-M Mainline.

gcc/ChangeLog:

2017-01-11  Prakhar Bahuguna  
Andre Simoes Dias Vieira  

* config/arm/arm.md (movsi): Change TARGET_32BIT to TARGET_HAVE_MOVT.
(movt splitter): Likewise.
* config/arm/arm.c (arm_option_check_internal): Change arm_arch_thumb2
to TARGET_HAVE_MOVT, and merge with -mslow-flash-data check.
(const_ok_for_arm): Change else to else if (TARGET_THUMB2) and add else
block for Thumb-1 with MOVT.
(thumb2_legitimate_address_p): Move code block ...
(can_avoid_literal_pool_for_label_p): ... into this new function.
(thumb1_legitimate_address_p): Add check for TARGET_HAVE_MOVT and
literal pool.
(thumb_legitimate_constant_p): Add conditional on TARGET_HAVE_MOVT

doc/ChangeLog:

2017-01-11  Prakhar Bahuguna  
Andre Simoes Dias Vieira  

* invoke.texi (-mpure-code): Change "ARMv7-M targets" for
"Thumb-only targets with the MOVT instruction".

testsuite/ChangeLog:

2017-01-11  Prakhar Bahuguna  
Andre Simoes Dias Vieira  

* gcc.target/arm/pure-code/pure-code.exp: Add conditional for
check_effective_target_arm_thumb1_movt_ok.

Testing done: Ran regression tests for arm-eabi-none targeting Cortex-M23, both
with and without -mpure-code.

Okay for stage 1?

-- 

Prakhar Bahuguna
>From a77336404fdbdc9ed9b836e4e164803915aa3b22 Mon Sep 17 00:00:00 2001
From: Prakhar Bahuguna 
Date: Wed, 15 Mar 2017 10:25:03 +
Subject: [PATCH] Enable Purecode for ARMv8-M Baseline.

---
 gcc/config/arm/arm.c   | 77 ++
 gcc/config/arm/arm.md  |  6 +-
 gcc/doc/invoke.texi|  3 +-
 .../gcc.target/arm/pure-code/pure-code.exp |  5 +-
 4 files changed, 57 insertions(+), 34 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index d719020dcde..1088895e7e5 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -2833,16 +2833,15 @@ arm_option_check_internal (struct gcc_options *opts)
   flag_pic = 0;
 }
 
-  /* We only support -mslow-flash-data on armv7-m targets.  */
-  if (target_slow_flash_data
-  && ((!(arm_arch7 && !arm_arch_notm) && !arm_arch7em)
- || (TARGET_THUMB1_P (flags) || flag_pic || TARGET_NEON)))
-error ("-mslow-flash-data only supports non-pic code on armv7-m targets");
-
-  /* We only support pure-code on Thumb-2 M-profile targets.  */
-  if (target_pure_code
-  && (!arm_arch_thumb2 || arm_arch_notm || flag_pic || TARGET_NEON))
-error ("-mpure-code only supports non-pic code on armv7-m targets");
+  /* We only support -mpure-code and -mslow-flash-data on Thumb-only targets
+ with MOVT.  */
+  if ((target_pure_code || target_slow_flash_data)
+  && (!TARGET_HAVE_MOVT || arm_arch_notm || flag_pic || TARGET_NEON))
+{
+  const char *flag = (target_pure_code ? "-mpure-code" : 
"-mslow-flash-data");
+  error ("%s only supports non-pic code on Thumb-only targets with the "
+"MOVT instruction", flag);
+}
 
 }
 
@@ -4077,7 +4076,7 @@ const_ok_for_arm (HOST_WIDE_INT i)
   || (i & ~0xfc03) == 0))
return TRUE;
 }
-  else
+  else if (TARGET_THUMB2)
 {
   HOST_WIDE_INT v;
 
@@ -4093,6 +4092,14 @@ const_ok_for_arm (HOST_WIDE_INT i)
   if (i == v)
return TRUE;
 }
+  else if (TARGET_HAVE_MOVT)
+{
+  /* Thumb-1 Targets with MOVT.  */
+  if (i > 0x)
+   return FALSE;
+  else
+   return TRUE;
+}
 
   return FALSE;
 }
@@ -7736,6 +7743,32 @@ arm_legitimate_address_outer_p (machine_mode mode, rtx 
x, RTX_CODE outer,
   return 0;
 }
 
+/* Return true if we can avoid creating a constant pool entry for x.  */
+bool
+can_avoid_literal_pool_for_label_p (rtx x)
+{
+  /* Normally we can assign constant values to target registers without
+ the help of constant pool.  But there are cases we have to use constant
+ pool like:
+ 1) assign a label to register.
+ 2) sign-extend a 8bit value to 32bit and then assign to register.
+
+ Constant pool access in format:
+ (set (reg r0) (mem (symbol_ref (".LC0"
+ will cause the use of literal pool (later in function arm_reorg).
+ So here we mark such format as an invalid format, then the compiler
+ will adjust it into:
+ (set (reg r0) (symbol_ref (".LC0")))
+ (set (reg r0) (mem (reg r0))).
+ No extra register is required, and (mem (reg r0)) won't cause the use
+ of literal pools.  */
+  if (arm_disable_literal_pool && GET_CODE (x) == SYMBOL_REF
+  && CONSTANT_POOL_ADDRESS_P (x))
+return 1;
+  return 0;
+}
+
+
 /* Return nonzero if X is a valid Thumb-2 address operand.  */
 static int
 thumb2_legitimate_address_p (machine_mode mode, rtx x, int strict_p)
@@ -7799,23 +7832,7 @@ thu

Re: [PATCH] Optimize in VRP loads from constant arrays (take 2)

2017-05-02 Thread Richard Biener
On Sat, 29 Apr 2017, Jakub Jelinek wrote:

> On Fri, Apr 21, 2017 at 04:42:03PM +0200, Richard Biener wrote:
> > On April 21, 2017 4:06:59 PM GMT+02:00, Jakub Jelinek  
> > wrote:
> > >This patch attempts to implement the optimization mentioned in
> > >http://gcc.gnu.org/ml/gcc-patches/2017-02/msg00945.html
> > >If we know a (relatively small) value range for ARRAY_REF index
> > >in constant array load and the values in the array for all the indexes
> > >in the array are the same (and constant), we can just replace the load
> > >with the constant.
> > 
> > But this should be done during propagation (and thus can even produce a 
> > range rather than just a constant).
> 
> True, I've changed the implementation to do that.
> Except that it is only useful during propagation for integral or pointer
> types, for anything else (and for pointers when not just doing NULL vs.
> non-NULL vr) it is also performed during simplification (in that case
> it requires operand_equal_p values).
> 
> The patch also newly computes range for the index and range from the array
> bounds (taking into account arrays at struct end) and intersects them,
> plus takes into account that some iterators might have a couple of zero LBS
> bits (as computed by ccp).

Hmm, while I can see how this helps I'd rather not do this in this way.
(see PR80533 and my followup with a testcase which shows
array_at_struct_end_p is wrong)  Ideally we'd add asserts for array
indices instead.  Thus

  i_3 = ASSERT_EXPR 
  .. = a[i_3];

which of course needs adjustment to -Warray-bounds to look at the
range of the original SSA name (and in loops even that might degrade...).

Was this necessary to get any reasonable results?

> > Also much of the fold_const_aggregate_ref work can be shared for all 
> > indices.
> 
> Maybe.  Is that required for the patch, or can it be done incrementally?

Incrementally.

Still given the high cost of get_array_ctor_element_at_index which
does linear searching I'd add a few early-outs like

  base = get_base_address (rhs);
  ctor = ctor_for_folding (base);
  if (! ctor)
return NULL_TREE;

I'd restructure the patch quite different, using for_each_index on the
ref gather an array of index pointers (bail out on sth unhandled).
Then I'd see if I have interesting ranges for them, if not, bail out.
Also compute the size product of all ranges and test that against
PARAM_MAX_VRP_CONSTANT_ARRAY_LOADS.  Then store the minimum range
value in the index places (temporarily) and use get_base_ref_and_extent to
get at the constant "starting" offset.  From there iterate using
the remembered indices (remember the ref tree as well via for_each_index),
directly adjusting the constant offset so you can feed
fold_ctor_reference the constant offsets of all elements that need to
be considered.  As optimization fold_ctor_reference would know how
to start from the "last" offset (much refactoring would need to be
done here given nested ctors and multiple indices I guess).

That said, restricting to a single index and open-coding
for_each_index intermangled with index range handling makes the code
quite hard to follow.

For large ctors we probably need to do sth to 
get_array_ctor_element_at_index, but given the way we lay out
ctors (idx being optional plus ranges) doing better than a linear search
might be tricky...

Thanks,
Richard.

> > >Shall I introduce a param for the size of the range to consider?
> > 
> > I think so.  Eventually we can pre-compute/cache a "range tree" for a
> > Constant initializer?
> 
> param introduced.
> 
> > That said I'm happy with moving it to propagation time and considering 
> > ranges
> > And leave compile-time optimization for future work (with possibly 
> > increasing the number of elements to consider)
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> Note, the patch doesn't handle the case when the constant load is aggregate,
> where fold_const_aggregate_ref_1 returns a CONSTRUCTOR.  CONSTRUCTOR is not
> gimple min invariant, and when not empty can't be in the IL anyway, so I was
> thinking we could perhaps in that case just modify the rhs to use a constant
> index (e.g. vr0.min) instead of the rhs with variable index.  It didn't work
> because operand_equal_p doesn't handle non-empty CONSTRUCTORs (they compare
> unequal even if they have the same elements).
> 
> 2017-04-29  Jakub Jelinek  
> 
>   * params.def (PARAM_MAX_VRP_CONSTANT_ARRAY_LOADS): New.
>   * tree.c (array_at_struct_end_p): Return false if ref is STRING_CST.
>   * tree-vrp.c: Include gimplify.h.
>   (load_index): New variable.
>   (vrp_load_valueize, extract_range_from_load): New functions.
>   (extract_range_from_assignment, simplify_stmt_using_ranges): Use
>   extract_range_from_load.
>   (stmt_interesting_for_vrp): Return true for loads with handled
>   component rhs.
> 
>   * gcc.dg/tree-ssa/vrp113.c: New test.
> 
> --- gcc/params.def.jj 2017-03-19 11:57:24.0 +0100
> +

Re: [PATCH] Improve switchconv optimization (PR tree-optimization/79472)

2017-05-02 Thread Richard Biener
On Sat, 29 Apr 2017, Jakub Jelinek wrote:

> On Wed, Feb 15, 2017 at 12:51:30PM +0100, Jakub Jelinek wrote:
> > On Wed, Feb 15, 2017 at 12:46:44PM +0100, Richard Biener wrote:
> > > >> Possibly, but for GCC 8.
> > > > 
> > > > To both this switchconv patch and the potential improvement for loading
> > > > from const arrays (can create an enhancement PR for that), or just the
> > > > latter?
> > > 
> > > Both I think - the patch is pretty big.
> > 
> > Ok, I'll queue the patch for GCC8 then.
> 
> If the tree-vrp.c change makes it in, is this patch ok for trunk too now
> that we are in stage1?  Bootstrapped/regtested on x86_64-linux and
> i686-linux on top of the tree-vrp.c change (without the tree-vrp.c change
> vrp40.c regresses).

Ok.

Thanks,
Richard.

> 2017-04-29  Jakub Jelinek  
> 
>   PR tree-optimization/79472
>   * tree-switch-conversion.c (struct switch_conv_info): Add
>   contiguous_range and default_case_nonstandard fields.
>   (collect_switch_conv_info): Compute contiguous_range and
>   default_case_nonstandard fields, don't clear final_bb if
>   contiguous_range and only the default case doesn't have the required
>   structure.
>   (check_all_empty_except_final): Set default_case_nonstandard instead
>   of failing if contiguous_range and the default case doesn't have empty
>   block.
>   (check_final_bb): Add SWTCH argument, don't fail if contiguous_range
>   and only the default case doesn't have the required constants.  Skip
>   virtual phis.
>   (gather_default_values): Skip virtual phis.  Allow non-NULL CASE_LOW
>   if default_case_nonstandard.
>   (build_constructors): Build constant 1 just once.  Assert that default
>   values aren't inserted in between cases if contiguous_range.  Skip
>   virtual phis.
>   (build_arrays): Skip virtual phis.
>   (prune_bbs): Add DEFAULT_BB argument, don't remove that bb.
>   (fix_phi_nodes): Don't add e2f phi arg if default_case_nonstandard.
>   Handle virtual phis.
>   (gen_inbound_check): Handle default_case_nonstandard case.
>   (process_switch): Adjust check_final_bb caller.  Call
>   gather_default_values with the first non-default case instead of
>   default case if default_case_nonstandard.
> 
>   * gcc.dg/tree-ssa/cswtch-3.c: New test.
>   * gcc.dg/tree-ssa/cswtch-4.c: New test.
>   * gcc.dg/tree-ssa/cswtch-5.c: New test.
> 
> --- gcc/tree-switch-conversion.c.jj   2017-02-14 14:54:08.020975500 +0100
> +++ gcc/tree-switch-conversion.c  2017-02-14 17:09:01.162826954 +0100
> @@ -592,6 +592,14 @@ struct switch_conv_info
>   dump file, if there is one.  */
>const char *reason;
>  
> +  /* True if default case is not used for any value between range_min and
> + range_max inclusive.  */
> +  bool contiguous_range;
> +
> +  /* True if default case does not have the required shape for other case
> + labels.  */
> +  bool default_case_nonstandard;
> +
>/* Parameters for expand_switch_using_bit_tests.  Should be computed
>   the same way as in expand_case.  */
>unsigned int uniq;
> @@ -606,8 +614,9 @@ collect_switch_conv_info (gswitch *swtch
>unsigned int branch_num = gimple_switch_num_labels (swtch);
>tree min_case, max_case;
>unsigned int count, i;
> -  edge e, e_default;
> +  edge e, e_default, e_first;
>edge_iterator ei;
> +  basic_block first;
>  
>memset (info, 0, sizeof (*info));
>  
> @@ -616,8 +625,8 @@ collect_switch_conv_info (gswitch *swtch
>   Collect the bits we can deduce from the CFG.  */
>info->index_expr = gimple_switch_index (swtch);
>info->switch_bb = gimple_bb (swtch);
> -  info->default_bb =
> -label_to_block (CASE_LABEL (gimple_switch_default_label (swtch)));
> +  info->default_bb
> += label_to_block (CASE_LABEL (gimple_switch_default_label (swtch)));
>e_default = find_edge (info->switch_bb, info->default_bb);
>info->default_prob = e_default->probability;
>info->default_count = e_default->count;
> @@ -625,17 +634,54 @@ collect_switch_conv_info (gswitch *swtch
>  if (e != e_default)
>info->other_count += e->count;
>  
> +  /* Get upper and lower bounds of case values, and the covered range.  */
> +  min_case = gimple_switch_label (swtch, 1);
> +  max_case = gimple_switch_label (swtch, branch_num - 1);
> +
> +  info->range_min = CASE_LOW (min_case);
> +  if (CASE_HIGH (max_case) != NULL_TREE)
> +info->range_max = CASE_HIGH (max_case);
> +  else
> +info->range_max = CASE_LOW (max_case);
> +
> +  info->contiguous_range = true;
> +  tree last = CASE_HIGH (min_case) ? CASE_HIGH (min_case) : info->range_min;
> +  for (i = 2; i < branch_num; i++)
> +{
> +  tree elt = gimple_switch_label (swtch, i);
> +  wide_int w = last;
> +  if (w + 1 != CASE_LOW (elt))
> + {
> +   info->contiguous_range = false;
> +   break;
> + }
> +  last = CASE_HIGH (elt) ? CASE_HIGH (elt) : CASE_LOW (elt);
> +}
> 

Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts

2017-05-02 Thread Allan Sandfeld Jensen
On Tuesday 02 May 2017, Jakub Jelinek wrote:
> On Mon, Apr 24, 2017 at 03:15:11PM +0200, Allan Sandfeld Jensen wrote:
> > Okay, I have tried that, and I also made it more obvious how the
> > intrinsics can become non-immediate shift.
> > 
> > 
> > diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> > index b58f5050db0..b9406550fc5 100644
> > --- a/gcc/ChangeLog
> > +++ b/gcc/ChangeLog
> > @@ -1,3 +1,10 @@
> > +2017-04-22  Allan Sandfeld Jensen  
> > +
> > +   * config/i386/emmintrin.h (_mm_slli_*, _mm_srli_*):
> > +   Use vector intrinstics instead of builtins.
> > +   * config/i386/avx2intrin.h (_mm256_slli_*, _mm256_srli_*):
> > +   Use vector intrinstics instead of builtins.
> > +
> > 
> >  2017-04-21  Uros Bizjak  
> >  
> > * config/i386/i386.md (*extzvqi_mem_rex64): Move above *extzv.
> > 
> > diff --git a/gcc/config/i386/avx2intrin.h b/gcc/config/i386/avx2intrin.h
> > index 82f170a3d61..64ba52b244e 100644
> > --- a/gcc/config/i386/avx2intrin.h
> > +++ b/gcc/config/i386/avx2intrin.h
> > @@ -665,13 +665,6 @@ _mm256_slli_si256 (__m256i __A, const int __N)
> > 
> >  extern __inline __m256i
> >  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> > 
> > -_mm256_slli_epi16 (__m256i __A, int __B)
> > -{
> > -  return (__m256i)__builtin_ia32_psllwi256 ((__v16hi)__A, __B);
> > -}
> > -
> > -extern __inline __m256i
> > -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> > 
> >  _mm256_sll_epi16 (__m256i __A, __m128i __B)
> >  {
> >  
> >return (__m256i)__builtin_ia32_psllw256((__v16hi)__A, (__v8hi)__B);
> > 
> > @@ -679,9 +672,11 @@ _mm256_sll_epi16 (__m256i __A, __m128i __B)
> > 
> >  extern __inline __m256i
> >  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> > 
> > -_mm256_slli_epi32 (__m256i __A, int __B)
> > +_mm256_slli_epi16 (__m256i __A, int __B)
> > 
> >  {
> > 
> > -  return (__m256i)__builtin_ia32_pslldi256 ((__v8si)__A, __B);
> > +  if (__builtin_constant_p(__B))
> > +return ((unsigned int)__B < 16) ? (__m256i)((__v16hi)__A << __B) :
> > _mm256_setzero_si256(); +  return _mm256_sll_epi16(__A,
> > _mm_cvtsi32_si128(__B));
> > 
> >  }
> 
> The formatting is wrong, missing spaces before function names and opening
> (, too long lines.  Also, you've removed some builtin uses like
> __builtin_ia32_psllwi256 above, but haven't removed those builtins from the
> compiler (unlike the intrinsics, the builtins are not supported and can be
> removed).  But I guess the primary question is on Uros, do we
> want to handle this in the *intrin.h headers and thus increase the size
> of those (and their parsing time etc.), or do we want to handle this
> in the target folders (tree as well as gimple one), where we'd convert
> e.g. __builtin_ia32_psllwi256 to the shift if the shift count is constant.
> 
Ok. I will await what you decide.

Btw. I thought of an alternative idea: Make a new set of built-ins, called for 
instance __builtin_lshift and __builtin_rshift, that translates simply to 
GIMPLE shifts, just like cpp_shifts currently does, the only difference being 
the new shifts (unlike C/C++ shifts) are defined for all shift sizes and on 
negative values.  With this also variable shift intrinsics can be written 
without builtins. Though to do this would making a whole set of them for all 
integer types, it would need to be implemented in the c-parser like 
__buitin_shuffle, and not with the other generic builtins.

Would that make sense?

Best regards
`Allan



[PATCH] Fix PR80549

2017-05-02 Thread Richard Biener

I have tested the following patch to fix another case where threading
plus CFG cleanup messes up loops, identifying a previous loop with
a new one and thus messing up nb_iterations_upper_bound.  I believe
preserving existing loops as much as possible is desirable (and
CFG cleanup goes a long way protecting loop headers).

Thus the following patch makes it impossible that CFG cleanup
accidentially makes a loop header the header of a larger/different
loop by ensuring we only have a single entry edge (aka preheaders).

While it would be nice to have this as a CFG property throughout
threading breaks it easily and until we fix that (and maybe other
places) the simpler fix is to make CFG cleanup re-instantiate
preheaders where required.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

As this is a regression I plan to backport this so comments are still
welcome.

Thanks,
Richard.

2017-04-28  Richard Biener  

PR tree-optimization/80549
* tree-cfgcleanup.c (mfb_keep_latches): New helper.
(cleanup_tree_cfg_noloop): Create forwarders to known loop
headers if they do not have a preheader.

* gcc.dg/torture/pr80549.c: New testcase.

Index: gcc/tree-cfgcleanup.c
===
*** gcc/tree-cfgcleanup.c   (revision 247368)
--- gcc/tree-cfgcleanup.c   (working copy)
*** cleanup_tree_cfg_1 (void)
*** 739,744 
--- 739,749 
return retval;
  }
  
+ static bool
+ mfb_keep_latches (edge e)
+ {
+   return ! dominated_by_p (CDI_DOMINATORS, e->src, e->dest);
+ }
  
  /* Remove unreachable blocks and other miscellaneous clean up work.
 Return true if the flowgraph was modified, false otherwise.  */
*** cleanup_tree_cfg_noloop (void)
*** 766,771 
--- 771,834 
changed = false;
  }
  
+   /* Ensure that we have single entries into loop headers.  Otherwise
+  if one of the entries is becoming a latch due to CFG cleanup
+  (from formerly being part of an irreducible region) then we mess
+  up loop fixup and associate the old loop with a different region
+  which makes niter upper bounds invalid.  See for example PR80549.
+  This needs to be done before we remove trivially dead edges as
+  we need to capture the dominance state before the pending transform.  */
+   if (current_loops)
+ {
+   loop_p loop;
+   unsigned i;
+   FOR_EACH_VEC_ELT (*get_loops (cfun), i, loop)
+   if (loop && loop->header)
+ {
+   basic_block bb = loop->header;
+   edge_iterator ei;
+   edge e;
+   bool found_latch = false;
+   bool any_abnormal = false;
+   unsigned n = 0;
+   /* We are only interested in preserving existing loops, but
+  we need to check whether they are still real and of course
+  if we need to add a preheader at all.  */
+   FOR_EACH_EDGE (e, ei, bb->preds)
+ {
+   if (e->flags & EDGE_ABNORMAL)
+ {
+   any_abnormal = true;
+   break;
+ }
+   if (dominated_by_p (CDI_DOMINATORS, e->src, bb))
+ {
+   found_latch = true;
+   continue;
+ }
+   n++;
+ }
+   /* If we have more than one entry to the loop header
+  create a forwarder.  */
+   if (found_latch && ! any_abnormal && n > 1)
+ {
+   edge fallthru = make_forwarder_block (bb, mfb_keep_latches,
+ NULL);
+   loop->header = fallthru->dest;
+   if (! loops_state_satisfies_p (LOOPS_NEED_FIXUP))
+ {
+   /* The loop updating from the CFG hook is incomplete
+  when we have multiple latches, fixup manually.  */
+   remove_bb_from_loops (fallthru->src);
+   loop_p cloop = loop;
+   FOR_EACH_EDGE (e, ei, fallthru->src->preds)
+ cloop = find_common_loop (cloop, e->src->loop_father);
+   add_bb_to_loop (fallthru->src, cloop);
+ }
+ }
+ }
+ }
+ 
changed |= cleanup_tree_cfg_1 ();
  
gcc_assert (dom_info_available_p (CDI_DOMINATORS));
Index: gcc/testsuite/gcc.dg/torture/pr80549.c
===
*** gcc/testsuite/gcc.dg/torture/pr80549.c  (nonexistent)
--- gcc/testsuite/gcc.dg/torture/pr80549.c  (working copy)
***
*** 0 
--- 1,33 
+ /* { dg-do run } */
+ 
+ signed char a, b;
+ int c;
+ short d;
+ void fn1(int p1)
+ {
+   short e = 4;
+   int f;
+   d = 0;
+   for (; d <= 0; d++)
+ e = 0;
+   if (e)
+ goto L1;
+ L2:
+   if (p1) {
+   a = 9;
+   for (; a; ++a) {
+ f = 5;
+ for (; f != 

[PATCH, alpha]: Merge some *_ieee insn patterns with their base patterns

2017-05-02 Thread Uros Bizjak
2017-05-02  Uros Bizjak  

* config/alpha/alpha.md (*add3_ieee): Merge to add3
using enabled attribute.
(*sub3_ieee): Merge to sub3 using enabled attribute.
(*mul3_ieee): Merge to mul3 using enabled attribute.
(*div3_ieee): Merge to div3 using enabled attribute.
(*sqrt2_ieee): Merge to sqrt2 using enabled attribute.
(*fix_truncdfdi_ieee): Merge to *fix_truncdfdi2 using enabled attribute.
(*fix_truncsfdi_ieee): Merge to *fix_truncsfdi2 using enabled attribute.
(*floatdisf_ieee): Merge to floatdisf2 using enabled attribute.
(*floatdidf_ieee): Merge to floatdidf2 using enabled attribute.
(*truncdfsf2_ieee): Merge to truncdfsf2 using enabled attribute.
(*cmpdf_ieee): Merge to *cmpdf_internal using enabled attribute.

Bootstrapped and regression tested on alphaev68-linux-gnu.

Committed to mainline SVN.

Uros.
Index: config/alpha/alpha.md
===
--- config/alpha/alpha.md   (revision 247428)
+++ config/alpha/alpha.md   (working copy)
@@ -1671,27 +1671,21 @@
   "cpysn %R2,%R1,%0"
   [(set_attr "type" "fadd")])
 
-(define_insn "*add3_ieee"
-  [(set (match_operand:FMODE 0 "register_operand" "=&f")
-   (plus:FMODE (match_operand:FMODE 1 "reg_or_0_operand" "%fG")
-   (match_operand:FMODE 2 "reg_or_0_operand" "fG")))]
-  "TARGET_FP && alpha_fptm >= ALPHA_FPTM_SU"
-  "add%/ %R1,%R2,%0"
-  [(set_attr "type" "fadd")
-   (set_attr "trap" "yes")
-   (set_attr "round_suffix" "normal")
-   (set_attr "trap_suffix" "u_su_sui")])
-
 (define_insn "add3"
-  [(set (match_operand:FMODE 0 "register_operand" "=f")
-   (plus:FMODE (match_operand:FMODE 1 "reg_or_0_operand" "%fG")
-   (match_operand:FMODE 2 "reg_or_0_operand" "fG")))]
+  [(set (match_operand:FMODE 0 "register_operand" "=f,&f")
+   (plus:FMODE (match_operand:FMODE 1 "reg_or_0_operand" "%fG,fG")
+   (match_operand:FMODE 2 "reg_or_0_operand" "fG,fG")))]
   "TARGET_FP"
   "add%/ %R1,%R2,%0"
   [(set_attr "type" "fadd")
(set_attr "trap" "yes")
(set_attr "round_suffix" "normal")
-   (set_attr "trap_suffix" "u_su_sui")])
+   (set_attr "trap_suffix" "u_su_sui")
+   (set (attr "enabled")
+ (cond [(eq_attr "alternative" "0")
+ (symbol_ref "alpha_fptm < ALPHA_FPTM_SU")
+  ]
+  (symbol_ref "true")))])
 
 (define_insn "*adddf_ext1"
   [(set (match_operand:DF 0 "register_operand" "=f")
@@ -1725,27 +1719,21 @@
   "TARGET_HAS_XFLOATING_LIBS"
   "alpha_emit_xfloating_arith (PLUS, operands); DONE;")
 
-(define_insn "*sub3_ieee"
-  [(set (match_operand:FMODE 0 "register_operand" "=&f")
-   (minus:FMODE (match_operand:FMODE 1 "reg_or_0_operand" "fG")
-(match_operand:FMODE 2 "reg_or_0_operand" "fG")))]
-  "TARGET_FP && alpha_fptm >= ALPHA_FPTM_SU"
-  "sub%/ %R1,%R2,%0"
-  [(set_attr "type" "fadd")
-   (set_attr "trap" "yes")
-   (set_attr "round_suffix" "normal")
-   (set_attr "trap_suffix" "u_su_sui")])
-
 (define_insn "sub3"
-  [(set (match_operand:FMODE 0 "register_operand" "=f")
-   (minus:FMODE (match_operand:FMODE 1 "reg_or_0_operand" "fG")
-(match_operand:FMODE 2 "reg_or_0_operand" "fG")))]
+  [(set (match_operand:FMODE 0 "register_operand" "=f,&f")
+   (minus:FMODE (match_operand:FMODE 1 "reg_or_0_operand" "fG,fG")
+(match_operand:FMODE 2 "reg_or_0_operand" "fG,fG")))]
   "TARGET_FP"
   "sub%/ %R1,%R2,%0"
   [(set_attr "type" "fadd")
(set_attr "trap" "yes")
(set_attr "round_suffix" "normal")
-   (set_attr "trap_suffix" "u_su_sui")])
+   (set_attr "trap_suffix" "u_su_sui")
+   (set (attr "enabled")
+ (cond [(eq_attr "alternative" "0")
+ (symbol_ref "alpha_fptm < ALPHA_FPTM_SU")
+  ]
+  (symbol_ref "true")))])
 
 (define_insn "*subdf_ext1"
   [(set (match_operand:DF 0 "register_operand" "=f")
@@ -1791,27 +1779,21 @@
   "TARGET_HAS_XFLOATING_LIBS"
   "alpha_emit_xfloating_arith (MINUS, operands); DONE;")
 
-(define_insn "*mul3_ieee"
-  [(set (match_operand:FMODE 0 "register_operand" "=&f")
-   (mult:FMODE (match_operand:FMODE 1 "reg_or_0_operand" "%fG")
-   (match_operand:FMODE 2 "reg_or_0_operand" "fG")))]
-  "TARGET_FP && alpha_fptm >= ALPHA_FPTM_SU"
-  "mul%/ %R1,%R2,%0"
-  [(set_attr "type" "fmul")
-   (set_attr "trap" "yes")
-   (set_attr "round_suffix" "normal")
-   (set_attr "trap_suffix" "u_su_sui")])
-
 (define_insn "mul3"
-  [(set (match_operand:FMODE 0 "register_operand" "=f")
-   (mult:FMODE (match_operand:FMODE 1 "reg_or_0_operand" "%fG")
-   (match_operand:FMODE 2 "reg_or_0_operand" "fG")))]
+  [(set (match_operand:FMODE 0 "register_operand" "=f,&f")
+   (mult:FMODE (match_operand:FMODE 1 "reg_or_0_operand" "%fG,fG")
+   (match_operand:FMODE 2 "reg_or_0_operand" "fG,fG")))]
   "TARGET_FP"
   "mul%/ %R1,%R2,%0"
   [(set_attr "type" "fmul")
(set_attr "trap" "yes")
(set_attr "round_suf

[PATCH] Fix cross compiling to x86_64-w64-mingw32

2017-05-02 Thread Hugo Beauzée-Luyssen
This patch fixes cross compiling to x86_64-w64-mingw32
See https://github.com/Alexpux/MINGW-packages/issues/1580 and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69506

My apologies if I missed something in the contributing/sending a patch
guidelines.

Regards,

Index: libstdc++-v3/config/os/mingw32-w64/os_defines.h
===
--- libstdc++-v3/config/os/mingw32-w64/os_defines.h (revision 247489)
+++ libstdc++-v3/config/os/mingw32-w64/os_defines.h (working copy)
@@ -76,6 +76,8 @@

 #ifdef __x86_64__
 #define _GLIBCXX_LLP64 1
+// See libstdc++/69506
+#define _GLIBCXX_USE_WEAK_REF 0
 #endif

 // Enable use of GetModuleHandleEx (requires Windows XP/2003) in


Re: [PATCH] Optimize in VRP loads from constant arrays (take 2)

2017-05-02 Thread Jakub Jelinek
On Tue, May 02, 2017 at 01:13:19PM +0200, Richard Biener wrote:
> On Sat, 29 Apr 2017, Jakub Jelinek wrote:
> 
> > On Fri, Apr 21, 2017 at 04:42:03PM +0200, Richard Biener wrote:
> > > On April 21, 2017 4:06:59 PM GMT+02:00, Jakub Jelinek  
> > > wrote:
> > > >This patch attempts to implement the optimization mentioned in
> > > >http://gcc.gnu.org/ml/gcc-patches/2017-02/msg00945.html
> > > >If we know a (relatively small) value range for ARRAY_REF index
> > > >in constant array load and the values in the array for all the indexes
> > > >in the array are the same (and constant), we can just replace the load
> > > >with the constant.
> > > 
> > > But this should be done during propagation (and thus can even produce a 
> > > range rather than just a constant).
> > 
> > True, I've changed the implementation to do that.
> > Except that it is only useful during propagation for integral or pointer
> > types, for anything else (and for pointers when not just doing NULL vs.
> > non-NULL vr) it is also performed during simplification (in that case
> > it requires operand_equal_p values).
> > 
> > The patch also newly computes range for the index and range from the array
> > bounds (taking into account arrays at struct end) and intersects them,
> > plus takes into account that some iterators might have a couple of zero LBS
> > bits (as computed by ccp).
> 
> Hmm, while I can see how this helps I'd rather not do this in this way.
> (see PR80533 and my followup with a testcase which shows
> array_at_struct_end_p is wrong)  Ideally we'd add asserts for array
> indices instead.  Thus
> 
>   i_3 = ASSERT_EXPR 
>   .. = a[i_3];
> 
> which of course needs adjustment to -Warray-bounds to look at the
> range of the original SSA name (and in loops even that might degrade...).
> 
> Was this necessary to get any reasonable results?

Yes, it is very common that you end up with a VR that has negative min, or
very large max, but if the array is in the middle of a struct, or it is a
toplevel non-common array variable etc., which is quite common case, then
we still should be able to optimize it.
Adding extra ASSERT_EXPRs would work too, but wouldn't it be too expensive?
I mean there can be lots of array accesses with the same index, if we'd add
an ASSERT_EXPR for every case, VRP work could increase many times.  It is
true that we'd be able to optimize:
int foo [5] = { 1, 2, 3, 4, 5 };
void
foo (int i, int j)
{
  if (j > 10)
{
  foo[i] = 10;
  if (i < 0 || i >= 5)
link_error ();
}
  foo[i]++;
}

If array_at_struct_end_p is wrong, it should be fixed ;)

> Still given the high cost of get_array_ctor_element_at_index which
> does linear searching I'd add a few early-outs like
> 
>   base = get_base_address (rhs);
>   ctor = ctor_for_folding (base);
>   if (! ctor)
> return NULL_TREE;

This one I can add easily.

> I'd restructure the patch quite different, using for_each_index on the
> ref gather an array of index pointers (bail out on sth unhandled).
> Then I'd see if I have interesting ranges for them, if not, bail out.
> Also compute the size product of all ranges and test that against
> PARAM_MAX_VRP_CONSTANT_ARRAY_LOADS.  Then store the minimum range
> value in the index places (temporarily) and use get_base_ref_and_extent to
> get at the constant "starting" offset.  From there iterate using
> the remembered indices (remember the ref tree as well via for_each_index),
> directly adjusting the constant offset so you can feed
> fold_ctor_reference the constant offsets of all elements that need to
> be considered.  As optimization fold_ctor_reference would know how
> to start from the "last" offset (much refactoring would need to be
> done here given nested ctors and multiple indices I guess).

But for this, don't you want to take it over?
I agree that the current implementation is not very efficient and that is
why it is also limited to that small number of iterations.
As many cases just aren't able to use the valueize callback, handling
arbitrary numbers of non-constant indexes would be harder.

Jakub


Re: [PATCH] Improve switchconv optimization (PR tree-optimization/79472)

2017-05-02 Thread Jakub Jelinek
On Tue, May 02, 2017 at 01:16:05PM +0200, Richard Biener wrote:
> On Sat, 29 Apr 2017, Jakub Jelinek wrote:
> 
> > On Wed, Feb 15, 2017 at 12:51:30PM +0100, Jakub Jelinek wrote:
> > > On Wed, Feb 15, 2017 at 12:46:44PM +0100, Richard Biener wrote:
> > > > >> Possibly, but for GCC 8.
> > > > > 
> > > > > To both this switchconv patch and the potential improvement for 
> > > > > loading
> > > > > from const arrays (can create an enhancement PR for that), or just the
> > > > > latter?
> > > > 
> > > > Both I think - the patch is pretty big.
> > > 
> > > Ok, I'll queue the patch for GCC8 then.
> > 
> > If the tree-vrp.c change makes it in, is this patch ok for trunk too now
> > that we are in stage1?  Bootstrapped/regtested on x86_64-linux and
> > i686-linux on top of the tree-vrp.c change (without the tree-vrp.c change
> > vrp40.c regresses).
> 
> Ok.

So with XFAILing vrp40.c for now until the constant load opt is resolved?

Jakub


Re: [PATCH] Improve switchconv optimization (PR tree-optimization/79472)

2017-05-02 Thread Richard Biener
On Tue, 2 May 2017, Jakub Jelinek wrote:

> On Tue, May 02, 2017 at 01:16:05PM +0200, Richard Biener wrote:
> > On Sat, 29 Apr 2017, Jakub Jelinek wrote:
> > 
> > > On Wed, Feb 15, 2017 at 12:51:30PM +0100, Jakub Jelinek wrote:
> > > > On Wed, Feb 15, 2017 at 12:46:44PM +0100, Richard Biener wrote:
> > > > > >> Possibly, but for GCC 8.
> > > > > > 
> > > > > > To both this switchconv patch and the potential improvement for 
> > > > > > loading
> > > > > > from const arrays (can create an enhancement PR for that), or just 
> > > > > > the
> > > > > > latter?
> > > > > 
> > > > > Both I think - the patch is pretty big.
> > > > 
> > > > Ok, I'll queue the patch for GCC8 then.
> > > 
> > > If the tree-vrp.c change makes it in, is this patch ok for trunk too now
> > > that we are in stage1?  Bootstrapped/regtested on x86_64-linux and
> > > i686-linux on top of the tree-vrp.c change (without the tree-vrp.c change
> > > vrp40.c regresses).
> > 
> > Ok.
> 
> So with XFAILing vrp40.c for now until the constant load opt is resolved?

Yes.  Maybe instead add vrp40-2.c XFAILed and leave vrp40.c with
switchconv disabled?

Richard.


Re: [PATCH] Optimize in VRP loads from constant arrays (take 2)

2017-05-02 Thread Richard Biener
On Tue, 2 May 2017, Jakub Jelinek wrote:

> On Tue, May 02, 2017 at 01:13:19PM +0200, Richard Biener wrote:
> > On Sat, 29 Apr 2017, Jakub Jelinek wrote:
> > 
> > > On Fri, Apr 21, 2017 at 04:42:03PM +0200, Richard Biener wrote:
> > > > On April 21, 2017 4:06:59 PM GMT+02:00, Jakub Jelinek 
> > > >  wrote:
> > > > >This patch attempts to implement the optimization mentioned in
> > > > >http://gcc.gnu.org/ml/gcc-patches/2017-02/msg00945.html
> > > > >If we know a (relatively small) value range for ARRAY_REF index
> > > > >in constant array load and the values in the array for all the indexes
> > > > >in the array are the same (and constant), we can just replace the load
> > > > >with the constant.
> > > > 
> > > > But this should be done during propagation (and thus can even produce a 
> > > > range rather than just a constant).
> > > 
> > > True, I've changed the implementation to do that.
> > > Except that it is only useful during propagation for integral or pointer
> > > types, for anything else (and for pointers when not just doing NULL vs.
> > > non-NULL vr) it is also performed during simplification (in that case
> > > it requires operand_equal_p values).
> > > 
> > > The patch also newly computes range for the index and range from the array
> > > bounds (taking into account arrays at struct end) and intersects them,
> > > plus takes into account that some iterators might have a couple of zero 
> > > LBS
> > > bits (as computed by ccp).
> > 
> > Hmm, while I can see how this helps I'd rather not do this in this way.
> > (see PR80533 and my followup with a testcase which shows
> > array_at_struct_end_p is wrong)  Ideally we'd add asserts for array
> > indices instead.  Thus
> > 
> >   i_3 = ASSERT_EXPR 
> >   .. = a[i_3];
> > 
> > which of course needs adjustment to -Warray-bounds to look at the
> > range of the original SSA name (and in loops even that might degrade...).
> > 
> > Was this necessary to get any reasonable results?
> 
> Yes, it is very common that you end up with a VR that has negative min, or
> very large max, but if the array is in the middle of a struct, or it is a
> toplevel non-common array variable etc., which is quite common case, then
> we still should be able to optimize it.
> Adding extra ASSERT_EXPRs would work too, but wouldn't it be too expensive?
> I mean there can be lots of array accesses with the same index, if we'd add
> an ASSERT_EXPR for every case, VRP work could increase many times.  It is
> true that we'd be able to optimize:
> int foo [5] = { 1, 2, 3, 4, 5 };
> void
> foo (int i, int j)
> {
>   if (j > 10)
> {
>   foo[i] = 10;
>   if (i < 0 || i >= 5)
>   link_error ();
> }
>   foo[i]++;
> }

I don't think it's too expensive.  Yes, you get one additional assert
(and SSA name) per array index.  My main gripe would be the 
affect on -Warray-bounds...

> 
> If array_at_struct_end_p is wrong, it should be fixed ;)

Indeed.  It was originally meant to say false if you can trust
TYPE_DOMAIN of the array but now it says false if there's some means
to constrain the array size (the DECL_P path and now your STRING_CST
one).  But all callers afterwards just look at TYPE_DOMAIN ...

> > Still given the high cost of get_array_ctor_element_at_index which
> > does linear searching I'd add a few early-outs like
> > 
> >   base = get_base_address (rhs);
> >   ctor = ctor_for_folding (base);
> >   if (! ctor)
> > return NULL_TREE;
> 
> This one I can add easily.
> 
> > I'd restructure the patch quite different, using for_each_index on the
> > ref gather an array of index pointers (bail out on sth unhandled).
> > Then I'd see if I have interesting ranges for them, if not, bail out.
> > Also compute the size product of all ranges and test that against
> > PARAM_MAX_VRP_CONSTANT_ARRAY_LOADS.  Then store the minimum range
> > value in the index places (temporarily) and use get_base_ref_and_extent to
> > get at the constant "starting" offset.  From there iterate using
> > the remembered indices (remember the ref tree as well via for_each_index),
> > directly adjusting the constant offset so you can feed
> > fold_ctor_reference the constant offsets of all elements that need to
> > be considered.  As optimization fold_ctor_reference would know how
> > to start from the "last" offset (much refactoring would need to be
> > done here given nested ctors and multiple indices I guess).
> 
> But for this, don't you want to take it over?

I can try.  Is there a PR for this?

> I agree that the current implementation is not very efficient and that is
> why it is also limited to that small number of iterations.
> As many cases just aren't able to use the valueize callback, handling
> arbitrary numbers of non-constant indexes would be harder.

Sure.  I'd have expected you simply handle ARRAY_REF of a VAR_DECL
and nothing else ;)

Richard.


Re: [PATCH][AARCH64]Simplify call, call_value, sibcall, sibcall_value patterns.

2017-05-02 Thread Richard Earnshaw (lists)
On 01/12/16 15:39, Renlin Li wrote:
> Hi all,
> 
> This patch refactors the code used in call, call_value, sibcall,
> sibcall_value expanders.
> 
> Before the change, the logic is following:
> 
> call expander  --> call_internal  --> call_reg/call_symbol
> call_vlaue expander--> call_value_internal-->
> call_value_reg/call_value_symbol
> 
> sibcall expander   --> sibcall_internal   --> sibcall_insn
> sibcall_value expander --> sibcall_value_internal --> sibcall_value_insn
> 
> After the change, the logic is simplified into:
> 
> call expander  --> aarch64_expand_call() --> call_insn
> call_value expander--> aarch64_expand_call() --> call_value_insn
> 
> sibcall expander   --> aarch64_expand_call() --> sibcall_insn
> sibcall_value expander --> aarch64_expand_call() --> sibcall_value_insn
> 
> The code are factored out from each expander into aarch64_expand_call ().
> 
> This also fixes the two issues Richard Henderson suggests in comments 8:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64971
> 
> aarch64-none-elf regression test Okay, aarch64-linux bootstrap Okay.
> Okay for trunk?
> 
> Regards,
> Renlin Li
> 
> 
> gcc/ChangeLog:
> 
> 2016-12-01  Renlin Li  
> 
> * config/aarch64/aarch64-protos.h (aarch64_expand_call): Declare.
> * config/aarch64/aarch64.c (aarch64_expand_call): Define.
> * config/aarch64/constraints.md (Usf): Add long call check.
> * config/aarch64/aarch64.md (call): Use aarch64_expand_call.
> (call_value): Likewise.
> (sibcall): Likewise.
> (sibcall_value): Likewise.
> (call_insn): New.
> (call_value_insn): New.
> (sibcall_insn): Update rtx pattern.
> (sibcall_value_insn): Likewise.
> (call_internal): Remove.
> (call_value_internal): Likewise.
> (sibcall_internal): Likewise.
> (sibcall_value_internal): Likewise.
> (call_reg): Likewise.
> (call_symbol): Likewise.
> (call_value_reg): Likewise.
> (call_value_symbol): Likewise.
> 
> 
> new.diff
> 
> 
> diff --git a/gcc/config/aarch64/aarch64-protos.h 
> b/gcc/config/aarch64/aarch64-protos.h
> index 7f67f14..3a5babb 100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -305,6 +305,7 @@ bool aarch64_const_vec_all_same_int_p (rtx, 
> HOST_WIDE_INT);
>  bool aarch64_constant_address_p (rtx);
>  bool aarch64_emit_approx_div (rtx, rtx, rtx);
>  bool aarch64_emit_approx_sqrt (rtx, rtx, bool);
> +void aarch64_expand_call (rtx, rtx, bool);
>  bool aarch64_expand_movmem (rtx *);
>  bool aarch64_float_const_zero_rtx_p (rtx);
>  bool aarch64_function_arg_regno_p (unsigned);
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 68a3380..c313cf5 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -4343,6 +4343,51 @@ aarch64_fixed_condition_code_regs (unsigned int *p1, 
> unsigned int *p2)
>return true;
>  }
>  
> +/* This function is used by the call expanders of the machine description.
> +   RESULT is the register in which the result is returned.  It's NULL for
> +   "call" and "sibcall".
> +   MEM is the location of the function call.
> +   SIBCALL indicates whether this function call is normal call or sibling 
> call.
> +   It will generate different pattern accordingly.  */
> +
> +void
> +aarch64_expand_call (rtx result, rtx mem, bool sibcall)
> +{
> +  rtx call, callee, tmp;
> +  rtvec vec;
> +  machine_mode mode;
> +
> +  gcc_assert (MEM_P (mem));
> +  callee = XEXP (mem, 0);
> +  mode = GET_MODE (callee);
> +  gcc_assert (mode == Pmode);
> +
> +  /* Decide if we should generate indirect calls by loading the
> + 64-bit address of the callee into a register before performing

Drop '64-bit'.  This code should also work for ILP32, where the
addresses are 32-bit.

> + the branch-and-link.  */
> +
> +  if (GET_CODE (callee) == SYMBOL_REF

Use SYMBOL_REF_P.

OK with those changes.

R.


> +  ? (aarch64_is_long_call_p (callee)
> +  || aarch64_is_noplt_call_p (callee))
> +  : !REG_P (callee))
> +  XEXP (mem, 0) = force_reg (mode, callee);
> +
> +  call = gen_rtx_CALL (VOIDmode, mem, const0_rtx);
> +
> +  if (result != NULL_RTX)
> +call = gen_rtx_SET (result, call);
> +
> +  if (sibcall)
> +tmp = ret_rtx;
> +  else
> +tmp = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (Pmode, LR_REGNUM));
> +
> +  vec = gen_rtvec (2, call, tmp);
> +  call = gen_rtx_PARALLEL (VOIDmode, vec);
> +
> +  aarch64_emit_call_insn (call);
> +}
> +
>  /* Emit call insn with PAT and do aarch64-specific handling.  */
>  
>  void
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index bc6d8a2..5682686 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -718,12 +718,6 @@
>  ;; Subroutine calls and sibcalls
>  ;; ---
>  
> -(define_expand "call_internal"
> -  [(parallel [(call (match_operand 0 "memo

[PATCH][www] Document -fno-strict-overflow changes (GCC 8)

2017-05-02 Thread Richard Biener

Doing this early so I won't forget.  Wording can be improved.

Committed.

Richard.

2017-05-02  Richard Biener  

* changes.html (-fno-strict-overflow): Document changes.

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-8/changes.html,v
retrieving revision 1.2
diff -u -r1.2 changes.html
--- changes.html12 Mar 2017 14:25:34 -  1.2
+++ changes.html2 May 2017 12:57:58 -
@@ -50,7 +50,11 @@
 
 C family
 
-  
+-fno-strict-overflow is now mapped to
+ -fwrapv and signed integer overflow is now undefined by
+ default at all optimization levels.  Using
+ -fsanitize=signed-integer-overflow is now the preferred
+ way to audit code, -Wstrict-overflow is deprecated.
 
 
 C++
@@ -150,11 +154,11 @@
 
 
 
-


Re: [PATCH][x86] Add missing intrinsics for ADD[SD,SS] and SUB[SD,SS]

2017-05-02 Thread Uros Bizjak
On Tue, May 2, 2017 at 11:39 AM, Peryt, Sebastian
 wrote:
> Hi,
> Can you please commit it for me?

Done.

Uros.


Re: [PATCH] Implement a warning for bogus sizeof(pointer) / sizeof(pointer[0])

2017-05-02 Thread Bernd Edlinger
On 05/01/17 17:54, Jason Merrill wrote:
> On Fri, Apr 28, 2017 at 1:05 PM, Bernd Edlinger
>  wrote:
>> On 04/28/17 17:29, Martin Sebor wrote:
>>> On 04/28/2017 08:12 AM, Bernd Edlinger wrote:

 Do you want me to change the %qT format strings to %T ?
>>>
>>> Yes, with the surrounding %< and %> the nested directives should
>>> use the unquoted forms, otherwise the printer would end up quoting
>>> both the whole expression and the type operand.
>>>
>>> FWIW, to help avoid this mistake, I think this might be something
>>> for GCC -Wformat to warn on and the pretty-printer to detect (and
>>> ICE on).
>>>
>>
>> Ah, now I understand.  That's pretty advanced.
>>
>> Here is the modified patch with correct quoting of the expression.
>>
>> Bootstrap and reg-testing on x86_64-pc-linux-gnu.
>
>> * cp-gimplify.c (cp_fold): Implement the -Wsizeof_pointer_div warning.
>
> I think this warning belongs in cp_build_binary_op rather than cp_fold.
>

Done, as suggested.

Bootstrap and reg-testing on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.
gcc:
2017-05-01  Bernd Edlinger  

* doc/invoke.texi: Document the -Wsizeof-pointer-div warning.

gcc/c-family:
2017-05-01  Bernd Edlinger  

* c.opt (Wsizeof-pointer-div): New warning option.

gcc/c:
2017-05-01  Bernd Edlinger  

* c-parser.c (c_parser_binary_expression): Implement the
-Wsizeof_pointer_div warning.
(c_parser_postfix_expression): Allow SIZEOF_EXPR as expr.original_code
from a parenthesized expression.
(c_parser_expr_list): Use c_last_sizeof_loc.
* c-tree.h (c_last_sizeof_loc): New external.
* c-typeck.c (c_last_sizeof_loc): New variable.
(c_expr_sizeof_expr, c_expr_sizeof_type): Assign c_last_sizeof_loc.


gcc/cp:
2017-05-01  Bernd Edlinger  

* typeck.c (cp_build_binary_op): Implement the -Wsizeof_pointer_div
warning.

gcc/testsuite:
2017-05-01  Bernd Edlinger  

* c-c++-common/Wsizeof-pointer-div.c: New test.
* gcc.dg/Wsizeof-pointer-memaccess1.c: Add test cases with parens.
* gcc.dg/torture/Wsizeof-pointer-memaccess1.c: Likewise.
* gcc.target/i386/sse-init-v4hi-1.c: Fix test case.
* gcc.target/i386/sse-init-v4sf-1.c: Likewise.
* gcc.target/i386/sse-set-ps-1.c: Likewise.
* gcc.target/i386/sse2-init-v16qi-1.c: Likewise.
* gcc.target/i386/sse2-init-v2di-1.c: Likewise.
* gcc.target/i386/sse2-init-v4si-1.c: Likewise.
* gcc.target/i386/sse2-init-v8hi-1.c: Likewise.
* gcc.target/i386/sse2-set-epi32-1.c: Likewise.
* gcc.target/i386/sse2-set-epi64x-1.c: Likewise.
* gcc.target/i386/sse4_1-init-v16qi-1.c: Likewise.
* gcc.target/i386/sse4_1-init-v2di-1.c: Likewise.
* gcc.target/i386/sse4_1-init-v4sf-1.c: Likewise.
* gcc.target/i386/sse4_1-init-v4si-1.c: Likewise.
* gcc.target/i386/sse4_1-set-epi32-1.c: Likewise.
* gcc.target/i386/sse4_1-set-epi64x-1.c: Likewise.
* gcc.target/i386/sse4_1-set-ps-1.c: Likewise.
* libgomp.c/pr39591-2.c: Likewise.
* libgomp.c/pr39591-3.c: Likewise.

Index: gcc/c/c-parser.c
===
--- gcc/c/c-parser.c	(revision 247440)
+++ gcc/c/c-parser.c	(working copy)
@@ -6657,6 +6657,8 @@ c_parser_binary_expression (c_parser *parser, stru
 enum tree_code op;
 /* The source location of this operation.  */
 location_t loc;
+/* The sizeof argument if expr.original_code == SIZEOF_EXPR.  */
+tree sizeof_arg;
   } stack[NUM_PRECS];
   int sp;
   /* Location of the binary operator.  */
@@ -6673,6 +6675,31 @@ c_parser_binary_expression (c_parser *parser, stru
 	c_inhibit_evaluation_warnings -= (stack[sp - 1].expr.value	  \
 	  == truthvalue_true_node);	  \
 	break;  \
+  case TRUNC_DIV_EXPR: 		  \
+	if (stack[sp - 1].expr.original_code == SIZEOF_EXPR		  \
+	&& stack[sp].expr.original_code == SIZEOF_EXPR)		  \
+	  {  \
+	tree type0 = stack[sp - 1].sizeof_arg;			  \
+	tree type1 = stack[sp].sizeof_arg;  \
+	tree first_arg = type0;	  \
+	if (!TYPE_P (type0))	  \
+	  type0 = TREE_TYPE (type0);  \
+	if (!TYPE_P (type1))	  \
+	  type1 = TREE_TYPE (type1);  \
+	if (POINTER_TYPE_P (type0)	  \
+		&& comptypes (TREE_TYPE (type0), type1)			  \
+		&& !(TREE_CODE (first_arg) == PARM_DECL			  \
+		 && C_ARRAY_PARAMETER (first_arg)			  \
+		 && warn_sizeof_array_argument))			  \
+	  if (warning_at (stack[sp].loc, OPT_Wsizeof_pointer_div,	  \
+			  "division % does "  \
+			  "not compute the number of array elements", \
+			  type0, type1))  \
+		if (DECL_P (first_arg))	  \
+		  inform (DECL_SOURCE_LOCATION (first_arg),		  \
+			  "first % operand was declared here");  \
+	  }  \
+	break;  

[PATCH] Fix PR31130, remove strict-overflow handling from VRP

2017-05-02 Thread Richard Biener

The following patch removes (well, the patch only disables) 
strict-overflow handling (and thus emitting -Wstrict-overflo diagnostics) 
from VRP.

I XFAILed three testcases (well, all three are really the same testcase),
removed on XFAIL and added a testcase for the missed VRP caused by
those INF(OVF) being "sticky".

I'm now re-bootstrapping and testing this on trunk (after I've committed
the -f[no-]strict-overflow removal).  The patch bootstrapped and
tested ok a week ago.

I'm not sure if I will commit the patch as-is or if I actually remove
(some) of the strict-overflow machinery itself in the initial patch.

Comments still appreciated.  And yes, it should be easily possible
to preserve the warning on the testcase -- even from within
the FE itself I guess but also from VRP (the IV is not SCEV
analyzable otherwise -Waggressive-loop-optimizations would warn).

Richard.

2017-05-02  Richard Biener  

PR tree-optimization/31130
* tree-vrp.c (needs_overflow_infinity): Return always false.
(set_value_range_to_nonnegative): Properly guard
supports_overflow_infinity.

* gcc.dg/Wstrict-overflow-12.c: XFAIL.
* gcc.dg/Wstrict-overflow-13.c: Likewise.
* gcc.dg/Wstrict-overflow-21.c: Likewise.
* gcc.dg/pr52904.c: Remove XFAIL.
* gcc.dg/tree-ssa/vrp113.c: New testcase.

Index: gcc/tree-vrp.c
===
*** gcc/tree-vrp.c.orig 2017-04-28 11:24:38.953886334 +0200
--- gcc/tree-vrp.c  2017-05-02 15:14:48.438675683 +0200
*** vrp_val_is_min (const_tree val)
*** 220,228 
 TYPE_{MIN,MAX}_VALUE.  */
  
  static inline bool
! needs_overflow_infinity (const_tree type)
  {
!   return INTEGRAL_TYPE_P (type) && !TYPE_OVERFLOW_WRAPS (type);
  }
  
  /* Return whether TYPE can support our overflow infinity
--- 220,228 
 TYPE_{MIN,MAX}_VALUE.  */
  
  static inline bool
! needs_overflow_infinity (const_tree)
  {
!   return false;
  }
  
  /* Return whether TYPE can support our overflow infinity
*** set_value_range_to_nonnegative (value_ra
*** 558,564 
  {
tree zero;
  
!   if (overflow_infinity && !supports_overflow_infinity (type))
  {
set_value_range_to_varying (vr);
return;
--- 558,566 
  {
tree zero;
  
!   if (overflow_infinity
!   && (needs_overflow_infinity (type)
! && !supports_overflow_infinity (type)))
  {
set_value_range_to_varying (vr);
return;
*** set_value_range_to_nonnegative (value_ra
*** 566,572 
  
zero = build_int_cst (type, 0);
set_value_range (vr, VR_RANGE, zero,
!  (overflow_infinity
? positive_overflow_infinity (type)
: TYPE_MAX_VALUE (type)),
   vr->equiv);
--- 568,574 
  
zero = build_int_cst (type, 0);
set_value_range (vr, VR_RANGE, zero,
!  (overflow_infinity && needs_overflow_infinity (type)
? positive_overflow_infinity (type)
: TYPE_MAX_VALUE (type)),
   vr->equiv);
Index: gcc/testsuite/gcc.dg/Wstrict-overflow-12.c
===
*** gcc/testsuite/gcc.dg/Wstrict-overflow-12.c.orig 2007-11-30 
13:59:34.0 +0100
--- gcc/testsuite/gcc.dg/Wstrict-overflow-12.c  2017-05-02 15:19:32.259012861 
+0200
*** int
*** 10,16 
  foo ()
  {
int i, bits;
!   for (i = 1, bits = 1; i > 0; i += i) /* { dg-warning "assuming signed 
overflow does not occur" "correct warning" } */
  ++bits;
return bits;
  }
--- 10,16 
  foo ()
  {
int i, bits;
!   for (i = 1, bits = 1; i > 0; i += i) /* { dg-warning "assuming signed 
overflow does not occur" "correct warning" { xfail *-*-* } } */
  ++bits;
return bits;
  }
Index: gcc/testsuite/gcc.dg/Wstrict-overflow-13.c
===
*** gcc/testsuite/gcc.dg/Wstrict-overflow-13.c.orig 2007-11-30 
13:59:35.0 +0100
--- gcc/testsuite/gcc.dg/Wstrict-overflow-13.c  2017-05-02 15:20:03.927497363 
+0200
*** int
*** 11,17 
  foo ()
  {
int j;
!   for (j = 1; 0 < j; j *= 2) /* { dg-warning "assuming signed overflow does 
not occur" "correct warning" } */
  if (! bigtime_test (j))
return 1;
return 0;
--- 11,17 
  foo ()
  {
int j;
!   for (j = 1; 0 < j; j *= 2) /* { dg-warning "assuming signed overflow does 
not occur" "correct warning" { xfail *-*-* } } */
  if (! bigtime_test (j))
return 1;
return 0;
Index: gcc/testsuite/gcc.dg/Wstrict-overflow-21.c
===
*** gcc/testsuite/gcc.dg/Wstrict-overflow-21.c.orig 2008-01-22 
15:53:57.0 +0100
--- gcc/testsuite/gcc.dg/Wstrict-overflow-21.c  2017-05-02 15:20:18.087714039 
+0200
*** int
*** 5,11 
  foo ()
  {
int i, bits;
! 

Re: [PATCH GCC8][14/33]Handle more cheap operations in force_expr_to_var_cost

2017-05-02 Thread Richard Biener
On Thu, Apr 27, 2017 at 5:45 PM, Bin.Cheng  wrote:
> On Thu, Apr 27, 2017 at 4:30 PM, Jeff Law  wrote:
>> On 04/26/2017 06:58 AM, Richard Biener wrote:
>>>
>>> On Tue, Apr 18, 2017 at 12:44 PM, Bin Cheng 
>>> wrote:

 Hi, This patch handles more cheap cases in function
 force_expr_to_var_cost, specifically, TRUNC_DIV_EXPR, BIT_AND_EXPR,
 BIT_IOR_EXPR, RSHIFT_EXPR and BIT_NOT_EXPR.

 Is it OK?
>>>
>>>
>>> I wonder if using add_cost is good here.  TRUNC_DIV by power of two better
>>> matches shift_cost, no, or div_pow2_cheap?  Likewise for
>>> LSHIFT/RSHIFT.  We do have [us]div_cost as well btw. And we have
>>> neg_cost.
>>
>> In an ideal world, we'd have a canoncial form and just handle hte canonical
>> form.  But that hasn't ever really panned out for this kind of stuff in RTL
>> -- the decision about what is the preferred form of an expression changes
>> based on use context.
>>
>> I don't think these problems are as bad at the gimple level, but clearly
>> they still exist.
>>
>> The more we query the target, the less predictable the compiler's behavior
>> will be over time.   It was a huge problem in RTL leading us to a point
>> where it became exceedingly difficult to predict how a change in a pass
>> would ultimately affect the performance across targets.
> Yeah, this is what I had in mind, but not expressed as clear as this.
> The patch's intention is to differentiate between expensive (div) and
> cheap operations.

Ok.  Let's go with the patch as-is then and if possible change the code
to use estimate_num_insns instead (but I guess we may end up comparing
the cost to the address costs we compute more exactly?)

Richard.

> Thanks,
> bin
>>
>> That led to a guiding principle that we want to avoid querying the target in
>> gimple as much as possible.  We've relaxed that somewhat (we have to be
>> pragmatic), but we need to be real careful here.
>>
>> So my recommendation would be to define a set of costs for gimple and get
>> those as solid as we can given an "ideal" target.  Only query the target for
>> cases where it's critical.
>>
>>
>> Jeff


[PATCH] canonical type hashing

2017-05-02 Thread Nathan Sidwell
On the modules branch, I need rematerialize canonical types and the like 
from a read-in serialized tree.


I discovered the canonical-type hash table is fed a bespoke hash value 
by each type creator.  There was no generic type hasher :( The rationale 
appears to allow each type constuctor to just specialize its needs. 
Excitingly, a generic type hasher is hiding inside 
build_type_attribute_qual_variant.  So I broke it out of there and 
generalized it a bit more.


The type hashers had diverged from the attribute-variant hasher.  This 
is not an error, because the attribute variant version was creating 
variants with non-null attributes, so guaranteed different  But it was 
confusing.


This generic hasher is slightly different to the bespoke hashers in a 
few places. One place of note was the vector type hasher, which mixed 
{elt-type, num-elts, vector-mode}, but vector-mode is entirely 
determined by the first two object, so mixing it in doesn't add any 
entropy.  I dropped the mode.


I still kept generating the hashvalue separate to the type_hash_canon 
call itself.  Perhaps a future patch could change that, but I didn't 
want to much churn in this patch.


I've included Jakub's recent TYPE_TYPELESS_STORAGE changes. (And notice 
that the attribute-type hasher wasn't dealing with it.)


booted and tested on x86_64-linux-gnu, ok?

nathan
--
Nathan Sidwell
2017-05-02  Nathan Sidwell  

	Canonicalize canonical type hashing
	gcc/
	* tree.h (type_hash_default): Declare.
	* tree.c (type_hash_list, attribute_hash_list): Move into
	type_hash_default.
	(build_type_attribute_qual_variant): Break out hash code calc into
	type_hash_default.
	(type_hash_default): New.  Generic type hash computation.
	(build_range_type_1, build_array_type_1, build_function_type,
	build_method_type_directly, build_offset_type, build_complex_type,
	make_vector_type): Call it.
	gcc/c-family/
	* c-common.c (complete_array_type): Use type_hash_default.

Index: c-family/c-common.c
===
--- c-family/c-common.c	(revision 247485)
+++ c-family/c-common.c	(working copy)
@@ -6368,12 +6368,8 @@ complete_array_type (tree *ptype, tree i
   layout_type (main_type);
 
   /* Make sure we have the canonical MAIN_TYPE. */
-  inchash::hash hstate;
-  hstate.add_object (TYPE_HASH (unqual_elt));
-  hstate.add_object (TYPE_HASH (TYPE_DOMAIN (main_type)));
-  if (!AGGREGATE_TYPE_P (unqual_elt))
-hstate.add_flag (TYPE_TYPELESS_STORAGE (main_type));
-  main_type = type_hash_canon (hstate.end (), main_type);
+  hashval_t hashcode = type_hash_default (main_type);
+  main_type = type_hash_canon (hashcode, main_type);
 
   /* Fix the canonical type.  */
   if (TYPE_STRUCTURAL_EQUALITY_P (TREE_TYPE (main_type))
Index: tree.c
===
--- tree.c	(revision 247485)
+++ tree.c	(working copy)
@@ -248,8 +248,6 @@ static void set_type_quals (tree, int);
 static void print_type_hash_statistics (void);
 static void print_debug_expr_statistics (void);
 static void print_value_expr_statistics (void);
-static void type_hash_list (const_tree, inchash::hash &);
-static void attribute_hash_list (const_tree, inchash::hash &);
 
 tree global_trees[TI_MAX];
 tree integer_types[itk_none];
@@ -4828,11 +4826,7 @@ build_type_attribute_qual_variant (tree
 {
   if (! attribute_list_equal (TYPE_ATTRIBUTES (ttype), attribute))
 {
-  inchash::hash hstate;
   tree ntype;
-  int i;
-  tree t;
-  enum tree_code code = TREE_CODE (ttype);
 
   /* Building a distinct copy of a tagged type is inappropriate; it
 	 causes breakage in code that expects there to be a one-to-one
@@ -4856,37 +4850,8 @@ build_type_attribute_qual_variant (tree
 
   TYPE_ATTRIBUTES (ntype) = attribute;
 
-  hstate.add_int (code);
-  if (TREE_TYPE (ntype))
-	hstate.add_object (TYPE_HASH (TREE_TYPE (ntype)));
-  attribute_hash_list (attribute, hstate);
-
-  switch (TREE_CODE (ntype))
-	{
-	case FUNCTION_TYPE:
-	  type_hash_list (TYPE_ARG_TYPES (ntype), hstate);
-	  break;
-	case ARRAY_TYPE:
-	  if (TYPE_DOMAIN (ntype))
-	hstate.add_object (TYPE_HASH (TYPE_DOMAIN (ntype)));
-	  break;
-	case INTEGER_TYPE:
-	  t = TYPE_MAX_VALUE (ntype);
-	  for (i = 0; i < TREE_INT_CST_NUNITS (t); i++)
-	hstate.add_object (TREE_INT_CST_ELT (t, i));
-	  break;
-	case REAL_TYPE:
-	case FIXED_POINT_TYPE:
-	  {
-	unsigned int precision = TYPE_PRECISION (ntype);
-	hstate.add_object (precision);
-	  }
-	  break;
-	default:
-	  break;
-	}
-
-  ntype = type_hash_canon (hstate.end(), ntype);
+  hashval_t hash = type_hash_default (ntype);
+  ntype = type_hash_canon (hash, ntype);
 
   /* If the target-dependent attributes make NTYPE different from
 	 its canonical type, we will need to use structural equality
@@ -6994,18 +6959,80 @@ decl_debug_args_insert (tree from)
 /* Hashing of types so that we don't make duplicates.
The entry point is `type

[wwwdocs] grammar & style fixes for /gcc-7/changes.html

2017-05-02 Thread Jonathan Wakely

This is the result of proofreading the release notes for GCC 7. Some
are obvious fixes for simple typos, but I've also tried to improve the
clarity of some text. Please take a look and let me know if you
disagree with any changes.


Index: htdocs/gcc-7/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/changes.html,v
retrieving revision 1.80
diff -u -r1.80 changes.html
--- htdocs/gcc-7/changes.html	1 May 2017 17:51:14 -	1.80
+++ htdocs/gcc-7/changes.html	2 May 2017 13:51:58 -
@@ -58,8 +58,8 @@
   A new code hoisting optimization has been added to the partial
   redundancy elimination pass.  It attempts to move evaluation of
   expressions executed on all paths to the function exit as early as
-  possible, which helps primarily for code size, but can be useful for
-  speed of generated code as well.  It is enabled by the
+  possible. This primarily helps improve code size, but can improve the
+  speed of the generated code as well.  It is enabled by the
   -fcode-hoisting option and at the -O2
   optimization level or higher (and -Os).
 
@@ -75,18 +75,18 @@
   ignored.
 
   A new interprocedural value range propagation optimization has been
-  added, which propagates integral ranges that variable values can be proven
-  to be within across the call graph.  It is enabled by the
-  -fipa-vrp option and at the -O2 optimization
-  level and higher (and -Os).
-
-  A new loop splitting optimization pass has been added.  It splits
-  certain loops if they contain a condition that is always true on one
-  side of the iteration space and always false on the other into two
-  loops where each of the new two loops iterates just on one of the sides
-  of the iteration space and the condition does not need to be checked
-  inside of the loop.  It is enabled by the -fsplit-loops
-  option and at the -O3 optimization level or higher.
+  added, which propagates integral range information across the call graph
+  when variable values can be proven to be within those ranges.  It is
+  enabled by the -fipa-vrp option and at the -O2
+  optimization level and higher (and -Os).
+
+  A new loop splitting optimization pass has been added.  Certain loops
+  which contain a condition that is always true on one side of the iteration
+  space and always false on the other are split into two loops, such that
+  each of the two new loops iterates on just one side of the iteration space
+  and the condition does not need to be checked inside of the loop.
+  It is enabled by the -fsplit-loops option and
+  at the -O3 optimization level or higher.
 
   The shrink-wrapping optimization can now separate portions of
   prologues and epilogues to improve performance if some of the
@@ -129,14 +129,14 @@
   The option is enabled by default with -fsanitize=address and disabled
   by default with -fsanitize=kernel-address.
   Compared to the LLVM compiler, where the option already exists,
-  the implementation in the GCC compiler has couple of improvements and advantages:
+  the implementation in the GCC compiler has some improvements and advantages:
   
-  A complex usage of gotos and case labels are properly handled and should not
-  report any false positive or false negatives.
+  Complex uses of gotos and case labels are properly handled and
+  should not report any false positive or false negatives.
   
   C++ temporaries are sanitized.
   Sanitization can handle invalid memory stores that are optimized out
-  by the LLVM compiler when using an optimization level.
+  by the LLVM compiler when optimization is enabled.
   
 
   
@@ -149,7 +149,7 @@
   href="http://www.dwarfstd.org/Download.php";>DWARF debugging
   information standard is supported through the -gdwarf-5
   option.  The DWARF version 4 debugging information remains the
-  default until debugging information consumers are adjusted.
+  default until consumers of debugging information are adjusted.
 
 
 
@@ -190,7 +190,7 @@
 through.  This warning has five different levels.  The compiler is
 	able to parse a wide range of fallthrough comments, depending on
 	the level.  It also handles control-flow statements, such as ifs.
-	It's possible to suppres the warning by either adding a fallthrough
+	It's possible to suppress the warning by either adding a fallthrough
 	comment, or by using a null statement: __attribute__
 	((fallthrough)); (C, C++), or [[fallthrough]];
 (C++17), or [[gnu::fallthrough]]; (C++11/C++14).
@@ -596,8 +596,8 @@
   The C++ front end has experimental support for all of the current C++17
 draft with the -std=c++1z or -std=gnu++1z flags,
 including if constexpr, class template argument
-deduction, auto template parameters, and decomposition
-declarations.  For a full list of new features,
+deduction, auto template parameters, and structured bindings.
+For a full list of new features,
 see https://g

Re: [PATCH GCC8][22/33]Generate TMR in new reassociation order

2017-05-02 Thread Richard Biener
On Wed, Apr 26, 2017 at 12:20 PM, Bin.Cheng  wrote:
> This is another one where context diff might help.  No code change
> from previous version.

This isn't a context diff.

Anyways, changes like using 'tmp' really interfere with creating a
useful diff so it's hard
to review no-op changes from the real meat.  I spot re-ordering and
doing parts.offset
in a different way first.

I wonder if we can do better by first re-factoring fields of mem_address to how
TARGET_MEM_REF is laid out now -- merge symbol and base and introduce
index2 so that create_mem_ref_raw becomes a 1:1 mapping.

Anyway, the patch looks fine (after much staring) but it could really need some
more commenting on what we try to do in what order and why.

Thanks,
Richard.

> Thanks,
> bin
>
> On Tue, Apr 18, 2017 at 11:49 AM, Bin Cheng  wrote:
>> Hi,
>> This patch generates TMR for ivopts in new re-association order.  General 
>> idea is,
>> by querying target's addressing mode, we put as much address computation as 
>> possible
>> in memory reference.  For computation that has to be done outside of memory 
>> reference,
>> we re-associate the address expression in new order so that loop invariant 
>> expression
>> is kept and exposed for later lim pass.
>> Is it OK?
>>
>> Thanks,
>> bin
>> 2017-04-11  Bin Cheng  
>>
>> * tree-ssa-address.c: Include header file.
>> (move_hint_to_base): Return TRUE if BASE_HINT is moved to memory
>> address.
>> (add_to_parts): Refactor.
>> (addr_to_parts): New parameter.  Update use of move_hint_to_base.
>> (create_mem_ref): Update use of addr_to_parts.  Re-associate addr
>> in new order.


Re: [PATCH 1/5][GIMPLE FE] PR testsuite/80580. Handle missing labels in goto statements

2017-05-02 Thread Richard Biener
On Mon, May 1, 2017 at 8:04 PM, Mikhail Maltsev  wrote:
> The first problem happens because we don't check for missing labels when 
> parsing
> 'goto' statements. I.e.:
>
> __GIMPLE() void fn1() {
>   if (1)
> goto
> }
>
> The fix is pretty obvious: just add a check.
> My question is: which functions should I use to produce diagnostics? The
> surrounding code uses 'c_parser_error', but this function does not handle
> locations very well (in fact, it uses input_location).

Certainly an improvement.  I suppose we can do better error recovery
for cases like

 if (1)
   goto
 else
   goto bar;

but I guess this is better than nothing.

And yes, we use c_parser_error -- input_location should be ok but here
we just peek which may upset the parser.  Maybe it works better
when consuming the token before issueing the error?  Thus

Index: gcc/c/gimple-parser.c
===
--- gcc/c/gimple-parser.c   (revision 247498)
+++ gcc/c/gimple-parser.c   (working copy)
@@ -1315,8 +1315,8 @@ c_parser_gimple_if_stmt (c_parser *parse
   loc = c_parser_peek_token (parser)->location;
   c_parser_consume_token (parser);
   label = c_parser_peek_token (parser)->value;
-  t_label = lookup_label_for_goto (loc, label);
   c_parser_consume_token (parser);
+  t_label = lookup_label_for_goto (loc, label);
   if (! c_parser_require (parser, CPP_SEMICOLON, "expected %<;%>"))
return;
 }

?

Patch is ok with or without this adjustment (and testcase adjustment).

Thanks,
Richard.

> --
> Regards,
>Mikhail Maltsev
>
> gcc/testsuite/ChangeLog:
>
> 2017-05-01  Mikhail Maltsev  
>
> * gcc.dg/gimplefe-error-4.c: New test.
> * gcc.dg/gimplefe-error-5.c: New test.
>
>
> gcc/c/ChangeLog:
>
> 2017-05-01  Mikhail Maltsev  
>
> * gimple-parser.c (c_parser_gimple_if_stmt): Check for empty labels.
>
>


Re: [PATCH 4/5][GIMPLE FE] PR testsuite/80580. Handle invalid __MEM

2017-05-02 Thread Richard Biener
On Mon, May 1, 2017 at 8:08 PM, Mikhail Maltsev  wrote:
> This patch deals with invalid __MEM construct. Before we start building an
> expression for __MEM, we must check that parsing succeeded and that the __MEM
> operand is a pointer.

Ok.

Thanks,
Richard.

> --
> Regards,
>Mikhail Maltsev
>
>
> gcc/c/ChangeLog:
>
> 2017-05-01  Mikhail Maltsev  
>
> * gimple-parser.c (c_parser_gimple_postfix_expression): Handle
> invalid __MEM.
>
> gcc/testsuite/ChangeLog:
>
> 2017-05-01  Mikhail Maltsev  
>
> * gcc.dg/gimplefe-error-9.c: New test.
>
>
>


Re: [PATCH 5/5][GIMPLE FE] PR testsuite/80580: Handle invalid SSA names

2017-05-02 Thread Richard Biener
On Mon, May 1, 2017 at 8:09 PM, Mikhail Maltsev  wrote:
> When parsing SSA names, we should check that parent names are scalars.
> In fact, this patch just uses the condition of a 'gcc_assert' in 
> 'make_ssa_name_fn'.

+ if (!(VAR_P (parent)
+   || TREE_CODE (parent) == PARM_DECL
+   || TREE_CODE (parent) == RESULT_DECL
+   || (TYPE_P (parent) && is_gimple_reg_type (parent
+   {
+ error ("invalid SSA name %qE", parent);
+ return error_mark_node;
+   }

please drop || (TYPE_P (parent) && is_gimple_reg_type (parent))), that
case isn't valid.
Please also change wording slightly to "invalid base %qE for SSA name".

Ok with that changes.

Thanks,
Richard.

> --
> Regards,
>Mikhail Maltsev
>
>
> gcc/testsuite/ChangeLog:
>
> 2017-05-01  Mikhail Maltsev  
>
> * gcc.dg/gimplefe-error-11.c: New test.
>
>
> gcc/c/ChangeLog:
>
> 2017-05-01  Mikhail Maltsev  
>
> * gimple-parser.c (c_parser_parse_ssa_name): Validate SSA name base.
>
>


[PATCH] Remove LTO_STREAMER_DEBUG (PR lto/79489).

2017-05-02 Thread Martin Liška
Hi.

After a discussion with Richi on IRC, I'm removing the debugging infrastructure 
as
it's obsolete.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Installed as it's pre-approved.
Martin
>From 88768989be6c79c43bd3166c87eaa877b867957c Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 2 May 2017 15:12:13 +0200
Subject: [PATCH] Remove LTO_STREAMER_DEBUG (PR lto/79489).

gcc/ChangeLog:

2017-05-02  Martin Liska  

	PR lto/79489.
	* lto-streamer-in.c (lto_read_tree_1): Remove
	LTO_STREAMER_DEBUG.
	* lto-streamer.c (struct tree_hash_entry): Likewise.
	(struct tree_entry_hasher): Likewise.
	(tree_entry_hasher::hash): Likewise.
	(tree_entry_hasher::equal): Likewise.
	(lto_streamer_init): Likewise.
	(lto_orig_address_map): Likewise.
	(lto_orig_address_get): Likewise.
	(lto_orig_address_remove): Likewise.
	* lto-streamer.h: Likewise.
	* tree-streamer-in.c (streamer_alloc_tree): Likewise.
	* tree-streamer-out.c (streamer_write_tree_header): Likewise.
---
 gcc/lto-streamer-in.c   |  6 
 gcc/lto-streamer.c  | 92 -
 gcc/lto-streamer.h  | 13 ---
 gcc/tree-streamer-in.c  | 20 ---
 gcc/tree-streamer-out.c | 10 --
 5 files changed, 141 deletions(-)

diff --git a/gcc/lto-streamer-in.c b/gcc/lto-streamer-in.c
index 515aa532ce6..6da217d5589 100644
--- a/gcc/lto-streamer-in.c
+++ b/gcc/lto-streamer-in.c
@@ -1337,12 +1337,6 @@ lto_read_tree_1 (struct lto_input_block *ib, struct data_in *data_in, tree expr)
   && TREE_CODE (expr) != FUNCTION_DECL
   && TREE_CODE (expr) != TRANSLATION_UNIT_DECL)
 DECL_INITIAL (expr) = stream_read_tree (ib, data_in);
-
-#ifdef LTO_STREAMER_DEBUG
-  /* Remove the mapping to RESULT's original address set by
- streamer_alloc_tree.  */
-  lto_orig_address_remove (expr);
-#endif
 }
 
 /* Read the physical representation of a tree node with tag TAG from
diff --git a/gcc/lto-streamer.c b/gcc/lto-streamer.c
index 04d733024d8..74fe0e259bf 100644
--- a/gcc/lto-streamer.c
+++ b/gcc/lto-streamer.c
@@ -257,35 +257,6 @@ print_lto_report (const char *s)
 	 lto_section_name[i], lto_stats.section_size[i]);
 }
 
-
-#ifdef LTO_STREAMER_DEBUG
-struct tree_hash_entry
-{
-  tree key;
-  intptr_t value;
-};
-
-struct tree_entry_hasher : nofree_ptr_hash 
-{
-  static inline hashval_t hash (const tree_hash_entry *);
-  static inline bool equal (const tree_hash_entry *, const tree_hash_entry *);
-};
-
-inline hashval_t
-tree_entry_hasher::hash (const tree_hash_entry *e)
-{
-  return htab_hash_pointer (e->key);
-}
-
-inline bool
-tree_entry_hasher::equal (const tree_hash_entry *e1, const tree_hash_entry *e2)
-{
-  return (e1->key == e2->key);
-}
-
-static hash_table *tree_htab;
-#endif
-
 /* Initialization common to the LTO reader and writer.  */
 
 void
@@ -297,10 +268,6 @@ lto_streamer_init (void)
  handle it.  */
   if (flag_checking)
 streamer_check_handled_ts_structures ();
-
-#ifdef LTO_STREAMER_DEBUG
-  tree_htab = new hash_table (31);
-#endif
 }
 
 
@@ -314,65 +281,6 @@ gate_lto_out (void)
 	  && !seen_error ());
 }
 
-
-#ifdef LTO_STREAMER_DEBUG
-/* Add a mapping between T and ORIG_T, which is the numeric value of
-   the original address of T as it was seen by the LTO writer.  This
-   mapping is useful when debugging streaming problems.  A debugging
-   session can be started on both reader and writer using ORIG_T
-   as a breakpoint value in both sessions.
-
-   Note that this mapping is transient and only valid while T is
-   being reconstructed.  Once T is fully built, the mapping is
-   removed.  */
-
-void
-lto_orig_address_map (tree t, intptr_t orig_t)
-{
-  struct tree_hash_entry ent;
-  struct tree_hash_entry **slot;
-
-  ent.key = t;
-  ent.value = orig_t;
-  slot = tree_htab->find_slot (&ent, INSERT);
-  gcc_assert (!*slot);
-  *slot = XNEW (struct tree_hash_entry);
-  **slot = ent;
-}
-
-
-/* Get the original address of T as it was seen by the writer.  This
-   is only valid while T is being reconstructed.  */
-
-intptr_t
-lto_orig_address_get (tree t)
-{
-  struct tree_hash_entry ent;
-  struct tree_hash_entry **slot;
-
-  ent.key = t;
-  slot = tree_htab->find_slot (&ent, NO_INSERT);
-  return (slot ? (*slot)->value : 0);
-}
-
-
-/* Clear the mapping of T to its original address.  */
-
-void
-lto_orig_address_remove (tree t)
-{
-  struct tree_hash_entry ent;
-  struct tree_hash_entry **slot;
-
-  ent.key = t;
-  slot = tree_htab->find_slot (&ent, NO_INSERT);
-  gcc_assert (slot);
-  free (*slot);
-  tree_htab->clear_slot (slot);
-}
-#endif
-
-
 /* Check that the version MAJOR.MINOR is the correct version number.  */
 
 void
diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
index 854bcd2d75e..9ab3007a9df 100644
--- a/gcc/lto-streamer.h
+++ b/gcc/lto-streamer.h
@@ -27,14 +27,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "gcov-io.h"
 #include "diagnostic.h"
 
-/* Define when debugging the LTO streamer.  This causes the writer
-   to output the num

[PATH][GCC][mid-end] Check the alternate cost model just as costs_lt_p

2017-05-02 Thread Tamar Christina
Hi all,

When comparing costs, the rtl function costs_lt_p compares the costs of
A and B such that if they are the same and we were checking for speed,
compare the size and use that as determining factor.

This applies the same principle to the comparison done for the costing
of expr expansions. Potentially makes -Osize code faster and -Ospeed code
smaller.

Bootstrapped on aarch64-none-linux-gnu and x86_64-linux
and reg-tested on aarch64-none-linux-gnu with no regressions.

OK for trunk?

Thanks,
Tamar


gcc/
2017-04-26  Tamar Christina  

* expr.c (expand_expr_real_2): Re-cost if previous costs are the same.diff --git a/gcc/expr.c b/gcc/expr.c
index 00f08aad55c90a0563bf93cf578107fe0b871231..2469608311c6793a58e7a042f751f390341242b4 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -8837,6 +8837,15 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 	   end_sequence ();
 	   unsigned uns_cost = seq_cost (uns_insns, speed_p);
 	   unsigned sgn_cost = seq_cost (sgn_insns, speed_p);
+
+	   /* If costs are the same then use as tie breaker the other
+	  other factor.  */
+	   if (uns_cost == sgn_cost)
+	 {
+		uns_cost = seq_cost (uns_insns, !speed_p);
+		sgn_cost = seq_cost (sgn_insns, !speed_p);
+	 }
+
 	   if (uns_cost < sgn_cost || (uns_cost == sgn_cost && unsignedp))
 	 {
 	   emit_insn (uns_insns);



[PATCH][GCC][mid-end] Support combining of LSHIFTRT + LSHIFTRT operations

2017-05-02 Thread Tamar Christina
Hi all,

r217118 added an optimization to combine ashiftrt and lshiftrt.
This same optimization can at the very least also apply to lshiftrt + lshiftrt
with the same constraints. i.e. that both operations are done for scalar modes,
that second operation operates on a subreg of the first one and that the shift
amount of the first operation is larger than the mode bitsize of the subreg.

This reduces

umull   x1, w0, w1
lsr x1, x1, 32
lsr w1, w1, 5

to

umull   x1, w0, w1
lsr x1, x1, 37


Bootstrapped on aarch64-none-linux-gnu and x86_64-linux
and reg-tested on aarch64-none-linux-gnu with no regressions.

OK for trunk?

Thanks,
Tamar


gcc/
2017-04-27  Tamar Christina  

* simplify-rtx.c (simplify_binary_operation_1): Add LSHIFTRT case.

gcc/testsuite/
2017-04-27  Tamar Christina  

* gcc.dg/lsr-div1.c: New testcase.diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 640ccb7cb95933a6991bf1599099f7aed455daec..feaceff06d6267b372f40fcd263e2ae67bbd4c74 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -3343,19 +3343,21 @@ simplify_binary_operation_1 (enum rtx_code code, machine_mode mode,
 	  && UINTVAL (trueop0) == GET_MODE_MASK (mode)
 	  && ! side_effects_p (op1))
 	return op0;
+
+canonicalize_shift:
   /* Given:
 	 scalar modes M1, M2
 	 scalar constants c1, c2
 	 size (M2) > size (M1)
 	 c1 == size (M2) - size (M1)
 	 optimize:
-	 (ashiftrt:M1 (subreg:M1 (lshiftrt:M2 (reg:M2) (const_int ))
+	 ([a|l]shiftrt:M1 (subreg:M1 (lshiftrt:M2 (reg:M2) (const_int ))
  )
 		  (const_int ))
 	 to:
-	 (subreg:M1 (ashiftrt:M2 (reg:M2) (const_int ))
+	 (subreg:M1 ([a|l]shiftrt:M2 (reg:M2) (const_int ))
 		).  */
-  if (code == ASHIFTRT
+  if ((code == ASHIFTRT || code == LSHIFTRT)
 	  && !VECTOR_MODE_P (mode)
 	  && SUBREG_P (op0)
 	  && CONST_INT_P (op1)
@@ -3372,13 +3374,13 @@ simplify_binary_operation_1 (enum rtx_code code, machine_mode mode,
 	  rtx tmp = GEN_INT (INTVAL (XEXP (SUBREG_REG (op0), 1))
 			 + INTVAL (op1));
 	  machine_mode inner_mode = GET_MODE (SUBREG_REG (op0));
-	  tmp = simplify_gen_binary (ASHIFTRT,
+	  tmp = simplify_gen_binary (code,
  GET_MODE (SUBREG_REG (op0)),
  XEXP (SUBREG_REG (op0), 0),
  tmp);
 	  return lowpart_subreg (mode, tmp, inner_mode);
 	}
-canonicalize_shift:
+
   if (SHIFT_COUNT_TRUNCATED && CONST_INT_P (op1))
 	{
 	  val = INTVAL (op1) & (GET_MODE_PRECISION (mode) - 1);
diff --git a/gcc/testsuite/gcc.dg/lsr-div1.c b/gcc/testsuite/gcc.dg/lsr-div1.c
new file mode 100644
index ..962054d34d953b63c9736134b9ad147791a491d3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lsr-div1.c
@@ -0,0 +1,57 @@
+/* Test division by const int generates only one shift.  */
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-rtl-combine-all" } */
+/* { dg-options "-O2 -fdump-rtl-combine-all -mtune=cortex-a53" { target aarch64*-*-* } } */
+/* { dg-require-effective-target int32plus } */
+
+extern void abort (void);
+
+#define NOINLINE __attribute__((noinline))
+
+static NOINLINE int
+f1 (unsigned int n)
+{
+  return n % 0x33;
+}
+
+static NOINLINE int
+f2 (unsigned int n)
+{
+  return n % 0x12;
+}
+
+int
+main ()
+{
+  int a = 0x;
+  int b = 0x;
+  int c;
+  c = f1 (a);
+  if (c != 0x11)
+abort ();
+  c = f1 (b);
+  if (c != 0x22)
+abort ();
+  c = f2 (a);
+  if (c != 0xE)
+abort ();
+  c = f2 (b);
+  if (c != 0x7)
+abort ();
+  return 0;
+}
+
+/* Following replacement pattern of intger division by constant, GCC is expected
+   to generate UMULL and (x)SHIFTRT.  This test checks that considering division
+   by const 0x33, gcc generates a single LSHIFTRT by 37, instead of
+   two - LSHIFTRT by 32 and LSHIFTRT by 5.  */
+
+/* { dg-final { scan-rtl-dump "\\(set \\(subreg:DI \\(reg:SI" "combine" { target aarch64*-*-* } } } */
+/* { dg-final { scan-rtl-dump "\\(lshiftrt:DI \\(reg:DI" "combine" { target aarch64*-*-* } } } */
+/* { dg-final { scan-rtl-dump "\\(const_int 37 " "combine" { target aarch64*-*-* } } } */
+
+/* Similarly, considering division by const 0x12, gcc generates a
+   single LSHIFTRT by 34, instead of two - LSHIFTRT by 32 and LSHIFTRT by 2.  */
+
+/* { dg-final { scan-rtl-dump "\\(const_int 34 " "combine" { target aarch64*-*-* } } } */
+



Re: [PATCH] handle sprintf(d, "%s", ...) in gimple-ssa-sprintf.c

2017-05-02 Thread Martin Sebor

FWIW, my fix for bug 79062 is only partial (it gets the pass
to run but the warnings are still not issued).  I don't quite
understand what prevents the warning flag(s) from getting set
when -flto is used.  This seems to be a bigger problem than
just the sprintf pass not doing something just right.


I've never dug deeply in the LTO stuff, but I believe we stream the compiler
flags, so it could be something there.


We do.


Alternately you might be running into a case where in LTO mode we recreate
base types.  Look for a type equality tester that goes beyond just testing
pointer equality.

ie, in LTO I think we'll create a type based on the streamed data, but I
also think we'll create various basic types.  Thus in LTO mode pointer
equality may not be sufficient.


We make sure that for most basic types we end up re-using them where possible.
char_type_node is an example where that generally doesn't work because it's
value depends on a command-line flag.


That answers the first part of the question of why the sprintf
pass wouldn't run (or do anything) with -flto.   With it fixed
(as in fold-const.c or tree-ssa-strlen.c as you suggested in
bug 79602) it runs and the optimization does its job, but no
warnings are issued.  The wan_foo_flags for warnings that are
enabled implicitly (e.g., by -Wall or -Wextra on the command
line) are clear.  There seem to be dependencies between warnings
in c.opt that ignore LTO (as a language), but even with those
corrected (i.e., with LTO added as a language to -Wformat and
-Wall) the flags are still clear when LTO runs.  Does that ring
any bells for you?

Thanks
Martin


Re: [wwwdocs] grammar & style fixes for /gcc-7/changes.html

2017-05-02 Thread Sandra Loosemore

On 05/02/2017 07:52 AM, Jonathan Wakely wrote:

This is the result of proofreading the release notes for GCC 7. Some
are obvious fixes for simple typos, but I've also tried to improve the
clarity of some text. Please take a look and let me know if you
disagree with any changes.


Overall this is an improvement.  Just a couple suggestions:


@@ -863,22 +863,22 @@
 ARC

  
-   Add support for ARC HS and ARC EM processors.
+   Added support for ARC HS and ARC EM processors has been added.
  


Too many "added"s now.


 GCC's already extensive testsuite has gained some new
   capabilities, to further improve the reliability of the compiler:
 
-  GCC now has has an internal unit testing API and a suite of tests
+  GCC now has an internal unit testing API and a suite of tests
 for programmatic self-testing of subsystems.


s/unit testing API/unit-testing API/

-Sandra



Re: [ARM] Enable FP16 vector arithmetic operations.

2017-05-02 Thread Tamar Christina
Hi All,

I'm taking this one over from Matthew, I think it slipped through the cracks 
before.

Since it still applies cleanly on trunk I'm just pinging it.

Ok for trunk?

Tamar

From: gcc-patches-ow...@gcc.gnu.org  on behalf 
of Matthew Wahab 
Sent: Friday, September 23, 2016 4:02 PM
To: gcc-patches
Subject: [ARM] Enable FP16 vector arithmetic operations.

Hello,

Support for the ARMv8.2-A FP16 NEON arithmetic instructions was added
using non-standard names for the instruction patterns. This was needed
because the NEON floating point semantics meant that their use by the
compiler for HFmode arithmetic operations needed to be restricted. This
follows the implementation for 32-bit NEON intructions.

As with the 32-bit instructions, the restriction on the HFmode
operation can be lifted when -funsafe-math-optimizations is
enabled. This patch does that, defining the standard pattern names
addhf3, subhf3, mulhf3 and fmahf3.

This patch also updates the NEON intrinsics to use the arithmetic
operations when -ffast-math is enabled. This is to make keep the 16-bit
support consistent with the 32-bit supportd. It is needed so that code
using the f16 intrinsics are subject to the same optimizations as code
using the f32 intrinsics would be.

Tested for arm-none-linux-gnueabihf with native bootstrap and make check
on ARMv8-A and for arm-none-eabi and armeb-none-eabi with cross-compiled
make check on an ARMv8.2-A emulator.

Ok for trunk?
Matthew

gcc/
2016-09-23  Matthew Wahab  

* config/arm/arm_neon.h (vadd_f16): Use standard arithmetic
operations in fast-math mode.
(vaddq_f16): Likewise.
(vmul_f16): Likewise.
(vmulq_f16): Likewise.
(vsub_f16): Likewise.
(vsubq_f16): Likewise.
* config/arm/neon.md (add3): New.
(sub3): New.
(fma:3): New.  Also remove outdated comment.
(mul3): New.

testsuite/
2016-09-23  Matthew Wahab  

* gcc.target/arm/armv8_2-fp16-arith-1.c: Expand comment.  Update
expected output of vadd, vsub and vmul instructions.
* gcc.target/arm/armv8_2-fp16-arith-2.c: New.
* gcc.target/arm/armv8_2-fp16-neon-2.c: New.
* gcc.target/arm/armv8_2-fp16-neon-3.c: New.


Re: [PATCH] Fix cross compiling to x86_64-w64-mingw32

2017-05-02 Thread JonY
On 05/02/2017 12:11 PM, Hugo Beauzée-Luyssen wrote:
> This patch fixes cross compiling to x86_64-w64-mingw32
> See https://github.com/Alexpux/MINGW-packages/issues/1580 and
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69506
> 
> My apologies if I missed something in the contributing/sending a patch
> guidelines.
> 
> Regards,
> 
> Index: libstdc++-v3/config/os/mingw32-w64/os_defines.h
> ===
> --- libstdc++-v3/config/os/mingw32-w64/os_defines.h (revision 247489)
> +++ libstdc++-v3/config/os/mingw32-w64/os_defines.h (working copy)
> @@ -76,6 +76,8 @@
> 
>  #ifdef __x86_64__
>  #define _GLIBCXX_LLP64 1
> +// See libstdc++/69506
> +#define _GLIBCXX_USE_WEAK_REF 0
>  #endif
> 
>  // Enable use of GetModuleHandleEx (requires Windows XP/2003) in
> 

Looks good, go ahead and apply.




signature.asc
Description: OpenPGP digital signature


Re: [PATCH GCC8][22/33]Generate TMR in new reassociation order

2017-05-02 Thread Bin.Cheng
On Tue, May 2, 2017 at 3:09 PM, Richard Biener
 wrote:
> On Wed, Apr 26, 2017 at 12:20 PM, Bin.Cheng  wrote:
>> This is another one where context diff might help.  No code change
>> from previous version.
>
> This isn't a context diff.
Thanks for reviewing.  I used git diff -U20 to generate patch.  Maybe
20 is too small?

>
> Anyways, changes like using 'tmp' really interfere with creating a
> useful diff so it's hard
> to review no-op changes from the real meat.  I spot re-ordering and
> doing parts.offset
> in a different way first.
>
> I wonder if we can do better by first re-factoring fields of mem_address to 
> how
> TARGET_MEM_REF is laid out now -- merge symbol and base and introduce
> index2 so that create_mem_ref_raw becomes a 1:1 mapping.
Probably.  Note the mapping shall be done in addr_to_parts?  Changes
in create_mem_ref tries to simplify address expression not supported
by current target into supported forms.

>
> Anyway, the patch looks fine (after much staring) but it could really need 
> some
> more commenting on what we try to do in what order and why.
Simple comments added as in updated patch.  Will commit this updated version.

Thanks,
bin
>
> Thanks,
> Richard.
>
>> Thanks,
>> bin
>>
>> On Tue, Apr 18, 2017 at 11:49 AM, Bin Cheng  wrote:
>>> Hi,
>>> This patch generates TMR for ivopts in new re-association order.  General 
>>> idea is,
>>> by querying target's addressing mode, we put as much address computation as 
>>> possible
>>> in memory reference.  For computation that has to be done outside of memory 
>>> reference,
>>> we re-associate the address expression in new order so that loop invariant 
>>> expression
>>> is kept and exposed for later lim pass.
>>> Is it OK?
>>>
>>> Thanks,
>>> bin
>>> 2017-04-11  Bin Cheng  
>>>
>>> * tree-ssa-address.c: Include header file.
>>> (move_hint_to_base): Return TRUE if BASE_HINT is moved to memory
>>> address.
>>> (add_to_parts): Refactor.
>>> (addr_to_parts): New parameter.  Update use of move_hint_to_base.
>>> (create_mem_ref): Update use of addr_to_parts.  Re-associate addr
>>> in new order.
diff --git a/gcc/tree-ssa-address.c b/gcc/tree-ssa-address.c
index 8aefed6..8257fde 100644
--- a/gcc/tree-ssa-address.c
+++ b/gcc/tree-ssa-address.c
@@ -29,40 +29,41 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree.h"
 #include "gimple.h"
 #include "memmodel.h"
 #include "stringpool.h"
 #include "tree-vrp.h"
 #include "tree-ssanames.h"
 #include "expmed.h"
 #include "insn-config.h"
 #include "emit-rtl.h"
 #include "recog.h"
 #include "tree-pretty-print.h"
 #include "fold-const.h"
 #include "stor-layout.h"
 #include "gimple-iterator.h"
 #include "gimplify-me.h"
 #include "tree-ssa-loop-ivopts.h"
 #include "expr.h"
 #include "tree-dfa.h"
 #include "dumpfile.h"
 #include "tree-affine.h"
+#include "gimplify.h"
 
 /* FIXME: We compute address costs using RTL.  */
 #include "tree-ssa-address.h"
 
 /* TODO -- handling of symbols (according to Richard Hendersons
comments, http://gcc.gnu.org/ml/gcc-patches/2005-04/msg00949.html):
 
There are at least 5 different kinds of symbols that we can run up against:
 
  (1) binds_local_p, small data area.
  (2) binds_local_p, eg local statics
  (3) !binds_local_p, eg global variables
  (4) thread local, local_exec
  (5) thread local, !local_exec
 
Now, (1) won't appear often in an array context, but it certainly can.
All you have to do is set -GN high enough, or explicitly mark any
random object __attribute__((section (".sdata"))).
 
All of these affect whether or not a symbol is in fact a valid address.
@@ -410,71 +411,73 @@ move_fixed_address_to_symbol (struct mem_address *parts, 
aff_tree *addr)
   tree val = NULL_TREE;
 
   for (i = 0; i < addr->n; i++)
 {
   if (addr->elts[i].coef != 1)
continue;
 
   val = addr->elts[i].val;
   if (TREE_CODE (val) == ADDR_EXPR
  && fixed_address_object_p (TREE_OPERAND (val, 0)))
break;
 }
 
   if (i == addr->n)
 return;
 
   parts->symbol = val;
   aff_combination_remove_elt (addr, i);
 }
 
-/* If ADDR contains an instance of BASE_HINT, move it to PARTS->base.  */
+/* Return true if ADDR contains an instance of BASE_HINT and it's moved to
+   PARTS->base.  */
 
-static void
+static bool
 move_hint_to_base (tree type, struct mem_address *parts, tree base_hint,
   aff_tree *addr)
 {
   unsigned i;
   tree val = NULL_TREE;
   int qual;
 
   for (i = 0; i < addr->n; i++)
 {
   if (addr->elts[i].coef != 1)
continue;
 
   val = addr->elts[i].val;
   if (operand_equal_p (val, base_hint, 0))
break;
 }
 
   if (i == addr->n)
-return;
+return false;
 
   /* Cast value to appropriate pointer type.  We cannot use a pointer
  to TYPE directly, as the back-end will assume registers of pointer
  type are aligned, and just the base itself may not actually be.
 

Re: [PATCH][GCC][mid-end] Support combining of LSHIFTRT + LSHIFTRT operations

2017-05-02 Thread Jeff Law

On 05/02/2017 08:34 AM, Tamar Christina wrote:

Hi all,

r217118 added an optimization to combine ashiftrt and lshiftrt.
This same optimization can at the very least also apply to lshiftrt + lshiftrt
with the same constraints. i.e. that both operations are done for scalar modes,
that second operation operates on a subreg of the first one and that the shift
amount of the first operation is larger than the mode bitsize of the subreg.

This reduces

umull   x1, w0, w1
lsr x1, x1, 32
lsr w1, w1, 5

to

umull   x1, w0, w1
lsr x1, x1, 37


Bootstrapped on aarch64-none-linux-gnu and x86_64-linux
and reg-tested on aarch64-none-linux-gnu with no regressions.

OK for trunk?

Thanks,
Tamar


gcc/
2017-04-27  Tamar Christina  

* simplify-rtx.c (simplify_binary_operation_1): Add LSHIFTRT case.

gcc/testsuite/
2017-04-27  Tamar Christina  

* gcc.dg/lsr-div1.c: New testcase.

OK for the trunk.

Thanks,
Jeff


Re: [PATH][GCC][mid-end] Check the alternate cost model just as costs_lt_p

2017-05-02 Thread Jeff Law

On 05/02/2017 08:34 AM, Tamar Christina wrote:

Hi all,

When comparing costs, the rtl function costs_lt_p compares the costs of
A and B such that if they are the same and we were checking for speed,
compare the size and use that as determining factor.

This applies the same principle to the comparison done for the costing
of expr expansions. Potentially makes -Osize code faster and -Ospeed code
smaller.

Bootstrapped on aarch64-none-linux-gnu and x86_64-linux
and reg-tested on aarch64-none-linux-gnu with no regressions.

OK for trunk?

Thanks,
Tamar


gcc/
2017-04-26  Tamar Christina  

* expr.c (expand_expr_real_2): Re-cost if previous costs are the same.

OK
jeff



Re: [PATCH, GCC/ARM, Stage 1] Enable Purecode for ARMv8-M Baseline

2017-05-02 Thread Ramana Radhakrishnan
On Tue, May 02, 2017 at 11:45:48AM +0100, Prakhar Bahuguna wrote:
> This patch adds support for purecode to ARMv8-M Baseline, in addition to the
> existing support for ARMv7-M and ARMv8-M Mainline.
> 
> gcc/ChangeLog:
> 
> 2017-01-11  Prakhar Bahuguna  
>   Andre Simoes Dias Vieira  
> 
>   * config/arm/arm.md (movsi): Change TARGET_32BIT to TARGET_HAVE_MOVT.
>   (movt splitter): Likewise.
>   * config/arm/arm.c (arm_option_check_internal): Change arm_arch_thumb2
>   to TARGET_HAVE_MOVT, and merge with -mslow-flash-data check.
>   (const_ok_for_arm): Change else to else if (TARGET_THUMB2) and add else
>   block for Thumb-1 with MOVT.
>   (thumb2_legitimate_address_p): Move code block ...
>   (can_avoid_literal_pool_for_label_p): ... into this new function.
>   (thumb1_legitimate_address_p): Add check for TARGET_HAVE_MOVT and
>   literal pool.
>   (thumb_legitimate_constant_p): Add conditional on TARGET_HAVE_MOVT
> 
> doc/ChangeLog:
> 
> 2017-01-11  Prakhar Bahuguna  
>   Andre Simoes Dias Vieira  
> 
>   * invoke.texi (-mpure-code): Change "ARMv7-M targets" for
>   "Thumb-only targets with the MOVT instruction".
> 
> testsuite/ChangeLog:
> 
> 2017-01-11  Prakhar Bahuguna  
>   Andre Simoes Dias Vieira  
> 
>   * gcc.target/arm/pure-code/pure-code.exp: Add conditional for
>   check_effective_target_arm_thumb1_movt_ok.
> 
> Testing done: Ran regression tests for arm-eabi-none targeting Cortex-M23, 
> both
> with and without -mpure-code.
> 
> Okay for stage 1?
> 
> -- 
> 
> Prakhar Bahuguna

> From a77336404fdbdc9ed9b836e4e164803915aa3b22 Mon Sep 17 00:00:00 2001
> From: Prakhar Bahuguna 
> Date: Wed, 15 Mar 2017 10:25:03 +
> Subject: [PATCH] Enable Purecode for ARMv8-M Baseline.
> 
> ---
>  gcc/config/arm/arm.c   | 77 
> ++
>  gcc/config/arm/arm.md  |  6 +-
>  gcc/doc/invoke.texi|  3 +-
>  .../gcc.target/arm/pure-code/pure-code.exp |  5 +-
>  4 files changed, 57 insertions(+), 34 deletions(-)
> 
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index d719020dcde..1088895e7e5 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -2833,16 +2833,15 @@ arm_option_check_internal (struct gcc_options *opts)
>flag_pic = 0;
>  }
>  
> -  /* We only support -mslow-flash-data on armv7-m targets.  */
> -  if (target_slow_flash_data
> -  && ((!(arm_arch7 && !arm_arch_notm) && !arm_arch7em)
> -   || (TARGET_THUMB1_P (flags) || flag_pic || TARGET_NEON)))
> -error ("-mslow-flash-data only supports non-pic code on armv7-m 
> targets");
> -
> -  /* We only support pure-code on Thumb-2 M-profile targets.  */
> -  if (target_pure_code
> -  && (!arm_arch_thumb2 || arm_arch_notm || flag_pic || TARGET_NEON))
> -error ("-mpure-code only supports non-pic code on armv7-m targets");
> +  /* We only support -mpure-code and -mslow-flash-data on Thumb-only targets
> + with MOVT.  */


It would be good to support this with movw / movt but without
new relocations it's not possible.



> +  if ((target_pure_code || target_slow_flash_data)
> +  && (!TARGET_HAVE_MOVT || arm_arch_notm || flag_pic || TARGET_NEON))
> +{
> +  const char *flag = (target_pure_code ? "-mpure-code" : 
> "-mslow-flash-data");

Check line length here.

  const char *flag = (target_pure_code ? "-mpure-code"
   : "-mslow-flash-data");

> +  error ("%s only supports non-pic code on Thumb-only targets with the "
> +  "MOVT instruction", flag);

I'd prefer the error message to be direct and say
"M profile targets" instead of "Thumb-only targets" ?


> +}



>  
>  }
>  
> @@ -4077,7 +4076,7 @@ const_ok_for_arm (HOST_WIDE_INT i)
>  || (i & ~0xfc03) == 0))
>   return TRUE;
>  }
> -  else
> +  else if (TARGET_THUMB2)
>  {
>HOST_WIDE_INT v;
>  
> @@ -4093,6 +4092,14 @@ const_ok_for_arm (HOST_WIDE_INT i)
>if (i == v)
>   return TRUE;
>  }
> +  else if (TARGET_HAVE_MOVT)
> +{
> +  /* Thumb-1 Targets with MOVT.  */
> +  if (i > 0x)
> + return FALSE;
> +  else
> + return TRUE;
> +}
>  
>return FALSE;
>  }
> @@ -7736,6 +7743,32 @@ arm_legitimate_address_outer_p (machine_mode mode, rtx 
> x, RTX_CODE outer,
>return 0;
>  }
>  
> +/* Return true if we can avoid creating a constant pool entry for x.  */
> +bool
> +can_avoid_literal_pool_for_label_p (rtx x)

This can be static, no ?

> +{
> +  /* Normally we can assign constant values to target registers without
> + the help of constant pool.  But there are cases we have to use constant
> + pool like:
> + 1) assign a label to register.
> + 2) sign-extend a 8bit value to 32bit and then assign to register.
> +
> + Constant pool access in format:
> + (set (reg r0) (mem (symbol_ref (".L

Re: [PATCH] PR libstdc++/80553 don't allow destroying non-destructible types

2017-05-02 Thread Marc Glisse

On Tue, 2 May 2017, Jonathan Wakely wrote:


On 02/05/17 10:16 +0100, Jonathan Wakely wrote:

On 28/04/17 13:56 +0100, Jonathan Wakely wrote:

We optimize _Destroy and _Destroy_n to do nothing when the type has a
trivial destructor, which means we do nothing (instead of giving an
error) when trying to destroy types with deleted destructors.


I wonder if this optimisation should even exist. The compiler should
be able to optimise away a loop that just calls trivial destructors,
without help from the library.


The compiler can indeed do that optimisation, even for destructors
like ~T() { } that are empty, but not trivial according to the
language rules. The libstdc++ optimisation does make a difference at
-O0 though. If we get any more bugs in that code I think we should
just remove it though, and let the compiler do the right thing.


Does the compiler manage it for all containers, even those with iterators 
much more complicated than vector's? I'd rather keep the special code in 
the library, if it doesn't cause too much trouble.


--
Marc Glisse


Re: [wwwdocs] grammar & style fixes for /gcc-7/changes.html

2017-05-02 Thread Jonathan Wakely

On 02/05/17 08:44 -0600, Sandra Loosemore wrote:

On 05/02/2017 07:52 AM, Jonathan Wakely wrote:

This is the result of proofreading the release notes for GCC 7. Some
are obvious fixes for simple typos, but I've also tried to improve the
clarity of some text. Please take a look and let me know if you
disagree with any changes.


Overall this is an improvement.  Just a couple suggestions:


@@ -863,22 +863,22 @@
ARC
   
 
-   Add support for ARC HS and ARC EM processors.
+   Added support for ARC HS and ARC EM processors has been added.
 


Too many "added"s now.



Oops, I started changing them all to "Support for ... has been added"
then reverted it, but not completely.


GCC's already extensive testsuite has gained some new
  capabilities, to further improve the reliability of the compiler:

-  GCC now has has an internal unit testing API and a suite of tests
+  GCC now has an internal unit testing API and a suite of tests
for programmatic self-testing of subsystems.


s/unit testing API/unit-testing API/


Thanks, I'll make those changes.




[PATCH][GCC][ARM] Adjust costs so udiv is preferred over sdiv when both are valid. [Patch (2/2)]

2017-05-02 Thread Tamar Christina
Hi All, 

This patch adjusts the cost model so that when both sdiv and udiv are possible
it prefers udiv over sdiv. This was done by making sdiv slightly more expensive
instead of making udiv cheaper to keep the baseline costs of a division the same
as before.

Similar to aarch64 this patch along with my other two related mid-end changes
makes a big difference in division by constants.

Given:

int f2(int x)
{
  return ((x * x) % 300) + ((x * x) / 300);
}

we now generate

f2:
mul r3, r0, r0
mov r0, r3
ldr r1, .L3
umull   r2, r3, r0, r1
lsr r2, r3, #5
add r3, r2, r2, lsl #2
rsb r3, r3, r3, lsl #4
sub r0, r0, r3, lsl #2
add r0, r0, r2
bx  lr

as opposed to

f2:
mul r3, r0, r0
mov r0, r3
ldr r3, .L4
push{r4, r5}
smull   r4, r5, r0, r3
asr r3, r0, #31
rsb r3, r3, r5, asr #5
add r2, r3, r3, lsl #2
rsb r2, r2, r2, lsl #4
sub r0, r0, r2, lsl #2
add r0, r0, r3
pop {r4, r5}
bx  lr

Bootstrapped and reg tested on arm-none-eabi
with no regressions.

OK for trunk?

Thanks,
Tamar


gcc/
2017-05-02  Tamar Christina  

* config/arm/arm.c (arm_rtx_costs_internal): Make sdiv more expensive 
than udiv.


gcc/testsuite/
2017-05-02  Tamar Christina  

* gcc.target/arm/sdiv_costs_1.c: New.diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index b24143e32e2f10f3b150f7ed0df4fabb3cc8..ecc7688b1db6309a4dd694a8e254e64abe14d7e3 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -9258,6 +9258,8 @@ arm_rtx_costs_internal (rtx x, enum rtx_code code, enum rtx_code outer_code,
 	*cost += COSTS_N_INSNS (speed_p ? extra_cost->mult[0].idiv : 0);
   else
 	*cost = LIBCALL_COST (2);
+
+  *cost += (code == DIV ? 1 : 0);
   return false;	/* All arguments must be in registers.  */
 
 case MOD:
@@ -9280,7 +9282,7 @@ arm_rtx_costs_internal (rtx x, enum rtx_code code, enum rtx_code outer_code,
 
 /* Fall-through.  */
 case UMOD:
-  *cost = LIBCALL_COST (2);
+  *cost = LIBCALL_COST (2) + (code == MOD ? 1 : 0);
   return false;	/* All arguments must be in registers.  */
 
 case ROTATE:
diff --git a/gcc/testsuite/gcc.target/arm/sdiv_costs_1.c b/gcc/testsuite/gcc.target/arm/sdiv_costs_1.c
new file mode 100644
index ..76086ab9ce28fceb37a4e8a615a38923fa7b985a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/sdiv_costs_1.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=armv8-a" } */
+
+/* Both sdiv and udiv can be used here, so prefer udiv.  */
+int f1 (unsigned char *p)
+{
+  return 100 / p[1];
+}
+
+int f2 (unsigned char *p, unsigned short x)
+{
+  return x / p[0];
+}
+
+int f3 (unsigned char *p, int x)
+{
+  x &= 0x7fff;
+  return x / p[0];
+}
+
+int f5 (unsigned char *p, unsigned short x)
+{
+  return x % p[0];
+}
+
+/* This should only generate signed divisions.  */
+int f4 (unsigned char *p)
+{
+  return -100 / p[1];
+}
+
+int f6 (unsigned char *p, short x)
+{
+  return x % p[0];
+}
+
+/* { dg-final { scan-assembler-times "udiv\tr\[0-9\]+, r\[0-9\]+, r\[0-9\]+" 4 } } */
+/* { dg-final { scan-assembler-times "sdiv\tr\[0-9\]+, r\[0-9\]+, r\[0-9\]+" 2 } } */



[PATCH][GCC][AARCH64]Adjust costs so udiv is preferred over sdiv when both are valid. [Patch (1/2)]

2017-05-02 Thread Tamar Christina
Hi All, 

This patch adjusts the cost model so that when both sdiv and udiv are possible
it prefers udiv over sdiv. This was done by making sdiv slightly more expensive
instead of making udiv cheaper to keep the baseline costs of a division the same
as before.

For aarch64 this patch along with my other two related mid-end changes
makes a big difference in division by constants.

Given:

int f2(int x)
{
  return ((x * x) % 300) + ((x * x) / 300);
}

we now generate

f2:
mul w0, w0, w0
mov w1, 33205
movkw1, 0x1b4e, lsl 16
mov w2, 300
umull   x1, w0, w1
lsr x1, x1, 37
msubw0, w1, w2, w0
add w0, w0, w1
ret

as opposed to

f2:
mul w0, w0, w0
mov w2, 33205
movkw2, 0x1b4e, lsl 16
mov w3, 300
smull   x1, w0, w2
umull   x2, w0, w2
asr x1, x1, 37
sub w1, w1, w0, asr 31
lsr x2, x2, 37
msubw0, w1, w3, w0
add w0, w0, w2
ret

Bootstrapped and reg tested on aarch64-none-linux-gnu with no regressions.

OK for trunk?

Thanks,
Tamar


gcc/
2017-05-02  Tamar Christina  

* config/aarch64/aarch64.c (aarch64_rtx_costs): Make sdiv more 
expensive than udiv.
Remove floating point cases from mod.

gcc/testsuite/
2017-05-02  Tamar Christina  

* gcc.target/aarch64/sdiv_costs_1.c: New.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 4f769a40a4e9de83cb5aacfd3ff58301c2feeb78..1f4fe51eda9057f1ccaded8e0d5ccd4bc3bc11ab 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -7484,17 +7484,13 @@ cost_plus:
 case UMOD:
   if (speed)
 	{
+	  /* Slighly prefer UMOD over SMOD.  */
 	  if (VECTOR_MODE_P (mode))
 	*cost += extra_cost->vect.alu;
 	  else if (GET_MODE_CLASS (mode) == MODE_INT)
 	*cost += (extra_cost->mult[mode == DImode].add
-		  + extra_cost->mult[mode == DImode].idiv);
-	  else if (mode == DFmode)
-	*cost += (extra_cost->fp[1].mult
-		  + extra_cost->fp[1].div);
-	  else if (mode == SFmode)
-	*cost += (extra_cost->fp[0].mult
-		  + extra_cost->fp[0].div);
+		  + extra_cost->mult[mode == DImode].idiv
+		  + (code == MOD ? 1 : 0));
 	}
   return false;  /* All arguments need to be in registers.  */
 
@@ -7508,7 +7504,9 @@ cost_plus:
 	  else if (GET_MODE_CLASS (mode) == MODE_INT)
 	/* There is no integer SQRT, so only DIV and UDIV can get
 	   here.  */
-	*cost += extra_cost->mult[mode == DImode].idiv;
+	*cost += (extra_cost->mult[mode == DImode].idiv
+		 /* Slighly prefer UDIV over SDIV.  */
+		 + (code == DIV ? 1 : 0));
 	  else
 	*cost += extra_cost->fp[mode == DFmode].div;
 	}
diff --git a/gcc/testsuite/gcc.target/aarch64/sdiv_costs_1.c b/gcc/testsuite/gcc.target/aarch64/sdiv_costs_1.c
new file mode 100644
index ..24d7f7df2089398288bdf67a489eb71d733a4450
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sdiv_costs_1.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+/* Both sdiv and udiv can be used here, so prefer udiv.  */
+int f1 (unsigned char *p)
+{
+  return 100 / p[1];
+}
+
+int f2 (unsigned char *p, unsigned short x)
+{
+  return x / p[0];
+}
+
+int f3 (unsigned char *p, int x)
+{
+  x &= 0x7fff;
+  return x / p[0];
+}
+
+int f5 (unsigned char *p, unsigned short x)
+{
+  return x % p[0];
+}
+
+/* This should only generate signed divisions.  */
+int f4 (unsigned char *p)
+{
+  return -100 / p[1];
+}
+
+int f6 (unsigned char *p, short x)
+{
+  return x % p[0];
+}
+
+/* { dg-final { scan-assembler-times "udiv\tw\[0-9\]+, w\[0-9\]+" 4 } } */
+/* { dg-final { scan-assembler-times "sdiv\tw\[0-9\]+, w\[0-9\]+" 2 } } */



[PATCH] Fix documentation and a ctor in gcov.c

2017-05-02 Thread Martin Liška
On 04/28/2017 07:23 PM, Nathan Sidwell wrote:
> Write proper member initializers please.

Hi.

Done that, patch can bootstrap on ppc64le-redhat-linux and survives regression 
tests.
I consider the patch as pre-approved.

Martin
>From e694ed03b29882bbaaa02747acb188e16d459514 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 2 May 2017 13:38:57 +0200
Subject: [PATCH] Fix documentation and a ctor in gcov.c

gcc/ChangeLog:

2017-05-02  Martin Liska  

	* doc/gcov.texi: Add missing preposition.
	* gcov.c (function_info::function_info): Properly fill up
	all member variables.
---
 gcc/doc/gcov.texi | 2 +-
 gcc/gcov.c| 7 ---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/gcc/doc/gcov.texi b/gcc/doc/gcov.texi
index c96f86df830..706aa6cf0b0 100644
--- a/gcc/doc/gcov.texi
+++ b/gcc/doc/gcov.texi
@@ -325,7 +325,7 @@ containing no code.  Unexecuted lines are marked @samp{#} or
 @samp{}, depending on whether they are reachable by
 non-exceptional paths or only exceptional paths such as C++ exception
 handlers, respectively. Given @samp{-a} option, unexecuted blocks are
-marked @samp{$} or @samp{%}, depending whether a basic block
+marked @samp{$} or @samp{%}, depending on whether a basic block
 is reachable via non-exceptional or exceptional paths.
 
 Some lines of information at the start have @var{line_number} of zero.
diff --git a/gcc/gcov.c b/gcc/gcov.c
index 4e6771e79d0..a5aa4aadcac 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -435,10 +435,11 @@ static char *mangle_name (const char *, char *);
 static void release_structures (void);
 extern int main (int, char **);
 
-function_info::function_info ()
+function_info::function_info (): name (NULL), demangled_name (NULL),
+  ident (0), lineno_checksum (0), cfg_checksum (0), has_catch (0),
+  blocks (), blocks_executed (0), counts (NULL), num_counts (0),
+  line (0), src (0), next_file_fn (NULL), next (NULL)
 {
-  /* The type is POD, so that we can use memset.  */
-  memset (this, 0, sizeof (*this));
 }
 
 function_info::~function_info ()
-- 
2.12.2



[PATCH][GCC][AArch64][ARM] Modify idiv costs for Cortex-A53

2017-05-02 Thread Tamar Christina
Hi All, 

This patch adjusts the cost model for Cortex-A53 to increase the costs of
an integer division. The reason for this is that we want to always expand
the division to a multiply when doing a division by constant.

On the Cortex-A53 shifts are modeled to cost 1 instruction,
when doing the expansion we have to perform two shifts and an addition.
However because the cost model can't model things such as fusing of shifts,
we have to fully cost both shifts.

This leads to the cost model telling us that for the Cortex-A53 we can never
do the expansion. By increasing the costs of the division by two instructions
we recover the room required in the cost calculation to do the expansions.

The reason for all of this is that currently the code does not produce what 
you'd expect,
which is that division by constants are always expanded. Also it's inconsistent 
because
unsigned division does get expanded.

This all reduces the ability to do CSE when using signed modulo since that one 
is also expanded.

Given:

void f5(void)
{
  int x = 0;
  while (x > -1000)
  {
g(x % 300);
x--;
  }
}


we now generate

smull   x0, w19, w21
asr x0, x0, 37
sub w0, w0, w19, asr 31
msubw0, w0, w20, w19
sub w19, w19, #1
bl  g

as opposed to

sdivw0, w19, w20
msubw0, w0, w20, w19
sub w19, w19, #1
bl  g


Bootstrapped and reg tested on aarch64-none-linux-gnu with no regressions.

OK for trunk?

Thanks,
Tamar


gcc/
2017-05-02  Tamar Christina  

* config/arm/aarch-cost-tables.h (cortexa53_extra_cost): Increase idiv 
cost.diff --git a/gcc/config/arm/aarch-cost-tables.h b/gcc/config/arm/aarch-cost-tables.h
index 68f84b04fe2f2cbdb66c6dd0c7add097606a7878..8cff517861aea2c249a07a07f6775c60f75fb9a0 100644
--- a/gcc/config/arm/aarch-cost-tables.h
+++ b/gcc/config/arm/aarch-cost-tables.h
@@ -154,7 +154,7 @@ const struct cpu_cost_table cortexa53_extra_costs =
   COSTS_N_INSNS (1),	/* extend.  */
   COSTS_N_INSNS (1),	/* add.  */
   COSTS_N_INSNS (1),	/* extend_add.  */
-  COSTS_N_INSNS (7)		/* idiv.  */
+  COSTS_N_INSNS (9)		/* idiv.  */
 },
 /* MULT DImode */
 {



[PATCH, testsuite]: Fix g++.dg/lto/pr79671 execute failure

2017-05-02 Thread Uros Bizjak
Hello!

We have to match input and output operand (or use "+") of an empty asm
to move the value to a register.

2017-05-02  Uros Bizjak  

* g++.dg/lto/pr79671_0.C (foo): Fix asm constraints.

Tested on alphaev68-linux-gnu (where the unpatched testcase fails) and
x86_64-linux-gnu.

OK for mailine?

Uros.
Index: g++.dg/lto/pr79671_0.C
===
--- g++.dg/lto/pr79671_0.C  (revision 247497)
+++ g++.dg/lto/pr79671_0.C  (working copy)
@@ -13,7 +13,7 @@ int __attribute__((noinline)) foo()
   new (&x) B (0);
   y = x;
   B *q = reinterpret_cast (&y);
-  asm volatile ("" : "=r" (q) : "r" (q));
+  asm volatile ("" : "+r" (q));
   return q->i;
 }
 extern "C" void bar ();


Re: [PATCH v2] Destroy arguments for _Cilk_spawn calling in the child (PR 80038)

2017-05-02 Thread Jeff Law

On 05/02/2017 01:56 AM, Xi Ruoyao wrote:

On 2017-05-02 09:16 +0200, Andreas Schwab wrote:


This could be related to --enable-checking=release:

In file included from ../../gcc/c-family/c-common.h:26:0,
  from ../../gcc/c-family/cilk.c:28:
../../gcc/c-family/cilk.c: In function 'bool cilk_set_spawn_marker(location_t, 
tree)':
../../gcc/tree.h:901:42: error: 'tree_check2' was not declared in this scope
  CALL_EXPR, AGGR_INIT_EXPR)->base.u.bits.unsigned_flag)
   ^
../../gcc/c-family/cilk.c:113:9: note: in expansion of macro 'EXPR_CILK_SPAWN'
  EXPR_CILK_SPAWN (fcall) = 1;
  ^
../../gcc/tree.h:901:42: error: 'tree_check2' was not declared in this scope
  CALL_EXPR, AGGR_INIT_EXPR)->base.u.bits.unsigned_flag)
   ^
../../gcc/c-family/cilk.c:115:9: note: in expansion of macro 'EXPR_CILK_SPAWN'
  EXPR_CILK_SPAWN (TREE_OPERAND (fcall, 1)) = 1;
  ^

Andreas.



Sorry T_T.  I've made a stupid mistake in tree.h.

Let's apply following patch, and alert the RM when backporting r247446.

2017-05-02 Xi Ruoyao 

* tree.h (EXPR_CILK_SPAWN): Use macro TREE_CHECK2 instead of
function tree_check2.

THanks.  Installed.

jeff


Re: [PATCH GCC8][22/33]Generate TMR in new reassociation order

2017-05-02 Thread Marc Glisse

On Tue, 2 May 2017, Bin.Cheng wrote:


On Tue, May 2, 2017 at 3:09 PM, Richard Biener
 wrote:

On Wed, Apr 26, 2017 at 12:20 PM, Bin.Cheng  wrote:

This is another one where context diff might help.  No code change
from previous version.


This isn't a context diff.

Thanks for reviewing.  I used git diff -U20 to generate patch.  Maybe
20 is too small?


See option -c (instead of -u) in man diff.

--
Marc Glisse


Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts

2017-05-02 Thread Marc Glisse

On Tue, 2 May 2017, Jakub Jelinek wrote:

Also, you've removed some builtin uses like __builtin_ia32_psllwi256 
above, but haven't removed those builtins from the compiler (unlike the 
intrinsics, the builtins are not supported and can be removed).


When we changed previous intrinsics, the same issue came up, and Ada folks 
asked us to keep the builtins...


--
Marc Glisse


Re: [PATCH] PR libstdc++/80553 don't allow destroying non-destructible types

2017-05-02 Thread Jonathan Wakely

On 02/05/17 17:30 +0200, Marc Glisse wrote:

On Tue, 2 May 2017, Jonathan Wakely wrote:


On 02/05/17 10:16 +0100, Jonathan Wakely wrote:

On 28/04/17 13:56 +0100, Jonathan Wakely wrote:

We optimize _Destroy and _Destroy_n to do nothing when the type has a
trivial destructor, which means we do nothing (instead of giving an
error) when trying to destroy types with deleted destructors.


I wonder if this optimisation should even exist. The compiler should
be able to optimise away a loop that just calls trivial destructors,
without help from the library.


The compiler can indeed do that optimisation, even for destructors
like ~T() { } that are empty, but not trivial according to the
language rules. The libstdc++ optimisation does make a difference at
-O0 though. If we get any more bugs in that code I think we should
just remove it though, and let the compiler do the right thing.


Does the compiler manage it for all containers, even those with 
iterators much more complicated than vector's?


It seems to for std::deque (not very complicated) and std::map
(moderately complicated). I didn't try for something like a
directory_iterator which almost certainly wouldn't get optimised away!

I'd rather keep the 
special code in the library, if it doesn't cause too much trouble.


Yes, assuming the code's correct now then we might as well keep it,
but if it's a source of too many more bugs then it starts to cause too
much trouble.




New template for 'gcc' made available

2017-05-02 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.  (If you have
any questions, send them to .)

A new POT file for textual domain 'gcc' has been made available
to the language teams for translation.  It is archived as:

http://translationproject.org/POT-files/gcc-7.1.0.pot

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

Below is the URL which has been provided to the translators of your
package.  Please inform the translation coordinator, at the address
at the bottom, if this information is not current:

ftp://ftp.gnu.org/gnu/gcc/gcc-7.1.0/gcc-7.1.0.tar.bz2

Translated PO files will later be automatically e-mailed to you.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




New French PO file for 'gcc' (version 7.1.0)

2017-05-02 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the French team of translators.  The file is available at:

http://translationproject.org/latest/gcc/fr.po

(This file, 'gcc-7.1.0.fr.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




[wwwdocs, ARM, AArch64] Document ABI changes and fixes

2017-05-02 Thread Richard Earnshaw (lists)
This patch adds some release notes for the gcc ABI changes affecting ARM
and AArch64.

Does this sound reasonable?

R.
Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/changes.html,v
retrieving revision 1.80
diff -u -r1.80 changes.html
--- changes.html1 May 2017 17:51:14 -   1.80
+++ changes.html2 May 2017 15:34:33 -
@@ -38,6 +38,17 @@
   
 
   The Cilk+ extensions to the C and C++ languages have been 
deprecated.
+  On ARM targets (arm*-*-*),
+  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77728";>a
+  bug introduced in GCC 5 that affects conformance to the
+  procedure call standard (AAPCS) has been fixed.  The bug affects
+  some C++ code where class objects are passed by value to
+  functions and could result in incorrect or inconsistent code
+  being generated.  This is an ABI change.  If the
+  option -Wpsabi is enabled (on by default) the
+  compiler will emit a diagnostic note for code that might be
+  affected.
+  
 
 
 
@@ -822,6 +833,11 @@
 AArch64

  
+   GCC has been updated to the latest revision of the procedure
+   call standard (AAPCS64) to provide support for paramater
+   passing when data types have been over-aligned.
+ 
+ 
The ARMv8.3-A architecture is now supported.  It can be used by
specifying the -march=armv8.3-a option.
  


Re: [wwwdocs] grammar & style fixes for /gcc-7/changes.html

2017-05-02 Thread Jonathan Wakely

On 02/05/17 08:44 -0600, Sandra Loosemore wrote:

On 05/02/2017 07:52 AM, Jonathan Wakely wrote:

This is the result of proofreading the release notes for GCC 7. Some
are obvious fixes for simple typos, but I've also tried to improve the
clarity of some text. Please take a look and let me know if you
disagree with any changes.


Overall this is an improvement.  Just a couple suggestions:


@@ -863,22 +863,22 @@
ARC
   
 
-   Add support for ARC HS and ARC EM processors.
+   Added support for ARC HS and ARC EM processors has been added.
 


Too many "added"s now.


GCC's already extensive testsuite has gained some new
  capabilities, to further improve the reliability of the compiler:

-  GCC now has has an internal unit testing API and a suite of tests
+  GCC now has an internal unit testing API and a suite of tests
for programmatic self-testing of subsystems.


s/unit testing API/unit-testing API/

-Sandra



Here's what I committed, which also removes the disclaimer about GCC 7
not being released yet.


Index: htdocs/gcc-7/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/changes.html,v
retrieving revision 1.81
diff -u -r1.81 changes.html
--- htdocs/gcc-7/changes.html	2 May 2017 16:04:29 -	1.81
+++ htdocs/gcc-7/changes.html	2 May 2017 16:49:43 -
@@ -19,10 +19,6 @@
 
 
 
-Disclaimer: GCC 7 has not been released yet, so this document is
-a work-in-progress.
-
-
 Caveats
 
   GCC now uses https://gcc.gnu.org/wiki/LRAIsDefault";>LRA (a
@@ -58,8 +54,8 @@
   A new code hoisting optimization has been added to the partial
   redundancy elimination pass.  It attempts to move evaluation of
   expressions executed on all paths to the function exit as early as
-  possible, which helps primarily for code size, but can be useful for
-  speed of generated code as well.  It is enabled by the
+  possible. This primarily helps improve code size, but can improve the
+  speed of the generated code as well.  It is enabled by the
   -fcode-hoisting option and at the -O2
   optimization level or higher (and -Os).
 
@@ -75,18 +71,18 @@
   ignored.
 
   A new interprocedural value range propagation optimization has been
-  added, which propagates integral ranges that variable values can be proven
-  to be within across the call graph.  It is enabled by the
-  -fipa-vrp option and at the -O2 optimization
-  level and higher (and -Os).
-
-  A new loop splitting optimization pass has been added.  It splits
-  certain loops if they contain a condition that is always true on one
-  side of the iteration space and always false on the other into two
-  loops where each of the new two loops iterates just on one of the sides
-  of the iteration space and the condition does not need to be checked
-  inside of the loop.  It is enabled by the -fsplit-loops
-  option and at the -O3 optimization level or higher.
+  added, which propagates integral range information across the call graph
+  when variable values can be proven to be within those ranges.  It is
+  enabled by the -fipa-vrp option and at the -O2
+  optimization level and higher (and -Os).
+
+  A new loop splitting optimization pass has been added.  Certain loops
+  which contain a condition that is always true on one side of the iteration
+  space and always false on the other are split into two loops, such that
+  each of the two new loops iterates on just one side of the iteration space
+  and the condition does not need to be checked inside of the loop.
+  It is enabled by the -fsplit-loops option and
+  at the -O3 optimization level or higher.
 
   The shrink-wrapping optimization can now separate portions of
   prologues and epilogues to improve performance if some of the
@@ -129,14 +125,14 @@
   The option is enabled by default with -fsanitize=address and disabled
   by default with -fsanitize=kernel-address.
   Compared to the LLVM compiler, where the option already exists,
-  the implementation in the GCC compiler has couple of improvements and advantages:
+  the implementation in the GCC compiler has some improvements and advantages:
   
-  A complex usage of gotos and case labels are properly handled and should not
-  report any false positive or false negatives.
+  Complex uses of gotos and case labels are properly handled and
+  should not report any false positive or false negatives.
   
   C++ temporaries are sanitized.
   Sanitization can handle invalid memory stores that are optimized out
-  by the LLVM compiler when using an optimization level.
+  by the LLVM compiler when optimization is enabled.
   
 
   
@@ -149,7 +145,7 @@
   href="http://www.dwarfstd.org/Download.php";>DWARF debugging
   information standard is supported through the -gdwarf-5
   option.  The DWARF version 4 debugging information remains the
-  default until debugging information consumers are adjusted.
+  default until consu

Re: [PATCH, GCC/LTO, ping3] Fix PR69866: LTO with def for weak alias in regular object file

2017-05-02 Thread Thomas Preudhomme

Now that GCC 7 is released, ping?

Original message below:

Hi,

This patch fixes an assert failure when linking one LTOed object file
having a weak alias with a regular object file containing a strong
definition for that same symbol. The patch is twofold:

+ do not add an alias to a partition if it is external
+ do not declare (.globl) an alias if it is external

ChangeLog entries are as follow:

*** gcc/lto/ChangeLog ***

2017-03-01  Thomas Preud'homme  

PR lto/69866
* lto/lto-partition.c (add_symbol_to_partition_1): Do not add external
aliases to partition.

*** gcc/ChangeLog ***

2017-03-01  Thomas Preud'homme  

PR lto/69866
* cgraphunit.c (cgraph_node::assemble_thunks_and_aliases): Do not
declare external aliases.

*** gcc/testsuite/ChangeLog ***

2017-02-28  Thomas Preud'homme  

PR lto/69866
* gcc.dg/lto/pr69866_0.c: New test.
* gcc.dg/lto/pr69866_1.c: Likewise.


Testing: Testsuite shows no regression when targeting Cortex-M3 with an
arm-none-eabi GCC cross-compiler, neither does it show any regression with 
native LTO-bootstrapped x86-64_linux-gnu and aarch64-linux-gnu compilers.


Is this ok for stage4?

Best regards,

Thomas

On 31/03/17 18:07, Richard Biener wrote:

On March 31, 2017 5:23:03 PM GMT+02:00, Jeff Law  wrote:

On 03/16/2017 08:05 AM, Thomas Preudhomme wrote:

Ping?

Is this ok for stage4?

Given the lack of response from Richi, I'd suggest deferring to stage1.


Honza needs to review this, i habe too little knowledge here.

Richard.


jeff


diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index c82a88a599ca61b068dd9783d2a6158163809b37..580500ff922b8546d33119261a2455235edbf16d 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -1972,7 +1972,7 @@ cgraph_node::assemble_thunks_and_aliases (void)
   FOR_EACH_ALIAS (this, ref)
 {
   cgraph_node *alias = dyn_cast  (ref->referring);
-  if (!alias->transparent_alias)
+  if (!alias->transparent_alias && !DECL_EXTERNAL (alias->decl))
 	{
 	  bool saved_written = TREE_ASM_WRITTEN (decl);
 
diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index e27d0d1690c1fcfb39e2fac03ce0f4154031fc7c..f44fd435ed075a27e373bdfdf0464eb06e1731ef 100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-partition.c
@@ -178,7 +178,8 @@ add_symbol_to_partition_1 (ltrans_partition part, symtab_node *node)
   /* Add all aliases associated with the symbol.  */
 
   FOR_EACH_ALIAS (node, ref)
-if (!ref->referring->transparent_alias)
+if (!ref->referring->transparent_alias
+	&& ref->referring->get_partitioning_class () != SYMBOL_EXTERNAL)
   add_symbol_to_partition_1 (part, ref->referring);
 else
   {
@@ -189,7 +190,8 @@ add_symbol_to_partition_1 (ltrans_partition part, symtab_node *node)
 	  {
 	/* Nested transparent aliases are not permitted.  */
 	gcc_checking_assert (!ref2->referring->transparent_alias);
-	add_symbol_to_partition_1 (part, ref2->referring);
+	if (ref2->referring->get_partitioning_class () != SYMBOL_EXTERNAL)
+	  add_symbol_to_partition_1 (part, ref2->referring);
 	  }
   }
 
diff --git a/gcc/testsuite/gcc.dg/lto/pr69866_0.c b/gcc/testsuite/gcc.dg/lto/pr69866_0.c
new file mode 100644
index ..f49ef8d4c1da7a21d1bfb5409d647bd18141595b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr69866_0.c
@@ -0,0 +1,13 @@
+/* { dg-lto-do link } */
+
+int _umh(int i)
+{
+  return i+1;
+}
+
+int weaks(int i) __attribute__((weak, alias("_umh")));
+
+int main()
+{
+  return weaks(10);
+}
diff --git a/gcc/testsuite/gcc.dg/lto/pr69866_1.c b/gcc/testsuite/gcc.dg/lto/pr69866_1.c
new file mode 100644
index ..3a14f850eefaffbf659ce4642adef7900330f4ed
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr69866_1.c
@@ -0,0 +1,6 @@
+/* { dg-options { -fno-lto } } */
+
+int weaks(int i)
+{
+  return i+1;
+}


[New file, Ping] Add testcase to ensure that #pragma GCC diagnostic push/pop works with -Wtraditional.

2017-05-02 Thread Eric Gallager
On 3/24/17, Eric Gallager  wrote:
> On 3/24/17, David Malcolm  wrote:
>> On Fri, 2017-03-24 at 14:10 -0400, Eric Gallager wrote:
>>> The attached test case failed with gcc 4.9 and older, but started
>>> compiling successfully with only the 1 expected warning with gcc 5.
>>> Adding it to the test suite would ensure that this behavior doesn't
>>> regress.
>>
>> Thanks for posting this.
>>
>> What's the significance of the leading space in the:
>>  #pragma GCC diagnostic pop
>> line?  Is *that* the bug?  (did we have a bug # for this, I wonder?)
>>
>
> It prints a warning without it, which would be entirely correct of it to
> do:
>
> /Users/ericgallager/gcc-git/gcc/testsuite/gcc.dg/pragma-diag-7.c:8:2:
> warning: suggest hiding #pragma from traditional C with an indented #
> [-Wtraditional]
>  #pragma GCC diagnostic pop
>   ^
>
> I only wanted the test case to be testing for the warnings about
> suffixes; another warning about the pragma would just be noise, albeit
> correct noise.
>
>>
>>> Note that I have only tested it by compiling it manually, and
>>> not by actually running it as part of the entire test suite, so
>>> please
>>> let me know if I got any of the dejagnu directives wrong.
>>
>> When I started contributing to gcc, it took me a while to figure out
>> how to run just one case in the testsuite, so in case it's helpful I'll
>> post the recipe here:
>>
>> 1) Find the pertinent Tcl script that runs the test: a .exp script in
>> the same directory, or one of the ancestors directories.  For this case
>> it's gcc.dg/dg.exp.  The significant part is the filename: dg.exp
>>
>> 2) Figure out the appropriate "make" target, normally based on the
>> source language for the test.  For this case it's "check-gcc"
>>
>> 3) Run make in your BUILDDIR/gcc, passing in a suitable value for
>> RUNTESTFLAGS based on the filename found in step 1 above.
>> For this case, giving it a couple of "-v" flags for verbosity (so that
>> we can see the command-line of the compiler invocation) it would be:
>>
>> $ make -jN && make check-gcc RUNTESTFLAGS="-v -v dg.exp=pragma-diag
>> -7.c"
>>
>> (for some N; I like the "make && make check-FOO" construction to ensure
>> that the compiler is rebuilt before running the tests).
>>
>> ...which leads to a summary of:
>>
>> # of expected passes 3
>>
>> which looks good.
>
> Okay, I tried this, and I also got:
>
> # of expected passes  3
>
> too, so that's good.
>
>>
>> You can also use wildcards e.g.:
>>
>> make -j64 && make check-gcc RUNTESTFLAGS="-v -v dg.exp=pragma-diag-*.c"
>>
>> (and can use -jN on the "make check-FOO" invocation if there are a lot of
>> tests; I tend not to use it for a small number of tests, to avoid
>> interleaving of output in the logs).
>>
>> Thanks,
>>> Eric Gallager
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2017-03-24  Eric Gallager  
>>>
>>> * gcc.dg/pragma-diag-7.c: New test.
>>
>> I tested your new test case via the above approach and it looks good to
>> me.
>>
>> Although we're meant to only be accepting regression fixes and
>> documentation fixes right now (stage 4 of gcc 7 development) I feel
>> that extra test coverage like this also ought to be acceptable.
>
> It's okay to save it for next stage 1, I'm already submitting it later
> than I intended to, so extra waiting won't hurt.
>

Okay, GCC 7 has been released and GCC 8 stage 1 is open now, so I'm
pinging this:
https://gcc.gnu.org/ml/gcc-patches/2017-03/msg01319.html

>>
>> I don't know if the test case is sufficiently small to be exempt from
>> the FSF's paperwork requirements here:
>>   https://gcc.gnu.org/contribute.html
>> (do you have that paperwork in place?)
>>
>> Thanks
>> Dave
>
> Yes, I dropped off my copyright assignment at the FSF in December, but
> I don't have commit access yet though.
> Thanks,
> Eric
>

David, can I list you as my sponsor when applying for
write-after-approval SVN access? Or would someone else be better?
Thanks,
Eric


Re: [PATCH GCC8][07/33]Offset validity check in address expression

2017-05-02 Thread Bin.Cheng
On Mon, Apr 24, 2017 at 11:34 AM, Richard Biener
 wrote:
> On Tue, Apr 18, 2017 at 12:41 PM, Bin Cheng  wrote:
>> Hi,
>> For now, we check validity of offset by computing the maximum offset then 
>> checking if
>> offset is smaller than the max offset.  This is inaccurate, for example, 
>> some targets
>> may require offset to be aligned by power of 2.  This patch introduces new 
>> interface
>> checking validity of offset.  It also buffers rtx among different calls.
>>
>> Is it OK?
>
> -  static vec max_offset_list;
> -
> +  auto_vec addr_list;
>as = TYPE_ADDR_SPACE (TREE_TYPE (use->iv->base));
>mem_mode = TYPE_MODE (TREE_TYPE (*use->op_p));
>
> -  num = max_offset_list.length ();
> +  num = addr_list.length ();
>list_index = (unsigned) as * MAX_MACHINE_MODE + (unsigned) mem_mode;
>if (list_index >= num)
>
> num here is always zero and thus the compare is always true.
>
> +  addr_list.safe_grow_cleared (list_index + MAX_MACHINE_MODE);
> +  for (; num < addr_list.length (); num++)
> +   addr_list[num] = NULL;
>
> the loop is now redundant (safe_grow_cleared)
>
> +  addr = addr_list[list_index];
> +  if (!addr)
>  {
>
> always true again...
>
> I wonder if you really indented to drop 'static' from addr_list?
> There's no caching
> across function calls.
Right, the redundancy is because I tried to cache across function
calls with declarations like:
  static unsigned num = 0;
  static GTY ((skip)) rtx *addr_list = NULL;
But this doesn't work, the addr_list[list_index] still gets corrupted somehow.

Thanks,
bin
>
> + /* Split group if aksed to, or the offset against the first
> +use can't fit in offset part of addressing mode.  IV uses
> +having the same offset are still kept in one group.  */
> + if (offset != 0 &&
> + (split_p || !addr_offset_valid_p (use, offset)))
>
> && goes to the next line.
>
> Richard.
>
>
>
>> Thanks,
>> bin
>> 2017-04-11  Bin Cheng  
>>
>> * tree-ssa-loop-ivopts.c (compute_max_addr_offset): Delete.
>> (addr_offset_valid_p): New function.
>> (split_address_groups): Check offset validity with above function.


Re: [gcn][patch] Add -mgpu option and plumb in assembler/linker

2017-05-02 Thread Martin Jambor
Hi Andrew,

sorry for replying only now but yesterday was public holiday here and
I am still only in the process of recovering from a long weekend.

While the only objection I have is the C++ style comment in
config/gcn/gcn.c, another problem, for me at least...

On Fri, Apr 28, 2017 at 06:06:39PM +0100, Andrew Stubbs wrote:
> This patch, for the "gcn" branch, does three things:
> 
> 1. Add specs to drive the LLVM assembler and linker. It requires them to be
> installed as "as" and "ld", under $target/bin, but then the compiler Just
> Works with these specs.

...is that I do not have llvm linker at hand and without it I did not
manage to make the patch produce loadable code.  Because ROCm 1.5 has
been released today, I will update our environment, which is a bit
obsolete, get llvm ld and try again.  This might take me a few days,
so please bear with me for a little more, I would like to make sure it
works on carrizos.

Thanks,

Martin

> 
> 2. Switch to HSACO format version 2, and have the assembler auto-set the
> architecture flags from -mcpu. This means the amdphdr utility is no longer
> required.
> 
> 3. Add -mgpu option and corresponding --with-gpu. I've deliberately used
> "gpu" instead of "cpu" because I want offloading compilers to be able to say
> "-mcpu=foo -foffload=-mgpu=bar", or even have the host compiler just
> understand -mgpu and DTRT.
> 
> The patch also removes the unused and unwritten "arch" and "tune" settings.
> They can be added back when useful, but the assembler requires a GPU name, I
> think, so we need that as input.
> 
> OK to commit to GCN branch?
> 2017-04-28  Andrew Stubbs  
> 
>   gcc/
>   * config.gcc (amdgcn): Remove --with-arch and --with-tune.
>   Add --with-gpu, and set default to "carrizo"
>   (add_defaults): Add "gpu".
>   * config/gcn/gcn-opts.h: New file.
>   * config/gcn/gcn.c (output_file_start): Switch to HSACO version
>   2 and auto-detection of GPU type (from -mcpu).
>   (gcn_arch, gcn_tune): Remove.
>   * config/gcn/gcn.h: Include gcn-opts.h.
>   (enum processor_type): Move to gcn-opts.h.
>   (LIBGCC_SPEC, ASM_SPEC, LINK_SPEC): Define.
>   (gcn_arch, gcn_tune): Remove.
>   (OPTION_DEFAULT_SPECS): Remove "arch" and "tune"; add "gpu".
>   * config/gcn/gcn.opt: Include gcn-opts.h.
>   (gpu_type): New Enum.
>   (mgpu): New option.
> 


[1/2] PR 78736: New warning -Wenum-conversion

2017-05-02 Thread Prathamesh Kulkarni
Hi,
The attached patch attempts to add option -Wenum-conversion for C and
objective-C similar to clang, which warns when an enum value of a type
is implicitly converted to enum value of another type and is enabled
by Wall.

Bootstrapped+tested on x86_64-unknown-linux-gnu.
Is the patch OK for trunk ?

Thanks,
Prathamesh
2017-05-02  Prathamesh Kulkarni  

* doc/invoke.text: Document Wenum-conversion.
* c-family/c.opt (Wenum-conversion): New option.
* c/c-typeck.c (convert_for_assignment): Handle Wenum-conversion.

testsuite/
* gcc.dg/Wenum-conversion.c: New test-case.

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 9ad2f6e1fcc..e04312ec253 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -492,6 +492,10 @@ Wenum-compare
 C ObjC C++ ObjC++ Var(warn_enum_compare) Init(-1) Warning LangEnabledBy(C 
ObjC,Wall || Wc++-compat)
 Warn about comparison of different enum types.
 
+Wenum-conversion
+C ObjC Var(warn_enum_conversion) Init(0) Warning LangEnabledBy(C Objc,Wall)
+Warn about implicit conversion of enum types.
+
 Werror
 C ObjC C++ ObjC++
 ; Documented in common.opt
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 6f9909c6396..c9cde8d7fef 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -6309,6 +6309,20 @@ convert_for_assignment (location_t location, location_t 
expr_loc, tree type,
}
 }
 
+  if (warn_enum_conversion)
+{
+  tree checktype = origtype != NULL_TREE ? origtype : rhstype;
+  if (checktype != error_mark_node
+ && TREE_CODE (checktype) == ENUMERAL_TYPE
+ && TREE_CODE (type) == ENUMERAL_TYPE
+ && TYPE_MAIN_VARIANT (checktype) != TYPE_MAIN_VARIANT (type))
+   {
+ gcc_rich_location loc (location);
+ warning_at_rich_loc (&loc, 0, "implicit conversion from"
+  " enum type of %qT to %qT", checktype, type);
+   }
+}
+
   if (TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (rhstype))
 return rhs;
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 0eeea7b3b87..79b1e175374 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -273,7 +273,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wdisabled-optimization @gol
 -Wno-discarded-qualifiers  -Wno-discarded-array-qualifiers @gol
 -Wno-div-by-zero  -Wdouble-promotion  -Wduplicated-cond @gol
--Wempty-body  -Wenum-compare  -Wno-endif-labels  -Wexpansion-to-defined @gol
+-Wempty-body  -Wenum-compare -Wenum-conversion  -Wno-endif-labels  
-Wexpansion-to-defined @gol
 -Werror  -Werror=*  -Wextra-semi  -Wfatal-errors @gol
 -Wfloat-equal  -Wformat  -Wformat=2 @gol
 -Wno-format-contains-nul  -Wno-format-extra-args  @gol
@@ -3754,6 +3754,7 @@ Options} and @ref{Objective-C and Objective-C++ Dialect 
Options}.
 -Wcomment  @gol
 -Wduplicate-decl-specifier @r{(C and Objective-C only)} @gol
 -Wenum-compare @r{(in C/ObjC; this is on by default in C++)} @gol
+-Wenum-conversion @r{in C/ObjC;} @gol
 -Wformat   @gol
 -Wint-in-bool-context  @gol
 -Wimplicit @r{(C and Objective-C only)} @gol
@@ -5961,6 +5962,12 @@ In C++ enumerated type mismatches in conditional 
expressions are also
 diagnosed and the warning is enabled by default.  In C this warning is 
 enabled by @option{-Wall}.
 
+@item -Wenum-conversion @r{(C, Objective-C only)}
+@opindex Wenum-conversion
+@opindex Wno-enum-conversion
+Warn when an enum value of a type is implicitly converted to an enum of
+another type. This warning is enabled by @option{-Wall}.
+
 @item -Wextra-semi @r{(C++, Objective-C++ only)}
 @opindex Wextra-semi
 @opindex Wno-extra-semi
diff --git a/gcc/testsuite/gcc.dg/Wenum-conversion.c 
b/gcc/testsuite/gcc.dg/Wenum-conversion.c
new file mode 100644
index 000..4459109c7cb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wenum-conversion.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-Wenum-conversion" } */
+
+enum X { x1, x2 };
+enum Y { y1, y2 };
+
+enum X obj = y1;  /* { dg-warning "implicit conversion from enum type of .enum 
Y. to .enum X." } */
+enum Y obj2 = y1;
+
+enum X obj3;
+void foo()
+{
+  obj3 = y2; /* { dg-warning "implicit conversion from enum type of .enum Y. 
to .enum X." } */
+}
+
+void bar(enum X);
+void f(void)
+{
+  bar (y1); /* { dg-warning "implicit conversion from enum type of .enum Y. to 
.enum X." } */
+}


[2/2] PR 78736: libgomp fallout

2017-05-02 Thread Prathamesh Kulkarni
Hi,
During gcc bootstrap, there's a couple of places where the warning got
triggered.
I suppose this wasn't a false positive since enum gomp_schedule_type
and enum omp_sched_t
are different types (although they have same set of values) ?

Bootstrap+tested on x86_64-unknown-linux-gnu.
Is this patch OK to commit ?

Thanks,
Prathamesh
2017-05-02  Prathamesh Kulkarni  

* icv.c (omp_set_schedule): Cast kind to enum gomp_schedule_type
before assigning to icv->run_sched_var.
(omp_get_schedule): Cast icv->run_sched_var to enum omp_sched_t before
assigning it to *kind.

diff --git a/libgomp/icv.c b/libgomp/icv.c
index 233d6dbe10e..71e1f677fd7 100644
--- a/libgomp/icv.c
+++ b/libgomp/icv.c
@@ -87,14 +87,14 @@ omp_set_schedule (omp_sched_t kind, int chunk_size)
 default:
   return;
 }
-  icv->run_sched_var = kind;
+  icv->run_sched_var = (enum gomp_schedule_type) kind;
 }
 
 void
 omp_get_schedule (omp_sched_t *kind, int *chunk_size)
 {
   struct gomp_task_icv *icv = gomp_icv (false);
-  *kind = icv->run_sched_var;
+  *kind = (enum omp_sched_t) icv->run_sched_var;
   *chunk_size = icv->run_sched_chunk_size;
 }
 


[PATCH] RFC: spellchecker for comments, plus -Wfixme and -Wtodo

2017-05-02 Thread David Malcolm
Currently the C/C++ frontends discard comments when parsing.
It's possible to set up libcpp to capture comments as tokens,
by setting CPP_OPTION (pfile, discard_comments) to false),
and this can be enabled using the -C command line option (see
also -CC), but c-family/c-lex.c then discards any CPP_COMMENT
tokens it sees, so they're not seen by the frontend parser.

The following patch adds an (optional) callback to libcpp
for handling comments, giving the comment content, and the
location it was seen at.  This approach allows arbitrary
logic to be wired up to comments, and avoids having to
copy the comment content to a new buffer (which the CPP_COMMENT
approach does).

This could be used by plugins to chain up on the callback
e.g. to parse specially-formatted comments, e.g. for
documentation generation, or e.g. for GObject introspection
annotations [1].

As a proof of concept, the patch uses this to add a spellchecker
for comments.  It uses the Enchant meta-library:
   https://abiword.github.io/enchant/
(essentially a wrapper around 8 different spellchecking libraries).
I didn't bother with the autotool detection for enchant, and
just hacked it in for now.

Example output:

test.c:3:46: warning: spellcheck_word: "evaulate"
When NONCONST_PRED is false the code will evaulate to constant and
  ^~~~
test.c:3:46: note: suggestion: "evaluate"
When NONCONST_PRED is false the code will evaulate to constant and
  ^~~~
  evaluate
test.c:3:46: note: suggestion: "ululate"
When NONCONST_PRED is false the code will evaulate to constant and
  ^~~~
  ululate
test.c:3:46: note: suggestion: "elevate"
When NONCONST_PRED is false the code will evaulate to constant and
  ^~~~
  elevate

License-wise, Enchant is LGPL 2.1 "or (at your option) any
later version." with a special exception to allow non-LGPL
spellchecking providers (e.g. to allow linking against an
OS-provided spellchecker).

Various FIXMEs are present (e.g. hardcoded "en_US" for the
language to spellcheck against).
Also, the spellchecker has a lot of false positives e.g.
it doesn't grok URLs (and thus complains when it seens them);
similar for DejaGnu directives etc.

Does enchant seem like a reasonable dependency for the compiler?
(it pulls in libpthread.so.0, libglib-2.0.so.0, libgmodule-2.0.so.0).
Or would this be better pursued as a plugin?  (if so, I'd
prefer the plugin to live in the source tree as an example,
rather than out-of-tree).

Unrelated to spellchecking, I also added two new options:
-Wfixme and -Wtodo, for warning when comments containing
"FIXME" or "TODO" are encountered.
I use such comments a lot during development.  I thought some
people might want a warning about them (I tend to just use grep
though).  [TODO: document these in invoke.texi, add test cases]

Thoughts?  Does any of this sound useful?

[not yet bootstrapped; as noted above, I haven't yet done
the autoconf stuff for handling Enchant]

[1] https://wiki.gnome.org/Projects/GObjectIntrospection/Annotations

gcc/ChangeLog:
* Makefile.in (LIBS): Hack in -lenchant for now.
(OBJS): Add spellcheck-enchant.o.
* common.opt (Wfixme): New option.
(Wtodo): New option.
* spellcheck-enchant.c: New file.
* spellcheck-enchant.h: New file.

gcc/c-family/ChangeLog:
* c-lex.c: Include spellcheck-enchant.h.
(init_c_lex): Wire up spellcheck_enchant_check_comment to the
comment callback.
* c-opts.c: Include spellcheck-enchant.h.
(c_common_post_options): Call spellcheck_enchant_init.
(c_common_finish): Call spellcheck_enchant_finish.

libcpp/ChangeLog:
* include/cpplib.h (struct cpp_callbacks): Add "comment"
callback.
* lex.c (_cpp_lex_direct): Call the comment callback if non-NULL.
---
 gcc/Makefile.in  |   3 +-
 gcc/c-family/c-lex.c |   2 +
 gcc/c-family/c-opts.c|   4 +
 gcc/common.opt   |   8 ++
 gcc/spellcheck-enchant.c | 294 +++
 gcc/spellcheck-enchant.h |  33 ++
 libcpp/include/cpplib.h  |   4 +
 libcpp/lex.c |   7 ++
 8 files changed, 354 insertions(+), 1 deletion(-)
 create mode 100644 gcc/spellcheck-enchant.c
 create mode 100644 gcc/spellcheck-enchant.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index f675e07..6bb3dc0 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1046,7 +1046,7 @@ BUILD_LIBDEPS= $(BUILD_LIBIBERTY)
 # How to link with both our special library facilities
 # and the system's installed libraries.
 LIBS = @LIBS@ libcommon.a $(CPPLIB) $(LIBINTL) $(LIBICONV) $(LIBBACKTRACE) \
-   $(LIBIBERTY) $(LIBDECNUMBER) $(HOST_LIBS)
+   $(LIBIBER

[committed] Support fix-it hints that add new lines

2017-05-02 Thread David Malcolm
Previously fix-it hints couldn't contain newlines.  This is
due to the need to print something user-readable for them
within diagnostic-show-locus, and for handling them within
edit-context for printing diffs and regenerating content.

This patch enables limited support for fix-it hints with newlines,
for suggesting adding new lines.
Such a fix-it hint must have exactly one newline character, at the
end of the content.  It must be an insertion at the beginning of
a line (so that e.g. fix-its that split a pre-existing line are
still rejected).

They are printed by diagnostic-show-locus with a '+' in the
left-hand margin, like this:

test.c:42:4: note: suggest adding 'break;' here
+  break;
 case 'b':
 ^

and the printer injects "spans" if the insertion location is not
near the primary range of the diagnostic e.g.:

test.c:4:2: note: unrecognized 'putchar'; suggest including ''
test.c:1:1:
+#include 

test.c:4:2:
  putchar (ch);
  ^~~

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.

Committed to trunk as r247522.

I'm working on implementing both of the above examples.

gcc/ChangeLog:
* diagnostic-show-locus.c
(layout::should_print_annotation_line_p): Make private.
(layout::print_annotation_line): Make private.
(layout::annotation_line_showed_range_p): Make private.
(layout::show_ruler): Make private.
(layout::print_source_line): Make private.  Pass in line and
line_width, rather than calling location_get_source_line.  Drop
returned value.
(layout::print_leading_fixits): New method.
(layout::print_any_fixits): Rename to...
(layout::print_trailing_fixits): ...this, and make private.
Don't print newline fixits.
(diagnostic_show_locus): Move logic for printing one row into...
(layout::print_line): ...this new function.  Move the
location_get_source_line call and error-handling from
print_source_line to here.  Call print_leading_fixits, and rename
print_any_fixits to print_trailing_fixits.
(selftest::test_fixit_insert_containing_newline): Update now that
newlines are partially supported.
(selftest::test_fixit_insert_containing_newline_2): New test.
(selftest::test_fixit_replace_containing_newline): Update comments.
(selftest::diagnostic_show_locus_c_tests): Call the new test.
* edit-context.c (class added_line): New class.
(class edited_line): Describe newline handling in comment.
(edited_line::actually_edited_p): New method.
(edited_line::print_content): Delete redundant decl.
(edited_line::m_predecessors): New field.
(edited_file::print_content): Call edited_line::print_content.
(edited_file::print_diff): Update to support newlines.
(edited_file::print_diff_hunk): Likewise.
(edited_file::print_run_of_changed_lines): New function.
(edited_file::print_diff_line): Convert to...
(print_diff_line): ...this.
(edited_file::get_effective_line_count): New function.
(edited_line::edited_line): Initialize new field m_predecessors.
(edited_line::~edited_line): Clean up m_predecessors.
(edited_line::apply_fixit): Handle newlines.
(edited_line::get_effective_line_count): New function.
(edited_line::print_content): New function.
(edited_line::print_diff_lines): New function.
(selftest::test_applying_fixits_insert_containing_newline): New
test.
(selftest::test_applying_fixits_replace_containing_newline): New
test.
(selftest::insert_line): New function.
(selftest::test_applying_fixits_multiple_lines): Add example of
inserting a line.
(selftest::edit_context_c_tests): Call the new tests.

gcc/testsuite/ChangeLog:
* gcc.dg/plugin/diagnostic-test-show-locus-bw.c
(test_fixit_insert_newline): New function.
* gcc.dg/plugin/diagnostic-test-show-locus-color.c
(test_fixit_insert_newline): New function.
* gcc.dg/plugin/diagnostic-test-show-locus-generate-patch.c
(test_fixit_insert_newline): New function.
* gcc.dg/plugin/diagnostic-test-show-locus-parseable-fixits.c
(test_fixit_insert_newline): New function.
* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
(test_show_locus): Handle test_fixit_insert_newline.

libcpp/ChangeLog:
* include/line-map.h (class rich_location): Update description of
newline handling.
(class fixit_hint): Likewise.
(fixit_hint::ends_with_newline_p): New decl.
* line-map.c (rich_location::maybe_add_fixit): Support newlines
in fix-it hints that are insertions of single lines at the start
of a line.  Don't consolidate into such fix-it hints.
(fixit_hint::ends_with_newline_p): New method.
---
 gcc/diagnostic-show-locus.c| 211 ---
 

New Spanish PO file for 'gcc' (version 7.1.0)

2017-05-02 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Spanish team of translators.  The file is available at:

http://translationproject.org/latest/gcc/es.po

(This file, 'gcc-7.1.0.es.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




New Spanish PO file for 'gcc' (version 7.1.0)

2017-05-02 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Spanish team of translators.  The file is available at:

http://translationproject.org/latest/gcc/es.po

(This file, 'gcc-7.1.0.es.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: [wwwdocs, ARM, AArch64] Document ABI changes and fixes

2017-05-02 Thread Jakub Jelinek
On Tue, May 02, 2017 at 05:32:21PM +0100, Richard Earnshaw (lists) wrote:
> This patch adds some release notes for the gcc ABI changes affecting ARM
> and AArch64.
> 
> Does this sound reasonable?

LGTM.

Jakub


Re: [PATCH] RFC: spellchecker for comments, plus -Wfixme and -Wtodo

2017-05-02 Thread Mike Stump
On May 2, 2017, at 12:08 PM, David Malcolm  wrote:
> 
> As a proof of concept, the patch uses this to add a spellchecker
> for comments.

:-)

> I didn't bother with the autotool detection for enchant, and
> just hacked it in for now.

Only other comment would be, rather then requiring it, would be nice to make it 
optional.  I can host gcc in an environment that is a bare metal target on 
newlib.  autoconf can smell it, and tell that a ton of things are missing.



Re: [1/2] PR 78736: New warning -Wenum-conversion

2017-05-02 Thread Martin Sebor

On 05/02/2017 11:11 AM, Prathamesh Kulkarni wrote:

Hi,
The attached patch attempts to add option -Wenum-conversion for C and
objective-C similar to clang, which warns when an enum value of a type
is implicitly converted to enum value of another type and is enabled
by Wall.


It seems quite useful.  My only high-level concern is with
the growing number of specialized warnings and options for each
and their interaction.

I've been working on -Wenum-assign patch that complains about
assigning to an enum variables an integer constants that doesn't
match any of the enumerators of the type.  Testing revealed that
the -Wenum-assign duplicated a subset of warnings already issued
by -Wconversion enabled with -Wpedantic.  I'm debating whether
to suppress that part of -Wenum-assign altogether or only when
-Wconversion and -Wpedantic are enabled.

My point is that these dependencies tend to be hard to discover
and understand, and the interactions tricky to get right (e.g.,
avoid duplicate warnings for similar but distinct problems).

This is not meant to be a negative comment on your patch, but
rather a comment about a general problem that might be worth
starting to think about.

One comment on the patch itself:

+ warning_at_rich_loc (&loc, 0, "implicit conversion from"
+  " enum type of %qT to %qT", checktype, type);

Unlike C++, the C front end formats an enumerated type E using
%qT as 'enum E' so the warning prints 'enum type of 'enum E'),
duplicating the "enum" part.

I would suggest to simplify that to:

  warning_at_rich_loc (&loc, 0, "implicit conversion from "
   "%qT to %qT", checktype, ...

Martin

PS As an example to illustrate my concern above, consider this:

  enum __attribute__ ((packed)) E { e1 = 1 };
  enum F { f256 = 256 };

  enum E e = f256;

It triggers -Woverflow:

warning: large integer implicitly truncated to unsigned type [-Woverflow]
   enum E e = f256;
  ^~~~

also my -Wenum-assign:

warning: integer constant ‘256’ converted to ‘0’ due to limited range 
[0, 255] of type ‘‘enum E’’ [-Wassign-enum]

   enum E e = f256;
  ^~~~

and (IIUC) will trigger your new -Wenum-conversion.

Martin


Re: [driver, doc] Support escaping special characters in specs

2017-05-02 Thread Joseph Myers
On Tue, 21 Feb 2017, Rainer Orth wrote:

> 2017-01-10  Jeff Downs  
>   Rainer Orth  
> 
>   * gcc.c (handle_braces): Support escaping in switch matching
>   text.
>   * doc/invoke.texi (Spec Files): Document it.
>   Remove superfluous @code markup in items.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 09/12] [i386] Add patterns and predicates foutline-msabi-xlouges

2017-05-02 Thread Daniel Santos

Thank you for the review.

On 05/01/2017 06:18 AM, Uros Bizjak wrote:

On Thu, Apr 27, 2017 at 10:09 AM, Daniel Santos  wrote:

Adds the predicates save_multiple and restore_multiple to predicates.md,
which are used by following patterns in sse.md:

* save_multiple - insn that calls a save stub
* restore_multiple - call_insn that calls a save stub and returns to the
   function to allow a sibling call (which should typically offer better
   optimization than the restore stub as the tail call)
* restore_multiple_and_return - a jump_insn that returns from the
   function as a tail-call.
* restore_multiple_leave_return - like the above, but restores the frame
   pointer before returning.

Signed-off-by: Daniel Santos 
---
  gcc/config/i386/predicates.md | 155 ++
  gcc/config/i386/sse.md|  37 ++
  2 files changed, 192 insertions(+)

diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 8f250a2e720..36fe8abc3f4 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -1657,3 +1657,158 @@
(ior (match_operand 0 "register_operand")
 (and (match_code "const_int")
 (match_test "op == constm1_rtx"
+
+;; Return true if:
+;; 1. first op is a symbol reference,
+;; 2. >= 13 operands, and
+;; 3. operands 2 to end is one of:
+;;   a. save a register to a memory location, or
+;;   b. restore stack pointer.
+(define_predicate "save_multiple"
+  (match_code "parallel")
+{
+  const unsigned nregs = XVECLEN (op, 0);
+  rtx head = XVECEXP (op, 0, 0);
+  unsigned i;
+
+  if (GET_CODE (head) != USE)
+return false;
+  else
+{
+  rtx op0 = XEXP (head, 0);
+  if (op0 == NULL_RTX || GET_CODE (op0) != SYMBOL_REF)
+   return false;
+}
+
+  if (nregs < 13)
+return false;
+
+  for (i = 2; i < nregs; i++)
+{
+  rtx e, src, dest;
+
+  e = XVECEXP (op, 0, i);
+
+  switch (GET_CODE (e))
+   {
+ case SET:
+   src  = SET_SRC (e);
+   dest = SET_DEST (e);
+
+   /* storing a register to memory.  */
+   if (GET_CODE (src) == REG && GET_CODE (dest) == MEM)

Please use REG_P (...) and MEM_P (...) - and possible others -
predicates in the code.


+ {
+   rtx addr = XEXP (dest, 0);
+
+   /* Good if dest address is in RAX.  */
+   if (GET_CODE (addr) == REG
+   && REGNO (addr) == AX_REG)
+ continue;
+
+   /* Good if dest address is offset of RAX.  */
+   if (GET_CODE (addr) == PLUS
+   && GET_CODE (XEXP (addr, 0)) == REG
+   && REGNO (XEXP (addr, 0)) == AX_REG)
+ continue;
+ }
+   break;
+
+ default:
+   break;
+   }
+   return false;
+}
+  return true;
+})
+
+;; Return true if:
+;; * first op is (return) or a a use (symbol reference),
+;; * >= 14 operands, and
+;; * operands 2 to end are one of:
+;;   - restoring a register from a memory location that's an offset of RSI.
+;;   - clobbering a reg
+;;   - adjusting SP
+(define_predicate "restore_multiple"
+  (match_code "parallel")
+{
+  const unsigned nregs = XVECLEN (op, 0);
+  rtx head = XVECEXP (op, 0, 0);
+  unsigned i;
+
+  switch (GET_CODE (head))
+{
+  case RETURN:
+   i = 3;
+   break;
+
+  case USE:
+  {
+   rtx op0 = XEXP (head, 0);
+
+   if (op0 == NULL_RTX || GET_CODE (op0) != SYMBOL_REF)
+ return false;
+
+   i = 1;
+   break;
+  }
+
+  default:
+   return false;
+}
+
+  if (nregs < i + 12)
+return false;
+
+  for (; i < nregs; i++)
+{
+  rtx e, src, dest;
+
+  e = XVECEXP (op, 0, i);
+
+  switch (GET_CODE (e))
+   {
+ case CLOBBER:
+   continue;

I don't see where CLOBBER is genreated in ix86_emit_outlined_ms2sysv_restore.


I think this is clutter that I didn't remove after changing the stubs.


+
+ case SET:
+   src  = SET_SRC (e);
+   dest = SET_DEST (e);
+
+   /* Restoring a register from memory.  */
+   if (GET_CODE (src) == MEM && GET_CODE (dest) == REG)
+ {
+   rtx addr = XEXP (src, 0);
+
+   /* Good if src address is in RSI.  */
+   if (GET_CODE (addr) == REG
+   && REGNO (addr) == SI_REG)
+ continue;
+
+   /* Good if src address is offset of RSI.  */
+   if (GET_CODE (addr) == PLUS
+   && GET_CODE (XEXP (addr, 0)) == REG
+   && REGNO (XEXP (addr, 0)) == SI_REG)
+ continue;
+
+   /* Good if adjusting stack pointer.  */
+   if (GET_CODE (dest) == REG
+   && REGNO (dest) == SP_REG
+   && GET_CODE (src) == PLUS
+   && GET_CODE (XEXP (src, 0)) == REG
+   && REGNO (XEXP (src, 0)) == SP_REG)
+

Re: [PATCH] Pretty-printing of some unsupported expressions (PR c/35441)

2017-05-02 Thread Joseph Myers
On Fri, 10 Mar 2017, Volker Reichelt wrote:

> a) This part (with foo1 and foo2 from the testcase) is straightforward.

That part is OK.

> b) I chose the shift operators 'a << b' and 'a >> b' for the rotate
>expressions, which is not 100% correct. Would it be better to use
>something like 'lrotate(a, b)', '__lrotate__(a, b)' or 'a lrotate b'
>instead? Or is there something like an '__builtin_lrotate' that I misseed?

I'd be inclined to use the notation <<< and >>> for rotation, cf. 
.

> c) I chose 'max(q, b)' and 'min(q, b)'.

I think that's fine.

> In addition I found some more division operators in gcc/tree.def that
> aren't handled by the pretty-printer:
> 
>   CEIL_DIV_EXPR
>   FLOOR_DIV_EXPR
>   ROUND_DIV_EXPR
>   CEIL_MOD_EXPR
>   FLOOR_MOD_EXPR
>   ROUND_MOD_EXPR
> 
> Alas I don't have testcases for them. Nevertheless, I could handle them
> like the other MOD and DIV operators just to be safe.

These can probably appear from Ada code, but maybe not from C.

Now we have caret diagnostics and location ranges I think we should be 
moving away from printing complicated expressions from trees anyway.  So 
for the diagnostics about calling non-functions, it would be best to make 
a location range for the called expression available if it isn't already, 
then do a diagnostic with a location that underlines that text rather than 
trying to reproduce an expression text from trees.

-- 
Joseph S. Myers
jos...@codesourcery.com


Fix ICE caused by rounoff error in ipa-inline-analysis

2017-05-02 Thread Jan Hubicka
Hi,
the following patch silence overactive sanity check which now fires due to 
roundoff
errors of sreals.  Because this affects some bootstraps I am commiting it 
tonight.
I will add testcase tomorrow. My apologizes for the breakage.

Bootstrapped/regtested x86_64-linux.

Honza

* ipa-inline-analysis.c (estimate_node_size_and_time): Allow rondoff
errors when comparing specialized and unspecialized times.
Index: ipa-inline-analysis.c
===
--- ipa-inline-analysis.c   (revision 247436)
+++ ipa-inline-analysis.c   (working copy)
@@ -3422,7 +3422,15 @@ estimate_node_size_and_time (struct cgra
   min_size = (*info->entry)[0].size;
   gcc_checking_assert (size >= 0);
   gcc_checking_assert (time >= 0);
-  gcc_checking_assert (nonspecialized_time >= time);
+  /* nonspecialized_time should be always bigger than specialized time.
+ Roundoff issues however may get into the way.  */
+  gcc_checking_assert ((nonspecialized_time - time) >= -1);
+
+  /* Roundoff issues may make specialized time bigger than nonspecialized
+ time.  We do not really want that to happen because some heurstics
+ may get confused by seeing negative speedups.  */
+  if (time > nonspecialized_time)
+time = nonspecialized_time;
 
   if (info->loop_iterations
   && !evaluate_predicate (info->loop_iterations, possible_truths))


[PATCH] ggc-page loop

2017-05-02 Thread Nathan Sidwell
This loop in ggc-page confused me, because the iterator is one greater than the 
indexing value.  Also the formatting of the array indexing is incorrect.


Fixed thusly, and applied as obvious after booting on x86_64-linux-gnu

nathan
--
Nathan Sidwell
2017-05-02  Nathan Sidwell  

* ggc-page.c (move_ptes_to_front): Replace unsigned >0 with i--
check.  Fix formatting.

Index: ggc-page.c
===
--- ggc-page.c  (revision 247528)
+++ ggc-page.c  (working copy)
@@ -2507,8 +2507,6 @@ ggc_pch_finish (struct ggc_pch_data *d,
 static void
 move_ptes_to_front (int count_old_page_tables, int count_new_page_tables)
 {
-  unsigned i;
-
   /* First, we swap the new entries to the front of the varrays.  */
   page_entry **new_by_depth;
   unsigned long **new_save_in_use;
@@ -2536,10 +2534,10 @@ move_ptes_to_front (int count_old_page_t
   G.save_in_use = new_save_in_use;
 
   /* Now update all the index_by_depth fields.  */
-  for (i = G.by_depth_in_use; i > 0; --i)
+  for (unsigned i = G.by_depth_in_use; i--;)
 {
-  page_entry *p = G.by_depth[i-1];
-  p->index_by_depth = i-1;
+  page_entry *p = G.by_depth[i];
+  p->index_by_depth = i;
 }
 
   /* And last, we update the depth pointers in G.depth.  The first


Re: [PATCH] RFC: spellchecker for comments, plus -Wfixme and -Wtodo

2017-05-02 Thread Trevor Saunders
On Tue, May 02, 2017 at 03:08:01PM -0400, David Malcolm wrote:
> Currently the C/C++ frontends discard comments when parsing.
> It's possible to set up libcpp to capture comments as tokens,
> by setting CPP_OPTION (pfile, discard_comments) to false),
> and this can be enabled using the -C command line option (see
> also -CC), but c-family/c-lex.c then discards any CPP_COMMENT
> tokens it sees, so they're not seen by the frontend parser.
> 
> The following patch adds an (optional) callback to libcpp
> for handling comments, giving the comment content, and the
> location it was seen at.  This approach allows arbitrary
> logic to be wired up to comments, and avoids having to
> copy the comment content to a new buffer (which the CPP_COMMENT
> approach does).
> 
> This could be used by plugins to chain up on the callback
> e.g. to parse specially-formatted comments, e.g. for
> documentation generation, or e.g. for GObject introspection
> annotations [1].

Making that kind of task easier seems like a good thing.  One difficulty
will be associating the comment with the declaration its for. In C++ its
probably better to use attributes when possible but that won't work for
the documentation issue.

> Does enchant seem like a reasonable dependency for the compiler?
> (it pulls in libpthread.so.0, libglib-2.0.so.0, libgmodule-2.0.so.0).
> Or would this be better pursued as a plugin?  (if so, I'd
> prefer the plugin to live in the source tree as an example,
> rather than out-of-tree).

I'd kind of worry about what loading all that does to start up time, but
maybe that's premature optimization.  Anyway it seems like making things
plugins where reasonable will help modularity, so kinda seems like a
good idea anyway.

> Unrelated to spellchecking, I also added two new options:
> -Wfixme and -Wtodo, for warning when comments containing
> "FIXME" or "TODO" are encountered.
> I use such comments a lot during development.  I thought some
> people might want a warning about them (I tend to just use grep
> though).  [TODO: document these in invoke.texi, add test cases]

it seems useful if you don't have existing committed fixmes all over
already, or you can tel those apart easily.

It may also be worth suppoting more wordings, for some reason Mozilla
uses XXX a lot.

> Thoughts?  Does any of this sound useful?

didn't read the code careful, but the ideas seem useful.

Trev



  1   2   >