Re: [PATCH] consider MIN_EXPR in get_size_range() (PR 85888)

2018-05-31 Thread Richard Biener
On May 31, 2018 12:42:39 AM GMT+02:00, Jeff Law  wrote:
>On 05/30/2018 03:37 AM, Richard Biener wrote:
>> On Tue, May 29, 2018 at 4:58 PM Martin Sebor 
>wrote:
>> 
>>> On 05/28/2018 03:11 AM, Richard Biener wrote:
 On Fri, May 25, 2018 at 10:15 PM Martin Sebor 
>wrote:

> Attached is revision 3 of the patch incorporating your
> determine_value_range function with the requested changes.

 I'm somewhat torn about removing the "basic" interface on SSA names
 so can you please not change get_range_info for now and instead
 use determine_value_range in get_size_range for now?
>> 
>>> I can do that.  Can you explain why you're having second thoughts
>>> about going this route?
>> 
>> I've seen you recurse between both APIs, thus they call each other.
>> That's ugly which is why I prefer to keep one of them a simple
>accessor
>> to the range-info associated with an SSA name.
>> 
>> A future enhancement for the new API would be to walk def stmts
>> but then the API should stop at SSA names that do have range-info
>> associated and record range-info it computed into SSA names it
>> walked so the IL itself serves as a cache.  That requires a way
>> to see whether an SSA name has range-info rather than having
>> get_range_info recurse into the walking machinery again.
>There's a lot of similarities between what you're suggesting here and
>the ranger API that Andrew has been working on.

Maybe - at least it integrates easily with existing infrastructure and doesn't 
require yet another set of operations on ranges. 

I'll yet have to review the ranger stuff. 

Richard. 

>
>Jeff



Re: [PATCH][AArch64] Fix aarch64_ira_change_pseudo_allocno_class

2018-05-31 Thread Christophe Lyon
On 29 May 2018 at 19:34, Wilco Dijkstra  wrote:
> James Greenhalgh wrote:
>
>> > Add a missing ? to aarch64_get_lane to fix a failure in the testsuite.
>>
>> > I'd prefer more detail than this for a workaround; which test, why did it
>> > start to fail, why is this the right solution, etc.
>
> It was gcc.target/aarch64/vect_copy_lane_1.c generating:
>
> test_copy_laneq_f64:
> umovx0, v1.d[1]
> fmovd0, x0
> ret
>
> For some reason returning a double uses DImode temporaries, so it's essential
> to prefer FP_REGS here and mark the lane copy correctly.
>
> Wilco
>

Hi Wilco,

This has probably been reported elsewhere already but I can't find
such a report, so sorry for possible duplicate,
but this patch is causing ICEs on aarch64
FAIL:gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
(internal compiler error)
FAIL:gcc.target/aarch64/sve/reduc_5.c -march=armv8.2-a+sve
(internal compiler error)

and also many scan-assembler regressions:

http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/260951/report-build-info.html

Can you check?

Thanks

Christophe


Re: [PATCH][AArch64] Improve LDP/STP generation that requires a base register

2018-05-31 Thread Christophe Lyon
Hi,

On 29 May 2018 at 18:02, James Greenhalgh  wrote:
> On Tue, May 29, 2018 at 10:28:27AM -0500, Kyrill Tkachov wrote:
>> [sending on behalf of Jackson Woodruff]
>>
>> Hi all,
>>
>> This patch generalizes the formation of LDP/STP that require a base register.
>>
>> In AArch64, LDP/STP instructions have different sized immediate offsets than
>> normal LDR/STR instructions. This part of the backend attempts to spot groups
>> of four LDR/STR instructions that can be turned into LDP/STP instructions by
>> using a base register.
>>
>> Previously, we would only accept address pairs that were ordered in ascending
>> or descending order, and only strictly sequential loads/stores. In fact, the
>> instructions that we generate from this should be able to consider any order
>> of loads or stores (provided that they can be re-ordered). They should also 
>> be
>> able to accept non-sequential loads and stores provided that the two pairs of
>> addresses are amenable to pairing. The current code is also overly 
>> restrictive
>> on the range of addresses that are accepted, as LDP/STP instructions may take
>> negative offsets as well as positive ones.
>>
>> This patch improves that by allowing us to accept all orders of loads/stores
>> that are valid, and extending the range that the LDP/STP addresses can reach.
>
> OK.
>

The new test ldp_stp_10.c fails in ILP32 mode:
FAIL:gcc.target/aarch64/ldp_stp_10.c scan-assembler-times
ldp\tw[0-9]+, w[0-9]+,  2
FAIL:gcc.target/aarch64/ldp_stp_10.c scan-assembler-times
ldp\tx[0-9]+, x[0-9]+,  2

Christophe

> Thanks,
> James
>
>


Re: [PATCH][AArch64] Fix aarch64_ira_change_pseudo_allocno_class

2018-05-31 Thread Richard Sandiford
Christophe Lyon  writes:
> On 29 May 2018 at 19:34, Wilco Dijkstra  wrote:
>> James Greenhalgh wrote:
>>
>>> > Add a missing ? to aarch64_get_lane to fix a failure in the testsuite.
>>>
>>> > I'd prefer more detail than this for a workaround; which test, why did it
>>> > start to fail, why is this the right solution, etc.
>>
>> It was gcc.target/aarch64/vect_copy_lane_1.c generating:
>>
>> test_copy_laneq_f64:
>> umovx0, v1.d[1]
>> fmovd0, x0
>> ret
>>
>> For some reason returning a double uses DImode temporaries, so it's essential
>> to prefer FP_REGS here and mark the lane copy correctly.
>>
>> Wilco
>>
>
> Hi Wilco,
>
> This has probably been reported elsewhere already but I can't find
> such a report, so sorry for possible duplicate,
> but this patch is causing ICEs on aarch64
> FAIL:gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
> (internal compiler error)
> FAIL:gcc.target/aarch64/sve/reduc_5.c -march=armv8.2-a+sve
> (internal compiler error)
>
> and also many scan-assembler regressions:
>
> http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/260951/report-build-info.html

Thanks for the heads-up.  Looks like they're all SVE, so I'll take this.

Richard


[patch] fix libsanitizer build on sparc64 (32bit multilib)

2018-05-31 Thread Matthias Klose
The fix for PR85835 causes the build to fail on sparc64-linux-gnu in the 32bit
multilib.  Testing the attached patch in a multilib enabled sparc64 cross build.
 Ok for the trunk and branches if the build succeeds?

Matthias

2018-05-31  Matthias Klose  

PR sanitizer/86012
* sanitizer_common/sanitizer_platform_limits_posix.cc: Define
SIZEOF_STRUCT_USTAT for 32bit sparc.


libsanitizer/

2018-05-31  Matthias Klose  

	PR sanitizer/86012
	* sanitizer_common/sanitizer_platform_limits_posix.cc: Define
	SIZEOF_STRUCT_USTAT for 32bit sparc.

--- a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc
+++ b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc
@@ -256,7 +256,7 @@
   || defined(__x86_64__)
 #define SIZEOF_STRUCT_USTAT 32
 #elif defined(__arm__) || defined(__i386__) || defined(__mips__) \
-  || defined(__powerpc__) || defined(__s390__)
+  || defined(__powerpc__) || defined(__s390__) || defined(__sparc__)
 #define SIZEOF_STRUCT_USTAT 20
 #else
 #error Unknown size of struct ustat


Re: [patch] fix libsanitizer build on sparc64 (32bit multilib)

2018-05-31 Thread Jakub Jelinek
On Thu, May 31, 2018 at 11:32:50AM +0200, Matthias Klose wrote:
> The fix for PR85835 causes the build to fail on sparc64-linux-gnu in the 32bit
> multilib.  Testing the attached patch in a multilib enabled sparc64 cross 
> build.
>  Ok for the trunk and branches if the build succeeds?
> 
> Matthias
> 
> 2018-05-31  Matthias Klose  
> 
> PR sanitizer/86012
> * sanitizer_common/sanitizer_platform_limits_posix.cc: Define
> SIZEOF_STRUCT_USTAT for 32bit sparc.

Ok, though if you could also propagate it upstream, it would be appreciated.

> 2018-05-31  Matthias Klose  
> 
>   PR sanitizer/86012
>   * sanitizer_common/sanitizer_platform_limits_posix.cc: Define
>   SIZEOF_STRUCT_USTAT for 32bit sparc.
> 
> --- a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc
> +++ b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc
> @@ -256,7 +256,7 @@
>|| defined(__x86_64__)
>  #define SIZEOF_STRUCT_USTAT 32
>  #elif defined(__arm__) || defined(__i386__) || defined(__mips__) \
> -  || defined(__powerpc__) || defined(__s390__)
> +  || defined(__powerpc__) || defined(__s390__) || defined(__sparc__)
>  #define SIZEOF_STRUCT_USTAT 20
>  #else
>  #error Unknown size of struct ustat

Jakub


Re: [PATCH] Avoid hot/cold partitioning in naked functions (PR target/85984)

2018-05-31 Thread Richard Biener
On May 31, 2018 8:51:59 AM GMT+02:00, Jakub Jelinek  wrote:
>Hi!
>
>We say in the documentation only supported body of naked functions
>is basic asm, anything else may but might not work.  On the following
>testcase we end up with ICE, because the missing epilogue means the
>first
>partition is not separated from the second partition with a barrier and
>something before that.
>
>I think easiest is just not to partition such functions, for the really
>supported case it shouldn't make a difference anyway.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Is naked an attribute that is specified for all targets? If so, OK. Otherwise 
we may instead want to add a target hook for whether a function has a 
prologue/epilogue? 

Richard. 

>2018-05-31  Jakub Jelinek  
>
>   PR target/85984
>   * bb-reorder.c (pass_partition_blocks::gate): Return false for
>   functions with naked attribute.
>
>   * gcc.target/i386/pr85984.c: New test.
>
>--- gcc/bb-reorder.c.jj2018-05-09 20:12:28.399260557 +0200
>+++ gcc/bb-reorder.c   2018-05-30 16:01:12.113006870 +0200
>@@ -2928,8 +2928,8 @@ pass_partition_blocks::gate (function *f
> {
>   /* The optimization to partition hot/cold basic blocks into separate
>  sections of the .o file does not work well with linkonce or with
>- user defined section attributes.  Don't call it if either case
>- arises.  */
>+ user defined section attributes or with naked attribute.  Don't
>call
>+ it if either case arises.  */
>   return (flag_reorder_blocks_and_partition
> && optimize
> /* See pass_reorder_blocks::gate.  We should not partition if
>@@ -2937,6 +2937,7 @@ pass_partition_blocks::gate (function *f
> && optimize_function_for_speed_p (fun)
> && !DECL_COMDAT_GROUP (current_function_decl)
> && !lookup_attribute ("section", DECL_ATTRIBUTES (fun->decl))
>+&& !lookup_attribute ("naked", DECL_ATTRIBUTES (fun->decl))
> /* Workaround a bug in GDB where read_partial_die doesn't cope
>with DIEs with DW_AT_ranges, see PR81115.  */
> && !(in_lto_p && MAIN_NAME_P (DECL_NAME (fun->decl;
>--- gcc/testsuite/gcc.target/i386/pr85984.c.jj 2018-05-30
>16:08:24.951523398 +0200
>+++ gcc/testsuite/gcc.target/i386/pr85984.c2018-05-30
>16:08:12.184508165 +0200
>@@ -0,0 +1,18 @@
>+/* PR target/85984 */
>+/* { dg-do compile } */
>+/* { dg-options "-O2" } */
>+
>+int foo (void);
>+
>+void __attribute__((naked))
>+bar (void)
>+{
>+  if (!foo ())
>+__builtin_abort ();
>+}
>+
>+void
>+baz (void)
>+{
>+  bar ();
>+}
>
>   Jakub



Re: [PATCH][AArch64] Improve LDP/STP generation that requires a base register

2018-05-31 Thread Kyrill Tkachov

Hi Christophe,

On 31/05/18 09:38, Christophe Lyon wrote:

Hi,

On 29 May 2018 at 18:02, James Greenhalgh  wrote:
> On Tue, May 29, 2018 at 10:28:27AM -0500, Kyrill Tkachov wrote:
>> [sending on behalf of Jackson Woodruff]
>>
>> Hi all,
>>
>> This patch generalizes the formation of LDP/STP that require a base register.
>>
>> In AArch64, LDP/STP instructions have different sized immediate offsets than
>> normal LDR/STR instructions. This part of the backend attempts to spot groups
>> of four LDR/STR instructions that can be turned into LDP/STP instructions by
>> using a base register.
>>
>> Previously, we would only accept address pairs that were ordered in ascending
>> or descending order, and only strictly sequential loads/stores. In fact, the
>> instructions that we generate from this should be able to consider any order
>> of loads or stores (provided that they can be re-ordered). They should also 
be
>> able to accept non-sequential loads and stores provided that the two pairs of
>> addresses are amenable to pairing. The current code is also overly 
restrictive
>> on the range of addresses that are accepted, as LDP/STP instructions may take
>> negative offsets as well as positive ones.
>>
>> This patch improves that by allowing us to accept all orders of loads/stores
>> that are valid, and extending the range that the LDP/STP addresses can reach.
>
> OK.
>

The new test ldp_stp_10.c fails in ILP32 mode:
FAIL:gcc.target/aarch64/ldp_stp_10.c scan-assembler-times
ldp\tw[0-9]+, w[0-9]+,  2
FAIL:gcc.target/aarch64/ldp_stp_10.c scan-assembler-times
ldp\tx[0-9]+, x[0-9]+,  2



This is because the register allocation is such that the last load in the 
sequence clobbers the address register like so:
...
ldr w0, [x2, 1600]
ldr w1, [x2, 2108]
ldr w3, [x2, 1604]
ldr w2, [x2, 2112] //<<--- x2 is an address and a destination
...

The checks in aarch64_operands_adjust_ok_for_ldpstp bail out for this case.
I believe as long as w2 is loaded in the second/last LDP pair that this 
optimisation generates
and the address is not a writeback address (as we are guaranteed in this 
context) then it should
be safe to form the LDP pairs.
So this is a missed-optimization to me.
Can you please file a bug report?

Thanks,
Kyrill



Christophe

> Thanks,
> James
>
>




Re: [PATCH] Avoid hot/cold partitioning in naked functions (PR target/85984)

2018-05-31 Thread Jakub Jelinek
On Thu, May 31, 2018 at 11:46:33AM +0200, Richard Biener wrote:
> Is naked an attribute that is specified for all targets?  If so, OK. 

It is not specified for all targets, but all targets for which it is
specified have the same behavior.
We handle "naked" a couple of times in the generic code already, e.g. in
attribs.c (naked implies noinline/noclone), or in cfgexpand.c (refuse
allocation of vars on the stack for "naked" functions).

> Otherwise we may instead want to add a target hook for whether a function
> has a prologue/epilogue?

Jakub


[Ada] Set Etype on rewriteen Max_Queue_Length expressions

2018-05-31 Thread Pierre-Marie de Rodat
Rewriting of Max_Queue_Length expression into N_Integer_Literal should probably
be done in expansion and not in analysis, but anyway it should not strip the
expression from its Etype because backends (e.g. GNATprove) expect that Etype
to be present.

No frontend test is provided, because GNAT doesn't care about the missing
Etype decoration. This patch allows to simplify AST processing in GNATprove.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-31  Piotr Trojanek  

gcc/ada/

* sem_prag.adb (Analyze_Pragma): Set Etype on the rewritten
Max_Queue_Length expression.--- gcc/ada/sem_prag.adb
+++ gcc/ada/sem_prag.adb
@@ -18833,6 +18833,7 @@ package body Sem_Prag is
 
 if Nkind (Arg) /= N_Integer_Literal then
Rewrite (Arg, Make_Integer_Literal (Sloc (Arg), Val));
+   Set_Etype (Arg, Etype (Original_Node (Arg)));
 end if;
 
 Record_Rep_Item (Entry_Id, N);



[Ada] Posix 2008: reimplement System.OS_Primitives.Clock using clock_gettime

2018-05-31 Thread Pierre-Marie de Rodat
gettimeofday is deprecated in Posix 2008, clock_gettime is the recommended
replacement.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-31  Doug Rupp  

gcc/ada/

* libgnat/s-osprim__posix2008.adb (Clock): Implement using
clock_gettime.--- gcc/ada/libgnat/s-osprim__posix2008.adb
+++ gcc/ada/libgnat/s-osprim__posix2008.adb
@@ -32,8 +32,11 @@
 --  This version is for POSIX.1-2008-like operating systems
 
 with System.CRTL;
+with System.OS_Constants;
 package body System.OS_Primitives is
 
+   subtype int is System.CRTL.int;
+
--  ??? These definitions are duplicated from System.OS_Interface because
--  we don't want to depend on any package. Consider removing these
--  declarations in System.OS_Interface and move these ones to the spec.
@@ -54,43 +57,22 @@ package body System.OS_Primitives is
---
 
function Clock return Duration is
+  TS : aliased timespec;
+  Result : int;
 
-  type timeval is array (1 .. 3) of Long_Integer;
-  --  The timeval array is sized to contain Long_Long_Integer sec and
-  --  Long_Integer usec. If Long_Long_Integer'Size = Long_Integer'Size then
-  --  it will be overly large but that will not effect the implementation
-  --  since it is not accessed directly.
-
-  procedure timeval_to_duration
-(T: not null access timeval;
- sec  : not null access Long_Long_Integer;
- usec : not null access Long_Integer);
-  pragma Import (C, timeval_to_duration, "__gnat_timeval_to_duration");
-
-  Micro  : constant := 10**6;
-  sec: aliased Long_Long_Integer;
-  usec   : aliased Long_Integer;
-  TV : aliased timeval;
-  Result : Integer;
-  pragma Unreferenced (Result);
-
-  function gettimeofday
-(Tv : access timeval;
- Tz : System.Address := System.Null_Address) return Integer;
-  pragma Import (C, gettimeofday, "gettimeofday");
-
-   begin
-  --  The return codes for gettimeofday are as follows (from man pages):
-  --EPERM  settimeofday is called by someone other than the superuser
-  --EINVAL Timezone (or something else) is invalid
-  --EFAULT One of tv or tz pointed outside accessible address space
+  type clockid_t is new int;
+  CLOCK_REALTIME : constant clockid_t :=
+ System.OS_Constants.CLOCK_REALTIME;
 
-  --  None of these codes signal a potential clock skew, hence the return
-  --  value is never checked.
+  function clock_gettime
+(clock_id : clockid_t;
+ tp   : access timespec) return int;
+  pragma Import (C, clock_gettime, "clock_gettime");
 
-  Result := gettimeofday (TV'Access, System.Null_Address);
-  timeval_to_duration (TV'Access, sec'Access, usec'Access);
-  return Duration (sec) + Duration (usec) / Micro;
+   begin
+  Result := clock_gettime (CLOCK_REALTIME, TS'Unchecked_Access);
+  pragma Assert (Result = 0);
+  return Duration (TS.tv_sec) + Duration (TS.tv_nsec) / 10#1#E9;
end Clock;
 
-



[Ada] Post warning on object size clause for subtype

2018-05-31 Thread Pierre-Marie de Rodat
This ensures that a warning for an object size clause present on a subtype
is posted on the clause and not on a size clause present on the type.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-31  Eric Botcazou  

gcc/ada/

* einfo.ads (Object_Size_Clause): Declare.
* einfo.adb (Object_Size_Clause): New function.
* gcc-interface/utils.c (maybe_pad_type): Test Has_Size_Clause before
retrieving Size_Clause and post the warning on the object size clause
if Has_Object_Size_Clause is true.

gcc/testsuite/

* gnat.dg/size_clause1.adb: New testcase.--- gcc/ada/einfo.adb
+++ gcc/ada/einfo.adb
@@ -8755,6 +8755,15 @@ package body Einfo is
   return N;
end Number_Formals;
 
+   
+   -- Object_Size_Clause --
+   
+
+   function Object_Size_Clause (Id : E) return N is
+   begin
+  return Get_Attribute_Definition_Clause (Id, Attribute_Object_Size);
+   end Object_Size_Clause;
+

-- Parameter_Mode --


--- gcc/ada/einfo.ads
+++ gcc/ada/einfo.ads
@@ -1828,7 +1828,7 @@ package Einfo is
 
 --Has_Object_Size_Clause (Flag172)
 --   Defined in entities for types and subtypes. Set if an Object_Size
---   clause has been processed for the type Used to prevent multiple
+--   clause has been processed for the type. Used to prevent multiple
 --   Object_Size clauses for a given entity.
 
 --Has_Out_Or_In_Out_Parameter (Flag110)
@@ -3753,6 +3753,15 @@ package Einfo is
 --   Applies to subprograms and subprogram types. Yields the number of
 --   formals as a value of type Pos.
 
+--Object_Size_Clause (synthesized)
+--   Applies to entities for types and subtypes. If an object size clause
+--   is present in the rep item chain for an entity then the attribute
+--   definition clause node is returned. Otherwise Object_Size_Clause
+--   returns Empty if no item is present. Usually this is only meaningful
+--   if the flag Has_Object_Size_Clause is set. This is because when the
+--   representation item chain is copied for a derived type, it can inherit
+--   an object size clause that is not applicable to the entity.
+
 --OK_To_Rename (Flag247)
 --   Defined only in entities for variables. If this flag is set, it
 --   means that if the entity is used as the initial value of an object
@@ -5782,6 +5791,7 @@ package Einfo is
--Is_Access_Protected_Subprogram_Type (synth)
--Is_Atomic_Or_VFA(synth)
--Is_Controlled   (synth)
+   --Object_Size_Clause  (synth)
--Partial_Invariant_Procedure (synth)
--Predicate_Function  (synth)
--Predicate_Function_M(synth)
@@ -7673,6 +7683,7 @@ package Einfo is
function Number_Dimensions   (Id : E) return Pos;
function Number_Entries  (Id : E) return Nat;
function Number_Formals  (Id : E) return Pos;
+   function Object_Size_Clause  (Id : E) return N;
function Parameter_Mode  (Id : E) return Formal_Kind;
function Partial_Refinement_Constituents (Id : E) return L;
function Primitive_Operations(Id : E) return L;

--- gcc/ada/gcc-interface/utils.c
+++ gcc/ada/gcc-interface/utils.c
@@ -1507,7 +1507,7 @@ built:
 	   || TREE_OVERFLOW (orig_size)
 	   || tree_int_cst_lt (size, orig_size
 {
-  Node_Id gnat_error_node = Empty;
+  Node_Id gnat_error_node;
 
   /* For a packed array, post the message on the original array type.  */
   if (Is_Packed_Array_Impl_Type (gnat_entity))
@@ -1517,8 +1517,12 @@ built:
 	   || Ekind (gnat_entity) == E_Discriminant)
 	  && Present (Component_Clause (gnat_entity)))
 	gnat_error_node = Last_Bit (Component_Clause (gnat_entity));
-  else if (Present (Size_Clause (gnat_entity)))
+  else if (Has_Size_Clause (gnat_entity))
 	gnat_error_node = Expression (Size_Clause (gnat_entity));
+  else if (Has_Object_Size_Clause (gnat_entity))
+	gnat_error_node = Expression (Object_Size_Clause (gnat_entity));
+  else
+	gnat_error_node = Empty;
 
   /* Generate message only for entities that come from source, since
 	 if we have an entity created by expansion, the message will be

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/size_clause1.adb
@@ -0,0 +1,11 @@
+procedure Size_Clause1 is
+
+  type Modular is mod 2**64;
+  for Modular'Size use 64;
+
+  subtype Enlarged_Modular is Modular;
+  for Enlarged_Modular'Object_Size use 128; --  { dg-warning "warning: 64 bits of \"Enlarged_Modular\" unused" }
+
+begin
+null;
+end Size_Clause1;



[Ada] Simplify call to Unique_Defining_Entity on protected entry declarations

2018-05-31 Thread Pierre-Marie de Rodat
Calling Unique_Defining_Entity on protectected entry declarations is
equivalent to calling a simpler Defining_Entity; use the simpler routine.

Simplification only; semantics unaffected, so no test provided.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-31  Piotr Trojanek  

gcc/ada/

* sem_prag.adb (Analyze_Pragma): Replace call to Unique_Defining_Entity
with a semantically equivalent call to Defining_Entity.--- gcc/ada/sem_prag.adb
+++ gcc/ada/sem_prag.adb
@@ -18795,7 +18795,7 @@ package body Sem_Prag is
   return;
end if;
 
-   Entry_Id := Unique_Defining_Entity (Entry_Decl);
+   Entry_Id := Defining_Entity (Entry_Decl);
 
 --  Otherwise the pragma is associated with an illegal construct
 



[Ada] Fix check on placement of multiple loop (in)variant pragmas

2018-05-31 Thread Pierre-Marie de Rodat
Loop (in)variants should appear next to each other, which is checked by GNAT
frontend. As statements inserted during expansion may break this contiguity,
GNAT recognizes specially such statements which originate in loop pragmas. In
some cases, this special treatment was not properly put in place, which lead to
spurious errors being issued.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-31  Yannick Moy  

gcc/ada/

* sem_prag.adb (Analyze_Pragma.Check_Loop_Pragma_Placement): Inverse
order of treatment between nodes recognized as loop pragmas (or
generated from one) and block statements.--- gcc/ada/sem_prag.adb
+++ gcc/ada/sem_prag.adb
@@ -5931,23 +5931,9 @@ package body Sem_Prag is
Stmt := First (L);
while Present (Stmt) loop
 
-  --  Pragmas Loop_Invariant and Loop_Variant may only appear
-  --  inside a loop or a block housed inside a loop. Inspect
-  --  the declarations and statements of the block as they may
-  --  contain the first grouping.
-
-  if Nkind (Stmt) = N_Block_Statement then
- HSS := Handled_Statement_Sequence (Stmt);
-
- Check_Grouping (Declarations (Stmt));
-
- if Present (HSS) then
-Check_Grouping (Statements (HSS));
- end if;
-
   --  First pragma of the first topmost grouping has been found
 
-  elsif Is_Loop_Pragma (Stmt) then
+  if Is_Loop_Pragma (Stmt) then
 
  --  The group and the current pragma are not in the same
  --  declarative or statement list.
@@ -6004,6 +5990,24 @@ package body Sem_Prag is
 
 raise Program_Error;
  end if;
+
+  --  Pragmas Loop_Invariant and Loop_Variant may only appear
+  --  inside a loop or a block housed inside a loop. Inspect
+  --  the declarations and statements of the block as they may
+  --  contain the first grouping. This case follows the one for
+  --  loop pragmas, as block statements which originate in a
+  --  loop pragma (and so Is_Loop_Pragma will return True on
+  --  that block statement) should be treated in the previous
+  --  case.
+
+  elsif Nkind (Stmt) = N_Block_Statement then
+ HSS := Handled_Statement_Sequence (Stmt);
+
+ Check_Grouping (Declarations (Stmt));
+
+ if Present (HSS) then
+Check_Grouping (Statements (HSS));
+ end if;
   end if;
 
   Next (Stmt);



[Ada] Spurious tampering check failure

2018-05-31 Thread Pierre-Marie de Rodat
This patch modifies the transient scope mechanism to create a scope when the
condition of an iteration scheme returns a controlled result or involves the
secondary stack. As a result, a while loop which iterates over a container
properly manages the tampering bit at each iteration of the loop.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-31  Hristian Kirtchev  

gcc/ada/

* exp_ch7.adb (Find_Transient_Context): An iteration scheme is a valid
boudary for a transient scope.

gcc/testsuite/

* gnat.dg/tampering_check1.adb, gnat.dg/tampering_check1_ivectors.ads,
gnat.dg/tampering_check1_trim.adb, gnat.dg/tampering_check1_trim.ads:
New testcase.--- gcc/ada/exp_ch7.adb
+++ gcc/ada/exp_ch7.adb
@@ -4987,6 +4987,7 @@ package body Exp_Ch7 is
| N_Entry_Body_Formal_Part
| N_Exit_Statement
| N_If_Statement
+   | N_Iteration_Scheme
| N_Terminate_Alternative
 =>
pragma Assert (Present (Prev));
@@ -5058,13 +5059,11 @@ package body Exp_Ch7 is
   return Curr;
end if;
 
---  An iteration scheme or an Ada 2012 iterator specification is
---  not a valid context because Analyze_Iteration_Scheme already
---  employs special processing for them.
+--  An Ada 2012 iterator specification is not a valid context
+--  because Analyze_Iterator_Specification already employs special
+--  processing for it.
 
-when N_Iteration_Scheme
-   | N_Iterator_Specification
-=>
+when N_Iterator_Specification =>
return Empty;
 
 when N_Loop_Parameter_Specification =>

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/tampering_check1.adb
@@ -0,0 +1,15 @@
+--  { dg-do run }
+
+with Tampering_Check1_IVectors; use Tampering_Check1_IVectors;
+with Tampering_Check1_Trim;
+
+procedure Tampering_Check1 is
+   V : Vector;
+
+begin
+   V.Append (-1);
+   V.Append (-2);
+   V.Append (-3);
+
+   Tampering_Check1_Trim (V);
+end Tampering_Check1;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/tampering_check1_ivectors.ads
@@ -0,0 +1,4 @@
+with Ada.Containers.Vectors;
+
+package Tampering_Check1_IVectors is new
+   Ada.Containers.Vectors (Positive, Integer);

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/tampering_check1_trim.adb
@@ -0,0 +1,9 @@
+procedure Tampering_Check1_Trim
+  (V : in out Tampering_Check1_IVectors.Vector) is
+   use Tampering_Check1_IVectors;
+
+begin
+   while not Is_Empty (V) and then V (V.First) < 0 loop
+  V.Delete_First;
+   end loop;
+end Tampering_Check1_Trim;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/tampering_check1_trim.ads
@@ -0,0 +1,4 @@
+with Tampering_Check1_IVectors;
+
+procedure Tampering_Check1_Trim
+  (V : in out Tampering_Check1_IVectors.Vector);



[Ada] Update comment on __atomic_compare_exchange in s-atomic_primitives

2018-05-31 Thread Pierre-Marie de Rodat
Remove mention of unavailability, long obsolete, and reword suggestion of use
to indicate that we might want to switch to an internal interface using them.
The current wording suggests just that we should bind the current
Sync_Compare_And_Swap Ada subprograms to __atomic_compare builtins instead of
__sync_compare, which would be highly confusing.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-31  Olivier Hainque  

gcc/ada/

* libgnat/s-atopri.ads: Update comment on __atomic_compare_exchange
builtins.--- gcc/ada/libgnat/s-atopri.ads
+++ gcc/ada/libgnat/s-atopri.ads
@@ -92,18 +92,6 @@ package System.Atomic_Primitives is
   Sync_Compare_And_Swap_8,
   "__sync_val_compare_and_swap_1");
 
-   --  ??? Should use __atomic_compare_exchange_1 (doesn't work yet):
-   --  function Sync_Compare_And_Swap_8
-   --(Ptr   : Address;
-   -- Expected  : Address;
-   -- Desired   : uint8;
-   -- Weak  : Boolean   := False;
-   -- Success_Model : Mem_Model := Seq_Cst;
-   -- Failure_Model : Mem_Model := Seq_Cst) return Boolean;
-   --  pragma Import (Intrinsic,
-   -- Sync_Compare_And_Swap_8,
-   -- "__atomic_compare_exchange_1");
-
function Sync_Compare_And_Swap_16
  (Ptr  : Address;
   Expected : uint16;
@@ -128,6 +116,20 @@ package System.Atomic_Primitives is
   Sync_Compare_And_Swap_64,
   "__sync_val_compare_and_swap_8");
 
+   --  ??? We might want to switch to the __atomic series of builtins for
+   --  compare-and-swap operations at some point.
+
+   --  function Atomic_Compare_Exchange_8
+   --(Ptr   : Address;
+   -- Expected  : Address;
+   -- Desired   : uint8;
+   -- Weak  : Boolean   := False;
+   -- Success_Model : Mem_Model := Seq_Cst;
+   -- Failure_Model : Mem_Model := Seq_Cst) return Boolean;
+   --  pragma Import (Intrinsic,
+   -- Atomic_Compare_Exchange_8,
+   -- "__atomic_compare_exchange_1");
+
--
-- Lock-free operations --
--



[Ada] Detect returning procedures annotated with No_Return

2018-05-31 Thread Pierre-Marie de Rodat
GNAT was emitting a warning about procedures with No_Return aspect on the
spec and a returning body, but failed to handle similar procedures with
no explicit spec. Now fixed.

This was also affecting GNATprove, where an undetected mismatch between
No_Return aspect and the body was a soundness bug, i.e. GNATprove was
silently accept code that raise a runtime exception.


-- Source --


procedure P (X : Boolean) with No_Return is
begin
   if X then
  raise Program_Error;
   end if;
end;

-
-- Compilation --
-

$ gcc -c p.adb
p.adb:3:04: warning: implied return after this statement will raise
 Program_Error
p.adb:3:04: warning: procedure "P" is marked as No_Return

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-31  Piotr Trojanek  

gcc/ada/

* sem_ch6.adb (Check_Missing_Return): Handle procedures with no
explicit spec.--- gcc/ada/sem_ch6.adb
+++ gcc/ada/sem_ch6.adb
@@ -3040,11 +3040,16 @@ package body Sem_Ch6 is
 
  --  If procedure with No_Return, check returns
 
- elsif Nkind (Body_Spec) = N_Procedure_Specification
-   and then Present (Spec_Id)
-   and then No_Return (Spec_Id)
- then
-Check_Returns (HSS, 'P', Missing_Ret, Spec_Id);
+ elsif Nkind (Body_Spec) = N_Procedure_Specification then
+if Present (Spec_Id) then
+   Id := Spec_Id;
+else
+   Id := Body_Id;
+end if;
+
+if No_Return (Id) then
+   Check_Returns (HSS, 'P', Missing_Ret, Id);
+end if;
  end if;
 
  --  Special checks in SPARK mode



[Ada] Illegal copy of limited object

2018-05-31 Thread Pierre-Marie de Rodat
This patch fixes a spurious copy of a limited object, when that object
is a discriminated record component of a limited type LT, and the enclosing
record is initialized by means of an aggregate, one of whose components is a
call to a build-in-place function that returns an unconstrained object of
type T.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-31  Ed Schonberg  

gcc/ada/

* checks.adb (Apply_Discriminant_Check): Do not apply discriminant
check to a call to a build-in-place function, given that the return
object is limited and cannot be copied.

gcc/testsuite/

* gnat.dg/limited1.adb, gnat.dg/limited1_inner.adb,
gnat.dg/limited1_inner.ads, gnat.dg/limited1_outer.adb,
gnat.dg/limited1_outer.ads: New testcase.--- gcc/ada/checks.adb
+++ gcc/ada/checks.adb
@@ -1458,6 +1458,19 @@ package body Checks is
  T_Typ := Typ;
   end if;
 
+  --  If the expression is a function call that returns a limited object
+  --  it cannot be copied. It is not clear how to perform the proper
+  --  discriminant check in this case because the discriminant value must
+  --  be retrieved from the constructed object itself.
+
+  if Nkind (N) = N_Function_Call
+and then Is_Limited_Type (Typ)
+and then Is_Entity_Name (Name (N))
+and then Returns_By_Ref (Entity (Name (N)))
+  then
+ return;
+  end if;
+
   --  Only apply checks when generating code and discriminant checks are
   --  not suppressed. In GNATprove mode, we do not apply the checks, but we
   --  still analyze the expression to possibly issue errors on SPARK code

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/limited1.adb
@@ -0,0 +1,9 @@
+--  { dg-do run }
+
+with Limited1_Outer; use Limited1_Outer;
+
+procedure Limited1 is
+   X : Outer_Type := Make_Outer;
+begin
+   null;
+end;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/limited1_inner.adb
@@ -0,0 +1,15 @@
+package body Limited1_Inner is
+   overriding procedure Finalize (X : in out Limited_Type) is
+   begin
+  if X.Self /= X'Unchecked_Access then
+ raise Program_Error with "Copied!";
+  end if;
+   end;
+
+   function Make_Inner return Inner_Type is
+   begin
+  return Inner : Inner_Type (True) do
+ null;
+  end return;
+   end;
+end;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/limited1_inner.ads
@@ -0,0 +1,18 @@
+with Ada.Finalization;
+package Limited1_Inner is
+   type Limited_Type is new Ada.Finalization.Limited_Controlled with record
+  Self : access Limited_Type := Limited_Type'Unchecked_Access;
+   end record;
+   overriding procedure Finalize (X : in out Limited_Type);
+
+   type Inner_Type (What : Boolean) is record
+  case What is
+ when False =>
+null;
+ when True =>
+L : Limited_Type;
+  end case;
+   end record;
+
+   function Make_Inner return Inner_Type;
+end;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/limited1_outer.adb
@@ -0,0 +1,6 @@
+package body Limited1_Outer is
+   function Make_Outer return Outer_Type is
+   begin
+  return (What => True, Inner => Make_Inner);
+   end;
+end;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/limited1_outer.ads
@@ -0,0 +1,9 @@
+with Limited1_Inner; use Limited1_Inner;
+
+package Limited1_Outer is
+   type Outer_Type (What : Boolean) is record
+  Inner : Inner_Type (What);
+   end record;
+
+   function Make_Outer return Outer_Type;
+end Limited1_Outer;



[Ada] Static predicate check on characters of a string literal

2018-05-31 Thread Pierre-Marie de Rodat
This patch implements the rule given in RM 4.2 (11): if the component type of
a string literal is a character type with a static predicate, that predicate
must be applied to each character in the string.

Compiling the example below must yield:

   gcc -c -gnata main.adb

  main.adb:4:23: warning: static expression fails static predicate check on "C"
  main.adb:4:23: warning: expression is no longer considered static
  main.adb:4:24: warning: static expression fails static predicate check on "C"
  main.adb:4:24: warning: expression is no longer considered static
  main.adb:4:25: warning: static expression fails static predicate check on "C"
  main.adb:4:25: warning: expression is no longer considered static

Execution must yield:

  raised SYSTEM.ASSERTIONS.ASSERT_FAILURE :
Static_Predicate failed at main.adb:4


procedure Main is
   subtype C is Character with Static_Predicate => C in 'A' | 'B' | 'C';
   type S is array (Positive range <>) of C;
   X : constant S := "abc";
begin
   null;
end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-31  Ed Schonberg  

gcc/ada/

* sem_res.adb (Resolve_String_Literal): If the type is a string type
whose component subtype has a static predicate, ensure that the
predicate is applied to each character by expanding the string into the
equivalent aggregate. This is also done if the component subtype is
constrained.
--- gcc/ada/sem_res.adb
+++ gcc/ada/sem_res.adb
@@ -10774,7 +10774,9 @@ package body Sem_Res is
  --  whether the evaluation of the string will raise constraint error.
  --  Otherwise we need to transform the string literal into the
  --  corresponding character aggregate and let the aggregate code do
- --  the checking.
+ --  the checking. We use the same transformation if the component
+ --  type has a static predicate, which will be applied to each
+ --  character when the aggregate is resolved.
 
  if Is_Standard_Character_Type (R_Typ) then
 
@@ -10811,7 +10813,9 @@ package body Sem_Res is
  end if;
   end loop;
 
-  return;
+  if not Has_Static_Predicate (C_Typ) then
+ return;
+  end if;
end if;
 end;
  end if;



[Ada] Fix __gnat_backtrace for VxWorks7 on x86

2018-05-31 Thread Pierre-Marie de Rodat
A STORAGE ERROR is raised in __gnat_backtrace:

adainit: 0x00400DBC

Execution of ce.vxe terminated by unhandled exception
raised STORAGE_ERROR : SIGSEGV: possible stack overflow
Call stack traceback locations:
0x4082f1 0x408323 0x4080c9

It was passing with vxsim because the WRS_RTP_BASE is set to a different
place hence the (CURRENT) < (TOP_STACK) was stopping the backtrace at the
right time. So let's stop at the main symbol when RTS=rtp.

Tested on x86_64-pc-linux-gnu, committed on trunk

2018-05-31  Frederic Konrad  

gcc/ada/

* tracebak.c (STOP_FRAME): Harden condition.
(is_return_from, EXTRA_STOP_CONDITION): New helpers for VxWorks in RTP
mode.--- gcc/ada/tracebak.c
+++ gcc/ada/tracebak.c
@@ -478,10 +478,11 @@ struct layout
 #define PC_ADJUST -2
 #define STOP_FRAME(CURRENT, TOP_STACK) \
   (IS_BAD_PTR((long)(CURRENT)) \
+   || (void *) (CURRENT) < (TOP_STACK) \
|| IS_BAD_PTR((long)(CURRENT)->return_address) \
|| (CURRENT)->return_address == 0 \
|| (void *) ((CURRENT)->next) < (TOP_STACK)  \
-   || (void *) (CURRENT) < (TOP_STACK))
+   || EXTRA_STOP_CONDITION(CURRENT))
 
 #define BASE_SKIP (1+FRAME_LEVEL)
 
@@ -504,6 +505,37 @@ struct layout
 || ((*((ptr) - 1) & 0xff) == 0xff) \
 || (((*(ptr) & 0xd0ff) == 0xd0ff
 
+#if defined (__vxworks) && defined (__RTP__)
+
+/* For VxWorks following backchains past the "main" frame gets us into the
+   kernel space, where it can't be dereferenced. So lets stop at the main
+   symbol.  */
+extern void main();
+
+static int
+is_return_from(void *symbol_addr, void *ret_addr)
+{
+  int ret = 0;
+  char *ptr = (char *)ret_addr;
+
+  if ((*(ptr - 5) & 0xff) == 0xe8)
+{
+  /* call addr16  E8 xx xx xx xx  */
+  int32_t offset = *(int32_t *)(ptr - 4);
+  ret = (ptr + offset) == symbol_addr;
+}
+
+  /* Others not implemented yet...  But it is very likely that call addr16
+ is used here.  */
+  return ret;
+}
+
+#define EXTRA_STOP_CONDITION(CURRENT) \
+  (is_return_from(&main, (CURRENT)->return_address))
+#else /* not (defined (__vxworks) && defined (__RTP__)) */
+#define EXTRA_STOP_CONDITION(CURRENT) (0)
+#endif /* not (defined (__vxworks) && defined (__RTP__)) */
+
 /*- qnx --*/
 
 #elif defined (__QNX__)



Re: [PATCH][AArch64] Fix aarch64_ira_change_pseudo_allocno_class

2018-05-31 Thread Wilco Dijkstra
Richard Sandiford wrote:

>> This has probably been reported elsewhere already but I can't find
>> such a report, so sorry for possible duplicate,
>> but this patch is causing ICEs on aarch64
>> FAIL:    gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
>> (internal compiler error)
>> FAIL:    gcc.target/aarch64/sve/reduc_5.c -march=armv8.2-a+sve
>> (internal compiler error)
>>
>> and also many scan-assembler regressions:
>>
>>  
>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/260951/report-build-info.html
>
> Thanks for the heads-up.  Looks like they're all SVE, so I'll take this.

It seems this is due to unnecessary spills of PR_REGS - the subset doesn't work 
for those.
The original proposal doing:

  if (allocno_class != POINTER_AND_FP_REGS)
return allocno_class;

doesn't appear to affect SVE. However the question is whether the register 
allocator
can get confused about PR_REGS and end up with POINTER_AND_FP_REGS for
both the allocno_class and best_class? If so then the return needs to support 
predicate
modes too.

Wilco

[testsuite] Run more gcc.dg/store_merging_*.c tests

2018-05-31 Thread Eric Botcazou
There is a handful of gcc.dg/store_merging_*.c tests in the testsuite with 
both a main procedure doing sanity checks and a dg-do compile directive.

Tested on x86-64/Linux, applied on the mainline and 8 branch as obvious.


2018-05-31  Eric Botcazou  

* gcc.dg/store_merging_10.c: Turn dg-do compile into dg-do run.
* gcc.dg/store_merging_11.c: Likewise.
* gcc.dg/store_merging_13.c: Likewise.
* gcc.dg/store_merging_14.c: Likewise.
* gcc.dg/store_merging_15.c: Likewise.
* gcc.dg/store_merging_16.c: Likewise.  Remove local variable.

-- 
Eric BotcazouIndex: gcc.dg/store_merging_10.c
===
--- gcc.dg/store_merging_10.c	(revision 260913)
+++ gcc.dg/store_merging_10.c	(working copy)
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do run } */
 /* { dg-require-effective-target store_merge } */
 /* { dg-options "-O2 -fdump-tree-store-merging" } */
 
Index: gcc.dg/store_merging_11.c
===
--- gcc.dg/store_merging_11.c	(revision 260913)
+++ gcc.dg/store_merging_11.c	(working copy)
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do run } */
 /* { dg-require-effective-target store_merge } */
 /* { dg-options "-O2 -fdump-tree-store-merging" } */
 
Index: gcc.dg/store_merging_13.c
===
--- gcc.dg/store_merging_13.c	(revision 260913)
+++ gcc.dg/store_merging_13.c	(working copy)
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do run } */
 /* { dg-require-effective-target store_merge } */
 /* { dg-options "-O2 -fdump-tree-store-merging" } */
 
Index: gcc.dg/store_merging_14.c
===
--- gcc.dg/store_merging_14.c	(revision 260913)
+++ gcc.dg/store_merging_14.c	(working copy)
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do run } */
 /* { dg-require-effective-target store_merge } */
 /* { dg-options "-O2 -fdump-tree-store-merging" } */
 
Index: gcc.dg/store_merging_15.c
===
--- gcc.dg/store_merging_15.c	(revision 260913)
+++ gcc.dg/store_merging_15.c	(working copy)
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do run } */
 /* { dg-require-effective-target store_merge } */
 /* { dg-options "-O2 -fdump-tree-store-merging" } */
 
Index: gcc.dg/store_merging_16.c
===
--- gcc.dg/store_merging_16.c	(revision 260913)
+++ gcc.dg/store_merging_16.c	(working copy)
@@ -1,6 +1,6 @@
 /* Only test on some 64-bit targets which do have bswap{si,di}2 patterns and
are either big or little endian (not pdp endian).  */
-/* { dg-do compile { target { lp64 && { i?86-*-* x86_64-*-* powerpc*-*-* aarch64*-*-* } } } } */
+/* { dg-do run { target { lp64 && { i?86-*-* x86_64-*-* powerpc*-*-* aarch64*-*-* } } } } */
 /* { dg-require-effective-target store_merge } */
 /* { dg-options "-O2 -fdump-tree-store-merging" } */
 
@@ -114,7 +114,7 @@ main ()
 {
   unsigned char a[8];
   int i;
-  struct S b, c, d;
+  struct S c, d;
   f1 (a, 0x0102030405060708ULL);
   for (i = 0; i < 8; ++i)
 if (a[i] != 8 - i)


Re: [testsuite] Run more gcc.dg/store_merging_*.c tests

2018-05-31 Thread Jakub Jelinek
On Thu, May 31, 2018 at 01:22:00PM +0200, Eric Botcazou wrote:
> There is a handful of gcc.dg/store_merging_*.c tests in the testsuite with 
> both a main procedure doing sanity checks and a dg-do compile directive.
> 
> Tested on x86-64/Linux, applied on the mainline and 8 branch as obvious.

Oops, thanks for fixing that.

> 2018-05-31  Eric Botcazou  
> 
>   * gcc.dg/store_merging_10.c: Turn dg-do compile into dg-do run.
>   * gcc.dg/store_merging_11.c: Likewise.
>   * gcc.dg/store_merging_13.c: Likewise.
>   * gcc.dg/store_merging_14.c: Likewise.
>   * gcc.dg/store_merging_15.c: Likewise.
>   * gcc.dg/store_merging_16.c: Likewise.  Remove local variable.

Jakub


[testsuite] Minor cleanup in gnat.dg/stack_usage* tests

2018-05-31 Thread Eric Botcazou
This replaces the scanning of the -fstack-usage output with -Wstack-usage.

Tested on x86-64/Linux, applied on the mainline and 8 branch.


2018-05-31  Eric Botcazou  

* gnat.dg/stack_usage1.adb: Replace -fstack-usage with -Wstack-usage.
* gnat.dg/stack_usage1b.adb: Likewise.
* gnat.dg/stack_usage1c.adb: Likewise.
* gnat.dg/stack_usage3.adb: Likewise.
* gnat.dg/stack_usage1_pkg.adb: Delete.

-- 
Eric BotcazouIndex: gnat.dg/stack_usage1.adb
===
--- gnat.dg/stack_usage1.adb	(revision 260913)
+++ gnat.dg/stack_usage1.adb	(working copy)
@@ -1,5 +1,5 @@
 -- { dg-do compile }
--- { dg-options "-fstack-usage" }
+-- { dg-options "-Wstack-usage=128" { target i?86-*-* x86_64-*-* } }
 
 with Stack_Usage1_Pkg; use Stack_Usage1_Pkg;
 
@@ -34,6 +34,3 @@ begin
end case;
 
 end Stack_Usage1;
-
--- { dg-final { scan-stack-usage "\t\[0-9\]\[0-9\]\t" { target i?86-*-* x86_64-*-* } } }
--- { dg-final { cleanup-stack-usage } }
Index: gnat.dg/stack_usage1_pkg.adb
===
--- gnat.dg/stack_usage1_pkg.adb	(revision 260913)
+++ gnat.dg/stack_usage1_pkg.adb	(nonexistent)
@@ -1,13 +0,0 @@
-package body Stack_Usage1_Pkg is
-
-   function Ident_Int (X : Integer) return Integer is
-   begin
-  return X;
-   end Ident_Int;
-
-   procedure My_Proc (X : R) is
-   begin
-  null;
-   end My_Proc;
-
-end Stack_Usage1_Pkg;
Index: gnat.dg/stack_usage1b.adb
===
--- gnat.dg/stack_usage1b.adb	(revision 260913)
+++ gnat.dg/stack_usage1b.adb	(working copy)
@@ -1,5 +1,5 @@
 -- { dg-do compile }
--- { dg-options "-O -fstack-usage" }
+-- { dg-options "-O -Wstack-usage=128" { target i?86-*-* x86_64-*-* } }
 
 with Stack_Usage1_Pkg; use Stack_Usage1_Pkg;
 
@@ -34,6 +34,3 @@ begin
end case;
 
 end Stack_Usage1b;
-
--- { dg-final { scan-stack-usage "\t\[0-9\]\[0-9\]\t" { target i?86-*-* x86_64-*-* } } }
--- { dg-final { cleanup-stack-usage } }
Index: gnat.dg/stack_usage1c.adb
===
--- gnat.dg/stack_usage1c.adb	(revision 260913)
+++ gnat.dg/stack_usage1c.adb	(working copy)
@@ -1,5 +1,5 @@
 -- { dg-do compile }
--- { dg-options "-O2 -fstack-usage" }
+-- { dg-options "-O2 -Wstack-usage=128" { target i?86-*-* x86_64-*-* } }
 
 with Stack_Usage1_Pkg; use Stack_Usage1_Pkg;
 
@@ -34,6 +34,3 @@ begin
end case;
 
 end Stack_Usage1c;
-
--- { dg-final { scan-stack-usage "\t\[0-9\]\[0-9\]\t" { target i?86-*-* x86_64-*-* } } }
--- { dg-final { cleanup-stack-usage } }
Index: gnat.dg/stack_usage3.adb
===
--- gnat.dg/stack_usage3.adb	(revision 260913)
+++ gnat.dg/stack_usage3.adb	(working copy)
@@ -1,5 +1,5 @@
 -- { dg-do compile }
--- { dg-options "-O -fstack-usage" }
+-- { dg-options "-O -Wstack-usage=1024" }
 
 with Ada.Text_IO; use Ada.Text_IO;
 with Stack_Usage3_Pkg; use Stack_Usage3_Pkg;
@@ -27,6 +27,3 @@ begin
Put_Line (Diag ("Diag line 19"));
Put_Line (Diag ("Diag line 20"));
 end;
-
--- { dg-final { scan-stack-usage-not "\t\[0-9\]\[0-9\]\[0-9\]\[0-9\]\t" } }
--- { dg-final { cleanup-stack-usage } }


Re: [PATCH] [MSP430] Fix PR39240 execution failure for msp430-elf

2018-05-31 Thread Jozef Lawrynowicz

On 31/05/18 00:28, Jeff Law wrote:


It's not clear why having that subreg is causing incorrect code to be
generated, but the subreg is clearly wrong since it's a qisi pattern not
a hisi pattern.  Based on that I've approved and installed the change.


There's some further info in PR85941, but I closed that since it didn't appear
combine was at fault.

If that subreg expression is in the insn pattern, then the expand RTL pass is
coerced into adding that additional expression so it is matched as
zero_extendqisi2 in the next pass:

(insn 11 10 12 2 (set (reg:SI 29)
(zero_extend:SI (subreg:HI (subreg/s/v:QI (reg:HI 23 [ _1 ]) 0) 0)))

By the time we get to the combine pass, the zero extend part looks like:

(zero_extend:SI (subreg:HI (reg:QI 28) 0)))

The expression to zero_extend then becomes R12:HI, losing the zero extension
from QImode. See below from the combine rtl dump:

insn_cost 4 for 8: r28:QI=R12:QI
  REG_DEAD R12:QI
...
insn_cost 8 for11: r29:SI=zero_extend(r28:QI#0)
  REG_DEAD r28:QI
...
allowing combination of insns 8 and 11
original costs 4 + 8 = 12
replacement cost 8
deferring deletion of insn with uid = 8.
modifying insn i311: r29:SI=zero_extend(R12:HI)
  REG_DEAD R12:QI

I guess combine decided it was valid to just directly use hard reg R12 in HImode
at this point.


I'm a big believer that any subreg appearing in an md is fishy and
should be reviewed and justified.  I've found they're generally a bad idea.


Ok, there are quite a few uses of subreg expressions in msp430 insn patterns,
mostly when a PSImode operand is involved. I'll make a note to review these in
the future.

Thanks,
Jozef



[PATCH] PR libstdc++/85951 for make_signed/make_unsigned for character types

2018-05-31 Thread Jonathan Wakely

Because the wide character types are neither signed integer types nor
unsigned integer types they need to be transformed to an integral type
of the correct size and the lowest rank (which is not necessarily the
underlying type). Reuse the helpers for enumeration types to select the
correct integer.

The refactoring of __make_unsigned_selector and __make_signed_selector
slightly reduces the number of template instantiations and so reduces
memory usage.

PR libstdc++/85951
* include/std/type_traits [_GLIBCXX_USE_C99_STDINT_TR1]: Do not define
uint_least16_t and uint_least32_t.
(__make_unsigned): Define unconditionally.
(__make_unsigned_selector<_Tp, true, false>): Remove intermediate
typedefs.
(__make_unsigned_selector_base): New type to provide helper templates.
(__make_unsigned_selector<_Tp, false, true>): Reimplement using
__make_unsigned_selector_base helpers.
(__make_unsigned, __make_unsigned): Define.
(__make_signed_selector<_Tp, true, false>): Remove intermediate
typedefs.
(__make_signed, __make_signed)
(__make_signed)): Define unconditionally.
* testsuite/20_util/make_signed/requirements/typedefs-3.cc: Check
wchar_t, char16_t and char32_t are transformed correctly.
* testsuite/20_util/make_signed/requirements/typedefs_neg.cc: Adjust
dg-error lineno.
* testsuite/20_util/make_unsigned/requirements/typedefs-3.cc: Check
wchar_t, char16_t and char32_t are transformed correctly.
* testsuite/20_util/make_unsigned/requirements/typedefs_neg.cc: Adjust
dg-error lineno.

Tested powerpc64le-linux, committed to trunk.

I'll backport a simpler version without the refactoring.

commit 332a337bcff7cd337476972cb9ce910db43ee236
Author: Jonathan Wakely 
Date:   Thu May 31 00:10:59 2018 +0100

PR libstdc++/85951 for make_signed/make_unsigned for character types

Because the wide character types are neither signed integer types nor
unsigned integer types they need to be transformed to an integral type
of the correct size and the lowest rank (which is not necessarily the
underlying type). Reuse the helpers for enumeration types to select the
correct integer.

The refactoring of __make_unsigned_selector and __make_signed_selector
slightly reduces the number of template instantiations and so reduces
memory usage.

PR libstdc++/85951
* include/std/type_traits [_GLIBCXX_USE_C99_STDINT_TR1]: Do not 
define
uint_least16_t and uint_least32_t.
(__make_unsigned): Define unconditionally.
(__make_unsigned_selector<_Tp, true, false>): Remove intermediate
typedefs.
(__make_unsigned_selector_base): New type to provide helper 
templates.
(__make_unsigned_selector<_Tp, false, true>): Reimplement using
__make_unsigned_selector_base helpers.
(__make_unsigned, __make_unsigned): Define.
(__make_signed_selector<_Tp, true, false>): Remove intermediate
typedefs.
(__make_signed, __make_signed)
(__make_signed)): Define unconditionally.
* testsuite/20_util/make_signed/requirements/typedefs-3.cc: Check
wchar_t, char16_t and char32_t are transformed correctly.
* testsuite/20_util/make_signed/requirements/typedefs_neg.cc: Adjust
dg-error lineno.
* testsuite/20_util/make_unsigned/requirements/typedefs-3.cc: Check
wchar_t, char16_t and char32_t are transformed correctly.
* testsuite/20_util/make_unsigned/requirements/typedefs_neg.cc: 
Adjust
dg-error lineno.

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 7c0ba727511..4397c484f20 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -37,18 +37,6 @@
 
 #include 
 
-#ifdef _GLIBCXX_USE_C99_STDINT_TR1
-# if defined (__UINT_LEAST16_TYPE__) && defined(__UINT_LEAST32_TYPE__)
-namespace std
-{
-  typedef __UINT_LEAST16_TYPE__ uint_least16_t;
-  typedef __UINT_LEAST32_TYPE__ uint_least32_t;
-}
-# else
-#  include 
-# endif
-#endif
-
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
@@ -1576,12 +1564,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct __make_unsigned
 { typedef unsigned long long __type; };
 
-#if defined(_GLIBCXX_USE_WCHAR_T) && !defined(__WCHAR_UNSIGNED__)
-  template<>
-struct __make_unsigned : __make_unsigned<__WCHAR_TYPE__>
-{ };
-#endif
-
 #if defined(__GLIBCXX_TYPE_INT_N_0)
   template<>
 struct __make_unsigned<__GLIBCXX_TYPE_INT_N_0>
@@ -1612,36 +1594,77 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 class __make_unsigned_selector<_Tp, true, false>
 {
-  typedef __make_unsigned::type> __unsignedt;
-  typedef typename __unsignedt::__type __unsigned_type;
- 

Re: [PATCH] Remove MPX support

2018-05-31 Thread Martin Liška
On 05/30/2018 07:40 PM, Jakub Jelinek wrote:
> On Tue, May 01, 2018 at 11:28:13AM -0600, Jeff Law wrote:
>> On 04/27/2018 05:00 AM, Martin Liška wrote:
>>> I'm sending patch that removes MPX. It preserves all options 
>>> -fcheck-pointer-bounds, -fchkp-* and -mmpx
>>> target option. These options are now NOP. On the contrary following options 
>>> were removed:
>>> --static-libmpx  -static-libmpxwrappers. Is it fine to remove them?
>>>
>>> Patch can bootstrap on x86_64-linux-gnu, ppc64le-linux-gnu and survives 
>>> regression tests.
>>> And the patch bootstraps also on aarch64-linux-gnu.
>>>
>>> Note that the patch is trim for files (some) that are removed. Doing that 
>>> was necessary to
>>> fit in 100K with bzip2 patch file.
>>>
>>> Ready to be installed after some time?
>> Yes.  Please coordinate with Jakub & Richi since this touches a fair
>> amount of code and might interfere with attempts to backport changes
>> from the trunk into the gcc-8 release branch.
>>
>> I wouldn't be surprised if we find bits of MPX code after the patch is
>> installed.  Changes to remove any stragglers are pre-approved as well.
> 
> Martin, any progress with this?  I'm not worried about MPX removal making
> backports much harder.
> 
>   Jakub
> 
Hi.

I did agreement with Richi that Honza will make review of IPA and i386 related 
parts.
And Richi will then review general changes. Honza is aware of the review request
and work on that soon.

Martin


Re: [PATCH][AArch64] Fix aarch64_ira_change_pseudo_allocno_class

2018-05-31 Thread Richard Sandiford
Wilco Dijkstra  writes:
> Richard Sandiford wrote:
>
>>> This has probably been reported elsewhere already but I can't find
>>> such a report, so sorry for possible duplicate,
>>> but this patch is causing ICEs on aarch64
>>> FAIL:    gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
>>> (internal compiler error)
>>> FAIL:    gcc.target/aarch64/sve/reduc_5.c -march=armv8.2-a+sve
>>> (internal compiler error)
>>>
>>> and also many scan-assembler regressions:
>>>
>>>  
>>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/260951/report-build-info.html
>>
>> Thanks for the heads-up.  Looks like they're all SVE, so I'll take this.
>
> It seems this is due to unnecessary spills of PR_REGS - the subset doesn't 
> work for those.

It does, but I'd originally suggested:

  if (!reg_class_subset_p (GENERAL_REGS, ...)
  || !reg_class_subset_p (FP_REGS, ...))
...bail out...

whereas the committed patch had:

  if (reg_class_subset_p (..., GENERAL_REGS)
  || reg_class_subset_p (..., FP_REGS))
...bail out...

That's an important difference.  The idea with the first was that
we should only make a choice between GENERAL_REGS and FP_REGS
if the original classes included both of them.  And that's what
we want because the new class has to be a refinement of the
original: it shouldn't include entirely new registers.

The committed version instead says that we won't make a choice
between GENERAL_REGS and FP_REGS if one of the classes is already
specific to one of them.  I think this would also lead to us changing
POINTER_REGS to GENERAL_REGS, although I don't know how much that
matters in practice.

> The original proposal doing:
>
>   if (allocno_class != POINTER_AND_FP_REGS)
> return allocno_class;
>
> doesn't appear to affect SVE. However the question is whether the
> register allocator can get confused about PR_REGS and end up with
> POINTER_AND_FP_REGS for both the allocno_class and best_class? If so
> then the return needs to support predicate modes too.

Yeah, that shouldn't happen, since predicate modes are only allowed in
predicate registers.

I think the reduc_1 ICE is a separate bug that I'll post a patch for,
but it goes latent again after the patch below.

Tested on aarch64-linux-gnu.  I don't think it can be called obvious
given the above, and it's only SVE-specifc by chance, so: OK to install?

Thanks,
Richard


2018-05-31  Richard Sandiford  

gcc/
* config/aarch64/aarch64.c (aarch64_ira_change_pseudo_allocno_class):
Fix subreg tests so that we only return a choice between
GENERAL_REGS and FP_REGS if the original classes included both.

Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c2018-05-30 19:31:14.212387813 +0100
+++ gcc/config/aarch64/aarch64.c2018-05-31 13:12:56.836974021 +0100
@@ -1108,12 +1108,12 @@ aarch64_ira_change_pseudo_allocno_class
 {
   machine_mode mode;
 
-  if (reg_class_subset_p (allocno_class, GENERAL_REGS)
-  || reg_class_subset_p (allocno_class, FP_REGS))
+  if (!reg_class_subset_p (GENERAL_REGS, allocno_class)
+  || !reg_class_subset_p (FP_REGS, allocno_class))
 return allocno_class;
 
-  if (reg_class_subset_p (best_class, GENERAL_REGS)
-  || reg_class_subset_p (best_class, FP_REGS))
+  if (!reg_class_subset_p (GENERAL_REGS, best_class)
+  || !reg_class_subset_p (FP_REGS, best_class))
 return best_class;
 
   mode = PSEUDO_REGNO_MODE (regno);


Re: [PATCH] c/55976 -Werror=return-type should error on returning a value from a void function

2018-05-31 Thread H.J. Lu
On Wed, May 30, 2018 at 3:56 PM, Jeff Law  wrote:
> On 04/22/2018 01:17 PM, dave.pa...@oracle.com wrote:
>> This patch fixes handling of -Werror=return-type as well as
>> -Wno-return-type. Currently, -Werror=return-type does not turn the
>> warnings into errors and -Wno-return-type does not turn off
>> warning/error. Now they both work as expected.
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55976
>>
>> Initialize warn_return_type only for C++/C++ with ObjC extensions. In C,
>> this allows us to differentiate between default (no option), or cases
>> where -Wreturn-type/-Wno-return-type are specified. Elsewhere, update
>> references to warn_return_type (for C) to reflect change in initialization.
>>
>> Patch was successfully bootstrapped and tested on x86_64-linux.
>>
>> --Dave
>>
>>
>> CL-55976
>>
>>
>> /c
>> 2018-04-22  David Pagan  
>>
>>   PR c/55976
>>   * c-decl.c (grokdeclarator): Update check for return type warnings.
>>   (start_function): Likewise.
>>   (finish_function): Likewise.
>>   * c-typeck.c (c_finish_return): Update check for return type warnings.
>>   Pass OPT_Wreturn_type to pedwarn when appropriate.
>>   * c-opts.c (c_common_post_options): Set default for warn_return_type
>>   for C++/C++ with ObjC extensions only. For C, makes it possible to
>>   differentiate between default (no option), -Wreturn-type, and
>>   -Wno-return-type.
>>
>> /testsuite
>> 2018-04-22  David Pagan  
>>
>>   PR c/55976
>>   * gcc.dg/noncompile/pr55976-1.c: New test.
>>   * gcc.dg/noncompile/pr55976-2.c: New test.
> THanks.  Installed on the trunk.
>

On x86, I got

FAIL: gcc.dg/noncompile/pr55976-1.c   -O0  (test for excess errors)
FAIL: gcc.dg/noncompile/pr55976-1.c   -O1  (test for excess errors)
FAIL: gcc.dg/noncompile/pr55976-1.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (test for excess errors)
FAIL: gcc.dg/noncompile/pr55976-1.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (test for excess errors)
FAIL: gcc.dg/noncompile/pr55976-1.c   -O2  (test for excess errors)
FAIL: gcc.dg/noncompile/pr55976-1.c   -O3 -g  (test for excess errors)
FAIL: gcc.dg/noncompile/pr55976-1.c   -Os  (test for excess errors)

[hjl@gnu-skx-1 testsuite]$
/export/ssd/git/gcc-test-native/bld/gcc/xgcc
-B/export/ssd/git/gcc-test-native/bld/gcc/
/export/ssd/git/gcc-test-native/src-trunk/gcc/testsuite/gcc.dg/noncompile/pr55976-1.c
-mx32 -B/export/ssd/git/gcc-test-native/bld/x86_64-pc-linux-gnu/32/libmpx/
-B/export/ssd/git/gcc-test-native/bld/x86_64-pc-linux-gnu/32/libmpx/mpxrt
-L/export/ssd/git/gcc-test-native/bld/x86_64-pc-linux-gnu/32/libmpx/mpxrt/.libs
-B/export/ssd/git/gcc-test-native/bld/x86_64-pc-linux-gnu/32/libmpx/
-B/export/ssd/git/gcc-test-native/bld/x86_64-pc-linux-gnu/32/libmpx/mpxwrap
-L/export/ssd/git/gcc-test-native/bld/x86_64-pc-linux-gnu/32/libmpx/mpxwrap/.libs
-fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects -Werror=return-type -S -o
pr55976-1.s
/export/ssd/git/gcc-test-native/src-trunk/gcc/testsuite/gcc.dg/noncompile/pr55976-1.c:
In function \u2018t\u2019:
/export/ssd/git/gcc-test-native/src-trunk/gcc/testsuite/gcc.dg/noncompile/pr55976-1.c:7:20:
error: \u2018return\u2019 with a value, in function returning void
[-Werror=return-type]
/export/ssd/git/gcc-test-native/src-trunk/gcc/testsuite/gcc.dg/noncompile/pr55976-1.c:7:6:
note: declared here
/export/ssd/git/gcc-test-native/src-trunk/gcc/testsuite/gcc.dg/noncompile/pr55976-1.c:
In function \u2018b\u2019:
/export/ssd/git/gcc-test-native/src-trunk/gcc/testsuite/gcc.dg/noncompile/pr55976-1.c:8:12:
error: \u2018return\u2019 with no value, in function returning
non-void [-Werror=return-type]
/export/ssd/git/gcc-test-native/src-trunk/gcc/testsuite/gcc.dg/noncompile/pr55976-1.c:8:5:
note: declared here
cc1: some warnings being treated as errors
[hjl@gnu-skx-1 testsuite]$

-- 
H.J.


Re: [PATCH] avoid ICE when pretty-printing a VLA with an error bound (PR 85956)

2018-05-31 Thread Jason Merrill
On Thu, May 31, 2018 at 2:58 AM, Jakub Jelinek  wrote:
> On Wed, May 30, 2018 at 02:39:15PM -0600, Martin Sebor wrote:
>> gcc/c-family/ChangeLog:
>>
>>   PR middle-end/85956
>>   * c-pretty-print.c (c_pretty_printer::direct_abstract_declarator):
>>   Handle error-mark-node in array bounds gracefully.
>
> This isn't sufficient, as it still ICEs with C++:
> during GIMPLE pass: vrp
> In function ‘_Z3fooiPv._omp_fn.0’:
> tree check: expected class ‘type’, have ‘exceptional’ (error_mark) in 
> build_int_cst, at tree.c:1342
>#pragma omp parallel shared(a) default(none)
>^~~
> 0x15ef6b3 tree_class_check_failed(tree_node const*, tree_code_class, char 
> const*, int, char const*)
> ../../gcc/tree.c:9385
> 0x80fb7c tree_class_check(tree_node*, tree_code_class, char const*, int, char 
> const*)
> ../../gcc/tree.h:3258
> 0x15d017c build_int_cst(tree_node*, poly_int<1u, long>)
> ../../gcc/tree.c:1342
> 0xe2b685 round_up_loc(unsigned int, tree_node*, unsigned int)
> ../../gcc/fold-const.c:14330
> 0x1233717 finalize_type_size
> ../../gcc/stor-layout.c:1908
> 0x1238390 layout_type(tree_node*)
> ../../gcc/stor-layout.c:2578
> 0x15e9d8c build_array_type_1
> ../../gcc/tree.c:7869
> 0x15ea022 build_array_type(tree_node*, tree_node*, bool)
> ../../gcc/tree.c:7906
> 0xad28b7 build_cplus_array_type(tree_node*, tree_node*)
> ../../gcc/cp/tree.c:985
> 0xad46c5 strip_typedefs(tree_node*, bool*)
> ../../gcc/cp/tree.c:1459
> 0x9312a8 type_to_string
> ../../gcc/cp/error.c:3176
> 0x93425c cp_printer
> ../../gcc/cp/error.c:4085
> 0x1f79f1b pp_format(pretty_printer*, text_info*)
> ../../gcc/pretty-print.c:1375
>
> I came up with the following hack instead (or in addition to),
> replace those error_mark_node bounds with NULL (i.e. pretend flexible array
> members) if during OpenMP/OpenACC outlining we've decided not to pass around
> the bounds artificial decl because nothing really use it.
>
> Is this a reasonable hack, or shall we go with Martin's patch + similar
> change in C++ pretty printer to handle error_mark_node specially and perhaps
> also handle NULL specially too as the patch does, or both those FE changes
> and this, something else?

We generally try to avoid embedded error_mark_node within other trees.
If the array bound is erroneous, can we replace the whole array type
with error_mark_node?

Jason


[patch] Enhance GIMPLE store-merging pass for bit-fields

2018-05-31 Thread Eric Botcazou
Hi,

this enhances the GIMPLE store-merging pass by teaching it to deal with 
generic stores to bit-fields, i.e. not just stores of immediates.  The 
motivating example is:

struct S {
  unsigned int flag : 1;
  unsigned int size : 31;
};

void foo (struct S *s, unsigned int size)
{
  s->flag = 1;
  s->size = size & 0x7FFF;
}

which may seem a little contrived but is the direct translation of something 
very natural in Ada, and for which the compiler currently generates at -O2:

orb $1, (%rdi)
leal(%rsi,%rsi), %eax
movl(%rdi), %esi
andl$1, %esi
orl %eax, %esi
movl%esi, (%rdi)
ret

With the change, the generated code is optimal:

leal1(%rsi,%rsi), %esi
movl%esi, (%rdi)
ret

The patch adds a 4th class of stores (with the BIT_INSERT_EXPR code) that can 
be merged into groups containing other BIT_INSERT_EXPR or INTEGER_CST stores.
These stores are merged like constant stores, but the immediate value is not 
updated (unlike the mask) and instead output_merged_store synthesizes the bit 
insertion sequences from the original stores.  It also contains a few cleanups 
for the dumping code and other minor fixes.

Tested on x86-64/Linux and SPARC/Solaris, OK for the mainline?


2018-05-31  Eric Botcazou  

* gimple-ssa-store-merging.c: Include gimple-fold.h.
(struct store_immediate_info): Document BIT_INSERT_EXPR stores.
(struct merged_store_group): Add bit_insertion field.
(dump_char_array): Use standard hexadecimal format.
(merged_store_group::merged_store_group): Set bit_insertion to false.
(merged_store_group::apply_stores): Use optimal buffer size.  Deal
with BIT_INSERT_EXPR stores.  Move up code updating the mask and
also print the mask in the dump file.
(pass_store_merging::gate): Minor tweak.
(imm_store_chain_info::coalesce_immediate): Fix wrong association
of stores with groups in dump.  Allow coalescing of BIT_INSERT_EXPR
with INTEGER_CST stores.
(count_multiple_uses) : New case.
(imm_store_chain_info::output_merged_store): Add try_bitpos variable
and use it throughout.  Generare bit insertion sequences if need be.
(pass_store_merging::process_store): Remove redundant condition.
Record store from a SSA name to a bit-field with BIT_INSERT_EXPR.


2018-05-31  Eric Botcazou  

* gcc.dg/store_merging_20.c: New test.
* gnat.dg/opt71.adb: Likewise.
* gnat.dg/opt71_pkg.ads: New helper.

-- 
Eric BotcazouIndex: gimple-ssa-store-merging.c
===
--- gimple-ssa-store-merging.c	(revision 260913)
+++ gimple-ssa-store-merging.c	(working copy)
@@ -18,9 +18,10 @@
along with GCC; see the file COPYING3.  If not see
.  */
 
-/* The purpose of the store merging pass is to combine multiple memory
-   stores of constant values, values loaded from memory or bitwise operations
-   on those to consecutive memory locations into fewer wider stores.
+/* The purpose of the store merging pass is to combine multiple memory stores
+   of constant values, values loaded from memory, bitwise operations on those,
+   or bit-field values, to consecutive locations, into fewer wider stores.
+
For example, if we have a sequence peforming four byte stores to
consecutive memory locations:
[p ] := imm1;
@@ -28,7 +29,7 @@
[p + 2B] := imm3;
[p + 3B] := imm4;
we can transform this into a single 4-byte store if the target supports it:
-  [p] := imm1:imm2:imm3:imm4 //concatenated immediates according to endianness.
+   [p] := imm1:imm2:imm3:imm4 concatenated according to endianness.
 
Or:
[p ] := [q ];
@@ -46,12 +47,18 @@
if there is no overlap can be transformed into a single 4-byte
load, xored with imm1:imm2:imm3:imm4 and stored using a single 4-byte store.
 
+   Or:
+   [p:1 ] := imm;
+   [p:31] := val & 0x7FFF;
+   we can transform this into a single 4-byte store if the target supports it:
+   [p] := imm:(val & 0x7FFF) concatenated according to endianness.
+
The algorithm is applied to each basic block in three phases:
 
-   1) Scan through the basic block recording assignments to
-   destinations that can be expressed as a store to memory of a certain size
-   at a certain bit offset from expressions we can handle.  For bit-fields
-   we also note the surrounding bit region, bits that could be stored in
+   1) Scan through the basic block and record assignments to destinations
+   that can be expressed as a store to memory of a certain size at a certain
+   bit offset from base expressions we can handle.  For bit-fields we also
+   record the surrounding bit region, i.e. bits that could be stored in
a read-modify-write operation when storing the bit-field.  Record store
chains to different bases in a hash_map 

Re: [PATCH] avoid ICE when pretty-printing a VLA with an error bound (PR 85956)

2018-05-31 Thread Jakub Jelinek
On Thu, May 31, 2018 at 09:14:33AM -0400, Jason Merrill wrote:
> > I came up with the following hack instead (or in addition to),
> > replace those error_mark_node bounds with NULL (i.e. pretend flexible array
> > members) if during OpenMP/OpenACC outlining we've decided not to pass around
> > the bounds artificial decl because nothing really use it.
> >
> > Is this a reasonable hack, or shall we go with Martin's patch + similar
> > change in C++ pretty printer to handle error_mark_node specially and perhaps
> > also handle NULL specially too as the patch does, or both those FE changes
> > and this, something else?
> 
> We generally try to avoid embedded error_mark_node within other trees.
> If the array bound is erroneous, can we replace the whole array type
> with error_mark_node?

The array bound isn't errorneous, just becomes unknown (well, known only in
an outer function), we still need to know it is an array type and that it
has 0 as the low bound.
Instead of replacing it with NULL we in theory could just create another
VAR_DECL and never initialize it, it wouldn't be far from what happens with
some other VLAs - during optimizations it is possible to array bound var is
optimized away.  Just it would be much more work to do it that way.

Jakub


Re: [PATCH] Avoid hot/cold partitioning in naked functions (PR target/85984)

2018-05-31 Thread Richard Biener
On May 31, 2018 12:02:49 PM GMT+02:00, Jakub Jelinek  wrote:
>On Thu, May 31, 2018 at 11:46:33AM +0200, Richard Biener wrote:
>> Is naked an attribute that is specified for all targets?  If so, OK. 
>
>It is not specified for all targets, but all targets for which it is
>specified have the same behavior.
>We handle "naked" a couple of times in the generic code already, e.g.
>in
>attribs.c (naked implies noinline/noclone), or in cfgexpand.c (refuse
>allocation of vars on the stack for "naked" functions).

OK then. 

Richard. 

>> Otherwise we may instead want to add a target hook for whether a
>function
>> has a prologue/epilogue?
>
>   Jakub



[PATCH] fix checking error with OpenACC reference types variables (PR85879)

2018-05-31 Thread Cesar Philippidis
OpenACC has slightly different data semantics for reference types and
pointers. Without this patch, omp-low was treating all references types
like pointers which resulted in some gimple type checking errors when
checking is enabled. It turned out that Chung-Lin had resolved the issue
way back in gomp-4_0-branch for PR77371, but that fix never made its way
into trunk. The gimplifier stuff is not relevant to this PR, but I left
it in there because it does address the original issue in PR77371.

I tested this patch from og8 on trunk on x86_64 / nvptx offloading, and
the results came back clean. It this OK for trunk?

Thanks,
Cesar
2018-05-31  Chung-Lin Tang  
	Cesar Philippidis  

	PR middle-end/85879

	gcc/
	* gimplify.c (gimplify_adjust_omp_clauses): Add 'remove = true'
	when emitting error on private/firstprivate reductions.
	* omp-low.c (lower_omp_target): Avoid reference-type processing
	on pointers for firstprivate clause.

	gcc/testsuite/
	* gfortran.dg/goacc/pr77371-1.f90: New test.
	* gfortran.dg/goacc/pr77371-2.f90: New test.
	* gfortran.dg/goacc/pr85879.f90: New test.


diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 9771804f27e..44cb784620a 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -9275,13 +9275,16 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p,
 	case OMP_CLAUSE_REDUCTION:
 	  decl = OMP_CLAUSE_DECL (c);
 	  /* OpenACC reductions need a present_or_copy data clause.
-	 Add one if necessary.  Error is the reduction is private.  */
+	 Add one if necessary.  Emit error when the reduction is private.  */
 	  if (ctx->region_type == ORT_ACC_PARALLEL)
 	{
 	  n = splay_tree_lookup (ctx->variables, (splay_tree_key) decl);
 	  if (n->value & (GOVD_PRIVATE | GOVD_FIRSTPRIVATE))
-		error_at (OMP_CLAUSE_LOCATION (c), "invalid private "
-			  "reduction on %qE", DECL_NAME (decl));
+		{
+		  remove = true;
+		  error_at (OMP_CLAUSE_LOCATION (c), "invalid private "
+			"reduction on %qE", DECL_NAME (decl));
+		}
 	  else if ((n->value & GOVD_MAP) == 0)
 		{
 		  tree next = OMP_CLAUSE_CHAIN (c);
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index d8588b9faed..ba6c705cf8b 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -7700,7 +7700,8 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 	if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FIRSTPRIVATE)
 	  {
 		gcc_assert (is_gimple_omp_oacc (ctx->stmt));
-		if (omp_is_reference (new_var))
+		if (omp_is_reference (new_var)
+		&& TREE_CODE (TREE_TYPE (new_var)) != POINTER_TYPE)
 		  {
 		/* Create a local object to hold the instance
 		   value.  */
diff --git a/gcc/testsuite/gfortran.dg/goacc/pr77371-1.f90 b/gcc/testsuite/gfortran.dg/goacc/pr77371-1.f90
new file mode 100644
index 000..11c29ba3e6d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/pr77371-1.f90
@@ -0,0 +1,9 @@
+! PR fortran/77371
+! { dg-do compile }
+program p
+  character(:), allocatable :: z
+  !$acc parallel
+  z = 'abc' 
+  !$acc end parallel
+  print *, z
+end
diff --git a/gcc/testsuite/gfortran.dg/goacc/pr77371-2.f90 b/gcc/testsuite/gfortran.dg/goacc/pr77371-2.f90
new file mode 100644
index 000..9d42c17ac4e
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/pr77371-2.f90
@@ -0,0 +1,7 @@
+! PR fortran/77371
+! { dg-do compile }
+program p
+   integer, allocatable :: n
+!$acc parallel reduction (+:n) private(n) ! { dg-error "invalid private reduction" }
+!$acc end parallel
+end
diff --git a/gcc/testsuite/gfortran.dg/goacc/pr85879.f90 b/gcc/testsuite/gfortran.dg/goacc/pr85879.f90
new file mode 100644
index 000..cd50be2fdb4
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/pr85879.f90
@@ -0,0 +1,12 @@
+! PR middle-end/85879
+! { dg-do compile }
+
+program p
+   integer, pointer :: i
+   integer, target :: j
+   j = 2
+   i => j
+   !$acc parallel
+   j = i
+   !$acc end parallel
+end


Re: [PATCH] fix checking error with OpenACC reference types variables (PR85879)

2018-05-31 Thread Jakub Jelinek
On Thu, May 31, 2018 at 06:54:11AM -0700, Cesar Philippidis wrote:
> 2018-05-31  Chung-Lin Tang  
>   Cesar Philippidis  
> 
>   PR middle-end/85879
> 
>   gcc/
>   * gimplify.c (gimplify_adjust_omp_clauses): Add 'remove = true'
>   when emitting error on private/firstprivate reductions.
>   * omp-low.c (lower_omp_target): Avoid reference-type processing
>   on pointers for firstprivate clause.
> 
>   gcc/testsuite/
>   * gfortran.dg/goacc/pr77371-1.f90: New test.
>   * gfortran.dg/goacc/pr77371-2.f90: New test.
>   * gfortran.dg/goacc/pr85879.f90: New test.

Ok, thanks.

Jakub


Re: [PATCH] PR libstdc++/85951 for make_signed/make_unsigned for character types

2018-05-31 Thread Jonathan Wakely

On 31/05/18 13:17 +0100, Jonathan Wakely wrote:

Because the wide character types are neither signed integer types nor
unsigned integer types they need to be transformed to an integral type
of the correct size and the lowest rank (which is not necessarily the
underlying type). Reuse the helpers for enumeration types to select the
correct integer.

The refactoring of __make_unsigned_selector and __make_signed_selector
slightly reduces the number of template instantiations and so reduces
memory usage.

PR libstdc++/85951
* include/std/type_traits [_GLIBCXX_USE_C99_STDINT_TR1]: Do not define
uint_least16_t and uint_least32_t.
(__make_unsigned): Define unconditionally.
(__make_unsigned_selector<_Tp, true, false>): Remove intermediate
typedefs.
(__make_unsigned_selector_base): New type to provide helper templates.
(__make_unsigned_selector<_Tp, false, true>): Reimplement using
__make_unsigned_selector_base helpers.
(__make_unsigned, __make_unsigned): Define.
(__make_signed_selector<_Tp, true, false>): Remove intermediate
typedefs.
(__make_signed, __make_signed)
(__make_signed)): Define unconditionally.
* testsuite/20_util/make_signed/requirements/typedefs-3.cc: Check
wchar_t, char16_t and char32_t are transformed correctly.
* testsuite/20_util/make_signed/requirements/typedefs_neg.cc: Adjust
dg-error lineno.
* testsuite/20_util/make_unsigned/requirements/typedefs-3.cc: Check
wchar_t, char16_t and char32_t are transformed correctly.
* testsuite/20_util/make_unsigned/requirements/typedefs_neg.cc: Adjust
dg-error lineno.

Tested powerpc64le-linux, committed to trunk.

I'll backport a simpler version without the refactoring.


This is the backport I'm committing to the branches. I forgot that one
of the reasons I did the refactoring was to avoid std::conditional,
which isn't defined yet and so can't be used by the new explicit
specializations of __make_unsigned and __make_signed. To solve that
problem on the branches I'll just put the new specializations later in
the file.

Tested x86_64-linux, committed to gcc-8-branch (and others to follow
soon).


commit a200d73247acdf8394574fe28250b7d3bf817576
Author: Jonathan Wakely 
Date:   Thu May 31 13:42:20 2018 +0100

PR libstdc++/85951 for make_signed/make_unsigned for character types

Because the wide character types are neither signed integer types nor
unsigned integer types they need to be transformed to an integral type
of the correct size and the lowest rank (which is not necessarily the
underlying type). Reuse the helpers for enumeration types to select the
correct integer.

PR libstdc++/85951
* include/std/type_traits [_GLIBCXX_USE_C99_STDINT_TR1]: Do not define
uint_least16_t and uint_least32_t.
(__make_unsigned): Define unconditionally.
(__make_unsigned, __make_unsigned): Define.
(__make_signed, __make_signed)
(__make_signed)): Define unconditionally.
* testsuite/20_util/make_signed/requirements/typedefs-3.cc: Check
wchar_t, char16_t and char32_t are transformed correctly.
* testsuite/20_util/make_signed/requirements/typedefs_neg.cc: Adjust
dg-error lineno.
* testsuite/20_util/make_unsigned/requirements/typedefs-3.cc: Check
wchar_t, char16_t and char32_t are transformed correctly.
* testsuite/20_util/make_unsigned/requirements/typedefs_neg.cc: Adjust
dg-error lineno.

diff --git a/libstdc++-v3/include/std/type_traits b/libstdc++-v3/include/std/type_traits
index 711d6c50dd1..41607f63096 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -37,18 +37,6 @@
 
 #include 
 
-#ifdef _GLIBCXX_USE_C99_STDINT_TR1
-# if defined (__UINT_LEAST16_TYPE__) && defined(__UINT_LEAST32_TYPE__)
-namespace std
-{
-  typedef __UINT_LEAST16_TYPE__ uint_least16_t;
-  typedef __UINT_LEAST32_TYPE__ uint_least32_t;
-}
-# else
-#  include 
-# endif
-#endif
-
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
@@ -1576,12 +1564,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct __make_unsigned
 { typedef unsigned long long __type; };
 
-#if defined(_GLIBCXX_USE_WCHAR_T) && !defined(__WCHAR_UNSIGNED__)
-  template<>
-struct __make_unsigned : __make_unsigned<__WCHAR_TYPE__>
-{ };
-#endif
-
 #if defined(__GLIBCXX_TYPE_INT_N_0)
   template<>
 struct __make_unsigned<__GLIBCXX_TYPE_INT_N_0>
@@ -1686,21 +1668,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct __make_signed
 { typedef signed long long __type; };
 
-#if defined(_GLIBCXX_USE_WCHAR_T) && defined(__WCHAR_UNSIGNED__)
-  template<>
-struct __make_signed : __make_signed<__WCHAR_TYPE__>
-{ };
-#endif
-
-#ifdef _GLIBCXX_USE_C99_STDINT_TR1
-  template<>
-struct

Re: C++ PATCH for c++/85977, array reference size deduction failure

2018-05-31 Thread Jason Merrill
On Wed, May 30, 2018 at 5:23 PM, Marek Polacek  wrote:
> We are failing to deduce the template parameter N here
>
>   template 
>   void foo(const long int (&)[N]) {}
>
>   void bar() {
> foo ({1,2,3});
>   }
>
> because of the type mismatch; parm is long int (element type of the array),
> while arg is int (element type of {1, 2, 3}), and unify doesn't like that:
>
> 21789   /* We have already checked cv-qualification at the top of the
> 21790  function.  */
> 21791   if (!same_type_ignoring_top_level_qualifiers_p (arg, parm))
> 21792 return unify_type_mismatch (explain_p, parm, arg);
>
> But since the parameter type is array, we should see if there exists an
> implicit conversion sequence for each element of the array from the
> corresponding element of the initializer list, and that's what I tried,
> and it seems to work.
>
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
>
> 2018-05-30  Marek Polacek  
>
> PR c++/85977
> * pt.c (unify): Handle the [over.ics.list]/6 case.
>
> + /* [over.ics.list]/6 says we should try an implicit conversion
> +from each list element to the corresponding array element
> +type.  */

[over.ics.list] doesn't apply during template argument deduction, but
rather [temp.deduct.call].

> + if (TREE_CODE (parm) == ARRAY_TYPE)
> +   {
> + tree x = perform_implicit_conversion (elttype, elt, 
> complain);

Rather than check this immediately here, we want to do more or less
what unify_one_argument does:

  if (strict != DEDUCE_EXACT
  && TYPE_P (parm) && !uses_deducible_template_parms (parm))
/* For function parameters with no deducible template parameters,
   just return.  We'll check non-dependent conversions later.  */
return unify_success (explain_p);

so if elttype has no deducible template parms, don't do deduction from
the list elements at all.

And then we want to check convertibility of the elements in
type_unification_real, when we check convertibility of other function
parameters that don't involve template parameters:

  /* DR 1391: All parameters have args, now check non-dependent
parms for
 convertibility.  */

Jason


Re: PING^1: [PATCH GCC 8] x86: Re-enable partial_reg_dependency and movx for Haswell

2018-05-31 Thread H.J. Lu
On Wed, May 30, 2018 at 5:43 AM, H.J. Lu  wrote:
> On Sun, May 20, 2018 at 11:51 AM, Jan Hubicka  wrote:
>>> r254152 disabled partial_reg_dependency and movx for Haswell and newer
>>> Intel processors.  r258972 restored them for skylake-avx512.  For Haswell,
>>> movx improves performance.  But partial_reg_stall may be better than
>>> partial_reg_dependency in theory.  We will investigate performance impact
>>> of partial_reg_stall vs partial_reg_dependency on Haswell for GCC 9.  In
>>> the meantime, this patch restores both partial_reg_dependency and mox for
>>> Haswell in GCC 8.
>>>
>>> OK for GCC 8?
>>
>> I would still like to know in what situations/bechnarks it improves the 
>> performance.
>> The change was benchmarked on spec2000/2006 plus some additional benchmarks 
>> and, so
>> it would be nice to know where it hurts.
>
> From
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85829#c5:
>
> I have made measurements on HSW comparing
> -mtune-ctrl=movx,partial_reg_dependency -Ofast -march=haswell to
> -Ofast -mtune=haswell and I see improvements on EEMBC benchmarks.
>
> automotive
> =
>   aifftr01 (default) - goodperf: Runtime improvement of   2.6% (time).
>   aiifft01 (default) - goodperf: Runtime improvement of   2.2% (time).
>
> networking
> =
>   ip_pktcheckb1m (default) - goodperf: Runtime improvement of   3.8% (time).
>   ip_pktcheckb2m (default) - goodperf: Runtime improvement of   5.2% (time).
>   ip_pktcheckb4m (default) - goodperf: Runtime improvement of   4.4% (time).
>   ip_pktcheckb512k (default) - goodperf: Runtime improvement of   4.2% (time).
>
> telecom
> =
>   fft00data_1 (default) - goodperf: Runtime improvement of   8.4% (time).
>   fft00data_2 (default) - goodperf: Runtime improvement of   8.6% (time).
>   fft00data_3 (default) - goodperf: Runtime improvement of   9.0% (time).
>
> OK for GCC 8?
>

This is the patch I am going to check into GCC 8.

-- 
H.J.
From 9ecbfa1fd04dc4370a9ec4f3d56189cc07aee668 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Thu, 17 May 2018 09:52:09 -0700
Subject: [PATCH] x86: Re-enable partial_reg_dependency and movx for Haswell

r254152 disabled partial_reg_dependency and movx for Haswell and newer
Intel processors.  r258972 restored them for skylake-avx512.  For Haswell,
movx improves performance.  But partial_reg_stall may be better than
partial_reg_dependency in theory.  We will investigate performance impact
of partial_reg_stall vs partial_reg_dependency on Haswell for GCC 9.  In
the meantime, this patch restores both partial_reg_dependency and mox for
Haswell in GCC 8.

On Haswell, improvements for EEMBC benchmarks with

-mtune-ctrl=movx,partial_reg_dependency -Ofast -march=haswell

vs

-Ofast -mtune=haswell

are

automotive
=
  aifftr01 (default) - goodperf: Runtime improvement of   2.6% (time).
  aiifft01 (default) - goodperf: Runtime improvement of   2.2% (time).

networking
=
  ip_pktcheckb1m (default) - goodperf: Runtime improvement of   3.8% (time).
  ip_pktcheckb2m (default) - goodperf: Runtime improvement of   5.2% (time).
  ip_pktcheckb4m (default) - goodperf: Runtime improvement of   4.4% (time).
  ip_pktcheckb512k (default) - goodperf: Runtime improvement of   4.2% (time).

telecom
=
  fft00data_1 (default) - goodperf: Runtime improvement of   8.4% (time).
  fft00data_2 (default) - goodperf: Runtime improvement of   8.6% (time).
  fft00data_3 (default) - goodperf: Runtime improvement of   9.0% (time).

	PR target/85829
	* config/i386/x86-tune.def: Re-enable partial_reg_dependency
	and movx for Haswell.
---
 gcc/config/i386/x86-tune.def | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def
index 5649fdcf416..60625668236 100644
--- a/gcc/config/i386/x86-tune.def
+++ b/gcc/config/i386/x86-tune.def
@@ -48,7 +48,7 @@ DEF_TUNE (X86_TUNE_SCHEDULE, "schedule",
over partial stores.  For example preffer MOVZBL or MOVQ to load 8bit
value over movb.  */
 DEF_TUNE (X86_TUNE_PARTIAL_REG_DEPENDENCY, "partial_reg_dependency",
-  m_P4_NOCONA | m_CORE2 | m_NEHALEM  | m_SANDYBRIDGE
+  m_P4_NOCONA | m_CORE2 | m_NEHALEM  | m_SANDYBRIDGE | m_HASWELL
 	  | m_BONNELL | m_SILVERMONT | m_INTEL
 	  | m_KNL | m_KNM | m_AMD_MULTIPLE | m_SKYLAKE_AVX512 | m_GENERIC)
 
@@ -84,7 +84,7 @@ DEF_TUNE (X86_TUNE_PARTIAL_FLAG_REG_STALL, "partial_flag_reg_stall",
partial dependencies.  */
 DEF_TUNE (X86_TUNE_MOVX, "movx",
   m_PPRO | m_P4_NOCONA | m_CORE2 | m_NEHALEM  | m_SANDYBRIDGE
-	  | m_BONNELL | m_SILVERMONT | m_KNL | m_KNM | m_INTEL
+	  | m_BONNELL | m_SILVERMONT | m_KNL | m_KNM | m_INTEL | m_HASWELL
 	  | m_GEODE | m_AMD_MULTIPLE | m_SKYLAKE_AVX512 | m_GENERIC)
 
 /* X86_TUNE_MEMORY_MISMATCH_STALL: Avoid partial stores that are followed by
-- 
2.17.0



Re: [PATCH] avoid ICE when pretty-printing a VLA with an error bound (PR 85956)

2018-05-31 Thread Martin Sebor

On 05/31/2018 07:30 AM, Jakub Jelinek wrote:

On Thu, May 31, 2018 at 09:14:33AM -0400, Jason Merrill wrote:

I came up with the following hack instead (or in addition to),
replace those error_mark_node bounds with NULL (i.e. pretend flexible array
members) if during OpenMP/OpenACC outlining we've decided not to pass around
the bounds artificial decl because nothing really use it.

Is this a reasonable hack, or shall we go with Martin's patch + similar
change in C++ pretty printer to handle error_mark_node specially and perhaps
also handle NULL specially too as the patch does, or both those FE changes
and this, something else?


We generally try to avoid embedded error_mark_node within other trees.
If the array bound is erroneous, can we replace the whole array type
with error_mark_node?


The array bound isn't errorneous, just becomes unknown (well, known only in
an outer function), we still need to know it is an array type and that it
has 0 as the low bound.
Instead of replacing it with NULL we in theory could just create another
VAR_DECL and never initialize it, it wouldn't be far from what happens with
some other VLAs - during optimizations it is possible to array bound var is
optimized away.  Just it would be much more work to do it that way.


In my mind the issue boils down to two questions:

1) should the pretty printer handle error-mark-node gracefully
   or is it appropriate for it to abort?
2) is it appropriate to be embedding/using error_mark_node in
   valid constructs as a proxy for "unused" or "unknown" or
   such?

I would expect the answer to (1) to be yes.  Despite that,
I agree with Jason that the answer to (2) should be no.

That said, I don't think the fix for this bug needs to depend
on solving (2).  We can avoid the ICE by changing the pretty
printers and adjust the openmp implementation later.

Martin


Re: PING^1: [PATCH GCC 8] x86: Re-enable partial_reg_dependency and movx for Haswell

2018-05-31 Thread Jan Hubicka
> 
> This is the patch I am going to check into GCC 8.
> 
> -- 
> H.J.

> From 9ecbfa1fd04dc4370a9ec4f3d56189cc07aee668 Mon Sep 17 00:00:00 2001
> From: "H.J. Lu" 
> Date: Thu, 17 May 2018 09:52:09 -0700
> Subject: [PATCH] x86: Re-enable partial_reg_dependency and movx for Haswell
> 
> r254152 disabled partial_reg_dependency and movx for Haswell and newer
> Intel processors.  r258972 restored them for skylake-avx512.  For Haswell,
> movx improves performance.  But partial_reg_stall may be better than
> partial_reg_dependency in theory.  We will investigate performance impact
> of partial_reg_stall vs partial_reg_dependency on Haswell for GCC 9.  In
> the meantime, this patch restores both partial_reg_dependency and mox for
> Haswell in GCC 8.
> 
> On Haswell, improvements for EEMBC benchmarks with
> 
> -mtune-ctrl=movx,partial_reg_dependency -Ofast -march=haswell
> 
> vs
> 
> -Ofast -mtune=haswell
> 
> are
> 
> automotive
> =
>   aifftr01 (default) - goodperf: Runtime improvement of   2.6% (time).
>   aiifft01 (default) - goodperf: Runtime improvement of   2.2% (time).
> 
> networking
> =
>   ip_pktcheckb1m (default) - goodperf: Runtime improvement of   3.8% (time).
>   ip_pktcheckb2m (default) - goodperf: Runtime improvement of   5.2% (time).
>   ip_pktcheckb4m (default) - goodperf: Runtime improvement of   4.4% (time).
>   ip_pktcheckb512k (default) - goodperf: Runtime improvement of   4.2% (time).
> 
> telecom
> =
>   fft00data_1 (default) - goodperf: Runtime improvement of   8.4% (time).
>   fft00data_2 (default) - goodperf: Runtime improvement of   8.6% (time).
>   fft00data_3 (default) - goodperf: Runtime improvement of   9.0% (time).

Thanks for data. Why did you commited the patch to release branch only?
The patch is OK for mainline too.

I do not have access to the benchmark so I can not check. Why do we get
the improvements here and how does that behave on skylake+?

honza
> 
>   PR target/85829
>   * config/i386/x86-tune.def: Re-enable partial_reg_dependency
>   and movx for Haswell.
> ---
>  gcc/config/i386/x86-tune.def | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def
> index 5649fdcf416..60625668236 100644
> --- a/gcc/config/i386/x86-tune.def
> +++ b/gcc/config/i386/x86-tune.def
> @@ -48,7 +48,7 @@ DEF_TUNE (X86_TUNE_SCHEDULE, "schedule",
> over partial stores.  For example preffer MOVZBL or MOVQ to load 8bit
> value over movb.  */
>  DEF_TUNE (X86_TUNE_PARTIAL_REG_DEPENDENCY, "partial_reg_dependency",
> -  m_P4_NOCONA | m_CORE2 | m_NEHALEM  | m_SANDYBRIDGE
> +  m_P4_NOCONA | m_CORE2 | m_NEHALEM  | m_SANDYBRIDGE | m_HASWELL
> | m_BONNELL | m_SILVERMONT | m_INTEL
> | m_KNL | m_KNM | m_AMD_MULTIPLE | m_SKYLAKE_AVX512 | m_GENERIC)
>  
> @@ -84,7 +84,7 @@ DEF_TUNE (X86_TUNE_PARTIAL_FLAG_REG_STALL, 
> "partial_flag_reg_stall",
> partial dependencies.  */
>  DEF_TUNE (X86_TUNE_MOVX, "movx",
>m_PPRO | m_P4_NOCONA | m_CORE2 | m_NEHALEM  | m_SANDYBRIDGE
> -   | m_BONNELL | m_SILVERMONT | m_KNL | m_KNM | m_INTEL
> +   | m_BONNELL | m_SILVERMONT | m_KNL | m_KNM | m_INTEL | m_HASWELL
> | m_GEODE | m_AMD_MULTIPLE | m_SKYLAKE_AVX512 | m_GENERIC)
>  
>  /* X86_TUNE_MEMORY_MISMATCH_STALL: Avoid partial stores that are followed by
> -- 
> 2.17.0
> 



Re: [PATCH] avoid ICE when pretty-printing a VLA with an error bound (PR 85956)

2018-05-31 Thread Jason Merrill
On Thu, May 31, 2018 at 11:00 AM, Martin Sebor  wrote:
> On 05/31/2018 07:30 AM, Jakub Jelinek wrote:
>>
>> On Thu, May 31, 2018 at 09:14:33AM -0400, Jason Merrill wrote:

 I came up with the following hack instead (or in addition to),
 replace those error_mark_node bounds with NULL (i.e. pretend flexible
 array
 members) if during OpenMP/OpenACC outlining we've decided not to pass
 around
 the bounds artificial decl because nothing really use it.

 Is this a reasonable hack, or shall we go with Martin's patch + similar
 change in C++ pretty printer to handle error_mark_node specially and
 perhaps
 also handle NULL specially too as the patch does, or both those FE
 changes
 and this, something else?
>>>
>>>
>>> We generally try to avoid embedded error_mark_node within other trees.
>>> If the array bound is erroneous, can we replace the whole array type
>>> with error_mark_node?
>>
>>
>> The array bound isn't errorneous, just becomes unknown (well, known only
>> in
>> an outer function), we still need to know it is an array type and that it
>> has 0 as the low bound.
>> Instead of replacing it with NULL we in theory could just create another
>> VAR_DECL and never initialize it, it wouldn't be far from what happens
>> with
>> some other VLAs - during optimizations it is possible to array bound var
>> is
>> optimized away.  Just it would be much more work to do it that way.
>
>
> In my mind the issue boils down to two questions:
>
> 1) should the pretty printer handle error-mark-node gracefully
>or is it appropriate for it to abort?
> 2) is it appropriate to be embedding/using error_mark_node in
>valid constructs as a proxy for "unused" or "unknown" or
>such?
>
> I would expect the answer to (1) to be yes.  Despite that,
> I agree with Jason that the answer to (2) should be no.
>
> That said, I don't think the fix for this bug needs to depend
> on solving (2).  We can avoid the ICE by changing the pretty
> printers and adjust the openmp implementation later.

The problem with embedded error_mark_node is that lots of places are
going to blow up like this, and we don't want to change everything to
expect it.  Adjusting the pretty-printer might fix this particular
testcase, but other things are likely to get tripped up by the same
problem.

Where is the error_mark_node coming from in the first place?

Jason


Re: [PATCH] avoid ICE when pretty-printing a VLA with an error bound (PR 85956)

2018-05-31 Thread Jakub Jelinek
On Thu, May 31, 2018 at 11:19:08AM -0400, Jason Merrill wrote:
> > In my mind the issue boils down to two questions:
> >
> > 1) should the pretty printer handle error-mark-node gracefully
> >or is it appropriate for it to abort?
> > 2) is it appropriate to be embedding/using error_mark_node in
> >valid constructs as a proxy for "unused" or "unknown" or
> >such?
> >
> > I would expect the answer to (1) to be yes.  Despite that,
> > I agree with Jason that the answer to (2) should be no.
> >
> > That said, I don't think the fix for this bug needs to depend
> > on solving (2).  We can avoid the ICE by changing the pretty
> > printers and adjust the openmp implementation later.
> 
> The problem with embedded error_mark_node is that lots of places are
> going to blow up like this, and we don't want to change everything to
> expect it.  Adjusting the pretty-printer might fix this particular
> testcase, but other things are likely to get tripped up by the same
> problem.
> 
> Where is the error_mark_node coming from in the first place?

remap_type invoked during omp-low.c (scan_omp).
omp_copy_decl returns error_mark_node for decls that tree-inline.c wants
to remap, but they aren't actually remapped for some reason.
For normal VLAs gimplify.c makes sure the needed artifical decls are
firstprivatized, but in this case (VLA not in some decl's type, but just
referenced indirectly through pointers) nothing scans those unless
those temporaries are actually used in the code.

Jakub


Re: PING^1: [PATCH GCC 8] x86: Re-enable partial_reg_dependency and movx for Haswell

2018-05-31 Thread H.J. Lu
On Thu, May 31, 2018 at 8:08 AM, Jan Hubicka  wrote:
>>
>> This is the patch I am going to check into GCC 8.
>>
>> --
>> H.J.
>
>> From 9ecbfa1fd04dc4370a9ec4f3d56189cc07aee668 Mon Sep 17 00:00:00 2001
>> From: "H.J. Lu" 
>> Date: Thu, 17 May 2018 09:52:09 -0700
>> Subject: [PATCH] x86: Re-enable partial_reg_dependency and movx for Haswell
>>
>> r254152 disabled partial_reg_dependency and movx for Haswell and newer
>> Intel processors.  r258972 restored them for skylake-avx512.  For Haswell,
>> movx improves performance.  But partial_reg_stall may be better than
>> partial_reg_dependency in theory.  We will investigate performance impact
>> of partial_reg_stall vs partial_reg_dependency on Haswell for GCC 9.  In
>> the meantime, this patch restores both partial_reg_dependency and mox for
>> Haswell in GCC 8.
>>
>> On Haswell, improvements for EEMBC benchmarks with
>>
>> -mtune-ctrl=movx,partial_reg_dependency -Ofast -march=haswell
>>
>> vs
>>
>> -Ofast -mtune=haswell
>>
>> are
>>
>> automotive
>> =
>>   aifftr01 (default) - goodperf: Runtime improvement of   2.6% (time).
>>   aiifft01 (default) - goodperf: Runtime improvement of   2.2% (time).
>>
>> networking
>> =
>>   ip_pktcheckb1m (default) - goodperf: Runtime improvement of   3.8% (time).
>>   ip_pktcheckb2m (default) - goodperf: Runtime improvement of   5.2% (time).
>>   ip_pktcheckb4m (default) - goodperf: Runtime improvement of   4.4% (time).
>>   ip_pktcheckb512k (default) - goodperf: Runtime improvement of   4.2% 
>> (time).
>>
>> telecom
>> =
>>   fft00data_1 (default) - goodperf: Runtime improvement of   8.4% (time).
>>   fft00data_2 (default) - goodperf: Runtime improvement of   8.6% (time).
>>   fft00data_3 (default) - goodperf: Runtime improvement of   9.0% (time).
>
> Thanks for data. Why did you commited the patch to release branch only?
> The patch is OK for mainline too.

I am checking this patch into trunk now.

> I do not have access to the benchmark so I can not check. Why do we get

>From Intel optimization guide:

3.5.2.4
Partial Register Stalls
General purpose registers can be accessed in granularities of bytes,
words, doublewords; 64-bit mode
also supports quadword granularity. Referencing a portion of a
register is referred to as a partial register
reference.
A partial register stall happens when an instruction refers to a
register, portions of which were previously
modified by other instructions. For example, partial register stalls
occurs with a read to AX while previous
instructions stored AL and AH, or a read to EAX while previous
instruction modified AX.
The delay of a partial register stall is small in processors based on
Intel Core and NetBurst microarchitec-
tures, and in Pentium M processor (with CPUID signature family 6,
model 13), Intel Core Solo, and Intel
Core Duo processors. Pentium M processors (CPUID signature with family
6, model 9) and the P6 family
incur a large penalty.
Note that in Intel 64 architecture, an update to the lower 32 bits of
a 64 bit integer register is architec-
turally defined to zero extend the upper 32 bits. While this action
may be logically viewed as a 32 bit
update, it is really a 64 bit update (and therefore does not cause a
partial stall).
Referencing partial registers frequently produces code sequences with
either false or real dependencies.
Example 3-18 demonstrates a series of false and real dependencies
caused by referencing partial regis-
ters.
...
When you want to load from memory to a partial register, consider
using MOVZX or MOVSX to
avoid the additional merge micro-op penalty.

We have movx, partial_reg_dependency and partial_reg_stall to deal with
it.  movx is always good.   But partial_reg_stall is enabled only for i686.  We
need to investigate partial_reg_stall vs partial_reg_dependency on Haswell+.

> the improvements here and how does that behave on skylake+?

This is

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84413

We are working on it.

> honza
>>
>>   PR target/85829
>>   * config/i386/x86-tune.def: Re-enable partial_reg_dependency
>>   and movx for Haswell.
>> ---
>>  gcc/config/i386/x86-tune.def | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def
>> index 5649fdcf416..60625668236 100644
>> --- a/gcc/config/i386/x86-tune.def
>> +++ b/gcc/config/i386/x86-tune.def
>> @@ -48,7 +48,7 @@ DEF_TUNE (X86_TUNE_SCHEDULE, "schedule",
>> over partial stores.  For example preffer MOVZBL or MOVQ to load 8bit
>> value over movb.  */
>>  DEF_TUNE (X86_TUNE_PARTIAL_REG_DEPENDENCY, "partial_reg_dependency",
>> -  m_P4_NOCONA | m_CORE2 | m_NEHALEM  | m_SANDYBRIDGE
>> +  m_P4_NOCONA | m_CORE2 | m_NEHALEM  | m_SANDYBRIDGE | m_HASWELL
>> | m_BONNELL | m_SILVERMONT | m_INTEL
>> | m_KNL | m_KNM | m_AMD_MULTIPLE | m_SKYLAKE_AVX512 | m_GENERIC)
>>
>> @@ -84,7 +84,7 @@ DEF_TUNE (X86_TUNE_PARTIAL_FLAG_REG_STALL, 
>> "partial_flag_re

Re: [PATCH] correct/improve handling of -Walloc-size-larger-than (PR 82063)

2018-05-31 Thread Martin Sebor

On 05/30/2018 05:18 PM, Jeff Law wrote:

On 05/18/2018 05:58 PM, Martin Sebor wrote:

The -Walloc-size-larger-than= option is supposed make it possible
to disable the warning by specifying a limit that's larger than
the default of PTRDIFF_MAX (the handler for the option argument
gets around the INT_MAX maximum for numeric arguments by accepting
suffixes like MB or GB).  Unfortunately, a silly typo in the handler
prevents this from working correctly, and because there is no
-Wno-alloc-size-larger-than option it's impossible to suppress
unwanted instances of this warning.

The attached patch removes these shortcomings by:

1) fixing the typo,
2) letting the argument handler accept excessively large arguments
   (> ULLONG_MAX) and treat them as infinite,
3) adding -Wno-alloc-size-larger-than option to disable the warning

The patch also issues a warning for invalid arguments (they either
reset the limit to zero or leave it at PTRDIFF_MAX otherwise).

I'm looking for approval to commit this patch to trunk and all
release branches that support the option (8 and 7).

For trunk, as the next step, I'd like to generalize the argument
handler and move it where other similar options (for example,
-Wlarger-than, -Walloca-larger-than, -Wframe-larger-than, and
-Wstack-usage) can make use of it.

Martin

gcc-82063.diff


PR c/82063 - issues with arguments enabled by -Wall

gcc/c-family/ChangeLog:

PR c/82063
* c.opt (-Wno-alloc-size-larger-than): New option.

gcc/ChangeLog:

PR c/82063
* calls.c (alloc_max_size): Correct a logic error/typo.
Treat excessive arguments as infinite.  Warn for invalid arguments.

gcc/testsuite/ChangeLog:

PR c/82063
* gcc.dg/Walloc-size-larger-than-1.c: New test.
* gcc.dg/Walloc-size-larger-than-10.c: New test.
* gcc.dg/Walloc-size-larger-than-11.c: New test.
* gcc.dg/Walloc-size-larger-than-12.c: New test.
* gcc.dg/Walloc-size-larger-than-13.c: New test.
* gcc.dg/Walloc-size-larger-than-14.c: New test.
* gcc.dg/Walloc-size-larger-than-15.c: New test.
* gcc.dg/Walloc-size-larger-than-16.c: New test.
* gcc.dg/Walloc-size-larger-than-17.c: New test.
* gcc.dg/Walloc-size-larger-than-2.c: New test.
* gcc.dg/Walloc-size-larger-than-3.c: New test.
* gcc.dg/Walloc-size-larger-than-4.c: New test.
* gcc.dg/Walloc-size-larger-than-5.c: New test.
* gcc.dg/Walloc-size-larger-than-6.c: New test.
* gcc.dg/Walloc-size-larger-than-7.c: New test.
* gcc.dg/Walloc-size-larger-than-8.c: New test.
* gcc.dg/Walloc-size-larger-than-9.c: New test.
* gcc.dg/Walloc-size-larger-than.c: New test.

OK for the trunk.  Not sure this is really a regression, so it'd need
Richi or Jakub to approve for the release branches.


I also updated the manual and committed r261030 to trunk.

Without the patch false positives the new warning sometimes
(if rarely) issues cannot be suppressed.  The particular false
positive that prompted me to fix this is due to C++ front end
bug 82063, and there's nothing the implementation of the warning
can do to avoid it.

Richi and/or Jakub, can you please review the patch and let me
know if it's suitable for the release branches?

Martin


[gomp5] atomic with memory-order clauses or hint, parsing of requires directive

2018-05-31 Thread Jakub Jelinek
Hi!

I've committed following patch to gomp-5_0-branch, which:
1) adds support for memory-order clauses other than seq_cst to atomic
   construct
2) adds support for hint clause on atomic construct, fixes some hint related
   glitches on critical construct; hints are ignored aftere checking their
   arguments
3) adds parsing of the requires directive (though not yet passing that info
   to libgomp to be able to filter out some devices)

Regtested on x86_64-linux.

2018-05-31  Jakub Jelinek  

* Makefile.in (GTFILES): Add omp-general.h.
* gengtype.c (open_base_files): Likewise.
* tree-core.h (enum omp_memory_order): New enum.
(struct tree_base): Add omp_atomic_memory_order field into union.
Remove OMP_ATOMIC_SEQ_CST comment.
* tree.h (OMP_ATOMIC_SEQ_CST): Remove.
(OMP_ATOMIC_MEMORY_ORDER): Define.
* tree-pretty-print.h (dump_omp_atomic_memory_order): Declare.
* tree-pretty-print.c (dump_omp_atomic_memory_order): New function.
(dump_generic_node): Use it.
* gimple.h (enum gf_mask): Remove GF_OMP_ATOMIC_SEQ_CST, add
GF_OMP_ATOMIC_MEMORY_ORDER, use different value for
GF_OMP_ATOMIC_NEED_VALUE.
(gimple_build_omp_atomic_load): Add enum omp_memory_order argument.
(gimple_build_omp_atomic_store): Likewise.
(gimple_omp_atomic_seq_cst_p): Remove.
(gimple_omp_atomic_memory_order): New function.
(gimple_omp_atomic_set_seq_cst): Remove.
(gimple_omp_atomic_set_memory_order): New function.
* gimple.c (gimple_build_omp_atomic_load): Add mo argument, call
gimple_omp_atomic_set_memory_order.
(gimple_build_omp_atomic_store): Likewise.
* gimple-pretty-print.c (dump_gimple_omp_atomic_load,
dump_gimple_omp_atomic_store): Use dump_omp_atomic_memory_order.
* gimplify.c (gimplify_omp_atomic): Use OMP_ATOMIC_MEMORY_ORDER instead
of OMP_ATOMIC_SEQ_CST, pass it as new argument to
gimple_build_omp_atomic_load and gimple_build_omp_atomic_store, remove
gimple_omp_atomic_set_seq_cst calls.
* omp-general.h (enum omp_requires): New enum.
(omp_requires_mask): Declare.
* omp-general.c (enum omp_requires): New variable.
* omp-expand.c (omp_memory_order_to_memmodel): New function.
(expand_omp_atomic_load, expand_omp_atomic_store,
expand_omp_atomic_fetch_op): Use it and gimple_omp_atomic_memory_order
instead of gimple_omp_atomic_seq_cst_p.
* omp-low.c (lower_reduction_clauses): Initialize
OMP_ATOMIC_MEMORY_ORDER to relaxed.
* tree-parloops.c (create_call_for_reduction_1): Pass
OMP_MEMORY_ORDER_RELAXED as new argument to dump_gimple_omp_atomic_load
and dump_gimple_omp_atomic_store.
c-family/
* c-common.h (c_finish_omp_atomic): Replace bool seq_cst argument with
enum omp_memory_order memory_order.
* c-omp.c (c_finish_omp_atomic): Likewise.  Set OMP_ATOMIC_MEMORY_ORDER
instead of OMP_ATOMIC_SEQ_CST.
* c-pragma.c (omp_pragmas): Add requires.
* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_REQUIRES.
c/
* c-parser.c (c_parser_omp_requires): New function.
(c_parser_pragma): Handle PRAGMA_OMP_REQUIRES.
(c_parser_omp_clause_hint): Require constant integer expression rather
than just integer expression.
(c_parser_omp_atomic): Parse hint and memory order clauses.  Handle
default memory order from requires directive if any.  Adjust
c_finish_omp_atomic caller.
(c_parser_omp_critical): Allow comma in between (name) and hint clause.
(c_parser_omp_target): Set OMP_REQUIRES_TARGET_USED bit in
omp_requires_mask.
cp/
* cp-tree.h (OMP_ATOMIC_DEPENDENT_P): Return true also for first
argument being OMP_CLAUSE.
(finish_omp_atomic): Remove seq_cst argument.  Add clauses and mo
arguments.
* parser.c (cp_parser_omp_atomic): Parse hint and memory order clauses.
Handle default memory order from requires directive if any.  Adjust
finish_omp_atomic caller.
(cp_parser_omp_critical): Allow comma in between (name) and hint
clause.
(cp_parser_omp_target): Set OMP_REQUIRES_TARGET_USED bit in
omp_requires_mask.
(cp_parser_omp_requires): New function.
(cp_parser_pragma): Handle PRAGMA_OMP_REQUIRES.
* pt.c (tsubst_expr) : Call tsubst_omp_clauses
on clauses if any, adjust finish_omp_atomic caller.  Use
OMP_ATOMIC_MEMORY_ORDER rather than OMP_ATOMIC_SEQ_CST.
* semantics.c (finish_omp_clauses): Use error_at rather than
error for priority and hint clause diagnostics.  Fix pasto for
hint clause.  Diagnose hint expression that doesn't fold into
INTEGER_CST.
(finish_omp_atomic): Remove seq_cst argument.  Add clauses and mo
arguments.  Adjust c_finish_omp_atomic caller.  Stick 

Re: [PATCH] avoid ICE when pretty-printing a VLA with an error bound (PR 85956)

2018-05-31 Thread Jason Merrill
On Thu, May 31, 2018 at 11:31 AM, Jakub Jelinek  wrote:
> On Thu, May 31, 2018 at 11:19:08AM -0400, Jason Merrill wrote:
>> > In my mind the issue boils down to two questions:
>> >
>> > 1) should the pretty printer handle error-mark-node gracefully
>> >or is it appropriate for it to abort?
>> > 2) is it appropriate to be embedding/using error_mark_node in
>> >valid constructs as a proxy for "unused" or "unknown" or
>> >such?
>> >
>> > I would expect the answer to (1) to be yes.  Despite that,
>> > I agree with Jason that the answer to (2) should be no.
>> >
>> > That said, I don't think the fix for this bug needs to depend
>> > on solving (2).  We can avoid the ICE by changing the pretty
>> > printers and adjust the openmp implementation later.
>>
>> The problem with embedded error_mark_node is that lots of places are
>> going to blow up like this, and we don't want to change everything to
>> expect it.  Adjusting the pretty-printer might fix this particular
>> testcase, but other things are likely to get tripped up by the same
>> problem.
>>
>> Where is the error_mark_node coming from in the first place?
>
> remap_type invoked during omp-low.c (scan_omp).
> omp_copy_decl returns error_mark_node for decls that tree-inline.c wants
> to remap, but they aren't actually remapped for some reason.
> For normal VLAs gimplify.c makes sure the needed artifical decls are
> firstprivatized, but in this case (VLA not in some decl's type, but just
> referenced indirectly through pointers) nothing scans those unless
> those temporaries are actually used in the code.

Returning error_mark_node from omp_copy_decl and then continuing seems
like the problem, then.  Would it really be that hard to return an
uninitialized variable instead?

Jason


Re: [PATCH] avoid ICE when pretty-printing a VLA with an error bound (PR 85956)

2018-05-31 Thread Jakub Jelinek
On Thu, May 31, 2018 at 01:34:19PM -0400, Jason Merrill wrote:
> >> Where is the error_mark_node coming from in the first place?
> >
> > remap_type invoked during omp-low.c (scan_omp).
> > omp_copy_decl returns error_mark_node for decls that tree-inline.c wants
> > to remap, but they aren't actually remapped for some reason.
> > For normal VLAs gimplify.c makes sure the needed artifical decls are
> > firstprivatized, but in this case (VLA not in some decl's type, but just
> > referenced indirectly through pointers) nothing scans those unless
> > those temporaries are actually used in the code.
> 
> Returning error_mark_node from omp_copy_decl and then continuing seems
> like the problem, then.  Would it really be that hard to return an
> uninitialized variable instead?

The routine doesn't know if it is used in a context of a VLA bound or
something else, in the former case it is acceptable to swap it for some
other var, but in the latter case it would be just a bug, so using
error_mark_node in that case instead is better to catch those.
I can try to do that in tree-inline.c, but really not sure how hard would it
be.
Or handle it in the gimplifier, scan for such vars and add private clauses
for those, that will mean nothing will be passed around.

Jakub


Re: [RFC][PR82479] missing popcount builtin detection

2018-05-31 Thread Bin.Cheng
On Thu, May 31, 2018 at 3:51 AM, Kugan Vivekanandarajah
 wrote:
> Hi Bin,
>
> Thanks for the review. Please find the revised patch based on the
> review comments.
>
> Thanks,
> Kugan
>
> On 17 May 2018 at 19:56, Bin.Cheng  wrote:
>> On Thu, May 17, 2018 at 2:39 AM, Kugan Vivekanandarajah
>>  wrote:
>>> Hi Richard,
>>>
>>> On 6 March 2018 at 02:24, Richard Biener  wrote:
 On Thu, Feb 8, 2018 at 1:41 AM, Kugan Vivekanandarajah
  wrote:
> Hi Richard,
>

Hi,
Thanks very much for working.

> +/* Utility function to check if OP is defined by a stmt
> +   that is a val - 1.  If that is the case, set it to STMT.  */
> +
> +static bool
> +ssa_defined_by_and_minus_one_stmt_p (tree op, tree val, gimple **stmt)
This is checking if op is defined as val - 1, so name it as
ssa_defined_by_minus_one_stmt_p?

> +{
> +  if (TREE_CODE (op) == SSA_NAME
> +  && (*stmt = SSA_NAME_DEF_STMT (op))
> +  && is_gimple_assign (*stmt)
> +  && (gimple_assign_rhs_code (*stmt) == PLUS_EXPR)
> +  && val == gimple_assign_rhs1 (*stmt)
> +  && integer_minus_onep (gimple_assign_rhs2 (*stmt)))
> +return true;
> +  else
> +return false;
You can simply return the boolean condition.

> +}
> +
> +/* See if LOOP is a popcout implementation of the form
...
> +  rhs1 = gimple_assign_rhs1 (and_stmt);
> +  rhs2 = gimple_assign_rhs2 (and_stmt);
> +
> +  if (ssa_defined_by_and_minus_one_stmt_p (rhs1, rhs2, &and_minus_one))
> +rhs1 = rhs2;
> +  else if (ssa_defined_by_and_minus_one_stmt_p (rhs2, rhs1, &and_minus_one))
> +;
> +  else
> +return false;
> +
> +  /* Check the recurrence.  */
> +  phi = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (and_minus_one));
So gimple_assign_rhs1 (and_minus_one) == rhs1 is always true?  Please
use rhs1 directly.

> +  gimple *src_phi = SSA_NAME_DEF_STMT (rhs2);
I think this is checking wrong thing and is redundant.  Either rhs2
equals to rhs1 or is defined as (rhs1 - 1).  For (rhs2 == rhs1) case,
the check duplicates checking on phi; for the latter, it's never a PHI
stmt and shouldn't be checked.

> +  if (gimple_code (phi) != GIMPLE_PHI
> +  || gimple_code (src_phi) != GIMPLE_PHI)
> +return false;
> +
> +  dest = gimple_assign_lhs (count_stmt);
> +  tree fn = builtin_decl_implicit (BUILT_IN_POPCOUNT);
> +  tree src = gimple_phi_arg_def (src_phi, loop_preheader_edge 
> (loop)->dest_idx);
> +  if (adjust)
> +iter = fold_build2 (MINUS_EXPR, TREE_TYPE (dest),
> +build_call_expr (fn, 1, src),
> +build_int_cst (TREE_TYPE (dest), 1));
> +  else
> +iter = build_call_expr (fn, 1, src);
Note tree-ssa-loop-niters.c always use unsigned_type_for (IV-type) as
niters type.  Though unsigned type is unnecessary in this case, but
better to follow existing behavior?

> +  max = int_cst_value (TYPE_MAX_VALUE (TREE_TYPE (dest)));
As richi suggested, max should be the number of bits in type of IV.

> +
> +  niter->assumptions = boolean_false_node;
Redundant.

> +  niter->control.base = NULL_TREE;
> +  niter->control.step = NULL_TREE;
> +  niter->control.no_overflow = false;
> +  niter->niter = iter;
> +  niter->assumptions = boolean_true_node;
> +  niter->may_be_zero = boolean_false_node;
> +  niter->max = max;
> +  niter->bound = NULL_TREE;
> +  niter->cmp = ERROR_MARK;
> +  return true;
> +}
> +
> +
Appology if these are nitpickings.

Thanks,
bin


[PATCH, i386]: Remove concat_tg_mode mode attribute.

2018-05-31 Thread Uros Bizjak
No functional changes.

2018-05-31  Uros Bizjak  

* config/i386/sse.md (avx_vec_concat):
Substitute concat_tg_mode mode attribute with xtg_mode.
(avx512dq_broadcast_1): Ditto.
(concat_tg_mode): Remove mode attribute.

Bootstrapped and regression tested on x86_64-linux-gnu.

Committed to mainline.

Uros.
Index: sse.md
===
--- sse.md  (revision 261026)
+++ sse.md  (working copy)
@@ -913,11 +913,6 @@
(V8DF "sd")  (V4DF "sd") (V2DF "sd")])
 
 ;; Tie mode of assembler operand to mode iterator
-(define_mode_attr concat_tg_mode
-  [(V32QI "t") (V16HI "t") (V8SI "t") (V4DI "t") (V8SF "t") (V4DF "t")
-   (V64QI "g") (V32HI "g") (V16SI "g") (V8DI "g") (V16SF "g") (V8DF "g")])
-
-;; Tie mode of assembler operand to mode iterator
 (define_mode_attr xtg_mode
   [(V16QI "x") (V8HI "x") (V4SI "x") (V2DI "x") (V4SF "x") (V2DF "x")
(V32QI "t") (V16HI "t") (V8SI "t") (V4DI "t") (V8SF "t") (V4DF "t")
@@ -18070,7 +18065,7 @@
  (match_operand:<64x2mode> 1 "nonimmediate_operand" "v,m")))]
   "TARGET_AVX512DQ"
   "@
-   vshuf64x2\t{$0x0, %1, %1, 
%0|%0, %1, %1, 
0x0}
+   vshuf64x2\t{$0x0, %1, %1, 
%0|%0, %1, %1, 0x0}
vbroadcast64x2\t{%1, %0|%0, %1}"
   [(set_attr "type" "ssemov")
(set_attr "prefix_extra" "1")
@@ -18919,21 +18914,21 @@
   switch (which_alternative)
 {
 case 0:
-  return "vinsert\t{$0x1, %2, %1, %0|%0, 
%1, %2, 0x1}";
+  return "vinsert\t{$0x1, %2, %1, %0|%0, %1, %2, 
0x1}";
 case 1:
   if ( == 64)
{
  if (TARGET_AVX512DQ && GET_MODE_SIZE (mode) == 4)
-   return "vinsert32x8\t{$0x1, %2, %1, 
%0|%0, %1, %2, 0x1}";
+   return "vinsert32x8\t{$0x1, %2, %1, %0|%0, 
%1, %2, 0x1}";
  else
-   return "vinsert64x4\t{$0x1, %2, %1, 
%0|%0, %1, %2, 0x1}";
+   return "vinsert64x4\t{$0x1, %2, %1, %0|%0, 
%1, %2, 0x1}";
}
   else
{
  if (TARGET_AVX512DQ && GET_MODE_SIZE (mode) == 8)
-   return "vinsert64x2\t{$0x1, %2, %1, 
%0|%0, %1, %2, 0x1}";
+   return "vinsert64x2\t{$0x1, %2, %1, %0|%0, 
%1, %2, 0x1}";
  else
-   return "vinsert32x4\t{$0x1, %2, %1, 
%0|%0, %1, %2, 0x1}";
+   return "vinsert32x4\t{$0x1, %2, %1, %0|%0, 
%1, %2, 0x1}";
}
 case 2:
 case 3:


Re: [PATCH, alpha] PR target/85095

2018-05-31 Thread Gerald Pfeifer
On Thu, 24 May 2018, Jeff Law wrote:
> > I can try to fix openbsd soon, but freebsd dropped alpha in 2009.
> > How does gcc feel about me picking up patches from openbsd ports, their
> > package manager, that weren't written by myself?
> So ISTM we should deprecate freebsd alpha.

Ab-so-lutely.  In fact, just yank it out, without any deprecation.

   % grep sparc $PORTSDIR/lang/gcc*/Makefile | grep -i alpha | wc -l
   0

None of the GCC ports in the FreeBSD Ports Collection (the oldest of 
which is gcc47 right now) has had any trace of alpha support left for 
years and years, and there have been zero requests/complaints.

Gerald


[PATCH] PR libstdc++/78870 support std::filesystem on Windows

2018-05-31 Thread Jonathan Wakely

This adds incomplete but functional support for std::filesystem and
std::experimental::filesystem on MinGW. In theory there should be no
changes to the existing behaviour for POSIX targets from this patch,
as all the various bugs I found while working on this have already
been fixed in separate patches.

Tested powerpc64le-linux, and x86_64-w64-mingw32 (with a few expected
FAILures on mingw-w64). Committed to trunk.


commit c5a8ea40f82117e98784b0342c2d873a97d990ef
Author: Jonathan Wakely 
Date:   Thu May 31 16:37:44 2018 +0100

PR libstdc++/78870 support std::filesystem on Windows

PR libstdc++/78870 support std::filesystem on Windows
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Check for link, readlink and symlink.
* include/bits/fs_path.h (path::operator/=(const path&)): Move
definition out of class body.
(path::is_absolute(), path::_M_append(path)): Likewise.
(operator<<(basic_ostream, const path&)): Use std::quoted directly.
(operator>>(basic_istream, path&)): Likewise.
(u8path): Reorder definitions and fix Windows implementation.
(path::is_absolute()): Define inline and fix for Windows.
[!_GLIBCXX_FILESYSTEM_IS_WINDOWS] (path::operator/=(const path&)):
Define POSIX version inline.
(path::_M_append(path)): Define inline.
* include/experimental/bits/fs_path.h (path::is_absolute()): Move
definition out of class body.
(operator<<(basic_ostream, const path&)): Fix type of delimiter and
escape characters.
(operator>>(basic_istream, path&)): Likewise.
(path::is_absolute()): Define inline and fix for Windows.
* src/filesystem/dir-common.h (__gnu_posix): New namespace.
(__gnu_posix::char_type, __gnu_posix::DIR, __gnu_posix::dirent)
(__gnu_posix::opendir, __gnu_posix::readdir, __gnu_posix::closedir):
Define as adaptors for Windows functions/types or as
using-declarations for POSIX functions/types.
(_Dir_base, get_file_type): Qualify names to use declarations from
__gnu_posix namespace.
(_Dir_base::is_dor_or_dotdot): New helper functions.
* src/filesystem/dir.cc (_Dir, recursive_directory_iterator): 
Qualify
names to use declarations from __gnu_posix namespace.
* src/filesystem/ops-common.h (__gnu_posix): New nested namespace.
(__gnu_posix::open, __gnu_posix::close, __gnu_posix::stat_type)
(__gnu_posix::stat, __gnu_posix::lstat, __gnu_posix::mode_t)
(__gnu_posix::chmod, __gnu_posix::mkdir, __gnu_posix::getcwd)
(__gnu_posix::chdir, __gnu_posix::utimbuf, __gnu_posix::utime)
(__gnu_posix::rename, __gnu_posix::truncate, 
__gnu_posix::char_type):
Define as adaptors for Windows functions/types or as
using-declarations for POSIX functions/types.
(stat_type, do_copy_file): Qualify names to use declarations from
__gnu_posix namespace.
(do_space): Declare new function.
(make_file_type): Only use S_ISLNK if defined.
* src/filesystem/ops.cc (char_ptr, filesystem::canonical): Use
path::value_type not char.
(filesystem::copy, create_dir, filesystem::create_directory): 
Qualify
names to use declarations from __gnu_posix namespace.
(filesystem::create_hard_link): Check HAVE_LINK autoconf macro and
add implementation for Windows.
(filesystem::create_symlink): Check HAVE_SYMLINK autoconf macro.
(filesystem::current_path(error_code&)): Use __gnu_posix::getcwd.
[!_PC_PATH_MAX]: Don't use pathconf.
[PATH_MAX]: Use if defined.
(filesystem::current_path(const path&, error_code&))
(filesystem::equivalent, do_stat, filesystem::hard_link_count)
(filesystem::last_write_time, filesystem::permissions): Use names
from __gnu_posix.
(filesystem::read_symlink): Check HAVE_READLINK autoconf macro.
(filesystem::remove) [_GLIBCXX_FILESYSTEM_IS_WINDOWS]: Add
implementation for Windows.
(filesystem::rename, filesystem::resize_file): Use names from
__gnu_posix.
(filesystem::space): Use do_space.
[_GLIBCXX_FILESYSTEM_IS_WINDOWS]: Get absolute path to directory.
(filesystem::status, filesystem::symlink_status): Use names from
__gnu_posix.
(filesystem::temp_directory_path): Add implementation for Windows.
* src/filesystem/path.cc (dot): Define constant.
(path::replace_extension): Use dot.
(path::_M_find_extension): Likewise. Use path::string_type not
std::string.
(path::_M_split_cmpts)

Re: [PATCH] rs6000: Fix mangling for 128-bit float

2018-05-31 Thread Segher Boessenkool
On Wed, May 30, 2018 at 06:43:23PM +0200, Jakub Jelinek wrote:
> On Wed, May 30, 2018 at 08:45:22AM -0500, Segher Boessenkool wrote:
> > > If you need to keep g for compatibility (you do), then why not just have
> > > e (long double is double)
> > > g (long double when matching __ibm128, or explicit __ibm128)
> > > u9__ieee128 (long double when matching __ieee128, or explicit __ieee128)
> > 
> > "g" means __float128.  Which is __ieee128.  And it has to be, because
> > so much code expects that already, and it will only become more.  But
> > "g" is demangled as __float128.  Confusion galore.
> 
> Is that such a big deal?  It hasn't been a problem in the past decade when
> it was similarly wrong too.  And, with the intent to phase out long double
> as __ibm128 (at least on powerpc64le) shortly only libstdc++ and perhaps a
> couple of other libraries will have this demangling glitch (and those will
> have it in any case, because there will be the compatibility aliases that
> will demangle as __float128).  So I really don't see any advantages of
> changing the __ibm128 and/or long double that is __ibm128 mangling from
> g to something else.  But I see severe disadvantages.
> 
> The libstdc++ changes to support both __ibm128 and __ieee128 will be very
> hard themselves, there is no need to complicate it further.

We're running some tests now with "g" mangling for everything double-double
(and u9__ieee128 for everything IEEE quad-precision float).

We think it will all work out, certainly for systems where double-double
in the future is just a bad memory (and for other systems the problems
are nothing new).

Thanks to you all for all the input,


Segher


[PATCH, rs6000 0/9] gimple folding of vector loads/stores + tests

2018-05-31 Thread Will Schmidt
Hi, 
  I've broken this set of patches up into bite-sized chunks for easier
review and management.  They'll be showing up as replies to this
message. 
#1-6 are straightforward tests to cover the variations of the vector
load and store intrinsics. These look much alike, but there really are
differences between them. :-)
#7 touches a few existing testcases to allow continued PASSing after the
gimple folding affects codegen.
#8 introduces the actual gimple-folding for the built-ins.
#9 adds support to allow _builtin_vec_xst() to take *double or *long
long as the third parameter, where it currently only allows *vector (of
double) or *vector (of long long).  Two of the new tests will fail
without this update.

The series have been successfully regtested on Linux -
P6,P7,P8(le,be),P9. 

Thanks,
-Will



[PATCH, i386]: __builtin_cpu_is() is not detecting bdver2 with Model = 0x02

2018-05-31 Thread Uros Bizjak
Hello!

As reported in the PR, AMDFAM15H model 0x2 should return
AMDFAM15H_BDVER2 subtype.

2018-05-31  Uros Bizjak  

PR target/85591
* config/i386/cpuinfo.c (get_amd_cpu): Return
AMDFAM15H_BDVER2 for AMDFAM15H model 0x2.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
diff --git a/libgcc/config/i386/cpuinfo.c b/libgcc/config/i386/cpuinfo.c
index 8c9878c..a7bb9da 100644
--- a/libgcc/config/i386/cpuinfo.c
+++ b/libgcc/config/i386/cpuinfo.c
@@ -83,17 +83,20 @@ get_amd_cpu (unsigned int family, unsigned int model)
 /* AMD Family 15h "Bulldozer".  */
 case 0x15:
   __cpu_model.__cpu_type = AMDFAM15H;
+
+  if (model == 0x2)
+   __cpu_model.__cpu_subtype = AMDFAM15H_BDVER2;  
   /* Bulldozer version 1.  */
-  if ( model <= 0xf)
+  else if (model <= 0xf)
__cpu_model.__cpu_subtype = AMDFAM15H_BDVER1;
   /* Bulldozer version 2 "Piledriver" */
-  if (model >= 0x10 && model <= 0x2f)
+  else if (model <= 0x2f)
__cpu_model.__cpu_subtype = AMDFAM15H_BDVER2;  
   /* Bulldozer version 3 "Steamroller"  */
-  if (model >= 0x30 && model <= 0x4f)
+  else if (model <= 0x4f)
__cpu_model.__cpu_subtype = AMDFAM15H_BDVER3;
   /* Bulldozer version 4 "Excavator"   */
-  if (model >= 0x60 && model <= 0x7f)
+  else if (model <= 0x7f)
__cpu_model.__cpu_subtype = AMDFAM15H_BDVER4;
   break;
 /* AMD Family 16h "btver2" */


[PATCH, rs6000 1/9] Testcase coverage for vec_xl() instrinsics

2018-05-31 Thread Will Schmidt
Hi,

Add testcase coverage for variations of the vec_xl() intrinsic.
Regtest clean across assorted Linux systems (p6-p9).
OK for trunk?
Thanks,
-Will

[testsuite]

2018-05-31  Will Schmidt  

* gcc.target/powerpc/fold-vec-load-vec_xl-char.c
* gcc.target/powerpc/fold-vec-load-vec_xl-double.c
* gcc.target/powerpc/fold-vec-load-vec_xl-float.c
* gcc.target/powerpc/fold-vec-load-vec_xl-int.c
* gcc.target/powerpc/fold-vec-load-vec_xl-longlong.c
* gcc.target/powerpc/fold-vec-load-vec_xl-short.c

diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-load-vec_xl-char.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-load-vec_xl-char.c
new file mode 100644
index 000..2f60a71
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-load-vec_xl-char.c
@@ -0,0 +1,37 @@
+/* Verify that overloaded built-ins for vec_xl with char
+   inputs produce the right code.  */
+
+/* { dg-do compile { target { powerpc*-*-linux* } } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2" } */
+
+#include 
+#define BUILD_VAR_TEST(TESTNAME1, RETTYPE, VAR_OFFSET, LOADFROM)   \
+RETTYPE
\
+TESTNAME1 ## _var (VAR_OFFSET offset, LOADFROM * loadfrom) \
+{  \
+   return vec_xl (offset, loadfrom);   \
+}
+
+#define BUILD_CST_TEST(TESTNAME1, RETTYPE, CST_OFFSET, LOADFROM)   \
+RETTYPE
\
+TESTNAME1 ## _cst (LOADFROM * loadfrom)\
+{  \
+   return vec_xl (CST_OFFSET, loadfrom);   \
+}
+
+BUILD_VAR_TEST( test1,  vector signed char, signed long long, vector signed 
char);
+BUILD_VAR_TEST( test2,  vector signed char, signed int, vector signed char);
+BUILD_CST_TEST( test3,  vector signed char, 12, vector signed char);
+BUILD_VAR_TEST( test4,  vector unsigned char, signed long long, vector 
unsigned char);
+BUILD_VAR_TEST( test5,  vector unsigned char, signed int, vector unsigned 
char);
+BUILD_CST_TEST( test6,  vector unsigned char, 12, vector unsigned char);
+
+BUILD_VAR_TEST( test7,  vector signed char, signed long long, signed char);
+BUILD_VAR_TEST( test8,  vector signed char, signed int, signed char);
+BUILD_CST_TEST( test9,  vector signed char, 12, signed char);
+BUILD_VAR_TEST( test10,  vector unsigned char, signed long long, unsigned 
char);
+BUILD_VAR_TEST( test11,  vector unsigned char, signed int, unsigned char);
+BUILD_CST_TEST( test12,  vector unsigned char, 12, unsigned char);
+
+/* { dg-final { scan-assembler-times "lxvw4x|lxvd2x|lxvx|lvx" 12 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-load-vec_xl-double.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-load-vec_xl-double.c
new file mode 100644
index 000..d99662a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-load-vec_xl-double.c
@@ -0,0 +1,31 @@
+/* Verify that overloaded built-ins for vec_xl with double
+   inputs produce the right code.  */
+
+/* { dg-do compile { target { powerpc*-*-linux*  } } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2" } */
+
+#include 
+
+#define BUILD_VAR_TEST(TESTNAME1, RETTYPE, VAR_OFFSET, LOADFROM)   \
+RETTYPE
\
+TESTNAME1 ## _var (VAR_OFFSET offset, LOADFROM * loadfrom) \
+{  \
+   return vec_xl (offset, loadfrom);   \
+}
+
+#define BUILD_CST_TEST(TESTNAME1, RETTYPE, CST_OFFSET, LOADFROM)   \
+RETTYPE
\
+TESTNAME1 ## _cst (LOADFROM * loadfrom)\
+{  \
+   return vec_xl (CST_OFFSET, loadfrom);   \
+}
+
+BUILD_VAR_TEST( test1,  vector double, signed long long, vector double);
+BUILD_VAR_TEST( test2,  vector double, signed int, vector double);
+BUILD_CST_TEST( test3,  vector double, 12, vector double);
+BUILD_VAR_TEST( test4,  vector double, signed long long, double);
+BUILD_VAR_TEST( test5,  vector double, signed int, double);
+BUILD_CST_TEST( test6,  vector double, 12, double);
+
+/* { dg-final { scan-assembler-times "lxvd2x|lxvx|lvx" 6 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-load-vec_xl-float.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-load-vec_xl-float.c
new file mode 100644
index 000..365961c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-load-vec_xl-float.c
@@ -0,0 +1,31 @@
+/* Verify that overloaded built-ins for vec_xl with float
+   inputs produce the right code.  */
+

[PATCH, rs6000 3/9] Testcase coverage for vec_vsx_ld() instrinsics

2018-05-31 Thread Will Schmidt
Hi,
Add testcase coverage for variations of the vec_vsx_ld() intrinsic.
Regtest clean across assorted Linux systems (p6-p9).
OK for trunk?
Thanks,
-Will

[testsuite]

2018-05-31  Will Schmidt  

* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-char.c : New.
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-double.c : New.
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-float.c : New.
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-int.c : New.
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-longlong.c : New.
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-short.c : New.

diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-load-vec_vsx_ld-char.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-load-vec_vsx_ld-char.c
new file mode 100644
index 000..19b0968
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-load-vec_vsx_ld-char.c
@@ -0,0 +1,38 @@
+/* Verify that overloaded built-ins for vec_vsx_ld* with char
+   inputs produce the right code.  */
+
+/* { dg-do compile { target { powerpc*-*-linux*  } } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2" } */
+
+#include 
+#define BUILD_VAR_TEST(TESTNAME1, RETTYPE, VAR_OFFSET, LOADFROM)   \
+RETTYPE
\
+TESTNAME1 ## _var (VAR_OFFSET offset, LOADFROM * loadfrom) \
+{  \
+   return vec_vsx_ld (offset, loadfrom);   \
+}
+
+#define BUILD_CST_TEST(TESTNAME1, RETTYPE, CST_OFFSET, LOADFROM)   \
+RETTYPE
\
+TESTNAME1 ## _cst (LOADFROM * loadfrom)\
+{  \
+   return vec_vsx_ld (CST_OFFSET, loadfrom);   \
+}
+
+BUILD_VAR_TEST( test1,  vector signed char, signed long long, signed char);
+BUILD_VAR_TEST( test2,  vector signed char, signed int, signed char);
+BUILD_CST_TEST( test3,  vector signed char, 12, signed char);
+BUILD_VAR_TEST( test4,  vector unsigned char, signed long long, unsigned char);
+BUILD_VAR_TEST( test5,  vector unsigned char, signed int, unsigned char);
+BUILD_CST_TEST( test6,  vector unsigned char, 12, unsigned char);
+
+BUILD_VAR_TEST( test7,  vector signed char, signed long long, vector signed 
char);
+BUILD_VAR_TEST( test8,  vector signed char, signed int, vector signed char);
+BUILD_CST_TEST( test9,  vector signed char, 12, vector signed char);
+BUILD_VAR_TEST( test10,  vector unsigned char, signed long long, vector 
unsigned char);
+BUILD_VAR_TEST( test11,  vector unsigned char, signed int, vector unsigned 
char);
+BUILD_CST_TEST( test12,  vector unsigned char, 12, vector unsigned char);
+
+/* { dg-final { scan-assembler-times "lxvw4x|lxvd2x|lxvx|lvx" 12 } } */
+
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-load-vec_vsx_ld-double.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-load-vec_vsx_ld-double.c
new file mode 100644
index 000..f01d0bd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-load-vec_vsx_ld-double.c
@@ -0,0 +1,31 @@
+/* Verify that overloaded built-ins for vec_vsx_ld* with double
+   inputs produce the right code.  */
+
+/* { dg-do compile { target { powerpc*-*-linux* } } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2" } */
+
+#include 
+#define BUILD_VAR_TEST(TESTNAME1, RETTYPE, VAR_OFFSET, LOADFROM)   \
+RETTYPE
\
+TESTNAME1 ## _var (VAR_OFFSET offset, LOADFROM * loadfrom) \
+{  \
+   return vec_vsx_ld (offset, loadfrom);   \
+}
+
+#define BUILD_CST_TEST(TESTNAME1, RETTYPE, CST_OFFSET, LOADFROM)   \
+RETTYPE
\
+TESTNAME1 ## _cst (LOADFROM * loadfrom)\
+{  \
+   return vec_vsx_ld (CST_OFFSET, loadfrom);   \
+}
+
+BUILD_VAR_TEST( test1, vector  double, long long, double);
+BUILD_VAR_TEST( test2, vector  double, int, double);
+BUILD_CST_TEST( test3, vector  double, 12, double);
+
+BUILD_VAR_TEST( test4, vector  double, int, vector double);
+BUILD_VAR_TEST( test5, vector  double, long long, vector double);
+BUILD_CST_TEST( test6, vector  double, 12, vector double);
+
+/* { dg-final { scan-assembler-times "lxvd2x|lxvx|lvx" 6 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-load-vec_vsx_ld-float.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-load-vec_vsx_ld-float.c
new file mode 100644
index 000..8236d8b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-load-vec_vsx_ld-float.c
@@ -0,0 +1,31 @@
+/* Ver

[PATCH, rs6000 2/9] Testcase coverage for builtin_vec_xl() instrinsics

2018-05-31 Thread Will Schmidt
Hi,
Add testcase coverage for variations of the builtin_vec_xl() intrinsic.
Regtest clean across assorted Linux systems (p6-p9).
OK for trunk?
Thanks,
-Will

[testsuite]

2018-05-31  Will Schmidt  

* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c: New.
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-double.c: New.
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-float.c: New.
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-int.c: New.
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-longlong.c: New.
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-short.c: New.

diff --git 
a/gcc/testsuite/gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c
new file mode 100644
index 000..1fc5f16
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c
@@ -0,0 +1,38 @@
+/* Verify that overloaded built-ins for __builtin_vec_xl with char
+   inputs produce the right code.  */
+
+/* { dg-do compile { target { powerpc*-*-linux*  } } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2" } */
+
+#include 
+
+#define BUILD_VAR_TEST(TESTNAME1, RETTYPE, VAR_OFFSET, LOADFROM)\
+RETTYPE\
+TESTNAME1 ## _var (VAR_OFFSET offset, LOADFROM * loadfrom) \
+{  \
+   return __builtin_vec_xl (offset, loadfrom); \
+}
+
+#define BUILD_CST_TEST(TESTNAME1, RETTYPE, CST_OFFSET, LOADFROM)   \
+RETTYPE\
+TESTNAME1 ## _cst (LOADFROM * loadfrom)\
+{  \
+   return __builtin_vec_xl (CST_OFFSET, loadfrom); \
+}
+
+BUILD_VAR_TEST( test1, vector signed char,   signed long long, signed char);
+BUILD_VAR_TEST( test2, vector signed char,   signed int, signed char);
+BUILD_CST_TEST( test3, vector signed char,   2, signed char);
+BUILD_VAR_TEST( test4, vector unsigned char, signed long long, unsigned char);
+BUILD_VAR_TEST( test5 ,vector unsigned char, signed int, unsigned char);
+BUILD_CST_TEST( test6, vector unsigned char, 4, unsigned char);
+
+BUILD_VAR_TEST( test7, vector signed char,   signed long long, vector signed 
char);
+BUILD_VAR_TEST( test8, vector signed char,   signed int, vector signed char);
+BUILD_CST_TEST( test9, vector signed char,   6, vector signed char);
+BUILD_VAR_TEST( test10, vector unsigned char, signed long long, vector 
unsigned char);
+BUILD_VAR_TEST( test11, vector unsigned char, signed int, vector unsigned 
char);
+BUILD_CST_TEST( test12, vector unsigned char, 8, vector unsigned char);
+
+/* { dg-final { scan-assembler-times "lxvw4x|lxvd2x|lxvx|lvx" 12 } } */
diff --git 
a/gcc/testsuite/gcc.target/powerpc/fold-vec-load-builtin_vec_xl-double.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-load-builtin_vec_xl-double.c
new file mode 100644
index 000..7fbc65d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-load-builtin_vec_xl-double.c
@@ -0,0 +1,32 @@
+/* Verify that overloaded built-ins for __builtin_vec_xl with double
+   inputs produce the right code.  */
+
+/* { dg-do compile { target { powerpc*-*-linux*  } } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2" } */
+
+#include 
+
+#define BUILD_VAR_TEST(TESTNAME1, RETTYPE, VAR_OFFSET, LOADFROM)\
+RETTYPE\
+TESTNAME1 ## _var (VAR_OFFSET offset, LOADFROM * loadfrom) \
+{  \
+   return __builtin_vec_xl (offset, loadfrom); \
+}
+
+#define BUILD_CST_TEST(TESTNAME1, RETTYPE, CST_OFFSET, LOADFROM)   \
+RETTYPE\
+TESTNAME1 ## _cst (LOADFROM * loadfrom)\
+{  \
+   return __builtin_vec_xl (CST_OFFSET, loadfrom); \
+}
+
+BUILD_VAR_TEST( test1, vector double, signed long long, double);
+BUILD_VAR_TEST( test2, vector double, signed int, double);
+BUILD_CST_TEST( test3, vector double, 12, double);
+
+BUILD_VAR_TEST( test4, vector double, signed long long, vector double);
+BUILD_VAR_TEST( test5, vector double, signed int, vector double);
+BUILD_CST_TEST( test6, vector double, 12, vector double);
+
+/* { dg-final { scan-assembler-times "lxvd2x|lxvx|lvx" 6 } } */
diff --git 
a/gcc/testsuite/gcc.target/powerpc/fold-vec-load-builtin_vec_xl-float.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-load-builtin_vec_xl-float.c
new file mode 100644
index 000..0743b0e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-load-builtin_vec_xl-float.c
@@ -0,0 +1,32 @@
+/* Verify that overloaded built-

[PATCH, rs6000 5/9] Testcase coverage for builtin_vec_xst() instrinsics

2018-05-31 Thread Will Schmidt
Hi,
Add testcase coverage for variations of the builtin_vec_xst() intrinsic.
Regtest clean across assorted Linux systems (p6-p9).
OK for trunk?
Thanks,
-Will


[testsuite]

2018-05-31  Will Schmidt  

* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-char.c: New.
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-double.c: New.
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-float.c: New.
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-int.c: New.
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-longlong.c: New.
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-short.c: New.

diff --git 
a/gcc/testsuite/gcc.target/powerpc/fold-vec-store-builtin_vec_xst-char.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-store-builtin_vec_xst-char.c
new file mode 100644
index 000..6bfbb0b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-store-builtin_vec_xst-char.c
@@ -0,0 +1,38 @@
+/* Verify that overloaded built-ins for builtin_vec_xst with char
+   inputs produce the right code.  */
+
+/* { dg-do compile { target { powerpc*-*-linux*  } } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2" } */
+
+#include 
+
+#define BUILD_VAR_TEST(TESTNAME1, VALUE, VAR_OFFSET, SAVETO)   \
+void   \
+TESTNAME1 ## _var (VALUE value, VAR_OFFSET offset, SAVETO * saveto)\
+{  \
+   __builtin_vec_xst (value, offset, saveto);  \
+}
+
+#define BUILD_CST_TEST(TESTNAME1, VALUE, CST_OFFSET, SAVETO)   \
+void   \
+TESTNAME1 ## _cst (VALUE value, SAVETO * saveto)   \
+{  \
+   __builtin_vec_xst (value, CST_OFFSET, saveto);  \
+}
+
+BUILD_VAR_TEST( test1,  vector signed char, signed long long, signed char );
+BUILD_VAR_TEST( test2,  vector signed char, signed int, signed char );
+BUILD_CST_TEST( test3,  vector signed char, 12, signed char );
+BUILD_VAR_TEST( test4,  vector unsigned char, signed long long, unsigned char 
);
+BUILD_VAR_TEST( test5,  vector unsigned char, signed int, unsigned char );
+BUILD_CST_TEST( test6,  vector unsigned char, 12, unsigned char );
+
+BUILD_VAR_TEST( test7,  vector signed char, signed long long, vector signed 
char );
+BUILD_VAR_TEST( test8,  vector signed char, signed int, vector signed char );
+BUILD_CST_TEST( test9,  vector signed char, 12, vector signed char );
+BUILD_VAR_TEST( test10,  vector unsigned char, signed long long, vector 
unsigned char );
+BUILD_VAR_TEST( test11,  vector unsigned char, signed int, vector unsigned 
char );
+BUILD_CST_TEST( test12,  vector unsigned char, 12, vector unsigned char );
+
+/* { dg-final { scan-assembler-times "stxvw4x|stxvd2x|stxvx|stvx" 12 } } */
diff --git 
a/gcc/testsuite/gcc.target/powerpc/fold-vec-store-builtin_vec_xst-double.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-store-builtin_vec_xst-double.c
new file mode 100644
index 000..a64558c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-store-builtin_vec_xst-double.c
@@ -0,0 +1,32 @@
+/* Verify that overloaded built-ins for __builtin_vec_xst with double
+   inputs produce the right code.  */
+
+/* { dg-do compile { target { powerpc*-*-linux*  } } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2" } */
+
+#define BUILD_VAR_TEST(TESTNAME1, VALUE, VAR_OFFSET, SAVETO)   \
+void   \
+TESTNAME1 ## _var (VALUE value, VAR_OFFSET offset, SAVETO * saveto)\
+{  \
+   __builtin_vec_xst (value, offset, saveto);  \
+}
+
+#define BUILD_CST_TEST(TESTNAME1, VALUE, CST_OFFSET, SAVETO)   \
+void   \
+TESTNAME1 ## _cst (VALUE value, SAVETO * saveto)   \
+{  \
+   __builtin_vec_xst (value, CST_OFFSET, saveto);  \
+}
+
+#include 
+
+BUILD_VAR_TEST( test1,  vector double, signed long long, vector double );
+BUILD_VAR_TEST( test2,  vector double, signed int, vector double );
+BUILD_CST_TEST( test3,  vector double, 12, vector double );
+
+BUILD_VAR_TEST( testvld,  vector double, signed long long, double );
+BUILD_VAR_TEST( testvid,  vector double, signed int, double );
+BUILD_CST_TEST( testcd,  vector double, 12, double );
+
+/* { dg-final { scan-assembler-times "stxvd2x|stxvx|stvx" 6 } } */
diff --git 
a/gcc/testsuite/gcc.target/powerpc/fold-vec-store-builtin_vec_xst-float.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-store-builtin_vec_xst-float.c
new file mode 100644
index 000..82a1e20
--- /dev/null
+

[PATCH, rs6000 6/9] Testcase coverage for vec_vsx_st() intrinsics

2018-05-31 Thread Will Schmidt
Hi,
   
New testcases to cover variations of the vec_vsx_st() intrinsic.
Regtest clean across assorted Linux systems (p6-p9).
OK for trunk?
Thanks,
-Will

[testsuite]

2018-05-31  Will Schmidt  

* gcc.target/powerpc/fold-vec-store-vec_vsx_st-char.c: New.
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-double.c: New.
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-float.c: New.
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-int.c: New.
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-longlong.c: New.
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-short.c: New.

diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-store-vec_vsx_st-char.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-store-vec_vsx_st-char.c
new file mode 100644
index 000..e512c43
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-store-vec_vsx_st-char.c
@@ -0,0 +1,37 @@
+/* Verify that overloaded built-ins for vec_vsx_st with char
+   inputs produce the right code.  */
+
+/* { dg-do compile { target { powerpc*-*-linux*  } } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2" } */
+
+#include 
+#define BUILD_VAR_TEST(TESTNAME1, VALUE, VAR_OFFSET, SAVETO)   \
+void   \
+TESTNAME1 ## _var (VALUE value, VAR_OFFSET offset, SAVETO * saveto)\
+{  \
+   vec_vsx_st (value, offset, saveto); \
+}
+
+#define BUILD_CST_TEST(TESTNAME1, VALUE, CST_OFFSET, SAVETO)   \
+void   \
+TESTNAME1 ## _cst (VALUE value, SAVETO * saveto)   \
+{  \
+   vec_vsx_st (value, CST_OFFSET, saveto); \
+}
+
+BUILD_VAR_TEST( test1,  vector signed char, signed long long, signed char );
+BUILD_VAR_TEST( test2,  vector signed char, signed int, signed char );
+BUILD_CST_TEST( test3,  vector signed char, 12, signed char );
+BUILD_VAR_TEST( test4,  vector unsigned char, signed long long, unsigned char 
);
+BUILD_VAR_TEST( test5,  vector unsigned char, signed int, unsigned char );
+BUILD_CST_TEST( test6,  vector unsigned char, 12, unsigned char );
+
+BUILD_VAR_TEST( test7,  vector signed char, signed long long, vector signed 
char );
+BUILD_VAR_TEST( test8,  vector signed char, signed int, vector signed char );
+BUILD_CST_TEST( test9,  vector signed char, 12, vector signed char );
+BUILD_VAR_TEST( test10, vector unsigned char, signed long long, vector 
unsigned char );
+BUILD_VAR_TEST( test11, vector unsigned char, signed int, vector unsigned char 
);
+BUILD_CST_TEST( test12, vector unsigned char, 12, vector unsigned char );
+
+/* { dg-final { scan-assembler-times "stxvw4x|stxvd2x|stxvx|stvx" 12 } } */
diff --git 
a/gcc/testsuite/gcc.target/powerpc/fold-vec-store-vec_vsx_st-double.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-store-vec_vsx_st-double.c
new file mode 100644
index 000..7755860
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-store-vec_vsx_st-double.c
@@ -0,0 +1,31 @@
+/* Verify that overloaded built-ins for vec_vsx_st with double
+   inputs produce the right code.  */
+
+/* { dg-do compile { target { powerpc*-*-linux*  } } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2" } */
+
+#include 
+#define BUILD_VAR_TEST(TESTNAME1, VALUE, VAR_OFFSET, SAVETO)   \
+void   \
+TESTNAME1 ## _var (VALUE value, VAR_OFFSET offset, SAVETO * saveto)\
+{  \
+   vec_vsx_st (value, offset, saveto); \
+}
+
+#define BUILD_CST_TEST(TESTNAME1, VALUE, CST_OFFSET, SAVETO)   \
+void   \
+TESTNAME1 ## _cst (VALUE value, SAVETO * saveto)   \
+{  \
+   vec_vsx_st (value, CST_OFFSET, saveto); \
+}
+
+BUILD_VAR_TEST( test1,  vector double, signed long long, double );
+BUILD_VAR_TEST( test2,  vector double, signed int, double );
+BUILD_CST_TEST( test3,  vector double, 12, double );
+
+BUILD_VAR_TEST( test7,  vector double, signed long long, vector double );
+BUILD_VAR_TEST( test8,  vector double, signed int, vector double );
+BUILD_CST_TEST( test9,  vector double, 12, vector double );
+
+/* { dg-final { scan-assembler-times "stxvd2x|stxvx|stvx" 6 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-store-vec_vsx_st-float.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-store-vec_vsx_st-float.c
new file mode 100644
index 000..f8f2d82
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-store-vec_vsx_st-float.c
@@ -0,0 +1,31 @@
+/* Verify that overloaded built-ins for vec_vsx_st 

[PATCH, rs6000 7/9] testcase updates for unaligned loads/stores

2018-05-31 Thread Will Schmidt
Hi,
  Assorted updates to existing tests to compensate for codegen changes
introduced with the gimple-folding of unaligned vector loads and stores.
Regtest clean across assorted Linux systems (p6-p9).
OK for trunk?
Thanks,
-Will

[testsuite]

2018-05-31  Will Schmidt  

* gcc.target/powerpc/p8-vec-xl-xst-v2.c: New.
* gcc.target/powerpc/p8-vec-xl-xst.c:  Disable gimple-folding.
* gcc.target/powerpc/swaps-p8-17.c:  Same.

diff --git a/gcc/testsuite/gcc.target/powerpc/p8-vec-xl-xst-v2.c 
b/gcc/testsuite/gcc.target/powerpc/p8-vec-xl-xst-v2.c
new file mode 100644
index 000..3315c5f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/p8-vec-xl-xst-v2.c
@@ -0,0 +1,64 @@
+/* { dg-do compile { target { powerpc64le-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */
+/* { dg-options "-mcpu=power8 -O2" } */
+
+/* Verify fix for problem where vec_xl and vec_xst are not recognized
+   for the vector char and vector short cases on P8 only.
+   This test duplicates p8-vec-xl-xst.c , except that it allows gimple-folding,
+   which changes the expected codegen.  */
+
+#include 
+
+vector unsigned char
+foo (unsigned char * address)
+{
+  return __builtin_vec_xl (0, address);
+}
+
+void
+bar (vector unsigned char x, unsigned char * address)
+{
+  __builtin_vec_xst (x, 0, address);
+}
+
+vector unsigned short
+foot (unsigned short * address)
+{
+  return __builtin_vec_xl (0, address);
+}
+
+void
+bart (vector unsigned short x, unsigned short * address)
+{
+  __builtin_vec_xst (x, 0, address);
+}
+
+vector unsigned char
+fool (unsigned char * address)
+{
+  return vec_xl (0, address);
+}
+
+void
+barl (vector unsigned char x, unsigned char * address)
+{
+  vec_xst (x, 0, address);
+}
+
+vector unsigned short
+footle (unsigned short * address)
+{
+  return vec_xl (0, address);
+}
+
+void
+bartle (vector unsigned short x, unsigned short * address)
+{
+  vec_xst (x, 0, address);
+}
+
+/* { dg-final { scan-assembler-times "lvx" 4 } } */
+/* { dg-final { scan-assembler-times "stvx"  4 } } */
+/* { dg-final { scan-assembler-times "xxpermdi" 0 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/p8-vec-xl-xst.c 
b/gcc/testsuite/gcc.target/powerpc/p8-vec-xl-xst.c
index bbf7d91..06f3457 100644
--- a/gcc/testsuite/gcc.target/powerpc/p8-vec-xl-xst.c
+++ b/gcc/testsuite/gcc.target/powerpc/p8-vec-xl-xst.c
@@ -1,10 +1,11 @@
 /* { dg-do compile { target { powerpc64le-*-* } } } */
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
 /* { dg-require-effective-target powerpc_p8vector_ok } */
 /* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */
-/* { dg-options "-mcpu=power8 -O2" } */
+/* { dg-options "-mcpu=power8 -O2 -mno-fold-gimple" } */
+/* { dg-prune-output "gimple folding of rs6000 builtins has been disabled." } 
*/
 
 /* Verify fix for problem where vec_xl and vec_xst are not recognized
for the vector char and vector short cases on P8 only.  */
 
 #include 
diff --git a/gcc/testsuite/gcc.target/powerpc/swaps-p8-17.c 
b/gcc/testsuite/gcc.target/powerpc/swaps-p8-17.c
index 7a9cfbf..889bbf7 100644
--- a/gcc/testsuite/gcc.target/powerpc/swaps-p8-17.c
+++ b/gcc/testsuite/gcc.target/powerpc/swaps-p8-17.c
@@ -1,8 +1,9 @@
 /* { dg-do compile { target { powerpc64le-*-* } } } */
 /* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */
-/* { dg-options "-mcpu=power8 -O1" } */
+/* { dg-options "-mcpu=power8 -O1 -mno-fold-gimple" } */
+/* { dg-prune-output "gimple folding of rs6000 builtins has been disabled." } 
*/
 /* { dg-final { scan-assembler "lxvd2x" } } */
 /* { dg-final { scan-assembler "xxpermdi" } } */
 
 /* Verify that we don't try to do permute removal in the presence of
vec_ste.  This used to ICE.  */




[PATCH, rs6000 4/9] Testcase coverage for vec_xst() instrinsics

2018-05-31 Thread Will Schmidt
Hi,
Testcase coverage for variations of the vec_xst() intrinsic.
Regtest clean across assorted Linux systems (p6-p9).
OK for trunk?
Thanks,
-Will

[testsuite]

2018-05-31  Will Schmidt  

* gcc.target/powerpc/fold-vec-store-vec_xst-char.c : New.
* gcc.target/powerpc/fold-vec-store-vec_xst-double.c : New.
* gcc.target/powerpc/fold-vec-store-vec_xst-float.c : New.
* gcc.target/powerpc/fold-vec-store-vec_xst-int.c : New.
* gcc.target/powerpc/fold-vec-store-vec_xst-longlong.c : New.
* gcc.target/powerpc/fold-vec-store-vec_xst-short.c : New.

diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-store-vec_xst-char.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-store-vec_xst-char.c
new file mode 100644
index 000..76dacf5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-store-vec_xst-char.c
@@ -0,0 +1,37 @@
+/* Verify that overloaded built-ins for vec_xst with char
+   inputs produce the right code.  */
+
+/* { dg-do compile { target { powerpc*-*-linux*  } } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2" } */
+
+#include 
+#define BUILD_VAR_TEST(TESTNAME1, VALUE, VAR_OFFSET, SAVETO)   \
+void   \
+TESTNAME1 ## _var (VALUE value, VAR_OFFSET offset, SAVETO * saveto)\
+{  \
+   vec_xst (value, offset, saveto);\
+}
+
+#define BUILD_CST_TEST(TESTNAME1, VALUE, CST_OFFSET, SAVETO)   \
+void   \
+TESTNAME1 ## _cst (VALUE value, SAVETO * saveto)   \
+{  \
+   vec_xst (value, CST_OFFSET, saveto);\
+}
+
+BUILD_VAR_TEST( test1,  vector signed char, signed long long, signed char );
+BUILD_VAR_TEST( test2,  vector signed char, signed int, signed char );
+BUILD_CST_TEST( test3,  vector signed char, 12, signed char );
+BUILD_VAR_TEST( test4,  vector unsigned char, signed long long, unsigned char 
);
+BUILD_VAR_TEST( test5,  vector unsigned char, signed int, unsigned char );
+BUILD_CST_TEST( test6,  vector unsigned char, 12, unsigned char );
+
+BUILD_VAR_TEST( test7,  vector signed char, signed long long, vector signed 
char );
+BUILD_VAR_TEST( test8,  vector signed char, signed int, vector signed char );
+BUILD_CST_TEST( test9,  vector signed char, 12, vector signed char );
+BUILD_VAR_TEST( test10, vector unsigned char, signed long long, vector 
unsigned char );
+BUILD_VAR_TEST( test11, vector unsigned char, signed int, vector unsigned char 
);
+BUILD_CST_TEST( test12, vector unsigned char, 12, vector unsigned char );
+
+/* { dg-final { scan-assembler-times "stxvw4x|stxvd2x|stxvx|stvx" 12 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-store-vec_xst-double.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-store-vec_xst-double.c
new file mode 100644
index 000..a9cf409
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-store-vec_xst-double.c
@@ -0,0 +1,31 @@
+/* Verify that overloaded built-ins for vec_xst with double
+   inputs produce the right code.  */
+
+/* { dg-do compile { target { powerpc*-*-linux*  } } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2" } */
+
+#include 
+#define BUILD_VAR_TEST(TESTNAME1, VALUE, VAR_OFFSET, SAVETO)   \
+void   \
+TESTNAME1 ## _var (VALUE value, VAR_OFFSET offset, SAVETO * saveto)\
+{  \
+   vec_xst (value, offset, saveto);\
+}
+
+#define BUILD_CST_TEST(TESTNAME1, VALUE, CST_OFFSET, SAVETO)   \
+void   \
+TESTNAME1 ## _cst (VALUE value, SAVETO * saveto)   \
+{  \
+   vec_xst (value, CST_OFFSET, saveto);\
+}
+
+BUILD_VAR_TEST( test1,  vector double, signed long long, double );
+BUILD_VAR_TEST( test2,  vector double, signed int, double );
+BUILD_CST_TEST( test3,  vector double, 12, double );
+
+BUILD_VAR_TEST( test7,  vector double, signed long long, vector double );
+BUILD_VAR_TEST( test8,  vector double, signed int, vector double );
+BUILD_CST_TEST( test9,  vector double, 12, vector double );
+
+/* { dg-final { scan-assembler-times "stxvd2x|stxvx|stvx" 6 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-store-vec_xst-float.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-store-vec_xst-float.c
new file mode 100644
index 000..a5c805b2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-store-vec_xst-float.c
@@ -0,0 +1,31 @@
+/* Verify that overloaded built-ins for vec_xst with float
+   inputs produce the right code.  */
+

[PATCH, rs6000 8/9] enable gimple folding for vec_xl, vec_xst

2018-05-31 Thread Will Schmidt
Hi, 
  Add support for gimple folding for unaligned vector loads and stores.
testcases posted separately in this thread.

Regtest completed across variety of systems, P6,P7,P8,P9.

OK for trunk?
Thanks,
-Will

[gcc]

2018-05-31 Will Schmidt 

* config/rs6000/rs6000.c: (rs6000_builtin_valid_without_lhs) Add vec_xst
variants to the list.  (rs6000_gimple_fold_builtin) Add support for
folding unaligned vector loads and stores.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index d62abdf..54b7de2 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15360,10 +15360,16 @@ rs6000_builtin_valid_without_lhs (enum 
rs6000_builtins fn_code)
 case ALTIVEC_BUILTIN_STVX_V8HI:
 case ALTIVEC_BUILTIN_STVX_V4SI:
 case ALTIVEC_BUILTIN_STVX_V4SF:
 case ALTIVEC_BUILTIN_STVX_V2DI:
 case ALTIVEC_BUILTIN_STVX_V2DF:
+case VSX_BUILTIN_STXVW4X_V16QI:
+case VSX_BUILTIN_STXVW4X_V8HI:
+case VSX_BUILTIN_STXVW4X_V4SF:
+case VSX_BUILTIN_STXVW4X_V4SI:
+case VSX_BUILTIN_STXVD2X_V2DF:
+case VSX_BUILTIN_STXVD2X_V2DI:
   return true;
 default:
   return false;
 }
 }
@@ -15869,10 +15875,77 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
gimple_set_location (g, loc);
gsi_replace (gsi, g, true);
return true;
   }
 
+/* unaligned Vector loads.  */
+case VSX_BUILTIN_LXVW4X_V16QI:
+case VSX_BUILTIN_LXVW4X_V8HI:
+case VSX_BUILTIN_LXVW4X_V4SF:
+case VSX_BUILTIN_LXVW4X_V4SI:
+case VSX_BUILTIN_LXVD2X_V2DF:
+case VSX_BUILTIN_LXVD2X_V2DI:
+  {
+arg0 = gimple_call_arg (stmt, 0);  // offset
+arg1 = gimple_call_arg (stmt, 1);  // address
+lhs = gimple_call_lhs (stmt);
+location_t loc = gimple_location (stmt);
+/* Since arg1 may be cast to a different type, just use ptr_type_node
+   here instead of trying to enforce TBAA on pointer types.  */
+tree arg1_type = ptr_type_node;
+tree lhs_type = TREE_TYPE (lhs);
+/* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  Create
+   the tree using the value from arg0.  The resulting type will match
+   the type of arg1.  */
+gimple_seq stmts = NULL;
+tree temp_offset = gimple_convert (&stmts, loc, sizetype, arg0);
+tree temp_addr = gimple_build (&stmts, loc, POINTER_PLUS_EXPR,
+  arg1_type, arg1, temp_offset);
+gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
+/* Use the build2 helper to set up the mem_ref.  The MEM_REF could also
+   take an offset, but since we've already incorporated the offset
+   above, here we just pass in a zero.  */
+gimple *g;
+g = gimple_build_assign (lhs, build2 (MEM_REF, lhs_type, temp_addr,
+   build_int_cst (arg1_type, 0)));
+gimple_set_location (g, loc);
+gsi_replace (gsi, g, true);
+return true;
+  }
+
+/* unaligned Vector stores.  */
+case VSX_BUILTIN_STXVW4X_V16QI:
+case VSX_BUILTIN_STXVW4X_V8HI:
+case VSX_BUILTIN_STXVW4X_V4SF:
+case VSX_BUILTIN_STXVW4X_V4SI:
+case VSX_BUILTIN_STXVD2X_V2DF:
+case VSX_BUILTIN_STXVD2X_V2DI:
+  {
+arg0 = gimple_call_arg (stmt, 0); /* Value to be stored.  */
+arg1 = gimple_call_arg (stmt, 1); /* Offset.  */
+tree arg2 = gimple_call_arg (stmt, 2); /* Store-to address.  */
+location_t loc = gimple_location (stmt);
+tree arg0_type = TREE_TYPE (arg0);
+/* Use ptr_type_node (no TBAA) for the arg2_type.  */
+tree arg2_type = ptr_type_node;
+/* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  Create
+   the tree using the value from arg0.  The resulting type will match
+   the type of arg2.  */
+gimple_seq stmts = NULL;
+tree temp_offset = gimple_convert (&stmts, loc, sizetype, arg1);
+tree temp_addr = gimple_build (&stmts, loc, POINTER_PLUS_EXPR,
+  arg2_type, arg2, temp_offset);
+/* Mask off any lower bits from the address.  */
+gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
+gimple *g;
+g = gimple_build_assign (build2 (MEM_REF, arg0_type, temp_addr,
+  build_int_cst (arg2_type, 0)), arg0);
+gimple_set_location (g, loc);
+gsi_replace (gsi, g, true);
+return true;
+  }
+
 /* Vector Fused multiply-add (fma).  */
 case ALTIVEC_BUILTIN_VMADDFP:
 case VSX_BUILTIN_XVMADDDP:
 case ALTIVEC_BUILTIN_VMLADDUHM:
   {




[PATCH, rs6000 9/9] Enable some additional combinations for builtin_vec_xst

2018-05-31 Thread Will Schmidt
Hi,
 Enable some variations of the _builtin_vec_xst() instrinsic.
For most data types (char,short,int,...) the _builtin_vec_xst() intrinsic
accepts either a pointer to , or a pointer to a vector of elements of
.   We currently do not accept pointer-to-type for the long long or
double data types.

This adds the combinations to accept *double and both signed and unsigned
*long long for the instrinsic.

Testcases coverage is provided by the fold-vec-load-* and fold-vec-store-*
series of tests, posted separately.

Regtest is clean across P6-P9 Linux platforms.
OK for trunk?
Thanks,
-Will

[gcc]

2018-05-31  Will Schmidt  

* config/rs6000/rs6000-c.c: Add BUILTIN_VEC_XST entries for *double
and *long long.

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index 98a812e..61ff4e2 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -4079,13 +4079,19 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
 RS6000_BTI_void, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, 
~RS6000_BTI_unsigned_V16QI },
   { ALTIVEC_BUILTIN_VEC_STVRXL, ALTIVEC_BUILTIN_STVRXL,
 RS6000_BTI_void, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, 
~RS6000_BTI_UINTQI },
   { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_STXVD2X_V2DF,
 RS6000_BTI_void, RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_V2DF },
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_STXVD2X_V2DF,
+RS6000_BTI_void, RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_double },
   { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_STXVD2X_V2DI,
 RS6000_BTI_void, RS6000_BTI_V2DI, RS6000_BTI_INTSI, ~RS6000_BTI_V2DI },
   { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_STXVD2X_V2DI,
+RS6000_BTI_void, RS6000_BTI_V2DI, RS6000_BTI_INTSI, ~RS6000_BTI_long_long 
},
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_STXVD2X_V2DI, RS6000_BTI_void,
+RS6000_BTI_unsigned_V2DI, RS6000_BTI_INTSI, ~RS6000_BTI_unsigned_long_long 
},
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_STXVD2X_V2DI,
 RS6000_BTI_void, RS6000_BTI_unsigned_V2DI, RS6000_BTI_INTSI,
 ~RS6000_BTI_unsigned_V2DI },
   { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_STXVD2X_V2DI,
 RS6000_BTI_void, RS6000_BTI_bool_V2DI, RS6000_BTI_INTSI,
 ~RS6000_BTI_bool_V2DI },





committed: [PATCH][Middle-end][version 3]2nd patch of PR78809 and PR83026

2018-05-31 Thread Qing Zhao
Hi, 

I have committed the patch as revision 261039.

thanks.

Qing

> On May 29, 2018, at 7:08 PM, Qing Zhao  wrote:
> 
> Hi, Jeff,
> 
> Thanks a lot for your review and comments.
> 
> I have updated my patch based on your suggestion, and retested this whole 
> patch on both X86 and aarch64.
> 
> please take a look at the patch again.
> 
> thanks.
> 
> Qing
> 
>> On May 25, 2018, at 3:38 PM, Jeff Law  wrote:
> 
>> So I originally thought you had the core logic wrong in the immediate
>> uses loop.  But it's actually the case that the return value is the
>> exact opposite of what I expected.
>> 
>> ie, I expected "TRUE" to mean the call was transformed, "FALSE" if it
>> was not transformed.
>> 
>> Can you fix that so it's not so confusing?
>> 
>> I think with that change we'll be good to go, but please repost for a
>> final looksie.
>> 
>> THanks,
>> Jeff


[wwwdocs] Add (empty for now) GCC 9 release notes

2018-05-31 Thread Gerald Pfeifer
With that in place, please go ahead and start filling this.

For GCC 8 many items came late, and I'm sure that we are still
missing many that would have been worth documenting here.

Gerald

Applied as follows...

Index: changes.html
===
RCS file: changes.html
diff -N changes.html
--- /dev/null   1 Jan 1970 00:00:00 -
+++ changes.html31 May 2018 21:08:33 -
@@ -0,0 +1,137 @@
+
+
+GCC 9 Release Series — Changes, New Features, and Fixes
+
+
+
+
+
+GCC 9 Release SeriesChanges, New Features, and Fixes
+
+
+This page is a "brief" summary of some of the huge number of improvements
+in GCC 9.
+
+
+
+Note: GCC 9 has not been released yet, so this document is
+a work-in-progress.
+
+
+Caveats
+
+  ...
+
+
+
+
+General Improvements
+
+
+
+
+New Languages and Language specific improvements
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+New Targets and Target Specific Improvements
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Operating Systems
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+


libgo patch committed: Update to Go 1.10.2 release

2018-05-31 Thread Ian Lance Taylor
This patch to libgo updates it to the 1.10.2 release.  Bootstrapped
and ran Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 260913)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-9731580e76c065b76e3a103356bb8920da05a685
+79eca4fd642724d89e9bec8f79889451f6632a46
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/MERGE
===
--- libgo/MERGE (revision 260048)
+++ libgo/MERGE (working copy)
@@ -1,4 +1,4 @@
-bf86aec25972f3a100c3aa58a6abcbcc35bdea49
+71bdbf431b79dff61944f22c25c7e085ccfc25d5
 
 The first line of this file holds the git revision number of the
 last merge done from the master library sources.
Index: libgo/VERSION
===
--- libgo/VERSION   (revision 260048)
+++ libgo/VERSION   (working copy)
@@ -1 +1 @@
-go1.10
+go1.10.2
Index: libgo/check-packages.txt
===
--- libgo/check-packages.txt(revision 260048)
+++ libgo/check-packages.txt(working copy)
@@ -122,6 +122,7 @@ net/http/httptest
 net/http/httptrace
 net/http/httputil
 net/http/internal
+net/http/pprof
 net/internal/socktest
 net/mail
 net/rpc
Index: libgo/go/archive/zip/reader.go
===
--- libgo/go/archive/zip/reader.go  (revision 260048)
+++ libgo/go/archive/zip/reader.go  (working copy)
@@ -366,7 +366,7 @@ parseExtras:
epoch := time.Date(1601, time.January, 1, 0, 0, 
0, 0, time.UTC)
modified = time.Unix(epoch.Unix()+secs, nsecs)
}
-   case unixExtraID:
+   case unixExtraID, infoZipUnixExtraID:
if len(fieldBuf) < 8 {
continue parseExtras
}
@@ -378,12 +378,6 @@ parseExtras:
continue parseExtras
}
ts := int64(fieldBuf.uint32()) // ModTime since Unix 
epoch
-   modified = time.Unix(ts, 0)
-   case infoZipUnixExtraID:
-   if len(fieldBuf) < 4 {
-   continue parseExtras
-   }
-   ts := int64(fieldBuf.uint32()) // ModTime since Unix 
epoch
modified = time.Unix(ts, 0)
}
}
Index: libgo/go/archive/zip/reader_test.go
===
--- libgo/go/archive/zip/reader_test.go (revision 260048)
+++ libgo/go/archive/zip/reader_test.go (working copy)
@@ -414,7 +414,7 @@ var tests = []ZipTest{
Name: "test.txt",
Content:  []byte{},
Size: 1<<32 - 1,
-   Modified: time.Date(2017, 10, 31, 21, 17, 27, 
0, timeZone(-7*time.Hour)),
+   Modified: time.Date(2017, 10, 31, 21, 11, 57, 
0, timeZone(-7*time.Hour)),
Mode: 0644,
},
},
Index: libgo/go/cmd/go/go_test.go
===
--- libgo/go/cmd/go/go_test.go  (revision 260048)
+++ libgo/go/cmd/go/go_test.go  (working copy)
@@ -3265,6 +3265,20 @@ func TestGoVetWithOnlyTestFiles(t *testi
tg.run("vet", "p")
 }
 
+// Issue 24193.
+func TestVetWithOnlyCgoFiles(t *testing.T) {
+   if !canCgo {
+   t.Skip("skipping because cgo not enabled")
+   }
+
+   tg := testgo(t)
+   defer tg.cleanup()
+   tg.parallel()
+   tg.tempFile("src/p/p.go", "package p; import \"C\"; func F() {}")
+   tg.setenv("GOPATH", tg.path("."))
+   tg.run("vet", "p")
+}
+
 // Issue 9767, 19769.
 func TestGoGetDotSlashDownload(t *testing.T) {
testenv.MustHaveExternalNetwork(t)
@@ -5099,6 +5113,28 @@ func TestCacheOutput(t *testing.T) {
}
 }
 
+func TestCacheListStale(t *testing.T) {
+   tooSlow(t)
+   if strings.Contains(os.Getenv("GODEBUG"), "gocacheverify") {
+   t.Skip("GODEBUG gocacheverify")
+   }
+   tg := testgo(t)
+   defer tg.cleanup()
+   tg.parallel()
+   tg.makeTempdir()
+   tg.setenv("GOCACHE", tg.path("cache"))
+   tg.tempFile("gopath/src/p/p.go", "package p; import _ \"q\"; func 
F(){}\n")
+   tg.tempFile("gopath/src/q/q.go", "package q; func F(){}\n")
+   tg.tempFile("gopath/src/m/m.go", "package main; import _ \"q\"; func 
main(){}\n")
+
+   tg.setenv("GOPATH", tg.path("gopath"))
+   tg.run("install", "p", "m")
+   tg.run("list

Re: [RFC] Add gcc.dg-selftests/dg-final.exp

2018-05-31 Thread Mike Stump
On May 30, 2018, at 3:41 AM, Tom de Vries  wrote:
> 
> this patch tests the error behaviour of dg-final directives when called with 
> an
> incorrect number of arguments.

Seems reasonable.  Unless someone wanted to argue against it for some reason 
(not portable, takes too much time, hard to maintain, doesn't test features), 
I'd ok it.

GCC 8 branch: update to Go 1.10.2, other fixes

2018-05-31 Thread Ian Lance Taylor
I've committed this patch to update the GCC 8 branch to the Go 1.10.2
release, and also brought along a couple of other recent fixes.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

cmd/go: support more Solaris assembler syntaxes

Patch by Rainer Orth.

For https://gcc.gnu.org/PR/85429.

cmd/go: update to match recent changes to gc

In https://golang.org/cl/111097 the gc version of cmd/go was updated
to include some gofrontend-specific changes. The gofrontend code
already has different versions of those changes; this CL makes the
gofrontend match the upstream code.

libgo: update to Go 1.10.2 release

cmd/go, cmd/vet: make vet work with gccgo

Backport https://golang.org/cl/113715 and https://golang.org/cl/113716:

cmd/go: don't pass -compiler flag to vet

Without this running go vet -compiler=gccgo causes vet to fail.
The vet tool does need to know the compiler, but it is passed in
vetConfig.Compiler.

cmd/go, cmd/vet, go/internal/gccgoimport: make vet work with gccgo

When using gccgo/GoLLVM, there is no package file for a standard
library package. Since it is impossible for the go tool to rebuild the
package, and since the package file exists only in the form of a .gox
file, this seems like the best choice. Unfortunately it was confusing
vet, which wanted to see a real file. This caused vet to report errors
about missing package files for standard library packages. The
gccgoimporter knows how to correctly handle this case. Fix this by

1) telling vet which packages are standard;
2) letting vet skip those packages;
3) letting the gccgoimporter handle this case.

As a separate required fix, gccgo/GoLLVM has no runtime/cgo package,
so don't try to depend on it (as it happens, this fixes golang/go#25324).

The result is that the cmd/go vet tests pass when using -compiler=gccgo.

crypto/x509: specify path to AIX certificate file

go/build, cmd/go: update to match recent changes to gc

Several recent changes to the gc version of cmd/go improve the
gofrontend support. These changes are partially copies of existing
gofrontend differences, and partially new code. This CL makes the
gofrontend match the upstream code.

The changes included here come from:
https://golang.org/cl/111575
https://golang.org/cl/111595
https://golang.org/cl/111635
https://golang.org/cl/111636

For the record, the following recent gc changes are based on code
already present in the gofrontend repo:
https://golang.org/cl/110915
https://golang.org/cl/111615

For the record, a gc change, partially based on earlier gofrontend
work, also with new gc code, was already copied to gofrontend repo in
CL 111099:

https://golang.org/cl/111097

This moves the generated list of standard library packages from
cmd/go/internal/load to go/build.

Ian

gotools/ChangeLog:

Backport from mainline:
2018-05-09  Ian Lance Taylor  
* Makefile.am (check-go-tool): Don't copy zstdpkglist.go.
* Makefile.in: Rebuild.
Index: gotools/Makefile.am
===
--- gotools/Makefile.am (revision 261041)
+++ gotools/Makefile.am (working copy)
@@ -232,7 +232,6 @@ check-go-tool: go$(EXEEXT) $(noinst_PROG
$(MKDIR_P) check-go-dir/src/cmd/go
cp $(cmdsrcdir)/go/*.go check-go-dir/src/cmd/go/
cp -r $(cmdsrcdir)/go/internal check-go-dir/src/cmd/go/
-   cp $(libgodir)/zstdpkglist.go check-go-dir/src/cmd/go/internal/load/
cp $(libgodir)/zdefaultcc.go check-go-dir/src/cmd/go/internal/cfg/
cp -r $(cmdsrcdir)/go/testdata check-go-dir/src/cmd/go/
cp -r $(cmdsrcdir)/internal check-go-dir/src/cmd/
Index: libgo/MERGE
===
--- libgo/MERGE (revision 261041)
+++ libgo/MERGE (working copy)
@@ -1,4 +1,4 @@
-bf86aec25972f3a100c3aa58a6abcbcc35bdea49
+71bdbf431b79dff61944f22c25c7e085ccfc25d5
 
 The first line of this file holds the git revision number of the
 last merge done from the master library sources.
Index: libgo/Makefile.am
===
--- libgo/Makefile.am   (revision 261041)
+++ libgo/Makefile.am   (working copy)
@@ -614,13 +614,13 @@ s-runtime-inc: runtime.lo Makefile
rm -f runtime.inc.tmp2 runtime.inc.tmp3
$(STAMP) $@
 
-noinst_DATA += zstdpkglist.go zdefaultcc.go
+noinst_DATA += zdefaultcc.go
 
 # Generate the list of go std packages that were included in libgo
 zstdpkglist.go: s-zstdpkglist; @true
 s-zstdpkglist: Makefile
rm -f zstdpkglist.go.tmp
-   echo 'package load' > zstdpkglist.go.tmp
+   echo 'package build' > zstdpkglist.go.tmp
echo "" >> zstdpkglist.go.tmp
echo 'var stdpkg = map[string]bool{' >> zstdpkglist.go.tmp
echo $(libgo_go_objs) 'unsafe.lo'

Re: [PATCH] Warn for ignored ASM labels on typdef declarations PR 85444 (v.3)

2018-05-31 Thread Joseph Myers
On Sat, 26 May 2018, Will Hawkins wrote:

> > +  if (asmspec_tree != NULL_TREE)
> > +{
> > +  warning (OPT_Wignored_asm_name, "asm-specifier is ignored in "
> > +   "typedef declaration");
> > +}

We avoid braces around a single statement like this.

I don't think diagnostics generally use hyphenated syntax production names 
like asm-specifier.  Rather, the hyphens are omitted, and literal code 
enclosed in quotes, so "% specifier" (and %).

> > +  warning (OPT_Wignored_asm_name, "asm-specifier is ignored for "
> > +   "typedef declarations");

Please use the same wording for C and C++ to save work for translators 
(thus, don't say "declaration" in one and "declarations" in the other, or 
"in" in one and "for" in the other, unless there is a concrete reason 
related to the languages to need a difference).

> > +Warn when an assembler name is given but ignored. For C and C++, this
> > +happens when a @code{typdef} declaration is given an assembler name.

typedef, not typdef.

> > diff --git a/gcc/testsuite/g++.dg/asm-pr85444.C

I think you should put the test in c-c++-common if possible.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [RFC][PR64946] "abs" vectorization fails for char/short types

2018-05-31 Thread Kugan Vivekanandarajah
Hi Richard,

This is the revised patch based on the review and the discussion in
https://gcc.gnu.org/ml/gcc/2018-05/msg00179.html.

In summary:
- I skipped  (element_precision (type) < element_precision (TREE_TYPE
(@0))) in the match.pd pattern as this would prevent transformation
for the case in PR.
that is, I am interested in is something like:
  char t = (char) ABS_EXPR <(int) x>
and I want to generate
char t = (char) ABSU_EXPR 

- I also haven't added all the necessary match.pd changes for
ABSU_EXPR. I have a patch for that but will submit separately based on
this reveiw.

- I also tried to add ABSU_EXPRsupport  in the places as necessary by
grepping for ABS_EXPR.

- I also had to add special casing in vectorizer for ABSU_EXP as its
result is unsigned type.

Is this OK. Patch bootstraps and the regression test is ongoing.

Thanks,
Kugan


On 18 May 2018 at 12:36, Kugan Vivekanandarajah
 wrote:
> Hi Richard,
>
> Thanks for the review. I am revising the patch based on Andrew's comments too.
>
> On 17 May 2018 at 20:36, Richard Biener  wrote:
>> On Thu, May 17, 2018 at 4:56 AM Andrew Pinski  wrote:
>>
>>> On Wed, May 16, 2018 at 7:14 PM, Kugan Vivekanandarajah
>>>  wrote:
>>> > As mentioned in the PR, I am trying to add ABSU_EXPR to fix this
>>> > issue. In the attached patch, in fold_cond_expr_with_comparison I am
>>> > generating ABSU_EXPR for these cases. As I understand, absu_expr is
>>> > well defined in RTL. So, the issue is generating absu_expr  and
>>> > transferring to RTL in the correct way. I am not sure I am not doing
>>> > all that is needed. I will clean up and add more test-cases based on
>>> > the feedback.
>>
>>
>>> diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
>>> index 71e172c..2b812e5 100644
>>> --- a/gcc/optabs-tree.c
>>> +++ b/gcc/optabs-tree.c
>>> @@ -235,6 +235,7 @@ optab_for_tree_code (enum tree_code code, const_tree
>> type,
>>> return trapv ? negv_optab : neg_optab;
>>
>>>   case ABS_EXPR:
>>> +case ABSU_EXPR:
>>> return trapv ? absv_optab : abs_optab;
>>
>>
>>> This part is not correct, it should something like this:
>>
>>>   case ABS_EXPR:
>>> return trapv ? absv_optab : abs_optab;
>>> +case ABSU_EXPR:
>>> +   return abs_optab ;
>>
>>> Because ABSU is not undefined at the TYPE_MAX.
>>
>> Also
>>
>> /* Unsigned abs is simply the operand.  Testing here means we don't
>>   risk generating incorrect code below.  */
>> -  if (TYPE_UNSIGNED (type))
>> +  if (TYPE_UNSIGNED (type)
>> + && (code != ABSU_EXPR))
>>  return op0;
>>
>> is wrong.  ABSU of an unsigned number is still just that number.
>>
>> The change to fold_cond_expr_with_comparison looks odd to me
>> (premature optimization).  It should be done separately - it seems
>> you are doing
>
> FE seems to be using this to generate ABS_EXPR from
> c_fully_fold_internal to fold_build3_loc and so on. I changed this to
> generate ABSU_EXPR for the case in the testcase. So the question
> should be, in what cases do we need ABS_EXPR and in what cases do we
> need ABSU_EXPR. It is not very clear to me.
>
>
>>
>> (simplify (abs (convert @0)) (convert (absu @0)))
>>
>> here.
>>
>> You touch one other place in fold-const.c but there seem to be many
>> more that need ABSU_EXPR handling (you touched the one needed
>> for correctness) - esp. you should at least handle constant folding
>> in const_unop and the nonnegative predicate.
>
> OK.
>>
>> @@ -3167,6 +3167,9 @@ verify_expr (tree *tp, int *walk_subtrees, void *data
>> ATTRIBUTE_UNUSED)
>> CHECK_OP (0, "invalid operand to unary operator");
>> break;
>>
>> +case ABSU_EXPR:
>> +  break;
>> +
>>   case REALPART_EXPR:
>>   case IMAGPART_EXPR:
>>
>> verify_expr is no more.  Did you test this recently against trunk?
>
> This patch is against slightly older trunk. I will rebase it.
>
>>
>> @@ -3937,6 +3940,9 @@ verify_gimple_assign_unary (gassign *stmt)
>>   case PAREN_EXPR:
>>   case CONJ_EXPR:
>> break;
>> +case ABSU_EXPR:
>> +  /* FIXME.  */
>> +  return false;
>>
>> no - please not!  Please add verification here - ABSU should be only
>> called on INTEGRAL, vector or complex INTEGRAL types and the
>> type of the LHS should be always the unsigned variant of the
>> argument type.
>
> OK.
>>
>> if (is_gimple_val (cond_expr))
>>   return cond_expr;
>>
>> -  if (TREE_CODE (cond_expr) == ABS_EXPR)
>> +  if (TREE_CODE (cond_expr) == ABS_EXPR
>> +  || TREE_CODE (cond_expr) == ABSU_EXPR)
>>   {
>> rhs1 = TREE_OPERAND (cond_expr, 1);
>> STRIP_USELESS_TYPE_CONVERSION (rhs1);
>>
>> err, but the next line just builds a ABS_EXPR ...
>>
>> How did you identify spots that need adjustment?  I would expect that
>> once folding generates ABSU_EXPR that you need to adjust frontends
>> (C++ constexpr handling for example).  Also I miss adjustments
>> to gimple-pretty-print.c and the GIMPLE FE parser.
>
> I will add this.
>>
>> recursi

Re: [PATCH] Implement Fortran 2018's RANDOM_INIT

2018-05-31 Thread Janne Blomqvist
On Mon, May 28, 2018 at 8:06 PM, Steve Kargl <
s...@troutmask.apl.washington.edu> wrote:

> The attached patch implements the RANDOM_INIT intrinsic
> subroutine specified in Fortran 2018.  I have had this
> patch in my local tree for the last 5+ months.  Now that
> 8.1 is out, it is time to submit it.  It has been built
> and regression tested on x86_64-*-freebsd.  OK to commit?
>
> Note, I have only tested with -fcoarray=single as I don't
> have OpenCoarray set up to build with trunk.  Testing with
> OpenCoarray is encouraged.
>
> 2018-05-28  Steven G. Kargl  
>
> * check.c (gfc_check_random_init): New function. Check arguments of
> RANDOM_INIT.
> * gfortran.h (GFC_ISYM_RANDOM_INIT): New enum token.
> * intrinsic.c (add_subroutines): Add RANDOM_INIT to list of
> subroutines.
> * intrinsic.h: Add prototypes for gfc_check_random_init and
> gfc_resolve_random_init
> * intrinsic.texi: Document new intrinsic subprogram.
> * iresolve.c (gfc_resolve_random_init): Resolve routine name.
> * trans-decl.c: Declare gfor_fndecl_random_init
> * trans-intrinsic.c (conv_intrinsic_random_init): New function.
> Translate call to RANDOM_INIT.
> (gfc_conv_intrinsic_subroutine): Call it.
> * trans.h: Declare gfor_fndecl_random_init
>
> 2018-05-28  Steven G. Kargl  
>
> * gfortran.dg/random_init_1.f90: New test.
> * gfortran.dg/random_init_2.f90: New test.
> * gfortran.dg/random_init_3.f90: New test.
> * gfortran.dg/random_init_4.f90: New test.
> * gfortran.dg/random_init_5.f90: New test.
> * gfortran.dg/random_init_6.f90: New test.
>
> 2018-05-28  Steven G. Kargl  
>
> * libgfortran/Makefile.am: Add random_init.f90 to build.
> * libgfortran/Makefile.in: Regenerated.
> * libgfortran/gfortran.map: Expose symbol for
> _gfortran_random_init.
> * libgfortran/intrinsics/random_init.f90: Implementation.
>
> --
> Steve
>

Looks good, thanks for the patch!

-- 
Janne Blomqvist


Re: [PATCH] PR fortran/85981 -- Check kind of errmsg variable.

2018-05-31 Thread Janne Blomqvist
On Wed, May 30, 2018 at 2:24 AM, Steve Kargl <
s...@troutmask.apl.washington.edu> wrote:

> The new comment in the patch explains the patch.  This was
> developed and tested on 8-branch, but will be applied to
> trunk prior to committing to branches.  Built and regression
> tested on x86_64-*-freebsd.  OK to commit?
>
> 2018-05-29  Steven G. Kargl  
>
> PR fortran/85981
> * resolve.c (resolve_allocate_deallocate): Check errmsg is default
> character kind.
>
> 2018-05-29  Steven G. Kargl  
>
> PR fortran/85981
>
> * gfortran.dg/allocate_alloc_opt_14.f90: New test.
> * gfortran.dg/allocate_alloc_opt_1.f90: Update error string.
> * gfortran.dg/allocate_stat_2.f90: Ditto.
> * gfortran.dg/deallocate_alloc_opt_1.f90: Ditto.
>
> --
> Steve
>

Ok, thanks!

-- 
Janne Blomqvist


Re: [PATCH, rs6000 8/9] enable gimple folding for vec_xl, vec_xst

2018-05-31 Thread Richard Biener
On Thu, May 31, 2018 at 9:59 PM Will Schmidt  wrote:
>
> Hi,
>   Add support for gimple folding for unaligned vector loads and stores.
> testcases posted separately in this thread.
>
> Regtest completed across variety of systems, P6,P7,P8,P9.
>
> OK for trunk?
> Thanks,
> -Will
>
> [gcc]
>
> 2018-05-31 Will Schmidt 
>
> * config/rs6000/rs6000.c: (rs6000_builtin_valid_without_lhs) Add 
> vec_xst
> variants to the list.  (rs6000_gimple_fold_builtin) Add support for
> folding unaligned vector loads and stores.
>
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index d62abdf..54b7de2 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -15360,10 +15360,16 @@ rs6000_builtin_valid_without_lhs (enum 
> rs6000_builtins fn_code)
>  case ALTIVEC_BUILTIN_STVX_V8HI:
>  case ALTIVEC_BUILTIN_STVX_V4SI:
>  case ALTIVEC_BUILTIN_STVX_V4SF:
>  case ALTIVEC_BUILTIN_STVX_V2DI:
>  case ALTIVEC_BUILTIN_STVX_V2DF:
> +case VSX_BUILTIN_STXVW4X_V16QI:
> +case VSX_BUILTIN_STXVW4X_V8HI:
> +case VSX_BUILTIN_STXVW4X_V4SF:
> +case VSX_BUILTIN_STXVW4X_V4SI:
> +case VSX_BUILTIN_STXVD2X_V2DF:
> +case VSX_BUILTIN_STXVD2X_V2DI:
>return true;
>  default:
>return false;
>  }
>  }
> @@ -15869,10 +15875,77 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator 
> *gsi)
> gimple_set_location (g, loc);
> gsi_replace (gsi, g, true);
> return true;
>}
>
> +/* unaligned Vector loads.  */
> +case VSX_BUILTIN_LXVW4X_V16QI:
> +case VSX_BUILTIN_LXVW4X_V8HI:
> +case VSX_BUILTIN_LXVW4X_V4SF:
> +case VSX_BUILTIN_LXVW4X_V4SI:
> +case VSX_BUILTIN_LXVD2X_V2DF:
> +case VSX_BUILTIN_LXVD2X_V2DI:
> +  {
> +arg0 = gimple_call_arg (stmt, 0);  // offset
> +arg1 = gimple_call_arg (stmt, 1);  // address
> +lhs = gimple_call_lhs (stmt);
> +location_t loc = gimple_location (stmt);
> +/* Since arg1 may be cast to a different type, just use ptr_type_node
> +   here instead of trying to enforce TBAA on pointer types.  */
> +tree arg1_type = ptr_type_node;
> +tree lhs_type = TREE_TYPE (lhs);
> +/* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  
> Create
> +   the tree using the value from arg0.  The resulting type will match
> +   the type of arg1.  */
> +gimple_seq stmts = NULL;
> +tree temp_offset = gimple_convert (&stmts, loc, sizetype, arg0);
> +tree temp_addr = gimple_build (&stmts, loc, POINTER_PLUS_EXPR,
> +  arg1_type, arg1, temp_offset);
> +gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
> +/* Use the build2 helper to set up the mem_ref.  The MEM_REF could 
> also
> +   take an offset, but since we've already incorporated the offset
> +   above, here we just pass in a zero.  */
> +gimple *g;
> +g = gimple_build_assign (lhs, build2 (MEM_REF, lhs_type, temp_addr,
> +   build_int_cst (arg1_type, 
> 0)));

So in GIMPLE the type of the MEM_REF specifies the alignment so my question
is what type does the lhs usually have here?  I'd simply guess V4SF, etc.?  In
this case you are missing a

  tree ltype = build_aligned_type (lhs_type, desired-alignment);

and use that ltype for building the MEM_REF.  I suppose in this case the known
alignment is either BITS_PER_UNIT or element alignment (thus
TYPE_ALIGN (TREE_TYPE (lhs_type)))?

Or is the type of the load the element types?

Richard.

> +gimple_set_location (g, loc);
> +gsi_replace (gsi, g, true);
> +return true;
> +  }
> +
> +/* unaligned Vector stores.  */
> +case VSX_BUILTIN_STXVW4X_V16QI:
> +case VSX_BUILTIN_STXVW4X_V8HI:
> +case VSX_BUILTIN_STXVW4X_V4SF:
> +case VSX_BUILTIN_STXVW4X_V4SI:
> +case VSX_BUILTIN_STXVD2X_V2DF:
> +case VSX_BUILTIN_STXVD2X_V2DI:
> +  {
> +arg0 = gimple_call_arg (stmt, 0); /* Value to be stored.  */
> +arg1 = gimple_call_arg (stmt, 1); /* Offset.  */
> +tree arg2 = gimple_call_arg (stmt, 2); /* Store-to address.  */
> +location_t loc = gimple_location (stmt);
> +tree arg0_type = TREE_TYPE (arg0);
> +/* Use ptr_type_node (no TBAA) for the arg2_type.  */
> +tree arg2_type = ptr_type_node;
> +/* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  
> Create
> +   the tree using the value from arg0.  The resulting type will match
> +   the type of arg2.  */
> +gimple_seq stmts = NULL;
> +tree temp_offset = gimple_convert (&stmts, loc, sizetype, arg1);
> +tree temp_addr = gimple_build (&stmts, loc, POINTER_PLUS_EXPR,
> +  arg2_type, arg2, temp_offset);
> +/* Mask off any lower bits from the address.  */
> +gsi_insert_seq_be

fwprop addressing costs

2018-05-31 Thread Robin Dapp
Hi,

when investigating a regression, I realized that we create a superfluous
load on S390.  The snippet looks something like

LA  %r10, 0(%r8,%r9)
LLH %r4, 0(%r10)

meaning the address in r10 is computed by an LA even though LLH supports
the addressing already.  The same address is used multiple times so
combine cannot do something about it.

Looking into fwprop, I realized it actually tries to propagate the
address but exits because we specify higher costs for an address with
index than for one without.  This was meant to account for the fact
that, in general and all other things being equal, not every instruction
can handle indexed addressing mode.

Now, in this case, fwprop actually knows the instructions it propagates
into and could decide based on the full costs, seeing that it would not
be more expensive.  Currently, it recursively descends to the parts that
are going to be propagated or replaced and compares the costs of both
without regarding the full instruction.

Would it make sense to enhance fwprop with a more detailed cost
evaluation or are there other passes that should do the same - i.e.
what's the preferred way to solve this? Is changing the address costs in
the backend depending on addressing mode sensible at all? As far as I
can see, the x86 backend also changes costs depending on global
properties (i.e. to prefer fewer registers when addressing).

Regards
 Robin