[PATCH] rs6000: Allow MMA built-in initialization regardless of compiler options

2020-07-08 Thread Peter Bergner via Gcc-patches
PR96125 shows a bug when we try to use an MMA built-in within a function
that uses #pragma target/attribute target to enable power10 code generation
and the -mcpu= command line option is pre-power10.

The problem is that we only initialize built-ins once, fairly early, when
the command line options are in force.  If the -mcpu= is pre-power10,
then we fail to initialize the MMA built-ins at all and so they are not
available to call in a #pragma target/attribute target function.

The patch below basically always (on server type cpus) initializes the MMA
built-ins so we can use them in #pragma target/attribute target functions.
The patch below fixes the bug and is currently in the middle of testing.

Is this ok for trunk assuming the bootstrap and regression testing
show no regressions?

This also affects GCC10, so I'd like to backport this before the release.
Ok there too after it sits on trunk a day or two?

Peter


gcc/
PR target/96125
* config/rs6000/rs6000-call.c (rs6000_init_builtins): Define the MMA
specific types __vector_quad and __vector_pair, and initialize the
MMA built-ins if TARGET_EXTRA_BUILTINS is set.
(mma_init_builtins): Don't test for mask set in rs6000_builtin_mask.
Remove now unneeded mask variable.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Add the
OPTION_MASK_MMA flag for power10 if not already set.

gcc/testsuite/
PR target/96125
* gcc.target/powerpc/pr96125.c: New test.


diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 8e7bb54c73d..883c66810e6 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -12572,7 +12572,7 @@ rs6000_init_builtins (void)
 ieee128_float_type_node = ibm128_float_type_node = long_double_type_node;
 
   /* Vector pair and vector quad support.  */
-  if (TARGET_MMA)
+  if (TARGET_EXTRA_BUILTINS)
 {
   tree oi_uns_type = make_unsigned_type (256);
   vector_pair_type_node = build_distinct_type_copy (oi_uns_type);
@@ -12648,13 +12648,14 @@ rs6000_init_builtins (void)
   pixel_V8HI_type_node = rs6000_vector_type ("__vector __pixel",
 pixel_type_node, 8);
 
-  /* Create Altivec and VSX builtins on machines with at least the
+  /* Create Altivec, VSX and MMA builtins on machines with at least the
  general purpose extensions (970 and newer) to allow the use of
  the target attribute.  */
   if (TARGET_EXTRA_BUILTINS)
-altivec_init_builtins ();
-  if (TARGET_MMA)
-mma_init_builtins ();
+{
+  altivec_init_builtins ();
+  mma_init_builtins ();
+}
   if (TARGET_HTM)
 htm_init_builtins ();
 
@@ -13388,20 +13389,12 @@ mma_init_builtins (void)
   for (unsigned i = 0; i < ARRAY_SIZE (bdesc_mma); i++, d++)
 {
   tree op[MAX_MMA_OPERANDS], type;
-  HOST_WIDE_INT mask = d->mask;
   unsigned icode = (unsigned) d->icode;
   unsigned attr = rs6000_builtin_info[d->code].attr;
   int attr_args = (attr & RS6000_BTC_OPND_MASK);
   bool gimple_func = (attr & RS6000_BTC_GIMPLE);
   unsigned nopnds = 0;
 
-  if ((mask & rs6000_builtin_mask) != mask)
-   {
- if (TARGET_DEBUG_BUILTIN)
-   fprintf (stderr, "mma_builtin, skip binary %s\n", d->name);
- continue;
-   }
-
   if (d->name == 0)
{
  if (TARGET_DEBUG_BUILTIN)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index fef72884b31..15af9b230e6 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4264,8 +4264,12 @@ rs6000_option_override_internal (bool global_init_p)
   rs6000_isa_flags &= ~OPTION_MASK_PCREL;
 }
 
+  /* Enable -mmma by default on power10 systems.  */
+  if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_MMA) == 0)
+rs6000_isa_flags |= OPTION_MASK_MMA;
+
   /* Turn off vector pair/mma options on non-power10 systems.  */
-  if (!TARGET_POWER10 && TARGET_MMA)
+  else if (!TARGET_POWER10 && TARGET_MMA)
 {
   if ((rs6000_isa_flags_explicit & OPTION_MASK_MMA) != 0)
error ("%qs requires %qs", "-mmma", "-mcpu=power10");
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index bbd8060e143..ea2c89eb6b3 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -577,7 +577,8 @@ extern int rs6000_vector_align[];
 || TARGET_POPCNTD   /* ISA 2.06 */  \
 || TARGET_ALTIVEC   \
 || TARGET_VSX   \
-|| TARGET_HARD_FLOAT)
+|| TARGET_HARD_FLOAT\
+|| TARGET_MMA)
 
 /* E500 cores only support plain "sync", not lwsync.  */
 #define TARGET_NO_LWSYNC (rs6000_cpu == PROCESSOR_PPC8540 \
diff --git 

Re: [PATCH] rs6000: Split movsf_from_si from high word before reload[PR89310]

2020-07-08 Thread luoxhu via Gcc-patches



On 2020/7/9 06:43, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Jul 08, 2020 at 11:19:21AM +0800, luoxhu wrote:
>> For extracting high part element from DImode register like:
>>
>> {%1:SF=unspec[r122:DI>>0x20#0] 86;clobber scratch;}
>>
>> split it before reload with "and mask" to avoid generating shift right
>> 32 bit then shift left 32 bit.  This pattern also exists in PR42475 and
>> PR67741, etc.
> 
>> 2020-07-08  Xionghu Luo  
>>
>>  PR rtl-optimization/89310
>>  * config/rs6000/rs6000.md (movsf_from_si2): New
>>  define_insn_and_split.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2020-07-08  Xionghu Luo  
>>
>>  PR rtl-optimization/89310
>>  * gcc.target/powerpc/pr89310.c: New test.
> 
>> +;; For extracting high part element from DImode register like:
>> +;; {%1:SF=unspec[r122:DI>>0x20#0] 86;clobber scratch;}
>> +;; split it before reload with "and mask" to avoid generating shift right
>> +;; 32 bit then shift left 32 bit.
>> +(define_insn_and_split "movsf_from_si2"
>> +  [(set (match_operand:SF 0 "gpc_reg_operand" "=wa")
>> +(unspec:SF [
>> + (subreg:SI (ashiftrt:DI
> 
> Put the  (ashiftrt:DI  on a separate line as well?  With indent changes,
> etc.
> 
>> +   (match_operand:DI 1 "input_operand" "r")
>> +   (const_int 32))
>> +  0)]
>> + UNSPEC_SF_FROM_SI))
>> +(clobber (match_scratch:DI 2 "=r"))]
>> +  "TARGET_NO_SF_SUBREG"
>> +  "@
>> +  #"
> 
> @ with only one alternative doesn't do anything; so just write
>"#"
> please.
> 
>> +
> 
> And no empty line here.
> 
>> +  "&& !reload_completed
> 
> Why this?  Why will this not work after reload?  In the very few cases
> where you do need this, you usually also need to check for
> lra_in_progress.
> 
>> +   && vsx_reg_sfsubreg_ok (operands[0], SFmode)"
>> +  [(const_int 0)]
>> +{
>> +  if (GET_CODE (operands[2]) == SCRATCH)
>> +operands[2] = gen_reg_rtx (DImode);
>> +
>> +  rtx mask = GEN_INT (HOST_WIDE_INT_M1U << 32);
> 
> The mask should be different for QI and HI.
> 
>> +  emit_insn (gen_anddi3 (operands[2], operands[1], mask));
>> +  emit_insn (gen_p8_mtvsrd_sf (operands[0], operands[2]));
>> +  emit_insn (gen_vsx_xscvspdpn_directmove (operands[0], operands[0]));
>> +  DONE;
>> +}
>> +  [(set_attr "length" "12")
>> +  (set_attr "type" "vecfloat")
>> +  (set_attr "isa" "p8v")])
>> +
> 
> No extra whiteline please.
> 
> 
> Maybe change it back to just SI?  It won't match often at all for QI or
> HI anyway, it seems.  Sorry for that detour.  Should be good with the
> above nits fixed :-)

OK, if I see correctly, subreg of DImode should be SImode and I used
subreg:SI to match only SI, so no need to consider mask for QI and HI? :)

Others are updated: removed reload_completed and adjust format. 


For extracting high part element from DImode register like:

{%1:SF=unspec[r122:DI>>0x20#0] 86;clobber scratch;}

split it before reload with "and mask" to avoid generating shift right
32 bit then shift left 32 bit.  This pattern also exists in PR42475 and
PR67741, etc.

srdi 3,3,32
sldi 9,3,32
mtvsrd 1,9
xscvspdpn 1,1

=>

rldicr 3,3,0,31
mtvsrd 1,3
xscvspdpn 1,1

Bootstrap and regression tested pass on Power8-LE.

gcc/ChangeLog:

2020-07-08  Xionghu Luo  

PR rtl-optimization/89310
* config/rs6000/rs6000.md (movsf_from_si2): New
define_insn_and_split.

gcc/testsuite/ChangeLog:

2020-07-08  Xionghu Luo  

PR rtl-optimization/89310
* gcc.target/powerpc/pr89310.c: New test.
---
 gcc/config/rs6000/rs6000.md| 31 ++
 gcc/testsuite/gcc.target/powerpc/pr89310.c | 17 
 2 files changed, 48 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr89310.c

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 4fcd6a94022..a493dfd4596 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7593,6 +7593,37 @@ (define_insn_and_split "movsf_from_si"
"*,  *, p9v,   p8v,   *, *,
 p8v,p8v,   p8v,   *")])
 
+;; For extracting high part element from DImode register like:
+;; {%1:SF=unspec[r122:DI>>0x20#0] 86;clobber scratch;}
+;; split it before reload with "and mask" to avoid generating shift right
+;; 32 bit then shift left 32 bit.
+(define_insn_and_split "movsf_from_si2"
+  [(set (match_operand:SF 0 "gpc_reg_operand" "=wa")
+   (unspec:SF [
+(subreg:SI (
+ashiftrt:DI (
+  match_operand:DI 1 "input_operand" "r")
+(const_int 32))
+ 0)]
+UNSPEC_SF_FROM_SI))
+   (clobber (match_scratch:DI 2 "=r"))]
+  "TARGET_NO_SF_SUBREG"
+  "#"
+  "&& vsx_reg_sfsubreg_ok (operands[0], SFmode)"
+  [(const_int 0)]
+{
+  if (GET_CODE (operands[2]) == SCRATCH)
+operands[2] = gen_reg_rtx (DImode);
+
+  rtx mask = GEN_INT (HOST_WIDE_INT_M1U << 32);
+  emit_insn (gen_anddi3 (operands[2], operands[1], 

Re: [PATCH] remove premature vect_verify_datarefs_alignment

2020-07-08 Thread Kewen.Lin via Gcc-patches
on 2020/7/9 上午10:48, Kewen.Lin via Gcc-patches wrote:
> Hi Richi,
> 
> on 2020/7/8 下午10:45, Richard Biener wrote:
>> This followup removes vect_verify_datarefs_alignment and its
>> premature cancellation of vectorization leaving the actual
>> decision whether alignment is supported to the functions
>> deciding whether we can vectorize a load or store.
>>
>> I'll see whether to find a suitable machine to test !hw_misalign_supported
>> (altivec-only ppc I think?  hints welcome...), but maybe I'm lazy...
>>
> 
> Thanks for caring about this!  As my limited experience, Power7 machine
> is qualified for this even with explicit configuration -mcpu=power7,

Oops, sorry I meant --with-cpu=power7.

> most of vector test cases will go with -mno-allow-movmisalign there.
> 
> cfarm gcc110 looks fine to use.
> 

BR,
Kewen


[PATCH] c++: Diagnose cv-qualified decltype(auto) [PR79815]

2020-07-08 Thread Marek Polacek via Gcc-patches
"If the placeholder is the decltype(auto) type-specifier, T shall be the
placeholder alone." but we weren't detecting "const decltype(auto)".

I've just expanded the existing diagnostic detecting "decltype(auto) &"
and similar.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/79815
* decl.c (grokdeclarator): Detect cv-qual decltype(auto).
* pt.c (do_auto_deduction): Likewise.

gcc/testsuite/ChangeLog:

PR c++/79815
* g++.dg/cpp1y/auto-fn58.C: New test.
---
 gcc/cp/decl.c  | 17 +
 gcc/cp/pt.c|  6 ++
 gcc/testsuite/g++.dg/cpp1y/auto-fn58.C |  8 
 3 files changed, 27 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/auto-fn58.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 60a09e9497a..839ea8059e7 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -12251,11 +12251,20 @@ grokdeclarator (const cp_declarator *declarator,
/* Only plain decltype(auto) is allowed.  */
if (tree a = type_uses_auto (type))
  {
-   if (AUTO_IS_DECLTYPE (a) && a != type)
+   if (AUTO_IS_DECLTYPE (a))
  {
-   error_at (typespec_loc, "%qT as type rather than "
- "plain %", type);
-   return error_mark_node;
+   if (a != type)
+ {
+   error_at (typespec_loc, "%qT as type rather than "
+ "plain %", type);
+   return error_mark_node;
+ }
+   else if (TYPE_QUALS (type) != TYPE_UNQUALIFIED)
+ {
+   error_at (typespec_loc, "% cannot be "
+ "cv-qualified");
+   return error_mark_node;
+ }
  }
  }
 
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index b6423f7432b..33d194e9e15 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -28993,6 +28993,12 @@ do_auto_deduction (tree type, tree init, tree 
auto_node,
error ("%qT as type rather than plain %", type);
  return error_mark_node;
}
+  else if (TYPE_QUALS (type) != TYPE_UNQUALIFIED)
+   {
+ if (complain & tf_error)
+   error ("% cannot be cv-qualified");
+ return error_mark_node;
+   }
 }
   else
 {
diff --git a/gcc/testsuite/g++.dg/cpp1y/auto-fn58.C 
b/gcc/testsuite/g++.dg/cpp1y/auto-fn58.C
new file mode 100644
index 000..8f6ec9b79ab
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/auto-fn58.C
@@ -0,0 +1,8 @@
+// PR c++/79815
+// { dg-do compile { target c++14 } }
+
+decltype(auto) const x = 1; // { dg-error "cannot be cv-qualified" }
+volatile decltype(auto) x2 = 1; // { dg-error "cannot be cv-qualified" }
+const volatile decltype(auto) x3 = 1; // { dg-error "cannot be cv-qualified" }
+const decltype(auto) fn() { return 42; } // { dg-error "cannot be 
cv-qualified" }
+const decltype(auto) fn2(); // { dg-error "cannot be cv-qualified" }

base-commit: 50873cc588fcc20384212b6dddca74393023a0e3
-- 
2.26.2



Re: [PATCH] remove premature vect_verify_datarefs_alignment

2020-07-08 Thread Kewen.Lin via Gcc-patches
Hi Richi,

on 2020/7/8 下午10:45, Richard Biener wrote:
> This followup removes vect_verify_datarefs_alignment and its
> premature cancellation of vectorization leaving the actual
> decision whether alignment is supported to the functions
> deciding whether we can vectorize a load or store.
> 
> I'll see whether to find a suitable machine to test !hw_misalign_supported
> (altivec-only ppc I think?  hints welcome...), but maybe I'm lazy...
> 

Thanks for caring about this!  As my limited experience, Power7 machine
is qualified for this even with explicit configuration -mcpu=power7,
most of vector test cases will go with -mno-allow-movmisalign there.

cfarm gcc110 looks fine to use.

BR,
Kewen


Re: [PATCH ping 3] ppc64 check for incompatible setting of minimal-toc

2020-07-08 Thread Douglas B Rupp

Greetings yet again,

It would be greatly appreciated if you could look at this patch when you 
have a minute.


--Doug

On 6/17/20 2:14 PM, Douglas B Rupp wrote:

Greetings again,

Could you please look at this patch when convenient?

--Doug

On 6/1/20 10:13 AM, Douglas B Rupp wrote:

Greetings,

Curious if you've had a chance to look at this patch yet?

--Doug

On 5/18/20 4:02 PM, Douglas B Rupp wrote:

Greetings,

The attached patch is proposed for rs6000/linux64.h.

The problem it addresses is that the current checking only tests for 
existence not for an incompatible/compatible setting.


For example:

$ powerpc64-linux-gnu-gcc -mcmodel=medium -mminimal-toc foo.c
is an incompatible set of switches

however

$ powerpc64-linux-gnu-gcc -mcmodel=medium -mno-minimal-toc foo.c
is ok.

Currently both are reported as incompatible.

--Douglas Rupp, AdaCore



Re: [PATCH v2, rs6000] Add support to enable vmsumudm behind vec_msum builtin.

2020-07-08 Thread Segher Boessenkool
On Wed, Jul 08, 2020 at 10:55:59AM -0500, will schmidt wrote:
> On Tue, 2020-06-30 at 18:39 -0500, Segher Boessenkool wrote:
> > On Tue, Jun 30, 2020 at 12:57:45PM -0500, will schmidt wrote:
> > >   Add support for the vmsumudm instruction and tie it into the
> > > vec_msum
> > >   built-in to support the variants of that built-in using vector
> > >  _int128 parameters.
> > > 2020-06-18  Will Schmidt  
> > > 
> > > * config/rs6000/altivec.h (vec_vmsumudm): New define.
> > > * config/rs6000/altivec.md (UNSPEC_VMSUMUDM): New unspec.
> > > (altivec_vmsumudm): New define_insn.
> > > * config/rs6000/rs6000-builtin.def (altivec_vmsumudm): New
> > > BU_ALTIVEC_3 entry. (vmsumudm): New BU_ALTIVEC_OVERLOAD_3
> > > entry.
> > 
> > No line break before (vmsumudm) please.
> > 
> > > * config/rs6000/rs6000-call.c
> > > (altivec_overloaded_builtins):
> > > Add entries for ALTIVEC_BUILTIN_VMSUMUDM variants of
> > > vec_msum.
> > 
> > Tha patch is okay for trunk with that (and some int128 selector in
> > the
> > testcases that need one).  Thanks!
> 
> Thanks for the review, etc.
> 
> OK for backports too? (after baking on trunk for a bit).

Yes please.  Do it in time before the RC?  Thanks!


Segher


Re: [PATCH v3] RS6000, add VSX mask manipulation support

2020-07-08 Thread Segher Boessenkool
Hi Carl,

On Tue, Jul 07, 2020 at 04:19:33PM -0700, Carl Love wrote:
> I have fixed the issues you mentioned in version 2. I also rebased the
> patch onto the latest mainline.  This resulted in having to change
> FUTURE to P10 everywhere.  

Yeah, that is painful.  I took the brunt of it, I should know :-/

Please fix everything Will found (as always, thanks Will!)  I don't see
more problems, so fourth time should be the charm?  :-)


Segher


[PATCH] improve validation of attribute arguments (PR c/78666)

2020-07-08 Thread Martin Sebor via Gcc-patches

GCC has gotten better at detecting conflicts between various
attributes but it still doesn't do a perfect job of detecting
similar problems due to mismatches between contradictory
arguments to the same attribute.  For example,

  __attribute ((alloc_size (1))) void* allocate (size_t, size_t);

followed by

  __attribute ((alloc_size (2))) void* allocate (size_t, size_t);

is accepted with the former overriding the latter in calls to
the function.  Similar problem exists with a few other attributes
that take arguments.

The attached change adds a new utility function that checks for
such mismatches and issues warnings.  It also adds calls to it
to detect the problem in attributes alloc_align, alloc_size, and
section.  This isn't meant to be a comprehensive fix but rather
a starting point for one.

Tested on x86_64-linux.

Martin

PS I ran into this again while debugging some unrelated changes
and wondering about the behavior in similar situations to mine.
Since the behavior seemed clearly suboptimal I figured I might
as well fix it.

PPS The improved checking triggers warnings in a few calls to
__builtin_has_attribute due to apparent conflicts.  I've xfailed
those in the test since it's a known issue with some existing
attributes that should be fixed at some point.  Valid uses of
the built-in shouldn't trigger diagnostics except for completely
nonsensical arguments.  Unfortunately, the line between valid
and completely nonsensical is a blurry one (GCC either issues
errors, or -Wattributes, or silently ignores some cases
altogether, such as those that are the subject of this patch)
and there is no internal mechanism to control the response.
Detect conflicts between incompatible uses of the same attribute (PR c/78666).

Resolves:
PR c/78666 - conflicting attribute alloc_size accepted
PR c/96126 - conflicting attribute section accepted on redeclaration

gcc/c-family/ChangeLog:

	PR c/78666
	PR c/96126
	* c-attribs.c (validate_attr_args): New function.
	(validate_attr_arg): Same.
	(handle_section_attribute): Call it.  Introduce a local variable.
	(handle_alloc_size_attribute):  Same.
	(handle_alloc_align_attribute): Same.

gcc/testsuite/ChangeLog:

	PR c/78666
	PR c/96126
	* gcc.dg/attr-alloc_align-5.c: New test.
	* gcc.dg/attr-alloc_size-13.c: New test.
	* gcc.dg/attr-section.c: New test.

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 37214831538..bc4f409e346 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -720,6 +725,124 @@ positional_argument (const_tree fntype, const_tree atname, tree pos,
   return pos;
 }
 
+/* Given a pair of NODEs for arbitrary DECLs or TYPEs, validate one or
+   two integral or string attribute arguments NEWARGS to be applied to
+   NODE[0] for the absence of conflicts with the same attribute arguments
+   already applied to NODE[1]. Issue a warning for conflicts and return
+   false.  Otherwise, when no conflicts are found, return true.  */
+
+static bool
+validate_attr_args (tree node[2], tree name, tree newargs[2])
+{
+  /* First validate the arguments against those already applied to
+ the same declaration (or type).  */
+  tree self[2] = { node[0], node[0] };
+  if (node[0] != node[1] && !validate_attr_args (self, name, newargs))
+return false;
+
+  if (!node[1])
+return true;
+
+  /* Extract the same attribute from the previous declaration or type.  */
+  tree prevattr = NULL_TREE;
+  if (DECL_P (node[1]))
+{
+  prevattr = DECL_ATTRIBUTES (node[1]);
+  if (!prevattr)
+	{
+	  tree type = TREE_TYPE (node[1]);
+	  prevattr = TYPE_ATTRIBUTES (type);
+	}
+}
+  else if (TYPE_P (node[1]))
+prevattr = TYPE_ATTRIBUTES (node[1]);
+
+  const char* const namestr = IDENTIFIER_POINTER (name);
+  prevattr = lookup_attribute (namestr, prevattr);
+  if (!prevattr)
+return true;
+
+  /* Extract one or both attribute arguments.  */
+  tree prevargs[2];
+  prevargs[0] = TREE_VALUE (TREE_VALUE (prevattr));
+  prevargs[1] = TREE_CHAIN (TREE_VALUE (prevattr));
+  if (prevargs[1])
+prevargs[1] = TREE_VALUE (prevargs[1]);
+
+  /* Both arguments must be equal or, for the second pair, neither must
+ be provided to succeed.  */
+  bool arg1eq, arg2eq;
+  if (TREE_CODE (newargs[0]) == INTEGER_CST)
+{
+  arg1eq = tree_int_cst_equal (newargs[0], prevargs[0]);
+  if (newargs[1] && prevargs[1])
+	arg2eq = tree_int_cst_equal (newargs[1], prevargs[1]);
+  else
+	arg2eq = newargs[1] == prevargs[1];
+}
+  else if (TREE_CODE (newargs[0]) == STRING_CST)
+{
+  const char *s0 = TREE_STRING_POINTER (newargs[0]);
+  const char *s1 = TREE_STRING_POINTER (prevargs[0]);
+  arg1eq = strcmp (s0, s1) == 0;
+  if (newargs[1] && prevargs[1])
+	{
+	  s0 = TREE_STRING_POINTER (newargs[1]);
+	  s1 = TREE_STRING_POINTER (prevargs[1]);
+	  arg2eq = strcmp (s0, s1) == 0;
+	}
+  else
+	arg2eq = newargs[1] == prevargs[1];
+}
+  else
+gcc_unreachable ();
+
+  if (arg1eq && arg2eq)
+return 

Re: [PATCH] libgomp: Add OMPD Address Space Information functions.

2020-07-08 Thread y2s1982 . via Gcc-patches
Hello Jakub,

Thank you again for detailed feedback. I had few questions.

On Wed, Jul 8, 2020 at 4:42 PM Jakub Jelinek  wrote:

> On Wed, Jul 08, 2020 at 03:30:35PM -0400, y2s1982 wrote:
> > +ompd_rc_t
> > +ompd_get_omp_version (ompd_address_space_handle_t *address_space,
> > +   ompd_word_t *omp_version)
> > +{
> > +  if (omp_version == NULL)
> > +return ompd_rc_bad_input;
> > +  if (address_space == NULL)
> > +return ompd_rc_stale_handle;
> > +
> > +  /* _OPENMP macro is defined to have mm integer.  */
> > +  ompd_size_t macro_length = sizeof (int);
> > +
> > +  ompd_rc_t ret = ompd_rc_ok;
> > +
> > +  struct ompd_address_t addr;
> > +  ret = gompd_callbacks.symbol_addr_lookup (address_space->context,
> NULL,
> > + "openmp_version", , NULL);
>
> This can't be right.  There is no openmp_version variable in libgomp.so.1
> (and I don't think we should add it).
> As I said multiple times before, you should add one (read-only) data
> variable to libgomp.so.1 that will encode a lot of information that OMPD
> needs and the version should be in there.
>

I do remember, though I obviously understood wrongly. Sorry about that.
I had assumed it might have something to do with ICV but didn't realize it
would also
apply to other variables. In all honesty, I was looking for _OPENMP macro;
I assumed
such information would be stored somewhere already and thought
symbol_addr_lookup() would find it somehow. I saw mentions of it on
ChangeLog,
testsuits, and in one string, but I couldn't find the actual macro. As for
openmp_version,
I (wrongly) made the assumption looking at the omp_lib.h.in. I should learn
more
about .in file's syntax and what they do.

To place a variable in libgomp.so.1, should I define a related struct and
declare a global
extern variable of the struct in omp.h and define it in some related .c
file?
Can I then simply use the name of the declared variable as the name (where
"openmp_version" currently  is) to find the struct? As for the value for
_OPENMP version,
where can I find it, or should OMPD maintain its own values for it?


> > +ompd_rc_t
> > +ompd_get_omp_version_string (ompd_address_space_handle_t *address_space,
> > +  const char **string)
> > +{
> > +  if (string == NULL)
> > +return ompd_rc_bad_input;
> > +
> > +  if (address_space == NULL)
> > +return ompd_rc_stale_handle;
> > +
> > +  ompd_word_t omp_version;
> > +  ompd_rc_t ret = ompd_get_omp_version (address_space, _version);
> > +  if (ret != ompd_rc_ok)
> > +return ret;
> > +
> > +  char *tmp = "GNU OpenMP Runtime implementing OpenMP 5.0 "
> > + ompd_stringify (omp_version);
>
> This will append "omp_version" to the string literal, won't it?
> That is not what you want in there, you instead want the value.
>

Oh I see. Thanks for pointing that out. I am still learning how macro
expansion works.

Cheers,

Tony

>
> Jakub
>
>


Re: [PATCH] rs6000: Refine RTL unroll adjust hook

2020-07-08 Thread Segher Boessenkool
On Wed, Jul 08, 2020 at 11:39:56AM +0800, Jiufu Guo wrote:
> Segher Boessenkool  writes:
> > I am not happy about what is considered "a complex loop" here.
> For early exit, which may cause and *next* unrolled iterations may be
> not executed, then unroll may be not benifit.

Yes, and it can well result in worse branch prediction.

> For too many branches (this patch say 20% of insns), may cause more
> branch-misses even for unrolled loop.
> 
> >From my intial intuition, I once think each condition may *define* loop
> as complex. :-)
> 
> But for each single condition, loop unrolling may still be helpful.
> While, if these conditions are all occur in a loop, it would be more
> possible to get negative impacts after unrolled.

Yes, but how many loops have *all* these conditions?  That is my problem
with it: it is only tested with one specific loop, and only benefits
that loop.

> > (A PARAM would be nice, but too many of those isn't actually useful
> > either...  Next time, add one as soon as writing the code, at least it
> > is useful at that point in time, when you still need to experiment with
> > it :-) )
> Yes, with PARAM, we can just change param to experiment:
> "if (loop->ninsns <= 10) return 0;" (6insn - 10 insn) may just located
> n one cache line, it may not need to unroll. But, after some tunning, I
> chose 6/4,10/2 here. 4 hardcodes may be too many for PARAM.  2 PARAMs
> (one for 6, one for 10) may be ok.  Any comments?

RTL insns are not one-to-one with machine insns.  One important reason
to unroll is for small loops, where we need to hide the latency of a
fetch redirect (say, three cycles); this is complicated by most insns
having a 2 cycle latency (or more for FP and such), and by different
SMT modes changing stuff here, oh and different CPUs of course.

So it is very important to unroll small loops enough, and it can be
beneficial for larger loops too, but it also hurts (in general, and
more for bigger loops, or loops with calls, or jumps).

It's not a science, more an art.  You'll just have to find something
that works well in practice.  But not something that looks at very
special cases only, preferably.

> > So, we use rs6000_complex_loop_p only to prevent all unrolling, never to
> > reduce the unrolling, and only in very specific cases.
> >
> > Is there no middle road possible?  Say, don't unroll to more than 25
> > insns total (which is what the "only small loops" does, sort of -- it
> > also avoids unrolling 3x a bit, yes), and don't unroll to more than 2
> > calls, and not to more than 4 branches (I'm making up those numbers, of
> > course, and PARAMS would be helpful).  Some of this already does exist,
> > and might need retuning for us?
> It may make sense. There are param_max_unrolled_insns,
> param_max_unroll_times, param_max_peel_branches(cunrol)...; we may add
> calls number and branches number checking for rtl unroller.

Hrm yes, that may be generally useful even.

> While, actually, here we would need condition to define *complex* loop,
> where contains call exist (may just 1), branch exist(may 2) and early
> exit(may 1) at the same time, but each number is not large.
> Any sugguestions? Thanks.

How many loops have you seen where all those conditions are true, but
the generic code still decides to unroll things?


Segher


Re: [PATCH] rs6000: Split movsf_from_si from high word before reload[PR89310]

2020-07-08 Thread Segher Boessenkool
Hi!

On Wed, Jul 08, 2020 at 11:19:21AM +0800, luoxhu wrote:
> For extracting high part element from DImode register like:
> 
> {%1:SF=unspec[r122:DI>>0x20#0] 86;clobber scratch;}
> 
> split it before reload with "and mask" to avoid generating shift right
> 32 bit then shift left 32 bit.  This pattern also exists in PR42475 and
> PR67741, etc.

> 2020-07-08  Xionghu Luo  
> 
>   PR rtl-optimization/89310
>   * config/rs6000/rs6000.md (movsf_from_si2): New
>   define_insn_and_split.
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-07-08  Xionghu Luo  
> 
>   PR rtl-optimization/89310
>   * gcc.target/powerpc/pr89310.c: New test.

> +;; For extracting high part element from DImode register like:
> +;; {%1:SF=unspec[r122:DI>>0x20#0] 86;clobber scratch;}
> +;; split it before reload with "and mask" to avoid generating shift right
> +;; 32 bit then shift left 32 bit.
> +(define_insn_and_split "movsf_from_si2"
> +  [(set (match_operand:SF 0 "gpc_reg_operand" "=wa")
> + (unspec:SF [
> +  (subreg:SI (ashiftrt:DI

Put the  (ashiftrt:DI  on a separate line as well?  With indent changes,
etc.

> +(match_operand:DI 1 "input_operand" "r")
> +(const_int 32))
> +   0)]
> +  UNSPEC_SF_FROM_SI))
> + (clobber (match_scratch:DI 2 "=r"))]
> +  "TARGET_NO_SF_SUBREG"
> +  "@
> +  #"

@ with only one alternative doesn't do anything; so just write
  "#"
please.

> +

And no empty line here.

> +  "&& !reload_completed

Why this?  Why will this not work after reload?  In the very few cases
where you do need this, you usually also need to check for
lra_in_progress.

> +   && vsx_reg_sfsubreg_ok (operands[0], SFmode)"
> +  [(const_int 0)]
> +{
> +  if (GET_CODE (operands[2]) == SCRATCH)
> +operands[2] = gen_reg_rtx (DImode);
> +
> +  rtx mask = GEN_INT (HOST_WIDE_INT_M1U << 32);

The mask should be different for QI and HI.

> +  emit_insn (gen_anddi3 (operands[2], operands[1], mask));
> +  emit_insn (gen_p8_mtvsrd_sf (operands[0], operands[2]));
> +  emit_insn (gen_vsx_xscvspdpn_directmove (operands[0], operands[0]));
> +  DONE;
> +}
> +  [(set_attr "length" "12")
> +  (set_attr "type" "vecfloat")
> +  (set_attr "isa" "p8v")])
> +

No extra whiteline please.


Maybe change it back to just SI?  It won't match often at all for QI or
HI anyway, it seems.  Sorry for that detour.  Should be good with the
above nits fixed :-)


Segher


Re: [PATCH] libgomp: Add OMPD Address Space Information functions.

2020-07-08 Thread Jakub Jelinek via Gcc-patches
On Wed, Jul 08, 2020 at 03:30:35PM -0400, y2s1982 wrote:
> +ompd_rc_t
> +ompd_get_omp_version (ompd_address_space_handle_t *address_space,
> +   ompd_word_t *omp_version)
> +{
> +  if (omp_version == NULL)
> +return ompd_rc_bad_input;
> +  if (address_space == NULL)
> +return ompd_rc_stale_handle;
> +
> +  /* _OPENMP macro is defined to have mm integer.  */
> +  ompd_size_t macro_length = sizeof (int);
> +
> +  ompd_rc_t ret = ompd_rc_ok;
> +
> +  struct ompd_address_t addr;
> +  ret = gompd_callbacks.symbol_addr_lookup (address_space->context, NULL,
> + "openmp_version", , NULL);

This can't be right.  There is no openmp_version variable in libgomp.so.1
(and I don't think we should add it).
As I said multiple times before, you should add one (read-only) data
variable to libgomp.so.1 that will encode a lot of information that OMPD
needs and the version should be in there.

> +ompd_rc_t
> +ompd_get_omp_version_string (ompd_address_space_handle_t *address_space,
> +  const char **string)
> +{
> +  if (string == NULL)
> +return ompd_rc_bad_input;
> +
> +  if (address_space == NULL)
> +return ompd_rc_stale_handle;
> +
> +  ompd_word_t omp_version;
> +  ompd_rc_t ret = ompd_get_omp_version (address_space, _version);
> +  if (ret != ompd_rc_ok)
> +return ret;
> +
> +  char *tmp = "GNU OpenMP Runtime implementing OpenMP 5.0 "
> + ompd_stringify (omp_version);

This will append "omp_version" to the string literal, won't it?
That is not what you want in there, you instead want the value.

Jakub



Re: [PATCH] c++: Improve checking of decls with trailing return type [PR95820]

2020-07-08 Thread Marek Polacek via Gcc-patches
Ping.

On Wed, Jun 24, 2020 at 07:27:14PM -0400, Marek Polacek via Gcc-patches wrote:
> This is an ICE-on-invalid but I've been seeing it when reducing
> various testcases, so it's more important for me than usually.
> 
> splice_late_return_type now checks that if we've seen a late return
> type, the function return type was auto.  That's a fair assumption
> but grokdeclarator/cdk_function wasn't giving errors for function
> pointers and similar.  So we want to perform various checks not only
> when funcdecl_p || inner_declarator == NULL.  But only give the
> !late_return_type errors when funcdecl_p, to accept e.g.
> 
> auto (*fp)() = f;
> 
> in C++11.  Here's a diff -w to ease the review:
> 
> --- a/gcc/cp/decl.c
> +++ b/gcc/cp/decl.c
> @@ -12102,14 +12102,9 @@ grokdeclarator (const cp_declarator *declarator,
> 
>   /* Handle a late-specified return type.  */
>   tree late_return_type = declarator->u.function.late_return_type;
> - if (funcdecl_p
> - /* This is the case e.g. for
> -using T = auto () -> int.  */
> - || inner_declarator == NULL)
> -   {
>   if (tree auto_node = type_uses_auto (type))
> {
> - if (!late_return_type)
> + if (!late_return_type && funcdecl_p)
> {
>   if (current_class_type
>   && LAMBDA_TYPE_P (current_class_type))
> @@ -12201,7 +12196,6 @@ grokdeclarator (const cp_declarator *declarator,
>   "type specifier", name);
>   return error_mark_node;
> }
> -   }
>   type = splice_late_return_type (type, late_return_type);
>   if (type == error_mark_node)
> return error_mark_node;
> 
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> 
> gcc/cp/ChangeLog:
> 
>   PR c++/95820
>   * decl.c (grokdeclarator) : Check also
>   pointers/references/... to functions.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR c++/95820
>   * g++.dg/cpp1y/auto-fn58.C: New test.
> ---
>  gcc/cp/decl.c  | 166 -
>  gcc/testsuite/g++.dg/cpp1y/auto-fn58.C |  13 ++
>  2 files changed, 93 insertions(+), 86 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp1y/auto-fn58.C
> 
> diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
> index 3afad5ca805..a9ec328c498 100644
> --- a/gcc/cp/decl.c
> +++ b/gcc/cp/decl.c
> @@ -12102,106 +12102,100 @@ grokdeclarator (const cp_declarator *declarator,
>  
>   /* Handle a late-specified return type.  */
>   tree late_return_type = declarator->u.function.late_return_type;
> - if (funcdecl_p
> - /* This is the case e.g. for
> -using T = auto () -> int.  */
> - || inner_declarator == NULL)
> + if (tree auto_node = type_uses_auto (type))
> {
> - if (tree auto_node = type_uses_auto (type))
> + if (!late_return_type && funcdecl_p)
> {
> - if (!late_return_type)
> + if (current_class_type
> + && LAMBDA_TYPE_P (current_class_type))
> +   /* OK for C++11 lambdas.  */;
> + else if (cxx_dialect < cxx14)
> {
> - if (current_class_type
> - && LAMBDA_TYPE_P (current_class_type))
> -   /* OK for C++11 lambdas.  */;
> - else if (cxx_dialect < cxx14)
> -   {
> - error_at (typespec_loc, "%qs function uses "
> -   "% type specifier without "
> -   "trailing return type", name);
> - inform (typespec_loc,
> - "deduced return type only available "
> - "with %<-std=c++14%> or %<-std=gnu++14%>");
> -   }
> - else if (virtualp)
> -   {
> - error_at (typespec_loc, "virtual function "
> -   "cannot have deduced return type");
> - virtualp = false;
> -   }
> + error_at (typespec_loc, "%qs function uses "
> +   "% type specifier without "
> +   "trailing return type", name);
> + inform (typespec_loc,
> + "deduced return type only available "
> + "with %<-std=c++14%> or %<-std=gnu++14%>");
> }
> - else if (!is_auto (type) && sfk != sfk_conversion)
> + else if (virtualp)
> {
> - error_at (typespec_loc, "%qs function with trailing "
> -   "return type has %qT as its 

Re: [PATCH v2] c++: Make convert_like complain about bad ck_ref_bind again [PR95789]

2020-07-08 Thread Marek Polacek via Gcc-patches
On Fri, Jul 03, 2020 at 05:24:34PM -0400, Jason Merrill via Gcc-patches wrote:
> On 6/22/20 10:09 PM, Marek Polacek wrote:
> > convert_like issues errors about bad_p conversions at the beginning
> > of the function, but in the ck_ref_bind case, it only issues them
> > after we've called convert_like on the next conversion.
> > 
> > This doesn't work as expected since r10-7096 because when we see
> > a conversion from/to class type in a template, we return early, thereby
> > missing the error, and a bad_p conversion goes by undetected.  That
> > made the attached test to compile even though it should not.
> 
> Hmm, why isn't there an error at instantiation time?

We threw away the result: we're called from

12213   if (complain & tf_error)
12214 {
12215   if (conv)
12216 convert_like (conv, expr, complain);
...
12228   return error_mark_node;

and convert_like never saw converting this->f to B& again when instantiating.

> Though giving an error at template parsing time is definitely preferable.

Yup.

> > I had thought that I could just move the ck_ref_bind/bad_p errors
> > above to the rest of them, but that regressed diagnostics because
> > expr then wasn't converted yet by the nested convert_like_real call.
> 
> Yeah, the early section is really just for scalar conversions.
> 
> It would probably be good to do normal processing for all other bad
> conversions and only afterward build the IMPLICIT_CONV_EXPR if we aren't
> returning error_mark_node.

Ok, so that if we add more bad_p errors, we won't run into this again.

Unfortunately it's a bit ugly.  I could introduce a RETURN macro to
use RETURN (expr); instead of what I have now, but it wouldn't be simply
"conv_expr ? conv_expr : expr", because if we have error_mark_node, we
want to return that, not conv_expr.  Does that seem worth it?
(I wish I could at least use the op0 ?: op1 GNU extension.)

I've also added another test.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
convert_like issues errors about bad_p conversions at the beginning
of the function, but in the ck_ref_bind case, it only issues them
after we've called convert_like on the next conversion.

This doesn't work as expected since r10-7096 because when we see
a conversion from/to class type in a template, we return early, thereby
missing the error, and a bad_p conversion goes by undetected.  That
made the attached test to compile even though it should not.

I had thought that I could just move the ck_ref_bind/bad_p errors
above to the rest of them, but that regressed diagnostics because
expr then wasn't converted yet by the nested convert_like_real call.

So, for bad_p conversions, do the normal processing, but still return
the IMPLICIT_CONV_EXPR to avoid introducing trees that the template
processing can't handle well.

gcc/cp/ChangeLog:

PR c++/95789
* call.c (convert_like_real): Do the normal processing for
conversion that are bad_p.  Return the IMPLICIT_CONV_EXPR
instead of EXPR if we're processing a bad_p conversion in
a template.

gcc/testsuite/ChangeLog:

PR c++/95789
* g++.dg/conversion/ref4.C: New test.
* g++.dg/conversion/ref5.C: New test.
---
 gcc/cp/call.c  | 47 +++---
 gcc/testsuite/g++.dg/conversion/ref4.C | 22 
 gcc/testsuite/g++.dg/conversion/ref5.C | 14 
 3 files changed, 64 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/conversion/ref4.C
 create mode 100644 gcc/testsuite/g++.dg/conversion/ref5.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 5341a572980..65565fc90a8 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -7400,12 +7400,19 @@ convert_like_real (conversion *convs, tree expr, tree 
fn, int argnum,
  so use an IMPLICIT_CONV_EXPR for this conversion.  We would have
  created such codes e.g. when calling a user-defined conversion
  function.  */
+  tree conv_expr = NULL_TREE;
   if (processing_template_decl
   && convs->kind != ck_identity
   && (CLASS_TYPE_P (totype) || CLASS_TYPE_P (TREE_TYPE (expr
 {
-  expr = build1 (IMPLICIT_CONV_EXPR, totype, expr);
-  return convs->kind == ck_ref_bind ? expr : convert_from_reference (expr);
+  conv_expr = build1 (IMPLICIT_CONV_EXPR, totype, expr);
+  if (convs->kind != ck_ref_bind)
+   conv_expr = convert_from_reference (conv_expr);
+  if (!convs->bad_p)
+   return conv_expr;
+  /* Do the normal processing to give the bad_p errors.  But we still
+need to return the IMPLICIT_CONV_EXPR, unless we're returning
+error_mark_node.  */
 }
 
   switch (convs->kind)
@@ -7465,7 +7472,7 @@ convert_like_real (conversion *convs, tree expr, tree fn, 
int argnum,
TARGET_EXPR_LIST_INIT_P (expr) = true;
TARGET_EXPR_DIRECT_INIT_P (expr) = direct;
  }
-   return expr;
+   return conv_expr ? 

Re: [PATCH ver 4] RS6000, add VSX mask manipulation support

2020-07-08 Thread Carl Love via Gcc-patches
Will:
> 
> > 
> > @@ -5701,3 +5716,55 @@
> >"TARGET_POWER10"
> >" %x0,%x1"
> >[(set_attr "type" "vecfloat")])
> > +
> > +;; VSX mask manipulation instructions
> > +;;;(define_expand "vec_mtvsrbm"
> > +;;;  [(set (match_operand:V16QI 0 "altivec_register_operand" "=v")
> > +;;;(unspec:V16QI [(match_operand:DI 1 "gpc_reg_operand"
> > "b")]
> > +;;;   UNSPEC_MTVSBM))]
> > +;;;   "TARGET_POWER10"
> > +;;; {
> > +;;;emit_insn (gen_vec_mtvsr_v16qi (operands[0], operands[1]));
> > +;;;DONE;
> > +;;;})
> > +
> 
> Still there.

Sorry, didn't attach the correct updated patch.

--

[PATCH] RS6000, add VSX mask manipulation support

Version 4
  vec_mtvsrbm was commented out in ver 3.  Forgot to go back and actually
remove it.  I was supposed to after testing. It is no longer needed with
the removal of vec_mtvsrbm_mtvsrbmi. Removed it from ChangeLog.
  Clarification, vec_mtvsrbm_mtvsrbmi was removed in version 3.  Updated the
code to use vec_mtvsr_v16qi instead.  Hopefully that clarifies Will's
comment about "Reworked define_expand vec_mtvsrbm_mtvsrbmi" in version 3.
  Fixed ChangeLog, replaced the FUTURE with P10 that I missed previously.

---
version 3
  rebased onto mainline 7/7/2020
  Change FUTURE to P10 in code and ChangeLog.
  ChangeLog, fixed the name of a couple of files which were wrong.
  Reformated define_mode_attr VSX_MM_SUFFIX definition to shorten the line.
  Reworked define_expand "vec_mtvsrbm_mtvsrbmi" as it will not work as
intended.
  Changed vsx_register_operand to altivec_register_operand for "v"
constraint.
  Removed --save-temps from test cases as it is not needed.
  Reran regression testing and ran test cases manually on mambo.

---
version 2

Addressed Will's comments
  - ChangeLog: fixed name/symbol order;
changed reference from rs6000-c.c to rs6000-builtin.def.

  - define_expand "vec_mtvsrbm": changed name to vec_mtvsrbm_mtvsrbmi,
updated comment

  - vsx_mask-runnable.c: divided it up into four smaller test cases,
vsx_mask-count-runnable.c, vsx_mask-expane-runnable.c,
vsx_mask-extract-runnable.c, vsx_mask-move-runnable.c.

---
RS6000 RFC 2629, add VSX mask manipulation support

The following patch adds support for builtins vec_genbm(),  vec_genhm(),
vec_genwm(), vec_gendm(), vec_genqm(), vec_cntm(), vec_expandm(),
vec_extractm().  Support for instructions mtvsrbm, mtvsrhm, mtvsrwm,
mtvsrdm, mtvsrqm, cntm, vexpandm, vextractm.

The test has been tested on:

  powerpc64le-unknown-linux-gnu (Power 9 LE)

and mambo with no regression errors.

Please let me know if this patch is acceptable for inclusion in the mainline
branch.  Thanks.

   Carl Love
---

RS6000, add VSX mask manipulation support

gcc/ChangeLog

2020-07-07  Carl Love  

* config/rs6000/vsx.md  (VSX_MM): New define_mode_iterator.
(VSX_MM4): New define_mode_iterator.
(VSX_MM_SUFFIX4): New define_mode_attr.
(vec_mtvsrbmi): New define_insn.
(vec_mtvsr_): New define_insn.
(vec_cntmb_): New define_insn.
(vec_extract_): New define_insn.
(vec_expand_): New define_insn.
(define_c_enum unspec): Add entries UNSPEC_MTVSBM, UNSPEC_VCNTMB,
UNSPEC_VEXTRACT, UNSPEC_VEXPAND.
* config/rs6000/altivec.h ( vec_genbm, vec_genhm, vec_genwm,
vec_gendm, vec_genqm, vec_cntm, vec_expandm, vec_extractm): Add
defines.
* config/rs6000/rs6000-builtin.def: Add defines BU_P10_2, BU_P10_1.
(BU_P10_1): Add definitions for mtvsrbm, mtvsrhm, mtvsrwm,
mtvsrdm, mtvsrqm, vexpandmb, vexpandmh, vexpandmw, vexpandmd,
vexpandmq, vextractmb, vextractmh, vextractmw, vextractmd, vextractmq.
(BU_P10_2): Add definitions for cntmbb, cntmbh, cntmbw, cntmbd.
(BU_P10_OVERLOAD_1): Add definitions for mtvsrbm, mtvsrhm,
mtvsrwm, mtvsrdm, mtvsrqm, vexpandm, vextractm.
(BU_P10_OVERLOAD_2): Add defition for cntm.
* config/rs6000/rs6000-call.c (rs6000_expand_binop_builtin): Add
checks for CODE_FOR_vec_cntmbb_v16qi, CODE_FOR_vec_cntmb_v8hi,
CODE_FOR_vec_cntmb_v4si, CODE_FOR_vec_cntmb_v2di.
(altivec_overloaded_builtins): Add overloaded argument entries for
P10_BUILTIN_VEC_MTVSRBM, P10_BUILTIN_VEC_MTVSRHM,
P10_BUILTIN_VEC_MTVSRWM, P10_BUILTIN_VEC_MTVSRDM,
P10_BUILTIN_VEC_MTVSRQM, P10_BUILTIN_VEC_VCNTMBB,
P10_BUILTIN_VCNTMBB, P10_BUILTIN_VCNTMBH,
P10_BUILTIN_VCNTMBW, P10_BUILTIN_VCNTMBD,
P10_BUILTIN_VEXPANDMB, P10_BUILTIN_VEXPANDMH,
P10_BUILTIN_VEXPANDMW, P10_BUILTIN_VEXPANDMD,
P10_BUILTIN_VEXPANDMQ, P10_BUILTIN_VEXTRACTMB,
P10_BUILTIN_VEXTRACTMH, P10_BUILTIN_VEXTRACTMW,
  

Re: [PATCH 0/6 ver 4] ] Permute Class Operations

2020-07-08 Thread Carl Love via Gcc-patches
[PATCH 5/6] rs6000, Add vector splat builtin support

--
V4 Fixes:

   Rebased on mainline.  Changed FUTURE to P10.
   define_predicate "s32bit_cint_operand" removed unnecessary cast in
 definition.
   Changed define_expand "xxsplti32dx_v4si" to use "0" for constraint
 of operand 1.
   Changed define_insn "xxsplti32dx_v4si_inst" to use "0 for constraint
 of operand 1.
   Removed define_predicate "f32bit_const_operand".  Use const_double_operand
 instead.

   *** Please provide feedback for the following change:
   (define_insn "xxspltidp_v2df_inst", Added print statement to warn of
   possible undefined behavior.  The xxspltidp instruction result is
   undefined for subnormal inputs.  I added a test for subnormal input with
   a fprintf to stderr to warn the "user" if the constant input is a subnormal
   value.  I tried assert initially, but that causes GCC to exit ungracefully
   with no information as to why.  I really didn't like that behavior.
   A subnormal input is not really a fatal error but the "user" needs
   to be told it is not a good idea.  Not sure if using an fprintf statement
   in a define_insn is an acceptable thing either.  But it does give the
   user the needed input and GCC exits normally.  Let me know if there
   is a better option here.

v3 fixes:
   Minor cleanup in the ChangeLog description.

-
v2 fixes:

  change log fixes
gcc/config/rs6000/altivec changed name of define_insn and define_expand
for vxxspltiw... to xxspltiw...   Fixed spaces in gen_xxsplti32dx_v4sf_inst 
(operands[0], GEN_INT

gcc/rs6000-builtin.def propagated name changes above where they are used.

Updated definition for S32bit_cint_operand, c32bit_cint_operand,
f32bit_const_operand predicate definitions.

Changed name of rs6000_constF32toI32 to rs6000_const_f32_to_i32, propagated
name change as needed.  Replaced if test with gcc_assert().

Fixed description of vec_splatid() in documentation.
---

GCC maintainers:

The following patch adds support for the vec_splati, vec_splatid and
vec_splati_ins builtins.

This patch adds support for instructions that take a 32-bit immediate
value that represents a floating point value.  This support adds new
predicates and a support function to properly handle the immediate value.

The patch has been compiled and tested on

  powerpc64le-unknown-linux-gnu (Power 9 LE)

with no regression errors.

The test case was compiled on a Power 9 system and then tested on
Mambo.

Please let me know if this patch is acceptable for the mainline
branch.  Thanks.

 Carl Love

gcc/ChangeLog

2020-07-06  Carl Love  

* config/rs6000/altivec.h (vec_splati, vec_splatid, vec_splati_ins):
Add defines.
* config/rs6000/altivec.md (UNSPEC_XXSPLTIW, UNSPEC_XXSPLTID,
UNSPEC_XXSPLTI32DX): New.
(vxxspltiw_v4si, vxxspltiw_v4sf_inst, vxxspltidp_v2df_inst,
vxxsplti32dx_v4si_inst, vxxsplti32dx_v4sf_inst): New define_insn.
(vxxspltiw_v4sf, vxxspltidp_v2df, vxxsplti32dx_v4si,
vxxsplti32dx_v4sf.): New define_expands.
* config/rs6000/predicates (u1bit_cint_operand,
s32bit_cint_operand, c32bit_cint_operand): New predicates.
* config/rs6000/rs6000-builtin.def (VXXSPLTIW_V4SI, VXXSPLTIW_V4SF,
VXXSPLTID): New definitions.
(VXXSPLTI32DX_V4SI, VXXSPLTI32DX_V4SF): New BU_P10V_3
definitions.
(XXSPLTIW, XXSPLTID): New definitions.
(XXSPLTI32DX): Add definitions.
* config/rs6000/rs6000-call.c (P10_BUILTIN_VEC_XXSPLTIW,
P10_BUILTIN_VEC_XXSPLTID, P10_BUILTIN_VEC_XXSPLTI32DX):
New definitions.
* config/rs6000/rs6000-protos.h (rs6000_constF32toI32): New extern
declaration.
* config/rs6000/rs6000.c (rs6000_constF32toI32): New function.
* config/doc/extend.texi: Add documentation for vec_splati,
vec_splatid, and vec_splati_ins.

gcc/testsuite/ChangeLog

2020-07-06  Carl Love  

* testsuite/gcc.target/powerpc/vec-splati-runnable: New test.
---
 gcc/config/rs6000/altivec.h   |   3 +
 gcc/config/rs6000/altivec.md  | 116 ++
 gcc/config/rs6000/predicates.md   |  15 ++
 gcc/config/rs6000/rs6000-builtin.def  |  12 ++
 gcc/config/rs6000/rs6000-call.c   |  19 +++
 gcc/config/rs6000/rs6000-protos.h |   1 +
 gcc/config/rs6000/rs6000.c|  11 ++
 gcc/doc/extend.texi   |  35 +
 .../gcc.target/powerpc/vec-splati-runnable.c  | 145 ++
 9 files changed, 357 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index c202fcf25da..126409c168b 100644
--- 

Re: [PATCH 0/6 ver 4] ] Permute Class Operations

2020-07-08 Thread Carl Love via Gcc-patches
[PATCH 6/6] rs6000 Add vector blend, permute builtin support

--
V4 Fixes:

   Rebased on mainline.  Changed FUTURE to P10.
-

v3 fixes:
   Replace spaces with tabs in ChangeLog description.
   Fix implementation comments for define_expand "xxpermx" in file
 gcc/config/rs6000/alitvec.md.
   Fix minor typos in the comments for the changes in 
gcc/config/rs6000/rs6000-call.c.


v2 changes:

   Updated ChangeLog per comments.

   Updated implementation of the define_expand "xxpermx".

   Fixed the comments and check for 3-bit immediate field for the
CODE_FOR_xxpermx check.

   gcc/doc/extend.texi:
comment "Maybe it should say it is related to vsel/xxsel, but per
bigger element?", added comment.  I took the description directly
from spec.  Don't really don't want to mess with the approved
description.

   fixed typo for Vector Permute Extendedextracth

--

GCC maintainers:

The following patch adds support for the vec_blendv and vec_permx
builtins.

The patch has been compiled and tested on

  powerpc64le-unknown-linux-gnu (Power 9 LE)

with no regression errors.

The test cases were compiled on a Power 9 system and then tested on
Mambo.

 Carl Love

---
rs6000 RFC2609 vector blend, permute instructions

gcc/ChangeLog

2020-07-06  Carl Love  

* config/rs6000/altivec.h (vec_blendv, vec_permx): Add define.
* config/rs6000/altivec.md (UNSPEC_XXBLEND, UNSPEC_XXPERMX.): New
unspecs.
(VM3): New define_mode.
(VM3_char): New define_attr.
(xxblend_ mode VM3): New define_insn.
(xxpermx): New define_expand.
(xxpermx_inst): New define_insn.
* config/rs6000/rs6000-builtin.def (VXXBLEND_V16QI, VXXBLEND_V8HI,
VXXBLEND_V4SI, VXXBLEND_V2DI, VXXBLEND_V4SF, VXXBLEND_V2DF): New
BU_P10V_3 definitions.
(XXBLENDBU_P10_OVERLOAD_3): New BU_P10_OVERLOAD_3 definition.
(XXPERMX): New BU_P10_OVERLOAD_4 definition.
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
(P10_BUILTIN_VXXPERMX): Add if case support.
* config/rs6000/rs6000-call.c (P10_BUILTIN_VXXBLEND_V16QI,
P10_BUILTIN_VXXBLEND_V8HI, P10_BUILTIN_VXXBLEND_V4SI,
P10_BUILTIN_VXXBLEND_V2DI, P10_BUILTIN_VXXBLEND_V4SF,
P10_BUILTIN_VXXBLEND_V2DF, P10_BUILTIN_VXXPERMX): Define
overloaded arguments.
(rs6000_expand_quaternop_builtin): Add if case for CODE_FOR_xxpermx.
(builtin_quaternary_function_type): Add v16uqi_type and xxpermx_type
variables, add case statement for P10_BUILTIN_VXXPERMX.
(builtin_function_type)[P10_BUILTIN_VXXBLEND_V16QI,
P10_BUILTIN_VXXBLEND_V8HI, P10_BUILTIN_VXXBLEND_V4SI,
P10_BUILTIN_VXXBLEND_V2DI]: Add case statements.
* doc/extend.texi: Add documentation for vec_blendv and vec_permx.

gcc/testsuite/ChangeLog

2020-07-06  Carl Love  
gcc.target/powerpc/vec-blend-runnable.c: New test.
gcc.target/powerpc/vec-permute-ext-runnable.c: New test.
---
 gcc/config/rs6000/altivec.h   |   2 +
 gcc/config/rs6000/altivec.md  |  71 +
 gcc/config/rs6000/rs6000-builtin.def  |  13 +
 gcc/config/rs6000/rs6000-c.c  |  27 +-
 gcc/config/rs6000/rs6000-call.c   |  95 ++
 gcc/doc/extend.texi   |  63 
 .../gcc.target/powerpc/vec-blend-runnable.c   | 276 
 .../powerpc/vec-permute-ext-runnable.c| 294 ++
 8 files changed, 835 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-blend-runnable.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-permute-ext-runnable.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 126409c168b..e8fdeb31b0b 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -708,6 +708,8 @@ __altivec_scalar_pred(vec_any_nle,
 #define vec_splati(a)  __builtin_vec_xxspltiw (a)
 #define vec_splatid(a) __builtin_vec_xxspltid (a)
 #define vec_splati_ins(a, b, c)__builtin_vec_xxsplti32dx (a, b, c)
+#define vec_blendv(a, b, c)__builtin_vec_xxblend (a, b, c)
+#define vec_permx(a, b, c, d)  __builtin_vec_xxpermx (a, b, c, d)
 
 #define vec_gnb(a, b)  __builtin_vec_gnb (a, b)
 #define vec_clrl(a, b) __builtin_vec_clrl (a, b)
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index f6858b5bf2a..226cf121f12 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -177,6 +177,8 @@
UNSPEC_XXSPLTIW
UNSPEC_XXSPLTID
UNSPEC_XXSPLTI32DX
+   UNSPEC_XXBLEND
+   UNSPEC_XXPERMX
 ])
 
 (define_c_enum "unspecv"
@@ -219,6 +221,21 @@
   (KF "FLOAT128_VECTOR_P (KFmode)")
   (TF "FLOAT128_VECTOR_P (TFmode)")])
 
+;; 

Re: [PATCH ver 4] RS6000, add VSX mask manipulation support

2020-07-08 Thread will schmidt via Gcc-patches
On Wed, 2020-07-08 at 09:22 -0700, Carl Love wrote:
> Will, Segher:
> 
> I fixed up the patch based on Will's comments.  I thought I had made
> and committed the fixes that Will caught, but no   Sorry about
> that.  I will get this right yet.

> 
> Carl Love
> ---
> 
> Version 4
>   vec_mtvsrbm was commented out in ver 3.  Forgot to go back and actually
> remove it.  I was supposed to after testing. It is no longer needed with
> the removal of vec_mtvsrbm_mtvsrbmi. Removed it from ChangeLog.

>   Clarification, vec_mtvsrbm_mtvsrbmi was removed in version 3.  Updated the
> code to use vec_mtvsr_v16qi instead.  Hopefully that clarifies Will's
> comment about "Reworked define_expand vec_mtvsrbm_mtvsrbmi" in version 3.
>   Fixed ChangeLog, replaced the FUTURE with P10 that I missed previously.

Don't need to changelog the changelog changes so much, but I'd keep
this part simple, leave the larger story in a paragraph elsewhere if
it's needed.  And.. though interesting, we don't always need the
background, just the contents and a call-out for things that changed
wrt previous versions.
This can be simplified to somethinglike 
"removed vec_mtvsrbm reference from changelog (v3 change)".
"Updated references to FUTURE_* to match the current P10_*
implementation.  "

> 
> ---
> version 3
>   rebased onto mainline 7/7/2020
>   Change FUTURE to P10 in code and ChangeLog.
>   ChangeLog, fixed the name of a couple of files which were wrong.
>   Reformated define_mode_attr VSX_MM_SUFFIX definition to shorten the line.
>   Reworked define_expand "vec_mtvsrbm_mtvsrbmi" as it will not work as
> intended.
>   Changed vsx_register_operand to altivec_register_operand for "v"
> constraint.
>   Removed --save-temps from test cases as it is not needed.
>   Reran regression testing and ran test cases manually on mambo.
> 
> ---
> version 2
> 
> Addressed Will's comments
>   - ChangeLog: fixed name/symbol order;
> changed reference from rs6000-c.c to rs6000-builtin.def.
> 
>   - define_expand "vec_mtvsrbm": changed name to vec_mtvsrbm_mtvsrbmi,
> updated comment
> 
>   - vsx_mask-runnable.c: divided it up into four smaller test cases,
> vsx_mask-count-runnable.c, vsx_mask-expane-runnable.c,
> vsx_mask-extract-runnable.c, vsx_mask-move-runnable.c.
> 
> ---
> RS6000 RFC 2629, add VSX mask manipulation support
> 
> The following patch adds support for builtins vec_genbm(),  vec_genhm(),
> vec_genwm(), vec_gendm(), vec_genqm(), vec_cntm(), vec_expandm(),
> vec_extractm().  Support for instructions mtvsrbm, mtvsrhm, mtvsrwm,
> mtvsrdm, mtvsrqm, cntm, vexpandm, vextractm.
> 
> The test has been tested on:
> 
>   powerpc64le-unknown-linux-gnu (Power 9 LE)
> 
> and mambo with no regression errors.
> 
> Please let me know if this patch is acceptable for inclusion in the mainline
> branch.  Thanks.
> 
>Carl Love
> ---
> 
> RS6000, add VSX mask manipulation support
> 
> gcc/ChangeLog
> 
> 2020-07-07  Carl Love  
> 
>   * config/rs6000/vsx.md  (VSX_MM): New define_mode_iterator.
>   (VSX_MM4): New define_mode_iterator.
>   (VSX_MM_SUFFIX4): New define_mode_attr.
>   (vec_mtvsrbmi): New define_insn.
>   (vec_mtvsr_): New define_insn.
>   (vec_cntmb_): New define_insn.
>   (vec_extract_): New define_insn.
>   (vec_expand_): New define_insn.
>   (define_c_enum unspec): Add entries UNSPEC_MTVSBM, UNSPEC_VCNTMB,
>   UNSPEC_VEXTRACT, UNSPEC_VEXPAND.
>   * config/rs6000/altivec.h ( vec_genbm, vec_genhm, vec_genwm,
>   vec_gendm, vec_genqm, vec_cntm, vec_expandm, vec_extractm): Add
>   defines.
>   * config/rs6000/rs6000-builtin.def: Add defines BU_P10_2, BU_P10_1.
>   (BU_P10_1): Add definitions for mtvsrbm, mtvsrhm, mtvsrwm,
>   mtvsrdm, mtvsrqm, vexpandmb, vexpandmh, vexpandmw, vexpandmd,
>   vexpandmq, vextractmb, vextractmh, vextractmw, vextractmd, vextractmq.
>   (BU_P10_2): Add definitions for cntmbb, cntmbh, cntmbw, cntmbd.
>   (BU_P10_OVERLOAD_1): Add definitions for mtvsrbm, mtvsrhm,
>   mtvsrwm, mtvsrdm, mtvsrqm, vexpandm, vextractm.
>   (BU_P10_OVERLOAD_2): Add defition for cntm.
>   * config/rs6000/rs6000-call.c (rs6000_expand_binop_builtin): Add
>   checks for CODE_FOR_vec_cntmbb_v16qi, CODE_FOR_vec_cntmb_v8hi,
>   CODE_FOR_vec_cntmb_v4si, CODE_FOR_vec_cntmb_v2di.
>   (altivec_overloaded_builtins): Add overloaded argument entries for
>   P10_BUILTIN_VEC_MTVSRBM, P10_BUILTIN_VEC_MTVSRHM,
>   P10_BUILTIN_VEC_MTVSRWM, P10_BUILTIN_VEC_MTVSRDM,
>   P10_BUILTIN_VEC_MTVSRQM, P10_BUILTIN_VEC_VCNTMBB,
>   P10_BUILTIN_VCNTMBB, P10_BUILTIN_VCNTMBH,
>   P10_BUILTIN_VCNTMBW, P10_BUILTIN_VCNTMBD,
>   

Re: [PATCH 0/6 ver 4] ] Permute Class Operations

2020-07-08 Thread Carl Love via Gcc-patches
[PATCH 4/6] rs6000, Add vector shift double builtin support

--
V4 Fixes:

   Rebased on mainline.  Changed FUTURE to P10.
   Changed SLDB_LR to SLDB_lr
   Changed error ("argument 3 must be in the range 0 to 7"); to
   error ("argument 3 must be a constant in the range 0 to 7");

-
V3 Fixes
Replace spaces with tabs in ChangeLog.
Minor edits to ChangeLog entry.
Minor edits to vec_sldb description in gcc/doc/extend.texi.


v2 fixes:

 change logs redone

  gcc/config/rs6000/rs6000-call.c - added spaces before parenthesis around args.

-
GCC maintainers:

The following patch adds support for the vector shift double builtins.

The patch has been compiled and tested on

  powerpc64le-unknown-linux-gnu (Power 9 LE)

and Mambo with no regression errors.

Please let me know if this patch is acceptable for the mainline branch.

Thanks.

 Carl Love

---

gcc/ChangeLog

2020-07-06  Carl Love  

* config/rs6000/altivec.h (vec_sldb, vec_srdb): New defines.
* config/rs6000/altivec.md (UNSPEC_SLDB, UNSPEC_SRDB): New.
(SLDB_LR): New attribute.
(VSHIFT_DBL_LR): New iterator.
(vsdb_): New define_insn.
* config/rs6000/rs6000-builtin.def (VSLDB_V16QI, VSLDB_V8HI,
VSLDB_V4SI, VSLDB_V2DI, VSRDB_V16QI, VSRDB_V8HI, VSRDB_V4SI,
VSRDB_V2DI): New BU_P10V_3 definitions.
(SLDB, SRDB): New BU_P10_OVERLOAD_3 definitions.
* config/rs6000/rs6000-call.c (P10_BUILTIN_VEC_SLDB,
P10_BUILTIN_VEC_SRDB): New definitions.
(rs6000_expand_ternop_builtin) [CODE_FOR_vsldb_v16qi,
CODE_FOR_vsldb_v8hi, CODE_FOR_vsldb_v4si, CODE_FOR_vsldb_v2di,
CODE_FOR_vsrdb_v16qi, CODE_FOR_vsrdb_v8hi, CODE_FOR_vsrdb_v4si,
CODE_FOR_vsrdb_v2di}: Add clauses.
* doc/extend.texi: Add description for vec_sldb and vec_srdb.

gcc/testsuite/ChangeLog

2020-07-06  Carl Love  

* gcc.target/powerpc/vec-shift-double-runnable.c:  New test file.
---
 gcc/config/rs6000/altivec.h   |   2 +
 gcc/config/rs6000/altivec.md  |  18 +
 gcc/config/rs6000/rs6000-builtin.def  |  12 +
 gcc/config/rs6000/rs6000-call.c   |  70 
 gcc/doc/extend.texi   |  53 +++
 .../powerpc/vec-shift-double-runnable.c   | 384 ++
 6 files changed, 539 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 560c43cfc99..c202fcf25da 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -703,6 +703,8 @@ __altivec_scalar_pred(vec_any_nle,
 #define vec_inserth(a, b, c)   __builtin_vec_inserth (a, b, c)
 #define vec_replace_elt(a, b, c)   __builtin_vec_replace_elt (a, b, c)
 #define vec_replace_unaligned(a, b, c) __builtin_vec_replace_un (a, b, c)
+#define vec_sldb(a, b, c)  __builtin_vec_sldb (a, b, c)
+#define vec_srdb(a, b, c)  __builtin_vec_srdb (a, b, c)
 
 #define vec_gnb(a, b)  __builtin_vec_gnb (a, b)
 #define vec_clrl(a, b) __builtin_vec_clrl (a, b)
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 749b2c42c14..c58fb3961e0 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -172,6 +172,8 @@
UNSPEC_XXEVAL
UNSPEC_VSTRIR
UNSPEC_VSTRIL
+   UNSPEC_SLDB
+   UNSPEC_SRDB
 ])
 
 (define_c_enum "unspecv"
@@ -782,6 +784,22 @@
   DONE;
 })
 
+;; Map UNSPEC_SLDB to "l" and  UNSPEC_SRDB to "r".
+(define_int_attr SLDB_lr [(UNSPEC_SLDB "l")
+ (UNSPEC_SRDB "r")])
+
+(define_int_iterator VSHIFT_DBL_LR [UNSPEC_SLDB UNSPEC_SRDB])
+
+(define_insn "vsdb_"
+ [(set (match_operand:VI2 0 "register_operand" "=v")
+  (unspec:VI2 [(match_operand:VI2 1 "register_operand" "v")
+  (match_operand:VI2 2 "register_operand" "v")
+  (match_operand:QI 3 "const_0_to_12_operand" "n")]
+ VSHIFT_DBL_LR))]
+  "TARGET_POWER10"
+  "vsdbi %0,%1,%2,%3"
+  [(set_attr "type" "vecsimple")])
+
 (define_expand "vstrir_"
   [(set (match_operand:VIshort 0 "altivec_register_operand")
(unspec:VIshort [(match_operand:VIshort 1 "altivec_register_operand")]
diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index e22b3e4d53b..c6fdfadeda8 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2738,6 +2738,16 @@ BU_P10V_3 (VREPLACE_UN_V2DI, "vreplace_un_v2di", CONST, 
vreplace_un_v2di)
 BU_P10V_3 (VREPLACE_UN_UV2DI, "vreplace_un_uv2di", CONST, vreplace_un_v2di)
 BU_P10V_3 (VREPLACE_UN_V2DF, "vreplace_un_v2df", CONST, vreplace_un_v2df)
 
+BU_P10V_3 (VSLDB_V16QI, 

Re: [PATCH] RISC-V: Implment __builtin_thread_pointer

2020-07-08 Thread Jim Wilson
On Tue, Jul 7, 2020 at 2:52 AM Kito Cheng  wrote:
> gcc/ChangeLog:
> * gcc/config/riscv/riscv.md (): New.
> (TP_REGNUM): Ditto.
> * doc/extend.texi (Target Builtins): Add RISC-V built-in section.
> Document __builtin_thread_pointer.
> gcc/testsuite/ChangeLog:
> * gcc.target/riscv/read-thread-pointer.c: New.

It looks OK to me in general.

You added builtin_thread_pointer but not builtin_set_thread_pointer.
Maybe we should implement both as long as we are implementing one?  If
clang only implements one, maybe it should implement the other also?
This doesn't have to be part of this patch.  This could be a separate
issue.

The builtin_thread_pointer docs looks out-of-date.  It is documented
for alpha and SH, but it is implemented in gcc/builtins.c not in the
backends.  A scan of md files show that quite a few targets support it
but don't document it.  I think it should be documented in the generic
builtins section not in the target dependent builtins sections with
some language that says not all targets support it.  This doesn't have
to be part of this patch.  This could be a separate issue.

We have two existing undocumented builtins.  __builtin_riscv_fsflags
and __builtin_riscv_frflags for setting or reading the FP flags.  I
don't know if anyone uses them though.  newlib and glbic both use
extended asms for these operations.  This doesn't have to be part of
this patch.  This could be a separate issue.

There is a document https://github.com/riscv/riscv-c-api-doc for
coordinating gcc and llvm work that has an empty list of builtin
functions.  I'm not sure if this document is still useful.  If this is
a RISC-V specific builtin then it should be listed here, but I don't
think it should be considered a RISC-V specific builtin.  There is an
unresolved pull request for the frflags and fsflags builtins.  I guess
I forgot about that.

Jim


Re: [PATCH 0/6 ver 4] ] Permute Class Operations

2020-07-08 Thread Carl Love via Gcc-patches
[PATCH 3/6] rs6000, Add vector replace builtin support

--
V4 Fixes:

   Rebased on mainline.  Changed FUTURE to P10 in code and ChangeLog.
   Set DEBUG to 0 in vec-replace-word-runnable.c test program.
   Fixed too long lines in ChangeLog.

--
V3 fixes:
   Fixed bad word breaks in ChangLog.
   Replace spaces with tabs in ChangeLog.


v2 fixes:

change log entries config/rs6000/vsx.md, config/rs6000/rs6000-builtin.def,
config/rs6000/rs6000-call.c.

gcc/config/rs6000/rs6000-call.c: fixed if check for 3rd arg between 0 and 3
 fixed if check for 3rd arg between 0 and 12

gcc/config/rs6000/vsx.md: removed REPLACE_ELT_atr definition and used
  VS_scalar instead.
  removed REPLACE_ELT_inst definition and used
   instead
  fixed spelling mistake on Endianness.
  fixed indenting for vreplace_elt_

---

GCC maintainers:

The following patch adds support for builtins vec_replace_elt and
vec_replace_unaligned.

The patch has been compiled and tested on

  powerpc64le-unknown-linux-gnu (Power 9 LE)

and mambo with no regression errors.

Please let me know if this patch is acceptable for the mainline
branch.  Thanks.

 Carl Love

---

gcc/ChangeLog

2020-07-06 Carl Love  

* config/rs6000/altivec.h: Add define for vec_replace_elt and
vec_replace_unaligned.
* config/rs6000/vsx.md (UNSPEC_REPLACE_ELT, UNSPEC_REPLACE_UN): New.
(REPLACE_ELT): New mode iterator.
(REPLACE_ELT_atr, REPLACE_ELT_inst, REPLACE_ELT_char,
REPLACE_ELT_sh, REPLACE_ELT_max): New mode attributes.
(vreplace_un_, vreplace_elt__inst): New.
* config/rs6000/rs6000-builtin.def (VREPLACE_ELT_V4SI,
VREPLACE_ELT_UV4SI, VREPLACE_ELT_V4SF, VREPLACE_ELT_UV2DI,
VREPLACE_ELT_V2DF, VREPLACE_UN_V4SI, VREPLACE_UN_UV4SI,
VREPLACE_UN_V4SF, VREPLACE_UN_V2DI, VREPLACE_UN_UV2DI,
VREPLACE_UN_V2DF, (REPLACE_ELT, REPLACE_UN): New.
* config/rs6000/rs6000-call.c (P10_BUILTIN_VEC_REPLACE_ELT,
P10_BUILTIN_VEC_REPLACE_UN): New.
(rs6000_expand_ternop_builtin): Add 3rd argument checks for
CODE_FOR_vreplace_elt_v4si, CODE_FOR_vreplace_elt_v4sf,
CODE_FOR_vreplace_un_v4si, CODE_FOR_vreplace_un_v4sf.
(builtin_function_type) [P10_BUILTIN_VREPLACE_ELT_UV4SI,
P10_BUILTIN_VREPLACE_ELT_UV2DI, P10_BUILTIN_VREPLACE_UN_UV4SI,
P10_BUILTIN_VREPLACE_UN_UV2DI]: New cases.
* doc/extend.texi: Add description for vec_replace_elt and
vec_replace_unaligned builtins.

gcc/testsuite/ChangeLog

2020-07-06 Carl Love  

* gcc.target/powerpc/vec-replace-word.c: Add new test.
---
 gcc/config/rs6000/altivec.h   |   2 +
 gcc/config/rs6000/rs6000-builtin.def  |  16 +
 gcc/config/rs6000/rs6000-call.c   |  61 
 gcc/config/rs6000/vsx.md  |  60 
 gcc/doc/extend.texi   |  50 +++
 .../powerpc/vec-replace-word-runnable.c   | 289 ++
 6 files changed, 478 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-replace-word-runnable.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 0563853c03f..560c43cfc99 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -701,6 +701,8 @@ __altivec_scalar_pred(vec_any_nle,
 #define vec_extracth(a, b, c)  __builtin_vec_extracth (a, b, c)
 #define vec_insertl(a, b, c)   __builtin_vec_insertl (a, b, c)
 #define vec_inserth(a, b, c)   __builtin_vec_inserth (a, b, c)
+#define vec_replace_elt(a, b, c)   __builtin_vec_replace_elt (a, b, c)
+#define vec_replace_unaligned(a, b, c) __builtin_vec_replace_un (a, b, c)
 
 #define vec_gnb(a, b)  __builtin_vec_gnb (a, b)
 #define vec_clrl(a, b) __builtin_vec_clrl (a, b)
diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index e73d144c1cc..e22b3e4d53b 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2724,6 +2724,20 @@ BU_P10V_3 (VINSERTVPRBR, "vinsvubvrx", CONST, 
vinsertvr_v16qi)
 BU_P10V_3 (VINSERTVPRHR, "vinsvuhvrx", CONST, vinsertvr_v8hi)
 BU_P10V_3 (VINSERTVPRWR, "vinsvuwvrx", CONST, vinsertvr_v4si)
 
+BU_P10V_3 (VREPLACE_ELT_V4SI, "vreplace_v4si", CONST, vreplace_elt_v4si)
+BU_P10V_3 (VREPLACE_ELT_UV4SI, "vreplace_uv4si", CONST, vreplace_elt_v4si)
+BU_P10V_3 (VREPLACE_ELT_V4SF, "vreplace_v4sf", CONST, vreplace_elt_v4sf)
+BU_P10V_3 (VREPLACE_ELT_V2DI, "vreplace_v2di", CONST, vreplace_elt_v2di)
+BU_P10V_3 (VREPLACE_ELT_UV2DI, "vreplace_uv2di", CONST, vreplace_elt_v2di)
+BU_P10V_3 (VREPLACE_ELT_V2DF, "vreplace_v2df", CONST, vreplace_elt_v2df)
+

Re: [PATCH 0/6 ver 4] ] Permute Class Operations

2020-07-08 Thread Carl Love via Gcc-patches


[PATCH 1/6] rs6000, Update support for vec_extract

-
V4 changes
rebased onto mainline 7/2/2020
Add iterator name to Change log

---
V3 changes

  Redo ChangeLog for code move.
  Replace spaces with tabs in ChangeLog.
  Replaced intruction names using * with the actual list of names.  For
example vextdu*vrx with the explicit instruction names vextdubvrx,
vextduhvrx, etc.
-
v2 changes

config/rs6000/altivec.md log entry for move from changed as suggested.

config/rs6000/vsx.md log entro for moved to here changed as suggested.

define_mode_iterator VI2 also moved, included in both change log entries


GCC maintainers:

Move the existing vector extract support in altivec.md to vsx.md
so all of the vector insert and extract support is in the same file.

The patch also updates the name of the builtins and descriptions for the
builtins in the documentation file so they match the approved builtin
names and descriptions.

The patch does not make any functional changes.

Please let me know if the changes are acceptable for mainline.  Thanks.

  Carl Love

--

gcc/ChangeLog

2020-07-06  Carl Love  

* config/rs6000/altivec.md: (UNSPEC_EXTRACTL, UNSPEC_EXTRACTR)
(vextractl, vextractr)
(vextractl_internal, vextractr_internal for mode VI2)
(VI2): Move to ...
* config/rs6000/vsx.md: (UNSPEC_EXTRACTL, UNSPEC_EXTRACTR)
(vextractl, vextractr)
(vextractl_internal, vextractr_internal for mode VI2)
(VI2):  ..here.
* gcc/doc/extend.texi: Update documentation for vec_extractl.
Replace builtin name vec_extractr with vec_extracth.  Update description
of vec_extracth.
---
 gcc/config/rs6000/altivec.md | 64 -
 gcc/config/rs6000/vsx.md | 66 ++
 gcc/doc/extend.texi  | 78 ++--
 3 files changed, 105 insertions(+), 103 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 2ce9227c765..749b2c42c14 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -172,8 +172,6 @@
UNSPEC_XXEVAL
UNSPEC_VSTRIR
UNSPEC_VSTRIL
-   UNSPEC_EXTRACTL
-   UNSPEC_EXTRACTR
 ])
 
 (define_c_enum "unspecv"
@@ -184,8 +182,6 @@
UNSPECV_DSS
   ])
 
-;; Like VI, defined in vector.md, but add ISA 2.07 integer vector ops
-(define_mode_iterator VI2 [V4SI V8HI V16QI V2DI])
 ;; Short vec int modes
 (define_mode_iterator VIshort [V8HI V16QI])
 ;; Longer vec int modes for rotate/mask ops
@@ -786,66 +782,6 @@
   DONE;
 })
 
-(define_expand "vextractl"
-  [(set (match_operand:V2DI 0 "altivec_register_operand")
-   (unspec:V2DI [(match_operand:VI2 1 "altivec_register_operand")
- (match_operand:VI2 2 "altivec_register_operand")
- (match_operand:SI 3 "register_operand")]
-UNSPEC_EXTRACTL))]
-  "TARGET_POWER10"
-{
-  if (BYTES_BIG_ENDIAN)
-{
-  emit_insn (gen_vextractl_internal (operands[0], operands[1],
-  operands[2], operands[3]));
-  emit_insn (gen_xxswapd_v2di (operands[0], operands[0]));
-}
-  else
-emit_insn (gen_vextractr_internal (operands[0], operands[2],
-operands[1], operands[3]));
-  DONE;
-})
-
-(define_insn "vextractl_internal"
-  [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
-   (unspec:V2DI [(match_operand:VEC_I 1 "altivec_register_operand" "v")
- (match_operand:VEC_I 2 "altivec_register_operand" "v")
- (match_operand:SI 3 "register_operand" "r")]
-UNSPEC_EXTRACTL))]
-  "TARGET_POWER10"
-  "vextvlx %0,%1,%2,%3"
-  [(set_attr "type" "vecsimple")])
-
-(define_expand "vextractr"
-  [(set (match_operand:V2DI 0 "altivec_register_operand")
-   (unspec:V2DI [(match_operand:VI2 1 "altivec_register_operand")
- (match_operand:VI2 2 "altivec_register_operand")
- (match_operand:SI 3 "register_operand")]
-UNSPEC_EXTRACTR))]
-  "TARGET_POWER10"
-{
-  if (BYTES_BIG_ENDIAN)
-{
-  emit_insn (gen_vextractr_internal (operands[0], operands[1],
-  operands[2], operands[3]));
-  emit_insn (gen_xxswapd_v2di (operands[0], operands[0]));
-}
-  else
-emit_insn (gen_vextractl_internal (operands[0], operands[2],
-operands[1], operands[3]));
-  DONE;
-})
-
-(define_insn "vextractr_internal"
-  [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
-   (unspec:V2DI [(match_operand:VEC_I 1 "altivec_register_operand" "v")
- (match_operand:VEC_I 2 

Re: [PATCH 0/6 ver 4] ] Permute Class Operations

2020-07-08 Thread Carl Love via Gcc-patches
[PATCH 2/6] rs6000 Add vector insert builtin support


V4 changes
  Rebased on mainline.  Changed FUTURE to P10 as needed.


V3 changes

  Replace spaces with of tabs in ChangeLog
  Ditto in gcc/config/rs6000/vsx.md.
  Updated description for vec_insertl() builtin.
  Cleaned up vec_insert description.

-
v2 changes

Fix change log entry for config/rs6000/altivec.h

Fix change log entry for config/rs6000/rs6000-builtin.def

Fix change log entry for config/rs6000/rs6000-call.c

vsx.md: Fixed if (BYTES_BIG_ENDIAN) else statements.
Porting error from pu branch.

---
GCC maintainers:

This patch adds support for vec_insertl and vec_inserth builtins.

The patch has been compiled and tested on

  powerpc64le-unknown-linux-gnu (Power 9 LE)

and mambo with no regression errors.

Please let me know if this patch is acceptable for the mainline branch.

Thanks.

 Carl Love

--
gcc/ChangeLog

2020-07-02  Carl Love  

* config/rs6000/altivec.h (vec_insertl, vec_inserth): New defines.
* config/rs6000/rs6000-builtin.def (VINSERTGPRBL, VINSERTGPRHL,
VINSERTGPRWL, VINSERTGPRDL, VINSERTVPRBL, VINSERTVPRHL, VINSERTVPRWL,
VINSERTGPRBR, VINSERTGPRHR, VINSERTGPRWR, VINSERTGPRDR, VINSERTVPRBR,
VINSERTVPRHR, VINSERTVPRWR): New builtins.
(INSERTL, INSERTH): New builtins.
* config/rs6000/rs6000-call.c (P10_BUILTIN_VEC_INSERTL,
P10_BUILTIN_VEC_INSERTH): New overloaded definitions.
(P10_BUILTIN_VINSERTGPRBL, P10_BUILTIN_VINSERTGPRHL,
P10_BUILTIN_VINSERTGPRWL, P10_BUILTIN_VINSERTGPRDL,
P10_BUILTIN_VINSERTVPRBL, P10_BUILTIN_VINSERTVPRHL,
P10_BUILTIN_VINSERTVPRWL): Add case entries.
* config/rs6000/vsx.md (define_c_enum): Add UNSPEC_INSERTL,
UNSPEC_INSERTR.
(define_expand): Add vinsertvl_, vinsertvr_,
vinsertgl_, vinsertgr_, mode is VI2.
(define_ins): vinsertvl_internal_, vinsertvr_internal_,
vinsertgl_internal_, vinsertgr_internal_, mode VEC_I.
* doc/extend.texi: Add documentation for vec_insertl, vec_inserth.

gcc/testsuite/ChangeLog

2020-07-02  Carl Love  

* gcc.target/powerpc/vec-insert-word-runnable.c: New test case.
---
 gcc/config/rs6000/altivec.h   |   2 +
 gcc/config/rs6000/rs6000-builtin.def  |  18 +
 gcc/config/rs6000/rs6000-call.c   |  51 +++
 gcc/config/rs6000/vsx.md  | 110 ++
 gcc/doc/extend.texi   |  71 
 .../powerpc/vec-insert-word-runnable.c| 345 ++
 6 files changed, 597 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-insert-word-runnable.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index bb1524f4a67..0563853c03f 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -699,6 +699,8 @@ __altivec_scalar_pred(vec_any_nle,
 /* Overloaded built-in functions for ISA 3.1.  */
 #define vec_extractl(a, b, c)  __builtin_vec_extractl (a, b, c)
 #define vec_extracth(a, b, c)  __builtin_vec_extracth (a, b, c)
+#define vec_insertl(a, b, c)   __builtin_vec_insertl (a, b, c)
+#define vec_inserth(a, b, c)   __builtin_vec_inserth (a, b, c)
 
 #define vec_gnb(a, b)  __builtin_vec_gnb (a, b)
 #define vec_clrl(a, b) __builtin_vec_clrl (a, b)
diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index 363656ec05c..e73d144c1cc 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2708,6 +2708,22 @@ BU_P10V_3 (VEXTRACTHR, "vextduhvhx", CONST, 
vextractrv8hi)
 BU_P10V_3 (VEXTRACTWR, "vextduwvhx", CONST, vextractrv4si)
 BU_P10V_3 (VEXTRACTDR, "vextddvhx", CONST, vextractrv2di)
 
+BU_P10V_3 (VINSERTGPRBL, "vinsgubvlx", CONST, vinsertgl_v16qi)
+BU_P10V_3 (VINSERTGPRHL, "vinsguhvlx", CONST, vinsertgl_v8hi)
+BU_P10V_3 (VINSERTGPRWL, "vinsguwvlx", CONST, vinsertgl_v4si)
+BU_P10V_3 (VINSERTGPRDL, "vinsgudvlx", CONST, vinsertgl_v2di)
+BU_P10V_3 (VINSERTVPRBL, "vinsvubvlx", CONST, vinsertvl_v16qi)
+BU_P10V_3 (VINSERTVPRHL, "vinsvuhvlx", CONST, vinsertvl_v8hi)
+BU_P10V_3 (VINSERTVPRWL, "vinsvuwvlx", CONST, vinsertvl_v4si)
+
+BU_P10V_3 (VINSERTGPRBR, "vinsgubvrx", CONST, vinsertgr_v16qi)
+BU_P10V_3 (VINSERTGPRHR, "vinsguhvrx", CONST, vinsertgr_v8hi)
+BU_P10V_3 (VINSERTGPRWR, "vinsguwvrx", CONST, vinsertgr_v4si)
+BU_P10V_3 (VINSERTGPRDR, "vinsgudvrx", CONST, vinsertgr_v2di)
+BU_P10V_3 (VINSERTVPRBR, "vinsvubvrx", CONST, vinsertvr_v16qi)
+BU_P10V_3 (VINSERTVPRHR, "vinsvuhvrx", CONST, vinsertvr_v8hi)
+BU_P10V_3 (VINSERTVPRWR, "vinsvuwvrx", CONST, vinsertvr_v4si)
+
 BU_P10V_1 (VSTRIBR, "vstribr", CONST, vstrir_v16qi)
 BU_P10V_1 (VSTRIHR, "vstrihr", CONST, 

[PATCH 0/6 ver 4] ] Permute Class Operations

2020-07-08 Thread Carl Love via Gcc-patches
Segher:

The following is version 4 of the series of patches for the permute
class operations.  Per your request, I will send each patch as a reply
to this message so they are all in the same thread in your email box.  

Patches 1, 2,3  and 4 just have minor fixes per your earlier comments. 
However, the patches have been rebased onto the latest mainline tree
which required changing FUTURE to P10 in the code and test cases.  So I
am sending everything.  

Patch 5 has the changes for the F32bit_const_operand stuff that we
discussed at length.  It has the changes with regards to the xxspltidp
instruction which has undefined results for subnormal inputs that we
also talked about.  See comments in the V4 fixes as to specific things
that need reviewing and commenting on.

Patch 6 didn't get reviewed the last time as we discussed that the
whole series needed rebasing due to the FUTURE to P10 changes that had
gone into mainline.

The series has been retested on Power 9 as well as running the
testcases on mambo.  Everything seems to checkout fine.

Please let me know if the series is acceptable for mainline.  Thanks.

 Carl 



Re: [PATCH] gcc/Makefile.in: move SELFTEST_DEPS before including language makefile fragments

2020-07-08 Thread Romain Naour via Gcc-patches
Le 03/06/2020 à 22:24, Jeff Law a écrit :
> On Wed, 2020-06-03 at 21:56 +0200, Romain Naour wrote:
>> Hi Jeff,
>>
>> Le 03/06/2020 à 20:33, Jeff Law a écrit :
>>> On Thu, 2020-05-21 at 17:35 +0200, Romain Naour via Gcc-patches wrote:
 As reported by several Buildroot users [1][2][3], the gcc build
 may fail while running selftests makefile target.

 The problem only occurs when ccache is used with gcc 9 and 10,
 probably due to a race condition.

 While debuging with "make -p" we can notice that s-selftest-c target
 contain only "cc1" as dependency instead of cc1 and SELFTEST_DEPS [4].

   s-selftest-c: cc1

 While the build is failing, the s-selftest-c dependencies recipe is
 still running and reported as a bug by make.

   "Dependencies recipe running (THIS IS A BUG)."

 A change [5] in gcc 9 seems to introduce the problem since we can't
 reproduce this problem with gcc 8.

 As suggested by Yann E. MORIN [6], move SELFTEST_DEPS before
 including language makefile fragments.

 With the fix applied, the s-seltest-c dependency contains
 SELFTEST_DEPS value.

   s-selftest-c: cc1 xgcc specs stmp-int-hdrs ../../gcc/testsuite/selftests

 [1] http://lists.busybox.net/pipermail/buildroot/2020-May/282171.html
 [2] http://lists.busybox.net/pipermail/buildroot/2020-May/282766.html
 [3] https://github.com/cirosantilli/linux-kernel-module-cheat/issues/108
 [4] 
 https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/c/Make-lang.in;h=bfae6fd2549c4f728816cd355fa9739dcc08fcde;hb=033eb5671769a4c681a44aad08a454e667e08502#l120
 [5] 
 https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=033eb5671769a4c681a44aad08a454e667e08502
 [6] http://lists.busybox.net/pipermail/buildroot/2020-May/283213.html

 Signed-off-by: Romain Naour 
 Cc: Ben Dakin-Norris 
 Cc: Maxim Kochetkov 
 Cc: Thomas Petazzoni 
 Cc: Yann E. MORIN 
 Cc: Cc: David Malcolm 
 ---
 This patch should be backported to gcc 10 and gcc 9.
 ---
  gcc/ChangeLog   | 5 +
  gcc/Makefile.in | 6 --
  2 files changed, 9 insertions(+), 2 deletions(-)

 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index 977e7664b62..c3bb18f2afd 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -1,3 +1,8 @@
 +2020-05-21  Romain Naour 
 +
 +  * Makefile.in: move SELFTEST_DEPS before including language
 +  makefile fragments.
>>> THanks.  I've installed this on the trunk.
>>
>> Many thanks for merging the patch!
>>
>> But I don't see the commit log I've written to explain the issue.
>> Was there a reason to drop it?
> As a project we're still trying to sort out the right level of verbosity of 
> the
> commit log.  I tend to use short ones.

This patch should be backported to gcc 10 and 9.

Best regards,
Romain

> 
> jeff
>>
> 



Re: [PATCH] RISC-V: Disable remove unneeded save-restore call optimization if there are any arguments on stack.

2020-07-08 Thread Jim Wilson
On Tue, Jul 7, 2020 at 12:28 AM Kito Cheng  wrote:
> gcc/ChangeLog:
> * config/riscv/riscv-sr.c (riscv_remove_unneeded_save_restore_calls):
> Abort if any arguments on stack.
> gcc/testsuite/ChangeLog
> * gcc.target/riscv/save-restore-9.c: New.

Looks good to me.

Jim


[PATCH] libgomp: Add OMPD Address Space Information functions.

2020-07-08 Thread y2s1982 via Gcc-patches
This patch adds Address Space Information function implementations as
defined in section 5.5.4 of OpenMP API Specification 5.0.

2020-07-08  Tony Sim  

libgomp/ChangeLog:

* Makefile.am (libgompd_la_OBJECTS): Add ompd-addr.c.
* Makefile.in: Regenerate.
* ompd-addr.c: New file.

---
 libgomp/Makefile.am |  2 +-
 libgomp/Makefile.in |  5 +--
 libgomp/ompd-addr.c | 88 +
 3 files changed, 92 insertions(+), 3 deletions(-)
 create mode 100644 libgomp/ompd-addr.c

diff --git a/libgomp/Makefile.am b/libgomp/Makefile.am
index fe0a92122ea..0a4a9c10eb9 100644
--- a/libgomp/Makefile.am
+++ b/libgomp/Makefile.am
@@ -90,7 +90,7 @@ libgomp_la_SOURCES = alloc.c atomic.c barrier.c critical.c 
env.c error.c \
oacc-mem.c oacc-async.c oacc-plugin.c oacc-cuda.c priority_queue.c \
affinity-fmt.c teams.c allocator.c oacc-profiling.c oacc-target.c
 
-libgompd_la_SOURCES = ompd-lib.c ompd-proc.c
+libgompd_la_SOURCES = ompd-lib.c ompd-proc.c ompd-addr.c
 
 include $(top_srcdir)/plugin/Makefrag.am
 
diff --git a/libgomp/Makefile.in b/libgomp/Makefile.in
index 2b487e00499..9ceb2c6e460 100644
--- a/libgomp/Makefile.in
+++ b/libgomp/Makefile.in
@@ -235,7 +235,7 @@ am_libgomp_la_OBJECTS = alloc.lo atomic.lo barrier.lo 
critical.lo \
$(am__objects_1)
 libgomp_la_OBJECTS = $(am_libgomp_la_OBJECTS)
 libgompd_la_LIBADD =
-am_libgompd_la_OBJECTS = ompd-lib.lo ompd-proc.lo
+am_libgompd_la_OBJECTS = ompd-lib.lo ompd-proc.lo ompd-addr.lo
 libgompd_la_OBJECTS = $(am_libgompd_la_OBJECTS)
 AM_V_P = $(am__v_P_@AM_V@)
 am__v_P_ = $(am__v_P_@AM_DEFAULT_V@)
@@ -592,7 +592,7 @@ libgomp_la_SOURCES = alloc.c atomic.c barrier.c critical.c 
env.c \
oacc-async.c oacc-plugin.c oacc-cuda.c priority_queue.c \
affinity-fmt.c teams.c allocator.c oacc-profiling.c \
oacc-target.c $(am__append_4)
-libgompd_la_SOURCES = ompd-lib.c ompd-proc.c
+libgompd_la_SOURCES = ompd-lib.c ompd-proc.c ompd-addr.c
 
 # Nvidia PTX OpenACC plugin.
 @PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_version_info = -version-info 
$(libtool_VERSION)
@@ -816,6 +816,7 @@ distclean-compile:
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-plugin.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-profiling.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-target.Plo@am__quote@
+@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ompd-addr.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ompd-lib.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ompd-proc.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ordered.Plo@am__quote@
diff --git a/libgomp/ompd-addr.c b/libgomp/ompd-addr.c
new file mode 100644
index 000..f1f3b8071c1
--- /dev/null
+++ b/libgomp/ompd-addr.c
@@ -0,0 +1,88 @@
+/* Copyright (C) 2020 Free Software Foundation, Inc.
+   Contributed by Yoosuk Sim .
+
+   This file is part of the GNU Offloading and Multi Processing Library
+   (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   .  */
+
+/* This file contains function definitions for OMPD's Address Space Information
+   functions defined in the OpenMP 5.0 API Documentation, 5.5.4.  */
+
+#include 
+#include 
+#include "omp-tools.h"
+#include "libgompd.h"
+
+ompd_rc_t
+ompd_get_omp_version (ompd_address_space_handle_t *address_space,
+ ompd_word_t *omp_version)
+{
+  if (omp_version == NULL)
+return ompd_rc_bad_input;
+  if (address_space == NULL)
+return ompd_rc_stale_handle;
+
+  /* _OPENMP macro is defined to have mm integer.  */
+  ompd_size_t macro_length = sizeof (int);
+
+  ompd_rc_t ret = ompd_rc_ok;
+
+  struct ompd_address_t addr;
+  ret = gompd_callbacks.symbol_addr_lookup (address_space->context, NULL,
+   "openmp_version", , NULL);
+  if (ret != ompd_rc_ok)
+return ret;
+
+  ret = gompd_callbacks.read_memory (address_space->context, NULL, ,
+macro_length, (void *) omp_version);
+  return ompd_rc_ok;
+}
+
+ompd_rc_t

[PATCH, committed] PR fortran/96085 - ICE in gfc_finish_var_decl, at fortran/trans-decl.c:694

2020-07-08 Thread Harald Anlauf
Committed as obvious.  Check whether the target of a legacy ASSIGN statement
is a parameter and reject if this is the case.

Regtested on x86_64-pc-linux-gnu.

Thanks,
Harald


PR fortran/96085 - ICE in gfc_finish_var_decl, at fortran/trans-decl.c:694

Legacy ASSIGN requires a scalar integer variable.  Reject parameter
arguments.

gcc/fortran/
PR fortran/96085
* resolve.c (gfc_resolve_code): Check whether assign target is a
parameter.

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 223de91..6bc1c46a97d 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -11900,6 +11900,7 @@ start:
 		  || code->expr1->symtree->n.sym->ts.type != BT_INTEGER
 		  || code->expr1->symtree->n.sym->ts.kind
 		 != gfc_default_integer_kind
+		  || code->expr1->symtree->n.sym->attr.flavor == FL_PARAMETER
 		  || code->expr1->symtree->n.sym->as != NULL))
 	gfc_error ("ASSIGN statement at %L requires a scalar "
 		   "default INTEGER variable", >expr1->where);
diff --git a/gcc/testsuite/gfortran.dg/pr96085.f90 b/gcc/testsuite/gfortran.dg/pr96085.f90
new file mode 100644
index 000..82b1cdec0f6
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr96085.f90
@@ -0,0 +1,12 @@
+! { dg-do compile }
+! { dg-options "-std=legacy" }
+! PR fortran/96085 - ICE in gfc_finish_var_decl, at fortran/trans-decl.c:694
+
+module m
+  integer, parameter :: a = 1
+contains
+  subroutine s
+assign 2 to a   ! { dg-error "requires a scalar default INTEGER variable" }
+2   print *, a
+  end
+end


Re: [Patch][gcn, nvptx, offloading] mkoffload – handle -fpic/-fPIC

2020-07-08 Thread Tom de Vries
On 7/8/20 7:35 PM, Kwok Cheung Yeung wrote:
> (Resent with CC to gcc-patches)
> 
> Hello
> 
>> I tried out the patch with one test-case and -pie -fPIC/-fpic already
>> seems to works, so perhaps we could have at least one test-case
>> exercising this in libgomp?  That sounds easier to do than the
>> shared-lib test-case.
> 
> I've created a simple testcase which tries to generate a shared library
> with offloaded code. Without the mkoffload patch, it fails during
> linking with:
> 
> ld: /tmp/ccNaT7fO.target.o: relocation R_X86_64_32 against `.rodata' can
> not be used when making a shared object; recompile with -fPIC
> 
> I have tested this on a x64 host with offloading to nvptx and gcn. On
> AMD GCN, it also produces a couple of extra linker warnings that I have
> added dg-warning entries for.
> 
> Okay for trunk/OG10 together with the previous mkoffload patch?
> 

Hi Kwok,

the test-case looks ok to me, but I can't approve it.

FWIW, the nvptx part of the previous mkoffload patch was already
approved, so AFAIU you could have committed that already.

Thanks,
- Tom

> Thanks
> 
> Kwok
> 
> 
> mkoffload_test_case.patch
> 
> commit 43238117c261285a6b95d881bcc2f9efd9f752ad
> Author: Kwok Cheung Yeung 
> Date:   Wed Jul 8 03:28:08 2020 -0700
> 
> Add test case
> 
> 2020-07-08  Kwok Cheung Yeung  
> 
>   libgomp/
>   * testsuite/libgomp.oacc-c-c++-common/shared-lib.c: New.
> 
> diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/shared-lib.c 
> b/libgomp/testsuite/libgomp.oacc-c-c++-common/shared-lib.c
> new file mode 100644
> index 000..6d8a4ac
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/shared-lib.c
> @@ -0,0 +1,16 @@
> +/* { dg-do link } */
> +/* { dg-additional-options "-shared -fPIC" { target fpic } } */
> +
> +#define N 512
> +
> +void f(int a[])
> +{
> +  int i;
> +
> +  #pragma acc parallel
> +  for (i = 0; i < N; i++)
> +a[i]++;
> +}
> +
> +/* { dg-warning "relocation against `.*' in read-only section `\.rodata'" "" 
> { target openacc_radeon_accel_selected } 0 } */
> +/* { dg-warning "creating DT_TEXTREL in a shared object" "" { target 
> openacc_radeon_accel_selected } 0 } */
> 


Re: [PATCH] libgomp: Add skeletal OMPD process functions and datatypes.

2020-07-08 Thread Jakub Jelinek via Gcc-patches
On Wed, Jul 08, 2020 at 02:02:44PM -0400, y2s1982 wrote:
> +  ret = gompd_callbacks.free_memory (handle);
> +  return ret;

You could here just
  return gompd_callbacks.free_memory (handle);

No need to bother with ret variable.
Ok with that fixed, no need to repost.

Jakub



Re: [PATCH] libgomp: Add OMPD process functions and datatypes.

2020-07-08 Thread y2s1982 . via Gcc-patches
Hello

On Wed, Jul 8, 2020 at 5:08 AM Jakub Jelinek  wrote:

> On Tue, Jul 07, 2020 at 02:52:37PM -0400, y2s1982 . via Gcc-patches wrote:
> > I have re-read the documentation trying to find a different solution.
> > In particular, ompd_device_initialize states that
> > ompd_device_t kind, ompd_size_t sizeof_id, and void *id represents
> > a device identifier. To dig further, I read up on the ompd_device_t. A
> > passage from ompd_device_t says that the OMPD library and a tool that
> uses
> > it must agree on the format of the object that is passed.
> > It also says that ompd_device_t is a pointer to where the device
> identifier
> > is stored and the size of the device identifier. I am not sure how this
> > works to ompd_device_initialize as those two information seems to be
> > supplied separately: *id and sizeof_id. In fact, ompd-type.h provides 4
> > examples, 2 of which are host and cuda, and they all simply contain
> unique
> > numerical values.  So does this mean that I should just decide on what
> the
> > library and tool will use for device id data type and simply stick to it?
> >
> > Otherwise, Is it possible to know the proper data type to cast the void
> *id
> > based on the device type (host/cuda)?
>
> Looking at libompd, it ignores sizeof_id completely and saves id (the
> pointer) in its device context, but doesn't ever do anything with it, so
> effectively ignores it completely.
>
> I think starting with devices it not a good idea, just return failure from
> ompd_device_initialize for now and get back to it much later, when
> handling of parallel, host teams, task etc. is all done.
> Because communicating with devices will need also communication with the
> libgomp plugins.
>

Thank you for the guidance. I have uploaded an updated patch where
ompd_device_initialize() ultimately returns an error. I have also added
FIXME
for future planning.

Tony

>
> Jakub
>
>


[PATCH] libgomp: Add skeletal OMPD process functions and datatypes.

2020-07-08 Thread y2s1982 via Gcc-patches
This patch adds OMPD functions defined in 5.5.2 of the OpenMP 5.0 API
documentation. It adds per-process and per-device functions, defines
related handle data types. It also introduces ompd-types.h to Makefile.

2020-07-08  Tony Sim  

libgomp/ChangeLog:

* Makefile.am (libgompd_la_SOURCES): Add ompd-proc.c.
(nodist_libsubinclude_HEADERS): Add ompd-types.h.
* Makefile.in: Regenerate.
* libgompd.h (ompd_address_space_handle_t): Add definition.
* ompd-proc.c: New file.
* ompd-types.h: New file.

---
 libgomp/Makefile.am  |   4 +-
 libgomp/Makefile.in  |  11 ++--
 libgomp/libgompd.h   |   9 +++
 libgomp/ompd-proc.c  | 130 +++
 libgomp/ompd-types.h |  90 ++
 5 files changed, 237 insertions(+), 7 deletions(-)
 create mode 100644 libgomp/ompd-proc.c
 create mode 100644 libgomp/ompd-types.h

diff --git a/libgomp/Makefile.am b/libgomp/Makefile.am
index e15a838e55c..fe0a92122ea 100644
--- a/libgomp/Makefile.am
+++ b/libgomp/Makefile.am
@@ -90,7 +90,7 @@ libgomp_la_SOURCES = alloc.c atomic.c barrier.c critical.c 
env.c error.c \
oacc-mem.c oacc-async.c oacc-plugin.c oacc-cuda.c priority_queue.c \
affinity-fmt.c teams.c allocator.c oacc-profiling.c oacc-target.c
 
-libgompd_la_SOURCES = ompd-lib.c
+libgompd_la_SOURCES = ompd-lib.c ompd-proc.c
 
 include $(top_srcdir)/plugin/Makefrag.am
 
@@ -99,7 +99,7 @@ libgomp_la_SOURCES += openacc.f90
 endif
 
 nodist_noinst_HEADERS = libgomp_f.h
-nodist_libsubinclude_HEADERS = omp.h omp-tools.h openacc.h acc_prof.h
+nodist_libsubinclude_HEADERS = omp.h omp-tools.h ompd-types.h openacc.h 
acc_prof.h
 if USE_FORTRAN
 nodist_finclude_HEADERS = omp_lib.h omp_lib.f90 omp_lib.mod omp_lib_kinds.mod \
openacc_lib.h openacc.f90 openacc.mod openacc_kinds.mod
diff --git a/libgomp/Makefile.in b/libgomp/Makefile.in
index af897d6c6ba..2b487e00499 100644
--- a/libgomp/Makefile.in
+++ b/libgomp/Makefile.in
@@ -235,7 +235,7 @@ am_libgomp_la_OBJECTS = alloc.lo atomic.lo barrier.lo 
critical.lo \
$(am__objects_1)
 libgomp_la_OBJECTS = $(am_libgomp_la_OBJECTS)
 libgompd_la_LIBADD =
-am_libgompd_la_OBJECTS = ompd-lib.lo
+am_libgompd_la_OBJECTS = ompd-lib.lo ompd-proc.lo
 libgompd_la_OBJECTS = $(am_libgompd_la_OBJECTS)
 AM_V_P = $(am__v_P_@AM_V@)
 am__v_P_ = $(am__v_P_@AM_DEFAULT_V@)
@@ -574,10 +574,10 @@ nodist_toolexeclib_HEADERS = libgomp.spec
 libgomp_version_info = -version-info $(libtool_VERSION)
 libgompd_version_info = -version-info $(libtool_VERSION)
 libgomp_la_LDFLAGS = $(libgomp_version_info) $(libgomp_version_script) \
-$(lt_host_flags)
+   $(lt_host_flags)
 
 libgompd_la_LDFLAGS = $(libgompd_version_info) $(libgompd_version_script) \
-$(lt_host_flags)
+   $(lt_host_flags)
 
 libgomp_la_DEPENDENCIES = $(libgomp_version_dep)
 libgompd_la_DEPENDENCIES = $(libgompd_version_dep)
@@ -592,7 +592,7 @@ libgomp_la_SOURCES = alloc.c atomic.c barrier.c critical.c 
env.c \
oacc-async.c oacc-plugin.c oacc-cuda.c priority_queue.c \
affinity-fmt.c teams.c allocator.c oacc-profiling.c \
oacc-target.c $(am__append_4)
-libgompd_la_SOURCES = ompd-lib.c
+libgompd_la_SOURCES = ompd-lib.c ompd-proc.c
 
 # Nvidia PTX OpenACC plugin.
 @PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_version_info = -version-info 
$(libtool_VERSION)
@@ -628,7 +628,7 @@ libgompd_la_SOURCES = ompd-lib.c
 @PLUGIN_GCN_TRUE@libgomp_plugin_gcn_la_LIBADD = libgomp.la $(PLUGIN_GCN_LIBS)
 @PLUGIN_GCN_TRUE@libgomp_plugin_gcn_la_LIBTOOLFLAGS = --tag=disable-static
 nodist_noinst_HEADERS = libgomp_f.h
-nodist_libsubinclude_HEADERS = omp.h omp-tools.h openacc.h acc_prof.h
+nodist_libsubinclude_HEADERS = omp.h omp-tools.h ompd-types.h openacc.h 
acc_prof.h
 @USE_FORTRAN_TRUE@nodist_finclude_HEADERS = omp_lib.h omp_lib.f90 omp_lib.mod 
omp_lib_kinds.mod \
 @USE_FORTRAN_TRUE@ openacc_lib.h openacc.f90 openacc.mod openacc_kinds.mod
 
@@ -817,6 +817,7 @@ distclean-compile:
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-profiling.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-target.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ompd-lib.Plo@am__quote@
+@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ompd-proc.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ordered.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/parallel.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/priority_queue.Plo@am__quote@
diff --git a/libgomp/libgompd.h b/libgomp/libgompd.h
index 3a428e1c1e4..495995e00d3 100644
--- a/libgomp/libgompd.h
+++ b/libgomp/libgompd.h
@@ -38,4 +38,13 @@
 
 extern ompd_callbacks_t gompd_callbacks;
 
+typedef struct _ompd_aspace_handle {
+  ompd_address_space_context_t *context;
+  ompd_device_t kind;
+  ompd_size_t sizeof_id;
+  void *id;
+  ompd_address_space_handle_t *process_reference;
+  ompd_size_t ref_count;
+} ompd_address_space_handle_t;
+
 #endif /* 

Re: [Patch][gcn, nvptx, offloading] mkoffload – handle -fpic/-fPIC

2020-07-08 Thread Kwok Cheung Yeung

(Resent with CC to gcc-patches)

Hello

> I tried out the patch with one test-case and -pie -fPIC/-fpic already
> seems to works, so perhaps we could have at least one test-case
> exercising this in libgomp?  That sounds easier to do than the
> shared-lib test-case.

I've created a simple testcase which tries to generate a shared library with 
offloaded code. Without the mkoffload patch, it fails during linking with:


ld: /tmp/ccNaT7fO.target.o: relocation R_X86_64_32 against `.rodata' can not be 
used when making a shared object; recompile with -fPIC


I have tested this on a x64 host with offloading to nvptx and gcn. On AMD GCN, 
it also produces a couple of extra linker warnings that I have added dg-warning 
entries for.


Okay for trunk/OG10 together with the previous mkoffload patch?

Thanks

Kwok

commit 43238117c261285a6b95d881bcc2f9efd9f752ad
Author: Kwok Cheung Yeung 
Date:   Wed Jul 8 03:28:08 2020 -0700

Add test case

2020-07-08  Kwok Cheung Yeung  

libgomp/
* testsuite/libgomp.oacc-c-c++-common/shared-lib.c: New.

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/shared-lib.c 
b/libgomp/testsuite/libgomp.oacc-c-c++-common/shared-lib.c
new file mode 100644
index 000..6d8a4ac
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/shared-lib.c
@@ -0,0 +1,16 @@
+/* { dg-do link } */
+/* { dg-additional-options "-shared -fPIC" { target fpic } } */
+
+#define N 512
+
+void f(int a[])
+{
+  int i;
+
+  #pragma acc parallel
+  for (i = 0; i < N; i++)
+a[i]++;
+}
+
+/* { dg-warning "relocation against `.*' in read-only section `\.rodata'" "" { 
target openacc_radeon_accel_selected } 0 } */
+/* { dg-warning "creating DT_TEXTREL in a shared object" "" { target 
openacc_radeon_accel_selected } 0 } */


RE: [PATCH 6/6] aarch64: Fix BTI support in libitm

2020-07-08 Thread Kyrylo Tkachov



> -Original Message-
> From: Szabolcs Nagy 
> Sent: 08 July 2020 17:28
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; Kyrylo Tkachov
> 
> Subject: [PATCH 6/6] aarch64: Fix BTI support in libitm
> 
> sjlj.S did not have the GNU property note markup and the BTI c
> instructions that are necessary when it is built with branch
> protection.
> 
> The notes are only added when libitm is built with branch
> protection, because old linkers mishandle the note (merge
> them incorrectly or emit warnings), the BTI instructions
> are added unconditionally.
> 
> libitm/ChangeLog:
> 
> 2020-07-08  Szabolcs Nagy  
> 
>   * config/aarch64/sjlj.S: Add BTI marking and related definitions,
>   and add BTI c to function entries.
> 
> ---
> Note: there is some redundancy: the libgcc fixup patch needed
> the same macro definitions, but i did not find a convenient
> place from where both libgcc and libitm can include them. Since
> this is a common problem i expect a change in the assembler
> that will be able to add the note without doing this manually,
> until then i think we can live with the code duplication.

Ok.
Thanks,
Kyrill

> ---
>  libitm/config/aarch64/sjlj.S | 27 +++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/libitm/config/aarch64/sjlj.S b/libitm/config/aarch64/sjlj.S
> index 5b97b973e27..e2093ca1a97 100644
> --- a/libitm/config/aarch64/sjlj.S
> +++ b/libitm/config/aarch64/sjlj.S
> @@ -24,6 +24,8 @@
> 
>  #include "asmcfi.h"
> 
> +#define BTI_Chint34
> +
>   .text
>   .align  2
>   .global _ITM_beginTransaction
> @@ -31,6 +33,7 @@
> 
>  _ITM_beginTransaction:
>   cfi_startproc
> + BTI_C
>   mov x1, sp
>   stp x29, x30, [sp, -11*16]!
>   cfi_adjust_cfa_offset(11*16)
> @@ -70,6 +73,7 @@ GTM_longjmp:
>   /* The first parameter becomes the return value (x0).
>  The third parameter is ignored for now.  */
>   cfi_startproc
> + BTI_C
>   ldp x19, x20, [x1, 1*16]
>   ldp x21, x22, [x1, 2*16]
>   ldp x23, x24, [x1, 3*16]
> @@ -87,6 +91,29 @@ GTM_longjmp:
>   cfi_endproc
>   .size   GTM_longjmp, . - GTM_longjmp
> 
> +/* GNU_PROPERTY_AARCH64_* macros from elf.h for use in asm code.  */
> +#define FEATURE_1_AND 0xc000
> +#define FEATURE_1_BTI 1
> +#define FEATURE_1_PAC 2
> +
> +/* Add a NT_GNU_PROPERTY_TYPE_0 note.  */
> +#define GNU_PROPERTY(type, value)\
> +  .section .note.gnu.property, "a";  \
> +  .p2align 3;\
> +  .word 4;   \
> +  .word 16;  \
> +  .word 5;   \
> +  .asciz "GNU";  \
> +  .word type;\
> +  .word 4;   \
> +  .word value;   \
> +  .word 0;
> +
>  #if defined(__linux__) || defined(__FreeBSD__)
>  .section .note.GNU-stack, "", %progbits
> +
> +/* Add GNU property note if built with branch protection.  */
> +# ifdef __ARM_FEATURE_BTI_DEFAULT
> +GNU_PROPERTY (FEATURE_1_AND, FEATURE_1_BTI)
> +# endif
>  #endif
> --
> 2.17.1



RE: [PATCH 5/6] aarch64: Fix BTI support in libgcc

2020-07-08 Thread Kyrylo Tkachov



> -Original Message-
> From: Szabolcs Nagy 
> Sent: 08 July 2020 17:28
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; Kyrylo Tkachov
> 
> Subject: [PATCH 5/6] aarch64: Fix BTI support in libgcc
> 
> lse.S did not have the GNU property note markup and the BTI c
> instructions that are necessary when it is built with branch
> protection.
> 
> The notes are only added when libgcc is built with branch
> protection, because old linkers mishandle the note (merge
> them incorrectly or emit warnings), the BTI instructions
> are added unconditionally.
> 
> Note: BTI c is only necessary at function entry if the function
> may be called indirectly, currently lse functions are not called
> indirectly, but BTI is added for ABI reasons e.g. to allow
> linkers later to emit stub code with indirect jump.
> 

Ok.
Thanks,
Kyrill

> libgcc/ChangeLog:
> 
> 2020-07-08  Szabolcs Nagy  
> 
>   * config/aarch64/lse.S: Add BTI marking and related definitions,
>   and add BTI c to function entries.
> ---
>  libgcc/config/aarch64/lse.S | 26 ++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/libgcc/config/aarch64/lse.S b/libgcc/config/aarch64/lse.S
> index 9e2acae806b..64691c601c1 100644
> --- a/libgcc/config/aarch64/lse.S
> +++ b/libgcc/config/aarch64/lse.S
> @@ -136,6 +136,8 @@ see the files COPYING3 and COPYING.RUNTIME
> respectively.  If not, see
>  #define tmp1 17
>  #define tmp2 15
> 
> +#define BTI_Chint34
> +
>  /* Start and end a function.  */
>  .macro   STARTFN name
>   .text
> @@ -145,6 +147,7 @@ see the files COPYING3 and COPYING.RUNTIME
> respectively.  If not, see
>   .type   \name, %function
>   .cfi_startproc
>  \name:
> + BTI_C
>  .endm
> 
>  .macro   ENDFN name
> @@ -275,6 +278,29 @@ STARTFN  NAME(LDNM)
>  ENDFNNAME(LDNM)
>  #endif
> 
> +/* GNU_PROPERTY_AARCH64_* macros from elf.h for use in asm code.  */
> +#define FEATURE_1_AND 0xc000
> +#define FEATURE_1_BTI 1
> +#define FEATURE_1_PAC 2
> +
> +/* Add a NT_GNU_PROPERTY_TYPE_0 note.  */
> +#define GNU_PROPERTY(type, value)\
> +  .section .note.gnu.property, "a";  \
> +  .p2align 3;\
> +  .word 4;   \
> +  .word 16;  \
> +  .word 5;   \
> +  .asciz "GNU";  \
> +  .word type;\
> +  .word 4;   \
> +  .word value;   \
> +  .word 0;
> +
>  #if defined(__linux__) || defined(__FreeBSD__)
>  .section .note.GNU-stack, "", %progbits
> +
> +/* Add GNU property note if built with branch protection.  */
> +# ifdef __ARM_FEATURE_BTI_DEFAULT
> +GNU_PROPERTY (FEATURE_1_AND, FEATURE_1_BTI)
> +# endif
>  #endif
> --
> 2.17.1



RE: [PATCH 2/6] aarch64: Add missing ACLE support for PAC-RET

2020-07-08 Thread Kyrylo Tkachov



> -Original Message-
> From: Szabolcs Nagy 
> Sent: 08 July 2020 17:27
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; Kyrylo Tkachov
> 
> Subject: [PATCH 2/6] aarch64: Add missing ACLE support for PAC-RET
> 
> Define the __ARM_FEATURE_PAC_DEFAULT feature test
> macro when PAC-RET branch protection is enabled.
> 

Ok once the prerequisites are in.
The developer.arm.com website hosting the ACLE spec is having trouble at the 
moment but, being familiar with this part of the spec, the definition looks 
correct to me.

Thanks,
Kyrill

> gcc/ChangeLog:
> 
> 2020-07-08  Szabolcs Nagy  
> 
>   * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Add
>   __ARM_FEATURE_PAC_DEFAULT support.
> 
> ---
> Note: i expect to push this patch after the pac-ret
> __builtin_return_address and unwinder patches are
> resolved so we only advertise pac-ret support in a
> fixed gcc which makes it possible to configure test
> for __builtin_return_address behaviour.
> ---
>  gcc/config/aarch64/aarch64-c.c | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/gcc/config/aarch64/aarch64-c.c b/gcc/config/aarch64/aarch64-c.c
> index 1882288af8d..1a1f4ecef04 100644
> --- a/gcc/config/aarch64/aarch64-c.c
> +++ b/gcc/config/aarch64/aarch64-c.c
> @@ -181,6 +181,19 @@ aarch64_update_cpp_builtins (cpp_reader *pfile)
>aarch64_def_or_undef (aarch64_bti_enabled (),
>   "__ARM_FEATURE_BTI_DEFAULT", pfile);
> 
> +  cpp_undef (pfile, "__ARM_FEATURE_PAC_DEFAULT");
> +  if (aarch64_ra_sign_scope != AARCH64_FUNCTION_NONE)
> +{
> +  int v = 0;
> +  if (aarch64_ra_sign_key == AARCH64_KEY_A)
> + v |= 1;
> +  if (aarch64_ra_sign_key == AARCH64_KEY_B)
> + v |= 2;
> +  if (aarch64_ra_sign_scope == AARCH64_FUNCTION_ALL)
> + v |= 4;
> +  builtin_define_with_int_value ("__ARM_FEATURE_PAC_DEFAULT", v);
> +}
> +
>aarch64_def_or_undef (TARGET_I8MM,
> "__ARM_FEATURE_MATMUL_INT8", pfile);
>aarch64_def_or_undef (TARGET_BF16_SIMD,
>   "__ARM_FEATURE_BF16_VECTOR_ARITHMETIC",
> pfile);
> --
> 2.17.1



RE: [PATCH 1/6] aarch64: Add missing ACLE support for BTI

2020-07-08 Thread Kyrylo Tkachov



> -Original Message-
> From: Szabolcs Nagy 
> Sent: 08 July 2020 17:26
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; Kyrylo Tkachov
> 
> Subject: [PATCH 1/6] aarch64: Add missing ACLE support for BTI
> 
> Define the __ARM_FEATURE_BTI_DEFAULT feature test
> macro when BTI branch protection is enabled.
> 

Ok.
Thanks,
Kyrill

> gcc/ChangeLog:
> 
> 2020-07-08  Szabolcs Nagy  
> 
>   * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Add
>   __ARM_FEATURE_BTI_DEFAULT support.
> ---
>  gcc/config/aarch64/aarch64-c.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/gcc/config/aarch64/aarch64-c.c b/gcc/config/aarch64/aarch64-c.c
> index e1c1cd415dc..1882288af8d 100644
> --- a/gcc/config/aarch64/aarch64-c.c
> +++ b/gcc/config/aarch64/aarch64-c.c
> @@ -178,6 +178,9 @@ aarch64_update_cpp_builtins (cpp_reader *pfile)
>aarch64_def_or_undef (TARGET_RNG, "__ARM_FEATURE_RNG", pfile);
>aarch64_def_or_undef (TARGET_MEMTAG,
> "__ARM_FEATURE_MEMORY_TAGGING", pfile);
> 
> +  aarch64_def_or_undef (aarch64_bti_enabled (),
> + "__ARM_FEATURE_BTI_DEFAULT", pfile);
> +
>aarch64_def_or_undef (TARGET_I8MM,
> "__ARM_FEATURE_MATMUL_INT8", pfile);
>aarch64_def_or_undef (TARGET_BF16_SIMD,
>   "__ARM_FEATURE_BF16_VECTOR_ARITHMETIC",
> pfile);
> --
> 2.17.1



RE: [PATCH 4/6] aarch64: Fix noexecstack note in libgcc

2020-07-08 Thread Kyrylo Tkachov



> -Original Message-
> From: Szabolcs Nagy 
> Sent: 08 July 2020 17:27
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; Kyrylo Tkachov
> 
> Subject: [PATCH 4/6] aarch64: Fix noexecstack note in libgcc
> 
> lse.S did not have GNU stack note, this may cause missing
> PT_GNU_STACK in binaries on Linux and FreeBSD.
> 

Ok.
Thanks,
Kyrill

> libgcc/ChangeLog:
> 
> 2020-07-08  Szabolcs Nagy  
> 
>   * config/aarch64/lse.S: Add stack note.
> ---
>  libgcc/config/aarch64/lse.S | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/libgcc/config/aarch64/lse.S b/libgcc/config/aarch64/lse.S
> index f3ccf5cf543..9e2acae806b 100644
> --- a/libgcc/config/aarch64/lse.S
> +++ b/libgcc/config/aarch64/lse.S
> @@ -274,3 +274,7 @@ STARTFN   NAME(LDNM)
> 
>  ENDFNNAME(LDNM)
>  #endif
> +
> +#if defined(__linux__) || defined(__FreeBSD__)
> +.section .note.GNU-stack, "", %progbits
> +#endif
> --
> 2.17.1



RE: [PATCH 3/6] aarch64: Fix noexecstack note in libitm

2020-07-08 Thread Kyrylo Tkachov



> -Original Message-
> From: Szabolcs Nagy 
> Sent: 08 July 2020 17:27
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; Kyrylo Tkachov
> 
> Subject: [PATCH 3/6] aarch64: Fix noexecstack note in libitm
> 
> sjlj.S only had the note on Linux, but it is supposed
> to have it on FreeBSD too.
> 

Ok.
Thanks,
Kyrill

> libitm/ChangeLog:
> 
> 2020-07-08  Szabolcs Nagy  
> 
>   * config/aarch64/sjlj.S: Add stack note if __FreeBSD__ is defined.
> 
> ---
> Note: this is a minor change to make the asm consistent with
> gcc code generation (which emits the note on freebsd too).
> the linker defaults to noexecstack on aarch64 so this should
> not matter much in practice.
> ---
>  libitm/config/aarch64/sjlj.S | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libitm/config/aarch64/sjlj.S b/libitm/config/aarch64/sjlj.S
> index 27626c1f378..5b97b973e27 100644
> --- a/libitm/config/aarch64/sjlj.S
> +++ b/libitm/config/aarch64/sjlj.S
> @@ -87,6 +87,6 @@ GTM_longjmp:
>   cfi_endproc
>   .size   GTM_longjmp, . - GTM_longjmp
> 
> -#ifdef __linux__
> +#if defined(__linux__) || defined(__FreeBSD__)
>  .section .note.GNU-stack, "", %progbits
>  #endif
> --
> 2.17.1



Re: [PATCH] Add C++2a synchronization support

2020-07-08 Thread Jonathan Wakely via Gcc-patches

On 05/06/20 17:29 -0700, Thomas Rodgers wrote:

Add support for -
   atomic wait/notify_one/notify_all
   counting_semaphore
   binary_semaphore
   latch

   * include/Makefile.am (bits_headers): Add new header.
* include/Makefile.in: Regenerate.
* include/bits/atomic_base.h (__atomic_base<_Itp>::wait): Define.
(__atomic_base<_Itp>::notify_one): Likewise.
(__atomic_base<_Itp>::notify_all): Likewise.
(__atomic_base<_Ptp*>::wait): Likewise.
(__atomic_base<_Ptp*>::notify_one): Likewise.
(__atomic_base<_Ptp*>::notify_all): Likewise.
(__atomic_impl::wait): Likewise.
(__atomic_impl::notify_one): Likewise.
(__atomic_impl::notify_all): Likewise.
(__atomic_float<_Fp>::wait): Likewise.
(__atomic_float<_Fp>::notify_one): Likewise.
(__atomic_float<_Fp>::notify_all): Likewise.
(__atomic_ref<_Tp>::wait): Likewise.
(__atomic_ref<_Tp>::notify_one): Likewise.
(__atomic_ref<_Tp>::notify_all): Likewise.
(atomic_wait<_Tp>): Likewise.
(atomic_wait_explicit<_Tp>): Likewise.
(atomic_notify_one<_Tp>): Likewise.
(atomic_notify_all<_Tp>): Likewise.
* include/bits/atomic_wait.h: New file.
   * include/bits/atomic_timed_wait.h: New file.
   * include/bits/semaphore_base.h: New file.
* include/std/atomic (atomic::wait): Define.
(atomic::wait_one): Likewise.
(atomic::wait_all): Likewise.
(atomic<_Tp>::wait): Likewise.
(atomic<_Tp>::wait_one): Likewise.
(atomic<_Tp>::wait_all): Likewise.
(atomic<_Tp*>::wait): Likewise.
(atomic<_Tp*>::wait_one): Likewise.
(atomic<_Tp*>::wait_all): Likewise.
   * include/std/latch: New file.
   * include/std/semaphore: New file.
   * include/std/version: Add __cpp_lib_semaphore and
   __cpp_lib_latch defines.
* testsuite/29_atomic/atomic/wait_notify/atomic_refs.cc: New test.
* testsuite/29_atomic/atomic/wait_notify/bool.cc: Likewise.
* testsuite/29_atomic/atomic/wait_notify/integrals.cc: Likewise.
* testsuite/29_atomic/atomic/wait_notify/floats.cc: Likewise.
* testsuite/29_atomic/atomic/wait_notify/pointers.cc: Likewise.
* testsuite/29_atomic/atomic/wait_notify/generic.h: New File.
   * testsuite/30_thread/semaphore/1.cc: New test.
   * testsuite/30_thread/semaphore/2.cc: Likewise.
   * testsuite/30_thread/semaphore/least_max_value_neg.cc: Likewise.
   * testsuite/30_thread/semaphore/try_acquire.cc: Likewise.
   * testsuite/30_thread/semaphore/try_acquire_for.cc: Likewise.
   * testsuite/30_thread/semaphore/try_acquire_futex.cc: Likewise.
   * testsuite/30_thread/semaphore/try_acquire_posix.cc: Likewise.
   * testsuite/30_thread/semaphore/try_acquire_until.cc: Likewise.
   * testsuite/30_thread/latch/1.cc: New test.
   * testsuite/30_thread/latch/2.cc: New test.
   * testsuite/30_thread/latch/3.cc: New test.




diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 80aeb3f8959..b3ac1a3365f 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -52,6 +52,7 @@ std_headers = \
${std_srcdir}/iostream \
${std_srcdir}/istream \
${std_srcdir}/iterator \
+   ${std_srcdir}/latch\


Missing space before the backslash here.


${std_srcdir}/limits \
${std_srcdir}/list \
${std_srcdir}/locale \



--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -823,6 +851,30 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   int(__m1), int(__m2));
  }

+#if __cplusplus > 201703L
+  _GLIBCXX_ALWAYS_INLINE void
+  wait(__pointer_type __old, memory_order __m = memory_order_seq_cst) 
noexcept


This line should be < 80 cols.


+  {
@@ -911,6 +963,32 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 int(__success), int(__failure));
  }

+#if __cplusplus > 201703L
+template
+  _GLIBCXX_ALWAYS_INLINE void
+  wait(const _Tp* __ptr, _Val<_Tp> __old, memory_order __m = 
memory_order_seq_cst) noexcept


And this one.



@@ -1164,6 +1242,23 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   __cmpexch_failure_order(__order));
  }

+  _GLIBCXX_ALWAYS_INLINE void
+  wait(_Fp __old, memory_order __m = memory_order_seq_cst) const noexcept
+  { __atomic_impl::wait(&_M_fp, __old, __m); }
+
+  // TODO add const volatile overload
+
+  _GLIBCXX_ALWAYS_INLINE void
+  notify_one() const noexcept
+  { __atomic_impl::notify_one(&_M_fp); }
+
+  // TODO add const volatile overload
+
+  _GLIBCXX_ALWAYS_INLINE void
+  notify_all() const noexcept
+  { __atomic_impl::notify_all(&_M_fp); }
+
+  // TODO add const volatile overload


Please add a newline after this comment.


 

[PATCH 6/6] aarch64: Fix BTI support in libitm

2020-07-08 Thread Szabolcs Nagy
sjlj.S did not have the GNU property note markup and the BTI c
instructions that are necessary when it is built with branch
protection.

The notes are only added when libitm is built with branch
protection, because old linkers mishandle the note (merge
them incorrectly or emit warnings), the BTI instructions
are added unconditionally.

libitm/ChangeLog:

2020-07-08  Szabolcs Nagy  

* config/aarch64/sjlj.S: Add BTI marking and related definitions,
and add BTI c to function entries.

---
Note: there is some redundancy: the libgcc fixup patch needed
the same macro definitions, but i did not find a convenient
place from where both libgcc and libitm can include them. Since
this is a common problem i expect a change in the assembler
that will be able to add the note without doing this manually,
until then i think we can live with the code duplication.
---
 libitm/config/aarch64/sjlj.S | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/libitm/config/aarch64/sjlj.S b/libitm/config/aarch64/sjlj.S
index 5b97b973e27..e2093ca1a97 100644
--- a/libitm/config/aarch64/sjlj.S
+++ b/libitm/config/aarch64/sjlj.S
@@ -24,6 +24,8 @@
 
 #include "asmcfi.h"
 
+#define BTI_C  hint34
+
.text
.align  2
.global _ITM_beginTransaction
@@ -31,6 +33,7 @@
 
 _ITM_beginTransaction:
cfi_startproc
+   BTI_C
mov x1, sp
stp x29, x30, [sp, -11*16]!
cfi_adjust_cfa_offset(11*16)
@@ -70,6 +73,7 @@ GTM_longjmp:
/* The first parameter becomes the return value (x0).
   The third parameter is ignored for now.  */
cfi_startproc
+   BTI_C
ldp x19, x20, [x1, 1*16]
ldp x21, x22, [x1, 2*16]
ldp x23, x24, [x1, 3*16]
@@ -87,6 +91,29 @@ GTM_longjmp:
cfi_endproc
.size   GTM_longjmp, . - GTM_longjmp
 
+/* GNU_PROPERTY_AARCH64_* macros from elf.h for use in asm code.  */
+#define FEATURE_1_AND 0xc000
+#define FEATURE_1_BTI 1
+#define FEATURE_1_PAC 2
+
+/* Add a NT_GNU_PROPERTY_TYPE_0 note.  */
+#define GNU_PROPERTY(type, value)  \
+  .section .note.gnu.property, "a";\
+  .p2align 3;  \
+  .word 4; \
+  .word 16;\
+  .word 5; \
+  .asciz "GNU";\
+  .word type;  \
+  .word 4; \
+  .word value; \
+  .word 0;
+
 #if defined(__linux__) || defined(__FreeBSD__)
 .section .note.GNU-stack, "", %progbits
+
+/* Add GNU property note if built with branch protection.  */
+# ifdef __ARM_FEATURE_BTI_DEFAULT
+GNU_PROPERTY (FEATURE_1_AND, FEATURE_1_BTI)
+# endif
 #endif
-- 
2.17.1



[PATCH 4/6] aarch64: Fix noexecstack note in libgcc

2020-07-08 Thread Szabolcs Nagy
lse.S did not have GNU stack note, this may cause missing
PT_GNU_STACK in binaries on Linux and FreeBSD.

libgcc/ChangeLog:

2020-07-08  Szabolcs Nagy  

* config/aarch64/lse.S: Add stack note.
---
 libgcc/config/aarch64/lse.S | 4 
 1 file changed, 4 insertions(+)

diff --git a/libgcc/config/aarch64/lse.S b/libgcc/config/aarch64/lse.S
index f3ccf5cf543..9e2acae806b 100644
--- a/libgcc/config/aarch64/lse.S
+++ b/libgcc/config/aarch64/lse.S
@@ -274,3 +274,7 @@ STARTFN NAME(LDNM)
 
 ENDFN  NAME(LDNM)
 #endif
+
+#if defined(__linux__) || defined(__FreeBSD__)
+.section .note.GNU-stack, "", %progbits
+#endif
-- 
2.17.1



[PATCH 3/6] aarch64: Fix noexecstack note in libitm

2020-07-08 Thread Szabolcs Nagy
sjlj.S only had the note on Linux, but it is supposed
to have it on FreeBSD too.

libitm/ChangeLog:

2020-07-08  Szabolcs Nagy  

* config/aarch64/sjlj.S: Add stack note if __FreeBSD__ is defined.

---
Note: this is a minor change to make the asm consistent with
gcc code generation (which emits the note on freebsd too).
the linker defaults to noexecstack on aarch64 so this should
not matter much in practice.
---
 libitm/config/aarch64/sjlj.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libitm/config/aarch64/sjlj.S b/libitm/config/aarch64/sjlj.S
index 27626c1f378..5b97b973e27 100644
--- a/libitm/config/aarch64/sjlj.S
+++ b/libitm/config/aarch64/sjlj.S
@@ -87,6 +87,6 @@ GTM_longjmp:
cfi_endproc
.size   GTM_longjmp, . - GTM_longjmp
 
-#ifdef __linux__
+#if defined(__linux__) || defined(__FreeBSD__)
 .section .note.GNU-stack, "", %progbits
 #endif
-- 
2.17.1



[PATCH 5/6] aarch64: Fix BTI support in libgcc

2020-07-08 Thread Szabolcs Nagy
lse.S did not have the GNU property note markup and the BTI c
instructions that are necessary when it is built with branch
protection.

The notes are only added when libgcc is built with branch
protection, because old linkers mishandle the note (merge
them incorrectly or emit warnings), the BTI instructions
are added unconditionally.

Note: BTI c is only necessary at function entry if the function
may be called indirectly, currently lse functions are not called
indirectly, but BTI is added for ABI reasons e.g. to allow
linkers later to emit stub code with indirect jump.

libgcc/ChangeLog:

2020-07-08  Szabolcs Nagy  

* config/aarch64/lse.S: Add BTI marking and related definitions,
and add BTI c to function entries.
---
 libgcc/config/aarch64/lse.S | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/libgcc/config/aarch64/lse.S b/libgcc/config/aarch64/lse.S
index 9e2acae806b..64691c601c1 100644
--- a/libgcc/config/aarch64/lse.S
+++ b/libgcc/config/aarch64/lse.S
@@ -136,6 +136,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 #define tmp1   17
 #define tmp2   15
 
+#define BTI_C  hint34
+
 /* Start and end a function.  */
 .macro STARTFN name
.text
@@ -145,6 +147,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
.type   \name, %function
.cfi_startproc
 \name:
+   BTI_C
 .endm
 
 .macro ENDFN name
@@ -275,6 +278,29 @@ STARTFNNAME(LDNM)
 ENDFN  NAME(LDNM)
 #endif
 
+/* GNU_PROPERTY_AARCH64_* macros from elf.h for use in asm code.  */
+#define FEATURE_1_AND 0xc000
+#define FEATURE_1_BTI 1
+#define FEATURE_1_PAC 2
+
+/* Add a NT_GNU_PROPERTY_TYPE_0 note.  */
+#define GNU_PROPERTY(type, value)  \
+  .section .note.gnu.property, "a";\
+  .p2align 3;  \
+  .word 4; \
+  .word 16;\
+  .word 5; \
+  .asciz "GNU";\
+  .word type;  \
+  .word 4; \
+  .word value; \
+  .word 0;
+
 #if defined(__linux__) || defined(__FreeBSD__)
 .section .note.GNU-stack, "", %progbits
+
+/* Add GNU property note if built with branch protection.  */
+# ifdef __ARM_FEATURE_BTI_DEFAULT
+GNU_PROPERTY (FEATURE_1_AND, FEATURE_1_BTI)
+# endif
 #endif
-- 
2.17.1



[PATCH 1/6] aarch64: Add missing ACLE support for BTI

2020-07-08 Thread Szabolcs Nagy
Define the __ARM_FEATURE_BTI_DEFAULT feature test
macro when BTI branch protection is enabled.

gcc/ChangeLog:

2020-07-08  Szabolcs Nagy  

* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Add
__ARM_FEATURE_BTI_DEFAULT support.
---
 gcc/config/aarch64/aarch64-c.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/config/aarch64/aarch64-c.c b/gcc/config/aarch64/aarch64-c.c
index e1c1cd415dc..1882288af8d 100644
--- a/gcc/config/aarch64/aarch64-c.c
+++ b/gcc/config/aarch64/aarch64-c.c
@@ -178,6 +178,9 @@ aarch64_update_cpp_builtins (cpp_reader *pfile)
   aarch64_def_or_undef (TARGET_RNG, "__ARM_FEATURE_RNG", pfile);
   aarch64_def_or_undef (TARGET_MEMTAG, "__ARM_FEATURE_MEMORY_TAGGING", pfile);
 
+  aarch64_def_or_undef (aarch64_bti_enabled (),
+   "__ARM_FEATURE_BTI_DEFAULT", pfile);
+
   aarch64_def_or_undef (TARGET_I8MM, "__ARM_FEATURE_MATMUL_INT8", pfile);
   aarch64_def_or_undef (TARGET_BF16_SIMD,
"__ARM_FEATURE_BF16_VECTOR_ARITHMETIC", pfile);
-- 
2.17.1



[PATCH 2/6] aarch64: Add missing ACLE support for PAC-RET

2020-07-08 Thread Szabolcs Nagy
Define the __ARM_FEATURE_PAC_DEFAULT feature test
macro when PAC-RET branch protection is enabled.

gcc/ChangeLog:

2020-07-08  Szabolcs Nagy  

* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Add
__ARM_FEATURE_PAC_DEFAULT support.

---
Note: i expect to push this patch after the pac-ret
__builtin_return_address and unwinder patches are
resolved so we only advertise pac-ret support in a
fixed gcc which makes it possible to configure test
for __builtin_return_address behaviour.
---
 gcc/config/aarch64/aarch64-c.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/gcc/config/aarch64/aarch64-c.c b/gcc/config/aarch64/aarch64-c.c
index 1882288af8d..1a1f4ecef04 100644
--- a/gcc/config/aarch64/aarch64-c.c
+++ b/gcc/config/aarch64/aarch64-c.c
@@ -181,6 +181,19 @@ aarch64_update_cpp_builtins (cpp_reader *pfile)
   aarch64_def_or_undef (aarch64_bti_enabled (),
"__ARM_FEATURE_BTI_DEFAULT", pfile);
 
+  cpp_undef (pfile, "__ARM_FEATURE_PAC_DEFAULT");
+  if (aarch64_ra_sign_scope != AARCH64_FUNCTION_NONE)
+{
+  int v = 0;
+  if (aarch64_ra_sign_key == AARCH64_KEY_A)
+   v |= 1;
+  if (aarch64_ra_sign_key == AARCH64_KEY_B)
+   v |= 2;
+  if (aarch64_ra_sign_scope == AARCH64_FUNCTION_ALL)
+   v |= 4;
+  builtin_define_with_int_value ("__ARM_FEATURE_PAC_DEFAULT", v);
+}
+
   aarch64_def_or_undef (TARGET_I8MM, "__ARM_FEATURE_MATMUL_INT8", pfile);
   aarch64_def_or_undef (TARGET_BF16_SIMD,
"__ARM_FEATURE_BF16_VECTOR_ARITHMETIC", pfile);
-- 
2.17.1



[PATCH 0/6] aarch64: Fix target libraries for BTI [PR96001]

2020-07-08 Thread Szabolcs Nagy
Some asm files in target libraries were not branch protected
and gcc missed preprocessor macros when it is generating
branch protection code (which is needed to fix the asm files).

Szabolcs Nagy (6):
  aarch64: Add missing ACLE support for BTI
  aarch64: Add missing ACLE support for PAC-RET
  aarch64: Fix noexecstack note in libitm
  aarch64: Fix noexecstack note in libgcc
  aarch64: Fix BTI support in libgcc
  aarch64: Fix BTI support in libitm

 gcc/config/aarch64/aarch64-c.c | 16 
 libgcc/config/aarch64/lse.S| 30 ++
 libitm/config/aarch64/sjlj.S   | 29 -
 3 files changed, 74 insertions(+), 1 deletion(-)

-- 
2.17.1



[PATCH ver 4] RS6000, add VSX mask manipulation support

2020-07-08 Thread Carl Love via Gcc-patches
Will, Segher:

I fixed up the patch based on Will's comments.  I thought I had made
and committed the fixes that Will caught, but no   Sorry about
that.  I will get this right yet.

Carl Love
---

Version 4
  vec_mtvsrbm was commented out in ver 3.  Forgot to go back and actually
remove it.  I was supposed to after testing. It is no longer needed with
the removal of vec_mtvsrbm_mtvsrbmi. Removed it from ChangeLog.
  Clarification, vec_mtvsrbm_mtvsrbmi was removed in version 3.  Updated the
code to use vec_mtvsr_v16qi instead.  Hopefully that clarifies Will's
comment about "Reworked define_expand vec_mtvsrbm_mtvsrbmi" in version 3.
  Fixed ChangeLog, replaced the FUTURE with P10 that I missed previously.

---
version 3
  rebased onto mainline 7/7/2020
  Change FUTURE to P10 in code and ChangeLog.
  ChangeLog, fixed the name of a couple of files which were wrong.
  Reformated define_mode_attr VSX_MM_SUFFIX definition to shorten the line.
  Reworked define_expand "vec_mtvsrbm_mtvsrbmi" as it will not work as
intended.
  Changed vsx_register_operand to altivec_register_operand for "v"
constraint.
  Removed --save-temps from test cases as it is not needed.
  Reran regression testing and ran test cases manually on mambo.

---
version 2

Addressed Will's comments
  - ChangeLog: fixed name/symbol order;
changed reference from rs6000-c.c to rs6000-builtin.def.

  - define_expand "vec_mtvsrbm": changed name to vec_mtvsrbm_mtvsrbmi,
updated comment

  - vsx_mask-runnable.c: divided it up into four smaller test cases,
vsx_mask-count-runnable.c, vsx_mask-expane-runnable.c,
vsx_mask-extract-runnable.c, vsx_mask-move-runnable.c.

---
RS6000 RFC 2629, add VSX mask manipulation support

The following patch adds support for builtins vec_genbm(),  vec_genhm(),
vec_genwm(), vec_gendm(), vec_genqm(), vec_cntm(), vec_expandm(),
vec_extractm().  Support for instructions mtvsrbm, mtvsrhm, mtvsrwm,
mtvsrdm, mtvsrqm, cntm, vexpandm, vextractm.

The test has been tested on:

  powerpc64le-unknown-linux-gnu (Power 9 LE)

and mambo with no regression errors.

Please let me know if this patch is acceptable for inclusion in the mainline
branch.  Thanks.

   Carl Love
---

RS6000, add VSX mask manipulation support

gcc/ChangeLog

2020-07-07  Carl Love  

* config/rs6000/vsx.md  (VSX_MM): New define_mode_iterator.
(VSX_MM4): New define_mode_iterator.
(VSX_MM_SUFFIX4): New define_mode_attr.
(vec_mtvsrbmi): New define_insn.
(vec_mtvsr_): New define_insn.
(vec_cntmb_): New define_insn.
(vec_extract_): New define_insn.
(vec_expand_): New define_insn.
(define_c_enum unspec): Add entries UNSPEC_MTVSBM, UNSPEC_VCNTMB,
UNSPEC_VEXTRACT, UNSPEC_VEXPAND.
* config/rs6000/altivec.h ( vec_genbm, vec_genhm, vec_genwm,
vec_gendm, vec_genqm, vec_cntm, vec_expandm, vec_extractm): Add
defines.
* config/rs6000/rs6000-builtin.def: Add defines BU_P10_2, BU_P10_1.
(BU_P10_1): Add definitions for mtvsrbm, mtvsrhm, mtvsrwm,
mtvsrdm, mtvsrqm, vexpandmb, vexpandmh, vexpandmw, vexpandmd,
vexpandmq, vextractmb, vextractmh, vextractmw, vextractmd, vextractmq.
(BU_P10_2): Add definitions for cntmbb, cntmbh, cntmbw, cntmbd.
(BU_P10_OVERLOAD_1): Add definitions for mtvsrbm, mtvsrhm,
mtvsrwm, mtvsrdm, mtvsrqm, vexpandm, vextractm.
(BU_P10_OVERLOAD_2): Add defition for cntm.
* config/rs6000/rs6000-call.c (rs6000_expand_binop_builtin): Add
checks for CODE_FOR_vec_cntmbb_v16qi, CODE_FOR_vec_cntmb_v8hi,
CODE_FOR_vec_cntmb_v4si, CODE_FOR_vec_cntmb_v2di.
(altivec_overloaded_builtins): Add overloaded argument entries for
P10_BUILTIN_VEC_MTVSRBM, P10_BUILTIN_VEC_MTVSRHM,
P10_BUILTIN_VEC_MTVSRWM, P10_BUILTIN_VEC_MTVSRDM,
P10_BUILTIN_VEC_MTVSRQM, P10_BUILTIN_VEC_VCNTMBB,
P10_BUILTIN_VCNTMBB, P10_BUILTIN_VCNTMBH,
P10_BUILTIN_VCNTMBW, P10_BUILTIN_VCNTMBD,
P10_BUILTIN_VEXPANDMB, P10_BUILTIN_VEXPANDMH,
P10_BUILTIN_VEXPANDMW, P10_BUILTIN_VEXPANDMD,
P10_BUILTIN_VEXPANDMQ, P10_BUILTIN_VEXTRACTMB,
P10_BUILTIN_VEXTRACTMH, P10_BUILTIN_VEXTRACTMW,
P10_BUILTIN_VEXTRACTMD, P10_BUILTIN_VEXTRACTMQ.
(builtin_function_type): Add case entries for P10_BUILTIN_MTVSRBM,
P10_BUILTIN_MTVSRHM, P10_BUILTIN_MTVSRWM, P10_BUILTIN_MTVSRDM,
P10_BUILTIN_MTVSRQM, P10_BUILTIN_VCNTMBB, P10_BUILTIN_VCNTMBH,
P10_BUILTIN_VCNTMBW, P10_BUILTIN_VCNTMBD,
P10_BUILTIN_VEXPANDMB, P10_BUILTIN_VEXPANDMH,
P10_BUILTIN_VEXPANDMW, P10_BUILTIN_VEXPANDMD,

Re: [PATCH v2, rs6000] Add support to enable vmsumudm behind vec_msum builtin.

2020-07-08 Thread will schmidt via Gcc-patches
On Tue, 2020-06-30 at 18:39 -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, Jun 30, 2020 at 12:57:45PM -0500, will schmidt wrote:
> >   Add support for the vmsumudm instruction and tie it into the
> > vec_msum
> >   built-in to support the variants of that built-in using vector
> >  _int128 parameters.
> > 2020-06-18  Will Schmidt  
> > 
> > * config/rs6000/altivec.h (vec_vmsumudm): New define.
> > * config/rs6000/altivec.md (UNSPEC_VMSUMUDM): New unspec.
> > (altivec_vmsumudm): New define_insn.
> > * config/rs6000/rs6000-builtin.def (altivec_vmsumudm): New
> > BU_ALTIVEC_3 entry. (vmsumudm): New BU_ALTIVEC_OVERLOAD_3
> > entry.
> 
> No line break before (vmsumudm) please.
> 
> > * config/rs6000/rs6000-call.c
> > (altivec_overloaded_builtins):
> > Add entries for ALTIVEC_BUILTIN_VMSUMUDM variants of
> > vec_msum.
> 
> Tha patch is okay for trunk with that (and some int128 selector in
> the
> testcases that need one).  Thanks!

Thanks for the review, etc.

OK for backports too? (after baking on trunk for a bit).

thanks
-Will

> 
> 
> Segher



RE: [PATCH 2/4] aarch64: fix __builtin_eh_return with pac-ret [PR94891]

2020-07-08 Thread Kyrylo Tkachov


> -Original Message-
> From: Szabolcs Nagy 
> Sent: 08 July 2020 16:48
> To: Kyrylo Tkachov 
> Cc: gcc-patches@gcc.gnu.org; fwei...@redhat.com; Richard Earnshaw
> ; Daniel Kiss 
> Subject: Re: [PATCH 2/4] aarch64: fix __builtin_eh_return with pac-ret
> [PR94891]
> 
> The 07/08/2020 13:24, Kyrylo Tkachov wrote:
> > Hi Szabolcs,
> > > The 06/05/2020 17:51, Szabolcs Nagy wrote:
> > > > --- a/gcc/config/aarch64/aarch64.c
> > > > +++ b/gcc/config/aarch64/aarch64.c
> > > > @@ -6954,6 +6954,10 @@ aarch64_return_address_signing_enabled
> (void)
> > > >/* This function should only be called after frame laid out.   */
> > > >gcc_assert (cfun->machine->frame.laid_out);
> > > >
> > > > +  /* TODO: Big hammer handling of __builtin_eh_return.  */
> >
> > ... I don't think this comment is very useful. Please make it a bit more
> descriptive. If you want to leave the TODO here, please give a more concrete
> action plan.
> 
> see attached patch with more detailed comment and commit message.

Nice, thanks.
Kyrill




Re: [PATCH 2/4] aarch64: fix __builtin_eh_return with pac-ret [PR94891]

2020-07-08 Thread Szabolcs Nagy
The 07/08/2020 13:24, Kyrylo Tkachov wrote:
> Hi Szabolcs,
> > The 06/05/2020 17:51, Szabolcs Nagy wrote:
> > > --- a/gcc/config/aarch64/aarch64.c
> > > +++ b/gcc/config/aarch64/aarch64.c
> > > @@ -6954,6 +6954,10 @@ aarch64_return_address_signing_enabled (void)
> > >/* This function should only be called after frame laid out.   */
> > >gcc_assert (cfun->machine->frame.laid_out);
> > >
> > > +  /* TODO: Big hammer handling of __builtin_eh_return.  */
> 
> ... I don't think this comment is very useful. Please make it a bit more 
> descriptive. If you want to leave the TODO here, please give a more concrete 
> action plan.

see attached patch with more detailed comment and commit message.

>From e0f4b9b94be1b59c3141abc136ea387bb43fcdce Mon Sep 17 00:00:00 2001
From: Szabolcs Nagy 
Date: Thu, 4 Jun 2020 13:42:16 +0100
Subject: [PATCH v2] aarch64: fix __builtin_eh_return with pac-ret [PR94891]

The handler argument must not be signed since that may come from
outside the current module and exposing signed addresses is a pointer
ABI break. (The signed address also may not be representable as void *
which is why pac-ret is currently broken on ilp32.)

There is no point protecting the eh return path with pointer auth
since arbitrary target can be reached with the instruction sequence
in the caller function anyway, however this is a big hammer solution
that turns off pac-ret for the caller completely not just on the eh
return path. A proper fix would change eh return to use an indirect
branch instead of ret (and ensure BTI j landing pads are in place),
this is not attempted so the patch remains small and backportable.

2020-07-08  Szabolcs Nagy  

gcc/ChangeLog:

	PR target/94891
	* config/aarch64/aarch64.c (aarch64_return_address_signing_enabled):
	Disable return address signing if __builtin_eh_return is used.

gcc/testsuite/ChangeLog:

	PR target/94891
	* gcc.target/aarch64/return_address_sign_1.c: Update test.
	* gcc.target/aarch64/return_address_sign_b_1.c: Likewise.
---
 gcc/config/aarch64/aarch64.c  | 11 +++
 .../gcc.target/aarch64/return_address_sign_1.c|  8 
 .../gcc.target/aarch64/return_address_sign_b_1.c  |  8 
 3 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 5865d1d7b78..3ad96e07b7b 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -6957,6 +6957,17 @@ aarch64_return_address_signing_enabled (void)
   /* This function should only be called after frame laid out.   */
   gcc_assert (cfun->machine->frame.laid_out);
 
+  /* Turn return address signing off in any function that uses
+ __builtin_eh_return.  The address passed to __builtin_eh_return
+ is not signed so either it has to be signed (with original sp)
+ or the code path that uses it has to avoid authenticating it.
+ Currently eh return introduces a return to anywhere gadget, no
+ matter what we do here since it uses ret with user provided
+ address. An ideal fix for that is to use indirect branch which
+ can be protected with BTI j (to some extent).  */
+  if (crtl->calls_eh_return)
+return false;
+
   /* If signing scope is AARCH64_FUNCTION_NON_LEAF, we only sign a leaf function
  if its LR is pushed onto stack.  */
   return (aarch64_ra_sign_scope == AARCH64_FUNCTION_ALL
diff --git a/gcc/testsuite/gcc.target/aarch64/return_address_sign_1.c b/gcc/testsuite/gcc.target/aarch64/return_address_sign_1.c
index 0140bee194f..232ba67ade0 100644
--- a/gcc/testsuite/gcc.target/aarch64/return_address_sign_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/return_address_sign_1.c
@@ -41,12 +41,12 @@ func3 (int a, int b, int c)
 void __attribute__ ((target ("arch=armv8.3-a")))
 func4 (long offset, void *handler, int *ptr, int imm1, int imm2)
 {
-  /* paciasp */
+  /* no paciasp */
   *ptr = imm1 + foo (imm1) + imm2;
   __builtin_eh_return (offset, handler);
-  /* autiasp */
+  /* no autiasp */
   return;
 }
 
-/* { dg-final { scan-assembler-times "autiasp" 4 } } */
-/* { dg-final { scan-assembler-times "paciasp" 4 } } */
+/* { dg-final { scan-assembler-times "autiasp" 3 } } */
+/* { dg-final { scan-assembler-times "paciasp" 3 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/return_address_sign_b_1.c b/gcc/testsuite/gcc.target/aarch64/return_address_sign_b_1.c
index 32d788ddf3f..43e32ab6cb7 100644
--- a/gcc/testsuite/gcc.target/aarch64/return_address_sign_b_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/return_address_sign_b_1.c
@@ -41,12 +41,12 @@ func3 (int a, int b, int c)
 void __attribute__ ((target ("arch=armv8.3-a")))
 func4 (long offset, void *handler, int *ptr, int imm1, int imm2)
 {
-  /* pacibsp */
+  /* no pacibsp */
   *ptr = imm1 + foo (imm1) + imm2;
   __builtin_eh_return (offset, handler);
-  /* autibsp */
+  /* no autibsp */
   return;
 }
 
-/* { dg-final { scan-assembler-times "pacibsp" 4 } } */
-/* { dg-final { scan-assembler-times "autibsp" 

Re: [PATCH v3] RS6000, add VSX mask manipulation support

2020-07-08 Thread will schmidt via Gcc-patches
On Tue, 2020-07-07 at 16:19 -0700, Carl Love wrote:
> Segher:
> 
> I have fixed the issues you mentioned in version 2. I also rebased the
> patch onto the latest mainline.  This resulted in having to change
> FUTURE to P10 everywhere.  
> 
> I reran regression testing on Power 9 with no regression issues.
> I also ran test cases manually on mambo. 
> 
> Please let me know if the patch is acceptable for mainline.  Thanks for
> your time and previous reviews of the patch.
> 

>  Carl Love
> -
> 
> version 3  Changes
>   rebased onto mainline 7/7/2020
>   Change FUTURE to P10 in code and ChangeLog.
>   ChangeLog, fixed the name of a couple of files which were wrong.
>   Reformated define_mode_attr VSX_MM_SUFFIX definition to shorten the 
>line.
>   Reworked define_expand "vec_mtvsrbm_mtvsrbmi" as it will not work as
> intended.

And renamed?  I don't see vec_mtvsrbm_mtvsrbmi referenced in the patch.

Did this get renamed back to vec_mtvsrbm, reversing that version 2 change?


>   Changed vsx_register_operand to altivec_register_operand for "v"
> constraint.
>   Removed --save-temps from test cases as it is not needed.
> 
> 
> ---
> version 2 Changes
> 
> Addressed Will's comments
>   - ChangeLog: fixed name/symbol order;
> changed reference from rs6000-c.c to rs6000-builtin.def.
> 
>   - define_expand "vec_mtvsrbm": changed name to vec_mtvsrbm_mtvsrbmi,
> updated comment
> 
>   - vsx_mask-runnable.c: divided it up into four smaller test cases,
> vsx_mask-count-runnable.c, vsx_mask-expane-runnable.c,
> vsx_mask-extract-runnable.c, vsx_mask-move-runnable.c.
> 
> ---
> RS6000 RFC 2629, add VSX mask manipulation support
> 
> The following patch adds support for builtins vec_genbm(),  vec_genhm(),
> vec_genwm(), vec_gendm(), vec_genqm(), vec_cntm(), vec_expandm(),
> vec_extractm().  Support for instructions mtvsrbm, mtvsrhm, mtvsrwm,
> mtvsrdm, mtvsrqm, cntm, vexpandm, vextractm.
> 
> The test has been tested on:
> 
>   powerpc64le-unknown-linux-gnu (Power 9 LE)
> 
> and mambo with no regression errors.
> 
> Please let me know if this patch is acceptable for inclusion in the mainline
> branch.  Thanks.
> 
>Carl Love
> ---
> 
> RS6000, add VSX mask manipulation support
> 
> gcc/ChangeLog
> 
> 2020-07-07  Carl Love  
> 
>   * config/rs6000/vsx.md  (VSX_MM): New define_mode_iterator.
>   (VSX_MM4): New define_mode_iterator.
>   (VSX_MM_SUFFIX4): New define_mode_attr.
>   (vec_mtvsrbm): New define_expand.

This (vec_mtvsrbm) is commented out in the patch below.


>   (vec_mtvsrbmi): New define_insn.
>   (vec_mtvsr_): New define_insn.
>   (vec_cntmb_): New define_insn.
>   (vec_extract_): New define_insn.
>   (vec_expand_): New define_insn.
>   (define_c_enum unspec): Add entries UNSPEC_MTVSBM, UNSPEC_VCNTMB,
>   UNSPEC_VEXTRACT, UNSPEC_VEXPAND.
>   * config/rs6000/altivec.h ( vec_genbm, vec_genhm, vec_genwm,
>   vec_gendm, vec_genqm, vec_cntm, vec_expandm, vec_extractm): Add
>   defines.
>   * config/rs6000/rs6000-builtin.def: Add defines BU_P10_2, BU_P10_1.
>   (BU_P10_1): Add definitions for mtvsrbm, mtvsrhm, mtvsrwm,
>   mtvsrdm, mtvsrqm, vexpandmb, vexpandmh, vexpandmw, vexpandmd,
>   vexpandmq, vextractmb, vextractmh, vextractmw, vextractmd, vextractmq.
>   (BU_P10_2): Add definitions for cntmbb, cntmbh, cntmbw, cntmbd.
>   (BU_P10_OVERLOAD_1): Add definitions for mtvsrbm, mtvsrhm,
>   mtvsrwm, mtvsrdm, mtvsrqm, vexpandm, vextractm.
>   (BU_P10_OVERLOAD_2): Add defition for cntm.
>   * config/rs6000/rs6000-call.c (rs6000_expand_binop_builtin): Add
>   checks for CODE_FOR_vec_cntmbb_v16qi, CODE_FOR_vec_cntmb_v8hi,
>   CODE_FOR_vec_cntmb_v4si, CODE_FOR_vec_cntmb_v2di.
>   (altivec_overloaded_builtins): Add overloaded argument entries for
>   FUTURE_BUILTIN_VEC_MTVSRBM, FUTURE_BUILTIN_VEC_MTVSRHM,
>   FUTURE_BUILTIN_VEC_MTVSRWM, FUTURE_BUILTIN_VEC_MTVSRDM,
>   FUTURE_BUILTIN_VEC_MTVSRQM, FUTURE_BUILTIN_VEC_VCNTMBB,
>   FUTURE_BUILTIN_VCNTMBB, FUTURE_BUILTIN_VCNTMBH,
>   FUTURE_BUILTIN_VCNTMBW, FUTURE_BUILTIN_VCNTMBD,
>   FUTURE_BUILTIN_VEXPANDMB, FUTURE_BUILTIN_VEXPANDMH,
>   FUTURE_BUILTIN_VEXPANDMW, FUTURE_BUILTIN_VEXPANDMD,
>   FUTURE_BUILTIN_VEXPANDMQ, FUTURE_BUILTIN_VEXTRACTMB,
>   FUTURE_BUILTIN_VEXTRACTMH, FUTURE_BUILTIN_VEXTRACTMW,
>   FUTURE_BUILTIN_VEXTRACTMD, FUTURE_BUILTIN_VEXTRACTMQ.
>   (builtin_function_type): Add case entries for FUTURE_BUILTIN_MTVSRBM,
>   FUTURE_BUILTIN_MTVSRHM, FUTURE_BUILTIN_MTVSRWM, FUTURE_BUILTIN_MTVSRDM,
>   FUTURE_BUILTIN_MTVSRQM, FUTURE_BUILTIN_VCNTMBB, FUTURE_BUILTIN_VCNTMBH,
>   FUTURE_BUILTIN_VCNTMBW, FUTURE_BUILTIN_VCNTMBD,
>   

Re: [patch] Make memory copy functions scalar storage order barriers

2020-07-08 Thread Richard Biener via Gcc-patches
On Wed, Jul 8, 2020 at 10:53 AM Eric Botcazou  wrote:
>
> [Sorry for dropping the ball here]
>
> > But GCC does not see the reverse storage order in mymemcpy so
> > it happily folds the memcpy inside it, inlines the result and then?
>
> You're right, this breaks, hence the following alternative: either we prevent
> inlining from happening, or we declare that this is simply not supported and
> warn (there is a -Wscalar-storage-order warning for problematic constructs).
>
> I didn't find any existing infrastructure for the former and I'm not sure it's
> worth adding, so the attached implements the latter.  Tested on x86-64/Linux.

OK with me.

I still believe we could handle reverse storage order more "optimistically"
(without all the current usage restrictions).  We seem to have no problems
with address-spaces in this area for example (their problematic cases are
of course slightly different).

Richard.

>
> 2020-07-08  Eric Botcazou  
>
> c-family/
> * c.opt (Wscalar-storage-order): Add warn_scalar_storage_order 
> variable.
>
>
> 2020-07-08  Eric Botcazou  
>
> c/
> * c-typeck.c (convert_for_assignment): If -Wscalar-storage-order is 
> set,
> warn for conversions between pointers that point to incompatible 
> scalar
> storage orders.
>
>
> 2020-07-08  Eric Botcazou  
>
> * gimple-fold.c (gimple_fold_builtin_memory_op): Do not fold if either
> type has reverse scalar storage order.
> * tree-ssa-sccvn.c (vn_reference_lookup_3): Do not propagate through a
> memory copy if either type has reverse scalar storage order.
>
>
> 2020-07-08  Eric Botcazou  
>
> testsuite/
> * gcc.dg/sso-11.c: New test.
> * gcc.dg/sso/sso.exp: Pass -Wno-scalar-storage-order.
> * gcc.dg/sso/memcpy-1.c: New test.
>
>
> --
> Eric Botcazou


[Ada] Fix C miss parentheses warning on Windows

2020-07-08 Thread Pierre-Marie de Rodat
The C compiler switch -Wparentheses causes the warning suggest
parentheses around '&&' within '||'.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* socket.c [_WIN32] (__gnat_minus_500ms): Parentheses around &&
operations.  Remove notes about TN in comment.diff --git a/gcc/ada/socket.c b/gcc/ada/socket.c
--- a/gcc/ada/socket.c
+++ b/gcc/ada/socket.c
@@ -808,14 +808,12 @@ int __gnat_minus_500ms() {
 ZeroMemory(, sizeof(OSVERSIONINFO));
 osvi.dwOSVersionInfoSize = sizeof(OSVERSIONINFO);
 // Documentation proposes to use IsWindowsVersionOrGreater(10, 0, 17763)
-// but it does not compare by the build number (last parameter). See
-// regression test for RC03-012 in fixedbugs, there are some code to
-// investigate Windows version API behavior.
+// but it does not compare by the build number (last parameter).
 GetVersionEx();
 return osvi.dwMajorVersion < 10
-|| osvi.dwMajorVersion == 10
-&& osvi.dwMinorVersion == 0
-&& osvi.dwBuildNumber < 17763;
+|| (osvi.dwMajorVersion == 10
+&& osvi.dwMinorVersion == 0
+&& osvi.dwBuildNumber < 17763);
   } else {
 return !IsWindows8OrGreater();
   }




[Ada] Do not apply constraint checks on allocator with No_Initialization

2020-07-08 Thread Pierre-Marie de Rodat
For an allocator in the subtype mark case, the constraints of the subtype
must be checked against the designated subtype, except in the case where
the No_Initialization flag is set on the allocator node.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch4.adb (Expand_N_Allocator): In the subtype mark case, do
not apply constraint checks if the No_Initialization flag is set.diff --git a/gcc/ada/exp_ch4.adb b/gcc/ada/exp_ch4.adb
--- a/gcc/ada/exp_ch4.adb
+++ b/gcc/ada/exp_ch4.adb
@@ -4847,10 +4847,11 @@ package body Exp_Ch4 is
  Temp_Type : Entity_Id;
 
   begin
- --  Apply constraint checks against designated subtype (RM 4.8(10/2)).
+ --  Apply constraint checks against designated subtype (RM 4.8(10/2))
+ --  but ignore the expression if the No_Initialization flag is set.
  --  Discriminant checks will be generated by the expansion below.
 
- if Is_Array_Type (Dtyp) then
+ if Is_Array_Type (Dtyp) and then not No_Initialization (N) then
 Apply_Constraint_Check (Expression (N), Dtyp, No_Sliding => True);
 
 Apply_Predicate_Check (Expression (N), Dtyp);




[Ada] Disable warning about unsafe use of __builtin_frame_address

2020-07-08 Thread Pierre-Marie de Rodat
This disables the warning that is given by the C compiler when it is
compiling the generic implementation of the backtrace facility used
on some platforms.

This happens when a positive frame level is requested, which can be
problematic in the general case.  But this implementation is known to
work for platforms where it is used, so the warning is useless here.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* tracebak.c [generic implementation]: Add pragma GCC diagnostic
to disable warning about __builtin_frame_address.diff --git a/gcc/ada/tracebak.c b/gcc/ada/tracebak.c
--- a/gcc/ada/tracebak.c
+++ b/gcc/ada/tracebak.c
@@ -690,6 +690,9 @@ __gnat_backtrace (void ** traceback __attribute__((unused)),
 
 #elif defined (USE_GENERIC_UNWINDER)
 
+/* No warning since the cases where FRAME_LEVEL > 0 are known to work.  */
+#pragma GCC diagnostic ignored "-Wframe-address"
+
 #ifndef CURRENT_STACK_FRAME
 # define CURRENT_STACK_FRAME  ({ char __csf; &__csf; })
 #endif




[Ada] Fix warnings in C runtime files on Windows

2020-07-08 Thread Pierre-Marie de Rodat
They are mostly warnings on unused parameters, useless variables, casts
between integer and pointers of different size, or missing or incorrect
prototypes.  There is also an improper checking of a return value.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* adaint.h (__gnat_expect_portable_execvp): Fix prototype.
(__gnat_expect_poll): Likewise.
* expect.c [_WIN32]: Include adaint.h file.
(__gnat_waitpid): Remove useless variable.
(__gnat_expect_portable_execvp): Add ATTRIBUTE_UNUSED on parameter.
* raise-gcc.c [SEH] (__gnat_personality_v0): Add ATTRIBUTE_UNUSED.
* socket.c [_WIN32] (__gnat_getservbyport): Add ATTRIBUTE_UNUSED on
a couple of parameters.
(__gnat_gethostbyname): Likewise.
(__gnat_gethostbyaddr): Likewise.
(__gnat_getservbyname): Likewise.
(__gnat_last_socket_in_set): Use variables local to loops.
(__gnat_socket_ioctl): Cast 3rd parameter to proper type if _WIN32.
(__gnat_inet_pton): Cast 2nd parameter to proper type if _WIN32.
* sysdep.c (__gnat_localtime_tzoff): Remove superfluous test.
* terminals.c [_WIN32]: Include io.h file.
(is_gui_app): Remove useless variables and fix unsigned comparison.
(nt_spawnve): Add ATTRIBUTE_UNUSED on first parameter.  Initialize a
local variable and remove others that are useless.  Add missing cast
(__gnat_setup_child_communication): Remove useless variable and call
Use proper formatting string in call to sprintf.
(__gnat_setup_parent_communication): Cast to proper type.
(find_child_console): Fix prototype and remove useless variable.
(find_process_handle): Likewise.
(_gnat_interrupt_process): Move to after __gnat_interrupt_pid.
(__gnat_reset_tty): Add ATTRIBUTE_UNUSED on parameter, remove return
(__gnat_setup_winsize): Add ATTRIBUTE_UNUSED on all parameters.diff --git a/gcc/ada/adaint.h b/gcc/ada/adaint.h
--- a/gcc/ada/adaint.h
+++ b/gcc/ada/adaint.h
@@ -282,9 +282,10 @@ extern char  *mktemp   (char *);
 extern void   __gnat_set_exit_status		   (int);
 
 extern int__gnat_expect_fork		   (void);
-extern void   __gnat_expect_portable_execvp	   (char *, char *[]);
+extern void   __gnat_expect_portable_execvp	   (int *, char *, char *[]);
 extern int__gnat_pipe			   (int *);
-extern int__gnat_expect_poll		   (int *, int, int, int *);
+extern int__gnat_expect_poll		   (int *, int, int, int *,
+		int *);
 extern void   __gnat_set_binary_mode		   (int);
 extern void   __gnat_set_text_mode		   (int);
 extern void   __gnat_set_mode			   (int,int);


diff --git a/gcc/ada/expect.c b/gcc/ada/expect.c
--- a/gcc/ada/expect.c
+++ b/gcc/ada/expect.c
@@ -78,6 +78,7 @@
 #include 
 #include 
 #include 
+#include "adaint.h"
 #include "mingw32.h"
 
 int
@@ -85,11 +86,10 @@ __gnat_waitpid (int pid)
 {
   HANDLE h = OpenProcess (PROCESS_ALL_ACCESS, FALSE, pid);
   DWORD exitcode = 1;
-  DWORD res;
 
   if (h != NULL)
 {
-  res = WaitForSingleObject (h, INFINITE);
+  (void) WaitForSingleObject (h, INFINITE);
   GetExitCodeProcess (h, );
   CloseHandle (h);
 }
@@ -105,7 +105,8 @@ __gnat_expect_fork (void)
 }
 
 void
-__gnat_expect_portable_execvp (int *pid, char *cmd, char *argv[])
+__gnat_expect_portable_execvp (int *pid, char *cmd ATTRIBUTE_UNUSED,
+   char *argv[])
 {
   *pid = __gnat_portable_no_block_spawn (argv);
 }


diff --git a/gcc/ada/raise-gcc.c b/gcc/ada/raise-gcc.c
--- a/gcc/ada/raise-gcc.c
+++ b/gcc/ada/raise-gcc.c
@@ -1611,7 +1611,7 @@ __gnat_personality_seh0 (PEXCEPTION_RECORD ms_exc, void *this_frame,
 
 /* Define __gnat_personality_v0 for convenience */
 
-PERSONALITY_STORAGE _Unwind_Reason_Code
+PERSONALITY_STORAGE ATTRIBUTE_UNUSED _Unwind_Reason_Code
 __gnat_personality_v0 (version_arg_t version_arg,
 		   phases_arg_t phases_arg,
 		   _Unwind_Exception_Class uw_exception_class,


diff --git a/gcc/ada/socket.c b/gcc/ada/socket.c
--- a/gcc/ada/socket.c
+++ b/gcc/ada/socket.c
@@ -333,8 +333,8 @@ __gnat_getservbyport (int port, const char *proto,
 }
 #else
 int
-__gnat_gethostbyname (const char *name,
-  struct hostent *ret, char *buf, size_t buflen,
+__gnat_gethostbyname (const char *name, struct hostent *ret,
+  char *buf ATTRIBUTE_UNUSED, size_t buflen ATTRIBUTE_UNUSED,
   int *h_errnop)
 {
   struct hostent *rh;
@@ -349,8 +349,8 @@ __gnat_gethostbyname (const char *name,
 }
 
 int
-__gnat_gethostbyaddr (const char *addr, int len, int type,
-  struct hostent *ret, char *buf, size_t buflen,
+__gnat_gethostbyaddr (const char *addr, int len, int type, struct hostent *ret,
+  char *buf ATTRIBUTE_UNUSED, size_t buflen ATTRIBUTE_UNUSED,
   int *h_errnop)
 {
   struct hostent *rh;
@@ -365,8 +365,8 @@ __gnat_gethostbyaddr (const char *addr, int len, int type,
 }
 
 int
-__gnat_getservbyname (const char *name, const char *proto,
-  struct servent *ret, 

[Ada] Fix internal error on string type comparision with predicate

2020-07-08 Thread Pierre-Marie de Rodat
This freezing issue shows that Freeze_Expression does not fully control
the placement of freeze nodes produced by an expression coming from the
Actions list of various constructs, here an N_And_Then node, because it
does not check whether the entity being frozen, for example a type, is
really declared in the expression or merely referenced in it.

This change attempts to unify the handling of such expressions and adds
a new predicate function to check that the entity is declared locally.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* freeze.adb (Has_Decl_In_List): New predicate to check that an
entity is declared in a list of nodes.
(Freeze_Expression): Use it to deal with Expression_With_Actions,
short-circuit expression, if- and case-expression and ensure that
the freeze node is put onto their Actions list if the entity is
declared locally.diff --git a/gcc/ada/freeze.adb b/gcc/ada/freeze.adb
--- a/gcc/ada/freeze.adb
+++ b/gcc/ada/freeze.adb
@@ -7060,6 +7060,13 @@ package body Freeze is
   --  proc, a stream subprogram, or a renaming as body. If so, this is not
   --  a freezing context and the entity will be frozen at a later point.
 
+  function Has_Decl_In_List
+(E : Entity_Id;
+ N : Node_Id;
+ L : List_Id) return Boolean;
+  --  Determines whether an entity E referenced in node N is declared in
+  --  the list L.
+
   -
   -- Find_Aggregate_Component_Desig_Type --
   -
@@ -7141,6 +7148,30 @@ package body Freeze is
  end if;
   end In_Expanded_Body;
 
+  --
+  -- Has_Decl_In_List --
+  --
+
+  function Has_Decl_In_List
+(E : Entity_Id;
+ N : Node_Id;
+ L : List_Id) return Boolean
+  is
+ Decl_Node : Node_Id;
+
+  begin
+ --  If E is an itype, pretend that it is declared in N
+
+ if Is_Itype (E) then
+Decl_Node := N;
+ else
+Decl_Node := Declaration_Node (E);
+ end if;
+
+ return Is_List_Member (Decl_Node)
+   and then List_Containing (Decl_Node) = L;
+  end Has_Decl_In_List;
+
   --  Local variables
 
   In_Spec_Exp : constant Boolean := In_Spec_Expression;
@@ -7592,7 +7623,6 @@ package body Freeze is
 
when N_Abortable_Part
   | N_Accept_Alternative
-  | N_And_Then
   | N_Case_Statement_Alternative
   | N_Compilation_Unit_Aux
   | N_Conditional_Entry_Call
@@ -7603,21 +7633,50 @@ package body Freeze is
   | N_Extended_Return_Statement
   | N_Freeze_Entity
   | N_If_Statement
-  | N_Or_Else
   | N_Selective_Accept
   | N_Triggering_Alternative
=>
   exit when Is_List_Member (P);
 
-   --  Freeze nodes produced by an expression coming from the
-   --  Actions list of a N_Expression_With_Actions node must remain
-   --  within the Actions list. Inserting the freeze nodes further
-   --  up the tree may lead to use before declaration issues in the
-   --  case of array types.
+   --  The freeze nodes produced by an expression coming from the
+   --  Actions list of an N_Expression_With_Actions, short-circuit
+   --  expression or N_Case_Expression_Alternative node must remain
+   --  within the Actions list if they freeze an entity declared in
+   --  this list, as inserting the freeze nodes further up the tree
+   --  may lead to use before declaration issues for the entity.
+
+   when N_Case_Expression_Alternative
+  | N_Expression_With_Actions
+  | N_Short_Circuit
+   =>
+  exit when (Present (Nam)
+  and then
+ Has_Decl_In_List (Nam, P, Actions (Parent_P)))
+or else (Present (Typ)
+  and then
+ Has_Decl_In_List (Typ, P, Actions (Parent_P)));
 
-   when N_Expression_With_Actions =>
-  exit when Is_List_Member (P)
-and then List_Containing (P) = Actions (Parent_P);
+   --  Likewise for an N_If_Expression and its two Actions list
+
+   when N_If_Expression =>
+  declare
+ L1 : constant List_Id := Then_Actions (Parent_P);
+ L2 : constant List_Id := Else_Actions (Parent_P);
+
+  begin
+ exit when (Present (Nam)
+ and then
+Has_Decl_In_List (Nam, P, 

[Ada] Clean up in Interfaces.C.Extensions

2020-07-08 Thread Pierre-Marie de Rodat
Now that Interfaces.C also defined long_long and unsigned_long_long,
make definitions in Interfaces.C.Extensions subtypes.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/i-cexten.ads (long_long, unsigned_long_long): Now
subtypes of Interfaces.C types.
* libgnat/a-calcon.ads, libgnat/a-calcon.adb
(To_Unix_Nano_Time): Use Interfaces.C.long_long instead of
Interfaces.C.Extensions.long_long.diff --git a/gcc/ada/libgnat/a-calcon.adb b/gcc/ada/libgnat/a-calcon.adb
--- a/gcc/ada/libgnat/a-calcon.adb
+++ b/gcc/ada/libgnat/a-calcon.adb
@@ -30,7 +30,6 @@
 --
 
 with Interfaces.C;use Interfaces.C;
-with Interfaces.C.Extensions; use Interfaces.C.Extensions;
 
 package body Ada.Calendar.Conversions is
 


diff --git a/gcc/ada/libgnat/a-calcon.ads b/gcc/ada/libgnat/a-calcon.ads
--- a/gcc/ada/libgnat/a-calcon.ads
+++ b/gcc/ada/libgnat/a-calcon.ads
@@ -33,7 +33,6 @@
 --  time models - Time, Duration, struct tm and struct timespec.
 
 with Interfaces.C;
-with Interfaces.C.Extensions;
 
 package Ada.Calendar.Conversions is
 
@@ -112,7 +111,7 @@ package Ada.Calendar.Conversions is
--  fit into a Time value.
 
function To_Unix_Nano_Time
- (Ada_Time : Time) return Interfaces.C.Extensions.long_long;
+ (Ada_Time : Time) return Interfaces.C.long_long;
--  Convert a time value represented as number of time units since the Ada
--  implementation-defined Epoch to a value relative to the Unix Epoch. The
--  units of the result are nanoseconds. Raises Time_Error if the result


diff --git a/gcc/ada/libgnat/i-cexten.ads b/gcc/ada/libgnat/i-cexten.ads
--- a/gcc/ada/libgnat/i-cexten.ads
+++ b/gcc/ada/libgnat/i-cexten.ads
@@ -61,8 +61,8 @@ package Interfaces.C.Extensions is
 
--  64-bit integer types
 
-   subtype long_long is Long_Long_Integer;
-   type unsigned_long_long is mod 2 ** 64;
+   subtype long_long is Interfaces.C.long_long;
+   subtype unsigned_long_long is Interfaces.C.unsigned_long_long;
 
--  128-bit integer type available on 64-bit platforms:
--  typedef int signed_128 __attribute__ ((mode (TI)));




[Ada] Accept aspect Relaxed_Initialization on generic subprograms

2020-07-08 Thread Pierre-Marie de Rodat
Aspect Relaxed_Initialization has been prototyped for ordinary
subprograms and then SPARK RM 6.10 allowed it for generic subprograms as
well.

This is mostly straightforward to implement, except when 'Result appears
in the aspect expression for a generic function. When instantiated in a
wrapper package, the aspect gets attached to a subprogram with internal
name, while 'Result in the aspect is still prefixed with the name of a
generic function.

We already had a mechanism to correct this discrepancy, as it also
happens with Post, Depends and Refined_Depends aspects. However,
expressions of those aspects are first relocated to pragmas and then
analysed with renaming of the instantiated subprogram in scope. This
patch reuses this relaxes a guard for this mechanism, so that correction
applies to aspect Relaxed_Initialization, which is not translated to
pragma.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_attr.adb (Analyze_Attribute): Correct prefix of 'Result
this prefix is a generic function but the enclosing aspect or
pragma is attached to its instance.
* sem_ch12.adb (Analyze_Generic_Subprogram_Declaration): Analyze
generic subprogram formal parameters (including the implicit
result of a generic function) and only then analyse its aspects,
because with Relaxed_Initialization the aspect expression might
refer to those formal parameters.
* sem_ch13.adb (Analyze_Aspect_Relaxed_Initialization): Accept
aspect on generic subprograms; install formal parameters of a
generic subprogram but not formal parameters of the generic unit
itself (the previous code was inspired by aspects Post and
Depends, where both kinds of formals are allowed).
* sem_util.ads (Enter_Name): Fix name of a subprogram referenced
in comment.diff --git a/gcc/ada/sem_attr.adb b/gcc/ada/sem_attr.adb
--- a/gcc/ada/sem_attr.adb
+++ b/gcc/ada/sem_attr.adb
@@ -5512,8 +5512,16 @@ package body Sem_Attr is
 if Is_Entity_Name (P) then
Pref_Id := Entity (P);
 
-   if Ekind_In (Pref_Id, E_Function, E_Generic_Function)
- and then Ekind (Spec_Id) = Ekind (Pref_Id)
+   --  Either both the prefix and the annotated spec must be
+   --  generic functions, or they both must be non-generic
+   --  functions, or the prefix must be generic and the spec
+   --  must be non-generic (i.e. it must denote an instance).
+
+   if (Ekind_In (Pref_Id, E_Function, E_Generic_Function)
+   and then Ekind (Pref_Id) = Ekind (Spec_Id))
+or else
+  (Ekind (Pref_Id) = E_Generic_Function
+   and then Ekind (Spec_Id) = E_Function)
then
   if Denote_Same_Function (Pref_Id, Spec_Id) then
 


diff --git a/gcc/ada/sem_ch12.adb b/gcc/ada/sem_ch12.adb
--- a/gcc/ada/sem_ch12.adb
+++ b/gcc/ada/sem_ch12.adb
@@ -3873,13 +3873,6 @@ package body Sem_Ch12 is
  Set_Ekind (Id, E_Generic_Procedure);
   end if;
 
-  --  Analyze the aspects of the generic copy to ensure that all generated
-  --  pragmas (if any) perform their semantic effects.
-
-  if Has_Aspects (N) then
- Analyze_Aspect_Specifications (N, Id);
-  end if;
-
   --  Set SPARK_Mode from context
 
   Set_SPARK_Pragma   (Id, SPARK_Mode_Pragma);
@@ -3951,6 +3944,13 @@ package body Sem_Ch12 is
  Set_Etype (Id, Standard_Void_Type);
   end if;
 
+  --  Analyze the aspects of the generic copy to ensure that all generated
+  --  pragmas (if any) perform their semantic effects.
+
+  if Has_Aspects (N) then
+ Analyze_Aspect_Specifications (N, Id);
+  end if;
+
   --  For a library unit, we have reconstructed the entity for the unit,
   --  and must reset it in the library tables. We also make sure that
   --  Body_Required is set properly in the original compilation unit node.


diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb
--- a/gcc/ada/sem_ch13.adb
+++ b/gcc/ada/sem_ch13.adb
@@ -2276,7 +2276,9 @@ package body Sem_Ch13 is
 
--  Annotation of a subprogram; aspect expression is required
 
-   elsif Is_Subprogram_Or_Entry (E) then
+   elsif Is_Subprogram_Or_Entry (E)
+ or else Is_Generic_Subprogram (E)
+   then
   if Present (Expr) then
 
  --  If we analyze subprogram body that acts as its own
@@ -2291,11 +2293,13 @@ package body Sem_Ch13 is
 Restore_Scope := True;
 Push_Scope (E);
 
-if Is_Generic_Subprogram (E) then
-   Install_Generic_Formals (E);
-else
-   Install_Formals (E);
-end if;
+--  

[Ada] Update entities on class-wide condition function creation

2020-07-08 Thread Pierre-Marie de Rodat
Problem: Expression functions that have class-wide pre/post conditions
which call functions that are themselves expression functions refer to
entities from a specification that isn't theirs.

Solution: When creating the class-wide clone of the function that has
the class-wide condition, update its entities to refer to the right
spec.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_util.adb (Build_Class_Wide_Clone_Body): Update entities to
refer to the right spec.diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -1510,17 +1510,38 @@ package body Sem_Util is
   Loc: constant Source_Ptr := Sloc (Bod);
   Clone_Id   : constant Entity_Id  := Class_Wide_Clone (Spec_Id);
   Clone_Body : Node_Id;
+  Assoc_List : constant Elist_Id := New_Elmt_List;
 
begin
   --  The declaration of the class-wide clone was created when the
   --  corresponding class-wide condition was analyzed.
 
+  --  The body of the original condition may contain references to
+  --  the formals of Spec_Id. In the body of the classwide clone,
+  --  these must be replaced with the corresponding formals of
+  --  the clone.
+
+  declare
+ Spec_Formal_Id  : Entity_Id := First_Formal (Spec_Id);
+ Clone_Formal_Id : Entity_Id := First_Formal (Clone_Id);
+  begin
+ while Present (Spec_Formal_Id) loop
+Append_Elmt (Spec_Formal_Id,  Assoc_List);
+Append_Elmt (Clone_Formal_Id, Assoc_List);
+
+Next_Formal (Spec_Formal_Id);
+Next_Formal (Clone_Formal_Id);
+ end loop;
+  end;
+
   Clone_Body :=
 Make_Subprogram_Body (Loc,
   Specification  =>
 Copy_Subprogram_Spec (Parent (Clone_Id)),
   Declarations   => Declarations (Bod),
-  Handled_Statement_Sequence => Handled_Statement_Sequence (Bod));
+  Handled_Statement_Sequence =>
+New_Copy_Tree (Handled_Statement_Sequence (Bod),
+  Map => Assoc_List));
 
   --  The new operation is internal and overriding indicators do not apply
   --  (the original primitive may have carried one).




[Ada] Static expression function problems with -gnatc and -gnatd.F (SPARK mode)

2020-07-08 Thread Pierre-Marie de Rodat
The implementation of static expression functions exhibited various
problems when compiling with the switches -gnatd.F (SPARK mode) or
-gnatc.  Use of those switches could lead to errors on legal calls to
static expression functions (such as the calls being flagged as not
static), plus the compiler could crash on cases of illegal static
function calls when using -gnatd.F.  Those problems are fixed, and the
unpleasant special-case code that was added in
Expand_Simple_Function_Return is eliminated as part of these changes.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch6.adb (Expand_Simple_Function_Return): Remove ugly code
that was copying the return expression, resetting Analyzed
flags, etc. for the return expression of static expression
functions.
* inline.adb (Inline_Static_Expression_Function_Call): Set the
Parent of the copied expression to that of the call. This avoids
a blowup in Insert_Actions when GNATprove_Mode is set and there
are nested SEF calls. Add ??? comment.
* sem_ch6.adb (Analyze_Expression_Function): In the case of a
static expression function, create a new copy of the expression
and replace the function's expression with the copy; the
original expression is used in the expression function's body
and will be analyzed and rewritten, and we need to save a clean
copy for later use in processing static calls to the function.
This allows removing the kludgy code that was in
Expand_Simple_Function_Return.
* sem_eval.adb (Eval_Qualified_Expression): Return immediately
if any errors have been posted on the qualified expression, to
avoid blowups when GNATprove_Mode is enabled (or with -gnatd.F),
since illegal static expressions are handled differently in that
case and attempting to fold such expressions would fail.diff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp_ch6.adb
--- a/gcc/ada/exp_ch6.adb
+++ b/gcc/ada/exp_ch6.adb
@@ -7356,33 +7356,9 @@ package body Exp_Ch6 is
  Reason => PE_Accessibility_Check_Failed));
   end Check_Against_Result_Level;
 
-  --  Local Data
-
-  New_Copy_Of_Exp : Node_Id := Empty;
-
--  Start of processing for Expand_Simple_Function_Return
 
begin
-  --  For static expression functions, the expression of the function
-  --  needs to be available in a form that can be replicated later for
-  --  calls, but rewriting of the return expression in the body created
-  --  for expression functions will cause the original expression to no
-  --  longer be properly copyable via New_Copy_Tree, because the Parent
-  --  fields of the nodes will now point to nodes in the rewritten tree,
-  --  and New_Copy_Tree won't copy the deeper nodes of the original tree.
-  --  So we work around that by making a copy of the expression tree
-  --  before any rewriting occurs, and replacing the original expression
-  --  tree with this copy (see the end of this procedure). We also reset
-  --  the Analyzed flags on the nodes in the tree copy to ensure that
-  --  later copies of the tree will be fully reanalyzed. This copying
-  --  is of course rather inelegant, to say the least, and it would be
-  --  nice if there were a way to avoid it. ???
-
-  if Is_Static_Expression_Function (Scope_Id) then
- New_Copy_Of_Exp := New_Copy_Tree (Exp);
- Reset_Analyzed_Flags (New_Copy_Of_Exp);
-  end if;
-
   if Is_Class_Wide_Type (R_Type)
 and then not Is_Class_Wide_Type (Exp_Typ)
 and then Nkind (Exp) /= N_Type_Conversion
@@ -8094,21 +8070,6 @@ package body Exp_Ch6 is
  Analyze_And_Resolve (Exp);
   end if;
 
-  --  If a new copy of a static expression function's expression was made
-  --  (see the beginning of this procedure's statement part), then we now
-  --  replace the original expression tree with the copy and also change
-  --  the Original_Node field of the rewritten expression to point to that
-  --  copy. It would be nice to find a way to avoid this???
-
-  if Present (New_Copy_Of_Exp) then
- Set_Expression
-   (Original_Node (Subprogram_Spec (Scope_Id)), New_Copy_Of_Exp);
-
- if Exp /= Original_Node (Exp) then
-Set_Original_Node (Exp, New_Copy_Of_Exp);
- end if;
-  end if;
-
   --  Ada 2020 (AI12-0279)
 
   if Has_Yield_Aspect (Scope_Id)


diff --git a/gcc/ada/inline.adb b/gcc/ada/inline.adb
--- a/gcc/ada/inline.adb
+++ b/gcc/ada/inline.adb
@@ -4714,6 +4714,13 @@ package body Inline is
 
  Establish_Actual_Mapping_For_Inlined_Call (N, Subp, Decls, Func_Expr);
 
+ --  Ensure that the copy has the same parent as the call (this seems
+ --  to matter when GNATprove_Mode is set and there are nested static
+ --  calls; prevents blowups in Insert_Actions, though 

[Ada] Add comment on implementation choice for byte-packed array types

2020-07-08 Thread Pierre-Marie de Rodat
This documents the implementation choice made for byte-packed array
types, where we let the code generator deal with them if the type is
composite and use the manipulation routines of the front-end if the
type is discrete.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* freeze.adb (Freeze_Array_Type): Add comment on implementation
choice for byte-packed array types.diff --git a/gcc/ada/freeze.adb b/gcc/ada/freeze.adb
--- a/gcc/ada/freeze.adb
+++ b/gcc/ada/freeze.adb
@@ -2928,7 +2928,10 @@ package body Freeze is
 
  --  Bit packing is not needed for multiples of the storage
  --  unit if the type is composite because the back end can
- --  byte pack composite types.
+ --  byte pack composite types efficiently. That's not true
+ --  for discrete types because every read would generate a
+ --  lot of instructions, so we keep using the manipulation
+ --  routines of the runtime for them.
 
  elsif Csiz mod System_Storage_Unit = 0
and then Is_Composite_Type (Ctyp)




[Ada] Add expected and actual size to "bit number out of range" error message

2020-07-08 Thread Pierre-Marie de Rodat
This commit lets users know what the expected and actual size are when
conflicting representation clauses are present.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch13.adb (Analyze_Record_Representation_Clause,
Check_Record_Representation_Clause): Add expected and actual
size to error message.diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb
--- a/gcc/ada/sem_ch13.adb
+++ b/gcc/ada/sem_ch13.adb
@@ -8105,8 +8105,10 @@ package body Sem_Ch13 is
  if Has_Size_Clause (Rectype)
and then RM_Size (Rectype) <= Lbit
  then
-Error_Msg_N
-  ("bit number out of range of specified size",
+Error_Msg_Uint_1 := RM_Size (Rectype);
+Error_Msg_Uint_2 := Lbit + 1;
+Error_Msg_N ("bit number out of range of specified "
+   & "size (expected ^, got ^)",
Last_Bit (CC));
  else
 Set_Component_Clause (Comp, CC);
@@ -11552,8 +11554,10 @@ package body Sem_Ch13 is
 if Has_Size_Clause (Rectype)
   and then RM_Size (Rectype) <= Lbit
 then
-   Error_Msg_N
- ("bit number out of range of specified size",
+   Error_Msg_Uint_1 := RM_Size (Rectype);
+   Error_Msg_Uint_2 := Lbit + 1;
+   Error_Msg_N ("bit number out of range of specified "
+  & "size (expected ^, got ^)",
   Last_Bit (CC));
 
--  Check for overlap with tag or parent component




[Ada] Ada_2020 AI12-0250 : Implement Iterator filters.

2020-07-08 Thread Pierre-Marie de Rodat
Iterator filters can appear in loop parameter specifications and in
iterator specifications, and determine which elements of some domain of
iteration are to be used in a loop, aggregate ,or quantified expression.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* par.adb (P_Iterator_Specification): Make public for use in
other parser subprograms.
* par-ch4.adb (P_Iterated_Component_Association): In Ada_2020,
recognize use of Iterator_Specification in an element iterator.
To simplify disambiguation between the two iterator forms, mark
the component association as carrying an Iterator_Specification
only when the element iterator (using "OF") is used.
* par-ch5.adb (P_Loop_Parameter_Specification): In Ada_2020,
parse iterator filter when present.
(P_Iterator_Specification): Ditto.  Remove declaration of
P_Iterator_Specification, now in parent unit.
* exp_ch5.adb (Expand_N_Loop_Statement): Apply Iterator filter
when present.
(Expand_Iterator_Loop_Over_Array): Ditto.
(Expand_Iterator_Loop_Over_Container): Ditto.
* sem_aggr.adb (Resolve_Array_Aggregate): Emit error nessage if
an iterated component association includes a iterator
specificcation with an element iterator, i.e. one that uses the
OF keyword.
* sem_ch5.adb (Analyze_Iterator_Specification): Analyze Iterator
filter when present.
(Analyze_Loop_Parameter_Specification): Ditto.
* sinfo.adb: Suprogram bodies for new syntactic element
Iterator_Filter.
* sinfo.ads: Add Iterator_Filter to relevant nodes.  Structure
of Component_Association and Iteroted_Component_Association
nodes is modified to take into account the possible  presence of
an iterator specification in the latter.diff --git a/gcc/ada/exp_ch5.adb b/gcc/ada/exp_ch5.adb
--- a/gcc/ada/exp_ch5.adb
+++ b/gcc/ada/exp_ch5.adb
@@ -3868,13 +3868,20 @@ package body Exp_Ch5 is
   Array_Dim  : constant Pos:= Number_Dimensions (Array_Typ);
   Id : constant Entity_Id  := Defining_Identifier (I_Spec);
   Loc: constant Source_Ptr := Sloc (Isc);
-  Stats  : constant List_Id:= Statements (N);
+  Stats  : List_Id:= Statements (N);
   Core_Loop  : Node_Id;
   Dim1   : Int;
   Ind_Comp   : Node_Id;
   Iterator   : Entity_Id;
 
begin
+  if Present (Iterator_Filter (I_Spec)) then
+ pragma Assert (Ada_Version >= Ada_2020);
+ Stats := New_List (Make_If_Statement (Loc,
+Condition => Iterator_Filter (I_Spec),
+Then_Statements => Stats));
+  end if;
+
   --  for Element of Array loop
 
   --  It requires an internally generated cursor to iterate over the array
@@ -4145,7 +4152,9 @@ package body Exp_Ch5 is
   Elem_Typ : constant Entity_Id   := Etype (Id);
   Id_Kind  : constant Entity_Kind := Ekind (Id);
   Loc  : constant Source_Ptr  := Sloc (N);
-  Stats: constant List_Id := Statements (N);
+
+  Stats: List_Id := Statements (N);
+  --  Maybe wrapped in a conditional if a filter is present
 
   Cursor: Entity_Id;
   Decl  : Node_Id;
@@ -4167,6 +4176,13 @@ package body Exp_Ch5 is
   --  The package in which the container type is declared
 
begin
+  if Present (Iterator_Filter (I_Spec)) then
+ pragma Assert (Ada_Version >= Ada_2020);
+ Stats := New_List (Make_If_Statement (Loc,
+Condition => Iterator_Filter (I_Spec),
+Then_Statements => Stats));
+  end if;
+
   --  Determine the advancement and initialization steps for the cursor.
   --  Analysis of the expanded loop will verify that the container has a
   --  reverse iterator.
@@ -4640,11 +4656,20 @@ package body Exp_Ch5 is
 Loop_Id : constant Entity_Id := Defining_Identifier (LPS);
 Ltype   : constant Entity_Id := Etype (Loop_Id);
 Btype   : constant Entity_Id := Base_Type (Ltype);
+Stats   : constant List_Id   := Statements (N);
 Expr: Node_Id;
 Decls   : List_Id;
 New_Id  : Entity_Id;
 
  begin
+if Present (Iterator_Filter (LPS)) then
+   pragma Assert (Ada_Version >= Ada_2020);
+   Set_Statements (N,
+  New_List (Make_If_Statement (Loc,
+Condition => Iterator_Filter (LPS),
+Then_Statements => Stats)));
+end if;
+
 --  Deal with loop over predicates
 
 if Is_Discrete_Type (Ltype)
@@ -4761,7 +4786,7 @@ package body Exp_Ch5 is
Declarations => Decls,
Handled_Statement_Sequence =>
  Make_Handled_Sequence_Of_Statements (Loc,
-   Statements => Statements (N,
+  

[Ada] Small tweak to Narrow_Large_Operation procedure

2020-07-08 Thread Pierre-Marie de Rodat
This recently introduced procedure is responsible for narrowing the
type of operations done originally in Universal_Integer, rewriting
them into operations done in Integer or Long_Long_Integer.

This changes the procedure to use the base type instead of the first
subtype for these two integer types, which avoids the need for useless
conversions between the base type and the first subtype.

No functional changes.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch4.adb (Narrow_Large_Operation): Use the base type instead
of the first subtype of standard integer types as narrower type.diff --git a/gcc/ada/exp_ch4.adb b/gcc/ada/exp_ch4.adb
--- a/gcc/ada/exp_ch4.adb
+++ b/gcc/ada/exp_ch4.adb
@@ -14067,13 +14067,15 @@ package body Exp_Ch4 is
  return;
   end if;
 
-  --  Now pick the narrower type according to the size
+  --  Now pick the narrower type according to the size. We use the base
+  --  type instead of the first subtype because operations are done in
+  --  the base type, so this avoids the need for useless conversions.
 
   if Nsiz <= RM_Size (Standard_Integer) then
- Ntyp := Standard_Integer;
+ Ntyp := Etype (Standard_Integer);
 
   elsif Nsiz <= RM_Size (Standard_Long_Long_Integer) then
- Ntyp := Standard_Long_Long_Integer;
+ Ntyp := Etype (Standard_Long_Long_Integer);
 
   else
  return;




[Ada] ACATS 4.1R - BD10001 - Error missed

2020-07-08 Thread Pierre-Marie de Rodat
GNAT does not reject specifying aspect Inline twice (once via an aspect
and once via a pragma).

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_prag.adb (Process_Inline): Check for duplicate
pragma+aspect Inline. Minor code cleanup.
(Check_Duplicate_Pragma): Add warning for duplicate
pragma [No_]Inline under -gnatwr.
* sinfo.ads, sinfo.adb (Next_Rep_Item): Allow N_Null_Statement
which can appear when a pragma is rewritten.
* sem_util.ads, sem_util.adb, bindo-writers.adb: Fix bad
copy/paste now flagged.
* libgnat/s-mmap.ads: Remove redundant pragma Inline.diff --git a/gcc/ada/bindo-writers.adb b/gcc/ada/bindo-writers.adb
--- a/gcc/ada/bindo-writers.adb
+++ b/gcc/ada/bindo-writers.adb
@@ -1037,7 +1037,7 @@ package body Bindo.Writers is
   --  output.
 
   procedure Write_Components (G : Library_Graph);
-  pragma Inline (Write_Component);
+  pragma Inline (Write_Components);
   --  Write all components of library graph G to standard output
 
   procedure Write_Edges_To_Successors


diff --git a/gcc/ada/libgnat/s-mmap.ads b/gcc/ada/libgnat/s-mmap.ads
--- a/gcc/ada/libgnat/s-mmap.ads
+++ b/gcc/ada/libgnat/s-mmap.ads
@@ -223,13 +223,11 @@ package System.Mmap is
--  (File); such accesses may cause Storage_Error to be raised.
 
function Data (Region : Mapped_Region) return Str_Access;
-   pragma Inline (Data);
--  The data mapped in Region as requested. The result is an unconstrained
--  string, so you cannot use the usual 'First and 'Last attributes.
--  Instead, these are respectively 1 and Size.
 
function Data (File : Mapped_File) return Str_Access;
-   pragma Inline (Data);
--  Likewise for the region contained in File
 
function Is_Mutable (Region : Mapped_Region) return Boolean;


diff --git a/gcc/ada/sem_prag.adb b/gcc/ada/sem_prag.adb
--- a/gcc/ada/sem_prag.adb
+++ b/gcc/ada/sem_prag.adb
@@ -5905,7 +5905,18 @@ package body Sem_Prag is
 then
Error_Msg_NE ("aspect% for & previously given#", N, Id);
 else
-   Error_Msg_NE ("pragma% for & duplicates pragma#", N, Id);
+   --  If -gnatwr is set, warn in case of a duplicate pragma
+   --  [No_]Inline which is suspicious but not an error, generate
+   --  an error for other pragmas.
+
+   if Nam_In (Pragma_Name (N), Name_Inline, Name_No_Inline) then
+  if Warn_On_Redundant_Constructs then
+ Error_Msg_NE
+   ("?r?pragma% for & duplicates pragma#", N, Id);
+  end if;
+   else
+  Error_Msg_NE ("pragma% for & duplicates pragma#", N, Id);
+   end if;
 end if;
 
 raise Pragma_Exit;
@@ -10127,6 +10138,18 @@ package body Sem_Prag is
   Applies := True;
 
else
+  --  Check for RM 13.1(9.2/4): If a [...] aspect_specification
+  --  is given that directly specifies an aspect of an entity,
+  --  then it is illegal to give another [...]
+  --  aspect_specification that directly specifies the same
+  --  aspect of the entity.
+  --  We only check Subp directly as per "directly specifies"
+  --  above and because the case of pragma Inline is really
+  --  special given its pre aspect usage.
+
+  Check_Duplicate_Pragma (Subp);
+  Record_Rep_Item (Subp, N);
+
   Make_Inline (Subp);
 
   --  For the pragma case, climb homonym chain. This is
@@ -10138,8 +10161,8 @@ package body Sem_Prag is
  while Present (Homonym (Subp))
and then Scope (Homonym (Subp)) = Current_Scope
  loop
-Make_Inline (Homonym (Subp));
 Subp := Homonym (Subp);
+Make_Inline (Subp);
  end loop;
   end if;
end if;


diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -21407,7 +21407,7 @@ package body Sem_Util is
   --  New_Id is the corresponding new entity generated during Phase 1.
 
   procedure Add_Pending_Itype (Assoc_Nod : Node_Id; Itype : Entity_Id);
-  pragma Inline (Add_New_Entity);
+  pragma Inline (Add_Pending_Itype);
   --  Add an entry in the NCT_Pending_Itypes which maps key Assoc_Nod to
   --  value Itype. Assoc_Nod is the associated node of an itype. Itype is
   --  an itype.


diff --git a/gcc/ada/sem_util.ads b/gcc/ada/sem_util.ads
--- a/gcc/ada/sem_util.ads
+++ b/gcc/ada/sem_util.ads
@@ -2512,7 +2512,7 @@ package Sem_Util is
--  with the same mode.
 
procedure Next_Global (Node : in out Node_Id);
-   pragma Inline 

[Ada] Disallow Predicate_Failure without predicate

2020-07-08 Thread Pierre-Marie de Rodat
This patch checks RM-3.2.4(14.2/4), which requires Predicate_Failure to
be specified only on a subtype with a previous predicate specification
(for Static_Predicate or Dynamic_Predicate). We apply the same rule to
the GNAT-specific Predicate aspect.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch13.adb (Predicate_Failure): Check that the type has
predicates.  Remove the setting of Has_Delayed_Aspects and
Freeze_Node, because (if the code is legal) it should have
already been done by the predicate aspect.diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb
--- a/gcc/ada/sem_ch13.adb
+++ b/gcc/ada/sem_ch13.adb
@@ -3120,6 +3120,12 @@ package body Sem_Ch13 is
  Error_Msg_N
("predicate cannot apply to incomplete view", Aspect);
  goto Continue;
+
+  elsif not Has_Predicates (E) then
+ Error_Msg_N
+   ("Predicate_Failure requires previous predicate" &
+" specification", Aspect);
+ goto Continue;
   end if;
 
   --  Construct the pragma
@@ -3132,16 +3138,6 @@ package body Sem_Ch13 is
  Expression => Relocate_Node (Expr))),
  Pragma_Name => Name_Predicate_Failure);
 
-  --  If the type is private, indicate that its completion
-  --  has a freeze node, because that is the one that will
-  --  be visible at freeze time.
-
-  if Is_Private_Type (E) and then Present (Full_View (E)) then
- Set_Has_Predicates (Full_View (E));
- Set_Has_Delayed_Aspects (Full_View (E));
- Ensure_Freeze_Node (Full_View (E));
-  end if;
-
--  Case 2b: Aspects corresponding to pragmas with two
--  arguments, where the second argument is a local name
--  referring to the entity, and the first argument is the




[Ada] Fix incorrect placement of freeze node with predicate

2020-07-08 Thread Pierre-Marie de Rodat
This prevents the freezing mechanism from putting a node inside the
subprogram body generated for a predicate function, which can for
example happen for a function referenced in the predicate.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* freeze.adb (In_Expanded_Body): Return true for the body of a
generated predicate function.diff --git a/gcc/ada/freeze.adb b/gcc/ada/freeze.adb
--- a/gcc/ada/freeze.adb
+++ b/gcc/ada/freeze.adb
@@ -7054,11 +7054,11 @@ package body Freeze is
   --  as well.
 
   function In_Expanded_Body (N : Node_Id) return Boolean;
-  --  Given an N_Handled_Sequence_Of_Statements node N, determines whether
-  --  it is the handled statement sequence of an expander-generated
-  --  subprogram: init proc, stream subprogram, renaming as body, or body
-  --  created for an expression function. If so, this is not a freezing
-  --  context and the entity will be frozen at a later point.
+  --  Given an N_Handled_Sequence_Of_Statements node, determines whether it
+  --  is the statement sequence of an expander-generated subprogram: body
+  --  created for an expression function, for a predicate function, an init
+  --  proc, a stream subprogram, or a renaming as body. If so, this is not
+  --  a freezing context and the entity will be frozen at a later point.
 
   -
   -- Find_Aggregate_Component_Desig_Type --
@@ -7112,6 +7112,13 @@ package body Freeze is
  elsif Was_Expression_Function (P) then
 return not Comes_From_Source (P);
 
+ --  This is the body of a generated predicate function
+
+ elsif Present (Corresponding_Spec (P))
+   and then Is_Predicate_Function (Corresponding_Spec (P))
+ then
+return True;
+
  else
 Id := Defining_Unit_Name (Specification (P));
 




[Ada] Remove excessive validity checks on in-parameters

2020-07-08 Thread Pierre-Marie de Rodat
Routine Safe_To_Capture_Value was written to capture just the "value",
which only made sense for assignable entities. However, it was later
employed to capture other properties, e.g. the value being [non-]null or
valid. Those properties can be captured for non-assignable entities as
well, e.g. constants and in-parameters (as is done here).

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_util.ads, sem_util.adb (Safe_To_Capture_Value): Return
True for in-parameters.diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -25718,23 +25718,25 @@ package body Sem_Util is
is
begin
   --  The only entities for which we track constant values are variables
-  --  which are not renamings, constants, out parameters, and in out
-  --  parameters, so check if we have this case.
+  --  which are not renamings, constants and formal parameters, so check
+  --  if we have this case.
 
   --  Note: it may seem odd to track constant values for constants, but in
   --  fact this routine is used for other purposes than simply capturing
-  --  the value. In particular, the setting of Known[_Non]_Null.
+  --  the value. In particular, the setting of Known[_Non]_Null and
+  --  Is_Known_Valid.
 
   if (Ekind (Ent) = E_Variable and then No (Renamed_Object (Ent)))
-or else
-  Ekind_In (Ent, E_Constant, E_Out_Parameter, E_In_Out_Parameter)
+   or else
+ Ekind (Ent) = E_Constant
+   or else
+ Is_Formal (Ent)
   then
  null;
 
-  --  For conditionals, we also allow loop parameters and all formals,
-  --  including in parameters.
+  --  For conditionals, we also allow loop parameters
 
-  elsif Cond and then Ekind_In (Ent, E_Loop_Parameter, E_In_Parameter) then
+  elsif Cond and then Ekind (Ent) = E_Loop_Parameter then
  null;
 
   --  For all other cases, not just unsafe, but impossible to capture


diff --git a/gcc/ada/sem_util.ads b/gcc/ada/sem_util.ads
--- a/gcc/ada/sem_util.ads
+++ b/gcc/ada/sem_util.ads
@@ -2743,13 +2743,14 @@ package Sem_Util is
  (N: Node_Id;
   Ent  : Entity_Id;
   Cond : Boolean := False) return Boolean;
-   --  The caller is interested in capturing a value (either the current value,
-   --  or an indication that the value is non-null) for the given entity Ent.
-   --  This value can only be captured if sequential execution semantics can be
-   --  properly guaranteed so that a subsequent reference will indeed be sure
-   --  that this current value indication is correct. The node N is the
-   --  construct which resulted in the possible capture of the value (this
-   --  is used to check if we are in a conditional).
+   --  The caller is interested in capturing a value (either the current
+   --  value, an indication that the value is [non-]null or an indication that
+   --  the value is valid) for the given entity Ent. This value can only be
+   --  captured if sequential execution semantics can be properly guaranteed so
+   --  that a subsequent reference will indeed be sure that this current value
+   --  indication is correct. The node N is the construct which resulted in
+   --  the possible capture of the value (this is used to check if we are in
+   --  a conditional).
--
--  Cond is used to skip the test for being inside a conditional. It is used
--  in the case of capturing values from if/while tests, which already do a




[Ada] Analyze aspects once generic subprograms are recognized

2020-07-08 Thread Pierre-Marie de Rodat
When analysing aspect Yield we were adding a minimum decoration to the
annotated entity by setting its kind to E_Function/E_Procedure. This
kind was then correctly reset to E_Generic_Function/E_Generic_Procedure
after all aspects has been analysed.

It seems cleaner to set this kind once and correctly before analysing
aspects on generic subprograms. This way we don't need to repeat this
minimal decoration for other aspects, e.g. Relaxed_Initialization.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch12.adb (Analyze_Generic_Subprogram_Declaration): Call
Analyze_Aspect_Specifications after setting Ekind of the
analyzed entity.
* sem_ch13.adb (Analyze_Aspect_Yield): Remove minimal decoration
of generic subprograms.diff --git a/gcc/ada/sem_ch12.adb b/gcc/ada/sem_ch12.adb
--- a/gcc/ada/sem_ch12.adb
+++ b/gcc/ada/sem_ch12.adb
@@ -3860,13 +3860,6 @@ package body Sem_Ch12 is
   Enter_Name (Id);
   Set_Scope_Depth_Value (Id, Scope_Depth (Current_Scope) + 1);
 
-  --  Analyze the aspects of the generic copy to ensure that all generated
-  --  pragmas (if any) perform their semantic effects.
-
-  if Has_Aspects (N) then
- Analyze_Aspect_Specifications (N, Id);
-  end if;
-
   Push_Scope (Id);
   Enter_Generic_Scope (Id);
   Set_Inner_Instances (Id, New_Elmt_List);
@@ -3880,6 +3873,13 @@ package body Sem_Ch12 is
  Set_Ekind (Id, E_Generic_Procedure);
   end if;
 
+  --  Analyze the aspects of the generic copy to ensure that all generated
+  --  pragmas (if any) perform their semantic effects.
+
+  if Has_Aspects (N) then
+ Analyze_Aspect_Specifications (N, Id);
+  end if;
+
   --  Set SPARK_Mode from context
 
   Set_SPARK_Pragma   (Id, SPARK_Mode_Pragma);


diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb
--- a/gcc/ada/sem_ch13.adb
+++ b/gcc/ada/sem_ch13.adb
@@ -2563,22 +2563,6 @@ package body Sem_Ch13 is
end if;
 
if Expr_Value then
-
-  --  Adding minimum decoration to generic subprograms to set
-  --  the Yield attribute (since at this stage it may not be
-  --  set; see Analyze_Generic_Subprogram_Declaration).
-
-  if Nkind (N) in N_Generic_Subprogram_Declaration
-and then Ekind (E) = E_Void
-  then
- if Nkind (Specification (N)) = N_Function_Specification
- then
-Set_Ekind (E, E_Function);
- else
-Set_Ekind (E, E_Procedure);
- end if;
-  end if;
-
   Set_Has_Yield_Aspect (E);
end if;
 




[Ada] Allow boolean expressions in aspect Relaxed_Initialization

2020-07-08 Thread Pierre-Marie de Rodat
The final version of SPARK RM 6.10 allows the Relaxed_Initialization
status of subprogram parameters (and function result) to be controlled
by an optional boolean expressions, e.g.:

   function F (Arg : Integer) return Integer
   with Relaxed_Initialization => (Arg => True, F'Result);

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch13.adb (Analyze_Aspect_Relaxed_Initialization): Analyze
optional boolean expressions.
* sem_util.ads, sem_util.adb (Has_Relaxed_Initialization): Adapt
query; update comment.diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb
--- a/gcc/ada/sem_ch13.adb
+++ b/gcc/ada/sem_ch13.adb
@@ -2304,12 +2304,48 @@ package body Sem_Ch13 is
 
  if Nkind (Expr) = N_Aggregate then
 
---  Component associations are not allowed in the
---  aspect expression aggregate.
+--  Component associations in the aggregate must be a
+--  parameter name followed by a static boolean
+--  expression.
 
 if Present (Component_Associations (Expr)) then
-   Error_Msg_N ("illegal aspect % expression", Expr);
-else
+   declare
+  Assoc : Node_Id :=
+First (Component_Associations (Expr));
+   begin
+  while Present (Assoc) loop
+ if List_Length (Choices (Assoc)) = 1 then
+Analyze_Relaxed_Parameter
+  (E, First (Choices (Assoc)), Seen);
+
+if Inside_A_Generic then
+   Preanalyze_And_Resolve
+ (Expression (Assoc), Any_Boolean);
+else
+   Analyze_And_Resolve
+ (Expression (Assoc), Any_Boolean);
+end if;
+
+if not Is_OK_Static_Expression
+  (Expression (Assoc))
+then
+   Error_Msg_N
+ ("expression of aspect %" &
+  "must be static", Aspect);
+end if;
+
+ else
+Error_Msg_N
+  ("illegal aspect % expression", Expr);
+ end if;
+ Next (Assoc);
+  end loop;
+   end;
+end if;
+
+--  Expressions of the aggregate are parameter names
+
+if Present (Expressions (Expr)) then
declare
   Param : Node_Id := First (Expressions (Expr));
 


diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -12525,6 +12525,7 @@ package body Sem_Util is
Subp_Id : Entity_Id;
Aspect_Expr : Node_Id;
Param_Expr  : Node_Id;
+   Assoc   : Node_Id;
 
 begin
if Is_Formal (E) then
@@ -12538,13 +12539,30 @@ package body Sem_Util is
 Find_Value_Of_Aspect
   (Subp_Id, Aspect_Relaxed_Initialization);
 
-  --  Aspect expression is either an aggregate, e.g.:
+  --  Aspect expression is either an aggregate with an optional
+  --  Boolean expression (which defaults to True), e.g.:
   --
   --function F (X : Integer) return Integer
-  --  with Relaxed_Initialization => (X, F'Result);
+  --  with Relaxed_Initialization => (X => True, F'Result);
 
   if Nkind (Aspect_Expr) = N_Aggregate then
 
+ if Present (Component_Associations (Aspect_Expr)) then
+Assoc := First (Component_Associations (Aspect_Expr));
+
+while Present (Assoc) loop
+   if Denotes_Relaxed_Parameter
+ (First (Choices (Assoc)), E)
+   then
+  return
+Is_True
+  (Static_Boolean (Expression (Assoc)));
+   end if;
+
+   Next (Assoc);
+end loop;
+ end if;
+
  Param_Expr := First 

[Ada] Fix inaccurate -gnatR output for derived untagged types

2020-07-08 Thread Pierre-Marie de Rodat
This fixes a couple of quirks in the output generated by the -gnatR switch
for derived untagged types with discriminants:

  1. in normal mode, it would display both the hidden and the visible
 discriminants, which results in overlapping components,

  2. in JSON mode, it would display only the hidden discriminants.

Both modes are changed to display only the visible discriminants.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* repinfo.adb (Compute_Max_Length): Skip hidden discriminants.
(List_Record_Layout): Likewise.
(List_Structural_Record_Layout): Use First_Discriminant instead
of First_Stored_Discriminant and Next_Discriminant instead of
Next_Stored_Discriminant to walk the list of discriminants.diff --git a/gcc/ada/repinfo.adb b/gcc/ada/repinfo.adb
--- a/gcc/ada/repinfo.adb
+++ b/gcc/ada/repinfo.adb
@@ -1006,10 +1006,12 @@ package body Repinfo is
  Comp := First_Component_Or_Discriminant (Ent);
  while Present (Comp) loop
 
---  Skip discriminant in unchecked union (since it is not there!)
+--  Skip a completely hidden discriminant or a discriminant in an
+--  unchecked union (since it is not there).
 
 if Ekind (Comp) = E_Discriminant
-  and then Is_Unchecked_Union (Ent)
+  and then (Is_Completely_Hidden (Comp)
+ or else Is_Unchecked_Union (Ent))
 then
goto Continue;
 end if;
@@ -1278,10 +1280,12 @@ package body Repinfo is
  Comp := First_Component_Or_Discriminant (Ent);
  while Present (Comp) loop
 
---  Skip discriminant in unchecked union (since it is not there!)
+--  Skip a completely hidden discriminant or a discriminant in an
+--  unchecked union (since it is not there).
 
 if Ekind (Comp) = E_Discriminant
-  and then Is_Unchecked_Union (Ent)
+  and then (Is_Completely_Hidden (Comp)
+ or else Is_Unchecked_Union (Ent))
 then
goto Continue;
 end if;
@@ -1370,7 +1374,7 @@ package body Repinfo is
 Derived_Disc : Entity_Id;
 
  begin
-Derived_Disc := First_Stored_Discriminant (Outer_Ent);
+Derived_Disc := First_Discriminant (Outer_Ent);
 
 --  Loop over the discriminants of the extension
 
@@ -1394,7 +1398,7 @@ package body Repinfo is
   end if;
end if;
 
-   Next_Stored_Discriminant (Derived_Disc);
+   Next_Discriminant (Derived_Disc);
 end loop;
 
 --  Disc is not constrained by a discriminant of Outer_Ent
@@ -1463,12 +1467,13 @@ package body Repinfo is
end if;
 
--  If the record has discriminants and is not an unchecked
-   --  union, then display them now.
+   --  union, then display them now. Note that, even if this is
+   --  a structural layout, we list the visible discriminants.
 
if Has_Discriminants (Ent)
  and then not Is_Unchecked_Union (Ent)
then
-  Disc := First_Stored_Discriminant (Ent);
+  Disc := First_Discriminant (Ent);
   while Present (Disc) loop
 
  --  If this is a record extension and the discriminant is
@@ -1506,7 +1511,7 @@ package body Repinfo is
  List_Component_Layout (Listed_Disc, Indent => Indent);
 
   <>
- Next_Stored_Discriminant (Disc);
+ Next_Discriminant (Disc);
   end loop;
end if;
 




[Ada] Extend optimization to True/False prefixed with Standard

2020-07-08 Thread Pierre-Marie de Rodat
Optimization that repaces

   if expression then
  return true;
   else
  return false;
   end if;

with

   return expression;

is now trivially extended to detect True/False prefixed by Standard.

Found while investigating excessive validity checks.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch5.adb (Expand_N_If_Statement): Detect True/False
prefixed with Standard.diff --git a/gcc/ada/exp_ch5.adb b/gcc/ada/exp_ch5.adb
--- a/gcc/ada/exp_ch5.adb
+++ b/gcc/ada/exp_ch5.adb
@@ -3743,9 +3743,9 @@ package body Exp_Ch5 is
   --  Another optimization, special cases that can be simplified
 
   -- if expression then
-  --return true;
+  --return [standard.]true;
   -- else
-  --return false;
+  --return [standard.]false;
   -- end if;
 
   --  can be changed to:
@@ -3755,9 +3755,9 @@ package body Exp_Ch5 is
   --  and
 
   -- if expression then
-  --return false;
+  --return [standard.]false;
   -- else
-  --return true;
+  --return [standard.]true;
   -- end if;
 
   --  can be changed to:
@@ -3790,9 +3790,9 @@ package body Exp_Ch5 is
  Else_Expr : constant Node_Id := Expression (Else_Stm);
 
   begin
- if Nkind (Then_Expr) = N_Identifier
+ if Nkind_In (Then_Expr, N_Expanded_Name, N_Identifier)
   and then
-Nkind (Else_Expr) = N_Identifier
+Nkind_In (Else_Expr, N_Expanded_Name, N_Identifier)
  then
 if Entity (Then_Expr) = Standard_True
   and then Entity (Else_Expr) = Standard_False




[Ada] Check predicates for subtypes of private types

2020-07-08 Thread Pierre-Marie de Rodat
This patch fixes a bug where if we have "subtype S is T with Predicate
=> ...", and T is a private type whose full type is derived from another
private type, then the predicate of S is not checked.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch13.adb (Analyze_Aspect_Specifications): Add freeze node
for the Underlying_Full_View if it exists. The freeze node is
what triggers the generation of the predicate function.
* freeze.adb: Minor reformatting.diff --git a/gcc/ada/freeze.adb b/gcc/ada/freeze.adb
--- a/gcc/ada/freeze.adb
+++ b/gcc/ada/freeze.adb
@@ -6175,8 +6175,7 @@ package body Freeze is
 
 if Present (F_Node) then
Inherit_Freeze_Node
- (Fnod => F_Node,
-  Typ  => Full_View (E));
+ (Fnod => F_Node, Typ => Full_View (E));
 else
Set_Has_Delayed_Freeze (Full_View (E), False);
Set_Freeze_Node (Full_View (E), Empty);
@@ -6187,9 +6186,7 @@ package body Freeze is
 F_Node := Freeze_Node (Full_View (E));
 
 if Present (F_Node) then
-   Inherit_Freeze_Node
- (Fnod => F_Node,
-  Typ  => E);
+   Inherit_Freeze_Node (Fnod => F_Node, Typ => E);
 else
--  {Incomplete,Private}_Subtypes with Full_Views
--  constrained by discriminants.


diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb
--- a/gcc/ada/sem_ch13.adb
+++ b/gcc/ada/sem_ch13.adb
@@ -3051,6 +3051,21 @@ package body Sem_Ch13 is
 
  Set_Has_Delayed_Aspects (Full_View (E));
  Ensure_Freeze_Node (Full_View (E));
+
+ --  If there is an Underlying_Full_View, also create a
+ --  freeze node for that one.
+
+ if Is_Private_Type (Full_View (E)) then
+declare
+   U_Full : constant Entity_Id :=
+ Underlying_Full_View (Full_View (E));
+begin
+   if Present (U_Full) then
+  Set_Has_Delayed_Aspects (U_Full);
+  Ensure_Freeze_Node (U_Full);
+   end if;
+end;
+ end if;
   end if;
 
--  Predicate_Failure




Re: [PATCH][RFC] __builtin_shuffle sometimes should produce zip1 rather than TBL (PR82199)

2020-07-08 Thread Richard Sandiford
Dmitrij Pochepko  writes:
> Hi,
>
> thank you for looking into this.
>
> I prepared new patch with all your comments addressed.

Thanks, looks good, just a couple of minor things:

> @@ -20090,6 +20092,62 @@ aarch64_evpc_trn (struct expand_vec_perm_d *d)
>return true;
>  }
>  
> +/* Try to re-encode the PERM constant so it use the bigger size up.

maybe s/use bigger size up/combines odd and even elements/

> +   This rewrites constants such as {0, 1, 4, 5}/V4SF to {0, 2}/V2DI.
> +   We retry with this new constant with the full suite of patterns.  */
> +static bool
> +aarch64_evpc_reencode (struct expand_vec_perm_d *d)
> +{
> +  expand_vec_perm_d newd;
> +  unsigned HOST_WIDE_INT nelt;
> +
> +  if (d->vec_flags != VEC_ADVSIMD)
> +return false;
> +
> +  /* Get the new mode.  Always twice the size of the inner
> + and half the elements.  */
> +  poly_uint64 vec_bits = GET_MODE_BITSIZE (d->vmode);
> +  unsigned int new_elt_bits = GET_MODE_UNIT_BITSIZE (d->vmode) * 2;
> +  auto new_elt_mode = int_mode_for_size (new_elt_bits, false).require ();
> +  machine_mode new_mode = aarch64_simd_container_mode (new_elt_mode, 
> vec_bits);
> +
> +  if (new_mode == word_mode)
> +return false;
> +
> +  /* to_constant is safe since this routine is specific to Advanced SIMD
> + vectors.  */
> +  nelt = d->perm.length ().to_constant ();
> +
> +  vec_perm_builder newpermconst;
> +  newpermconst.new_vector (nelt / 2, nelt / 2, 1);
> +
> +  /* Convert the perm constant if we can.  Require even, odd as the pairs.  
> */
> +  for (unsigned int i = 0; i < nelt; i += 2)
> +{
> +  poly_int64 elt_poly0 = d->perm[i];
> +  poly_int64 elt_poly1 = d->perm[i+1];
> +  if (!elt_poly0.is_constant () || !elt_poly1.is_constant ())
> + return false;
> +  unsigned int elt0 = elt_poly0.to_constant ();
> +  unsigned int elt1 = elt_poly1.to_constant ();
> +  if ((elt0 & 1) != 0 || elt0 + 1 != elt1)
> + return false;
> +  newpermconst.quick_push (elt0 / 2);

It should be possible to do this without the to_constants, e.g.:

  poly_int64 elt0 = d->perm[i];
  poly_int64 elt1 = d->perm[i + 1];
  poly_int64 newelt;
  if (!multiple_p (elt0, 2, ) || maybe_ne (elt0 + 1, elt1))
return false;

(The coding conventions require spaces around “+”, even though I agree
“[i+1]” looks better.)

Looks good otherwise.

Richard


[PATCH] remove premature vect_verify_datarefs_alignment

2020-07-08 Thread Richard Biener
This followup removes vect_verify_datarefs_alignment and its
premature cancellation of vectorization leaving the actual
decision whether alignment is supported to the functions
deciding whether we can vectorize a load or store.

I'll see whether to find a suitable machine to test !hw_misalign_supported
(altivec-only ppc I think?  hints welcome...), but maybe I'm lazy...

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Richard.

2020-07-08  Richard Biener  

* tree-vectorizer.h (vect_verify_datarefs_alignment): Remove.
(vect_slp_analyze_and_verify_instance_alignment): Rename to ...
(vect_slp_analyze_instance_alignment): ... this.
* tree-vect-data-refs.c (verify_data_ref_alignment): Remove.
(vect_verify_datarefs_alignment): Likewise.
(vect_enhance_data_refs_alignment): Do not call
vect_verify_datarefs_alignment.
(vect_slp_analyze_node_alignment): Rename from
vect_slp_analyze_and_verify_node_alignment and do not
call verify_data_ref_alignment.
(vect_slp_analyze_instance_alignment): Rename from
vect_slp_analyze_and_verify_instance_alignment.
* tree-vect-stmts.c (vectorizable_store): Dump when
we vectorize an unaligned access.
(vectorizable_load): Likewise.
* tree-vect-loop.c (vect_analyze_loop_2): Do not call
vect_verify_datarefs_alignment.
* tree-vect-slp.c (vect_slp_analyze_bb_1): Adjust.
---
 gcc/tree-vect-data-refs.c | 88 ---
 gcc/tree-vect-loop.c  |  2 -
 gcc/tree-vect-slp.c   |  2 +-
 gcc/tree-vect-stmts.c | 14 +++
 gcc/tree-vectorizer.h |  4 +-
 5 files changed, 25 insertions(+), 85 deletions(-)

diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 2b4421b5fb4..e35a215e042 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -1214,56 +1214,6 @@ vect_relevant_for_alignment_p (dr_vec_info *dr_info)
   return true;
 }
 
-/* Function verify_data_ref_alignment
-
-   Return TRUE if DR_INFO can be handled with respect to alignment.  */
-
-static opt_result
-verify_data_ref_alignment (vec_info *vinfo, dr_vec_info *dr_info)
-{
-  enum dr_alignment_support supportable_dr_alignment
-= vect_supportable_dr_alignment (vinfo, dr_info, false);
-  if (!supportable_dr_alignment)
-return opt_result::failure_at
-  (dr_info->stmt->stmt,
-   DR_IS_READ (dr_info->dr)
-   ? "not vectorized: unsupported unaligned load: %T\n"
-   : "not vectorized: unsupported unaligned store: %T\n",
-   DR_REF (dr_info->dr));
-
-  if (supportable_dr_alignment != dr_aligned && dump_enabled_p ())
-dump_printf_loc (MSG_NOTE, vect_location,
-"Vectorizing an unaligned access.\n");
-
-  return opt_result::success ();
-}
-
-/* Function vect_verify_datarefs_alignment
-
-   Return TRUE if all data references in the loop can be
-   handled with respect to alignment.  */
-
-opt_result
-vect_verify_datarefs_alignment (loop_vec_info loop_vinfo)
-{
-  vec datarefs = LOOP_VINFO_DATAREFS (loop_vinfo);
-  struct data_reference *dr;
-  unsigned int i;
-
-  FOR_EACH_VEC_ELT (datarefs, i, dr)
-{
-  dr_vec_info *dr_info = loop_vinfo->lookup_dr (dr);
-  if (!vect_relevant_for_alignment_p (dr_info))
-   continue;
-
-  opt_result res = verify_data_ref_alignment (loop_vinfo, dr_info);
-  if (!res)
-   return res;
-}
-
-  return opt_result::success ();
-}
-
 /* Given an memory reference EXP return whether its alignment is less
than its size.  */
 
@@ -2091,13 +2041,7 @@ vect_enhance_data_refs_alignment (loop_vec_info 
loop_vinfo)
 
   /* Check if all datarefs are supportable and log.  */
   if (do_peeling && known_alignment_for_access_p (dr0_info) && npeel == 0)
-{
-  opt_result stat = vect_verify_datarefs_alignment (loop_vinfo);
-  if (!stat)
-do_peeling = false;
-  else
-   return stat;
-}
+   return opt_result::success ();
 
   /* Cost model #1 - honor --param vect-max-peeling-for-alignment.  */
   if (do_peeling)
@@ -2186,9 +2130,7 @@ vect_enhance_data_refs_alignment (loop_vec_info 
loop_vinfo)
  /* The inside-loop cost will be accounted for in vectorizable_load
 and vectorizable_store correctly with adjusted alignments.
 Drop the body_cst_vec on the floor here.  */
- opt_result stat = vect_verify_datarefs_alignment (loop_vinfo);
- gcc_assert (stat);
-  return stat;
+ return opt_result::success ();
 }
 }
 
@@ -2318,16 +2260,13 @@ vect_enhance_data_refs_alignment (loop_vec_info 
loop_vinfo)
   /* Peeling and versioning can't be done together at this time.  */
   gcc_assert (! (do_peeling && do_versioning));
 
-  opt_result stat = vect_verify_datarefs_alignment (loop_vinfo);
-  gcc_assert (stat);
-  return stat;
+  return opt_result::success ();
 }
 
   

Re: [Patch 2/3] aarch64: Introduce SLS mitigation for RET and BR instructions

2020-07-08 Thread Richard Sandiford
Matthew Malcomson  writes:
> With suggestions applied.
> Testing with `-mabi=ilp32` found a bug around the trampoline
> initialisation where the new larger size of the trampoline caused a
> different execution path of `emit_block_move` which ICE'd on the
> pre-existing `ptr_mode` address.

OK, thanks, and sorry for the slow review.

Richard


Re: [GCC 10 PATCH] aarch64: Treat GNU and Advanced SIMD vectors as distinct [PR95726]

2020-07-08 Thread Jakub Jelinek via Gcc-patches
On Wed, Jul 08, 2020 at 03:10:14PM +0100, Richard Sandiford wrote:
> gcc/
>   PR target/95726
>   * config/aarch64/aarch64.c (aarch64_attribute_table): Add
>   "Advanced SIMD type".
>   * config/aarch64/aarch64-builtins.c: Include stringpool.h and
>   attribs.h.
>   (aarch64_init_simd_builtin_types): Add an "Advanced SIMD type"
>   attribute to each Advanced SIMD type.
> 
> gcc/cp/
>   PR target/95726
>   * typeck.c (structural_comptypes): When comparing template
>   specializations, differentiate between vectors that have and
>   do not have an "Advanced SIMD type" attribute.
> 
> gcc/testsuite/
>   PR target/95726
>   * g++.target/aarch64/pr95726.C: New test.
> --- a/gcc/cp/typeck.c
> +++ b/gcc/cp/typeck.c
> @@ -1429,6 +1429,15 @@ structural_comptypes (tree t1, tree t2, int strict)
> || maybe_ne (TYPE_VECTOR_SUBPARTS (t1), TYPE_VECTOR_SUBPARTS (t2))
> || !same_type_p (TREE_TYPE (t1), TREE_TYPE (t2)))
>   return false;

I'd at least add an explaining comment that it is a hack for GCC 8-10 only,
for aarch64 and arm targets, why, reference to the PR and that it is solved
differently for GCC 11+.

> +  if (comparing_specializations)
> + {
> +   bool asimd1 = lookup_attribute ("Advanced SIMD type",
> +   TYPE_ATTRIBUTES (t1));
> +   bool asimd2 = lookup_attribute ("Advanced SIMD type",
> +   TYPE_ATTRIBUTES (t2));
> +   if (asimd1 != asimd2)
> + return false;
> + }

Otherwise LGTM for release branches if it is acceptable that way to Jason.

Just note, Richi announced 10.2 RC will be June 15th, so would be nice to
have it in by then.

Jakub



[GCC 10 PATCH] aarch64: Treat GNU and Advanced SIMD vectors as distinct [PR95726]

2020-07-08 Thread Richard Sandiford
This is a release branch version of r11-1741-g:31427b974ed7b7dd54e2.
The trunk version of the patch made GNU and Advanced SIMD vectors
distinct (but inter-convertible) in all cases.  However, the
traditional behaviour is that the types are distinct in template
arguments but not otherwise.

Following a suggestion from Jason, this patch puts the check
for different vector types under comparing_specializations.
In order to keep the backport as simple as possible, the patch
hard-codes the name of the attribute in the frontend rather than
adding a new branch-only target hook.  This code will be reused
for AArch32 too.

I didn't find a test that tripped the assert on the branch,
even with the --param in the PR, so instead I tested this by
forcing the hash function to only hash the tree code.  That
made the static assertion in the test fail without the patch
but pass with it.

This means that the test passes for unmodified sources even
without the patch (unless you're very unlucky).

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK for branches?

Richard


gcc/
PR target/95726
* config/aarch64/aarch64.c (aarch64_attribute_table): Add
"Advanced SIMD type".
* config/aarch64/aarch64-builtins.c: Include stringpool.h and
attribs.h.
(aarch64_init_simd_builtin_types): Add an "Advanced SIMD type"
attribute to each Advanced SIMD type.

gcc/cp/
PR target/95726
* typeck.c (structural_comptypes): When comparing template
specializations, differentiate between vectors that have and
do not have an "Advanced SIMD type" attribute.

gcc/testsuite/
PR target/95726
* g++.target/aarch64/pr95726.C: New test.
---
 gcc/config/aarch64/aarch64-builtins.c  | 14 +++
 gcc/config/aarch64/aarch64.c   |  1 +
 gcc/cp/typeck.c|  9 +++
 gcc/testsuite/g++.target/aarch64/pr95726.C | 28 ++
 4 files changed, 48 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/aarch64/pr95726.C

diff --git a/gcc/config/aarch64/aarch64-builtins.c 
b/gcc/config/aarch64/aarch64-builtins.c
index 95213cd70c8..8407a34b594 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -43,6 +43,8 @@
 #include "gimple-iterator.h"
 #include "case-cfn-macros.h"
 #include "emit-rtl.h"
+#include "stringpool.h"
+#include "attribs.h"
 
 #define v8qi_UP  E_V8QImode
 #define v4hi_UP  E_V4HImode
@@ -802,10 +804,14 @@ aarch64_init_simd_builtin_types (void)
 
   if (aarch64_simd_types[i].itype == NULL)
{
- aarch64_simd_types[i].itype
-   = build_distinct_type_copy
- (build_vector_type (eltype, GET_MODE_NUNITS (mode)));
- SET_TYPE_STRUCTURAL_EQUALITY (aarch64_simd_types[i].itype);
+ tree type = build_vector_type (eltype, GET_MODE_NUNITS (mode));
+ type = build_distinct_type_copy (type);
+ SET_TYPE_STRUCTURAL_EQUALITY (type);
+
+ TYPE_ATTRIBUTES (type)
+   = tree_cons (get_identifier ("Advanced SIMD type"),
+NULL_TREE, TYPE_ATTRIBUTES (type));
+ aarch64_simd_types[i].itype = type;
}
 
   tdecl = add_builtin_type (aarch64_simd_types[i].name,
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 8aeb78a4793..60173b34b23 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1429,6 +1429,7 @@ static const struct attribute_spec 
aarch64_attribute_table[] =
   { "arm_sve_vector_bits", 1, 1, false, true,  false, true,
  aarch64_sve::handle_arm_sve_vector_bits_attribute,
  NULL },
+  { "Advanced SIMD type", 0, 0, false, true,  false, true,  NULL, NULL },
   { "SVE type",  3, 3, false, true,  false, true,  NULL, NULL 
},
   { "SVE sizeless type",  0, 0, false, true,  false, true,  NULL, NULL },
   { NULL, 0, 0, false, false, false, false, NULL, NULL }
diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index 5f8f8290f0f..34a27fe2414 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -1429,6 +1429,15 @@ structural_comptypes (tree t1, tree t2, int strict)
  || maybe_ne (TYPE_VECTOR_SUBPARTS (t1), TYPE_VECTOR_SUBPARTS (t2))
  || !same_type_p (TREE_TYPE (t1), TREE_TYPE (t2)))
return false;
+  if (comparing_specializations)
+   {
+ bool asimd1 = lookup_attribute ("Advanced SIMD type",
+ TYPE_ATTRIBUTES (t1));
+ bool asimd2 = lookup_attribute ("Advanced SIMD type",
+ TYPE_ATTRIBUTES (t2));
+ if (asimd1 != asimd2)
+   return false;
+   }
   break;
 
 case TYPE_PACK_EXPANSION:
diff --git a/gcc/testsuite/g++.target/aarch64/pr95726.C 
b/gcc/testsuite/g++.target/aarch64/pr95726.C
new file mode 100644
index 000..3327b335ff5
--- /dev/null
+++ 

Re: [PATCH] expr: Fix REDUCE_BIT_FIELD for constants [PR95694]

2020-07-08 Thread Richard Biener via Gcc-patches
On Tue, Jul 7, 2020 at 9:20 PM Richard Sandiford
 wrote:
>
> [Sorry, been sitting on this patch for a while and just realised
>  I never sent it.]
>
> This is yet another PR caused by constant integer rtxes not storing
> a mode.  We were calling REDUCE_BIT_FIELD on a constant integer that
> didn't fit in poly_int64, and then tripped the as_a
> assert on VOIDmode.
>
> AFAICT REDUCE_BIT_FIELD is always passed rtxes that have TYPE_MODE
> (rather than some other mode) and it just fills in the redundant
> sign bits of that TYPE_MODE value.  So it should be safe to get
> the mode from the type instead of the rtx.  The patch does that
> and asserts that the modes agree, where information is available.
>
> That on its own is enough to fix the bug, but we might as well
> extend the folding case to all constant integers, not just those
> that fit poly_int64.
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to trunk
> and release branches?

OK.

Thanks,
Richard.

> Richard
>
>
> gcc/
> PR middle-end/95694
> * expr.c (expand_expr_real_2): Get the mode from the type rather
> than the rtx, and assert that it is consistent with the mode of
> the rtx (where known).  Optimize all constant integers, not just
> those that can be represented in poly_int64.
>
> gcc/testsuite/
> PR middle-end/95694
> * gcc.dg/pr95694.c: New test.
> ---
>  gcc/expr.c | 15 ---
>  gcc/testsuite/gcc.dg/pr95694.c | 23 +++
>  2 files changed, 31 insertions(+), 7 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/pr95694.c
>
> diff --git a/gcc/expr.c b/gcc/expr.c
> index 3c68b0d754c..715edae819a 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -11525,26 +11525,27 @@ expand_expr_real_1 (tree exp, rtx target, 
> machine_mode tmode,
>  static rtx
>  reduce_to_bit_field_precision (rtx exp, rtx target, tree type)
>  {
> +  scalar_int_mode mode = SCALAR_INT_TYPE_MODE (type);
>HOST_WIDE_INT prec = TYPE_PRECISION (type);
> -  if (target && GET_MODE (target) != GET_MODE (exp))
> +  gcc_assert (GET_MODE (exp) == VOIDmode || GET_MODE (exp) == mode);
> +  if (target && GET_MODE (target) != mode)
>  target = 0;
> -  /* For constant values, reduce using build_int_cst_type. */
> -  poly_int64 const_exp;
> -  if (poly_int_rtx_p (exp, _exp))
> +
> +  /* For constant values, reduce using wide_int_to_tree. */
> +  if (poly_int_rtx_p (exp))
>  {
> -  tree t = build_int_cst_type (type, const_exp);
> +  auto value = wi::to_poly_wide (exp, mode);
> +  tree t = wide_int_to_tree (type, value);
>return expand_expr (t, target, VOIDmode, EXPAND_NORMAL);
>  }
>else if (TYPE_UNSIGNED (type))
>  {
> -  scalar_int_mode mode = as_a  (GET_MODE (exp));
>rtx mask = immed_wide_int_const
> (wi::mask (prec, false, GET_MODE_PRECISION (mode)), mode);
>return expand_and (mode, exp, mask, target);
>  }
>else
>  {
> -  scalar_int_mode mode = as_a  (GET_MODE (exp));
>int count = GET_MODE_PRECISION (mode) - prec;
>exp = expand_shift (LSHIFT_EXPR, mode, exp, count, target, 0);
>return expand_shift (RSHIFT_EXPR, mode, exp, count, target, 0);
> diff --git a/gcc/testsuite/gcc.dg/pr95694.c b/gcc/testsuite/gcc.dg/pr95694.c
> new file mode 100644
> index 000..6f5e1900a02
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr95694.c
> @@ -0,0 +1,23 @@
> +/* PR tree-optimization/68835 */
> +/* { dg-do run { target int128 } } */
> +/* { dg-options "-fno-tree-forwprop -fno-tree-ccp -O1 
> -fno-tree-dominator-opts -fno-tree-fre" } */
> +
> +__attribute__((noinline, noclone)) unsigned __int128
> +foo (void)
> +{
> +  unsigned __int128 x = (unsigned __int128) 0xULL;
> +  struct { unsigned __int128 a : 65; } w;
> +  w.a = x;
> +  w.a += x;
> +  return w.a;
> +}
> +
> +int
> +main ()
> +{
> +  unsigned __int128 x = foo ();
> +  if ((unsigned long long) x != 0xfffeULL
> +  || (unsigned long long) (x >> 64) != 1)
> +__builtin_abort ();
> +  return 0;
> +}


Re: [PATCH PR95804]Force reduction partition to be scheduled in the last

2020-07-08 Thread Richard Biener via Gcc-patches
On Tue, Jul 7, 2020 at 12:03 PM bin.cheng via Gcc-patches
 wrote:
>
> Hi,
> This is a followup fix for PR95638 which changed the way post order numbers 
> are maintained for
> partition graph.  It missed one case that when SCC of reduction partition is 
> broken by runtime
> alias checks, we do need to make sure the reduction partition be scheduled in 
> the last.  This patch
> does this by forcing a negative post order to it.
>
> Bootstrap and test on x86_64, is it OK?

OK.

Richard.

> Thanks,
> bin


Re: [PATCH] Map filename from print in gfortran with -ffile-prefix-map (PR96069)

2020-07-08 Thread Yichao Yu via Gcc-patches
Forwarding to fort...@gcc.gnu.org as suggested by Dominique d'Humieres.

On Sun, Jul 5, 2020 at 9:29 PM Yichao Yu  wrote:
>
> > I think this remapping should happen with `file-prefix-map` but
> > shouldn't with `debug-prefix-map` (though if it happens for both it's
> > also not too bad) and I believe this patch is the minimum change to
> > achieve that. I think it makes sense to make this follow
> > `macro-prefix-map` although I'm not sure if this is a macro... (OTOH,
> > __builtin_FILE isn't a macro either so maybe it's fine?). I haven't
> > figured out how I can allow the option in gfortran or how to document
> > this new behavior though (e.g. I actually don't know what this is
> > called in fortran...)
>
> And here's a version that makes -fmacro-prefix-remap a common option.
>
> ---
> diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
> index 9b6300f330f..6d105e24f16 100644
> --- a/gcc/c-family/c-opts.c
> +++ b/gcc/c-family/c-opts.c
> @@ -40,7 +40,6 @@ along with GCC; see the file COPYING3.  If not see
> #include "plugin.h"/* For PLUGIN_INCLUDE_FILE event.  */
> #include "mkdeps.h"
> #include "dumpfile.h"
> -#include "file-prefix-map.h"/* add_*_prefix_map()  */
>
> #ifndef DOLLARS_IN_IDENTIFIERS
> # define DOLLARS_IN_IDENTIFIERS true
> @@ -443,10 +442,6 @@ c_common_handle_option (size_t scode, const char
> *arg, HOST_WIDE_INT value
> ,
>   cpp_opts->dollars_in_ident = value;
>   break;
>
> -case OPT_fmacro_prefix_map_:
> -  add_macro_prefix_map (arg);
> -  break;
> -
> case OPT_ffreestanding:
>   value = !value;
>   /* Fall through.  */
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index 89a58282b3f..bf9899d1aef 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -1517,10 +1517,6 @@ fdollars-in-identifiers
> C ObjC C++ ObjC++
> Permit '$' as an identifier character.
>
> -fmacro-prefix-map=
> -C ObjC C++ ObjC++ Joined RejectNegative
> --fmacro-prefix-map== Map one directory name to another in
> __FILE__, __BASE_FILE__, a
> nd __builtin_FILE().
> -
> fdump-ada-spec
> C ObjC C++ ObjC++ RejectNegative Var(flag_dump_ada_spec)
> Write all declarations as Ada code transitively.
> diff --git a/gcc/common.opt b/gcc/common.opt
> index df8af365d1b..e018716af89 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1217,6 +1217,10 @@ fdebug-prefix-map=
> Common Joined RejectNegative Var(common_deferred_options) Defer
> -fdebug-prefix-map== Map one directory name to another in
> debug information.
>
> +fmacro-prefix-map=
> +Common Joined RejectNegative Var(common_deferred_options) Defer
> +-fmacro-prefix-map== Map one directory name to another in
> __FILE__, __BASE_FILE__, a
> nd __builtin_FILE().
> +
> ffile-prefix-map=
> Common Joined RejectNegative Var(common_deferred_options) Defer
> -ffile-prefix-map==  Map one directory name to another in
> compilation result.
> diff --git a/gcc/fortran/trans-io.c b/gcc/fortran/trans-io.c
> index 21bdd5ef0d8..4d406493603 100644
> --- a/gcc/fortran/trans-io.c
> +++ b/gcc/fortran/trans-io.c
> @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
> #include "trans-types.h"
> #include "trans-const.h"
> #include "options.h"
> +#include "file-prefix-map.h" /* remap_macro_filename()  */
>
> /* Members of the ioparm structure.  */
>
> @@ -1026,7 +1027,7 @@ set_error_locus (stmtblock_t * block, tree var,
> locus * where)
>TREE_TYPE (p->field), locus_file,
>p->field, NULL_TREE);
>   f = where->lb->file;
> -  str = gfc_build_cstring_const (f->filename);
> +  str = gfc_build_cstring_const (remap_macro_filename(f->filename));
>
>   str = gfc_build_addr_expr (pchar_type_node, str);
>   gfc_add_modify (block, locus_file, str);
> diff --git a/gcc/opts-global.c b/gcc/opts-global.c
> index b1a8429dc3c..574db430430 100644
> --- a/gcc/opts-global.c
> +++ b/gcc/opts-global.c
> @@ -380,6 +380,10 @@ handle_common_deferred_options (void)
>  add_debug_prefix_map (opt->arg);
>  break;
>
> +   case OPT_fmacro_prefix_map_:
> + add_macro_prefix_map (opt->arg);
> + break;
> +
>case OPT_ffile_prefix_map_:
>  add_file_prefix_map (opt->arg);
>  break;
> diff --git a/gcc/testsuite/gfortran.dg/pr96069.f90
> b/gcc/testsuite/gfortran.dg/pr96069.f90
> new file mode 100644
> index 000..d7fed59a150
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/pr96069.f90
> @@ -0,0 +1,11 @@
> +! { dg-do compile }
> +! { dg-options "-fmacro-prefix-map==MACRO-PREFIX" }
> +
> +subroutine f(name)
> +  implicit none
> +  character*(*) name
> +  print *,name
> +  return
> +end subroutine f
> +
> +! { dg-final { scan-assembler ".string\t\"MACRO-PREFIX" } }
>
>
>
> >
> > ---
> >  gcc/fortran/trans-io.c|  3 ++-
> > gcc/testsuite/gfortran.dg/pr96069.f90 | 11 +++
> > 2 files changed, 13 insertions(+), 1 deletion(-)
> > create mode 100644 gcc/testsuite/gfortran.dg/pr96069.f90
> >
> > diff 

New Swedish PO file for 'gcc' (version 10.1.0)

2020-07-08 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Swedish team of translators.  The file is available at:

https://translationproject.org/latest/gcc/sv.po

(This file, 'gcc-10.1.0.sv.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: GCC 10.1.1 Status Report (2020-06-29)

2020-07-08 Thread Richard Biener
On Mon, 29 Jun 2020, Richard Biener wrote:

> 
> Status
> ==
> 
> The GCC 10 branch is in regression and documentation fixing mode.
> 
> We're close to two months after the GCC 10.1 release which means
> a first bugfix release is about to happen.  The plan is to release
> mid July and I am targeting for a release candidate mid next
> week, no later than July 17th.

So this sparked some confusion as "mid next week" is now but July 17th
is end of next week.  Thus I will do 10.1 RC1 next week July 15th.

Sorry for the confusion,
Richard.


git-hooks: integrate ChangeLog format check

2020-07-08 Thread Martin Liška

On 5/13/20 7:53 PM, Joseph Myers wrote:

  See the issues
I filed athttps://github.com/AdaCore/git-hooks/issues  for the existing
custom GCC changes and the pull request
https://github.com/AdaCore/git-hooks/pull/12  to bring in implementations
of many of those features (not sure if it covers everything or not).


Hey Joseph.

If I see correctly, the pull request was merged.
Are we now in a position where we can separate the GCC-specific
ChangeLog check in gcc-hooks? If so, can you please guide me
a bit?

Thanks,
Martin


RE: [PATCH 2/4] aarch64: fix __builtin_eh_return with pac-ret [PR94891]

2020-07-08 Thread Kyrylo Tkachov
Hi Szabolcs,

> -Original Message-
> From: Gcc-patches  On Behalf Of
> Szabolcs Nagy
> Sent: 26 June 2020 15:49
> To: gcc-patches@gcc.gnu.org
> Cc: fwei...@redhat.com; Richard Earnshaw ;
> Daniel Kiss 
> Subject: Re: [PATCH 2/4] aarch64: fix __builtin_eh_return with pac-ret
> [PR94891]
> 
> The 06/05/2020 17:51, Szabolcs Nagy wrote:
> > The handler argument must not be signed since that may come from
> > outside the current module and exposing signed addresses is a pointer
> > ABI break. (The signed address also may not be representable as void *
> > which is why pac-ret is currently broken on ilp32.)
> >
> > There is no point protecting the eh return path with pointer auth
> > since arbitrary target can be reached with the instruction sequence
> > in the caller function anyway, however this is a big hammer solution
> > that turns off pac-ret for the caller completely not just on the eh
> > return path.
> >
> > 2020-06-04  Szabolcs Nagy  
> >
> > * config/aarch64/aarch64.c
> (aarch64_return_address_signing_enabled):
> > Disable return address signing if __builtin_eh_return is used.
> 
> ping.
> 
> this fixes a correctness bug in pac-ret, tested
> on aarch64, with only the following regressions:
> 
> FAIL: gcc.target/aarch64/return_address_sign_1.c scan-assembler-times
> autiasp 4
> FAIL: gcc.target/aarch64/return_address_sign_1.c scan-assembler-times
> paciasp 4
> FAIL: gcc.target/aarch64/return_address_sign_b_1.c scan-assembler-times
> autibsp 4
> FAIL: gcc.target/aarch64/return_address_sign_b_1.c scan-assembler-times
> pacibsp 4
> 
> which can be fixed by
> 
> -/* { dg-final { scan-assembler-times "autiasp" 4 } } */
> -/* { dg-final { scan-assembler-times "paciasp" 4 } } */
> +/* { dg-final { scan-assembler-times "autiasp" 3 } } */
> +/* { dg-final { scan-assembler-times "paciasp" 3 } } */
> 
> since __builtin_eh_return path no longer uses pac/aut.

This is ok but...

> 
> > ---
> >  gcc/config/aarch64/aarch64.c | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> > index 6a2f85c4af7..d9557f7c0a2 100644
> > --- a/gcc/config/aarch64/aarch64.c
> > +++ b/gcc/config/aarch64/aarch64.c
> > @@ -6954,6 +6954,10 @@ aarch64_return_address_signing_enabled (void)
> >/* This function should only be called after frame laid out.   */
> >gcc_assert (cfun->machine->frame.laid_out);
> >
> > +  /* TODO: Big hammer handling of __builtin_eh_return.  */

... I don't think this comment is very useful. Please make it a bit more 
descriptive. If you want to leave the TODO here, please give a more concrete 
action plan.

Thanks,
Kyrill

> > +  if (crtl->calls_eh_return)
> > +return false;
> > +
> >/* If signing scope is AARCH64_FUNCTION_NON_LEAF, we only sign a leaf
> function
> >   if its LR is pushed onto stack.  */
> >return (aarch64_ra_sign_scope == AARCH64_FUNCTION_ALL
> > --
> > 2.17.1
> >
> 
> --


RE: [PATCH 1/4] aarch64: fix return address access with pac [PR94891][PR94791]

2020-07-08 Thread Kyrylo Tkachov


> -Original Message-
> From: Gcc-patches  On Behalf Of
> Szabolcs Nagy
> Sent: 26 June 2020 15:44
> To: gcc-patches@gcc.gnu.org
> Cc: fwei...@redhat.com; Richard Earnshaw ;
> Daniel Kiss 
> Subject: Re: [PATCH 1/4] aarch64: fix return address access with pac
> [PR94891][PR94791]
> 
> The 06/05/2020 17:51, Szabolcs Nagy wrote:
> > This is a big hammer fix for __builtin_return_address (PR target/94891)
> > returning signed addresses (sometimes, depending on wether lr happens
> > to be signed or not at the time of call which depends on optimizations),
> > and similarly -pg may pass signed return address to _mcount
> > (PR target/94791).
> >
> > At the time of return address expansion we don't know if it's signed or
> > not so it is done unconditionally.
> >
> > I wonder if allocate_initial_value for the lr reg may solve this better
> > such that get_hard_reg_initial_val just gives the right (unsigned) value?
> >
> > 2020-06-04  Szabolcs Nagy  
> >
> > * config/aarch64/aarch64-protos.h (aarch64_return_addr_rtx):
> Declare.
> > * config/aarch64/aarch64.c (aarch64_return_addr_rtx): New.
> > (aarch64_return_addr): Use aarch64_return_addr_rtx.
> > * config/aarch64/aarch64.h (PROFILE_HOOK): Likewise.
> 
> ping.
> 

This looks ok to me.
Thanks,
Kyrill

> (this fixes a correctness bug in pac-ret, tested with no regressions).
> 
> > ---
> >  gcc/config/aarch64/aarch64-protos.h |  1 +
> >  gcc/config/aarch64/aarch64.c| 20 +++-
> >  gcc/config/aarch64/aarch64.h|  2 +-
> >  3 files changed, 21 insertions(+), 2 deletions(-)
> >
> > diff --git a/gcc/config/aarch64/aarch64-protos.h
> b/gcc/config/aarch64/aarch64-protos.h
> > index 9e43adb7db0..723d9ba6ac6 100644
> > --- a/gcc/config/aarch64/aarch64-protos.h
> > +++ b/gcc/config/aarch64/aarch64-protos.h
> > @@ -578,6 +578,7 @@ int aarch64_vec_fpconst_pow_of_2 (rtx);
> >  rtx aarch64_eh_return_handler_rtx (void);
> >  rtx aarch64_mask_from_zextract_ops (rtx, rtx);
> >  const char *aarch64_output_move_struct (rtx *operands);
> > +rtx aarch64_return_addr_rtx (void);
> >  rtx aarch64_return_addr (int, rtx);
> >  rtx aarch64_simd_gen_const_vector_dup (machine_mode,
> HOST_WIDE_INT);
> >  bool aarch64_simd_mem_operand_p (rtx);
> > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> > index 6352d4ff78a..6a2f85c4af7 100644
> > --- a/gcc/config/aarch64/aarch64.c
> > +++ b/gcc/config/aarch64/aarch64.c
> > @@ -10819,6 +10819,24 @@ aarch64_initial_elimination_offset (unsigned
> from, unsigned to)
> >return cfun->machine->frame.frame_size;
> >  }
> >
> > +
> > +/* Get return address without mangling.  */
> > +
> > +rtx
> > +aarch64_return_addr_rtx (void)
> > +{
> > +  rtx val = get_hard_reg_initial_val (Pmode, LR_REGNUM);
> > +  /* Note: aarch64_return_address_signing_enabled only
> > + works after cfun->machine->frame.laid_out is set,
> > + so here we don't know if the return address will
> > + be signed or not.  */
> > +  rtx lr = gen_rtx_REG (Pmode, LR_REGNUM);
> > +  emit_move_insn (lr, val);
> > +  emit_insn (GEN_FCN (CODE_FOR_xpaclri) ());
> > +  return lr;
> > +}
> > +
> > +
> >  /* Implement RETURN_ADDR_RTX.  We do not support moving back to a
> > previous frame.  */
> >
> > @@ -10827,7 +10845,7 @@ aarch64_return_addr (int count, rtx frame
> ATTRIBUTE_UNUSED)
> >  {
> >if (count != 0)
> >  return const0_rtx;
> > -  return get_hard_reg_initial_val (Pmode, LR_REGNUM);
> > +  return aarch64_return_addr_rtx ();
> >  }
> >
> >
> > diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
> > index 2be52fd4d73..f11941bbc86 100644
> > --- a/gcc/config/aarch64/aarch64.h
> > +++ b/gcc/config/aarch64/aarch64.h
> > @@ -1112,7 +1112,7 @@ typedef struct
> >  #define PROFILE_HOOK(LABEL)
>   \
> >{
> > \
> >  rtx fun, lr;   \
> > -lr = get_hard_reg_initial_val (Pmode, LR_REGNUM);
>   \
> > +lr = aarch64_return_addr_rtx ();   
> > \
> >  fun = gen_rtx_SYMBOL_REF (Pmode, MCOUNT_NAME);
>   \
> >  emit_library_call (fun, LCT_NORMAL, VOIDmode, lr, Pmode);
>   \
> >}
> > --
> > 2.17.1
> >
> 
> --


Re: [PATCH] vect: Enhance condition check to use partial vectors in vectorizable_condition

2020-07-08 Thread Richard Sandiford
"Kewen.Lin"  writes:
> Hi,
>
> This patch is derived from the review of vector with length patch series.
> The length-based partial vector approach doesn't support reduction so far,
> so we would like to disable vectorization with partial vectors explicitly
> for it in vectorizable_condition.  Otherwise, it will cause some unexpected
> failures for a few cases like gcc.dg/vect/pr65947-2.c.
>
> But if we disable it for the cases excepting for reduction_type equal to
> EXTRACT_LAST_REDUCTION, it cause one regression failure on aarch64:
>   gcc.target/aarch64/sve/reduc_8.c -march=armv8.2-a+sve
>
> The disabling makes the outer loop can't work with partial vectors, the
> check fails.  But the case is safe to adopt it.  As Richard S. pointed out
> in the review comments, the extra inactive lanes only matter for double
> reductions, so this patch is to permit vectorization with partial vectors
> for cases EXTRACT_LAST_REDUCTION or nested-cycle reduction.
>
> Testing is ongoing, is it ok for trunk if the testing goes well?
>
> BR,
> Kewen
> -
> gcc/ChangeLog:
>
>   * tree-vect-stmts.c (vectorizable_condition): Prohibit vectorization
>   with partial vectors explicitly excepting for EXTRACT_LAST_REDUCTION
>   or nested-cycle reduction.
>
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 40e2664f93b..c23520aceab 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -9968,11 +9968,16 @@ vectorizable_condition (vec_info *vinfo,
> return false;
>   }
>  
> -  if (loop_vinfo
> -   && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
> -   && reduction_type == EXTRACT_LAST_REDUCTION)
> - vect_record_loop_mask (loop_vinfo, _VINFO_MASKS (loop_vinfo),
> -ncopies * vec_num, vectype, NULL);
> +  if (loop_vinfo && for_reduction
> +   && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo))
> + {
> +   if (reduction_type == EXTRACT_LAST_REDUCTION)
> + vect_record_loop_mask (loop_vinfo, _VINFO_MASKS (loop_vinfo),
> +ncopies * vec_num, vectype, NULL);
> +   /* Extra inactive lanes should be safe for vect_nested_cycle.  */
> +   else if (STMT_VINFO_DEF_TYPE (reduc_info) != vect_nested_cycle)
> + LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false;

We should print a dump message when setting this to false.  E.g.:

  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 "conditional reduction prevents the use"
 " of partial vectors\n");

OK with that change, thanks.

Richard


Re: [PATCH] vect/testsuite: Adjust dumping for fully masking decision

2020-07-08 Thread Richard Sandiford
"Kewen.Lin"  writes:
> Hi,
>
> As Richard S. suggested in the review of vector with length patch
> series, we can use one message on "partial vectors" instead of
> "fully with masking".  This patch is to update the dumping string
> and related test cases.
>
> Bootstrapped/regtested on aarch64-linux-gnu.
>
> Is it ok for trunk?
>
> BR,
> Kewen
> -
> gcc/ChangeLog:
>
>   * tree-vect-loop.c (vect_analyze_loop_2): Update dumping string
>   for fully masking to be more common.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/aarch64/sve/clastb_1.c: Update dumping string.
>   * gcc.target/aarch64/sve/clastb_2.c: Likewise.
>   * gcc.target/aarch64/sve/clastb_3.c: Likewise.
>   * gcc.target/aarch64/sve/clastb_4.c: Likewise.
>   * gcc.target/aarch64/sve/clastb_5.c: Likewise.
>   * gcc.target/aarch64/sve/clastb_6.c: Likewise.
>   * gcc.target/aarch64/sve/clastb_7.c: Likewise.

OK, thanks.

Richard

>
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clastb_1.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/clastb_1.c
> index d3ea52dea47..f4445d443ac 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/clastb_1.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/clastb_1.c
> @@ -17,5 +17,5 @@ condition_reduction (int *a, int min_v)
>return last;
>  }
>  
> -/* { dg-final { scan-tree-dump "using a fully-masked loop." "vect" } } */
> +/* { dg-final { scan-tree-dump "operating on partial vectors." "vect" } } */
>  /* { dg-final { scan-assembler {\tclastb\ts[0-9]+, p[0-7], s[0-9]+, 
> z[0-9]+\.s} } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clastb_2.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/clastb_2.c
> index c222b707912..27d4cd94a3c 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/clastb_2.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/clastb_2.c
> @@ -23,5 +23,5 @@ condition_reduction (TYPE *a, TYPE min_v)
>return last;
>  }
>  
> -/* { dg-final { scan-tree-dump "using a fully-masked loop." "vect" } } */
> +/* { dg-final { scan-tree-dump "operating on partial vectors." "vect" } } */
>  /* { dg-final { scan-assembler {\tclastb\ts[0-9]+, p[0-7], s[0-9]+, 
> z[0-9]+\.s} } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clastb_3.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/clastb_3.c
> index 5aaa71f948d..597f8268413 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/clastb_3.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/clastb_3.c
> @@ -5,5 +5,5 @@
>  
>  #include "clastb_2.c"
>  
> -/* { dg-final { scan-tree-dump "using a fully-masked loop." "vect" } } */
> +/* { dg-final { scan-tree-dump "operating on partial vectors." "vect" } } */
>  /* { dg-final { scan-assembler {\tclastb\tb[0-9]+, p[0-7], b[0-9]+, 
> z[0-9]+\.b} } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clastb_4.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/clastb_4.c
> index b4db170ea06..788e29fe982 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/clastb_4.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/clastb_4.c
> @@ -5,5 +5,5 @@
>  
>  #include "clastb_2.c"
>  
> -/* { dg-final { scan-tree-dump "using a fully-masked loop." "vect" } } */
> +/* { dg-final { scan-tree-dump "operating on partial vectors." "vect" } } */
>  /* { dg-final { scan-assembler {\tclastb\tw[0-9]+, p[0-7], w[0-9]+, 
> z[0-9]+\.h} } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clastb_5.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/clastb_5.c
> index 28d40a01f93..0e7e20b69ec 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/clastb_5.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/clastb_5.c
> @@ -5,5 +5,5 @@
>  
>  #include "clastb_2.c"
>  
> -/* { dg-final { scan-tree-dump "using a fully-masked loop." "vect" } } */
> +/* { dg-final { scan-tree-dump "operating on partial vectors." "vect" } } */
>  /* { dg-final { scan-assembler {\tclastb\td[0-9]+, p[0-7], d[0-9]+, 
> z[0-9]+\.d} } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clastb_6.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/clastb_6.c
> index 38632a21be1..b9d687e0597 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/clastb_6.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/clastb_6.c
> @@ -21,5 +21,5 @@ condition_reduction (TYPE *a, TYPE min_v)
>return last;
>  }
>  
> -/* { dg-final { scan-tree-dump "using a fully-masked loop." "vect" } } */
> +/* { dg-final { scan-tree-dump "operating on partial vectors." "vect" } } */
>  /* { dg-final { scan-assembler {\tclastb\ts[0-9]+, p[0-7], s[0-9]+, 
> z[0-9]+\.s} } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clastb_7.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/clastb_7.c
> index e5307d2edc8..a9f1a6aea98 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/clastb_7.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/clastb_7.c
> @@ -4,5 +4,5 @@
>  #define TYPE double
>  #include "clastb_6.c"
>  
> -/* { dg-final { scan-tree-dump "using a fully-masked loop." "vect" } } */
> +/* { dg-final { scan-tree-dump "operating on partial vectors." "vect" } } */
>  /* { dg-final { scan-assembler 

Re: [PATCH 5/7 v6] vect: Support vector load/store with length in vectorizer

2020-07-08 Thread Richard Sandiford
"Kewen.Lin"  writes:
>> […]
>>> I tested the updated patch with this releasing, LOOP_VINFO_PEELING_FOR_GAPS
>>> part looks fine, but LOOP_VINFO_PEELING_FOR_ALIGNMENT caused one case to
>>> fail at execution during vect-partial-vector-usage=2.  So far the patch
>>> doesn't handle any niters_skip cases.  I think if we want to support it, 
>>> we have to add some handlings in/like what we have for masking, such as: 
>>> mask_skip_niters, vect_prepare_for_masked_peels etc.  
>>>
>>> Do you prefer me to extend the support in this patch series?
>> 
>> It's not so much whether it has to be supported now, but more why
>> it doesn't work now.  What was the reason for the failure?
>> 
>> The peeling-with-masking thing is just an optimisation, so that we
>> can vectorise the peeled iterations rather than falling back to
>> scalar code for them.  It shouldn't be needed for correctness.
>> 
>
> Whoops, thanks for the clarification!  Nice, I just realized it's a way to
> adopt partial vectors for prologue.  The fail case is 
> gcc.dg/vect/vect-ifcvt-11.c.
> There the first iteration is optimized out due to the known AND result of
> IV 0, then it tries to peel 3 iterations, the number of remaining iterations
> for vectorization body is expected to be 12.  But it still uses 15 and causes
> out-of-bound access.
>
> The below fix can fix the failure.  The justification is that we need to use
> the fixed up niters after peeling prolog for the vectorization body for
> partial vectors.  I'm not sure why the other cases not using partial vectors 
> don't need the fixed up niters, to avoid troubles I guarded it with 
> LOOP_VINFO_USING_PARTIAL_VECTORS_P explicitly.

I think the reason is that if we're peeling prologue iterations and
the total number of iterations isn't fixed, full-vector vectorisation
will “almost always” need an epilogue loop too, and in that case
niters_vector will be nonnull.

But that's not guaranteed to be true forever.  E.g. if the start
pointers have a known misalignment that require peeling a constant
number of iterations N, and if we can prove (using enhanced range/
nonzero-bits information) that the way niters is calculated means
that niter - N is a multiple of the vector size, we could peel
the prologue and not the epilogue.  In that case, what your patch
does would be correct.

So…

> --- a/gcc/tree-vect-loop.c
> +++ b/gcc/tree-vect-loop.c
> @@ -,6 +8896,11 @@ vect_transform_loop (loop_vec_info loop_vinfo, gimple 
> *loop_vectorized_call)
>LOOP_VINFO_INT_NITERS (loop_vinfo) / lowest_vf);
> step_vector = build_one_cst (TREE_TYPE (niters));
>   }
> +  else if (LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo)
> +   && !vect_use_loop_mask_for_alignment_p (loop_vinfo))
> +   vect_gen_vector_loop_niters (loop_vinfo, LOOP_VINFO_NITERS 
> (loop_vinfo),
> + _vector, _vector,
> + niters_no_overflow);
>else
>   vect_gen_vector_loop_niters (loop_vinfo, niters, _vector,
>_vector, niters_no_overflow);

…I think we should drop the LOOP_VINFO_USING_PARTIAL_VECTORS_P
condition.  Could you also add a comment above the new call saying:

   /* vect_do_peeling subtracted the number of peeled prologue
  iterations from LOOP_VINFO_NITERS.  */

It wasn't obvious to me where the update was happening when I first
looked at the code.

Very minor, but maybe also switch the last two cases round so that
“else” is the default behaviour and the “if”s are the exceptions.

OK with those changes, thanks.

Richard


[committed] wwwdocs: Switch www.doxygen.nl to https.

2020-07-08 Thread Gerald Pfeifer
---
 htdocs/codingconventions.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/codingconventions.html b/htdocs/codingconventions.html
index a08ddcbb..c0377315 100644
--- a/htdocs/codingconventions.html
+++ b/htdocs/codingconventions.html
@@ -752,7 +752,7 @@ outside the build system are discouraged, and should be 
sent upstream
 first. 
 
 libstdc++-v3:  In docs/doxygen, comments in *.cfg.in are
-partially autogenerated from http://www.doxygen.nl;>the
+partially autogenerated from https://www.doxygen.nl;>the
 Doxygen tool.  In docs/html, the ext/lwg-* files are copied from http://www.open-std.org/jtc1/sc22/wg21/;>the C++ committee homepage,
 the 27_io/binary_iostream_* files are copies of Usenet postings, and most
-- 
2.27.0


Re: [PATCH] compute and check alignment info during analysis

2020-07-08 Thread Richard Sandiford
Richard Biener  writes:
> This moves querying the alignment support scheme from load/store
> transform time to get_load_store_type where we should know best
> what alignment constraints we actually need.  This should make
> verify_data_ref_alignment obsolete which prematurely disqualifies
> all vectorization IMHO.

Nice.  LGTM FWIW.

> Bootstrap / regtest running on x86_64-unknown-linux-gnu.
>
> 2020-07-08  Richard Biener  
>
>   * tree-vect-stmts.c (get_group_load_store_type): Pass
>   in the SLP node and the alignment support scheme output.
>   Set that.
>   (get_load_store_type): Likewise.
>   (vectorizable_store): Adjust.
>   (vectorizable_load): Likewise.
> ---
>  gcc/tree-vect-stmts.c | 72 ++-
>  1 file changed, 50 insertions(+), 22 deletions(-)
>
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index fcae3ef5f35..cec5c601268 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -2058,9 +2058,10 @@ vector_vector_composition_type (tree vtype, 
> poly_uint64 nelts, tree *ptype)
>  
>  static bool
>  get_group_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info,
> -tree vectype, bool slp,
> +tree vectype, slp_tree slp_node,
>  bool masked_p, vec_load_store_type vls_type,
>  vect_memory_access_type *memory_access_type,
> +dr_alignment_support *alignment_support_scheme,
>  gather_scatter_info *gs_info)
>  {
>loop_vec_info loop_vinfo = dyn_cast  (vinfo);
> @@ -2089,10 +2090,15 @@ get_group_load_store_type (vec_info *vinfo, 
> stmt_vec_info stmt_info,
>gcc_assert (!STMT_VINFO_STRIDED_P (first_stmt_info) || gap == 0);
>  
>/* Stores can't yet have gaps.  */
> -  gcc_assert (slp || vls_type == VLS_LOAD || gap == 0);
> +  gcc_assert (slp_node || vls_type == VLS_LOAD || gap == 0);
>  
> -  if (slp)
> +  if (slp_node)
>  {
> +  /* For SLP vectorization we directly vectorize a subchain
> +  without permutation.  */
> +  if (! SLP_TREE_LOAD_PERMUTATION (slp_node).exists ())
> + first_dr_info
> +   = STMT_VINFO_DR_INFO (SLP_TREE_SCALAR_STMTS (slp_node)[0]);
>if (STMT_VINFO_STRIDED_P (first_stmt_info))
>   {
> /* Try to use consecutive accesses of DR_GROUP_SIZE elements,
> @@ -2232,6 +2238,13 @@ get_group_load_store_type (vec_info *vinfo, 
> stmt_vec_info stmt_info,
>   *memory_access_type = VMAT_GATHER_SCATTER;
>  }
>  
> +  if (*memory_access_type == VMAT_GATHER_SCATTER
> +  || *memory_access_type == VMAT_ELEMENTWISE)
> +*alignment_support_scheme = dr_unaligned_supported;
> +  else
> +*alignment_support_scheme
> +  = vect_supportable_dr_alignment (vinfo, first_dr_info, false);
> +
>if (vls_type != VLS_LOAD && first_stmt_info == stmt_info)
>  {
>/* STMT is the leader of the group. Check the operands of all the
> @@ -2268,7 +2281,9 @@ get_group_load_store_type (vec_info *vinfo, 
> stmt_vec_info stmt_info,
>  /* Analyze load or store statement STMT_INFO of type VLS_TYPE.  Return true
> if there is a memory access type that the vectorized form can use,
> storing it in *MEMORY_ACCESS_TYPE if so.  If we decide to use gathers
> -   or scatters, fill in GS_INFO accordingly.
> +   or scatters, fill in GS_INFO accordingly.  In addition
> +   *ALIGNMENT_SUPPORT_SCHEME is filled out and false is returned if
> +   the target does not support the alignment scheme.
>  
> SLP says whether we're performing SLP rather than loop vectorization.
> MASKED_P is true if the statement is conditional on a vectorized mask.
> @@ -2277,10 +2292,11 @@ get_group_load_store_type (vec_info *vinfo, 
> stmt_vec_info stmt_info,
>  
>  static bool
>  get_load_store_type (vec_info  *vinfo, stmt_vec_info stmt_info,
> -  tree vectype, bool slp,
> +  tree vectype, slp_tree slp_node,
>bool masked_p, vec_load_store_type vls_type,
>unsigned int ncopies,
>vect_memory_access_type *memory_access_type,
> +  dr_alignment_support *alignment_support_scheme,
>gather_scatter_info *gs_info)
>  {
>loop_vec_info loop_vinfo = dyn_cast  (vinfo);
> @@ -2300,22 +2316,29 @@ get_load_store_type (vec_info  *vinfo, stmt_vec_info 
> stmt_info,
>vls_type == VLS_LOAD ? "gather" : "scatter");
> return false;
>   }
> +  /* Gather-scatter accesses perform only component accesses, alignment
> +  is irrelevant for them.  */
> +  *alignment_support_scheme = dr_unaligned_supported;
>  }
>else if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
>  {
> -  if (!get_group_load_store_type (vinfo, stmt_info, vectype, slp, 
> masked_p,
> -   vls_type, memory_access_type, gs_info))
> +  if (!get_group_load_store_type 

[PATCH] compute and check alignment info during analysis

2020-07-08 Thread Richard Biener
This moves querying the alignment support scheme from load/store
transform time to get_load_store_type where we should know best
what alignment constraints we actually need.  This should make
verify_data_ref_alignment obsolete which prematurely disqualifies
all vectorization IMHO.

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

2020-07-08  Richard Biener  

* tree-vect-stmts.c (get_group_load_store_type): Pass
in the SLP node and the alignment support scheme output.
Set that.
(get_load_store_type): Likewise.
(vectorizable_store): Adjust.
(vectorizable_load): Likewise.
---
 gcc/tree-vect-stmts.c | 72 ++-
 1 file changed, 50 insertions(+), 22 deletions(-)

diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index fcae3ef5f35..cec5c601268 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -2058,9 +2058,10 @@ vector_vector_composition_type (tree vtype, poly_uint64 
nelts, tree *ptype)
 
 static bool
 get_group_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info,
-  tree vectype, bool slp,
+  tree vectype, slp_tree slp_node,
   bool masked_p, vec_load_store_type vls_type,
   vect_memory_access_type *memory_access_type,
+  dr_alignment_support *alignment_support_scheme,
   gather_scatter_info *gs_info)
 {
   loop_vec_info loop_vinfo = dyn_cast  (vinfo);
@@ -2089,10 +2090,15 @@ get_group_load_store_type (vec_info *vinfo, 
stmt_vec_info stmt_info,
   gcc_assert (!STMT_VINFO_STRIDED_P (first_stmt_info) || gap == 0);
 
   /* Stores can't yet have gaps.  */
-  gcc_assert (slp || vls_type == VLS_LOAD || gap == 0);
+  gcc_assert (slp_node || vls_type == VLS_LOAD || gap == 0);
 
-  if (slp)
+  if (slp_node)
 {
+  /* For SLP vectorization we directly vectorize a subchain
+without permutation.  */
+  if (! SLP_TREE_LOAD_PERMUTATION (slp_node).exists ())
+   first_dr_info
+ = STMT_VINFO_DR_INFO (SLP_TREE_SCALAR_STMTS (slp_node)[0]);
   if (STMT_VINFO_STRIDED_P (first_stmt_info))
{
  /* Try to use consecutive accesses of DR_GROUP_SIZE elements,
@@ -2232,6 +2238,13 @@ get_group_load_store_type (vec_info *vinfo, 
stmt_vec_info stmt_info,
*memory_access_type = VMAT_GATHER_SCATTER;
 }
 
+  if (*memory_access_type == VMAT_GATHER_SCATTER
+  || *memory_access_type == VMAT_ELEMENTWISE)
+*alignment_support_scheme = dr_unaligned_supported;
+  else
+*alignment_support_scheme
+  = vect_supportable_dr_alignment (vinfo, first_dr_info, false);
+
   if (vls_type != VLS_LOAD && first_stmt_info == stmt_info)
 {
   /* STMT is the leader of the group. Check the operands of all the
@@ -2268,7 +2281,9 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info 
stmt_info,
 /* Analyze load or store statement STMT_INFO of type VLS_TYPE.  Return true
if there is a memory access type that the vectorized form can use,
storing it in *MEMORY_ACCESS_TYPE if so.  If we decide to use gathers
-   or scatters, fill in GS_INFO accordingly.
+   or scatters, fill in GS_INFO accordingly.  In addition
+   *ALIGNMENT_SUPPORT_SCHEME is filled out and false is returned if
+   the target does not support the alignment scheme.
 
SLP says whether we're performing SLP rather than loop vectorization.
MASKED_P is true if the statement is conditional on a vectorized mask.
@@ -2277,10 +2292,11 @@ get_group_load_store_type (vec_info *vinfo, 
stmt_vec_info stmt_info,
 
 static bool
 get_load_store_type (vec_info  *vinfo, stmt_vec_info stmt_info,
-tree vectype, bool slp,
+tree vectype, slp_tree slp_node,
 bool masked_p, vec_load_store_type vls_type,
 unsigned int ncopies,
 vect_memory_access_type *memory_access_type,
+dr_alignment_support *alignment_support_scheme,
 gather_scatter_info *gs_info)
 {
   loop_vec_info loop_vinfo = dyn_cast  (vinfo);
@@ -2300,22 +2316,29 @@ get_load_store_type (vec_info  *vinfo, stmt_vec_info 
stmt_info,
 vls_type == VLS_LOAD ? "gather" : "scatter");
  return false;
}
+  /* Gather-scatter accesses perform only component accesses, alignment
+is irrelevant for them.  */
+  *alignment_support_scheme = dr_unaligned_supported;
 }
   else if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
 {
-  if (!get_group_load_store_type (vinfo, stmt_info, vectype, slp, masked_p,
- vls_type, memory_access_type, gs_info))
+  if (!get_group_load_store_type (vinfo, stmt_info, vectype, slp_node,
+ masked_p,
+ vls_type, memory_access_type,
+ 

Re: OMP: zero-length array sections

2020-07-08 Thread Jakub Jelinek via Gcc-patches
On Wed, Jul 08, 2020 at 12:21:57PM +0200, Thomas Schwinge wrote:
> Andrew and I are currently trying to sort out zero-length array sections
> behavior in the OpenACC context.  From that fell out a testcase that I
> created to learn/verify (some of?) the OpenMP behavior.  OK to push?
> 
> 
> Oh, and any thought on this:
> 
> int dev = omp_get_default_device();
> bool shared_mem = false;
> #pragma omp target map(alloc:shared_mem)
> shared_mem = true;
> #if 1 //TODO
> /* 'omp_target_is_present' (and a few others) have behavior different from
>most others, where 'devicep == NULL' is handled the same as 
> 'device_num ==
>GOMP_DEVICE_HOST_FALLBACK'.  Is that difference intentional?  */
> /* Given that GCC doesn't support shared-memory offload devices, we fake 
> the
>expected thing as follows.  */
> if (shared_mem)
>   dev = omp_get_initial_device(); // GOMP_DEVICE_HOST_FALLBACK
> #endif

The difference is in quite unclear cases, where the device is either not
valid (outside of range) or the device used to be valid, but has been
unloaded already.
I don't remember why have I done it that way, we could as well treat also
the device == NULL cases as fallback, but any expectations how it will
behave in those cases are wrong IMHO.

>   int *a = malloc(n * sizeof *a);
>   assert(a);

I'd probably just return 0; from the test instead if a is NULL, that is not
something invalid.

Otherwise LGTM.

Jakub



OMP: zero-length array sections

2020-07-08 Thread Thomas Schwinge
Hi Jakub!

Andrew and I are currently trying to sort out zero-length array sections
behavior in the OpenACC context.  From that fell out a testcase that I
created to learn/verify (some of?) the OpenMP behavior.  OK to push?


Oh, and any thought on this:

int dev = omp_get_default_device();
bool shared_mem = false;
#pragma omp target map(alloc:shared_mem)
shared_mem = true;
#if 1 //TODO
/* 'omp_target_is_present' (and a few others) have behavior different from
   most others, where 'devicep == NULL' is handled the same as 'device_num 
==
   GOMP_DEVICE_HOST_FALLBACK'.  Is that difference intentional?  */
/* Given that GCC doesn't support shared-memory offload devices, we fake the
   expected thing as follows.  */
if (shared_mem)
  dev = omp_get_initial_device(); // GOMP_DEVICE_HOST_FALLBACK
#endif


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
#include 
#include 
#include 
#include 

static int dev;
static bool shared_mem = false;

static void test(bool mapped_outside, int *a, int n_map, int n_iterate)
{
#pragma omp target enter data map(alloc:a[0:n_map])
  if (!mapped_outside && n_map == 0)
assert(!!omp_target_is_present(a, dev) == shared_mem);
  else
assert(omp_target_is_present(a, dev));

  {
bool a_NULL;
#pragma omp target map(from:a[0:n_map]) map(from:a_NULL)
{
  a_NULL = (a == NULL);

  for (int i = 0; i < n_iterate; ++i)
	a[i] = n_iterate + i;
}
if (!shared_mem && !mapped_outside && n_map == 0)
  assert(a_NULL);
else
  assert(!a_NULL);
  }

#pragma omp target exit data map(from:a[0:n_map])
  if (mapped_outside)
assert(omp_target_is_present(a, dev));
  else
assert(!!omp_target_is_present(a, dev) == shared_mem);
}

int main(int argc, char *argv[])
{
  dev = omp_get_default_device();
#pragma omp target map(alloc:shared_mem)
  shared_mem = true;
#if 1 //TODO
  /* 'omp_target_is_present' (and a few others) have behavior different from
 most others, where 'devicep == NULL' is handled the same as 'device_num ==
 GOMP_DEVICE_HOST_FALLBACK'.  Is that difference intentional?  */
  /* Given that GCC doesn't support shared-memory offload devices, we fake the
 expected thing as follows.  */
  if (shared_mem)
dev = omp_get_initial_device(); // GOMP_DEVICE_HOST_FALLBACK
#endif

  const int n = 38;

  int *a = malloc(n * sizeof *a);
  assert(a);
  assert(!!omp_target_is_present(a, dev) == shared_mem);

  for (int i = 0; i < n; ++i)
a[i] = -i;
  test(false, a, n, n);
  for (int i = 0; i < n; ++i)
assert(a[i] == n + i);

  for (int i = 0; i < n; ++i)
a[i] = -i;
  test(false, a, 0, 0);
  for (int i = 0; i < n; ++i)
assert(a[i] == -i);

  for (int i = 0; i < n; ++i)
a[i] = -i;
#pragma omp target enter data map(alloc:a[0:n])
  assert(omp_target_is_present(a, dev));
  test(true, a, n, n);
  for (int i = 0; i < n; ++i)
if (shared_mem)
  assert(a[i] == n + i);
else
  assert(a[i] == -i);
#pragma omp target exit data map(from:a[0:n])
  assert(!!omp_target_is_present(a, dev) == shared_mem);
  for (int i = 0; i < n; ++i)
assert(a[i] == n + i);

  for (int i = 0; i < n; ++i)
a[i] = -i;
#pragma omp target enter data map(alloc:a[0:n])
  assert(omp_target_is_present(a, dev));
  test(true, a, 0, n);
  for (int i = 0; i < n; ++i)
if (shared_mem)
  assert(a[i] == n + i);
else
  assert(a[i] == -i);
#pragma omp target exit data map(from:a[0:n])
  assert(!!omp_target_is_present(a, dev) == shared_mem);
  for (int i = 0; i < n; ++i)
assert(a[i] == n + i);

  free(a);

  return 0;
}


Re: [PATCH] libgomp: Add OMPD process functions and datatypes.

2020-07-08 Thread Jakub Jelinek via Gcc-patches
On Tue, Jul 07, 2020 at 02:52:37PM -0400, y2s1982 . via Gcc-patches wrote:
> I have re-read the documentation trying to find a different solution.
> In particular, ompd_device_initialize states that
> ompd_device_t kind, ompd_size_t sizeof_id, and void *id represents
> a device identifier. To dig further, I read up on the ompd_device_t. A
> passage from ompd_device_t says that the OMPD library and a tool that uses
> it must agree on the format of the object that is passed.
> It also says that ompd_device_t is a pointer to where the device identifier
> is stored and the size of the device identifier. I am not sure how this
> works to ompd_device_initialize as those two information seems to be
> supplied separately: *id and sizeof_id. In fact, ompd-type.h provides 4
> examples, 2 of which are host and cuda, and they all simply contain unique
> numerical values.  So does this mean that I should just decide on what the
> library and tool will use for device id data type and simply stick to it?
> 
> Otherwise, Is it possible to know the proper data type to cast the void *id
> based on the device type (host/cuda)?

Looking at libompd, it ignores sizeof_id completely and saves id (the
pointer) in its device context, but doesn't ever do anything with it, so
effectively ignores it completely.

I think starting with devices it not a good idea, just return failure from
ompd_device_initialize for now and get back to it much later, when
handling of parallel, host teams, task etc. is all done.
Because communicating with devices will need also communication with the
libgomp plugins.

Jakub



Re: [patch] Make memory copy functions scalar storage order barriers

2020-07-08 Thread Eric Botcazou
[Sorry for dropping the ball here]

> But GCC does not see the reverse storage order in mymemcpy so
> it happily folds the memcpy inside it, inlines the result and then?

You're right, this breaks, hence the following alternative: either we prevent 
inlining from happening, or we declare that this is simply not supported and 
warn (there is a -Wscalar-storage-order warning for problematic constructs).

I didn't find any existing infrastructure for the former and I'm not sure it's 
worth adding, so the attached implements the latter.  Tested on x86-64/Linux.


2020-07-08  Eric Botcazou  

c-family/
* c.opt (Wscalar-storage-order): Add warn_scalar_storage_order variable.


2020-07-08  Eric Botcazou  

c/
* c-typeck.c (convert_for_assignment): If -Wscalar-storage-order is set,
warn for conversions between pointers that point to incompatible scalar
storage orders.


2020-07-08  Eric Botcazou  

* gimple-fold.c (gimple_fold_builtin_memory_op): Do not fold if either
type has reverse scalar storage order.
* tree-ssa-sccvn.c (vn_reference_lookup_3): Do not propagate through a
memory copy if either type has reverse scalar storage order.


2020-07-08  Eric Botcazou  

testsuite/
* gcc.dg/sso-11.c: New test.
* gcc.dg/sso/sso.exp: Pass -Wno-scalar-storage-order.
* gcc.dg/sso/memcpy-1.c: New test.


-- 
Eric Botcazoudiff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 89a58282b3f..21df0c10cfe 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1072,7 +1072,7 @@ C ObjC C++ ObjC++ Var(warn_return_type) Warning LangEnabledBy(C ObjC C++ ObjC++,
 Warn whenever a function's return type defaults to \"int\" (C), or about inconsistent return types (C++).
 
 Wscalar-storage-order
-C ObjC C++ ObjC++ Init(1) Warning
+C ObjC C++ ObjC++ Var(warn_scalar_storage_order) Init(1) Warning
 Warn on suspicious constructs involving reverse scalar storage order.
 
 Wselector
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 3be3690c6e2..b28c2c5ff62 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -7151,6 +7151,41 @@ convert_for_assignment (location_t location, location_t expr_loc, tree type,
 	  }
 	}
 
+  /* See if the pointers point to incompatible scalar storage orders.  */
+  if (warn_scalar_storage_order
+	  && (AGGREGATE_TYPE_P (ttl) && TYPE_REVERSE_STORAGE_ORDER (ttl))
+	 != (AGGREGATE_TYPE_P (ttr) && TYPE_REVERSE_STORAGE_ORDER (ttr)))
+	{
+	  switch (errtype)
+	  {
+	  case ic_argpass:
+	/* Do not warn for built-in functions, for example memcpy, since we
+	   control how they behave and they can be useful in this area.  */
+	if (TREE_CODE (rname) != FUNCTION_DECL || !DECL_IS_BUILTIN (rname))
+	  warning_at (location, OPT_Wscalar_storage_order,
+			  "passing argument %d of %qE from incompatible "
+			  "scalar storage order", parmnum, rname);
+	break;
+	  case ic_assign:
+	warning_at (location, OPT_Wscalar_storage_order,
+			"assignment to %qT from pointer type %qT with "
+			"incompatible scalar storage order", type, rhstype);
+	break;
+	  case ic_init:
+	warning_at (location, OPT_Wscalar_storage_order,
+			"initialization of %qT from pointer type %qT with "
+			"incompatible scalar storage order", type, rhstype);
+	break;
+	  case ic_return:
+	warning_at (location, OPT_Wscalar_storage_order,
+			"returning %qT from pointer type with incompatible "
+			"scalar storage order %qT", rhstype, type);
+	break;
+	  default:
+	gcc_unreachable ();
+	  }
+	}
+
   /* Any non-function converts to a [const][volatile] void *
 	 and vice versa; otherwise, targets must be the same.
 	 Meanwhile, the lhs target must have all the qualifiers of the rhs.  */
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 72c5e43300a..41b84ba3bb3 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -740,15 +740,24 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
 }
   else
 {
-  tree srctype, desttype, destvar, srcvar, srcoff;
+  /* We cannot (easily) change the type of the copy if it is a storage
+	 order barrier, i.e. is equivalent to a VIEW_CONVERT_EXPR that can
+	 modify the storage order of objects (see storage_order_barrier_p).  */
+  tree srctype
+	= POINTER_TYPE_P (TREE_TYPE (src))
+	  ? TREE_TYPE (TREE_TYPE (src)) : NULL_TREE;
+  tree desttype
+	= POINTER_TYPE_P (TREE_TYPE (dest))
+	  ? TREE_TYPE (TREE_TYPE (dest)) : NULL_TREE;
+  tree destvar, srcvar, srcoff;
   unsigned int src_align, dest_align;
-  tree off0;
-  const char *tmp_str;
   unsigned HOST_WIDE_INT tmp_len;
+  const char *tmp_str;
 
   /* Build accesses at offset zero with a ref-all character type.  */
-  off0 = build_int_cst (build_pointer_type_for_mode (char_type_node,
-			 ptr_mode, true), 0);
+  tree off0
+	= build_int_cst (build_pointer_type_for_mode (char_type_node,
+		  ptr_mode, true), 0);
 
   /* If we can 

Re: [PATCH][GCC][Arm] PR target/95646: Do not clobber callee saved registers with CMSE

2020-07-08 Thread Andre Simoes Dias Vieira



On 07/07/2020 13:43, Christophe Lyon wrote:

Hi,


On Mon, 6 Jul 2020 at 16:31, Andre Vieira (lists)
 wrote:


On 30/06/2020 14:50, Andre Vieira (lists) wrote:

On 29/06/2020 11:15, Christophe Lyon wrote:

On Mon, 29 Jun 2020 at 10:56, Andre Vieira (lists)
 wrote:

On 23/06/2020 21:52, Christophe Lyon wrote:

On Tue, 23 Jun 2020 at 15:28, Andre Vieira (lists)
 wrote:

On 23/06/2020 13:10, Kyrylo Tkachov wrote:

-Original Message-
From: Andre Vieira (lists) 
Sent: 22 June 2020 09:52
To: gcc-patches@gcc.gnu.org
Cc: Kyrylo Tkachov 
Subject: [PATCH][GCC][Arm] PR target/95646: Do not clobber
callee saved
registers with CMSE

Hi,

As reported in bugzilla when the -mcmse option is used while
compiling
for size (-Os) with a thumb-1 target the generated code will
clear the
registers r7-r10. These however are callee saved and should be
preserved
accross ABI boundaries. The reason this happens is because these
registers are made "fixed" when optimising for size with Thumb-1
in a
way to make sure they are not used, as pushing and popping
hi-registers
requires extra moves to and from LO_REGS.

To fix this, this patch uses 'callee_saved_reg_p', which
accounts for
this optimisation, instead of 'call_used_or_fixed_reg_p'. Be
aware of
'callee_saved_reg_p''s definition, as it does still take call used
registers into account, which aren't callee_saved in my opinion,
so it
is a rather misnoemer, works in our advantage here though as it
does
exactly what we need.

Regression tested on arm-none-eabi.

Is this OK for trunk? (Will eventually backport to previous
versions if
stable.)

Ok.
Thanks,
Kyrill

As I was getting ready to push this I noticed I didn't add any
skip-ifs
to prevent this failing with specific target options. So here's a new
version with those.

Still OK?


Hi,

This is not sufficient to skip arm-linux-gnueabi* configs built with
non-default cpu/fpu.

For instance, with arm-linux-gnueabihf --with-cpu=cortex-a9
--with-fpu=neon-fp16 --with-float=hard
I see:
FAIL: gcc.target/arm/pr95646.c (test for excess errors)
Excess errors:
cc1: error: ARMv8-M Security Extensions incompatible with selected FPU
cc1: error: target CPU does not support ARM mode

and the testcase is compiled with -mcpu=cortex-m23 -mcmse -Os

Resending as I don't think my earlier one made it to the lists
(sorry if
you are receiving this double!)

I'm not following this, before I go off and try to reproduce it,
what do
you mean by 'the testcase is compiled with -mcpu=cortex-m23 -mcmse
-Os'?
These are the options you are seeing in the log file? Surely they
should
override the default options? Only thing I can think of is this might
need an extra -mfloat-abi=soft to make sure it overrides the default
float-abi.  Could you give that a try?

No it doesn't make a difference alone.

I also had to add:
-mfpu=auto (that clears the above warning)
-mthumb otherwise we now get cc1: error: target CPU does not support
ARM mode

Looks like some effective-target machinery is needed

So I had a look at this,  I was pretty sure that -mfloat-abi=soft
overwrote -mfpu=<>, which in large it does, as in no FP instructions
will be generated but the error you see only checks for the right
number of FP registers. Which doesn't check whether
'TARGET_HARD_FLOAT' is set or not. I'll fix this too and use the
check-effective-target for armv8-m.base for this test as it is indeed
a better approach than my bag of skip-ifs. I'm testing it locally to
make sure my changes don't break anything.

Cheers,
Andre

Hi,

Sorry for the delay. So I changed the test to use the effective-target
machinery as you suggested and I also made sure that you don't get the
"ARMv8-M Security Extensions incompatible with selected FPU" when
-mfloat-abi=soft.
Further changed 'asm' to '__asm__' to avoid failures with '-std=' options.

Regression tested on arm-none-eabi.
@Christophe: could you test this for your configuration, shouldn't fail
anymore!


Indeed with your patch I don't see any failure with pr95646.c

Note that it is still unsupported with arm-eabi when running the tests
with -mcpu=cortex-mXX
because the compiler complains that -mcpu=cortex-mXX conflicts with
-march=armv8-m.base,
thus the effective-target test fails.

BTW, is that warning useful/practical? Wouldn't it be more convenient
if the last -mcpu/-march
on the command line was the only one taken into account? (I had a
similar issue when
running tests (libstdc++) getting -march=armv8-m.main+fp from their
multilib environment
and forcing -mcpu=cortex-m33 because it also means '+dsp' and produces
a warning;
I had to use -mcpu=cortex-m33 -march=armv8-m.main+fp+dsp to workaround this)
Yeah I've been annoyed by that before, also in the context of testing 
multilibs.


Even though I can see how it can be a useful warning though, if you are 
using these in build-systems and you accidentally introduce a new 
(incompatible) -mcpu/-march alongside the old one. Though it seems 
arbitrary, as we do not warn against multiple -mcpu's or 

  1   2   >