Re: introduce -fcallgraph-info option

2019-11-14 Thread Richard Biener
On Fri, 15 Nov 2019, Alexandre Oliva wrote:

> On Nov 14, 2019, Alexandre Oliva  wrote:
> 
> > %{!c:%{!S:-dumpbase %b}
> 
> Uhh, I failed to adjust this one to add the executable output name to
> dumpbase.
> 
> Anyway, getting the right semantics out of specs is providing to be a
> lot trickier than I had anticipated.  I'm now pondering a single spec
> function to deal with all of these dumpbase possibilities.
> 
> I'm also a little uncertain about behavior change WRT .dwo files.
> Though their names are built out of the .o files in the objcopy
> commands, they're built from aux_base_name in dwarf2out.  Currently,
> since aux_base_name is derived from the output object file name, this
> ensures they have the same name and directory, but once we enable
> -dumpdir to be specified to override it, that may no longer be the
> case.  Ugh...

Hmm, -dwo-base-name to the rescue? ;)  Well, I guess the debug info
has to somewhere encode the full/relative path to the .dwo files
so all that is needed is to keep that consistent?

Richard.


Re: [Patch] [mid-end][__RTL] Clean df state despite invalid __RTL startwith passes

2019-11-14 Thread Richard Biener
On Thu, 14 Nov 2019, Matthew Malcomson wrote:

> Hi there,
> 
> When compiling an __RTL function that has an invalid "startwith" pass we
> currently don't run the dfinish cleanup pass. This means we ICE on the next
> function.
> 
> This change ensures that all state is cleaned up for the next function
> to run correctly.
> 
> As an example, before this change the following code would ICE when compiling
> the function `foo2` because the "peephole2" pass is not run at optimisation
> level -O0.
> 
> When compiled with
> ./aarch64-none-linux-gnu-gcc -O0 -S missed-pass-error.c -o test.s
> 
> ```
> int __RTL (startwith ("peephole2")) badfoo ()
> {
> (function "badfoo"
>   (insn-chain
> (block 2
>   (edge-from entry (flags "FALLTHRU"))
>   (cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK)
>   (cinsn 101 (set (reg:DI x19) (reg:DI x0)))
>   (cinsn 10 (use (reg/i:SI x19)))
>   (edge-to exit (flags "FALLTHRU"))
> ) ;; block 2
>   ) ;; insn-chain
> ) ;; function "foo2"
> }
> 
> int __RTL (startwith ("final")) foo2 ()
> {
> (function "foo2"
>   (insn-chain
> (block 2
>   (edge-from entry (flags "FALLTHRU"))
>   (cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK)
>   (cinsn 101 (set (reg:DI x19) (reg:DI x0)))
>   (cinsn 10 (use (reg/i:SI x19)))
>   (edge-to exit (flags "FALLTHRU"))
> ) ;; block 2
>   ) ;; insn-chain
> ) ;; function "foo2"
> }
> ```
> 
> Now it silently ignores the __RTL function and successfully compiles foo2.
> 
> regtest done on aarch64
> regtest done on x86_64
> 
> OK for trunk?

OK.

Richard.

> gcc/ChangeLog:
> 
> 2019-11-14  Matthew Malcomson  
> 
>   * passes.c (should_skip_pass_p): Always run "dfinish".
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-11-14  Matthew Malcomson  
> 
>   * gcc.dg/rtl/aarch64/missed-pass-error.c: New test.
> 
> 
> 
> ### Attachment also inlined for ease of reply
> ###
> 
> 
> diff --git a/gcc/passes.c b/gcc/passes.c
> index 
> d86af115ecb16fcab6bfce070f1f3e4f1d90ce71..258f85ab4f8a1519b978b75dfa67536d2eacd106
>  100644
> --- a/gcc/passes.c
> +++ b/gcc/passes.c
> @@ -2375,7 +2375,8 @@ should_skip_pass_p (opt_pass *pass)
>  return false;
>  
>/* Don't skip df init; later RTL passes need it.  */
> -  if (strstr (pass->name, "dfinit") != NULL)
> +  if (strstr (pass->name, "dfinit") != NULL
> +  || strstr (pass->name, "dfinish") != NULL)
>  return false;
>  
>if (!quiet_flag)
> diff --git a/gcc/testsuite/gcc.dg/rtl/aarch64/missed-pass-error.c 
> b/gcc/testsuite/gcc.dg/rtl/aarch64/missed-pass-error.c
> new file mode 100644
> index 
> ..2f02ca9d0c40b372d86b24009540e157ed1a8c59
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/rtl/aarch64/missed-pass-error.c
> @@ -0,0 +1,45 @@
> +/* { dg-do compile { target aarch64-*-* } } */
> +/* { dg-additional-options "-O0" } */
> +
> +/*
> +   When compiling __RTL functions the startwith string can be either 
> incorrect
> +   (i.e. not matching a pass) or be unused (i.e. can refer to a pass that is
> +   not run at the current optimisation level).
> +
> +   Here we ensure that the state clean up is still run, so that functions 
> other
> +   than the faulty one can still be compiled.
> + */
> +
> +int __RTL (startwith ("peephole2")) badfoo ()
> +{
> +(function "badfoo"
> +  (insn-chain
> +(block 2
> +  (edge-from entry (flags "FALLTHRU"))
> +  (cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK)
> +  (cinsn 101 (set (reg:DI x19) (reg:DI x0)))
> +  (cinsn 10 (use (reg/i:SI x19)))
> +  (edge-to exit (flags "FALLTHRU"))
> +) ;; block 2
> +  ) ;; insn-chain
> +) ;; function "foo2"
> +}
> +
> +/* Compile a valid __RTL function to test state from the "dfinit" pass has 
> been
> +   cleaned with the "dfinish" pass.  */
> +
> +int __RTL (startwith ("final")) foo2 ()
> +{
> +(function "foo2"
> +  (insn-chain
> +(block 2
> +  (edge-from entry (flags "FALLTHRU"))
> +  (cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK)
> +  (cinsn 101 (set (reg:DI x19) (reg:DI x0)))
> +  (cinsn 10 (use (reg/i:SI x19)))
> +  (edge-to exit (flags "FALLTHRU"))
> +) ;; block 2
> +  ) ;; insn-chain
> +) ;; function "foo2"
> +}
> +
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Re: [patch, fortran] Load scalar intent-in variables at the beginning of procedures

2019-11-14 Thread Tobias Burnus

Hi Thomas,

On 11/11/19 10:55 PM, Thomas König wrote:

the attached patch loads scalar INTENT(IN) variables to a local
variable at the start of a procedure, as suggested in PR 67202, in
order to aid optimization.  This is controlled by front-end
optimization so it is easier to catch if any bugs should turn up :-)



+  if (f->sym == NULL || f->sym->attr.dimension || f->sym->attr.allocatable
+ || f->sym->attr.optional || f->sym->attr.pointer
+ || f->sym->attr.codimension || f->sym->attr.value
+ || f->sym->attr.proc_pointer || f->sym->attr.target
+ || f->sym->attr.asynchronous
+ || f->sym->ts.type == BT_CHARACTER || f->sym->ts.type == BT_DERIVED
+ || f->sym->ts.type == BT_CLASS)
+   continue;
I think you need to add at least VOLATILE to this list – otherwise, I 
have not thought much about corner cases nor have studied the patch, sorry.


Cheers,

Tobias



Re: introduce -fcallgraph-info option

2019-11-14 Thread Alexandre Oliva
On Nov 14, 2019, Alexandre Oliva  wrote:

> %{!c:%{!S:-dumpbase %b}

Uhh, I failed to adjust this one to add the executable output name to
dumpbase.

Anyway, getting the right semantics out of specs is providing to be a
lot trickier than I had anticipated.  I'm now pondering a single spec
function to deal with all of these dumpbase possibilities.

I'm also a little uncertain about behavior change WRT .dwo files.
Though their names are built out of the .o files in the objcopy
commands, they're built from aux_base_name in dwarf2out.  Currently,
since aux_base_name is derived from the output object file name, this
ensures they have the same name and directory, but once we enable
-dumpdir to be specified to override it, that may no longer be the
case.  Ugh...

-- 
Alexandre Oliva, freedom fighter   he/him   https://FSFLA.org/blogs/lxo
Free Software Evangelist   Stallman was right, but he's left :(
GNU Toolchain EngineerFSMatrix: It was he who freed the first of us
FSF & FSFLA board memberThe Savior shall return (true);


Re: [PATCH] PR92398: Fix testcase failure of pr72804.c

2019-11-14 Thread luoxhu



On 2019/11/15 11:12, Xiong Hu Luo wrote:

P9LE generated instruction is not worse than P8LE.
mtvsrdd;xxlnot;stxv vs. not;not;std;std.
Update the test case to fix failures.

gcc/testsuite/ChangeLog:

2019-11-15  Luo Xiong Hu  

testsuite/pr92398
* gcc.target/powerpc/pr72804.h: New.
* gcc.target/powerpc/pr72804.p8.c: New.
* gcc.target/powerpc/pr72804.c: Rename to ...
* gcc.target/powerpc/pr72804.p9.c: ... this one.
---
  gcc/testsuite/gcc.target/powerpc/pr72804.h| 17 ++
  gcc/testsuite/gcc.target/powerpc/pr72804.p8.c | 16 ++
  .../powerpc/{pr72804.c => pr72804.p9.c}   | 22 ++-
  3 files changed, 40 insertions(+), 15 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr72804.h
  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr72804.p8.c
  rename gcc/testsuite/gcc.target/powerpc/{pr72804.c => pr72804.p9.c} (59%)

diff --git a/gcc/testsuite/gcc.target/powerpc/pr72804.h 
b/gcc/testsuite/gcc.target/powerpc/pr72804.h
new file mode 100644
index 000..8a5ea93cc17
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr72804.h
@@ -0,0 +1,17 @@
+/* This test code is included into pr72804.p8.c and pr72804.p9.c
+   The two files have the tests for the number of instructions generated for
+   P8LE versus P9LE.  */
+
+__int128_t
+foo (__int128_t *src)
+{
+  return ~*src;
+}
+
+void
+bar (__int128_t *dst, __int128_t src)
+{
+  *dst =  ~src;
+}
+
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr72804.p8.c 
b/gcc/testsuite/gcc.target/powerpc/pr72804.p8.c
new file mode 100644
index 000..ad968769aae
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr72804.p8.c
@@ -0,0 +1,16 @@
+/* { dg-do compile { target lp64 } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -mvsx -mdejagnu-cpu=power8" } */
+
+/* { dg-final { scan-assembler-times "not " 4 {xfail be} } } */
+/* { dg-final { scan-assembler-times "std " 2 {xfail be} } } */
+/* { dg-final { scan-assembler-times "ld " 2 } } */
+/* { dg-final { scan-assembler-not "lxvd2x" } } */
+/* { dg-final { scan-assembler-not "stxvd2x" } } */
+/* { dg-final { scan-assembler-not "xxpermdi" } } */


Update to this after test it on P8BE:
-/* { dg-final { scan-assembler-not "stxvd2x" } } */
-/* { dg-final { scan-assembler-not "xxpermdi" } } */
+/* { dg-final { scan-assembler-not "stxvd2x" {xfail be} } } */
+/* { dg-final { scan-assembler-not "xxpermdi" {xfail be} } } */



+/* { dg-final { scan-assembler-not "mfvsrd" } } */
+/* { dg-final { scan-assembler-not "mfvsrd" } } */
+
+/* Source code for the test in pr72804.h */
+#include "pr72804.h"
diff --git a/gcc/testsuite/gcc.target/powerpc/pr72804.c 
b/gcc/testsuite/gcc.target/powerpc/pr72804.p9.c
similarity index 59%
rename from gcc/testsuite/gcc.target/powerpc/pr72804.c
rename to gcc/testsuite/gcc.target/powerpc/pr72804.p9.c
index 10e37caed6b..2059d7df1a2 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr72804.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr72804.p9.c
@@ -1,25 +1,17 @@
  /* { dg-do compile { target { lp64 } } } */
  /* { dg-skip-if "" { powerpc*-*-darwin* } } */
  /* { dg-require-effective-target powerpc_vsx_ok } */
-/* { dg-options "-O2 -mvsx -fno-inline-functions --param 
max-inline-insns-single-O2=200" } */
+/* { dg-options "-O2 -mvsx -mdejagnu-cpu=power9" } */
  
-__int128_t

-foo (__int128_t *src)
-{
-  return ~*src;
-}
-
-void
-bar (__int128_t *dst, __int128_t src)
-{
-  *dst =  ~src;
-}
-
-/* { dg-final { scan-assembler-times "not " 4 } } */
-/* { dg-final { scan-assembler-times "std " 2 } } */
+/* { dg-final { scan-assembler-times "not " 2 } } */
+/* { dg-final { scan-assembler-times "std " 0 } } */
  /* { dg-final { scan-assembler-times "ld " 2 } } */
  /* { dg-final { scan-assembler-not "lxvd2x" } } */
  /* { dg-final { scan-assembler-not "stxvd2x" } } */
  /* { dg-final { scan-assembler-not "xxpermdi" } } */
  /* { dg-final { scan-assembler-not "mfvsrd" } } */
  /* { dg-final { scan-assembler-not "mfvsrd" } } */
+
+/* Source code for the test in pr72804.h */
+#include "pr72804.h"
+





Go patch committed: Fix inlining of sink names

2019-11-14 Thread Ian Lance Taylor
This Go frontend patch by Than McIntosh fixes inlining of sink names.
When the compiler writes an inlinable function to the export data,
parameter names are written out (in Export::write_name) using the
Gogo::message_name as opposed to a raw/encoded name.  This means that
sink parameters (those named "_") get created with the name "_"
instead of "._" (the name created by the lexer/parser).  This confuses
Gogo::is_sink_name, which looks for the latter sequence and not just
"_".  This can cause issues later on if an inlinable function is
imported and fed through the rest of the compiler (things that are
sinks are no recognized as such).  To fix these issues, change
Gogo::is_sink_name to return true for either variants ("_" or "._").
This fixes https://golang.org/issue/35586.  Bootstrapped and ran Go
testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 277299)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-1e2d98b27701744cf0ec57b19d7fc8f594184b9a
+2d0504236c7236345ee17a0cb43a3bb9ce3acf7f
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/gogo.h
===
--- gcc/go/gofrontend/gogo.h(revision 277299)
+++ gcc/go/gofrontend/gogo.h(working copy)
@@ -222,7 +222,9 @@ class Gogo
   {
 return (name[0] == '.'
&& name[name.length() - 1] == '_'
-   && name[name.length() - 2] == '.');
+   && name[name.length() - 2] == '.')
+|| (name[0] == '_'
+&& name.length() == 1);
   }
 
   // Helper used when adding parameters (including receiver param) to the


[PATCH] PR92398: Fix testcase failure of pr72804.c

2019-11-14 Thread Xiong Hu Luo
P9LE generated instruction is not worse than P8LE.
mtvsrdd;xxlnot;stxv vs. not;not;std;std.
Update the test case to fix failures.

gcc/testsuite/ChangeLog:

2019-11-15  Luo Xiong Hu  

testsuite/pr92398
* gcc.target/powerpc/pr72804.h: New.
* gcc.target/powerpc/pr72804.p8.c: New.
* gcc.target/powerpc/pr72804.c: Rename to ...
* gcc.target/powerpc/pr72804.p9.c: ... this one.
---
 gcc/testsuite/gcc.target/powerpc/pr72804.h| 17 ++
 gcc/testsuite/gcc.target/powerpc/pr72804.p8.c | 16 ++
 .../powerpc/{pr72804.c => pr72804.p9.c}   | 22 ++-
 3 files changed, 40 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr72804.h
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr72804.p8.c
 rename gcc/testsuite/gcc.target/powerpc/{pr72804.c => pr72804.p9.c} (59%)

diff --git a/gcc/testsuite/gcc.target/powerpc/pr72804.h 
b/gcc/testsuite/gcc.target/powerpc/pr72804.h
new file mode 100644
index 000..8a5ea93cc17
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr72804.h
@@ -0,0 +1,17 @@
+/* This test code is included into pr72804.p8.c and pr72804.p9.c
+   The two files have the tests for the number of instructions generated for
+   P8LE versus P9LE.  */
+
+__int128_t
+foo (__int128_t *src)
+{
+  return ~*src;
+}
+
+void
+bar (__int128_t *dst, __int128_t src)
+{
+  *dst =  ~src;
+}
+
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr72804.p8.c 
b/gcc/testsuite/gcc.target/powerpc/pr72804.p8.c
new file mode 100644
index 000..ad968769aae
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr72804.p8.c
@@ -0,0 +1,16 @@
+/* { dg-do compile { target lp64 } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -mvsx -mdejagnu-cpu=power8" } */
+
+/* { dg-final { scan-assembler-times "not " 4 {xfail be} } } */
+/* { dg-final { scan-assembler-times "std " 2 {xfail be} } } */
+/* { dg-final { scan-assembler-times "ld " 2 } } */
+/* { dg-final { scan-assembler-not "lxvd2x" } } */
+/* { dg-final { scan-assembler-not "stxvd2x" } } */
+/* { dg-final { scan-assembler-not "xxpermdi" } } */
+/* { dg-final { scan-assembler-not "mfvsrd" } } */
+/* { dg-final { scan-assembler-not "mfvsrd" } } */
+
+/* Source code for the test in pr72804.h */
+#include "pr72804.h"
diff --git a/gcc/testsuite/gcc.target/powerpc/pr72804.c 
b/gcc/testsuite/gcc.target/powerpc/pr72804.p9.c
similarity index 59%
rename from gcc/testsuite/gcc.target/powerpc/pr72804.c
rename to gcc/testsuite/gcc.target/powerpc/pr72804.p9.c
index 10e37caed6b..2059d7df1a2 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr72804.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr72804.p9.c
@@ -1,25 +1,17 @@
 /* { dg-do compile { target { lp64 } } } */
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
 /* { dg-require-effective-target powerpc_vsx_ok } */
-/* { dg-options "-O2 -mvsx -fno-inline-functions --param 
max-inline-insns-single-O2=200" } */
+/* { dg-options "-O2 -mvsx -mdejagnu-cpu=power9" } */
 
-__int128_t
-foo (__int128_t *src)
-{
-  return ~*src;
-}
-
-void
-bar (__int128_t *dst, __int128_t src)
-{
-  *dst =  ~src;
-}
-
-/* { dg-final { scan-assembler-times "not " 4 } } */
-/* { dg-final { scan-assembler-times "std " 2 } } */
+/* { dg-final { scan-assembler-times "not " 2 } } */
+/* { dg-final { scan-assembler-times "std " 0 } } */
 /* { dg-final { scan-assembler-times "ld " 2 } } */
 /* { dg-final { scan-assembler-not "lxvd2x" } } */
 /* { dg-final { scan-assembler-not "stxvd2x" } } */
 /* { dg-final { scan-assembler-not "xxpermdi" } } */
 /* { dg-final { scan-assembler-not "mfvsrd" } } */
 /* { dg-final { scan-assembler-not "mfvsrd" } } */
+
+/* Source code for the test in pr72804.h */
+#include "pr72804.h"
+
-- 
2.21.0.777.g83232e3864



Re: [PATCH] Add support for C++2a stop_token

2019-11-14 Thread Thomas Rodgers
Tested x86_64-pc-linux-gnu, committed to trunk.

Jonathan Wakely writes:

> On 13/11/19 17:59 -0800, Thomas Rodgers wrote:
>>+/** @file include/stop_token
>>+ *  This is a Standard C++ Library header.
>>+ */
>>+
>>+#ifndef _GLIBCXX_STOP_TOKEN
>>+#define _GLIBCXX_STOP_TOKEN
>>+
>>+#if __cplusplus >= 201703L
>
> This should be > not >=
>
> OK for trunk with that change.



Improve checks on C2x fallthrough attribute

2019-11-14 Thread Joseph Myers
When adding C2x attribute support, some [[fallthrough]] support
appeared as a side-effect because of code for that attribute going
through separate paths from the normal attribute handling.

However, going through those paths without the normal attribute
handlers meant that certain checks, such as for the invalid usage
[[fallthrough()]], did not operate.  This patch improves checks by
adding this attribute to the standard attribute table, so that the
parser knows it expects no arguments, along with adding an explicit
check for "[[fallthrough]];" attribute-declarations at top level.  As
with other attributes, there are still cases where warnings should be
pedwarns because C2x constraints are violated, but this patch improves
the attribute handling.

Bootstrapped with no regressions on x86_64-pc-linux-gnu.  Applied to 
mainline.

gcc/c:
2019-11-15  Joseph Myers  

* c-decl.c (std_attribute_table): Add fallthrough.
* c-parser.c (c_parser_declaration_or_fndef): Diagnose fallthrough
attribute at top level.

gcc/c-family:
2019-11-15  Joseph Myers  

* c-attribs.c (handle_fallthrough_attribute): Remove static.
* c-common.h (handle_fallthrough_attribute): Declare.

gcc/testsuite:
2019-11-15  Joseph Myers  

* gcc.dg/c2x-attr-fallthrough-2.c,
gcc.dg/c2x-attr-fallthrough-3.c: New tests.

Index: gcc/c/c-decl.c
===
--- gcc/c/c-decl.c  (revision 278268)
+++ gcc/c/c-decl.c  (working copy)
@@ -4343,6 +4343,8 @@ const struct attribute_spec std_attribute_table[]
affects_type_identity, handler, exclude } */
   { "deprecated", 0, 1, false, false, false, false,
 handle_deprecated_attribute, NULL },
+  { "fallthrough", 0, 0, false, false, false, false,
+handle_fallthrough_attribute, NULL },
   { NULL, 0, 0, false, false, false, false, NULL, NULL }
 };
 
Index: gcc/c/c-parser.c
===
--- gcc/c/c-parser.c(revision 278268)
+++ gcc/c/c-parser.c(working copy)
@@ -1927,9 +1927,15 @@ c_parser_declaration_or_fndef (c_parser *parser, b
{
  if (fallthru_attr_p != NULL)
*fallthru_attr_p = true;
- tree fn = build_call_expr_internal_loc (here, IFN_FALLTHROUGH,
- void_type_node, 0);
- add_stmt (fn);
+ if (nested)
+   {
+ tree fn = build_call_expr_internal_loc (here, IFN_FALLTHROUGH,
+ void_type_node, 0);
+ add_stmt (fn);
+   }
+ else
+   pedwarn (here, OPT_Wattributes,
+"% attribute at top level");
}
   else if (empty_ok && !(have_attrs
 && specs->non_std_attrs_seen_p))
Index: gcc/c-family/c-attribs.c
===
--- gcc/c-family/c-attribs.c(revision 278268)
+++ gcc/c-family/c-attribs.c(working copy)
@@ -144,7 +144,6 @@ static tree handle_simd_attribute (tree *, tree, t
 static tree handle_omp_declare_target_attribute (tree *, tree, tree, int,
 bool *);
 static tree handle_designated_init_attribute (tree *, tree, tree, int, bool *);
-static tree handle_fallthrough_attribute (tree *, tree, tree, int, bool *);
 static tree handle_patchable_function_entry_attribute (tree *, tree, tree,
   int, bool *);
 static tree handle_copy_attribute (tree *, tree, tree, int, bool *);
@@ -4114,7 +4113,7 @@ handle_designated_init_attribute (tree *node, tree
 /* Handle a "fallthrough" attribute; arguments as in struct
attribute_spec.handler.  */
 
-static tree
+tree
 handle_fallthrough_attribute (tree *, tree name, tree, int,
  bool *no_add_attrs)
 {
Index: gcc/c-family/c-common.h
===
--- gcc/c-family/c-common.h (revision 278268)
+++ gcc/c-family/c-common.h (working copy)
@@ -1359,6 +1359,7 @@ extern void warn_for_multistatement_macros (locati
 extern bool attribute_takes_identifier_p (const_tree);
 extern tree handle_deprecated_attribute (tree *, tree, tree, int, bool *);
 extern tree handle_unused_attribute (tree *, tree, tree, int, bool *);
+extern tree handle_fallthrough_attribute (tree *, tree, tree, int, bool *);
 extern int parse_tm_stmt_attr (tree, int);
 extern int tm_attr_to_mask (tree);
 extern tree tm_mask_to_attr (int);
Index: gcc/testsuite/gcc.dg/c2x-attr-fallthrough-2.c
===
--- gcc/testsuite/gcc.dg/c2x-attr-fallthrough-2.c   (nonexistent)
+++ gcc/testsuite/gcc.dg/c2x-attr-fallthrough-2.c   (working copy)
@@ -0,0 +1,35 @@
+/* Test C2x attribute syntax.  Invalid use of fallthrough attribute.  */
+/* { dg-do compile } */
+/* { dg-options "-s

Re: introduce -fcallgraph-info option

2019-11-14 Thread Alexandre Oliva
On Nov  8, 2019, Eric Gallager  wrote:

> If you're touching the -auxbase option... is that related to the other
> options starting with -aux at all?

'fraid they're entirely unrelated.  We're talking about how the compiler
names aux and dump output files, which is not related with the contents
of an explicitly-named output file as the PR you mentioned.

-- 
Alexandre Oliva, freedom fighter   he/him   https://FSFLA.org/blogs/lxo
Free Software Evangelist   Stallman was right, but he's left :(
GNU Toolchain EngineerFSMatrix: It was he who freed the first of us
FSF & FSFLA board memberThe Savior shall return (true);


Re: introduce -fcallgraph-info option

2019-11-14 Thread Alexandre Oliva
On Nov  8, 2019, Richard Biener  wrote:

> Wow, thanks for the elaborate write-up!  I wonder if we can
> cut&paste this into documentation somewhere appropriate, maybe
> there's already a section for "auxiliary compiler outputs".

Sure, that makes sense.

>> I'm a little hesitant, this amounts to quite significant behavior
>> changes.  Do these seem acceptable and desirable nevertheless?

> I think the current state is somewhat of a mess and in some
> cases confusing and your suggestion sounds like an overall
> improvement to me (you didn't actually suggest to remove
> either of the -dump{base,dir} -auxbase{-strip} options?)

I was trying to narrow down the desired behavior before trying to figure
out what options we could do away with.  If what I proposed was
acceptable, I thought we could drop the internal -auxbase* options
altogether.

However, I missed one relevant case in my analysis.  I suggested the
auxbase internally derived from dumpbase would drop the dumpbase
extension iff the extension matched that of the input file name.  That
doesn't work when compilation takes an intermediate file rather than the
input, e.g., in a -save-temps compilation, in which we'll have separate
preprocessing and the actual compiler will take the saved preprocessed
input, but should still output dumps to files named after the .c input.

ex $CC -S srcdir/foo.c -o objdir/foo.s -save-temps
-> objdir/foo.i objdir/foo.s objdir/foo.su objdir/foo.c.#t.original

The compilation line would only take the .c from -dumpbase, but since
its input is .i, it wouldn't match, and we wouldn't strip the .c from
aux outputs, and would instead generate:

-> objdir/foo.i objdir/foo.s objdir/foo.c.su objdir/foo.c.#t.original
   ^^

(which would likely be ok for .su, but quite unexpected for .dwo)

In order to address this, I propose we add an internal option (not for
the driver), -dumpbase-ext, that names the extension to be discarded
from dumpbase to form aux output names.

-dumpdir objdir -dumpbase foo.c -dumpbase-ext .c

The new -dumpbase-ext option specifies the extension to drop from the
specified -dumpbase to form aux output names, but dump output names keep
that intermediate extension.  When absent, we take it from the main
input file.

So aux outputs end up as objdir/foo.* whereas dump outputs end up as
objdump/foo.c.*, just as expected.

We could keep -dumpbase-ext an internal option, used only when doing
separate preprocessing, but it might make sense to expose it for use
along with -dumpbase for such tools as ccache and distcc, that call the
compiler driver with internal .i files, but would still prefer dumps and
aux files to be generated just as they would have for the .c files.

Specs would change from:

%{!dumpbase:-dumpbase %B}
%{c|S:%{o*:-auxbase-strip %*}
  %{!o*:-auxbase %b}}}
%{!c:%{!S:-auxbase %b}

to

%{!dumpdir:%{o*:-dumpdir %:dirname(%*)}}
%{c|S:%{!dumpbase:%{o*:-dumpbase %:replace-extension(%:basename(%*) 
%:extension(%i))}
  %{!o*:-dumpbase %b}}}
%{!c:%{!S:-dumpbase %b}

and add to separate preprocessing commands:

%{!dumpbase-ext:-dumpbase-ext %:extension(%i)}


Then we'd set up aux_base_name from dump_base_name minus the extension,
given or taken from main_input_filename.

-- 
Alexandre Oliva, freedom fighter   he/him   https://FSFLA.org/blogs/lxo
Free Software Evangelist   Stallman was right, but he's left :(
GNU Toolchain EngineerFSMatrix: It was he who freed the first of us
FSF & FSFLA board memberThe Savior shall return (true);


[PATCH], V8, #6 of #6, Testsuite: Test -fstack-protect-strong works with prefixed addressing

2019-11-14 Thread Michael Meissner
This patch checks whether the -fstack-protect-strong option works with a large
stack frame on -mcpu=future systems where prefixed instructions are generated.

Can I check this into the FSF trunk?

2019-11-14  Michael Meissner  

* gcc.target/powerpc/prefix-stack-protect.c: New test to make sure
-fstack-protect-strong works with prefixed addressing.

--- /tmp/byVdyb_prefix-stack-protect.c  2019-11-13 17:45:35.374176204 -0500
+++ gcc/testsuite/gcc.target/powerpc/prefix-stack-protect.c 2019-11-13 
17:45:35.143174125 -0500
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future -fstack-protector-strong" } */
+
+/* Test that we can handle large stack frames with -fstack-protector-strong and
+   prefixed addressing.  */
+
+extern long foo (char *);
+
+long
+bar (void)
+{
+  char buffer[0x2];
+  return foo (buffer) + 1;
+}
+
+/* { dg-final { scan-assembler {\mpld\M}  } } */
+/* { dg-final { scan-assembler {\mpstd\M} } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[PATCH], V8, #5 of 6, Testsuite: Test PC-relative load/store instructions

2019-11-14 Thread Michael Meissner
This patch adds tests for using the PC-relative addressing on the 'future'
system.

Can I check this patch into the FSF trunk after the patch in the V7 series that
enables PC-relative addressing by default on 64-bit Linux systems has been
commited?

2019-11-14  Michael Meissner  

* gcc/testsuite/gcc.target/powerpc/prefix-pcrel.h: New set of
tests to test prefixed addressing on 'future' system with
PC-relative tests.
* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-hi.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-kf.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-qi.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sd.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-sf.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-si.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-udi.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uhi.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-uqi.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-usi.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-pcrel-v2df.c: New test.

--- /tmp/79Y8V6_prefix-pcrel-dd.c   2019-11-13 17:43:34.462087329 -0500
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-dd.c  2019-11-13 
17:43:34.183084816 -0500
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for SImode.  */
+
+#define TYPE _Decimal64
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
--- /tmp/st8ftv_prefix-pcrel-df.c   2019-11-13 17:43:34.472087419 -0500
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-df.c  2019-11-13 
17:43:34.188084861 -0500
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for DFmode.  */
+
+#define TYPE double
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
--- /tmp/Wo2P1T_prefix-pcrel-di.c   2019-11-13 17:43:34.479087482 -0500
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-di.c  2019-11-13 
17:43:34.194084915 -0500
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for DImode.  */
+
+#define TYPE long
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mpld\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M} 2 } } */
--- /tmp/KmOSBi_prefix-pcrel-hi.c   2019-11-13 17:43:34.487087554 -0500
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-hi.c  2019-11-13 
17:43:34.199084960 -0500
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for HImode.  */
+
+#define TYPE short
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplh[az]\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpsth\M} 2 } } */
--- /tmp/BalpdH_prefix-pcrel-kf.c   2019-11-13 17:43:34.494087617 -0500
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-kf.c  2019-11-13 
17:43:34.205085014 -0500
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for KFmode.  */
+
+#define TYPE __float128
+
+#include "prefix-pcrel.h"
+
+/* { dg-final { scan-assembler-times {\mplxv\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 2 } } */
--- /tmp/FMdpQ5_prefix-pcrel-qi.c   2019-11-13 17:43:34.502087689 -0500
+++ gcc/testsuite/gcc.target/powerpc/prefix-pcrel-qi.c  2019-11-13 
17:43:34.210085059 -0500
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_pcrel_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether pc-relative prefixed
+   instructions are generated for QImode.  */
+
+#define TYPE signed char
+
+#include "p

[PATCH], V8, #4 of 6, Testsuite: Test for prefixed instructions with large offsets

2019-11-14 Thread Michael Meissner
This patch tests whether using large numeric offsets causes prefixed loads or
stores to be generated.

Can I check this patch into the FSF trunk?

2019-11-14  Michael Meissner  

* gcc/testsuite/gcc.target/powerpc/prefix-large.h: New set of
tests to test prefixed addressing on 'future' system with large
numeric offsets.
* gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-large-df.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-large-di.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-large-hi.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-large-kf.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-large-qi.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-large-sd.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-large-sf.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-large-si.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-large-udi.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-large-uhi.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-large-uqi.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-large-usi.c: New test.
* gcc/testsuite/gcc.target/powerpc/prefix-large-v2df.c: New test.

--- /tmp/RMaUEu_prefix-large-dd.c   2019-11-13 17:42:31.960524470 -0500
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-dd.c  2019-11-13 
17:42:31.719522299 -0500
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE _Decimal64
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
--- /tmp/ASyj4G_prefix-large-df.c   2019-11-13 17:42:31.968524542 -0500
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-df.c  2019-11-13 
17:42:31.725522354 -0500
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE double
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplfd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstfd\M} 2 } } */
--- /tmp/uCv6uT_prefix-large-di.c   2019-11-13 17:42:31.975524605 -0500
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-di.c  2019-11-13 
17:42:31.730522399 -0500
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE long
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mpld\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M} 2 } } */
--- /tmp/M6slX5_prefix-large-hi.c   2019-11-13 17:42:31.983524677 -0500
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-hi.c  2019-11-13 
17:42:31.735522443 -0500
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE short
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplh[az]\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpsth\M} 2 } } */
--- /tmp/iEQZqi_prefix-large-kf.c   2019-11-13 17:42:31.990524740 -0500
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-kf.c  2019-11-13 
17:42:31.740522489 -0500
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  */
+
+#define TYPE __float128
+
+#include "prefix-large.h"
+
+/* { dg-final { scan-assembler-times {\mplxv\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 2 } } */
--- /tmp/01w3Vu_prefix-large-qi.c   2019-11-13 17:42:31.997524803 -0500
+++ gcc/testsuite/gcc.target/powerpc/prefix-large-qi.c  2019-11-13 
17:42:31.745522534 -0500
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests for prefixed instructions testing whether we can generate a prefixed
+   load/store instruction that has a 34-bit offset.  

[PATCH], V8, #3 of 6, Testsuite: Insure no prefixed instruction uses update addressing

2019-11-14 Thread Michael Meissner
The prefixed instructions do not support the update form of the memory
instruction (i.e. internally this is addresses using PRE_INC, PRE_DEC, or
PRE_MODIFY).

Can I check this into the FSF trunk?

2019-11-14  Michael Meissner  

* gcc.target/powerpc/prefix-premodify.c: New test to make sure we
do not generate PRE_INC, PRE_DEC, or PRE_MODIFY on prefixed loads
or stores.

--- /tmp/LMc94y_prefix-premodify.c  2019-11-13 17:41:36.037020850 -0500
+++ gcc/testsuite/gcc.target/powerpc/prefix-premodify.c 2019-11-13 
17:41:35.807018779 -0500
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Make sure that we don't try to generate a prefixed form of the load and
+   store with update instructions.  */
+
+#ifndef SIZE
+#define SIZE 5
+#endif
+
+struct foo {
+  unsigned int field;
+  char pad[SIZE];
+};
+
+struct foo *inc_load (struct foo *p, unsigned int *q)
+{
+  *q = (++p)->field;
+  return p;
+}
+
+struct foo *dec_load (struct foo *p, unsigned int *q)
+{
+  *q = (--p)->field;
+  return p;
+}
+
+struct foo *inc_store (struct foo *p, unsigned int *q)
+{
+  (++p)->field = *q;
+  return p;
+}
+
+struct foo *dec_store (struct foo *p, unsigned int *q)
+{
+  (--p)->field = *q;
+  return p;
+}
+
+/* { dg-final { scan-assembler-times {\mpli\M|\mpla\M|\mpaddi\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mplwz\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstw\M}  2 } } */
+/* { dg-final { scan-assembler-not   {\mp?lwzu\M}  } } */
+/* { dg-final { scan-assembler-not   {\mp?stwzu\M} } } */
+/* { dg-final { scan-assembler-not   {\maddis\M}   } } */
+/* { dg-final { scan-assembler-not   {\maddi\M}} } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[PATCH], V8, #2 of 6, Testsuite: Test illegal DS/DQ offsets become prefixed insns

2019-11-14 Thread Michael Meissner
This test tests whether traditional DS and DQ instructions that require the
bottom 2 bits of the offset to be zero (DS) or the bottom 4 of the offset to be
zero (DQ) properly generate the prefixed form of the instruction instead of
loading the offset into a GPR and doing an indexed memory operation.

Can I check test this into the FSF trunk?

2019-11-14  Michael Meissner  

* gcc.target/powerpc/prefix-odd-memory.c: New test to make sure
prefixed instructions are generated if an offset would not be
legal for the non-prefixed DS/DQ instructions.

--- /tmp/Clb8P3_prefix-odd-memory.c 2019-11-13 17:40:31.750441916 -0500
+++ gcc/testsuite/gcc.target/powerpc/prefix-odd-memory.c2019-11-13 
17:40:31.568440277 -0500
@@ -0,0 +1,156 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Tests whether we can generate a prefixed load/store operation for addresses
+   that don't meet DS/DQ alignment constraints.  */
+
+unsigned long
+load_uc_odd (unsigned char *p)
+{
+  return p[1]; /* should generate LBZ.  */
+}
+
+long
+load_sc_odd (signed char *p)
+{
+  return p[1]; /* should generate LBZ + EXTSB.  */
+}
+
+unsigned long
+load_us_odd (unsigned char *p)
+{
+  return *(unsigned short *)(p + 1);   /* should generate LHZ.  */
+}
+
+long
+load_ss_odd (unsigned char *p)
+{
+  return *(short *)(p + 1);/* should generate LHA.  */
+}
+
+unsigned long
+load_ui_odd (unsigned char *p)
+{
+  return *(unsigned int *)(p + 1); /* should generate LWZ.  */
+}
+
+long
+load_si_odd (unsigned char *p)
+{
+  return *(int *)(p + 1);  /* should generate PLWA.  */
+}
+
+unsigned long
+load_ul_odd (unsigned char *p)
+{
+  return *(unsigned long *)(p + 1);/* should generate PLD.  */
+}
+
+long
+load_sl_odd (unsigned char *p)
+{
+  return *(long *)(p + 1); /* should generate PLD.  */
+}
+
+float
+load_float_odd (unsigned char *p)
+{
+  return *(float *)(p + 1);/* should generate LFS.  */
+}
+
+double
+load_double_odd (unsigned char *p)
+{
+  return *(double *)(p + 1);   /* should generate LFD.  */
+}
+
+__ieee128
+load_ieee128_odd (unsigned char *p)
+{
+  return *(__ieee128 *)(p + 1);/* should generate PLXV.  */
+}
+
+void
+store_uc_odd (unsigned char uc, unsigned char *p)
+{
+  p[1] = uc;   /* should generate STB.  */
+}
+
+void
+store_sc_odd (signed char sc, signed char *p)
+{
+  p[1] = sc;   /* should generate STB.  */
+}
+
+void
+store_us_odd (unsigned short us, unsigned char *p)
+{
+  *(unsigned short *)(p + 1) = us; /* should generate STH.  */
+}
+
+void
+store_ss_odd (signed short ss, unsigned char *p)
+{
+  *(signed short *)(p + 1) = ss;   /* should generate STH.  */
+}
+
+void
+store_ui_odd (unsigned int ui, unsigned char *p)
+{
+  *(unsigned int *)(p + 1) = ui;   /* should generate STW.  */
+}
+
+void
+store_si_odd (signed int si, unsigned char *p)
+{
+  *(signed int *)(p + 1) = si; /* should generate STW.  */
+}
+
+void
+store_ul_odd (unsigned long ul, unsigned char *p)
+{
+  *(unsigned long *)(p + 1) = ul;  /* should generate PSTD.  */
+}
+
+void
+store_sl_odd (signed long sl, unsigned char *p)
+{
+  *(signed long *)(p + 1) = sl;/* should generate PSTD.  */
+}
+
+void
+store_float_odd (float f, unsigned char *p)
+{
+  *(float *)(p + 1) = f;   /* should generate STF.  */
+}
+
+void
+store_double_odd (double d, unsigned char *p)
+{
+  *(double *)(p + 1) = d;  /* should generate STD.  */
+}
+
+void
+store_ieee128_odd (__ieee128 ieee, unsigned char *p)
+{
+  *(__ieee128 *)(p + 1) = ieee;/* should generate PSTXV.  */
+}
+
+/* { dg-final { scan-assembler-times {\mextsb\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mlbz\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mlfd\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlfs\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlha\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlhz\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mlwz\M}   1 } } */
+/* { dg-final { scan-assembler-times {\mpld\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mplwa\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mplxv\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mpstd\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mpstxv\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstb\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mstfd\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mstfs\M}  1 } } */
+/* { dg-final { scan-assembler-times {\msth\M}   2 } } */
+/* { dg-final { scan-assembler-times {\mstw\M}   2 } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[PATCH], V8, #1 of 6, Tessuite: Add PADDI tests

2019-11-14 Thread Michael Meissner
This patch adds 3 tests that tests whether PLI (PADDI) is generated to load up
DImode constants, load up SImode constants, and adding 34-bit constants.

Once the appropriate patches to generate this code have been checked in (V7,
#1-3), can I check these new tests into the FSF trunk?

2019-11-14   Michael Meissner  

* gcc.target/powerpc/paddi-1.c: New test to test using PLI to
load up a large DImode constant.
* gcc.target/powerpc/paddi-2.c: New test to test using PLI to
load up a large SImode constant.
* gcc.target/powerpc/paddi-3.c: New test to test using PADDI to
add a large DImode constant.

--- /tmp/s2UNQW_paddi-1.c   2019-11-13 17:39:21.274807246 -0500
+++ gcc/testsuite/gcc.target/powerpc/paddi-1.c  2019-11-13 17:39:21.067805382 
-0500
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PADDI is generated to add a large constant.  */
+unsigned long
+add (unsigned long a)
+{
+  return a + 0x12345678UL;
+}
+
+/* { dg-final { scan-assembler {\mpaddi\M} } } */
--- /tmp/T53ePo_paddi-2.c   2019-11-13 17:39:21.283807328 -0500
+++ gcc/testsuite/gcc.target/powerpc/paddi-2.c  2019-11-13 17:39:21.069805400 
-0500
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PLI (PADDI) is generated to load a large constant.  */
+unsigned long
+large (void)
+{
+  return 0x12345678UL;
+}
+
+/* { dg-final { scan-assembler {\mpli\M} } } */
--- /tmp/gyt7OQ_paddi-3.c   2019-11-13 17:39:21.291807400 -0500
+++ gcc/testsuite/gcc.target/powerpc/paddi-3.c  2019-11-13 17:39:21.071805418 
-0500
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_prefixed_addr_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=future" } */
+
+/* Test that PLI (PADDI) is generated to load a large constant for SImode.  */
+void
+large_si (unsigned int *p)
+{
+  *p = 0x12345U;
+}
+
+/* { dg-final { scan-assembler {\mpli\M} } } */


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


PowerPC V8 testsuite patches

2019-11-14 Thread Michael Meissner
After the V7 patches have been processed, these patches add various tests for
new features with -mcpu=future.  If we don't apply the last patch of V7 (that
sets the defaults for 64-bit Linux to enable both prefixed and PC-relative
addressing when -mcpu=future is used), some of these patches may need to be
modified to have the appropriate -mpcrel, etc. switches.  I would prefer to get
the default changed, but if there are reasons why we can't do that patch, I
wold prefer to get these patches in ASAP.

There are 6 patches in this set.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Support C2x [[deprecated]] attribute

2019-11-14 Thread Joseph Myers
This patch adds support for the C2x [[deprecated]] attribute.  All the
actual logic for generating warnings can be identical to the GNU
__attribute__ ((deprecated)), as can the attribute handler, so this is
just a matter of wiring things up appropriately and adding the checks
specified in the standard.  Unlike for C++, this patch gives
"deprecated" an entry in a table of standard attributes rather than
remapping it internally to the GNU attribute, as that seems a cleaner
approach to me.

Specifically, the only form of arguments to the attribute permitted in
the standard is (string-literal); empty parentheses are not permitted
in the case of no arguments, and a string literal (which includes
concatenated adjacent string literals, because concatenation is an
earlier phase of translation) cannot have further redundant
parentheses around it.  For the case of empty parentheses, this patch
makes the C parser disallow them for all known attributes using the
[[]] syntax, as done for C++.  For string literals (where the C++
front end is missing the check to avoid redundant parentheses, 92521
filed for that issue), a special case is inserted in the C parser.

A known issue that I think can be addressed later as a bug fix is that
the warnings for the attribute being ignored in certain cases
(attribute declarations, statements, most uses on types) ought to be
pedwarns, as those usages are constraint violations.

Bad handling of wide string literals with this attribute is also a
pre-existing bug (91182 - although that's filed as a C++ bug, the code
in question is language-independent, in tree.c).

Bootstrapped with no regressions on x86_64-pc-linux-gnu.  Applied to 
mainline.

gcc/c:
2019-11-15  Joseph Myers  

* c-decl.c (std_attribute_table): New.
(c_init_decl_processing): Register attributes from
std_attribute_table.
* c-parser.c (c_parser_attribute_arguments): Add arguments
require_string and allow_empty_args.  All callers changed.
(c_parser_std_attribute): Set require_string argument for
"deprecated" attribute.

gcc/c-family:
2019-11-15  Joseph Myers  

* c-attribs.c (handle_deprecated_attribute): Remove static.
* c-common.h (handle_deprecated_attribute): Declare.

gcc/testsuite:
2019-11-15  Joseph Myers  

* gcc.dg/c2x-attr-deprecated-1.c, gcc.dg/c2x-attr-deprecated-2.c,
gcc.dg/c2x-attr-deprecated-3.c: New tests.

Index: gcc/c/c-decl.c
===
--- gcc/c/c-decl.c  (revision 278265)
+++ gcc/c/c-decl.c  (working copy)
@@ -4336,6 +4336,16 @@ lookup_name_fuzzy (tree name, enum lookup_name_fuz
 }
 
 
+/* Table of supported standard (C2x) attributes.  */
+const struct attribute_spec std_attribute_table[] =
+{
+  /* { name, min_len, max_len, decl_req, type_req, fn_type_req,
+   affects_type_identity, handler, exclude } */
+  { "deprecated", 0, 1, false, false, false, false,
+handle_deprecated_attribute, NULL },
+  { NULL, 0, 0, false, false, false, false, NULL, NULL }
+};
+
 /* Create the predefined scalar types of C,
and some nodes representing standard constants (0, 1, (void *) 0).
Initialize the global scope.
@@ -4349,6 +4359,8 @@ c_init_decl_processing (void)
   /* Initialize reserved words for parser.  */
   c_parse_init ();
 
+  register_scoped_attributes (std_attribute_table, NULL);
+
   current_function_decl = NULL_TREE;
 
   gcc_obstack_init (&parser_obstack);
Index: gcc/c/c-parser.c
===
--- gcc/c/c-parser.c(revision 278265)
+++ gcc/c/c-parser.c(working copy)
@@ -4478,7 +4478,8 @@ c_parser_gnu_attribute_any_word (c_parser *parser)
allow identifiers declared as types to start the arguments?  */
 
 static tree
-c_parser_attribute_arguments (c_parser *parser, bool takes_identifier)
+c_parser_attribute_arguments (c_parser *parser, bool takes_identifier,
+ bool require_string, bool allow_empty_args)
 {
   vec *expr_list;
   tree attr_args;
@@ -4518,7 +4519,21 @@ static tree
   else
 {
   if (c_parser_next_token_is (parser, CPP_CLOSE_PAREN))
-   attr_args = NULL_TREE;
+   {
+ if (!allow_empty_args)
+   error_at (c_parser_peek_token (parser)->location,
+ "parentheses must be omitted if "
+ "attribute argument list is empty");
+ attr_args = NULL_TREE;
+   }
+  else if (require_string)
+   {
+ /* The only valid argument for this attribute is a string
+literal.  Handle this specially here to avoid accepting
+string literals with excess parentheses.  */
+ tree string = c_parser_string_literal (parser, false, true).value;
+ attr_args = build_tree_list (NULL_TREE, string);
+   }
   else
{
  expr_list = c_parser_expr_list (parser, false, true,
@@ -4601,7 +4616,8 @@ c_parser_gnu_attribute (c_pa

Re: Use known value ranges while evaluating ipa predicates

2019-11-14 Thread Jakub Jelinek
On Tue, Nov 12, 2019 at 02:03:32PM +0100, Jan Hubicka wrote:
> this implements use of value ranges in ipa-predicates so inliner know
> when some tests are going to be removed (especially NULL pointer
> checks).
> 
> Bootstrapped/regtested x86_64-linux. Martin, I would apprechiate if you
> look on the patch. 

>   * gcc.dg/ipa/inline-9.c: New testcase.

The testcase is now UNRESOLVED on both x86_64-linux and i686-linux:

> --- testsuite/gcc.dg/ipa/inline-9.c   (nonexistent)
> +++ testsuite/gcc.dg/ipa/inline-9.c   (working copy)
> @@ -0,0 +1,21 @@
> +/* { dg-options "-Os -fdump-ipa-inline"  } */
> +int test(int a)
> +{
> + if (a>100)
> +   {
> + foo();
> + foo();
> + foo();
> + foo();
> + foo();
> + foo();
> + foo();
> + foo();
> +   }
> +}
> +main()
> +{
> +  for (int i=0;i<100;i++)
> +test(i);
> +}
> +/* { dg-final { scan-tree-dump "Inlined 1 calls" "inline" } } */

PASS: gcc.dg/ipa/inline-9.c (test for excess errors)
gcc.dg/ipa/inline-9.c: dump file does not exist
UNRESOLVED: gcc.dg/ipa/inline-9.c scan-tree-dump inline "Inlined 1 calls"

but fixing the obvious bug in there, s/scan-tree-dump/scan-ipa-dump/
doesn't help, nothing is really inlined.

Jakub



[PATCH], V7, #7 of 7, Turn on -mpcrel for Linux 64-bit, but not for other targets

2019-11-14 Thread Michael Meissner
This patch will enable prefixed addressing and PC-relative addressing if the OS
target support indicates that the OS supports either prefixed addressing and
whether it supports PC-relative addressing when the user uses -mcpu=future.  At
the moment, 64-bit Linux is the only system that enables both prefixed
addressing and PC-relative addressing.

I have built bootstrap compilers with this patch and there were no regressions
in the testsuite.  In addition, during development, I set each of the two
options, and built a copiler with it, and I observed that the expected behavior
for the default of whether prefixed addressing an PC-relative support is
enabled.  Can I check this into the FSF trunk?

2019-11-14  Michael Meissner  

* config/rs6000/linux64.h (TARGET_PREFIXED_ADDR_DEFAULT): Enable
prefixed addressing by default.
(TARGET_PCREL_DEFAULT): Enable pc-relative addressing by default.
* config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Only
enable -mprefixed-addr and -mpcrel if the OS tm.h says to enable
it.
(ADDRESSING_FUTURE_MASKS): New mask macro.
(OTHER_FUTURE_MASKS): Use ADDRESSING_FUTURE_MASKS.
* config/rs6000/rs6000.c (TARGET_PREFIXED_ADDR_DEFAULT): Do not
enable -mprefixed-addr unless the OS tm.h says to.
(TARGET_PCREL_DEFAULT): Do not enable -mpcrel unless the OS tm.h
says to.
(rs6000_option_override_internal): Do not enable -mprefixed-addr
or -mpcrel unless the OS tm.h says to enable it.  Add more checks
for -mcpu=future.

Index: gcc/config/rs6000/linux64.h
===
--- gcc/config/rs6000/linux64.h (revision 278173)
+++ gcc/config/rs6000/linux64.h (working copy)
@@ -640,3 +640,11 @@ extern int dot_symbols;
enabling the __float128 keyword.  */
 #undef TARGET_FLOAT128_ENABLE_TYPE
 #define TARGET_FLOAT128_ENABLE_TYPE 1
+
+/* Enable support for pc-relative and numeric prefixed addressing on the
+   'future' system.  */
+#undef  TARGET_PREFIXED_ADDR_DEFAULT
+#define TARGET_PREFIXED_ADDR_DEFAULT   1
+
+#undef  TARGET_PCREL_DEFAULT
+#define TARGET_PCREL_DEFAULT   1
Index: gcc/config/rs6000/rs6000-cpus.def
===
--- gcc/config/rs6000/rs6000-cpus.def   (revision 278173)
+++ gcc/config/rs6000/rs6000-cpus.def   (working copy)
@@ -75,15 +75,21 @@
 | OPTION_MASK_P8_VECTOR\
 | OPTION_MASK_P9_VECTOR)
 
-/* Support for a future processor's features.  Do not enable -mpcrel until it
-   is fully functional.  */
+/* Support for a future processor's features.  The prefixed and pc-relative
+   addressing bits are not added here.  Instead, rs6000.c adds them if the OS
+   tm.h says that it supports the addressing modes.  */
 #define ISA_FUTURE_MASKS_SERVER(ISA_3_0_MASKS_SERVER   
\
-| OPTION_MASK_FUTURE   \
+| OPTION_MASK_FUTURE)
+
+/* Addressing related flags on a future processor.  These flags are broken out
+   because not all targets will support either pc-relative addressing, or even
+   prefixed addressing, and we want to clear all of the addressing bits
+   on targets that cannot support prefixed/pcrel addressing.  */
+#define ADDRESSING_FUTURE_MASKS(OPTION_MASK_PCREL  
\
 | OPTION_MASK_PREFIXED_ADDR)
 
 /* Flags that need to be turned off if -mno-future.  */
-#define OTHER_FUTURE_MASKS (OPTION_MASK_PCREL  \
-| OPTION_MASK_PREFIXED_ADDR)
+#define OTHER_FUTURE_MASKS ADDRESSING_FUTURE_MASKS
 
 /* Flags that need to be turned off if -mno-power9-vector.  */
 #define OTHER_P9_VECTOR_MASKS  (OPTION_MASK_FLOAT128_HW\
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 278181)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -98,6 +98,16 @@
 #endif
 #endif
 
+/* Set up the defaults for whether prefixed addressing is used, and if it is
+   used, whether we want to turn on pc-relative support by default.  */
+#ifndef TARGET_PREFIXED_ADDR_DEFAULT
+#define TARGET_PREFIXED_ADDR_DEFAULT   0
+#endif
+
+#ifndef TARGET_PCREL_DEFAULT
+#define TARGET_PCREL_DEFAULT   0
+#endif
+
 /* Support targetm.vectorize.builtin_mask_for_load.  */
 GTY(()) tree altivec_builtin_mask_for_load;
 
@@ -2535,6 +2545,14 @@ rs6000_debug_reg_global (void)
   if (TARGET_DIRECT_MOVE_128)
 fprintf (stderr, DEBUG_FMT_D, "VSX easy 64-bit mfvsrld element",
 (int)VECTOR_ELEMENT_MFVSRLD_64BIT);
+
+  if (TARGET_FUTURE)
+{
+  fprintf (stderr, DEBUG_FMT_D, "TARGET_PREFIXED_ADDR_DEFAULT",
+  TARGET_PREFIXED_ADDR_DEFAULT);
+  fprintf (stderr, DEBUG_FMT_D, "TAR

[PATCH], V7, #6 of 7, Fix issues with vector extract and prefixed instructions

2019-11-14 Thread Michael Meissner
This patch fixes two issues with vector extracts and prefixed instructions.

The first is if you use a vector extract on a vector that is located in memory
and to access the vector, you use a PC-relative address with a veriable index.
I.e.:

#include 

static vector int vi;

int get (int n)
{
  return vec_extract (vi, n);
}

In this case, the current code re-uses the temporary for calculating the offset
of the element to load up the address of the vector, losing the offset.  This
code prevents the combiner from combining loading the vector from memory and
the vector extract if the vector is accessed via a PC-relative address.
Instead, the vector is loaded up into a register, and the variable extract from
a register is done.

I needed to add a new constraint (em) in addition to new predicate functions.
I discovered that with the predicate function alone, the register allocator
would re-create the address.  The constraint prevents this combination.

I also modified the vector extract code to generate a single PC-relative load
if the vector has a PC-relative address and the offset is constant.

I have built a bootstrap compiler with this change, and there were no
regressions in the test suite.  Can I check this into the FSF trunk?

2019-11-14  Michael Meissner  

* config/rs6000/constraints.md (em constraint): New constraint for
non-prefixed memory.
* config/rs6000/predicates.md (non_prefixed_memory): New
predicate.
(reg_or_non_prefixed_memory): New predicate.
* config/rs6000/rs6000.c (rs6000_adjust_vec_address): Add support
for optimizing extracting a constant vector element from a vector
that uses a prefixed address.  If the element number is variable
and the address uses a prefixed address, abort.
* config/rs6000/vsx.md (vsx_extract__var, VSX_D iterator):
Do not allow combining prefixed memory with a variable vector
extract.
(vsx_extract_v4sf_var): Do not allow combining prefixed memory
with a variable vector extract.
(vsx_extract__var, VSX_EXTRACT_I iterator): Do not allow
combining prefixed memory with a variable vector extract.
(vsx_extract__mode_var): Do not allow combining
prefixed memory with a variable vector extract.
* doc/md.texi (PowerPC constraints): Document the em constraint.

Index: gcc/config/rs6000/constraints.md
===
--- gcc/config/rs6000/constraints.md(revision 278173)
+++ gcc/config/rs6000/constraints.md(working copy)
@@ -202,6 +202,11 @@ (define_constraint "H"
 
 ;; Memory constraints
 
+(define_memory_constraint "em"
+  "A memory operand that does not contain a prefixed address."
+  (and (match_code "mem")
+   (match_test "non_prefixed_memory (op, mode)")))
+
 (define_memory_constraint "es"
   "A ``stable'' memory operand; that is, one which does not include any
 automodification of the base register.  Unlike @samp{m}, this constraint
Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md (revision 278177)
+++ gcc/config/rs6000/predicates.md (working copy)
@@ -1836,3 +1836,24 @@ (define_predicate "prefixed_memory"
 {
   return address_is_prefixed (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
 })
+
+;; Return true if the operand is a memory address that does not use a prefixed
+;; address.
+(define_predicate "non_prefixed_memory"
+  (match_code "mem")
+{
+  /* If the operand is not a valid memory operand even if it is not prefixed,
+ do not return true.  */
+  if (!memory_operand (op, mode))
+return false;
+
+  return !address_is_prefixed (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
+})
+
+;; Return true if the operand is either a register or it is a non-prefixed
+;; memory operand.
+(define_predicate "reg_or_non_prefixed_memory"
+  (match_code "reg,subreg,mem")
+{
+  return gpc_reg_operand (op, mode) || non_prefixed_memory (op, mode);
+})
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 278178)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -6734,6 +6734,7 @@ rs6000_adjust_vec_address (rtx scalar_re
   rtx element_offset;
   rtx new_addr;
   bool valid_addr_p;
+  bool pcrel_p = pcrel_local_address (addr, Pmode);
 
   /* Vector addresses should not have PRE_INC, PRE_DEC, or PRE_MODIFY.  */
   gcc_assert (GET_RTX_CLASS (GET_CODE (addr)) != RTX_AUTOINC);
@@ -6771,6 +6772,38 @@ rs6000_adjust_vec_address (rtx scalar_re
   else if (REG_P (addr) || SUBREG_P (addr))
 new_addr = gen_rtx_PLUS (Pmode, addr, element_offset);
 
+  /* Optimize PC-relative addresses with a constant offset.  */
+  else if (pcrel_p && CONST_INT_P (element_offset))
+{
+  rtx addr2 = addr;
+  HOST_WIDE_INT offset = INTVAL (element_offset);

[PATCH], V7, #5 of 7, Add more effective targets for the 'future' system to target-supports.

2019-11-14 Thread Michael Meissner
This patch adds more effective targets to the target-supports.exp in the
testsuite.  I tried to break it down to whether prefixed instructions are
allowed, whether the target is generating 64-bit code with prefixed
instructions, and if -mpcrel support is available.  I also enabled 'future'
testing on the actual hardware (or simulator).

The tests in V8 will use some of these capabilities.

I have run the test suite on a little endian power8 system with no degradation.
Can I check this into the FSF trunk?

2019-11-14   Michael Meissner  

* lib/target-supports.exp
(check_effective_target_powerpc_future_ok): Do not require 64-bit
or Linux support before doing the test.  Use a 32-bit constant in
PLI.
(check_effective_target_powerpc_prefixed_addr_ok): New effective
target test to see if prefixed memory instructions are supported.
(check_effective_target_powerpc_pcrel_ok): New effective target
test to test whether PC-relative addressing is supported.
(is-effective-target): Add test for the PowerPC 'future' hardware
support.

Index: gcc/testsuite/lib/target-supports.exp
===
--- gcc/testsuite/lib/target-supports.exp   (revision 278173)
+++ gcc/testsuite/lib/target-supports.exp   (working copy)
@@ -5345,16 +5345,14 @@ proc check_effective_target_powerpc_p9mo
 }
 }
 
-# Return 1 if this is a PowerPC target supporting -mfuture.
-# Limit this to 64-bit linux systems for now until other
-# targets support FUTURE.
+# Return 1 if this is a PowerPC target supporting -mcpu=future.
 
 proc check_effective_target_powerpc_future_ok { } {
-if { ([istarget powerpc64*-*-linux*]) } {
+if { ([istarget powerpc*-*-*]) } {
return [check_no_compiler_messages powerpc_future_ok object {
int main (void) {
long e;
-   asm ("pli %0,%1" : "=r" (e) : "n" (0x12345));
+   asm ("pli %0,%1" : "=r" (e) : "n" (0x1234));
return e;
}
} "-mfuture"]
@@ -5363,6 +5361,46 @@ proc check_effective_target_powerpc_futu
 }
 }
 
+# Return 1 if this is a PowerPC target supporting -mcpu=future.  The compiler
+# must support large numeric prefixed addresses by default when -mfuture is
+# used.  We test loading up a large constant to verify that the full 34-bit
+# offset for prefixed instructions is supported and we check for a prefixed
+# load as well.
+
+proc check_effective_target_powerpc_prefixed_addr_ok { } {
+if { ([istarget powerpc*-*-*]) } {
+   return [check_no_compiler_messages powerpc_prefixed_addr_ok object {
+   int main (void) {
+   extern long l[];
+   long e, e2;
+   asm ("pli %0,%1" : "=r" (e) : "n" (0x12345678));
+   asm ("pld %0,0x12345678(%1)" : "=r" (e2) : "r" (& l[0]));
+   return e - e2;
+   }
+   } "-mfuture"]
+} else {
+   return 0
+}
+}
+
+# Return 1 if this is a PowerPC target supporting -mfuture.  The compiler must
+# support PC-relative addressing when -mcpu=future is used to pass this test.
+
+proc check_effective_target_powerpc_pcrel_ok { } {
+if { ([istarget powerpc*-*-*]) } {
+   return [check_no_compiler_messages powerpc_pcrel_ok object {
+ int main (void) {
+ static int s __attribute__((__used__));
+ int e;
+ asm ("plwa %0,s@pcrel(0),1" : "=r" (e));
+ return e;
+ }
+ } "-mfuture"]
+  } else {
+ return 0
+  }
+}
+
 # Return 1 if this is a PowerPC target supporting -mfloat128 via either
 # software emulation on power7/power8 systems or hardware support on power9.
 
@@ -7261,6 +7299,7 @@ proc is-effective-target { arg } {
  "named_sections" { set selected [check_named_sections_available] }
  "gc_sections"{ set selected [check_gc_sections_available] }
  "cxa_atexit" { set selected [check_cxa_atexit_available] }
+ "powerpc_future_hw" { set selected 
[check_powerpc_future_hw_available] }
  default  { error "unknown effective target keyword `$arg'" }
}
 }

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[PATCH] V7, #4 of 7, Add explicit (0),1 to @pcrel references

2019-11-14 Thread Michael Meissner
In some of my previous work, I had make a mistake forgetting that the PADDI
instruction did not allow adding a PC-relative reference to a register (you can
either load up a PC-relative address without adding a register, or you can add
a register to a constant).  The assembler allowed the instruction, but it
didn't do what I expected.

This patch adds an explicit (0),1 to PC-relative references.  This way if you
try to add the PC-relative offset to a register, it will get a syntax error.

I have built compilers with this patch installed on a little endian power8
Linux system, and there were no regressions.  Can I check this into the trunk?

2019-11-14  Michael Meissner  

* config/rs6000/rs6000.c (print_operand_address): Add (0),1 to
@pcrel to catch errant usage.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 278175)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -13241,7 +13241,10 @@ print_operand_address (FILE *file, rtx x
   if (SYMBOL_REF_P (x) && !SYMBOL_REF_LOCAL_P (x))
fprintf (file, "@got");
 
-  fprintf (file, "@pcrel");
+  /* Specifically add (0),1 to catch uses where a @pcrel was added to a an
+address with a base register, since the hardware does not support
+adding a base register to a PC-relative address.  */
+  fprintf (file, "@pcrel(0),1");
 }
   else if (SYMBOL_REF_P (x) || GET_CODE (x) == CONST
   || GET_CODE (x) == LABEL_REF)

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[PATCH], V7, #3 of 7, Use PADDI for 34-bit immediate adds

2019-11-14 Thread Michael Meissner
This patch generates PADDI to add 34-bit immediate constants on the 'future'
system, and prevents such adds from being split.

I have built and boostrapped compilers with the patch, and there were no
regressions.  Can I check this into the trunk?

2019-11-14  Michael Meissner  

* config/rs6000/predicates.md (add_operand): Add support for
PADDI.
* config/rs6000/rs6000.md (add3): Add support for PADDI.

Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md (revision 278173)
+++ gcc/config/rs6000/predicates.md (working copy)
@@ -839,7 +839,8 @@ (define_special_predicate "indexed_addre
 (define_predicate "add_operand"
   (if_then_else (match_code "const_int")
 (match_test "satisfies_constraint_I (op)
-|| satisfies_constraint_L (op)")
+|| satisfies_constraint_L (op)
+|| satisfies_constraint_eI (op)")
 (match_operand 0 "gpc_reg_operand")))
 
 ;; Return 1 if the operand is either a non-special register, or 0, or -1.
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 278176)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -1761,15 +1761,17 @@ (define_expand "add3"
 })
 
 (define_insn "*add3"
-  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r")
-   (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b")
- (match_operand:GPR 2 "add_operand" "r,I,L")))]
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r,r")
+   (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b,b")
+ (match_operand:GPR 2 "add_operand" "r,I,L,eI")))]
   ""
   "@
add %0,%1,%2
addi %0,%1,%2
-   addis %0,%1,%v2"
-  [(set_attr "type" "add")])
+   addis %0,%1,%v2
+   addi %0,%1,%2"
+  [(set_attr "type" "add")
+   (set_attr "isa" "*,*,*,fut")])
 
 (define_insn "*addsi3_high"
   [(set (match_operand:SI 0 "gpc_reg_operand" "=b")

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[PATCH], V7, #2 of 7, Use PLI to load up 32-bit SImode constants

2019-11-14 Thread Michael Meissner
This patch generates the PLI (PADDI) instruction to load up 32-bit SImode
constants on the future system.  It adds an alternative to movsi, and prevents
the movsi load immediate from being split.

I have built compilers with this patch which bootstrapped fine, and there were
no regressions in the test suite.  Can I check this into the trunk?

2019-11-14  Michael Meissner  

* config/rs6000/rs6000.md (movsi_internal1): Add support to load
up 32-bit SImode integer constants with PADDI.
(movsi integer constant splitter): Do not split constant if PADDI
can load it up directly.

Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 278175)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -6891,22 +6891,22 @@ (define_split
 
 ;; MR   LA   LWZ  LFIWZX   LXSIWZX
 ;; STW  STFIWX   STXSIWX  LI   LIS
-;; #XXLORXXSPLTIB 0   XXSPLTIB -1  VSPLTISW
-;; XXLXOR 0 XXLORC -1P9 const MTVSRWZ  MFVSRWZ
-;; MF%1 MT%0 NOP
+;; PLI  #XXLORXXSPLTIB 0   XXSPLTIB -1
+;; VSPLTISW XXLXOR 0 XXLORC -1P9 const MTVSRWZ
+;; MFVSRWZ  MF%1 MT%0 NOP
 (define_insn "*movsi_internal1"
   [(set (match_operand:SI 0 "nonimmediate_operand"
"=r, r,   r,   d,   v,
 m,  Z,   Z,   r,   r,
-r,  wa,  wa,  wa,  v,
-wa, v,   v,   wa,  r,
-r,  *h,  *h")
+r,  r,   wa,  wa,  wa,
+v,  wa,  v,   v,   wa,
+r,  r,   *h,  *h")
(match_operand:SI 1 "input_operand"
"r,  U,   m,   Z,   Z,
 r,  d,   v,   I,   L,
-n,  wa,  O,   wM,  wB,
-O,  wM,  wS,  r,   wa,
-*h, r,   0"))]
+eI, n,   wa,  O,   wM,
+wB, O,   wM,  wS,  r,
+wa, *h,  r,   0"))]
   "gpc_reg_operand (operands[0], SImode)
|| gpc_reg_operand (operands[1], SImode)"
   "@
@@ -6920,6 +6920,7 @@ (define_insn "*movsi_internal1"
stxsiwx %x1,%y0
li %0,%1
lis %0,%v1
+   li %0,%1
#
xxlor %x0,%x1,%x1
xxspltib %x0,0
@@ -6936,21 +6937,21 @@ (define_insn "*movsi_internal1"
   [(set_attr "type"
"*,  *,   load,fpload,  fpload,
 store,  fpstore, fpstore, *,   *,
-*,  veclogical,  vecsimple,   vecsimple,   vecsimple,
-veclogical, veclogical,  vecsimple,   mffgpr,  mftgpr,
-*,  *,   *")
+*,  *,   veclogical,  vecsimple,   vecsimple,
+vecsimple,  veclogical,  veclogical,  vecsimple,   mffgpr,
+mftgpr, *,   *,   *")
(set_attr "length"
"*,  *,   *,   *,   *,
 *,  *,   *,   *,   *,
-8,  *,   *,   *,   *,
-*,  *,   8,   *,   *,
-*,  *,   *")
+*,  8,   *,   *,   *,
+*,  *,   *,   8,   *,
+*,  *,   *,   *")
(set_attr "isa"
"*,  *,   *,   p8v, p8v,
 *,  p8v, p8v, *,   *,
-*,  p8v, p9v, p9v, p8v,
-p9v,p8v, p9v, p8v, p8v,
-*,  *,   *")])
+fut,*,   p8v, p9v, p9v,
+p8v,p9v, p8v, p9v, p8v,
+p8v,*,   *,   *")])
 
 ;; Like movsi, but adjust a SF value to be used in a SI context, i.e.
 ;; (set (reg:SI ...) (subreg:SI (reg:SF ...) 0))
@@ -7095,14 +7096,15 @@ (define_insn "*movsi_from_df"
   "xscvdpsp %x0,%x1"
   [(set_attr "type" "fp")])
 
-;; Split a load of a large constant into the appropriate two-insn
-;; sequence.
+;; Split a load of a large constant into the appropriate

[PATCH] V7, #1 of 7, Use PLI to load up 34-bit DImode constants

2019-11-14 Thread Michael Meissner
This patch adds an alternative to movdi to allow using PLI (PADDI) to load up
34-bit constants on the 'future' machine.

I have built compilers with this patch, and there were no regressions in the
test suite.  Can I check this into the trunk?

2019-11-14  Michael Meissner  

* config/rs6000/rs6000.c (num_insns_constant_gpr): Add support for
PADDI to load up and/or add 34-bit integer constants.
(rs6000_rtx_costs): Treat constants loaded up with PADDI with the
same cost as normal 16-bit constants.
* config/rs6000/rs6000.md (movdi_internal64): Add support to load
up 34-bit integer constants with PADDI.
(movdi integer constant splitter): Add comment about PADDI.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 278173)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -5552,7 +5552,7 @@ static int
 num_insns_constant_gpr (HOST_WIDE_INT value)
 {
   /* signed constant loadable with addi */
-  if (((unsigned HOST_WIDE_INT) value + 0x8000) < 0x1)
+  if (SIGNED_16BIT_OFFSET_P (value))
 return 1;
 
   /* constant loadable with addis */
@@ -5560,6 +5560,10 @@ num_insns_constant_gpr (HOST_WIDE_INT va
   && (value >> 31 == -1 || value >> 31 == 0))
 return 1;
 
+  /* PADDI can support up to 34 bit signed integers.  */
+  else if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (value))
+return 1;
+
   else if (TARGET_POWERPC64)
 {
   HOST_WIDE_INT low  = ((value & 0x) ^ 0x8000) - 0x8000;
@@ -20679,7 +20683,8 @@ rs6000_rtx_costs (rtx x, machine_mode mo
|| outer_code == PLUS
|| outer_code == MINUS)
   && (satisfies_constraint_I (x)
-  || satisfies_constraint_L (x)))
+  || satisfies_constraint_L (x)
+  || satisfies_constraint_eI (x)))
  || (outer_code == AND
  && (satisfies_constraint_K (x)
  || (mode == SImode
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 278173)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -8808,24 +8808,24 @@ (define_split
   DONE;
 })
 
-;;  GPR store  GPR load   GPR move   GPR li GPR lis GPR #
-;;  FPR store  FPR load   FPR move   AVX store  AVX store   AVX 
load
-;;  AVX load   VSX move   P9 0   P9 -1  AVX 0/-1VSX 0
-;;  VSX -1 P9 const   AVX const  From SPR   To SPR  
SPR<->SPR
-;;  VSX->GPR   GPR->VSX
+;;  GPR store  GPR load   GPR move   GPR li GPR lis GPR pli
+;;  GPR #  FPR store  FPR load   FPR move   AVX store   AVX 
store
+;;  AVX load   AVX load   VSX move   P9 0   P9 -1   AVX 
0/-1
+;;  VSX 0  VSX -1 P9 const   AVX const  From SPRTo SPR
+;;  SPR<->SPR  VSX->GPR   GPR->VSX
 (define_insn "*movdi_internal64"
   [(set (match_operand:DI 0 "nonimmediate_operand"
"=YZ,   r, r, r, r,  r,
-m, ^d,^d,wY,Z,  $v,
-$v,^wa,   wa,wa,v,  wa,
-wa,v, v, r, *h, *h,
-?r,?wa")
+r, m, ^d,^d,wY, Z,
+$v,$v,^wa,   wa,wa, v,
+wa,wa,v, v, r,  *h,
+*h,?r,?wa")
(match_operand:DI 1 "input_operand"
-   "r, YZ,r, I, L,  nF,
-^d,m, ^d,^v,$v, wY,
-Z, ^wa,   Oj,wM,OjwM,   Oj,
-wM,wS,wB,*h,r,  0,
-wa,r"))]
+   "r, YZ,r, I, L,  eI,
+nF,^d,m, ^d,^v, $v,
+wY,Z, ^wa,   Oj,wM, OjwM,
+Oj,wM,wS,wB,*h, r,
+0, wa,r"))]
   "TARGET_POWERPC64
&& (gpc_reg_operand (operands[0], DImode)
|| gpc_reg_operand (operands[1], DImode))"
@@ -8835,6 +8835,7 @@ (define_insn "*movdi_internal64"
mr %0,%1
li %0,%1
lis %0,%v1
+   li %0,%1
#
stfd%U0%X0 %1,%0
lfd%U1%X1 %0,%1
@@ -8858,26 +8859,28 @@ (define_insn "*movdi_internal64"
mtvsrd %x0,%1"
   [(set_attr "type"
"store,  load,  *, *, *, *,
-fpstore,fpload, fpsimple,  fpstore,   fpstore,   
fpload,
-  

Re: [RFC C++ PATCH] __builtin_source_location ()

2019-11-14 Thread Jakub Jelinek
On Thu, Nov 14, 2019 at 10:15:21PM +, Jonathan Wakely wrote:
> > namespace std {
> >  struct source_location {
> >struct __impl {
> 
> Will this work if the library type is actually in an inline namespace,
> such as std::__8::source_location::__impl (as for
> --enable-symvers=gnu-versioned-namespace) or
> std::__v1::source_location::__impl (as it probably would be in
> libc++).
> 
> If I'm reading the patch correctly, it would work fine, because
> qualified lookup for std::source_location would find that name even if
> it's really in some inline namespace.

I'd say so, but the unfinished testsuite coverage would need to cover it of
course.

> >  const char *__file;
> >  const char *__function;
> >  unsigned int __line, __column;
> >};
> >const void *__ptr;
> 
> If the magic type the compiler generates is declared in the header,
> then this member might as well be 'const __impl*'.

Yes, with the static_cast on the __builtin_source_location ()
result sure.  I can't easily make it return const std::source_location::__impl*
though, because the initialization of the builtins happens early, before
 is parsed.
And as it is a nested class, I think I can't predeclare it in the compiler.
If it would be std::__source_location_impl instead of
std::source_location::__impl, perhaps I could pretend there is
namespace std { struct __source_location_impl; }
but I bet that wouldn't work well with the inline namespaces.
So, is const void * return from the builtin ok?

Jakub



[committed] Change range_operator:fold_range to return a boolean indicating success.

2019-11-14 Thread Andrew MacLeod

One final tweak to the range-ops API.

Most fold_range calls were returning true, which was a trigger for the 
original change to return by value.   When I changed it back to a return 
by reference, I should also have added the boolean result back.  Now all 
3 routines are similar...


    virtual bool fold_range (value_range &r, tree type, const 
value_range &lh, const value_range &rh) const;
    virtual bool op1_range (value_range &r, tree type, const 
value_range &lhs, const value_range &op2) const;
    virtual bool op2_range (value_range &r, tree type, const 
value_range &lhs, const value_range &op1) const;


And again, this is pretty mechanical.  bootstraps and passes all tests.

Checked in as revision 278266

Andrew

2019-11-14  Andrew MacLeod  

	* range-op.h (range_operator::fold_range): Return a bool.
	* range-op.cc (range_operator::wi_fold): Assert supported type.
	(range_operator::fold_range): Assert supported type and return true.
	(operator_equal::fold_range): Return true.
	(operator_not_equal::fold_range): Same.
	(operator_lt::fold_range): Same.
	(operator_le::fold_range): Same.
	(operator_gt::fold_range): Same.
	(operator_ge::fold_range): Same.
	(operator_plus::op1_range): Adjust call to fold_range.
	(operator_plus::op2_range): Same.
	(operator_minus::op1_range): Same.
	(operator_minus::op2_range): Same.
	(operator_exact_divide::op1_range): Same.
	(operator_lshift::fold_range): Return true and adjust fold_range call.
	(operator_rshift::fold_range): Same.
	(operator_cast::fold_range): Return true.
	(operator_logical_and::fold_range): Same.
	(operator_logical_or::fold_range): Same.
	(operator_logical_not::fold_range): Same.
	(operator_bitwise_not::fold_range): Adjust call to fold_range.
	(operator_bitwise_not::op1_range): Same.
	(operator_cst::fold_range): Return true.
	(operator_identity::fold_range): Return true.
	(operator_negate::fold_range): Return true and adjust fold_range call.
	(operator_addr_expr::fold_range): Return true.
	(operator_addr_expr::op1_range): Adjust call to fold_range.
	(range_cast): Same.
	* tree-vrp.c (range_fold_binary_symbolics_p): Adjust call to fold_range.
	(range_fold_unary_symbolics_p): Same.

Index: range-op.h
===
*** range-op.h	(revision 278265)
--- range-op.h	(working copy)
*** class range_operator
*** 50,56 
  {
  public:
// Perform an operation between 2 ranges and return it.
!   virtual void fold_range (value_range &r, tree type,
  			   const value_range &lh,
  			   const value_range &rh) const;
  
--- 50,56 
  {
  public:
// Perform an operation between 2 ranges and return it.
!   virtual bool fold_range (value_range &r, tree type,
  			   const value_range &lh,
  			   const value_range &rh) const;
  
*** public:
*** 73,79 
  			  const value_range &op1) const;
  
  protected:
!   // Perform an operation between 2 sub-ranges and return it.
virtual void wi_fold (value_range &r, tree type,
  		const wide_int &lh_lb,
  		const wide_int &lh_ub,
--- 73,79 
  			  const value_range &op1) const;
  
  protected:
!   // Perform an integral operation between 2 sub-ranges and return it.
virtual void wi_fold (value_range &r, tree type,
  		const wide_int &lh_lb,
  		const wide_int &lh_ub,
Index: range-op.cc
===
*** range-op.cc	(revision 278265)
--- range-op.cc	(working copy)
*** range_operator::wi_fold (value_range &r,
*** 131,149 
  			 const wide_int &rh_lb ATTRIBUTE_UNUSED,
  			 const wide_int &rh_ub ATTRIBUTE_UNUSED) const
  {
r = value_range (type);
  }
  
  // The default for fold is to break all ranges into sub-ranges and
  // invoke the wi_fold method on each sub-range pair.
  
! void
  range_operator::fold_range (value_range &r, tree type,
  			const value_range &lh,
  			const value_range &rh) const
  {
if (empty_range_check (r, lh, rh))
! return;
  
value_range tmp;
r.set_undefined ();
--- 131,151 
  			 const wide_int &rh_lb ATTRIBUTE_UNUSED,
  			 const wide_int &rh_ub ATTRIBUTE_UNUSED) const
  {
+   gcc_checking_assert (value_range::supports_type_p (type));
r = value_range (type);
  }
  
  // The default for fold is to break all ranges into sub-ranges and
  // invoke the wi_fold method on each sub-range pair.
  
! bool
  range_operator::fold_range (value_range &r, tree type,
  			const value_range &lh,
  			const value_range &rh) const
  {
+   gcc_checking_assert (value_range::supports_type_p (type));
if (empty_range_check (r, lh, rh))
! return true;
  
value_range tmp;
r.set_undefined ();
*** range_operator::fold_range (value_range
*** 157,164 
  	wi_fold (tmp, type, lh_lb, lh_ub, rh_lb, rh_ub);
  	r.union_ (tmp);
  	if (r.varying_p ())
! 	  return;
}
  }
  
  // The default for op1_range is to return false.
--- 159,167 
  	wi_fold (

PowerPC V7 future machine patches

2019-11-14 Thread Michael Meissner
This set of patches from V6 got missed, and should go in ASAP.

I am breaking the patches into 3 logical steps:

1) The V7 patches that were part of the V6 patches that have not been
committed.  There are 7 patches in the suite (6 to the compiler, 1 to the
testsuite to enable finer granularity for running the 'future' tests)..

2) The V8 patches will be all of the tests that have been modified.  There are
6 testsuite patches.

3) The V9 patches will be the implementation of PCREL_OPT, which has not been
submitted yet.

I acknowledge that there will be a larger patch after these to clean up the
insn length issues, but none of these patches do anything but add a simple insn
that may or may not be prefixed.  I would prefer to get these out of the way
first.

I have built compilers with these patches, and they have bootstrapped on a
little endian power8 system with no issue.  There have been no regressions in
the test suite.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [RFC C++ PATCH] __builtin_source_location ()

2019-11-14 Thread Jonathan Wakely

On 14/11/19 20:34 +0100, Jakub Jelinek wrote:

Hi!

The following WIP patch implements __builtin_source_location (),
which returns const void pointer to a std::source_location::__impl
struct that is required to contain __file, __function, __line and __column
fields, the first two with const char * type, the latter some integral type.

I don't have testcase coverage yet and the hash map to allow sharing of
VAR_DECLs with the same location is commented out both because it
doesn't compile for some reason and because hashing on location_t
is not enough, we probably need to hash on both location_t and fndecl,
as the baz case in the following shows.

Comments?

namespace std {
 struct source_location {
   struct __impl {


Will this work if the library type is actually in an inline namespace,
such as std::__8::source_location::__impl (as for
--enable-symvers=gnu-versioned-namespace) or
std::__v1::source_location::__impl (as it probably would be in
libc++).

If I'm reading the patch correctly, it would work fine, because
qualified lookup for std::source_location would find that name even if
it's really in some inline namespace.



 const char *__file;
 const char *__function;
 unsigned int __line, __column;
   };
   const void *__ptr;


If the magic type the compiler generates is declared in the header,
then this member might as well be 'const __impl*'.


   constexpr source_location () : __ptr (nullptr) {}
   static consteval source_location
   current (const void *__p = __builtin_source_location ()) {
 source_location __ret;
 __ret.__ptr = __p;
 return __ret;
   }
   constexpr const char *file () const {
 return static_cast  (__ptr)->__file;


Not really relevant to your patch, but I'll say it here for the
benefit of others reading these mails ...

On IRC I suggested that the default constructor should set the __ptr
member to null, and these member functions should check for null, e.g.

 if (__ptr) [[likely]]
   return __ptr->__function;
 else
   return "";

The alternative is for the default constructor to call
__builtin_source_location() or refer to some static object in the
runtime library, but both options waste space. Adding a [[likely]]
branch to the accessors wastes no space and should only penalise users
who are misusing source_location by trying to get meaningful values
out of default constructed objects. If that's a bit slower I don't
care.



Re: [PATCH V2] Refactor tree-loop-distribution for thread safety

2019-11-14 Thread Giuliano Belinassi
Previously, the suggested patch removed all tree-loop-distributions.c global
variables moving them into a struct and passing it aroung across the functions.
This patch address this problem by using C++ classes instead, avoiding passing
the struct as argument since it will be accessible from this pointer.

gcc/ChangeLog
2019-11-14  Giuliano Belinassi  

* cfgloop.c (get_loop_body_in_custom_order): New.
* cfgloop.h (get_loop_body_in_custom_order): New prototype.
* tree-loop-distribution.c (class loop_distribution): New.
(bb_top_order_cmp): Remove.
(bb_top_order_cmp_r): New.
(create_rdg_vertices): Move into class loop_distribution.
(stmts_from_loop): Same as above.
(update_for_merge): Same as above.
(partition_merge_into): Same as above.
(get_data_dependence): Same as above.
(data_dep_in_cycle_p): Same as above.
(update_type_for_merge): Same as above.
(build_rdg_partition_for-vertex): Same as above.
(classify_builtin_ldst): Same as above.
(classify_partition): Same as above.
(share_memory_accesses): Same as above.
(rdg_build_partitions): Same as above.
(pg_add_dependence_edges): Same as above.
(build_partition_graph): Same as above.
(merge_dep_scc_partitions): Same as above.
(break_alias_scc_partitions): Same as above.
(finalize_partitions): Same as above.
(distribute_loop): Same as above.
(bb_top_order_init): New method
(bb_top_order_destroy): New method.
(get_bb_top_order_index_size): New method.
(get_bb_top_order_index_index): New method.
(get_bb_top_order_index_index): New method.
(loop_distribution::execute): New method.
(pass_loop_distribution::execute): Instantiate loop_distribution.
diff --git gcc/cfgloop.c gcc/cfgloop.c
index f18d2b3f24b..db0066ea859 100644
--- gcc/cfgloop.c
+++ gcc/cfgloop.c
@@ -980,6 +980,19 @@ get_loop_body_in_custom_order (const class loop *loop,
   return bbs;
 }
 
+/* Same as above, but use gcc_sort_r instead of qsort.  */
+
+basic_block *
+get_loop_body_in_custom_order (const class loop *loop, void *data,
+			   int (*bb_comparator) (const void *, const void *, void *))
+{
+  basic_block *bbs = get_loop_body (loop);
+
+  gcc_sort_r (bbs, loop->num_nodes, sizeof (basic_block), bb_comparator, data);
+
+  return bbs;
+}
+
 /* Get body of a LOOP in breadth first sort order.  */
 
 basic_block *
diff --git gcc/cfgloop.h gcc/cfgloop.h
index 0b0154ffd7b..6256cc01ff4 100644
--- gcc/cfgloop.h
+++ gcc/cfgloop.h
@@ -376,6 +376,8 @@ extern basic_block *get_loop_body_in_dom_order (const class loop *);
 extern basic_block *get_loop_body_in_bfs_order (const class loop *);
 extern basic_block *get_loop_body_in_custom_order (const class loop *,
 			   int (*) (const void *, const void *));
+extern basic_block *get_loop_body_in_custom_order (const class loop *, void *,
+			   int (*) (const void *, const void *, void *));
 
 extern vec get_loop_exit_edges (const class loop *);
 extern edge single_exit (const class loop *);
diff --git gcc/tree-loop-distribution.c gcc/tree-loop-distribution.c
index 81784866ad1..6afb3089ec1 100644
--- gcc/tree-loop-distribution.c
+++ gcc/tree-loop-distribution.c
@@ -155,21 +155,10 @@ ddr_hasher::equal (const data_dependence_relation *ddr1,
   return (DDR_A (ddr1) == DDR_A (ddr2) && DDR_B (ddr1) == DDR_B (ddr2));
 }
 
-/* The loop (nest) to be distributed.  */
-static vec loop_nest;
 
-/* Vector of data references in the loop to be distributed.  */
-static vec datarefs_vec;
 
-/* If there is nonaddressable data reference in above vector.  */
-static bool has_nonaddressable_dataref_p;
-
-/* Store index of data reference in aux field.  */
 #define DR_INDEX(dr)  ((uintptr_t) (dr)->aux)
 
-/* Hash table for data dependence relation in the loop to be distributed.  */
-static hash_table *ddrs_table;
-
 /* A Reduced Dependence Graph (RDG) vertex representing a statement.  */
 struct rdg_vertex
 {
@@ -216,6 +205,83 @@ struct rdg_edge
 
 #define RDGE_TYPE(E)((struct rdg_edge *) ((E)->data))->type
 
+/* Kind of distributed loop.  */
+enum partition_kind {
+PKIND_NORMAL,
+/* Partial memset stands for a paritition can be distributed into a loop
+   of memset calls, rather than a single memset call.  It's handled just
+   like a normal parition, i.e, distributed as separate loop, no memset
+   call is generated.
+
+   Note: This is a hacking fix trying to distribute ZERO-ing stmt in a
+   loop nest as deep as possible.  As a result, parloop achieves better
+   parallelization by parallelizing deeper loop nest.  This hack should
+   be unnecessary and removed once distributed memset can be understood
+   and analyzed in data reference analysis.  See PR82604 for more.  */
+PKIND_PARTIAL_MEMSET,
+PKIND_MEMSET, PKIND_MEMCPY, PKIND_MEMMOVE
+};
+
+/* Type of distributed loop.  

Re: [PATCH v3] gdbinit.in: allow to pass function argument explicitly

2019-11-14 Thread Konstantin Kharlamov

On 14.11.2019 23:50, Segher Boessenkool wrote:

Hi!

On Thu, Nov 14, 2019 at 07:01:47PM +0300, Konstantin Kharlamov wrote:

Generally, people expect functions to accept arguments directly. But
ones defined in gdbinit did not use the argument, which may be confusing
for newcomers. But we can't change behavior to use the argument without
breaking existing users of the gdbinit. Let's fix this by adding a check
for whether a user passed an argument, and either use it or go with
older behavior.

2019-11-14  Konstantin Kharlamov  

 * gdbinit.in (pp, pr, prl, pt, pct, pgg, pgq, pgs, pge, pmz, ptc, pdn,
 ptn, pdd, prc, pi, pbm, pel, trt): Make use of $arg0 if a user passed 
it


(Lines in a changelog end in a dot).

I love this...  if it works :-)  How was it tested?  With what GDB versions?


I ran GCC under GDB, and set a breakpoint somewhere in the GCC code 
where `stmt` variable is defined (I don't remember the type, ATM I'm 
from another PC). Then I tested 3 things:


1. that `pgg stmt` prints the `stmt` content
2. that `p 0` followed by `pgg` prints nothing (because the underlying 
function checks its argument for being zero)

3. that `p 1` followed by `pgg` crashes.

Tested on Archlinux, gdb version is 8.3.1


Re: [PATCH] Check suitability of spill register for mode

2019-11-14 Thread Vladimir Makarov

On 11/14/19 7:34 AM, Kwok Cheung Yeung wrote:

Hello

Currently, when choosing a spill register, GCC just picks the first 
available register in the register class returned by the 
TAQRGET_SPILL_CLASS hook that doesn't conflict.


On AMD GCN this can cause problems as DImode values stored in SGPRs 
must start on an even register number and TImode values on a multiple 
of 4. This is enforced by defining TARGET_HARD_REGNO_MODE_OK to be 
false when this condition is not satisfied. However, 
assign_spill_hard_regs does not check TARGET_HARD_REGNO_MODE_OK, so it 
can assign an unsuitable hard register for the mode. I have fixed this 
by rejecting spill registers that do not satisfy 
TARGET_HARD_REGNO_MODE_OK for the largest mode of the spilt register.


Built and tested for an AMD GCN target. This fixes failures in:

gcc.dg/vect/no-scevccp-outer-9.c
gcc.dg/vect/no-scevccp-outer-10.c

I have also ensured that the code bootstraps on x86_64, though it 
currently does not use spill registers.


Okay for trunk?

Spilling into registers of another class was an experimental feature.  
On some x86-64 micro-architectures, it resulted in a worse performance 
code.  But I guess in your case, it is profitable.   I hope after your 
patches it can be switched on for some x86-64 micro-architectures.


As for the patch, it is OK to commit.  Thank you.


2019-11-14 Kwok Cheung Yeung  

gcc/
* lra-spills.c (assign_spill_hard_regs): Check that the spill
register is suitable for the mode.
---
 gcc/lra-spills.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/lra-spills.c b/gcc/lra-spills.c
index 54f76cc..8fbd3a8 100644
--- a/gcc/lra-spills.c
+++ b/gcc/lra-spills.c
@@ -283,7 +283,8 @@ assign_spill_hard_regs (int *pseudo_regnos, int n)
   for (k = 0; k < spill_class_size; k++)
 {
   hard_regno = ira_class_hard_regs[spill_class][k];
-  if (TEST_HARD_REG_BIT (eliminable_regset, hard_regno))
+  if (TEST_HARD_REG_BIT (eliminable_regset, hard_regno)
+  || !targetm.hard_regno_mode_ok (hard_regno, mode))
 continue;
   if (! overlaps_hard_reg_set_p (conflict_hard_regs, mode, 
hard_regno))

 break;





[PATCH] PR c++/92078 Add access specifiers to specializations

2019-11-14 Thread Andrew Sutton
Fixes mentioned issue. Tested on bootstrap and cmcsstl2.

gcc/cp/
* pt.c (maybe_new_partial_specialization): Apply access to newly
created partial specializations. Update comment style.

gcc/testsuite/
* g++.dg/cpp2a/concepts-pr92078.C: New.

Andrew


pr92078.patch
Description: Binary data


Re: [PATCH v3] gdbinit.in: allow to pass function argument explicitly

2019-11-14 Thread Segher Boessenkool
Hi!

On Thu, Nov 14, 2019 at 07:01:47PM +0300, Konstantin Kharlamov wrote:
> Generally, people expect functions to accept arguments directly. But
> ones defined in gdbinit did not use the argument, which may be confusing
> for newcomers. But we can't change behavior to use the argument without
> breaking existing users of the gdbinit. Let's fix this by adding a check
> for whether a user passed an argument, and either use it or go with
> older behavior.
> 
> 2019-11-14  Konstantin Kharlamov  
> 
> * gdbinit.in (pp, pr, prl, pt, pct, pgg, pgq, pgs, pge, pmz, ptc, pdn,
> ptn, pdd, prc, pi, pbm, pel, trt): Make use of $arg0 if a user passed 
> it

(Lines in a changelog end in a dot).

I love this...  if it works :-)  How was it tested?  With what GDB versions?


Segher


Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-11-14 Thread Jan Hubicka
> >> Cost model used by self-recursive cloning is mainly based on existing 
> >> stuffs
> >> in ipa-cp cloning, size growth and time benefit are considered. But since
> >> recursive cloning is a more aggressive cloning, we will actually have 
> >> another
> >> problem, which is opposite to your concern.  By default, current parameter
> >> set used to control ipa-cp and recursive-inliner gives priority to code 
> >> size,
> >> not completely for performance. This makes ipa-cp behave somewhat
> 
> > Yes, for a while the cost model is quite off.  On Firefox it does just
> > few clonings where code size increases so it desprately needs retuning.
> 
> > But since rescursive cloning is quite a different case from normal one,
> > perhaps having independent set of limits would help in particular ...
> I did consider this way, but this seems to be contradictory for normal
> and recursive cloning.

We could definitly discuss cost model incrementally. It is safe to do
what you do currently (rely on the existing ipa-cp's overconservative
heuristics).  

> 
> > > Do you have some data on code size/performance effects of this change?
> > For spec2017, no obvious code size and performance change with default 
> > setting.
> > Specifically, for exchange2, with ipa-cp-eval-threshold=1 and 
> > ipcp-unit-growth=80,
> > performance +31%, size +7%, on aarch64.
> 
> > ... it will help here since ipa-cp-eval-threshold value needed are quite 
> > off of what we need to do.
> 
> > I wonder about the 80% of unit growth which is also more than we can
> > enable by default.  How it comes the overal size change is only 7%?
> 343624 -> 365632 (this contains debug info, -g)recursion-depth=8
> 273488 -> 273760 (no debug info)   recursion-depth=8

What seems bit odd is that ipcp's metrics ends up with 80% code growth.
I will try to look into it and see if I can think better what to do
about the costs.

Honza


[PATCH, fortran] Extend the builtin directive

2019-11-14 Thread Szabolcs Nagy
The builtin directive allows specifying the simd attribute for a builtin
function. Similarly how the C language simd attribtue got extended to
allow declaring a specific vector variant, update the fortran builtin
directive too.

Before the patch, only the masking (inbranch/notinbranch) could be specified,
when declaring the availability of vector variants, e.g.:

 !GCC$ builtin (expf) attributes simd (notinbranch) if('x86_64')

now the simdlen and simdabi (aka ISA) can be specified too, and a different
name may be used instead of the vector ABI name, e.g.:

 !GCC$ builtin (expf) attributes simd (notinbranch, 4, 'b', 'vexpf') 
if('x86_64')

Tested on aarch64-linux-gnu.

The C language change is at

https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01288.html

Note: I don't have much fortran experience, so i'm not sure if the syntax
makes sense or if i modified the frontend reasonably.

2019-11-14  Szabolcs Nagy  

* gfortran.h (struct gfc_vect_builtin): Define.
* decl.c (gfc_match_gcc_builtin): Parse new flags.
* trans-intrinsic.c (add_simd_flag_for_built_in): Update.
(gfc_adjust_builtins): Update.

gcc/testsuite/ChangeLog:

2019-11-14  Szabolcs Nagy  

* gfortran.dg/simd-builtins-9.f90: New test.
* gfortran.dg/simd-builtins-9.h: New test.
* gfortran.dg/simd-builtins-10.f90: New test.
* gfortran.dg/simd-builtins-10.h: New test.
diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index 7858973cc20..dab8a323148 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -105,7 +105,7 @@ bool directive_vector = false;
 bool directive_novector = false;
 
 /* Map of middle-end built-ins that should be vectorized.  */
-hash_map *gfc_vectorized_builtins;
+hash_map *gfc_vectorized_builtins;
 
 /* If a kind expression of a component of a parameterized derived type is
parameterized, temporarily store the expression here.  */
@@ -11600,9 +11600,13 @@ gfc_match_gcc_unroll (void)
 /* Match a !GCC$ builtin (b) attributes simd flags if('target') form:
 
The parameter b is name of a middle-end built-in.
-   FLAGS is optional and must be one of:
- - (inbranch)
- - (notinbranch)
+   FLAGS is optional and must be of the form:
+ (mask)
+ (mask, simdlen)
+ (mask, simdlen, 'simdabi')
+ (mask, simdlen, 'simdabi', 'name')
+   where mask is inbranch or notinbranch, simdlen is an integer, simdabi
+   and name are strings.
 
IF('target') is optional and TARGET is a name of a multilib ABI.
 
@@ -11613,15 +11617,44 @@ gfc_match_gcc_builtin (void)
 {
   char builtin[GFC_MAX_SYMBOL_LEN + 1];
   char target[GFC_MAX_SYMBOL_LEN + 1];
+  char simdabi[GFC_MAX_SYMBOL_LEN + 1] = "";
+  char name[GFC_MAX_SYMBOL_LEN + 1] = "";
+  bool inbranch;
+  bool flags = false;
+  int simdlen = 0;
 
   if (gfc_match (" ( %n ) attributes simd", builtin) != MATCH_YES)
 return MATCH_ERROR;
 
-  gfc_simd_clause clause = SIMD_NONE;
-  if (gfc_match (" ( notinbranch ) ") == MATCH_YES)
-clause = SIMD_NOTINBRANCH;
-  else if (gfc_match (" ( inbranch ) ") == MATCH_YES)
-clause = SIMD_INBRANCH;
+  if (gfc_match (" ( ") == MATCH_YES)
+{
+  flags = true;
+  if (gfc_match ("notinbranch") == MATCH_YES)
+	inbranch = false;
+  else if (gfc_match ("inbranch") == MATCH_YES)
+	inbranch = true;
+  else
+	{
+syntax_error:
+	  gfc_error ("Syntax error in !GCC$ BUILTIN directive at %C");
+	  return MATCH_ERROR;
+	}
+
+  if (gfc_match (" , ") == MATCH_YES)
+	if (gfc_match_small_int (&simdlen) != MATCH_YES || simdlen < 0)
+	  goto syntax_error;
+
+  if (gfc_match (" , ") == MATCH_YES)
+	if (gfc_match (" '%n'", &simdabi) != MATCH_YES)
+	  goto syntax_error;
+
+  if (gfc_match (" , ") == MATCH_YES)
+	if (gfc_match (" '%n'", &name) != MATCH_YES)
+	  goto syntax_error;
+
+  if (gfc_match (" ) ") != MATCH_YES)
+	goto syntax_error;
+}
 
   if (gfc_match (" if ( '%n' ) ", target) == MATCH_YES)
 {
@@ -11631,16 +11664,27 @@ gfc_match_gcc_builtin (void)
 }
 
   if (gfc_vectorized_builtins == NULL)
-gfc_vectorized_builtins = new hash_map ();
+gfc_vectorized_builtins =
+  new hash_map ();
 
   char *r = XNEWVEC (char, strlen (builtin) + 32);
   sprintf (r, "__builtin_%s", builtin);
 
   bool existed;
-  int &value = gfc_vectorized_builtins->get_or_insert (r, &existed);
-  value |= clause;
+  gfc_vect_builtin *v = &gfc_vectorized_builtins->get_or_insert (r, &existed);
   if (existed)
-free (r);
+{
+  free (r);
+  gfc_vect_builtin *next = v->next;
+  v->next = new gfc_vect_builtin;
+  v = v->next;
+  v->next = next;
+}
+  v->flags = flags;
+  v->inbranch = inbranch;
+  v->simdlen = simdlen;
+  v->simdabi = simdabi[0] ? xstrdup (simdabi) : 0;
+  v->name = name[0] ? xstrdup (name) : 0;
 
   return MATCH_YES;
 }
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 920acdafc6b..56becb207b2 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -2812,26 +2812,18 @@ extern bool

Re: [C++ Patch] Use cp_expr_loc_or_input_loc in a few additional typeck.c places

2019-11-14 Thread Paolo Carlini

Hi again,

On 14/11/19 12:09, Paolo Carlini wrote:

Hi,

tested x86_64-linux.


Instead of sending a separate patch, the below has two additional uses. 
Tested as usual.


Thanks, Paolo.

///

/cp
2019-11-14  Paolo Carlini  

* typeck.c (cp_build_addr_expr_1): Use cp_expr_loc_or_input_loc
in three places.
(cxx_sizeof_expr): Use it in one additional place.
(cxx_alignof_expr): Likewise.
(lvalue_or_else): Likewise.

/testsuite
2019-11-14  Paolo Carlini  

* g++.dg/cpp0x/addressof2.C: Test locations too.
* g++.dg/cpp0x/rv-lvalue-req.C: Likewise.
* g++.dg/expr/crash2.C: Likewise.
* g++.dg/expr/lval1.C: Likewise.
* g++.dg/expr/unary2.C: Likewise.
* g++.dg/ext/lvaddr.C: Likewise.
* g++.dg/ext/lvalue1.C: Likewise.
* g++.dg/tree-ssa/pr20280.C: Likewise.
* g++.dg/warn/Wplacement-new-size.C: Likewise.
* g++.old-deja/g++.brendan/alignof.C: Likewise.
* g++.old-deja/g++.brendan/sizeof2.C: Likewise.
* g++.old-deja/g++.law/temps1.C: Likewise.
Index: cp/typeck.c
===
--- cp/typeck.c (revision 278253)
+++ cp/typeck.c (working copy)
@@ -1765,7 +1765,8 @@ cxx_sizeof_expr (tree e, tsubst_flags_t complain)
   if (bitfield_p (e))
 {
   if (complain & tf_error)
-error ("invalid application of % to a bit-field");
+   error_at (cp_expr_loc_or_input_loc (e),
+ "invalid application of % to a bit-field");
   else
 return error_mark_node;
   e = char_type_node;
@@ -1825,7 +1826,8 @@ cxx_alignof_expr (tree e, tsubst_flags_t complain)
   else if (bitfield_p (e))
 {
   if (complain & tf_error)
-error ("invalid application of %<__alignof%> to a bit-field");
+   error_at (cp_expr_loc_or_input_loc (e),
+ "invalid application of %<__alignof%> to a bit-field");
   else
 return error_mark_node;
   t = size_one_node;
@@ -6126,7 +6128,7 @@ cp_build_addr_expr_1 (tree arg, bool strict_lvalue
   if (kind == clk_none)
{
  if (complain & tf_error)
-   lvalue_error (input_location, lv_addressof);
+   lvalue_error (cp_expr_loc_or_input_loc (arg), lv_addressof);
  return error_mark_node;
}
   if (strict_lvalue && (kind & (clk_rvalueref|clk_class)))
@@ -6134,7 +6136,8 @@ cp_build_addr_expr_1 (tree arg, bool strict_lvalue
  if (!(complain & tf_error))
return error_mark_node;
  /* Make this a permerror because we used to accept it.  */
- permerror (input_location, "taking address of rvalue");
+ permerror (cp_expr_loc_or_input_loc (arg),
+"taking address of rvalue");
}
 }
 
@@ -6228,7 +6231,8 @@ cp_build_addr_expr_1 (tree arg, bool strict_lvalue
   if (bitfield_p (arg))
 {
   if (complain & tf_error)
-   error ("attempt to take address of bit-field");
+   error_at (cp_expr_loc_or_input_loc (arg),
+ "attempt to take address of bit-field");
   return error_mark_node;
 }
 
@@ -10431,7 +10435,7 @@ lvalue_or_else (tree ref, enum lvalue_use use, tsu
   if (kind == clk_none)
 {
   if (complain & tf_error)
-   lvalue_error (input_location, use);
+   lvalue_error (cp_expr_loc_or_input_loc (ref), use);
   return 0;
 }
   else if (kind & (clk_rvalueref|clk_class))
Index: testsuite/g++.dg/cpp0x/addressof2.C
===
--- testsuite/g++.dg/cpp0x/addressof2.C (revision 278253)
+++ testsuite/g++.dg/cpp0x/addressof2.C (working copy)
@@ -8,19 +8,19 @@ addressof (T &x) noexcept
   return __builtin_addressof (x);
 }
 
-auto a = __builtin_addressof (1);  // { dg-error "lvalue required 
as unary" }
-auto b = addressof (1);// { dg-error "cannot 
bind non-const lvalue reference of type" }
+auto a = __builtin_addressof (1);  // { dg-error "31:lvalue 
required as unary" }
+auto b = addressof (1);// { dg-error 
"21:cannot bind non-const lvalue reference of type" }
 
 struct S { int s : 5; int t; void foo (); } s;
 
 auto c = __builtin_addressof (s);
 auto d = addressof (s);
-auto e = __builtin_addressof (s.s);// { dg-error "attempt to take 
address of bit-field" }
-auto f = addressof (s.s);  // { dg-error "cannot bind 
bit-field" }
-auto g = __builtin_addressof (S{});// { dg-error "taking address 
of rvalue" }
-auto h = addressof (S{});  // { dg-error "cannot bind 
non-const lvalue reference of type" }
-auto i = __builtin_addressof (S::t);   // { dg-error "invalid use of 
non-static data member" }
-auto j = __builtin_addressof (S::foo); // { dg-error "invalid use of 
non-static member function" }
+auto e = __builtin_addressof (s.s);// { dg-error

[PATCH v3] Extend the simd function attribute

2019-11-14 Thread Szabolcs Nagy
Sorry v2 had a bug.

v2: added documentation and tests.
v3: fixed expand_simd_clones so different isa variants are actually
generated.

GCC currently supports two ways to declare the availability of vector
variants of a scalar function:

  #pragma omp declare simd
  void f (void);

and

  __attribute__ ((simd))
  void f (void);

However these declare a set of symbols that are different simd variants
of f, so a library either provides definitions for all those symbols or
it cannot use these declarations. (The set of declared symbols can be
narrowed down with additional omp clauses, but not enough to allow
declaring a single symbol. OpenMP 5 has a declare variant feature that
allows declaring more specific simd variants, but it is complicated and
still requires gcc or vendor extension for unambiguous declarations.)

This patch extends the gcc specific simd attribute such that it can
specify a single vector variant of simple scalar functions (functions
that only take and return scalar integer or floating type values):

  __attribute__ ((simd (mask, simdlen, simdabi, name

where mask is "inbranch" or "notinbranch" like now, simdlen is an int
with the same meaning as in omp declare simd and simdabi is a string
specifying the call ABI (which the intel vector ABI calls ISA). The
name is optional and allows a library to use a different symbol name
than what the vector ABI specifies.

The simd attribute currently can be used for both declarations and
definitions, in the latter case the simd varaints of the function are
generated, which should work with the extended simd attribute too.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.

gcc/ChangeLog:

2019-11-14  Szabolcs Nagy  

* cgraph.h (struct cgraph_simd_clone): Add simdname field.
* doc/extend.texi: Update the simd attribute documentation.
* tree.h (OMP_CLAUSE__SIMDABI__EXPR): Define.
(OMP_CLAUSE__SIMDNAME__EXPR): Define.
* tree.c (walk_tree_1): Handle new omp clauses.
* tree-core.h (enum omp_clause_code): Likewise.
* tree-nested.c (convert_nonlocal_omp_clauses): Likewise.
* tree-pretty-print.c (dump_omp_clause): Likewise.
* omp-low.c (scan_sharing_clauses): Likewise.
* omp-simd-clone.c (simd_clone_clauses_extract): Likewise.
(simd_clone_mangle): Handle simdname.
(expand_simd_clones): Reset vecsize_mangle when generating clones.
* config/aarch64/aarch64.c
(aarch64_simd_clone_compute_vecsize_and_simdlen): Warn about
unsupported SIMD ABI.
* config/i386/i386.c
(ix86_simd_clone_compute_vecsize_and_simdlen): Likewise.

gcc/c-family/ChangeLog:

2019-11-14  Szabolcs Nagy  

* c-attribs.c (handle_simd_attribute): Handle 4 arguments.

gcc/testsuite/ChangeLog:

2019-11-14  Szabolcs Nagy  

* c-c++-common/attr-simd-5.c: Update.
* c-c++-common/attr-simd-6.c: New test.
* c-c++-common/attr-simd-7.c: New test.
* c-c++-common/attr-simd-8.c: New test.
diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index c62cebf7bfd..bf2301eb790 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -448,7 +448,7 @@ const struct attribute_spec c_common_attribute_table[] =
 			  handle_omp_declare_variant_attribute, NULL },
   { "omp declare variant variant", 0, -1, true,  false, false, false,
 			  handle_omp_declare_variant_attribute, NULL },
-  { "simd",		  0, 1, true,  false, false, false,
+  { "simd",		  0, 4, true,  false, false, false,
 			  handle_simd_attribute, NULL },
   { "omp declare target", 0, -1, true, false, false, false,
 			  handle_omp_declare_target_attribute, NULL },
@@ -3094,13 +3094,22 @@ handle_simd_attribute (tree *node, tree name, tree args, int, bool *no_add_attrs
 {
   tree t = get_identifier ("omp declare simd");
   tree attr = NULL_TREE;
+
+  /* Allow
+	  simd
+	  simd (mask)
+	  simd (mask, simdlen)
+	  simd (mask, simdlen, simdabi)
+	  simd (mask, simdlen, simdabi, name)
+	 forms.  */
+
   if (args)
 	{
 	  tree id = TREE_VALUE (args);
 
 	  if (TREE_CODE (id) != STRING_CST)
 	{
-	  error ("attribute %qE argument not a string", name);
+	  error ("attribute %qE first argument not a string", name);
 	  *no_add_attrs = true;
 	  return NULL_TREE;
 	}
@@ -3113,13 +3122,75 @@ handle_simd_attribute (tree *node, tree name, tree args, int, bool *no_add_attrs
  OMP_CLAUSE_INBRANCH);
 	  else
 	{
-	  error ("only % and % flags are "
-		 "allowed for %<__simd__%> attribute");
+	  error ("%qE attribute first argument must be % or "
+		 "%", name);
+	  *no_add_attrs = true;
+	  return NULL_TREE;
+	}
+
+	  args = TREE_CHAIN (args);
+	}
+
+  if (args)
+	{
+	  tree arg = TREE_VALUE (args);
+
+	  arg = c_fully_fold (arg, false, NULL);
+	  if (TREE_CODE (arg) != INTEGER_CST
+	  || !INTEGRAL_TYPE_P (TREE_TYPE (arg))
+	  || tree_int_cst_sgn (ar

Support UTF-8 character constants for C2x

2019-11-14 Thread Joseph Myers
C2x adds u8'' character constants to C.  This patch adds the
corresponding GCC support.

Most of the support was already present for C++ and just needed
enabling for C2x.  However, in C2x these constants have type unsigned
char, which required corresponding adjustments in the compiler and the
preprocessor to give them that type for C.

For C, it seems clear to me that having type unsigned char means the
constants are unsigned in the preprocessor (and thus treated as having
type uintmax_t in #if conditionals), so this patch implements that.  I
included a conditional in the libcpp change to avoid affecting
signedness for C++, but I'm not sure if in fact these constants should
also be unsigned in the preprocessor for C++ in which case that
!CPP_OPTION (pfile, cplusplus) conditional would not be needed.

Bootstrapped with no regressions on x86_64-pc-linux-gnu.  Applied to 
mainline.

gcc/c:
2019-11-14  Joseph Myers  

* c-parser.c (c_parser_postfix_expression)
(c_parser_check_literal_zero): Handle CPP_UTF8CHAR.
* gimple-parser.c (c_parser_gimple_postfix_expression): Likewise.

gcc/c-family:
2019-11-14  Joseph Myers  

* c-lex.c (lex_charconst): Make CPP_UTF8CHAR constants unsigned
char for C.

gcc/testsuite:
2019-11-14  Joseph Myers  

* gcc.dg/c11-utf8char-1.c, gcc.dg/c2x-utf8char-1.c,
gcc.dg/c2x-utf8char-2.c, gcc.dg/c2x-utf8char-3.c,
gcc.dg/gnu2x-utf8char-1.c: New tests.

libcpp:
2019-11-14  Joseph Myers  

* charset.c (narrow_str_to_charconst): Make CPP_UTF8CHAR constants
unsigned for C.
* init.c (lang_defaults): Set utf8_char_literals for GNUC2X and
STDC2X.

Index: gcc/c/c-parser.c
===
--- gcc/c/c-parser.c(revision 278253)
+++ gcc/c/c-parser.c(working copy)
@@ -8783,6 +8783,7 @@ c_parser_postfix_expression (c_parser *parser)
 case CPP_CHAR:
 case CPP_CHAR16:
 case CPP_CHAR32:
+case CPP_UTF8CHAR:
 case CPP_WCHAR:
   expr.value = c_parser_peek_token (parser)->value;
   /* For the purpose of warning when a pointer is compared with
@@ -10459,6 +10460,7 @@ c_parser_check_literal_zero (c_parser *parser, uns
 case CPP_WCHAR:
 case CPP_CHAR16:
 case CPP_CHAR32:
+case CPP_UTF8CHAR:
   /* If a parameter is literal zero alone, remember it
 for -Wmemset-transposed-args warning.  */
   if (integer_zerop (tok->value)
Index: gcc/c/gimple-parser.c
===
--- gcc/c/gimple-parser.c   (revision 278253)
+++ gcc/c/gimple-parser.c   (working copy)
@@ -1395,6 +1395,7 @@ c_parser_gimple_postfix_expression (gimple_parser
 case CPP_CHAR:
 case CPP_CHAR16:
 case CPP_CHAR32:
+case CPP_UTF8CHAR:
 case CPP_WCHAR:
   expr.value = c_parser_peek_token (parser)->value;
   set_c_expr_source_range (&expr, tok_range);
Index: gcc/c-family/c-lex.c
===
--- gcc/c-family/c-lex.c(revision 278253)
+++ gcc/c-family/c-lex.c(working copy)
@@ -1376,7 +1376,9 @@ lex_charconst (const cpp_token *token)
 type = char16_type_node;
   else if (token->type == CPP_UTF8CHAR)
 {
-  if (flag_char8_t)
+  if (!c_dialect_cxx ())
+   type = unsigned_char_type_node;
+  else if (flag_char8_t)
 type = char8_type_node;
   else
 type = char_type_node;
Index: gcc/testsuite/gcc.dg/c11-utf8char-1.c
===
--- gcc/testsuite/gcc.dg/c11-utf8char-1.c   (nonexistent)
+++ gcc/testsuite/gcc.dg/c11-utf8char-1.c   (working copy)
@@ -0,0 +1,7 @@
+/* Test C2x UTF-8 characters.  Test not accepted for C11.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
+
+#define z(x) 0
+#define u8 z(
+unsigned char a = u8'a');
Index: gcc/testsuite/gcc.dg/c2x-utf8char-1.c
===
--- gcc/testsuite/gcc.dg/c2x-utf8char-1.c   (nonexistent)
+++ gcc/testsuite/gcc.dg/c2x-utf8char-1.c   (working copy)
@@ -0,0 +1,29 @@
+/* Test C2x UTF-8 characters.  Test valid usages.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x -pedantic-errors" } */
+
+unsigned char a = u8'a';
+_Static_assert (u8'a' == 97);
+
+unsigned char b = u8'\0';
+_Static_assert (u8'\0' == 0);
+
+unsigned char c = u8'\xff';
+_Static_assert (u8'\xff' == 255);
+
+unsigned char d = u8'\377';
+_Static_assert (u8'\377' == 255);
+
+_Static_assert (sizeof (u8'a') == 1);
+_Static_assert (sizeof (u8'\0') == 1);
+_Static_assert (sizeof (u8'\xff') == 1);
+_Static_assert (sizeof (u8'\377') == 1);
+
+_Static_assert (_Generic (u8'a', unsigned char: 1, default: 2) == 1);
+_Static_assert (_Generic (u8'\0', unsigned char: 1, default: 2) == 1);
+_Static_assert (_Generic (u8'\xff', unsigned char: 1, default: 2) == 1);
+_Static_assert (_Generic (u8'\377', un

[RFC C++ PATCH] __builtin_source_location ()

2019-11-14 Thread Jakub Jelinek
Hi!

The following WIP patch implements __builtin_source_location (),
which returns const void pointer to a std::source_location::__impl
struct that is required to contain __file, __function, __line and __column
fields, the first two with const char * type, the latter some integral type.

I don't have testcase coverage yet and the hash map to allow sharing of
VAR_DECLs with the same location is commented out both because it
doesn't compile for some reason and because hashing on location_t
is not enough, we probably need to hash on both location_t and fndecl,
as the baz case in the following shows.

Comments?

namespace std {
  struct source_location {
struct __impl {
  const char *__file;
  const char *__function;
  unsigned int __line, __column;
};
const void *__ptr;
constexpr source_location () : __ptr (nullptr) {}
static consteval source_location
current (const void *__p = __builtin_source_location ()) {
  source_location __ret;
  __ret.__ptr = __p;
  return __ret;
}
constexpr const char *file () const {
  return static_cast  (__ptr)->__file;
}
constexpr const char *function () const {
  return static_cast  (__ptr)->__function;
}
constexpr unsigned line () const {
  return static_cast  (__ptr)->__line;
}
constexpr unsigned column () const {
  return static_cast  (__ptr)->__column;
}
  };
}

using namespace std;

consteval source_location
bar (const source_location x = source_location::current ())
{
  return x;
}

void
foo (const char **p, unsigned *q)
{
  constexpr source_location s = source_location::current ();
  constexpr source_location t = bar ();
  p[0] = s.file ();
  p[1] = s.function ();
  q[0] = s.line ();
  q[1] = s.column ();
  p[2] = t.file ();
  p[3] = t.function ();
  q[2] = t.line ();
  q[3] = t.column ();
  constexpr const char *r = s.file ();
}

template 
constexpr source_location
baz ()
{
  return source_location::current ();
}

constexpr source_location s1 = baz <0> ();
constexpr source_location s2 = baz <1> ();
const source_location *p1 = &s1;
const source_location *p2 = &s2;

--- gcc/cp/tree.c.jj2019-11-13 10:54:45.437045793 +0100
+++ gcc/cp/tree.c   2019-11-14 18:11:42.391213117 +0100
@@ -445,7 +445,9 @@ builtin_valid_in_constant_expr_p (const_
   if (DECL_BUILT_IN_CLASS (decl) != BUILT_IN_NORMAL)
 {
   if (fndecl_built_in_p (decl, CP_BUILT_IN_IS_CONSTANT_EVALUATED,
-  BUILT_IN_FRONTEND))
+BUILT_IN_FRONTEND)
+ || fndecl_built_in_p (decl, CP_BUILT_IN_SOURCE_LOCATION,
+   BUILT_IN_FRONTEND))
return true;
   /* Not a built-in.  */
   return false;
--- gcc/cp/constexpr.c.jj   2019-11-13 10:54:45.426045960 +0100
+++ gcc/cp/constexpr.c  2019-11-14 18:26:40.691581038 +0100
@@ -1238,6 +1238,9 @@ cxx_eval_builtin_function_call (const co
   return boolean_true_node;
 }
 
+  if (fndecl_built_in_p (fun, CP_BUILT_IN_SOURCE_LOCATION, BUILT_IN_FRONTEND))
+return fold_builtin_source_location (EXPR_LOCATION (t));
+
   /* Be permissive for arguments to built-ins; __builtin_constant_p should
  return constant false for a non-constant argument.  */
   constexpr_ctx new_ctx = *ctx;
--- gcc/cp/name-lookup.c.jj 2019-11-13 10:54:45.495044911 +0100
+++ gcc/cp/name-lookup.c2019-11-14 18:38:30.765804391 +0100
@@ -5747,6 +5747,8 @@ get_std_name_hint (const char *name)
 {"shared_lock", "", cxx14},
 {"shared_mutex", "", cxx17},
 {"shared_timed_mutex", "", cxx14},
+/* .  */
+{"source_location", "", cxx2a},
 /* .  */
 {"basic_stringbuf", "", cxx98},
 {"basic_istringstream", "", cxx98},
--- gcc/cp/cp-gimplify.c.jj 2019-11-06 08:58:38.036473709 +0100
+++ gcc/cp/cp-gimplify.c2019-11-14 20:22:32.905068438 +0100
@@ -35,6 +35,9 @@ along with GCC; see the file COPYING3.
 #include "attribs.h"
 #include "asan.h"
 #include "gcc-rich-location.h"
+#include "output.h"
+#include "file-prefix-map.h"
+#include "cgraph.h"
 
 /* Forward declarations.  */
 
@@ -896,8 +899,12 @@ cp_gimplify_expr (tree *expr_p, gimple_s
  tree decl = cp_get_callee_fndecl_nofold (*expr_p);
  if (decl
  && fndecl_built_in_p (decl, CP_BUILT_IN_IS_CONSTANT_EVALUATED,
- BUILT_IN_FRONTEND))
+   BUILT_IN_FRONTEND))
*expr_p = boolean_false_node;
+ else if (decl
+  && fndecl_built_in_p (decl, CP_BUILT_IN_SOURCE_LOCATION,
+BUILT_IN_FRONTEND))
+   *expr_p = fold_builtin_source_location (EXPR_LOCATION (*expr_p));
}
   break;
 
@@ -2641,9 +2648,17 @@ cp_fold (tree x)
/* Defer folding __builtin_is_constant_evaluated.  */
if (callee
&& fndecl_built_in_p (callee, CP_BUILT_IN_IS_CONSTANT_EVALUATED,
-   BUILT_IN_FRONTEND))
+  

[PATCH][ARM][GCC][0/x]: Support for MVE ACLE intrinsics.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patches series is to support Arm MVE ACLE intrinsics.

Please refer to Arm reference manual [1] and MVE intrinsics [2] for more 
details.
Please refer to Chapter 13 MVE ACLE [3] for MVE intrinsics concepts.

This patch series depends on upstream patches "Armv8.1-M Mainline Security 
Extension" [4],
"CLI and multilib support for Armv8.1-M Mainline MVE extensions" [5] and 
"support for Armv8.1-M
Mainline scalar shifts" [6].

[1] 
https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf?_ga=2.102521798.659307368.1572453718-1501600630.1548848914
[2] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics
[3] 
https://static.docs.arm.com/101028/0009/Q3-ACLE_2019Q3_release-0009.pdf?_ga=2.239684871.588348166.1573726994-1501600630.1548848914
[4] https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01654.html
[5] https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00641.html
[6] https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01194.html

Srinath Parvathaneni(38):
[PATCH][ARM][GCC][1/x]: MVE ACLE intrinsics framework patch.
[PATCH][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch.
[PATCH][ARM][GCC][3/x]: MVE ACLE intrinsics framework patch.
[PATCH][ARM][GCC][4/x]: MVE ACLE vector interleaving store intrinsics.
[PATCH][ARM][GCC][1/1x]: Patch to support MVE ACLE intrinsics with unary 
operand.
[PATCH][ARM][GCC][2/1x]: MVE intrinsics with unary operand.
[PATCH][ARM][GCC][3/1x]: MVE intrinsics with unary operand.
[PATCH][ARM][GCC][4/1x]: MVE intrinsics with unary operand.
[PATCH][ARM][GCC][1/2x]: MVE intrinsics with binary operands.
[PATCH][ARM][GCC][2/2x]: MVE intrinsics with binary operands.
[PATCH][ARM][GCC][3/2x]: MVE intrinsics with binary operands.
[PATCH][ARM][GCC][4/2x]: MVE intrinsics with binary operands.
[PATCH][ARM][GCC][5/2x]: MVE intrinsics with binary operands.
[PATCH][ARM][GCC][1/3x]: MVE intrinsics with ternary operands.
[PATCH][ARM][GCC][2/3x]: MVE intrinsics with ternary operands.
[PATCH][ARM][GCC][3/3x]: MVE intrinsics with ternary operands.
[PATCH][ARM][GCC][1/4x]: MVE intrinsics with quaternary operands.
[PATCH][ARM][GCC][2/4x]: MVE intrinsics with quaternary operands.
[PATCH][ARM][GCC][3/4x]: MVE intrinsics with quaternary operands.
[PATCH][ARM][GCC][4/4x]: MVE intrinsics with quaternary operands.
[PATCH][ARM][GCC][1/5x]: MVE store intrinsics.
[PATCH][ARM][GCC][2/5x]: MVE load intrinsics.
[PATCH][ARM][GCC][3/5x]: MVE store intrinsics with predicated suffix.
[PATCH][ARM][GCC][4/5x]: MVE load intrinsics with zero(_z) suffix.
[PATCH][ARM][GCC][5/5x]: MVE ACLE load intrinsics which load a byte, halfword, 
or word from memory.
[PATCH][ARM][GCC][6/5x]: Remaining MVE load intrinsics which loads half word 
and word or double word from memory.
[PATCH][ARM][GCC][7/5x]: MVE store intrinsics which stores byte,half word or 
word to memory.
[PATCH][ARM][GCC][8/5x]: Remaining MVE store intrinsics which stores an half 
word, word and double word to memory.
[PATCH][ARM][GCC][6x]:MVE ACLE vaddq intrinsics using arithmetic plus operator.
[PATCH][ARM][GCC][7x]: MVE vreinterpretq and vuninitializedq intrinsics.
[PATCH][ARM][GCC][1/8x]: MVE ACLE vidup, vddup, viwdup and vdwdup intrinsics 
with writeback.
[PATCH][ARM][GCC][2/8x]: MVE ACLE gather load and scatter store intrinsics with 
writeback.
[PATCH][ARM][GCC][9x]: MVE ACLE predicated intrinsics with (dont-care) variant.
[PATCH][ARM][GCC][10x]: MVE ACLE intrinsics "add with carry across beats" and 
"beat-wise substract".
[PATCH][ARM][GCC][11x]: MVE ACLE vector interleaving store and deinterleaving 
load intrinsics and also aliases to vstr and vldr intrinsics.
[PATCH][ARM][GCC][12x]: MVE ACLE intrinsics to set and get vector lane.
[PATCH][ARM][GCC][13x]: MVE ACLE scalar shift intrinsics.
[PATCH][ARM][GCC][14x]: MVE ACLE whole vector left shift with carry intrinsics.

Regards,
Srinath.

[PATCH][ARM][GCC][11x]: MVE ACLE vector interleaving store and deinterleaving load intrinsics and also aliases to vstr and vldr intrinsics.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE intrinsics which are aliases of vstr and
vldr intrinsics.

vst1q_p_u8, vst1q_p_s8, vld1q_z_u8, vld1q_z_s8, vst1q_p_u16, vst1q_p_s16,
vld1q_z_u16, vld1q_z_s16, vst1q_p_u32, vst1q_p_s32, vld1q_z_u32, vld1q_z_s32,
vld1q_z_f16, vst1q_p_f16, vld1q_z_f32, vst1q_p_f32.

This patch also supports following MVE ACLE vector deinterleaving loads and 
vector
interleaving stores.

vst2q_s8, vst2q_u8, vld2q_s8, vld2q_u8, vld4q_s8, vld4q_u8, vst2q_s16, 
vst2q_u16,
vld2q_s16, vld2q_u16, vld4q_s16, vld4q_u16, vst2q_s32, vst2q_u32, vld2q_s32,
vld2q_u32, vld4q_s32, vld4q_u32, vld4q_f16, vld2q_f16, vst2q_f16, vld4q_f32,
vld2q_f32, vst2q_f32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-11-08  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm_mve.h (vst1q_p_u8): Define macro.
(vst1q_p_s8): Likewise.
(vst2q_s8): Likewise.
(vst2q_u8): Likewise.
(vld1q_z_u8): Likewise.
(vld1q_z_s8): Likewise.
(vld2q_s8): Likewise.
(vld2q_u8): Likewise.
(vld4q_s8): Likewise.
(vld4q_u8): Likewise.
(vst1q_p_u16): Likewise.
(vst1q_p_s16): Likewise.
(vst2q_s16): Likewise.
(vst2q_u16): Likewise.
(vld1q_z_u16): Likewise.
(vld1q_z_s16): Likewise.
(vld2q_s16): Likewise.
(vld2q_u16): Likewise.
(vld4q_s16): Likewise.
(vld4q_u16): Likewise.
(vst1q_p_u32): Likewise.
(vst1q_p_s32): Likewise.
(vst2q_s32): Likewise.
(vst2q_u32): Likewise.
(vld1q_z_u32): Likewise.
(vld1q_z_s32): Likewise.
(vld2q_s32): Likewise.
(vld2q_u32): Likewise.
(vld4q_s32): Likewise.
(vld4q_u32): Likewise.
(vld4q_f16): Likewise.
(vld2q_f16): Likewise.
(vld1q_z_f16): Likewise.
(vst2q_f16): Likewise.
(vst1q_p_f16): Likewise.
(vld4q_f32): Likewise.
(vld2q_f32): Likewise.
(vld1q_z_f32): Likewise.
(vst2q_f32): Likewise.
(vst1q_p_f32): Likewise.
(__arm_vst1q_p_u8): Define intrinsic.
(__arm_vst1q_p_s8): Likewise.
(__arm_vst2q_s8): Likewise.
(__arm_vst2q_u8): Likewise.
(__arm_vld1q_z_u8): Likewise.
(__arm_vld1q_z_s8): Likewise.
(__arm_vld2q_s8): Likewise.
(__arm_vld2q_u8): Likewise.
(__arm_vld4q_s8): Likewise.
(__arm_vld4q_u8): Likewise.
(__arm_vst1q_p_u16): Likewise.
(__arm_vst1q_p_s16): Likewise.
(__arm_vst2q_s16): Likewise.
(__arm_vst2q_u16): Likewise.
(__arm_vld1q_z_u16): Likewise.
(__arm_vld1q_z_s16): Likewise.
(__arm_vld2q_s16): Likewise.
(__arm_vld2q_u16): Likewise.
(__arm_vld4q_s16): Likewise.
(__arm_vld4q_u16): Likewise.
(__arm_vst1q_p_u32): Likewise.
(__arm_vst1q_p_s32): Likewise.
(__arm_vst2q_s32): Likewise.
(__arm_vst2q_u32): Likewise.
(__arm_vld1q_z_u32): Likewise.
(__arm_vld1q_z_s32): Likewise.
(__arm_vld2q_s32): Likewise.
(__arm_vld2q_u32): Likewise.
(__arm_vld4q_s32): Likewise.
(__arm_vld4q_u32): Likewise.
(__arm_vld4q_f16): Likewise.
(__arm_vld2q_f16): Likewise.
(__arm_vld1q_z_f16): Likewise.
(__arm_vst2q_f16): Likewise.
(__arm_vst1q_p_f16): Likewise.
(__arm_vld4q_f32): Likewise.
(__arm_vld2q_f32): Likewise.
(__arm_vld1q_z_f32): Likewise.
(__arm_vst2q_f32): Likewise.
(__arm_vst1q_p_f32): Likewise.
(vld1q_z): Define polymorphic variant.
(vld2q): Likewise.
(vld4q): Likewise.
(vst1q_p): Likewise.
(vst2q): Likewise.
* config/arm/arm_mve_builtins.def (STORE1): Use builtin qualifier.
(LOAD1): Likewise.
* config/arm/mve.md (mve_vst2q): Define RTL pattern.
(mve_vld2q): Likewise.
(mve_vld4q): Likewise.

gcc/testsuite/ChangeLog:

2019-11-08  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vld1q_z_f16.c: New test.
* gcc.target/arm/mve/intrinsics/vld1q_z_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld2q_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld2q_f32.c: Likewi

[PATCH][ARM][GCC][12x]: MVE ACLE intrinsics to set and get vector lane.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE intrinsics to get and set vector lane.

vsetq_lane_f16, vsetq_lane_f32, vsetq_lane_s16, vsetq_lane_s32, vsetq_lane_s8,
vsetq_lane_s64, vsetq_lane_u8, vsetq_lane_u16, vsetq_lane_u32, vsetq_lane_u64,
vgetq_lane_f16, vgetq_lane_f32, vgetq_lane_s16, vgetq_lane_s32, vgetq_lane_s8,
vgetq_lane_s64, vgetq_lane_u8, vgetq_lane_u16, vgetq_lane_u32, vgetq_lane_u64.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-11-08  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm_mve.h (vsetq_lane_f16): Define macro.
(vsetq_lane_f32): Likewise.
(vsetq_lane_s16): Likewise.
(vsetq_lane_s32): Likewise.
(vsetq_lane_s8): Likewise.
(vsetq_lane_s64): Likewise.
(vsetq_lane_u8): Likewise.
(vsetq_lane_u16): Likewise.
(vsetq_lane_u32): Likewise.
(vsetq_lane_u64): Likewise.
(vgetq_lane_f16): Likewise.
(vgetq_lane_f32): Likewise.
(vgetq_lane_s16): Likewise.
(vgetq_lane_s32): Likewise.
(vgetq_lane_s8): Likewise.
(vgetq_lane_s64): Likewise.
(vgetq_lane_u8): Likewise.
(vgetq_lane_u16): Likewise.
(vgetq_lane_u32): Likewise.
(vgetq_lane_u64): Likewise.
(__ARM_NUM_LANES): Likewise.
(__ARM_LANEQ): Likewise.
(__ARM_CHECK_LANEQ): Likewise.
(__arm_vsetq_lane_s16): Define intrinsic.
(__arm_vsetq_lane_s32): Likewise.
(__arm_vsetq_lane_s8): Likewise.
(__arm_vsetq_lane_s64): Likewise.
(__arm_vsetq_lane_u8): Likewise.
(__arm_vsetq_lane_u16): Likewise.
(__arm_vsetq_lane_u32): Likewise.
(__arm_vsetq_lane_u64): Likewise.
(__arm_vgetq_lane_s16): Likewise.
(__arm_vgetq_lane_s32): Likewise.
(__arm_vgetq_lane_s8): Likewise.
(__arm_vgetq_lane_s64): Likewise.
(__arm_vgetq_lane_u8): Likewise.
(__arm_vgetq_lane_u16): Likewise.
(__arm_vgetq_lane_u32): Likewise.
(__arm_vgetq_lane_u64): Likewise.
(__arm_vsetq_lane_f16): Likewise.
(__arm_vsetq_lane_f32): Likewise.
(__arm_vgetq_lane_f16): Likewise.
(__arm_vgetq_lane_f32): Likewise.
(vgetq_lane): Define polymorphic variant.
(vsetq_lane): Likewise.
* config/arm/mve.md (mve_vec_extract): Define RTL
pattern.
(mve_vec_extractv2didi): Likewise.
(mve_vec_extract_sext_internal): Likewise.
(mve_vec_extract_zext_internal): Likewise.
(mve_vec_set_internal): Likewise.
(mve_vec_setv2di_internal): Likewise.
* config/arm/neon.md (vec_set): Move RTL pattern to vec-common.md
file.
(vec_extract): Rename to
"neon_vec_extract".
(vec_extractv2didi): Rename to "neon_vec_extractv2didi".
* config/arm/vec-common.md (vec_extract): Define RTL
pattern common for MVE and NEON.
(vec_set): Move RTL pattern from neon.md and modify to accept both
MVE and NEON.

gcc/testsuite/ChangeLog:

2019-11-08  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vgetq_lane_f16.c: New test.
* gcc.target/arm/mve/intrinsics/vgetq_lane_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_u64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_u64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_u8.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 
d0259d7bd96c565d901b7634e9f735e0e14bc9dc..9dcf8d692670cd8552fade9868bc051683553b91

[PATCH][ARM][GCC][13x]: MVE ACLE scalar shift intrinsics.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE scalar shift intrinsics.

sqrshr, sqrshrl, sqrshrl_sat48, sqshl, sqshll, srshr, srshrl, uqrshl,
uqrshll, uqrshll_sat48, uqshl, uqshll, urshr, urshrl, lsll, asrl.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-11-08  Srinath Parvathaneni  

* config/arm/arm-builtins.c (LSLL_QUALIFIERS): Define builtin qualifier.
(UQSHL_QUALIFIERS): Likewise.
(ASRL_QUALIFIERS): Likewise.
(SQSHL_QUALIFIERS): Likewise.
* config/arm/arm_mve.h (sqrshr): Define macro.
(sqrshrl): Likewise.
(sqrshrl_sat48): Likewise.
(sqshl): Likewise.
(sqshll): Likewise.
(srshr): Likewise.
(srshrl): Likewise.
(uqrshl): Likewise.
(uqrshll): Likewise.
(uqrshll_sat48): Likewise.
(uqshl): Likewise.
(uqshll): Likewise.
(urshr): Likewise.
(urshrl): Likewise.
(lsll): Likewise.
(asrl): Likewise.
(__arm_lsll): Define intrinsic.
(__arm_asrl): Likewise.
(__arm_uqrshll): Likewise.
(__arm_uqrshll_sat48): Likewise.
(__arm_sqrshrl): Likewise.
(__arm_sqrshrl_sat48): Likewise.
(__arm_uqshll): Likewise.
(__arm_urshrl): Likewise.
(__arm_srshrl): Likewise.
(__arm_sqshll): Likewise.
(__arm_uqrshl): Likewise.
(__arm_sqrshr): Likewise.
(__arm_uqshl): Likewise.
(__arm_urshr): Likewise.
(__arm_sqshl): Likewise.
(__arm_srshr): Likewise.
* config/arm/arm_mve_builtins.def (LSLL_QUALIFIERS): Use builtin
qualifier.
(UQSHL_QUALIFIERS): Likewise.
(ASRL_QUALIFIERS): Likewise.
(SQSHL_QUALIFIERS): Likewise.
* config/arm/mve.md (mve_uqrshll_sat_di): Define RTL pattern.
(mve_sqrshrl_sat_di): Likewise
(mve_uqrshl_si): Likewise
(mve_sqrshr_si): Likewise
(mve_uqshll_di): Likewise
(mve_urshrl_di): Likewise
(mve_uqshl_si): Likewise
(mve_urshr_si): Likewise
(mve_sqshl_si): Likewise
(mve_srshr_si): Likewise
(mve_srshrl_di): Likewise
(mve_sqshll_di): Likewise

gcc/testsuite/ChangeLog:

2019-11-08  Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/asrl.c: New test.
* gcc.target/arm/mve/intrinsics/lsll.c: Likewise.
* gcc.target/arm/mve/intrinsics/sqrshr.c: Likewise.
* gcc.target/arm/mve/intrinsics/sqrshrl_sat48.c: Likewise.
* gcc.target/arm/mve/intrinsics/sqrshrl_sat64.c: Likewise.
* gcc.target/arm/mve/intrinsics/sqshl.c: Likewise.
* gcc.target/arm/mve/intrinsics/sqshll.c: Likewise.
* gcc.target/arm/mve/intrinsics/srshr.c: Likewise.
* gcc.target/arm/mve/intrinsics/srshrl.c: Likewise.
* gcc.target/arm/mve/intrinsics/uqrshl.c: Likewise.
* gcc.target/arm/mve/intrinsics/uqrshll_sat48.c: Likewise.
* gcc.target/arm/mve/intrinsics/uqrshll_sat64.c: Likewise.
* gcc.target/arm/mve/intrinsics/uqshl.c: Likewise.
* gcc.target/arm/mve/intrinsics/uqshll.c: Likewise.
* gcc.target/arm/mve/intrinsics/urshr.c: Likewise.
* gcc.target/arm/mve/intrinsics/urshrl.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 
9e87dff264d6b535f64407f669c6e83b0ed639a6..31bff8511e368f8e789297818e9b0b9f885463ae
 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -738,6 +738,26 @@ arm_strsbwbu_p_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   qualifier_unsigned, qualifier_unsigned};
 #define STRSBWBU_P_QUALIFIERS (arm_strsbwbu_p_qualifiers)
 
+static enum arm_type_qualifiers
+arm_lsll_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_unsigned, qualifier_none};
+#define LSLL_QUALIFIERS (arm_lsll_qualifiers)
+
+static enum arm_type_qualifiers
+arm_uqshl_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_unsigned, qualifier_const};
+#define UQSHL_QUALIFIERS (arm_uqshl_qualifiers)
+
+static enum arm_type_qualifiers
+arm_asrl_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_none, qualifier_none};
+#define ASRL_QUALIFIERS (arm_asrl_qualifiers)
+
+static enum arm_type_qualifiers
+arm_sqshl_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_unsigned, qualifier_const};
+#define SQSHL_QUALIFIERS (arm_sqshl_qualifiers)
+
 /* End of Qualifier for MVE builtins.  */
 
/* void ([T element type] *, T, immediate).  */
diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 
9dcf8d692670cd8552fade9868bc051683553b91..2adae7f8b21f44aa3b80231b89bd68bcd0812611
 100644
--- 

[PATCH][ARM][GCC][14x]: MVE ACLE whole vector left shift with carry intrinsics.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE whole vector left shift with carry 
intrinsics.

vshlcq_m_s8, vshlcq_m_s16, vshlcq_m_s32, vshlcq_m_u8, vshlcq_m_u16, 
vshlcq_m_u32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-11-09  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm_mve.h (vshlcq_m_s8): Define macro.
(vshlcq_m_u8): Likewise.
(vshlcq_m_s16): Likewise.
(vshlcq_m_u16): Likewise.
(vshlcq_m_s32): Likewise.
(vshlcq_m_u32): Likewise.
(__arm_vshlcq_m_s8): Define intrinsic.
(__arm_vshlcq_m_u8): Likewise.
(__arm_vshlcq_m_s16): Likewise.
(__arm_vshlcq_m_u16): Likewise.
(__arm_vshlcq_m_s32): Likewise.
(__arm_vshlcq_m_u32): Likewise.
(vshlcq_m): Define polymorphic variant.
* config/arm/arm_mve_builtins.def (QUADOP_NONE_NONE_UNONE_IMM_UNONE):
Use builtin qualifier.
(QUADOP_UNONE_UNONE_UNONE_IMM_UNONE): Likewise.
* config/arm/mve.md (mve_vshlcq_m_vec_): Define RTL pattern.
(mve_vshlcq_m_carry_): Likewise.
(mve_vshlcq_m_): Likewise.

gcc/testsuite/ChangeLog:

2019-11-09  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vshlcq_m_s16.c: New test.
* gcc.target/arm/mve/intrinsics/vshlcq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlcq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlcq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlcq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlcq_m_u8.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 
2adae7f8b21f44aa3b80231b89bd68bcd0812611..5d385081a0affacc4dd21d010b01bceb38a9b699
 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -2542,6 +2542,12 @@ typedef struct { uint8x16_t val[4]; } uint8x16x4_t;
 #define urshrl(__p0, __p1) __arm_urshrl(__p0, __p1)
 #define lsll(__p0, __p1) __arm_lsll(__p0, __p1)
 #define asrl(__p0, __p1) __arm_asrl(__p0, __p1)
+#define vshlcq_m_s8(__a,  __b,  __imm, __p) __arm_vshlcq_m_s8(__a,  __b,  
__imm, __p)
+#define vshlcq_m_u8(__a,  __b,  __imm, __p) __arm_vshlcq_m_u8(__a,  __b,  
__imm, __p)
+#define vshlcq_m_s16(__a,  __b,  __imm, __p) __arm_vshlcq_m_s16(__a,  __b,  
__imm, __p)
+#define vshlcq_m_u16(__a,  __b,  __imm, __p) __arm_vshlcq_m_u16(__a,  __b,  
__imm, __p)
+#define vshlcq_m_s32(__a,  __b,  __imm, __p) __arm_vshlcq_m_s32(__a,  __b,  
__imm, __p)
+#define vshlcq_m_u32(__a,  __b,  __imm, __p) __arm_vshlcq_m_u32(__a,  __b,  
__imm, __p)
 #endif
 
 /* For big-endian, GCC's vector indices are reversed within each 64 bits
@@ -16667,6 +16673,60 @@ __arm_srshr (int32_t value, const int shift)
   return __builtin_mve_srshr_si (value, shift);
 }
 
+__extension__ extern __inline int8x16_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vshlcq_m_s8 (int8x16_t __a, uint32_t * __b, const int __imm, 
mve_pred16_t __p)
+{
+  int8x16_t __res = __builtin_mve_vshlcq_m_vec_sv16qi (__a, *__b, __imm, __p);
+  *__b = __builtin_mve_vshlcq_m_carry_sv16qi (__a, *__b, __imm, __p);
+  return __res;
+}
+
+__extension__ extern __inline uint8x16_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vshlcq_m_u8 (uint8x16_t __a, uint32_t * __b, const int __imm, 
mve_pred16_t __p)
+{
+  uint8x16_t __res = __builtin_mve_vshlcq_m_vec_uv16qi (__a, *__b, __imm, __p);
+  *__b = __builtin_mve_vshlcq_m_carry_uv16qi (__a, *__b, __imm, __p);
+  return __res;
+}
+
+__extension__ extern __inline int16x8_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vshlcq_m_s16 (int16x8_t __a, uint32_t * __b, const int __imm, 
mve_pred16_t __p)
+{
+  int16x8_t __res = __builtin_mve_vshlcq_m_vec_sv8hi (__a, *__b, __imm, __p);
+  *__b = __builtin_mve_vshlcq_m_carry_sv8hi (__a, *__b, __imm, __p);
+  return __res;
+}
+
+__extension__ extern __inline uint16x8_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vshlcq_m_u16 (uint16x8_t __a, uint32_t * __b, const int __imm, 
mve_pred16_t __p)
+{
+  uint16x8_t __res = __builtin_mve_vshlcq_m_vec_uv8hi (__a, *__b, __imm, __p);
+  *__b = __builtin_mve_vshlcq_m_carry_uv8hi (__a, *__b, __imm, __p);
+  return __res;
+}
+
+__extension__ extern __inline int32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vshlcq_m_s32 (int32x4_t __a, uint32_t * __b, const int __imm, 
mve_pred16_t __p)
+{
+  int32x4_t __res = __builtin_mve_vshlcq_m_vec_sv4si (__a, *__b, __imm, __p);
+  *__b = __builtin_mve_vshlcq_m_ca

[PATCH][ARM][GCC][10x]: MVE ACLE intrinsics "add with carry across beats" and "beat-wise substract".

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE "add with carry across beats" intrinsics
and "beat-wise substract" intrinsics.

vadciq_s32, vadciq_u32, vadciq_m_s32, vadciq_m_u32, vadcq_s32, vadcq_u32,
vadcq_m_s32, vadcq_m_u32, vsbciq_s32, vsbciq_u32, vsbciq_m_s32, vsbciq_m_u32,
vsbcq_s32, vsbcq_u32, vsbcq_m_s32, vsbcq_m_u32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-11-08  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm_mve.h (vadciq_s32): Define macro.
(vadciq_u32): Likewise.
(vadciq_m_s32): Likewise.
(vadciq_m_u32): Likewise.
(vadcq_s32): Likewise.
(vadcq_u32): Likewise.
(vadcq_m_s32): Likewise.
(vadcq_m_u32): Likewise.
(vsbciq_s32): Likewise.
(vsbciq_u32): Likewise.
(vsbciq_m_s32): Likewise.
(vsbciq_m_u32): Likewise.
(vsbcq_s32): Likewise.
(vsbcq_u32): Likewise.
(vsbcq_m_s32): Likewise.
(vsbcq_m_u32): Likewise.
(__arm_vadciq_s32): Define intrinsic.
(__arm_vadciq_u32): Likewise.
(__arm_vadciq_m_s32): Likewise.
(__arm_vadciq_m_u32): Likewise.
(__arm_vadcq_s32): Likewise.
(__arm_vadcq_u32): Likewise.
(__arm_vadcq_m_s32): Likewise.
(__arm_vadcq_m_u32): Likewise.
(__arm_vsbciq_s32): Likewise.
(__arm_vsbciq_u32): Likewise.
(__arm_vsbciq_m_s32): Likewise.
(__arm_vsbciq_m_u32): Likewise.
(__arm_vsbcq_s32): Likewise.
(__arm_vsbcq_u32): Likewise.
(__arm_vsbcq_m_s32): Likewise.
(__arm_vsbcq_m_u32): Likewise.
(vadciq_m): Define polymorphic variant.
(vadciq): Likewise.
(vadcq_m): Likewise.
(vadcq): Likewise.
(vsbciq_m): Likewise.
(vsbciq): Likewise.
(vsbcq_m): Likewise.
(vsbcq): Likewise.
* config/arm/arm_mve_builtins.def (BINOP_NONE_NONE_NONE): Use builtin
qualifier.
(BINOP_UNONE_UNONE_UNONE): Likewise.
(QUADOP_NONE_NONE_NONE_NONE_UNONE): Likewise.
(QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE): Likewise.
* config/arm/mve.md (VADCIQ): Define iterator.
(VADCIQ_M): Likewise.
(VSBCQ): Likewise.
(VSBCQ_M): Likewise.
(VSBCIQ): Likewise.
(VSBCIQ_M): Likewise.
(VADCQ): Likewise.
(VADCQ_M): Likewise.
(mve_vadciq_m_v4si): Define RTL pattern.
(mve_vadciq_v4si): Likewise.
(mve_vadcq_m_v4si): Likewise.
(mve_vadcq_v4si): Likewise.
(mve_vsbciq_m_v4si): Likewise.
(mve_vsbciq_v4si): Likewise.
(mve_vsbcq_m_v4si): Likewise.
(mve_vsbcq_v4si): Likewise.

gcc/testsuite/ChangeLog:

2019-11-08  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vadciq_m_s32.c: New test.
* gcc.target/arm/mve/intrinsics/vadciq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vadciq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vadciq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vadcq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vadcq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vadcq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vadcq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsbciq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsbciq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsbciq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsbciq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsbcq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsbcq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsbcq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsbcq_u32.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 
31ad3fc5cddfedede02b10e194a426a98bd13024..1704b622c5d6e0abcf814ae1d439bb732f0bd76e
 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -2450,6 +2450,22 @@ typedef struct { uint8x16_t val[4]; } uint8x16x4_t;
 #define vrev32q_x_f16(__a, __p) __arm_vrev32q_x_f16(__a, __p)
 #define vrev64q_x_f16(__a, __p) __arm_vrev64q_x_f16(__a, __p)
 #define vrev64q_x_f32(__a, __p) __arm_vrev64q_x_f32(__a, __p)
+#define vadciq_s32(__a, __b,  __carry_out) __arm_vadciq_s32(__a, __b,  
__carry_out)
+#define vadciq_u32(__a, __b,  __carry_out) __arm_vadciq_u32(__a, __b,  
__carry_out)
+#define vadciq_m_s32(__inactive, __a, __b,  __carry_out, __p) 
__arm_vadciq_m_s32(__inactive, __a, __b,  __carry_out, __p)
+#define vadciq_m_u32(__inactive

[PATCH][ARM][GCC][9x]: MVE ACLE predicated intrinsics with (dont-care) variant.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE predicated intrinsic with `_x` 
(dont-care) variant.
* ``_x`` (dont-care) which indicates that the false-predicated lanes have 
undefined values.
These are syntactic sugar for merge intrinsics with a ``vuninitializedq`` 
inactive parameter.

vabdq_x_f16, vabdq_x_f32, vabdq_x_s16, vabdq_x_s32, vabdq_x_s8, vabdq_x_u16, 
vabdq_x_u32, vabdq_x_u8,
vabsq_x_f16, vabsq_x_f32, vabsq_x_s16, vabsq_x_s32, vabsq_x_s8, vaddq_x_f16, 
vaddq_x_f32, vaddq_x_n_f16,
vaddq_x_n_f32, vaddq_x_n_s16, vaddq_x_n_s32, vaddq_x_n_s8, vaddq_x_n_u16, 
vaddq_x_n_u32, vaddq_x_n_u8,
vaddq_x_s16, vaddq_x_s32, vaddq_x_s8, vaddq_x_u16, vaddq_x_u32, vaddq_x_u8, 
vandq_x_f16, vandq_x_f32,
vandq_x_s16, vandq_x_s32, vandq_x_s8, vandq_x_u16, vandq_x_u32, vandq_x_u8, 
vbicq_x_f16, vbicq_x_f32,
vbicq_x_s16, vbicq_x_s32, vbicq_x_s8, vbicq_x_u16, vbicq_x_u32, vbicq_x_u8, 
vbrsrq_x_n_f16,
vbrsrq_x_n_f32, vbrsrq_x_n_s16, vbrsrq_x_n_s32, vbrsrq_x_n_s8, vbrsrq_x_n_u16, 
vbrsrq_x_n_u32,
vbrsrq_x_n_u8, vcaddq_rot270_x_f16, vcaddq_rot270_x_f32, vcaddq_rot270_x_s16, 
vcaddq_rot270_x_s32,
vcaddq_rot270_x_s8, vcaddq_rot270_x_u16, vcaddq_rot270_x_u32, 
vcaddq_rot270_x_u8, vcaddq_rot90_x_f16,
vcaddq_rot90_x_f32, vcaddq_rot90_x_s16, vcaddq_rot90_x_s32, vcaddq_rot90_x_s8, 
vcaddq_rot90_x_u16,
vcaddq_rot90_x_u32, vcaddq_rot90_x_u8, vclsq_x_s16, vclsq_x_s32, vclsq_x_s8, 
vclzq_x_s16, vclzq_x_s32,
vclzq_x_s8, vclzq_x_u16, vclzq_x_u32, vclzq_x_u8, vcmulq_rot180_x_f16, 
vcmulq_rot180_x_f32,
vcmulq_rot270_x_f16, vcmulq_rot270_x_f32, vcmulq_rot90_x_f16, 
vcmulq_rot90_x_f32, vcmulq_x_f16,
vcmulq_x_f32, vcvtaq_x_s16_f16, vcvtaq_x_s32_f32, vcvtaq_x_u16_f16, 
vcvtaq_x_u32_f32, vcvtbq_x_f32_f16,
vcvtmq_x_s16_f16, vcvtmq_x_s32_f32, vcvtmq_x_u16_f16, vcvtmq_x_u32_f32, 
vcvtnq_x_s16_f16,
vcvtnq_x_s32_f32, vcvtnq_x_u16_f16, vcvtnq_x_u32_f32, vcvtpq_x_s16_f16, 
vcvtpq_x_s32_f32,
vcvtpq_x_u16_f16, vcvtpq_x_u32_f32, vcvtq_x_f16_s16, vcvtq_x_f16_u16, 
vcvtq_x_f32_s32, vcvtq_x_f32_u32,
vcvtq_x_n_f16_s16, vcvtq_x_n_f16_u16, vcvtq_x_n_f32_s32, vcvtq_x_n_f32_u32, 
vcvtq_x_n_s16_f16,
vcvtq_x_n_s32_f32, vcvtq_x_n_u16_f16, vcvtq_x_n_u32_f32, vcvtq_x_s16_f16, 
vcvtq_x_s32_f32,
vcvtq_x_u16_f16, vcvtq_x_u32_f32, vcvttq_x_f32_f16, vddupq_x_n_u16, 
vddupq_x_n_u32, vddupq_x_n_u8,
vddupq_x_wb_u16, vddupq_x_wb_u32, vddupq_x_wb_u8, vdupq_x_n_f16, vdupq_x_n_f32, 
vdupq_x_n_s16,
vdupq_x_n_s32, vdupq_x_n_s8, vdupq_x_n_u16, vdupq_x_n_u32, vdupq_x_n_u8, 
vdwdupq_x_n_u16, vdwdupq_x_n_u32,
vdwdupq_x_n_u8, vdwdupq_x_wb_u16, vdwdupq_x_wb_u32, vdwdupq_x_wb_u8, 
veorq_x_f16, veorq_x_f32, veorq_x_s16,
veorq_x_s32, veorq_x_s8, veorq_x_u16, veorq_x_u32, veorq_x_u8, vhaddq_x_n_s16, 
vhaddq_x_n_s32,
vhaddq_x_n_s8, vhaddq_x_n_u16, vhaddq_x_n_u32, vhaddq_x_n_u8, vhaddq_x_s16, 
vhaddq_x_s32, vhaddq_x_s8,
vhaddq_x_u16, vhaddq_x_u32, vhaddq_x_u8, vhcaddq_rot270_x_s16, 
vhcaddq_rot270_x_s32, vhcaddq_rot270_x_s8,
vhcaddq_rot90_x_s16, vhcaddq_rot90_x_s32, vhcaddq_rot90_x_s8, vhsubq_x_n_s16, 
vhsubq_x_n_s32,
vhsubq_x_n_s8, vhsubq_x_n_u16, vhsubq_x_n_u32, vhsubq_x_n_u8, vhsubq_x_s16, 
vhsubq_x_s32, vhsubq_x_s8,
vhsubq_x_u16, vhsubq_x_u32, vhsubq_x_u8, vidupq_x_n_u16, vidupq_x_n_u32, 
vidupq_x_n_u8, vidupq_x_wb_u16,
vidupq_x_wb_u32, vidupq_x_wb_u8, viwdupq_x_n_u16, viwdupq_x_n_u32, 
viwdupq_x_n_u8, viwdupq_x_wb_u16,
viwdupq_x_wb_u32, viwdupq_x_wb_u8, vmaxnmq_x_f16, vmaxnmq_x_f32, vmaxq_x_s16, 
vmaxq_x_s32, vmaxq_x_s8,
vmaxq_x_u16, vmaxq_x_u32, vmaxq_x_u8, vminnmq_x_f16, vminnmq_x_f32, 
vminq_x_s16, vminq_x_s32, vminq_x_s8,
vminq_x_u16, vminq_x_u32, vminq_x_u8, vmovlbq_x_s16, vmovlbq_x_s8, 
vmovlbq_x_u16, vmovlbq_x_u8,
vmovltq_x_s16, vmovltq_x_s8, vmovltq_x_u16, vmovltq_x_u8, vmulhq_x_s16, 
vmulhq_x_s32, vmulhq_x_s8,
vmulhq_x_u16, vmulhq_x_u32, vmulhq_x_u8, vmullbq_int_x_s16, vmullbq_int_x_s32, 
vmullbq_int_x_s8,
vmullbq_int_x_u16, vmullbq_int_x_u32, vmullbq_int_x_u8, vmullbq_poly_x_p16, 
vmullbq_poly_x_p8,
vmulltq_int_x_s16, vmulltq_int_x_s32, vmulltq_int_x_s8, vmulltq_int_x_u16, 
vmulltq_int_x_u32,
vmulltq_int_x_u8, vmulltq_poly_x_p16, vmulltq_poly_x_p8, vmulq_x_f16, 
vmulq_x_f32, vmulq_x_n_f16,
vmulq_x_n_f32, vmulq_x_n_s16, vmulq_x_n_s32, vmulq_x_n_s8, vmulq_x_n_u16, 
vmulq_x_n_u32, vmulq_x_n_u8,
vmulq_x_s16, vmulq_x_s32, vmulq_x_s8, vmulq_x_u16, vmulq_x_u32, vmulq_x_u8, 
vmvnq_x_n_s16, vmvnq_x_n_s32,
vmvnq_x_n_u16, vmvnq_x_n_u32, vmvnq_x_s16, vmvnq_x_s32, vmvnq_x_s8, 
vmvnq_x_u16, vmvnq_x_u32, vmvnq_x_u8,
vnegq_x_f16, vnegq_x_f32, vnegq_x_s16, vnegq_x_s32, vnegq_x_s8, vornq_x_f16, 
vornq_x_f32, vornq_x_s16,
vornq_x_s32, vornq_x_s8, vornq_x_u16, vornq_x_u32, vornq_x_u8, vorrq_x_f16, 
vorrq_x_f32, vorrq_x_s16,
vorrq_x_s32, vorrq_x_s8, vorrq_x_u16, vorrq_x_u32, vorrq_x_u8, vrev16q_x_s8, 
vrev16q_x_u8, vrev32q_x_f16,
vrev32q_x_s16, vrev32q_x_s8, vrev32q_x_u16, vrev32q_x_u8, vrev64q_x_f16, 
vrev64q_x_f32, vrev64q_x_s16,
vrev64q_x_s32, vrev64q_x_s8, vrev64q_x_u16, vrev64q_x_u32, vrev64q_x_u8, 
vrhaddq_x_s16, vrhaddq_x_s32,
vrhaddq_x_s8, vrhaddq_x_u16, vrhaddq_x_u32, vrhaddq_x_u8, vrmu

[PATCH][ARM][GCC][2/8x]: MVE ACLE gather load and scatter store intrinsics with writeback.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE intrinsics with writeback.

vldrdq_gather_base_wb_s64, vldrdq_gather_base_wb_u64, 
vldrdq_gather_base_wb_z_s64,
vldrdq_gather_base_wb_z_u64, vldrwq_gather_base_wb_f32, 
vldrwq_gather_base_wb_s32,
vldrwq_gather_base_wb_u32, vldrwq_gather_base_wb_z_f32, 
vldrwq_gather_base_wb_z_s32,
vldrwq_gather_base_wb_z_u32, vstrdq_scatter_base_wb_p_s64, 
vstrdq_scatter_base_wb_p_u64,
vstrdq_scatter_base_wb_s64, vstrdq_scatter_base_wb_u64, 
vstrwq_scatter_base_wb_p_s32,
vstrwq_scatter_base_wb_p_f32, vstrwq_scatter_base_wb_p_u32, 
vstrwq_scatter_base_wb_s32,
vstrwq_scatter_base_wb_u32, vstrwq_scatter_base_wb_f32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-11-07  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c (LDRGBWBS_QUALIFIERS): Define builtin
qualifier.
(LDRGBWBU_QUALIFIERS): Likewise.
(LDRGBWBS_Z_QUALIFIERS): Likewise.
(LDRGBWBU_Z_QUALIFIERS): Likewise.
(STRSBWBS_QUALIFIERS): Likewise.
(STRSBWBU_QUALIFIERS): Likewise.
(STRSBWBS_P_QUALIFIERS): Likewise.
(STRSBWBU_P_QUALIFIERS): Likewise.
* config/arm/arm_mve.h (vldrdq_gather_base_wb_s64): Define macro.
(vldrdq_gather_base_wb_u64): Likewise.
(vldrdq_gather_base_wb_z_s64): Likewise.
(vldrdq_gather_base_wb_z_u64): Likewise.
(vldrwq_gather_base_wb_f32): Likewise.
(vldrwq_gather_base_wb_s32): Likewise.
(vldrwq_gather_base_wb_u32): Likewise.
(vldrwq_gather_base_wb_z_f32): Likewise.
(vldrwq_gather_base_wb_z_s32): Likewise.
(vldrwq_gather_base_wb_z_u32): Likewise.
(vstrdq_scatter_base_wb_p_s64): Likewise.
(vstrdq_scatter_base_wb_p_u64): Likewise.
(vstrdq_scatter_base_wb_s64): Likewise.
(vstrdq_scatter_base_wb_u64): Likewise.
(vstrwq_scatter_base_wb_p_s32): Likewise.
(vstrwq_scatter_base_wb_p_f32): Likewise.
(vstrwq_scatter_base_wb_p_u32): Likewise.
(vstrwq_scatter_base_wb_s32): Likewise.
(vstrwq_scatter_base_wb_u32): Likewise.
(vstrwq_scatter_base_wb_f32): Likewise.
(__arm_vldrdq_gather_base_wb_s64): Define intrinsic.
(__arm_vldrdq_gather_base_wb_u64): Likewise.
(__arm_vldrdq_gather_base_wb_z_s64): Likewise.
(__arm_vldrdq_gather_base_wb_z_u64): Likewise.
(__arm_vldrwq_gather_base_wb_s32): Likewise.
(__arm_vldrwq_gather_base_wb_u32): Likewise.
(__arm_vldrwq_gather_base_wb_z_s32): Likewise.
(__arm_vldrwq_gather_base_wb_z_u32): Likewise.
(__arm_vstrdq_scatter_base_wb_s64): Likewise.
(__arm_vstrdq_scatter_base_wb_u64): Likewise.
(__arm_vstrdq_scatter_base_wb_p_s64): Likewise.
(__arm_vstrdq_scatter_base_wb_p_u64): Likewise.
(__arm_vstrwq_scatter_base_wb_p_s32): Likewise.
(__arm_vstrwq_scatter_base_wb_p_u32): Likewise.
(__arm_vstrwq_scatter_base_wb_s32): Likewise.
(__arm_vstrwq_scatter_base_wb_u32): Likewise.
(__arm_vldrwq_gather_base_wb_f32): Likewise.
(__arm_vldrwq_gather_base_wb_z_f32): Likewise.
(__arm_vstrwq_scatter_base_wb_f32): Likewise.
(__arm_vstrwq_scatter_base_wb_p_f32): Likewise.
(vstrwq_scatter_base_wb): Define polymorphic variant.
(vstrwq_scatter_base_wb_p): Likewise.
(vstrdq_scatter_base_wb_p): Likewise.
(vstrdq_scatter_base_wb): Likewise.
* config/arm/arm_mve_builtins.def (LDRGBWBS_QUALIFIERS): Use builtin
qualifier.
* config/arm/mve.md (mve_vstrwq_scatter_base_wb_v4si): Define RTL
pattern.
(mve_vstrwq_scatter_base_wb_add_v4si): Likewise.
(mve_vstrwq_scatter_base_wb_v4si_insn): Likewise.
(mve_vstrwq_scatter_base_wb_p_v4si): Likewise.
(mve_vstrwq_scatter_base_wb_p_add_v4si): Likewise.
(mve_vstrwq_scatter_base_wb_p_v4si_insn): Likewise.
(mve_vstrwq_scatter_base_wb_fv4sf): Likewise.
(mve_vstrwq_scatter_base_wb_add_fv4sf): Likewise.
(mve_vstrwq_scatter_base_wb_fv4sf_insn): Likewise.
(mve_vstrwq_scatter_base_wb_p_fv4sf): Likewise.
(mve_vstrwq_scatter_base_wb_p_add_fv4sf): Likewise.
(mve_vstrwq_scatter_base_wb_p_fv4sf_insn): Likewise.
(mve_vstrdq_scatter_base_wb_v2di): Likewise.
(mve_vstrdq_scatter_base_wb_add_v2di): Likewise.
(mve_vstrdq_scatter_base_wb_v2di_insn): Likewise.
(mve_vstrdq_scatter_base_wb_p_v2di): Likewise.
(mve_vstrdq_scatter_base_wb_p_add_v2di): Likewise.
(mve_vstrdq_scatter_base_wb_p_v2di_insn): Likewise.
(mve_vldrwq_gather_base_wb_v4si): Likewise.
(mve_vldrwq_gather_base_wb_v4si_insn): Likewise.
(

[PATCH][ARM][GCC][7x]: MVE vreinterpretq and vuninitializedq intrinsics.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE intrinsics.

vreinterpretq_s16_s32, vreinterpretq_s16_s64, vreinterpretq_s16_s8, 
vreinterpretq_s16_u16,
vreinterpretq_s16_u32, vreinterpretq_s16_u64, vreinterpretq_s16_u8, 
vreinterpretq_s32_s16,
vreinterpretq_s32_s64, vreinterpretq_s32_s8, vreinterpretq_s32_u16, 
vreinterpretq_s32_u32,
vreinterpretq_s32_u64, vreinterpretq_s32_u8, vreinterpretq_s64_s16, 
vreinterpretq_s64_s32,
vreinterpretq_s64_s8, vreinterpretq_s64_u16, vreinterpretq_s64_u32, 
vreinterpretq_s64_u64,
vreinterpretq_s64_u8, vreinterpretq_s8_s16, vreinterpretq_s8_s32, 
vreinterpretq_s8_s64,
vreinterpretq_s8_u16, vreinterpretq_s8_u32, vreinterpretq_s8_u64, 
vreinterpretq_s8_u8,
vreinterpretq_u16_s16, vreinterpretq_u16_s32, vreinterpretq_u16_s64, 
vreinterpretq_u16_s8,
vreinterpretq_u16_u32, vreinterpretq_u16_u64, vreinterpretq_u16_u8, 
vreinterpretq_u32_s16,
vreinterpretq_u32_s32, vreinterpretq_u32_s64, vreinterpretq_u32_s8, 
vreinterpretq_u32_u16,
vreinterpretq_u32_u64, vreinterpretq_u32_u8, vreinterpretq_u64_s16, 
vreinterpretq_u64_s32,
vreinterpretq_u64_s64, vreinterpretq_u64_s8, vreinterpretq_u64_u16, 
vreinterpretq_u64_u32,
vreinterpretq_u64_u8, vreinterpretq_u8_s16, vreinterpretq_u8_s32, 
vreinterpretq_u8_s64,
vreinterpretq_u8_s8, vreinterpretq_u8_u16, vreinterpretq_u8_u32, 
vreinterpretq_u8_u64,
vreinterpretq_s32_f16, vreinterpretq_s32_f32, vreinterpretq_u16_f16, 
vreinterpretq_u16_f32,
vreinterpretq_u32_f16, vreinterpretq_u32_f32, vreinterpretq_u64_f16, 
vreinterpretq_u64_f32,
vreinterpretq_u8_f16, vreinterpretq_u8_f32, vreinterpretq_f16_f32, 
vreinterpretq_f16_s16,
vreinterpretq_f16_s32, vreinterpretq_f16_s64, vreinterpretq_f16_s8, 
vreinterpretq_f16_u16,
vreinterpretq_f16_u32, vreinterpretq_f16_u64, vreinterpretq_f16_u8, 
vreinterpretq_f32_f16,
vreinterpretq_f32_s16, vreinterpretq_f32_s32, vreinterpretq_f32_s64, 
vreinterpretq_f32_s8,
vreinterpretq_f32_u16, vreinterpretq_f32_u32, vreinterpretq_f32_u64, 
vreinterpretq_f32_u8,
vreinterpretq_s16_f16, vreinterpretq_s16_f32, vreinterpretq_s64_f16, 
vreinterpretq_s64_f32,
vreinterpretq_s8_f16, vreinterpretq_s8_f32, vuninitializedq_u8, 
vuninitializedq_u16,
vuninitializedq_u32, vuninitializedq_u64, vuninitializedq_s8, 
vuninitializedq_s16,
vuninitializedq_s32, vuninitializedq_s64, vuninitializedq_f16, 
vuninitializedq_f32 and
vuninitializedq.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-11-14  Srinath Parvathaneni  

* config/arm/arm_mve.h (vreinterpretq_s16_s32): Define macro.
(vreinterpretq_s16_s64): Likewise.
(vreinterpretq_s16_s8): Likewise.
(vreinterpretq_s16_u16): Likewise.
(vreinterpretq_s16_u32): Likewise.
(vreinterpretq_s16_u64): Likewise.
(vreinterpretq_s16_u8): Likewise.
(vreinterpretq_s32_s16): Likewise.
(vreinterpretq_s32_s64): Likewise.
(vreinterpretq_s32_s8): Likewise.
(vreinterpretq_s32_u16): Likewise.
(vreinterpretq_s32_u32): Likewise.
(vreinterpretq_s32_u64): Likewise.
(vreinterpretq_s32_u8): Likewise.
(vreinterpretq_s64_s16): Likewise.
(vreinterpretq_s64_s32): Likewise.
(vreinterpretq_s64_s8): Likewise.
(vreinterpretq_s64_u16): Likewise.
(vreinterpretq_s64_u32): Likewise.
(vreinterpretq_s64_u64): Likewise.
(vreinterpretq_s64_u8): Likewise.
(vreinterpretq_s8_s16): Likewise.
(vreinterpretq_s8_s32): Likewise.
(vreinterpretq_s8_s64): Likewise.
(vreinterpretq_s8_u16): Likewise.
(vreinterpretq_s8_u32): Likewise.
(vreinterpretq_s8_u64): Likewise.
(vreinterpretq_s8_u8): Likewise.
(vreinterpretq_u16_s16): Likewise.
(vreinterpretq_u16_s32): Likewise.
(vreinterpretq_u16_s64): Likewise.
(vreinterpretq_u16_s8): Likewise.
(vreinterpretq_u16_u32): Likewise.
(vreinterpretq_u16_u64): Likewise.
(vreinterpretq_u16_u8): Likewise.
(vreinterpretq_u32_s16): Likewise.
(vreinterpretq_u32_s32): Likewise.
(vreinterpretq_u32_s64): Likewise.
(vreinterpretq_u32_s8): Likewise.
(vreinterpretq_u32_u16): Likewise.
(vreinterpretq_u32_u64): Likewise.
(vreinterpretq_u32_u8): Likewise.
(vreinterpretq_u64_s16): Likewise.
(vreinterpretq_u64_s32): Likewise.
(vreinterpretq_u64_s64): Likewise.
(vreinterpretq_u64_s8): Likewise.
(vreinterpretq_u64_u16): Likewise.
(vreinterpretq_u64_u32): Likewise.
(vreinterpretq_u64_u8): Likewise.
(vreinterpretq_u8_s16): Likewise.
(vreinterpretq_u8_s32): Likewise.
(vreinterpretq_u8_s64): Likewise.
(vreinterpretq_u8_s8): Likewise.
(vreinterpretq_u8_u16): Likewise.
  

[PATCH][ARM][GCC][1/8x]: MVE ACLE vidup, vddup, viwdup and vdwdup intrinsics with writeback.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE intrinsics with writeback.

vddupq_m_n_u8, vddupq_m_n_u32, vddupq_m_n_u16, vddupq_m_wb_u8, vddupq_m_wb_u16,
vddupq_m_wb_u32, vddupq_n_u8, vddupq_n_u32, vddupq_n_u16, vddupq_wb_u8,
vddupq_wb_u16, vddupq_wb_u32, vdwdupq_m_n_u8, vdwdupq_m_n_u32, vdwdupq_m_n_u16,
vdwdupq_m_wb_u8, vdwdupq_m_wb_u32, vdwdupq_m_wb_u16, vdwdupq_n_u8, 
vdwdupq_n_u32,
vdwdupq_n_u16, vdwdupq_wb_u8, vdwdupq_wb_u32, vdwdupq_wb_u16, vidupq_m_n_u8,
vidupq_m_n_u32, vidupq_m_n_u16, vidupq_m_wb_u8, vidupq_m_wb_u16, 
vidupq_m_wb_u32,
vidupq_n_u8, vidupq_n_u32, vidupq_n_u16, vidupq_wb_u8, vidupq_wb_u16,
vidupq_wb_u32, viwdupq_m_n_u8, viwdupq_m_n_u32, viwdupq_m_n_u16, 
viwdupq_m_wb_u8,
viwdupq_m_wb_u32, viwdupq_m_wb_u16, viwdupq_n_u8, viwdupq_n_u32, viwdupq_n_u16,
viwdupq_wb_u8, viwdupq_wb_u32, viwdupq_wb_u16.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-11-07  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c
(QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Define quinary
builtin qualifier.
* config/arm/arm_mve.h (vddupq_m_n_u8): Define macro.
(vddupq_m_n_u32): Likewise.
(vddupq_m_n_u16): Likewise.
(vddupq_m_wb_u8): Likewise.
(vddupq_m_wb_u16): Likewise.
(vddupq_m_wb_u32): Likewise.
(vddupq_n_u8): Likewise.
(vddupq_n_u32): Likewise.
(vddupq_n_u16): Likewise.
(vddupq_wb_u8): Likewise.
(vddupq_wb_u16): Likewise.
(vddupq_wb_u32): Likewise.
(vdwdupq_m_n_u8): Likewise.
(vdwdupq_m_n_u32): Likewise.
(vdwdupq_m_n_u16): Likewise.
(vdwdupq_m_wb_u8): Likewise.
(vdwdupq_m_wb_u32): Likewise.
(vdwdupq_m_wb_u16): Likewise.
(vdwdupq_n_u8): Likewise.
(vdwdupq_n_u32): Likewise.
(vdwdupq_n_u16): Likewise.
(vdwdupq_wb_u8): Likewise.
(vdwdupq_wb_u32): Likewise.
(vdwdupq_wb_u16): Likewise.
(vidupq_m_n_u8): Likewise.
(vidupq_m_n_u32): Likewise.
(vidupq_m_n_u16): Likewise.
(vidupq_m_wb_u8): Likewise.
(vidupq_m_wb_u16): Likewise.
(vidupq_m_wb_u32): Likewise.
(vidupq_n_u8): Likewise.
(vidupq_n_u32): Likewise.
(vidupq_n_u16): Likewise.
(vidupq_wb_u8): Likewise.
(vidupq_wb_u16): Likewise.
(vidupq_wb_u32): Likewise.
(viwdupq_m_n_u8): Likewise.
(viwdupq_m_n_u32): Likewise.
(viwdupq_m_n_u16): Likewise.
(viwdupq_m_wb_u8): Likewise.
(viwdupq_m_wb_u32): Likewise.
(viwdupq_m_wb_u16): Likewise.
(viwdupq_n_u8): Likewise.
(viwdupq_n_u32): Likewise.
(viwdupq_n_u16): Likewise.
(viwdupq_wb_u8): Likewise.
(viwdupq_wb_u32): Likewise.
(viwdupq_wb_u16): Likewise.
(__arm_vddupq_m_n_u8): Define intrinsic.
(__arm_vddupq_m_n_u32): Likewise.
(__arm_vddupq_m_n_u16): Likewise.
(__arm_vddupq_m_wb_u8): Likewise.
(__arm_vddupq_m_wb_u16): Likewise.
(__arm_vddupq_m_wb_u32): Likewise.
(__arm_vddupq_n_u8): Likewise.
(__arm_vddupq_n_u32): Likewise.
(__arm_vddupq_n_u16): Likewise.
(__arm_vdwdupq_m_n_u8): Likewise.
(__arm_vdwdupq_m_n_u32): Likewise.
(__arm_vdwdupq_m_n_u16): Likewise.
(__arm_vdwdupq_m_wb_u8): Likewise.
(__arm_vdwdupq_m_wb_u32): Likewise.
(__arm_vdwdupq_m_wb_u16): Likewise.
(__arm_vdwdupq_n_u8): Likewise.
(__arm_vdwdupq_n_u32): Likewise.
(__arm_vdwdupq_n_u16): Likewise.
(__arm_vdwdupq_wb_u8): Likewise.
(__arm_vdwdupq_wb_u32): Likewise.
(__arm_vdwdupq_wb_u16): Likewise.
(__arm_vidupq_m_n_u8): Likewise.
(__arm_vidupq_m_n_u32): Likewise.
(__arm_vidupq_m_n_u16): Likewise.
(__arm_vidupq_n_u8): Likewise.
(__arm_vidupq_m_wb_u8): Likewise.
(__arm_vidupq_m_wb_u16): Likewise.
(__arm_vidupq_m_wb_u32): Likewise.
(__arm_vidupq_n_u32): Likewise.
(__arm_vidupq_n_u16): Likewise.
(__arm_vidupq_wb_u8): Likewise.
(__arm_vidupq_wb_u16): Likewise.
(__arm_vidupq_wb_u32): Likewise.
(__arm_vddupq_wb_u8): Likewise.
(__arm_vddupq_wb_u16): Likewise.
(__arm_vddupq_wb_u32): Likewise.
(__arm_viwdupq_m_n_u8): Likewise.
(__arm_viwdupq_m_n_u32): Likewise.
(__arm_viwdupq_m_n_u16): Likewise.
(__arm_viwdupq_m_wb_u8): Likewise.
(__arm_viwdupq_m_wb_u32): Likewise.
(__arm_viwdupq_m_wb_u16): Likewise.
(__arm_viwdupq_n_u8): Likewise.
(__arm_viwdupq_n_u32): Likewise.
(__arm_viwdupq_n_u16): Likewise.
(__arm_viwdupq_

[PATCH][ARM][GCC][7/5x]: MVE store intrinsics which stores byte,half word or word to memory.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports the following MVE ACLE store intrinsics which stores a 
byte, halfword,
or word to memory.

vst1q_f32, vst1q_f16, vst1q_s8, vst1q_s32, vst1q_s16, vst1q_u8, vst1q_u32, 
vst1q_u16,
vstrhq_f16, vstrhq_scatter_offset_s32, vstrhq_scatter_offset_s16, 
vstrhq_scatter_offset_u32,
vstrhq_scatter_offset_u16, vstrhq_scatter_offset_p_s32, 
vstrhq_scatter_offset_p_s16,
vstrhq_scatter_offset_p_u32, vstrhq_scatter_offset_p_u16, 
vstrhq_scatter_shifted_offset_s32,
vstrhq_scatter_shifted_offset_s16, vstrhq_scatter_shifted_offset_u32,
vstrhq_scatter_shifted_offset_u16, vstrhq_scatter_shifted_offset_p_s32,
vstrhq_scatter_shifted_offset_p_s16, vstrhq_scatter_shifted_offset_p_u32,
vstrhq_scatter_shifted_offset_p_u16, vstrhq_s32, vstrhq_s16, vstrhq_u32, 
vstrhq_u16,
vstrhq_p_f16, vstrhq_p_s32, vstrhq_p_s16, vstrhq_p_u32, vstrhq_p_u16, 
vstrwq_f32,
vstrwq_s32, vstrwq_u32, vstrwq_p_f32, vstrwq_p_s32, vstrwq_p_u32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1]  
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-11-01  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm_mve.h (vst1q_f32): Define macro.
(vst1q_f16): Likewise.
(vst1q_s8): Likewise.
(vst1q_s32): Likewise.
(vst1q_s16): Likewise.
(vst1q_u8): Likewise.
(vst1q_u32): Likewise.
(vst1q_u16): Likewise.
(vstrhq_f16): Likewise.
(vstrhq_scatter_offset_s32): Likewise.
(vstrhq_scatter_offset_s16): Likewise.
(vstrhq_scatter_offset_u32): Likewise.
(vstrhq_scatter_offset_u16): Likewise.
(vstrhq_scatter_offset_p_s32): Likewise.
(vstrhq_scatter_offset_p_s16): Likewise.
(vstrhq_scatter_offset_p_u32): Likewise.
(vstrhq_scatter_offset_p_u16): Likewise.
(vstrhq_scatter_shifted_offset_s32): Likewise.
(vstrhq_scatter_shifted_offset_s16): Likewise.
(vstrhq_scatter_shifted_offset_u32): Likewise.
(vstrhq_scatter_shifted_offset_u16): Likewise.
(vstrhq_scatter_shifted_offset_p_s32): Likewise.
(vstrhq_scatter_shifted_offset_p_s16): Likewise.
(vstrhq_scatter_shifted_offset_p_u32): Likewise.
(vstrhq_scatter_shifted_offset_p_u16): Likewise.
(vstrhq_s32): Likewise.
(vstrhq_s16): Likewise.
(vstrhq_u32): Likewise.
(vstrhq_u16): Likewise.
(vstrhq_p_f16): Likewise.
(vstrhq_p_s32): Likewise.
(vstrhq_p_s16): Likewise.
(vstrhq_p_u32): Likewise.
(vstrhq_p_u16): Likewise.
(vstrwq_f32): Likewise.
(vstrwq_s32): Likewise.
(vstrwq_u32): Likewise.
(vstrwq_p_f32): Likewise.
(vstrwq_p_s32): Likewise.
(vstrwq_p_u32): Likewise.
(__arm_vst1q_s8): Define intrinsic.
(__arm_vst1q_s32): Likewise.
(__arm_vst1q_s16): Likewise.
(__arm_vst1q_u8): Likewise.
(__arm_vst1q_u32): Likewise.
(__arm_vst1q_u16): Likewise.
(__arm_vstrhq_scatter_offset_s32): Likewise.
(__arm_vstrhq_scatter_offset_s16): Likewise.
(__arm_vstrhq_scatter_offset_u32): Likewise.
(__arm_vstrhq_scatter_offset_u16): Likewise.
(__arm_vstrhq_scatter_offset_p_s32): Likewise.
(__arm_vstrhq_scatter_offset_p_s16): Likewise.
(__arm_vstrhq_scatter_offset_p_u32): Likewise.
(__arm_vstrhq_scatter_offset_p_u16): Likewise.
(__arm_vstrhq_scatter_shifted_offset_s32): Likewise.
(__arm_vstrhq_scatter_shifted_offset_s16): Likewise.
(__arm_vstrhq_scatter_shifted_offset_u32): Likewise.
(__arm_vstrhq_scatter_shifted_offset_u16): Likewise.
(__arm_vstrhq_scatter_shifted_offset_p_s32): Likewise.
(__arm_vstrhq_scatter_shifted_offset_p_s16): Likewise.
(__arm_vstrhq_scatter_shifted_offset_p_u32): Likewise.
(__arm_vstrhq_scatter_shifted_offset_p_u16): Likewise.
(__arm_vstrhq_s32): Likewise.
(__arm_vstrhq_s16): Likewise.
(__arm_vstrhq_u32): Likewise.
(__arm_vstrhq_u16): Likewise.
(__arm_vstrhq_p_s32): Likewise.
(__arm_vstrhq_p_s16): Likewise.
(__arm_vstrhq_p_u32): Likewise.
(__arm_vstrhq_p_u16): Likewise.
(__arm_vstrwq_s32): Likewise.
(__arm_vstrwq_u32): Likewise.
(__arm_vstrwq_p_s32): Likewise.
(__arm_vstrwq_p_u32): Likewise.
(__arm_vstrwq_p_f32): Likewise.
(__arm_vstrwq_f32): Likewise.
(__arm_vst1q_f32): Likewise.
(__arm_vst1q_f16): Likewise.
(__arm_vstrhq_f16): Likewise.
(__arm_vstrhq_p_f16): Likewise.
(vst1q): Define polymorphic variant.
(vstrhq): Likewise.
(vstrhq_p): Likewise.
(vstrhq_scatter_offset_p): Likewise.
(vstrhq_scatter_offset): Likewise.
 

[PATCH][ARM][GCC][5/5x]: MVE ACLE load intrinsics which load a byte, halfword, or word from memory.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports the following MVE ACLE load intrinsics which load a byte, 
halfword,
or word from memory.
vld1q_s8, vld1q_s32, vld1q_s16, vld1q_u8, vld1q_u32, vld1q_u16, 
vldrhq_gather_offset_s32,
vldrhq_gather_offset_s16, vldrhq_gather_offset_u32, vldrhq_gather_offset_u16,
vldrhq_gather_offset_z_s32, vldrhq_gather_offset_z_s16, 
vldrhq_gather_offset_z_u32,
vldrhq_gather_offset_z_u16, vldrhq_gather_shifted_offset_s32,vldrwq_f32, 
vldrwq_z_f32,
vldrhq_gather_shifted_offset_s16, vldrhq_gather_shifted_offset_u32,
vldrhq_gather_shifted_offset_u16, vldrhq_gather_shifted_offset_z_s32,
vldrhq_gather_shifted_offset_z_s16, vldrhq_gather_shifted_offset_z_u32,
vldrhq_gather_shifted_offset_z_u16, vldrhq_s32, vldrhq_s16, vldrhq_u32, 
vldrhq_u16,
vldrhq_z_s32, vldrhq_z_s16, vldrhq_z_u32, vldrhq_z_u16, vldrwq_s32, vldrwq_u32,
vldrwq_z_s32, vldrwq_z_u32, vld1q_f32, vld1q_f16, vldrhq_f16, vldrhq_z_f16.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1]  
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-11-01  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm_mve.h (vld1q_s8): Define macro.
(vld1q_s32): Likewise.
(vld1q_s16): Likewise.
(vld1q_u8): Likewise.
(vld1q_u32): Likewise.
(vld1q_u16): Likewise.
(vldrhq_gather_offset_s32): Likewise.
(vldrhq_gather_offset_s16): Likewise.
(vldrhq_gather_offset_u32): Likewise.
(vldrhq_gather_offset_u16): Likewise.
(vldrhq_gather_offset_z_s32): Likewise.
(vldrhq_gather_offset_z_s16): Likewise.
(vldrhq_gather_offset_z_u32): Likewise.
(vldrhq_gather_offset_z_u16): Likewise.
(vldrhq_gather_shifted_offset_s32): Likewise.
(vldrhq_gather_shifted_offset_s16): Likewise.
(vldrhq_gather_shifted_offset_u32): Likewise.
(vldrhq_gather_shifted_offset_u16): Likewise.
(vldrhq_gather_shifted_offset_z_s32): Likewise.
(vldrhq_gather_shifted_offset_z_s16): Likewise.
(vldrhq_gather_shifted_offset_z_u32): Likewise.
(vldrhq_gather_shifted_offset_z_u16): Likewise.
(vldrhq_s32): Likewise.
(vldrhq_s16): Likewise.
(vldrhq_u32): Likewise.
(vldrhq_u16): Likewise.
(vldrhq_z_s32): Likewise.
(vldrhq_z_s16): Likewise.
(vldrhq_z_u32): Likewise.
(vldrhq_z_u16): Likewise.
(vldrwq_s32): Likewise.
(vldrwq_u32): Likewise.
(vldrwq_z_s32): Likewise.
(vldrwq_z_u32): Likewise.
(vld1q_f32): Likewise.
(vld1q_f16): Likewise.
(vldrhq_f16): Likewise.
(vldrhq_z_f16): Likewise.
(vldrwq_f32): Likewise.
(vldrwq_z_f32): Likewise.
(__arm_vld1q_s8): Define intrinsic.
(__arm_vld1q_s32): Likewise.
(__arm_vld1q_s16): Likewise.
(__arm_vld1q_u8): Likewise.
(__arm_vld1q_u32): Likewise.
(__arm_vld1q_u16): Likewise.
(__arm_vldrhq_gather_offset_s32): Likewise.
(__arm_vldrhq_gather_offset_s16): Likewise.
(__arm_vldrhq_gather_offset_u32): Likewise.
(__arm_vldrhq_gather_offset_u16): Likewise.
(__arm_vldrhq_gather_offset_z_s32): Likewise.
(__arm_vldrhq_gather_offset_z_s16): Likewise.
(__arm_vldrhq_gather_offset_z_u32): Likewise.
(__arm_vldrhq_gather_offset_z_u16): Likewise.
(__arm_vldrhq_gather_shifted_offset_s32): Likewise.
(__arm_vldrhq_gather_shifted_offset_s16): Likewise.
(__arm_vldrhq_gather_shifted_offset_u32): Likewise.
(__arm_vldrhq_gather_shifted_offset_u16): Likewise.
(__arm_vldrhq_gather_shifted_offset_z_s32): Likewise.
(__arm_vldrhq_gather_shifted_offset_z_s16): Likewise.
(__arm_vldrhq_gather_shifted_offset_z_u32): Likewise.
(__arm_vldrhq_gather_shifted_offset_z_u16): Likewise.
(__arm_vldrhq_s32): Likewise.
(__arm_vldrhq_s16): Likewise.
(__arm_vldrhq_u32): Likewise.
(__arm_vldrhq_u16): Likewise.
(__arm_vldrhq_z_s32): Likewise.
(__arm_vldrhq_z_s16): Likewise.
(__arm_vldrhq_z_u32): Likewise.
(__arm_vldrhq_z_u16): Likewise.
(__arm_vldrwq_s32): Likewise.
(__arm_vldrwq_u32): Likewise.
(__arm_vldrwq_z_s32): Likewise.
(__arm_vldrwq_z_u32): Likewise.
(__arm_vld1q_f32): Likewise.
(__arm_vld1q_f16): Likewise.
(__arm_vldrwq_f32): Likewise.
(__arm_vldrwq_z_f32): Likewise.
(__arm_vldrhq_z_f16): Likewise.
(__arm_vldrhq_f16): Likewise.
(vld1q): Define polymorphic variant.
(vldrhq_gather_offset): Likewise.
(vldrhq_gather_offset_z): Likewise.
(vldrhq_gather_shifted_offset): Likewise.
(vldrhq_gather_shifted_offset_z): Likewise.
* config

[PATCH][ARM][GCC][6x]:MVE ACLE vaddq intrinsics using arithmetic plus operator.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE vaddq intrinsics. The RTL patterns for 
this intrinsics
are added using arithmetic "plus" operator.

vaddq_s8, vaddq_s16, vaddq_s32, vaddq_u8, vaddq_u16, vaddq_u32, vaddq_f16, 
vaddq_f32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1]  
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-11-05  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm_mve.h (vaddq_s8): Define macro.
(vaddq_s16): Likewise.
(vaddq_s32): Likewise.
(vaddq_u8): Likewise.
(vaddq_u16): Likewise.
(vaddq_u32): Likewise.
(vaddq_f16): Likewise.
(vaddq_f32): Likewise.
(__arm_vaddq_s8): Define intrinsic.
(__arm_vaddq_s16): Likewise.
(__arm_vaddq_s32): Likewise.
(__arm_vaddq_u8): Likewise.
(__arm_vaddq_u16): Likewise.
(__arm_vaddq_u32): Likewise.
(__arm_vaddq_f16): Likewise.
(__arm_vaddq_f32): Likewise.
(vaddq): Define polymorphic variant.
* config/arm/iterators.md (VNIM): Define mode iterator for common types
Neon, IWMMXT and MVE.
(VNINOTM): Likewise.
* config/arm/mve.md (mve_vaddq): Define RTL pattern.
(mve_vaddq_f): Define RTL pattern.
* config/arm/neon.md (add3): Rename to addv4hf3 RTL pattern.
(addv8hf3_neon): Define RTL pattern.
* config/arm/vec-common.md (add3): Modify standard add RTL pattern
to support MVE.
(addv8hf3): Define standard RTL pattern for MVE and Neon.
(add3): Modify existing standard add RTL pattern for Neon and 
IWMMXT.

gcc/testsuite/ChangeLog:

2019-11-05  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vaddq_f16.c: New test.
* gcc.target/arm/mve/intrinsics/vaddq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_u8.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 
42e98f9ad1e357fe974e58378a49bcaaf36c302a..89456589c9dcdff5b56e8707dd720fb15141
 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -1898,6 +1898,14 @@ typedef struct { uint8x16_t val[4]; } uint8x16x4_t;
 #define vstrwq_scatter_shifted_offset_p_u32(__base, __offset, __value, __p) 
__arm_vstrwq_scatter_shifted_offset_p_u32(__base, __offset, __value, __p)
 #define vstrwq_scatter_shifted_offset_s32(__base, __offset, __value) 
__arm_vstrwq_scatter_shifted_offset_s32(__base, __offset, __value)
 #define vstrwq_scatter_shifted_offset_u32(__base, __offset, __value) 
__arm_vstrwq_scatter_shifted_offset_u32(__base, __offset, __value)
+#define vaddq_s8(__a, __b) __arm_vaddq_s8(__a, __b)
+#define vaddq_s16(__a, __b) __arm_vaddq_s16(__a, __b)
+#define vaddq_s32(__a, __b) __arm_vaddq_s32(__a, __b)
+#define vaddq_u8(__a, __b) __arm_vaddq_u8(__a, __b)
+#define vaddq_u16(__a, __b) __arm_vaddq_u16(__a, __b)
+#define vaddq_u32(__a, __b) __arm_vaddq_u32(__a, __b)
+#define vaddq_f16(__a, __b) __arm_vaddq_f16(__a, __b)
+#define vaddq_f32(__a, __b) __arm_vaddq_f32(__a, __b)
 #endif
 
 __extension__ extern __inline void
@@ -12341,6 +12349,48 @@ __arm_vstrwq_scatter_shifted_offset_u32 (uint32_t * 
__base, uint32x4_t __offset,
   __builtin_mve_vstrwq_scatter_shifted_offset_uv4si ((__builtin_neon_si *) 
__base, __offset, __value);
 }
 
+__extension__ extern __inline int8x16_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vaddq_s8 (int8x16_t __a, int8x16_t __b)
+{
+  return __a + __b;
+}
+
+__extension__ extern __inline int16x8_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vaddq_s16 (int16x8_t __a, int16x8_t __b)
+{
+  return __a + __b;
+}
+
+__extension__ extern __inline int32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vaddq_s32 (int32x4_t __a, int32x4_t __b)
+{
+  return __a + __b;
+}
+
+__extension__ extern __inline uint8x16_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vaddq_u8 (uint8x16_t __a, uint8x16_t __b)
+{
+  return __a + __b;
+}
+
+__extension__ extern __inline uint16x8_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vaddq_u16 (uint16x8_t __a, uint16x8_t __b)
+{
+  return __a + __b;
+}
+
+__extension__ extern __inline uint32x4_t
+__attribute__ ((__always_inline__, __gnu_inl

[PATCH][ARM][GCC][6/5x]: Remaining MVE load intrinsics which loads half word and word or double word from memory.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports the following Remaining MVE ACLE load intrinsics which load 
an halfword,
word or double word from memory.

vldrdq_gather_base_s64, vldrdq_gather_base_u64, vldrdq_gather_base_z_s64,
vldrdq_gather_base_z_u64, vldrdq_gather_offset_s64, vldrdq_gather_offset_u64,
vldrdq_gather_offset_z_s64, vldrdq_gather_offset_z_u64, 
vldrdq_gather_shifted_offset_s64,
vldrdq_gather_shifted_offset_u64, vldrdq_gather_shifted_offset_z_s64,
vldrdq_gather_shifted_offset_z_u64, vldrhq_gather_offset_f16, 
vldrhq_gather_offset_z_f16,
vldrhq_gather_shifted_offset_f16, vldrhq_gather_shifted_offset_z_f16, 
vldrwq_gather_base_f32,
vldrwq_gather_base_z_f32, vldrwq_gather_offset_f32, vldrwq_gather_offset_s32,
vldrwq_gather_offset_u32, vldrwq_gather_offset_z_f32, 
vldrwq_gather_offset_z_s32,
vldrwq_gather_offset_z_u32, vldrwq_gather_shifted_offset_f32, 
vldrwq_gather_shifted_offset_s32,
vldrwq_gather_shifted_offset_u32, vldrwq_gather_shifted_offset_z_f32,
vldrwq_gather_shifted_offset_z_s32, vldrwq_gather_shifted_offset_z_u32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1]  
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-11-01  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm_mve.h (vldrdq_gather_base_s64): Define macro.
(vldrdq_gather_base_u64): Likewise.
(vldrdq_gather_base_z_s64): Likewise.
(vldrdq_gather_base_z_u64): Likewise.
(vldrdq_gather_offset_s64): Likewise.
(vldrdq_gather_offset_u64): Likewise.
(vldrdq_gather_offset_z_s64): Likewise.
(vldrdq_gather_offset_z_u64): Likewise.
(vldrdq_gather_shifted_offset_s64): Likewise.
(vldrdq_gather_shifted_offset_u64): Likewise.
(vldrdq_gather_shifted_offset_z_s64): Likewise.
(vldrdq_gather_shifted_offset_z_u64): Likewise.
(vldrhq_gather_offset_f16): Likewise.
(vldrhq_gather_offset_z_f16): Likewise.
(vldrhq_gather_shifted_offset_f16): Likewise.
(vldrhq_gather_shifted_offset_z_f16): Likewise.
(vldrwq_gather_base_f32): Likewise.
(vldrwq_gather_base_z_f32): Likewise.
(vldrwq_gather_offset_f32): Likewise.
(vldrwq_gather_offset_s32): Likewise.
(vldrwq_gather_offset_u32): Likewise.
(vldrwq_gather_offset_z_f32): Likewise.
(vldrwq_gather_offset_z_s32): Likewise.
(vldrwq_gather_offset_z_u32): Likewise.
(vldrwq_gather_shifted_offset_f32): Likewise.
(vldrwq_gather_shifted_offset_s32): Likewise.
(vldrwq_gather_shifted_offset_u32): Likewise.
(vldrwq_gather_shifted_offset_z_f32): Likewise.
(vldrwq_gather_shifted_offset_z_s32): Likewise.
(vldrwq_gather_shifted_offset_z_u32): Likewise.
(__arm_vldrdq_gather_base_s64): Define intrinsic.
(__arm_vldrdq_gather_base_u64): Likewise.
(__arm_vldrdq_gather_base_z_s64): Likewise.
(__arm_vldrdq_gather_base_z_u64): Likewise.
(__arm_vldrdq_gather_offset_s64): Likewise.
(__arm_vldrdq_gather_offset_u64): Likewise.
(__arm_vldrdq_gather_offset_z_s64): Likewise.
(__arm_vldrdq_gather_offset_z_u64): Likewise.
(__arm_vldrdq_gather_shifted_offset_s64): Likewise.
(__arm_vldrdq_gather_shifted_offset_u64): Likewise.
(__arm_vldrdq_gather_shifted_offset_z_s64): Likewise.
(__arm_vldrdq_gather_shifted_offset_z_u64): Likewise.
(__arm_vldrwq_gather_offset_s32): Likewise.
(__arm_vldrwq_gather_offset_u32): Likewise.
(__arm_vldrwq_gather_offset_z_s32): Likewise.
(__arm_vldrwq_gather_offset_z_u32): Likewise.
(__arm_vldrwq_gather_shifted_offset_s32): Likewise.
(__arm_vldrwq_gather_shifted_offset_u32): Likewise.
(__arm_vldrwq_gather_shifted_offset_z_s32): Likewise.
(__arm_vldrwq_gather_shifted_offset_z_u32): Likewise.
(__arm_vldrhq_gather_offset_f16): Likewise.
(__arm_vldrhq_gather_offset_z_f16): Likewise.
(__arm_vldrhq_gather_shifted_offset_f16): Likewise.
(__arm_vldrhq_gather_shifted_offset_z_f16): Likewise.
(__arm_vldrwq_gather_base_f32): Likewise.
(__arm_vldrwq_gather_base_z_f32): Likewise.
(__arm_vldrwq_gather_offset_f32): Likewise.
(__arm_vldrwq_gather_offset_z_f32): Likewise.
(__arm_vldrwq_gather_shifted_offset_f32): Likewise.
(__arm_vldrwq_gather_shifted_offset_z_f32): Likewise.
(vldrhq_gather_offset): Define polymorphic variant.
(vldrhq_gather_offset_z): Likewise.
(vldrhq_gather_shifted_offset): Likewise.
(vldrhq_gather_shifted_offset_z): Likewise.
(vldrwq_gather_offset): Likewise.
(vldrwq_gather_offset_z): Likewise.
(vldrwq_gather_shifted_offset): Likewise.
(vldrwq_gather_shifted_offs

[PATCH][ARM][GCC][3/5x]: MVE store intrinsics with predicated suffix.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports the following MVE ACLE store intrinsics with predicated
suffix.

vstrbq_p_s8, vstrbq_p_s32, vstrbq_p_s16, vstrbq_p_u8, vstrbq_p_u32,
vstrbq_p_u16, vstrbq_scatter_offset_p_s8, vstrbq_scatter_offset_p_s32,
vstrbq_scatter_offset_p_s16, vstrbq_scatter_offset_p_u8,
vstrbq_scatter_offset_p_u32, vstrbq_scatter_offset_p_u16,
vstrwq_scatter_base_p_s32, vstrwq_scatter_base_p_u32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1]  
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-11-01  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c (STRS_P_QUALIFIERS): Define builtin
qualifier.
(STRU_P_QUALIFIERS): Likewise.
(STRSU_P_QUALIFIERS): Likewise.
(STRSS_P_QUALIFIERS): Likewise.
(STRSBS_P_QUALIFIERS): Likewise.
(STRSBU_P_QUALIFIERS): Likewise.
* config/arm/arm_mve.h (vstrbq_p_s8): Define macro.
(vstrbq_p_s32): Likewise.
(vstrbq_p_s16): Likewise.
(vstrbq_p_u8): Likewise.
(vstrbq_p_u32): Likewise.
(vstrbq_p_u16): Likewise.
(vstrbq_scatter_offset_p_s8): Likewise.
(vstrbq_scatter_offset_p_s32): Likewise.
(vstrbq_scatter_offset_p_s16): Likewise.
(vstrbq_scatter_offset_p_u8): Likewise.
(vstrbq_scatter_offset_p_u32): Likewise.
(vstrbq_scatter_offset_p_u16): Likewise.
(vstrwq_scatter_base_p_s32): Likewise.
(vstrwq_scatter_base_p_u32): Likewise.
(__arm_vstrbq_p_s8): Define intrinsic.
(__arm_vstrbq_p_s32): Likewise.
(__arm_vstrbq_p_s16): Likewise.
(__arm_vstrbq_p_u8): Likewise.
(__arm_vstrbq_p_u32): Likewise.
(__arm_vstrbq_p_u16): Likewise.
(__arm_vstrbq_scatter_offset_p_s8): Likewise.
(__arm_vstrbq_scatter_offset_p_s32): Likewise.
(__arm_vstrbq_scatter_offset_p_s16): Likewise.
(__arm_vstrbq_scatter_offset_p_u8): Likewise.
(__arm_vstrbq_scatter_offset_p_u32): Likewise.
(__arm_vstrbq_scatter_offset_p_u16): Likewise.
(__arm_vstrwq_scatter_base_p_s32): Likewise.
(__arm_vstrwq_scatter_base_p_u32): Likewise.
(vstrbq_p): Define polymorphic variant.
(vstrbq_scatter_offset_p): Likewise.
(vstrwq_scatter_base_p): Likewise.
* config/arm/arm_mve_builtins.def (STRS_P_QUALIFIERS): Use builtin
qualifier.
(STRU_P_QUALIFIERS): Likewise.
(STRSU_P_QUALIFIERS): Likewise.
(STRSS_P_QUALIFIERS): Likewise.
(STRSBS_P_QUALIFIERS): Likewise.
(STRSBU_P_QUALIFIERS): Likewise.
* config/arm/mve.md (mve_vstrbq_scatter_offset_p_): Define
RTL pattern.
(mve_vstrwq_scatter_base_p_v4si): Likewise.
(mve_vstrbq_p_): Likewise.

gcc/testsuite/ChangeLog:

2019-11-01  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vstrbq_p_s16.c: New test.
* gcc.target/arm/mve/intrinsics/vstrbq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_p_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_p_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_p_u32.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 
02ea297937b18099f33a50c808964d1dd7eac1b3..b5639051bf07785d906ed596e08d670f4de1a67e
 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -589,6 +589,41 @@ arm_strsbu_qualifiers[SIMD_MAX_BUILTIN_ARGS]
 #define STRSBU_QUALIFIERS (arm_strsbu_qualifiers)
 
 static enum arm_type_qualifiers
+arm_strs_p_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_void, qualifier_pointer, qualifier_none, qualifier_unsigned};
+#define STRS_P_QUALIFIERS (arm_strs_p_qualifiers)
+
+static enum arm_type_qualifiers
+arm_stru_p_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_void, qualifier_pointer, qualifier_unsigned,
+  qualifier_unsigned};
+#define STRU_P_QUALIFIERS (arm_stru_

[PATCH][ARM][GCC][8/5x]: Remaining MVE store intrinsics which stores an half word, word and double word to memory.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports the following MVE ACLE store intrinsics which stores an 
halfword,
word or double word to memory.

vstrdq_scatter_base_p_s64, vstrdq_scatter_base_p_u64, vstrdq_scatter_base_s64,
vstrdq_scatter_base_u64, vstrdq_scatter_offset_p_s64, 
vstrdq_scatter_offset_p_u64,
vstrdq_scatter_offset_s64, vstrdq_scatter_offset_u64, 
vstrdq_scatter_shifted_offset_p_s64,
vstrdq_scatter_shifted_offset_p_u64, vstrdq_scatter_shifted_offset_s64,
vstrdq_scatter_shifted_offset_u64, vstrhq_scatter_offset_f16, 
vstrhq_scatter_offset_p_f16,
vstrhq_scatter_shifted_offset_f16, vstrhq_scatter_shifted_offset_p_f16,
vstrwq_scatter_base_f32, vstrwq_scatter_base_p_f32, vstrwq_scatter_offset_f32,
vstrwq_scatter_offset_p_f32, vstrwq_scatter_offset_p_s32, 
vstrwq_scatter_offset_p_u32,
vstrwq_scatter_offset_s32, vstrwq_scatter_offset_u32, 
vstrwq_scatter_shifted_offset_f32,
vstrwq_scatter_shifted_offset_p_f32, vstrwq_scatter_shifted_offset_p_s32,
vstrwq_scatter_shifted_offset_p_u32, vstrwq_scatter_shifted_offset_s32,
vstrwq_scatter_shifted_offset_u32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1]  
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

In this patch a new predicate "Ri" is defined to check the immediate is in the
range of +/-1016 and multiple of 8.

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-11-05  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm_mve.h (vstrdq_scatter_base_p_s64): Define macro.
(vstrdq_scatter_base_p_u64): Likewise.
(vstrdq_scatter_base_s64): Likewise.
(vstrdq_scatter_base_u64): Likewise.
(vstrdq_scatter_offset_p_s64): Likewise.
(vstrdq_scatter_offset_p_u64): Likewise.
(vstrdq_scatter_offset_s64): Likewise.
(vstrdq_scatter_offset_u64): Likewise.
(vstrdq_scatter_shifted_offset_p_s64): Likewise.
(vstrdq_scatter_shifted_offset_p_u64): Likewise.
(vstrdq_scatter_shifted_offset_s64): Likewise.
(vstrdq_scatter_shifted_offset_u64): Likewise.
(vstrhq_scatter_offset_f16): Likewise.
(vstrhq_scatter_offset_p_f16): Likewise.
(vstrhq_scatter_shifted_offset_f16): Likewise.
(vstrhq_scatter_shifted_offset_p_f16): Likewise.
(vstrwq_scatter_base_f32): Likewise.
(vstrwq_scatter_base_p_f32): Likewise.
(vstrwq_scatter_offset_f32): Likewise.
(vstrwq_scatter_offset_p_f32): Likewise.
(vstrwq_scatter_offset_p_s32): Likewise.
(vstrwq_scatter_offset_p_u32): Likewise.
(vstrwq_scatter_offset_s32): Likewise.
(vstrwq_scatter_offset_u32): Likewise.
(vstrwq_scatter_shifted_offset_f32): Likewise.
(vstrwq_scatter_shifted_offset_p_f32): Likewise.
(vstrwq_scatter_shifted_offset_p_s32): Likewise.
(vstrwq_scatter_shifted_offset_p_u32): Likewise.
(vstrwq_scatter_shifted_offset_s32): Likewise.
(vstrwq_scatter_shifted_offset_u32): Likewise.
(__arm_vstrdq_scatter_base_p_s64): Define intrinsic.
(__arm_vstrdq_scatter_base_p_u64): Likewise.
(__arm_vstrdq_scatter_base_s64): Likewise.
(__arm_vstrdq_scatter_base_u64): Likewise.
(__arm_vstrdq_scatter_offset_p_s64): Likewise.
(__arm_vstrdq_scatter_offset_p_u64): Likewise.
(__arm_vstrdq_scatter_offset_s64): Likewise.
(__arm_vstrdq_scatter_offset_u64): Likewise.
(__arm_vstrdq_scatter_shifted_offset_p_s64): Likewise.
(__arm_vstrdq_scatter_shifted_offset_p_u64): Likewise.
(__arm_vstrdq_scatter_shifted_offset_s64): Likewise.
(__arm_vstrdq_scatter_shifted_offset_u64): Likewise.
(__arm_vstrwq_scatter_offset_p_s32): Likewise.
(__arm_vstrwq_scatter_offset_p_u32): Likewise.
(__arm_vstrwq_scatter_offset_s32): Likewise.
(__arm_vstrwq_scatter_offset_u32): Likewise.
(__arm_vstrwq_scatter_shifted_offset_p_s32): Likewise.
(__arm_vstrwq_scatter_shifted_offset_p_u32): Likewise.
(__arm_vstrwq_scatter_shifted_offset_s32): Likewise.
(__arm_vstrwq_scatter_shifted_offset_u32): Likewise.
(__arm_vstrhq_scatter_offset_f16): Likewise.
(__arm_vstrhq_scatter_offset_p_f16): Likewise.
(__arm_vstrhq_scatter_shifted_offset_f16): Likewise.
(__arm_vstrhq_scatter_shifted_offset_p_f16): Likewise.
(__arm_vstrwq_scatter_base_f32): Likewise.
(__arm_vstrwq_scatter_base_p_f32): Likewise.
(__arm_vstrwq_scatter_offset_f32): Likewise.
(__arm_vstrwq_scatter_offset_p_f32): Likewise.
(__arm_vstrwq_scatter_shifted_offset_f32): Likewise.
(__arm_vstrwq_scatter_shifted_offset_p_f32): Likewise.
(vstrhq_scatter_offset): Define polymorphic variant.
(vstrhq_scatter_offset_p): Likewise.
(vstrhq_scatter_shifted_offset): Likewise.
(vstrhq_scatter

[PATCH][ARM][GCC][4/5x]: MVE load intrinsics with zero(_z) suffix.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports the following MVE ACLE load intrinsics with zero(_z)
suffix.
* ``_z`` (zero) which indicates false-predicated lanes are filled with zeroes,
these are only used for load instructions.

vldrbq_gather_offset_z_s16, vldrbq_gather_offset_z_u8, 
vldrbq_gather_offset_z_s32,
vldrbq_gather_offset_z_u16, vldrbq_gather_offset_z_u32, 
vldrbq_gather_offset_z_s8,
vldrbq_z_s16, vldrbq_z_u8, vldrbq_z_s8, vldrbq_z_s32, vldrbq_z_u16, 
vldrbq_z_u32,
vldrwq_gather_base_z_u32, vldrwq_gather_base_z_s32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1]  
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-11-01  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c (LDRGBS_Z_QUALIFIERS): Define builtin
qualifier.
(LDRGBU_Z_QUALIFIERS): Likewise.
(LDRGS_Z_QUALIFIERS): Likewise.
(LDRGU_Z_QUALIFIERS): Likewise.
(LDRS_Z_QUALIFIERS): Likewise.
(LDRU_Z_QUALIFIERS): Likewise.
* config/arm/arm_mve.h (vldrbq_gather_offset_z_s16): Define macro.
(vldrbq_gather_offset_z_u8): Likewise.
(vldrbq_gather_offset_z_s32): Likewise.
(vldrbq_gather_offset_z_u16): Likewise.
(vldrbq_gather_offset_z_u32): Likewise.
(vldrbq_gather_offset_z_s8): Likewise.
(vldrbq_z_s16): Likewise.
(vldrbq_z_u8): Likewise.
(vldrbq_z_s8): Likewise.
(vldrbq_z_s32): Likewise.
(vldrbq_z_u16): Likewise.
(vldrbq_z_u32): Likewise.
(vldrwq_gather_base_z_u32): Likewise.
(vldrwq_gather_base_z_s32): Likewise.
(__arm_vldrbq_gather_offset_z_s8): Define intrinsic.
(__arm_vldrbq_gather_offset_z_s32): Likewise.
(__arm_vldrbq_gather_offset_z_s16): Likewise.
(__arm_vldrbq_gather_offset_z_u8): Likewise.
(__arm_vldrbq_gather_offset_z_u32): Likewise.
(__arm_vldrbq_gather_offset_z_u16): Likewise.
(__arm_vldrbq_z_s8): Likewise.
(__arm_vldrbq_z_s32): Likewise.
(__arm_vldrbq_z_s16): Likewise.
(__arm_vldrbq_z_u8): Likewise.
(__arm_vldrbq_z_u32): Likewise.
(__arm_vldrbq_z_u16): Likewise.
(__arm_vldrwq_gather_base_z_s32): Likewise.
(__arm_vldrwq_gather_base_z_u32): Likewise.
(vldrbq_gather_offset_z): Define polymorphic variant.
* config/arm/arm_mve_builtins.def (LDRGBS_Z_QUALIFIERS): Use builtin
qualifier.
(LDRGBU_Z_QUALIFIERS): Likewise.
(LDRGS_Z_QUALIFIERS): Likewise.
(LDRGU_Z_QUALIFIERS): Likewise.
(LDRS_Z_QUALIFIERS): Likewise.
(LDRU_Z_QUALIFIERS): Likewise.
* config/arm/mve.md (mve_vldrbq_gather_offset_z_): Define
RTL pattern.
(mve_vldrbq_z_): Likewise.
(mve_vldrwq_gather_base_z_v4si): Likewise.

gcc/testsuite/ChangeLog: Likewise.

2019-11-01  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_z_s16.c: New test.
* gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_z_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_z_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_z_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_z_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_z_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_z_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_z_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_z_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_z_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_z_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_z_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrwq_gather_base_z_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrwq_gather_base_z_u32.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 
b5639051bf07785d906ed596e08d670f4de1a67e..c3d12375d2fbc933ad33f7a15a3bbf53079d0639
 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -653,6 +653,40 @@ arm_ldrgbu_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   = { qualifier_unsigned, qualifier_unsigned, qualifier_immediate};
 #define LDRGBU_QUALIFIERS (arm_ldrgbu_qualifiers)
 
+static enum arm_type_qualifiers
+arm_ldrgbs_z_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_unsigned, qualifier_immediate,
+  qualifier_unsigned};
+#define LDRGBS_Z_QUALIFIERS (arm_ldrgbs_z_qualifiers)
+
+static enum arm_type_qualifiers
+arm_ldrgbu_z_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualif

[PATCH][ARM][GCC][2/5x]: MVE load intrinsics.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports the following MVE ACLE load intrinsics.

vldrbq_gather_offset_u8, vldrbq_gather_offset_s8, vldrbq_s8, vldrbq_u8,
vldrbq_gather_offset_u16, vldrbq_gather_offset_s16, vldrbq_s16, vldrbq_u16,
vldrbq_gather_offset_u32, vldrbq_gather_offset_s32, vldrbq_s32, vldrbq_u32,
vldrwq_gather_base_s32, vldrwq_gather_base_u32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1]  
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-11-01  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c (LDRGU_QUALIFIERS): Define builtin
qualifier.
(LDRGS_QUALIFIERS): Likewise.
(LDRS_QUALIFIERS): Likewise.
(LDRU_QUALIFIERS): Likewise.
(LDRGBS_QUALIFIERS): Likewise.
(LDRGBU_QUALIFIERS): Likewise.
* config/arm/arm_mve.h (vldrbq_gather_offset_u8): Define macro.
(vldrbq_gather_offset_s8): Likewise.
(vldrbq_s8): Likewise.
(vldrbq_u8): Likewise.
(vldrbq_gather_offset_u16): Likewise.
(vldrbq_gather_offset_s16): Likewise.
(vldrbq_s16): Likewise.
(vldrbq_u16): Likewise.
(vldrbq_gather_offset_u32): Likewise.
(vldrbq_gather_offset_s32): Likewise.
(vldrbq_s32): Likewise.
(vldrbq_u32): Likewise.
(vldrwq_gather_base_s32): Likewise.
(vldrwq_gather_base_u32): Likewise.
(__arm_vldrbq_gather_offset_u8): Define intrinsic.
(__arm_vldrbq_gather_offset_s8): Likewise.
(__arm_vldrbq_s8): Likewise.
(__arm_vldrbq_u8): Likewise.
(__arm_vldrbq_gather_offset_u16): Likewise.
(__arm_vldrbq_gather_offset_s16): Likewise.
(__arm_vldrbq_s16): Likewise.
(__arm_vldrbq_u16): Likewise.
(__arm_vldrbq_gather_offset_u32): Likewise.
(__arm_vldrbq_gather_offset_s32): Likewise.
(__arm_vldrbq_s32): Likewise.
(__arm_vldrbq_u32): Likewise.
(__arm_vldrwq_gather_base_s32): Likewise.
(__arm_vldrwq_gather_base_u32): Likewise.
(vldrbq_gather_offset): Define polymorphic variant.
* config/arm/arm_mve_builtins.def (LDRGU_QUALIFIERS): Use builtin
qualifier.
(LDRGS_QUALIFIERS): Likewise.
(LDRS_QUALIFIERS): Likewise.
(LDRU_QUALIFIERS): Likewise.
(LDRGBS_QUALIFIERS): Likewise.
(LDRGBU_QUALIFIERS): Likewise.
* config/arm/mve.md (VLDRBGOQ): Define iterator.
(VLDRBQ): Likewise. 
(VLDRWGBQ): Likewise.
(mve_vldrbq_gather_offset_): Define RTL pattern.
(mve_vldrbq_): Likewise.
(mve_vldrwq_gather_base_v4si): Likewise.

gcc/testsuite/ChangeLog:

2019-11-01  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_s16.c: New test.
* gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrwq_gather_base_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrwq_gather_base_u32.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 
ec88199bb5e7e9c15a346061c70841f3086004ef..02ea297937b18099f33a50c808964d1dd7eac1b3
 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -588,6 +588,36 @@ arm_strsbu_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   qualifier_unsigned};
 #define STRSBU_QUALIFIERS (arm_strsbu_qualifiers)
 
+static enum arm_type_qualifiers
+arm_ldrgu_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_pointer, qualifier_unsigned};
+#define LDRGU_QUALIFIERS (arm_ldrgu_qualifiers)
+
+static enum arm_type_qualifiers
+arm_ldrgs_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_pointer, qualifier_unsigned};
+#define LDRGS_QUALIFIERS (arm_ldrgs_qualifiers)
+
+static enum arm_type_qualifiers
+arm_ldrs_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_pointer};
+#define LDRS_QUALIFIERS (arm_ldrs_qualifiers)
+
+static enum arm_typ

[PATCH][ARM][GCC][1/5x]: MVE store intrinsics.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports the following MVE ACLE store intrinsics.

vstrbq_scatter_offset_s8, vstrbq_scatter_offset_s32, vstrbq_scatter_offset_s16,
vstrbq_scatter_offset_u8, vstrbq_scatter_offset_u32, vstrbq_scatter_offset_u16,
vstrbq_s8, vstrbq_s32, vstrbq_s16, vstrbq_u8, vstrbq_u32, vstrbq_u16,
vstrwq_scatter_base_s32, vstrwq_scatter_base_u32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1]  
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-11-01  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c (STRS_QUALIFIERS): Define builtin qualifier.
(STRU_QUALIFIERS): Likewise.
(STRSS_QUALIFIERS): Likewise.
(STRSU_QUALIFIERS): Likewise.
(STRSBS_QUALIFIERS): Likewise.
(STRSBU_QUALIFIERS): Likewise.
* config/arm/arm_mve.h (vstrbq_s8): Define macro.
(vstrbq_u8): Likewise.
(vstrbq_u16): Likewise.
(vstrbq_scatter_offset_s8): Likewise.
(vstrbq_scatter_offset_u8): Likewise.
(vstrbq_scatter_offset_u16): Likewise.
(vstrbq_s16): Likewise.
(vstrbq_u32): Likewise.
(vstrbq_scatter_offset_s16): Likewise.
(vstrbq_scatter_offset_u32): Likewise.
(vstrbq_s32): Likewise.
(vstrbq_scatter_offset_s32): Likewise.
(vstrwq_scatter_base_s32): Likewise.
(vstrwq_scatter_base_u32): Likewise.
(__arm_vstrbq_scatter_offset_s8): Define intrinsic.
(__arm_vstrbq_scatter_offset_s32): Likewise.
(__arm_vstrbq_scatter_offset_s16): Likewise.
(__arm_vstrbq_scatter_offset_u8): Likewise.
(__arm_vstrbq_scatter_offset_u32): Likewise.
(__arm_vstrbq_scatter_offset_u16): Likewise.
(__arm_vstrbq_s8): Likewise.
(__arm_vstrbq_s32): Likewise.
(__arm_vstrbq_s16): Likewise.
(__arm_vstrbq_u8): Likewise.
(__arm_vstrbq_u32): Likewise.
(__arm_vstrbq_u16): Likewise.
(__arm_vstrwq_scatter_base_s32): Likewise.
(__arm_vstrwq_scatter_base_u32): Likewise.
(vstrbq): Define polymorphic variant.
(vstrbq_scatter_offset): Likewise.
(vstrwq_scatter_base): Likewise.
* config/arm/arm_mve_builtins.def (STRS_QUALIFIERS): Use builtin
qualifier.
(STRU_QUALIFIERS): Likewise.
(STRSS_QUALIFIERS): Likewise.
(STRSU_QUALIFIERS): Likewise.
(STRSBS_QUALIFIERS): Likewise.
(STRSBU_QUALIFIERS): Likewise.
* config/arm/mve.md (MVE_B_ELEM): Define mode attribute iterator.
(VSTRWSBQ): Define iterators.
(VSTRBSOQ): Likewise. 
(VSTRBQ): Likewise.
(mve_vstrbq_): Define RTL pattern.
(mve_vstrbq_scatter_offset_): Likewise.
(mve_vstrwq_scatter_base_v4si): Likewise.

gcc/testsuite/ChangeLog:

2019-11-01  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vstrbq_s16.c: New test.
* gcc.target/arm/mve/intrinsics/vstrbq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrbq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_u32.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 
6dffb36fe179357c62cb4f35d486513971e3487d..ec88199bb5e7e9c15a346061c70841f3086004ef
 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -555,6 +555,39 @@ 
arm_quadop_unone_unone_unone_none_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS]
 #define QUADOP_UNONE_UNONE_UNONE_NONE_UNONE_QUALIFIERS \
   (arm_quadop_unone_unone_unone_none_unone_qualifiers)
 
+static enum arm_type_qualifiers
+arm_strs_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_void, qualifier_pointer, qualifier_none };
+#define STRS_QUALIFIERS (arm_strs_qualifiers)
+
+static enum arm_type_qualifiers
+arm_stru_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_void, qualifier_pointer, qualifier_unsigned };
+#define STRU_QUALIFIERS (arm_str

[PATCH][ARM][GCC][1/4x]: MVE intrinsics with quaternary operands.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE intrinsics with quaternary operands.

vsriq_m_n_s8, vsubq_m_s8, vsubq_x_s8, vcvtq_m_n_f16_u16, vcvtq_x_n_f16_u16,
vqshluq_m_n_s8, vabavq_p_s8, vsriq_m_n_u8, vshlq_m_u8, vshlq_x_u8, vsubq_m_u8,
vsubq_x_u8, vabavq_p_u8, vshlq_m_s8, vshlq_x_s8, vcvtq_m_n_f16_s16,
vcvtq_x_n_f16_s16, vsriq_m_n_s16, vsubq_m_s16, vsubq_x_s16, vcvtq_m_n_f32_u32,
vcvtq_x_n_f32_u32, vqshluq_m_n_s16, vabavq_p_s16, vsriq_m_n_u16,
vshlq_m_u16, vshlq_x_u16, vsubq_m_u16, vsubq_x_u16, vabavq_p_u16, vshlq_m_s16,
vshlq_x_s16, vcvtq_m_n_f32_s32, vcvtq_x_n_f32_s32, vsriq_m_n_s32, vsubq_m_s32,
vsubq_x_s32, vqshluq_m_n_s32, vabavq_p_s32, vsriq_m_n_u32, vshlq_m_u32,
vshlq_x_u32, vsubq_m_u32, vsubq_x_u32, vabavq_p_u32, vshlq_m_s32, vshlq_x_s32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-10-29  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c 
(QUADOP_UNONE_UNONE_NONE_NONE_UNONE_QUALIFIERS):
Define builtin qualifier.
(QUADOP_NONE_NONE_NONE_NONE_UNONE_QUALIFIERS): Likewise.
(QUADOP_NONE_NONE_NONE_IMM_UNONE_QUALIFIERS): Likewise.
(QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE_QUALIFIERS): Likewise.
(QUADOP_UNONE_UNONE_NONE_IMM_UNONE_QUALIFIERS): Likewise.
(QUADOP_NONE_NONE_UNONE_IMM_UNONE_QUALIFIERS): Likewise.
(QUADOP_UNONE_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Likewise.
(QUADOP_UNONE_UNONE_UNONE_NONE_UNONE_QUALIFIERS): Likewise.
* config/arm/arm_mve.h (vsriq_m_n_s8): Define macro.
(vsubq_m_s8): Likewise.
(vcvtq_m_n_f16_u16): Likewise.
(vqshluq_m_n_s8): Likewise.
(vabavq_p_s8): Likewise.
(vsriq_m_n_u8): Likewise.
(vshlq_m_u8): Likewise.
(vsubq_m_u8): Likewise.
(vabavq_p_u8): Likewise.
(vshlq_m_s8): Likewise.
(vcvtq_m_n_f16_s16): Likewise.
(vsriq_m_n_s16): Likewise.
(vsubq_m_s16): Likewise.
(vcvtq_m_n_f32_u32): Likewise.
(vqshluq_m_n_s16): Likewise.
(vabavq_p_s16): Likewise.
(vsriq_m_n_u16): Likewise.
(vshlq_m_u16): Likewise.
(vsubq_m_u16): Likewise.
(vabavq_p_u16): Likewise.
(vshlq_m_s16): Likewise.
(vcvtq_m_n_f32_s32): Likewise.
(vsriq_m_n_s32): Likewise.
(vsubq_m_s32): Likewise.
(vqshluq_m_n_s32): Likewise.
(vabavq_p_s32): Likewise.
(vsriq_m_n_u32): Likewise.
(vshlq_m_u32): Likewise.
(vsubq_m_u32): Likewise.
(vabavq_p_u32): Likewise.
(vshlq_m_s32): Likewise.
(__arm_vsriq_m_n_s8): Define intrinsic.
(__arm_vsubq_m_s8): Likewise.
(__arm_vqshluq_m_n_s8): Likewise.
(__arm_vabavq_p_s8): Likewise.
(__arm_vsriq_m_n_u8): Likewise.
(__arm_vshlq_m_u8): Likewise.
(__arm_vsubq_m_u8): Likewise.
(__arm_vabavq_p_u8): Likewise.
(__arm_vshlq_m_s8): Likewise.
(__arm_vsriq_m_n_s16): Likewise.
(__arm_vsubq_m_s16): Likewise.
(__arm_vqshluq_m_n_s16): Likewise.
(__arm_vabavq_p_s16): Likewise.
(__arm_vsriq_m_n_u16): Likewise.
(__arm_vshlq_m_u16): Likewise.
(__arm_vsubq_m_u16): Likewise.
(__arm_vabavq_p_u16): Likewise.
(__arm_vshlq_m_s16): Likewise.
(__arm_vsriq_m_n_s32): Likewise.
(__arm_vsubq_m_s32): Likewise.
(__arm_vqshluq_m_n_s32): Likewise.
(__arm_vabavq_p_s32): Likewise.
(__arm_vsriq_m_n_u32): Likewise.
(__arm_vshlq_m_u32): Likewise.
(__arm_vsubq_m_u32): Likewise.
(__arm_vabavq_p_u32): Likewise.
(__arm_vshlq_m_s32): Likewise.
(__arm_vcvtq_m_n_f16_u16): Likewise.
(__arm_vcvtq_m_n_f16_s16): Likewise.
(__arm_vcvtq_m_n_f32_u32): Likewise.
(__arm_vcvtq_m_n_f32_s32): Likewise.
(vcvtq_m_n): Define polymorphic variant.
(vqshluq_m_n): Likewise.
(vshlq_m): Likewise.
(vsriq_m_n): Likewise.
(vsubq_m): Likewise.
(vabavq_p): Likewise.
* config/arm/arm_mve_builtins.def
(QUADOP_UNONE_UNONE_NONE_NONE_UNONE_QUALIFIERS): Use builtin qualifier.
(QUADOP_NONE_NONE_NONE_NONE_UNONE_QUALIFIERS): Likewise.
(QUADOP_NONE_NONE_NONE_IMM_UNONE_QUALIFIERS): Likewise.
(QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE_QUALIFIERS): Likewise.
(QUADOP_UNONE_UNONE_NONE_IMM_UNONE_QUALIFIERS): Likewise.
(QUADOP_NONE_NONE_UNONE_IMM_UNONE_QUALIFIERS): Likewise.
(QUADOP_UNONE_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Likewise.
(QUADOP_UNONE_UNONE_UNONE_NONE_UNONE_QUALIFIERS): Likewise.
* config/arm/mve.md (VABAVQ_P): Define iterator.
(VSHLQ_M): Likewise.
(VSRIQ_M_N): Likewise.
(V

[PATCH][ARM][GCC][2/4x]: MVE intrinsics with quaternary operands.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE intrinsics with quaternary operands.

vabdq_m_s8, vabdq_m_s32, vabdq_m_s16, vabdq_m_u8, vabdq_m_u32, vabdq_m_u16, 
vaddq_m_n_s8,
vaddq_m_n_s32, vaddq_m_n_s16, vaddq_m_n_u8, vaddq_m_n_u32, vaddq_m_n_u16, 
vaddq_m_s8,
vaddq_m_s32, vaddq_m_s16, vaddq_m_u8, vaddq_m_u32, vaddq_m_u16, vandq_m_s8, 
vandq_m_s32,
vandq_m_s16, vandq_m_u8, vandq_m_u32, vandq_m_u16, vbicq_m_s8, vbicq_m_s32, 
vbicq_m_s16,
vbicq_m_u8, vbicq_m_u32, vbicq_m_u16, vbrsrq_m_n_s8, vbrsrq_m_n_s32, 
vbrsrq_m_n_s16,
vbrsrq_m_n_u8, vbrsrq_m_n_u32, vbrsrq_m_n_u16, vcaddq_rot270_m_s8, 
vcaddq_rot270_m_s32,
vcaddq_rot270_m_s16, vcaddq_rot270_m_u8, vcaddq_rot270_m_u32, 
vcaddq_rot270_m_u16,
vcaddq_rot90_m_s8, vcaddq_rot90_m_s32, vcaddq_rot90_m_s16, vcaddq_rot90_m_u8,
vcaddq_rot90_m_u32, vcaddq_rot90_m_u16, veorq_m_s8, veorq_m_s32, veorq_m_s16, 
veorq_m_u8,
veorq_m_u32, veorq_m_u16, vhaddq_m_n_s8, vhaddq_m_n_s32, vhaddq_m_n_s16, 
vhaddq_m_n_u8,
vhaddq_m_n_u32, vhaddq_m_n_u16, vhaddq_m_s8, vhaddq_m_s32, vhaddq_m_s16, 
vhaddq_m_u8,
vhaddq_m_u32, vhaddq_m_u16, vhcaddq_rot270_m_s8, vhcaddq_rot270_m_s32, 
vhcaddq_rot270_m_s16,
vhcaddq_rot90_m_s8, vhcaddq_rot90_m_s32, vhcaddq_rot90_m_s16, vhsubq_m_n_s8, 
vhsubq_m_n_s32,
vhsubq_m_n_s16, vhsubq_m_n_u8, vhsubq_m_n_u32, vhsubq_m_n_u16, vhsubq_m_s8, 
vhsubq_m_s32,
vhsubq_m_s16, vhsubq_m_u8, vhsubq_m_u32, vhsubq_m_u16, vmaxq_m_s8, vmaxq_m_s32, 
vmaxq_m_s16,
vmaxq_m_u8, vmaxq_m_u32, vmaxq_m_u16, vminq_m_s8, vminq_m_s32, vminq_m_s16, 
vminq_m_u8,
vminq_m_u32, vminq_m_u16, vmladavaq_p_s8, vmladavaq_p_s32, vmladavaq_p_s16, 
vmladavaq_p_u8,
vmladavaq_p_u32, vmladavaq_p_u16, vmladavaxq_p_s8, vmladavaxq_p_s32, 
vmladavaxq_p_s16,
vmlaq_m_n_s8, vmlaq_m_n_s32, vmlaq_m_n_s16, vmlaq_m_n_u8, vmlaq_m_n_u32, 
vmlaq_m_n_u16,
vmlasq_m_n_s8, vmlasq_m_n_s32, vmlasq_m_n_s16, vmlasq_m_n_u8, vmlasq_m_n_u32, 
vmlasq_m_n_u16,
vmlsdavaq_p_s8, vmlsdavaq_p_s32, vmlsdavaq_p_s16, vmlsdavaxq_p_s8, 
vmlsdavaxq_p_s32,
vmlsdavaxq_p_s16, vmulhq_m_s8, vmulhq_m_s32, vmulhq_m_s16, vmulhq_m_u8, 
vmulhq_m_u32,
vmulhq_m_u16, vmullbq_int_m_s8, vmullbq_int_m_s32, vmullbq_int_m_s16, 
vmullbq_int_m_u8,
vmullbq_int_m_u32, vmullbq_int_m_u16, vmulltq_int_m_s8, vmulltq_int_m_s32, 
vmulltq_int_m_s16,
vmulltq_int_m_u8, vmulltq_int_m_u32, vmulltq_int_m_u16, vmulq_m_n_s8, 
vmulq_m_n_s32,
vmulq_m_n_s16, vmulq_m_n_u8, vmulq_m_n_u32, vmulq_m_n_u16, vmulq_m_s8, 
vmulq_m_s32,
vmulq_m_s16, vmulq_m_u8, vmulq_m_u32, vmulq_m_u16, vornq_m_s8, vornq_m_s32, 
vornq_m_s16,
vornq_m_u8, vornq_m_u32, vornq_m_u16, vorrq_m_s8, vorrq_m_s32, vorrq_m_s16, 
vorrq_m_u8,
vorrq_m_u32, vorrq_m_u16, vqaddq_m_n_s8, vqaddq_m_n_s32, vqaddq_m_n_s16, 
vqaddq_m_n_u8,
vqaddq_m_n_u32, vqaddq_m_n_u16, vqaddq_m_s8, vqaddq_m_s32, vqaddq_m_s16, 
vqaddq_m_u8, 
vqaddq_m_u32, vqaddq_m_u16, vqdmladhq_m_s8, vqdmladhq_m_s32, vqdmladhq_m_s16, 
vqdmladhxq_m_s8,
vqdmladhxq_m_s32, vqdmladhxq_m_s16, vqdmlahq_m_n_s8, vqdmlahq_m_n_s32, 
vqdmlahq_m_n_s16,
vqdmlahq_m_n_u8, vqdmlahq_m_n_u32, vqdmlahq_m_n_u16, vqdmlsdhq_m_s8, 
vqdmlsdhq_m_s32,
vqdmlsdhq_m_s16, vqdmlsdhxq_m_s8, vqdmlsdhxq_m_s32, vqdmlsdhxq_m_s16, 
vqdmulhq_m_n_s8,
vqdmulhq_m_n_s32, vqdmulhq_m_n_s16, vqdmulhq_m_s8, vqdmulhq_m_s32, 
vqdmulhq_m_s16,
vqrdmladhq_m_s8, vqrdmladhq_m_s32, vqrdmladhq_m_s16, vqrdmladhxq_m_s8, 
vqrdmladhxq_m_s32,
vqrdmladhxq_m_s16, vqrdmlahq_m_n_s8, vqrdmlahq_m_n_s32, vqrdmlahq_m_n_s16, 
vqrdmlahq_m_n_u8,
vqrdmlahq_m_n_u32, vqrdmlahq_m_n_u16, vqrdmlashq_m_n_s8, vqrdmlashq_m_n_s32, 
vqrdmlashq_m_n_s16,
vqrdmlashq_m_n_u8, vqrdmlashq_m_n_u32, vqrdmlashq_m_n_u16, vqrdmlsdhq_m_s8, 
vqrdmlsdhq_m_s32,
vqrdmlsdhq_m_s16, vqrdmlsdhxq_m_s8, vqrdmlsdhxq_m_s32, vqrdmlsdhxq_m_s16, 
vqrdmulhq_m_n_s8,
vqrdmulhq_m_n_s32, vqrdmulhq_m_n_s16, vqrdmulhq_m_s8, vqrdmulhq_m_s32, 
vqrdmulhq_m_s16,
vqrshlq_m_s8, vqrshlq_m_s32, vqrshlq_m_s16, vqrshlq_m_u8, vqrshlq_m_u32, 
vqrshlq_m_u16,
vqshlq_m_n_s8, vqshlq_m_n_s32, vqshlq_m_n_s16, vqshlq_m_n_u8, vqshlq_m_n_u32, 
vqshlq_m_n_u16,
vqshlq_m_s8, vqshlq_m_s32, vqshlq_m_s16, vqshlq_m_u8, vqshlq_m_u32, 
vqshlq_m_u16, 
vqsubq_m_n_s8, vqsubq_m_n_s32, vqsubq_m_n_s16, vqsubq_m_n_u8, vqsubq_m_n_u32, 
vqsubq_m_n_u16,
vqsubq_m_s8, vqsubq_m_s32, vqsubq_m_s16, vqsubq_m_u8, vqsubq_m_u32, 
vqsubq_m_u16,
vrhaddq_m_s8, vrhaddq_m_s32, vrhaddq_m_s16, vrhaddq_m_u8, vrhaddq_m_u32, 
vrhaddq_m_u16,
vrmulhq_m_s8, vrmulhq_m_s32, vrmulhq_m_s16, vrmulhq_m_u8, vrmulhq_m_u32, 
vrmulhq_m_u16,
vrshlq_m_s8, vrshlq_m_s32, vrshlq_m_s16, vrshlq_m_u8, vrshlq_m_u32, 
vrshlq_m_u16, vrshrq_m_n_s8,
vrshrq_m_n_s32, vrshrq_m_n_s16, vrshrq_m_n_u8, vrshrq_m_n_u32, vrshrq_m_n_u16, 
vshlq_m_n_s8,
vshlq_m_n_s32, vshlq_m_n_s16, vshlq_m_n_u8, vshlq_m_n_u32, vshlq_m_n_u16, 
vshrq_m_n_s8,
vshrq_m_n_s32, vshrq_m_n_s16, vshrq_m_n_u8, vshrq_m_n_u32, vshrq_m_n_u16, 
vsliq_m_n_s8,
vsliq_m_n_s32, vsliq_m_n_s16, vsliq_m_n_u8, vsliq_m_n_u32, vsliq_m_n_u16, 
vsubq_m_n_s8,
vsubq_m_n_s32, vsubq_m_n_s16, vsubq_m_n_u8, vsubq_m_n_u32, vsubq_m_n_u16.


Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
de

[PATCH][ARM][GCC][1/3x]: MVE intrinsics with ternary operands.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE intrinsics with ternary operands.

vabavq_s8, vabavq_s16, vabavq_s32, vbicq_m_n_s16, vbicq_m_n_s32,
vbicq_m_n_u16, vbicq_m_n_u32, vcmpeqq_m_f16, vcmpeqq_m_f32,
vcvtaq_m_s16_f16, vcvtaq_m_u16_f16, vcvtaq_m_s32_f32, vcvtaq_m_u32_f32,
vcvtq_m_f16_s16, vcvtq_m_f16_u16, vcvtq_m_f32_s32, vcvtq_m_f32_u32,
vqrshrnbq_n_s16, vqrshrnbq_n_u16, vqrshrnbq_n_s32, vqrshrnbq_n_u32,
vqrshrunbq_n_s16, vqrshrunbq_n_s32, vrmlaldavhaq_s32, vrmlaldavhaq_u32,
vshlcq_s8, vshlcq_u8, vshlcq_s16, vshlcq_u16, vshlcq_s32, vshlcq_u32,
vabavq_s8, vabavq_s16, vabavq_s32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-10-23  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c (TERNOP_UNONE_UNONE_UNONE_IMM_QUALIFIERS):
Define qualifier for ternary operands.
(TERNOP_UNONE_UNONE_NONE_NONE_QUALIFIERS): Likewise.
(TERNOP_UNONE_NONE_UNONE_IMM_QUALIFIERS): Likewise.
(TERNOP_NONE_NONE_UNONE_IMM_QUALIFIERS): Likewise.
(TERNOP_UNONE_UNONE_NONE_IMM_QUALIFIERS): Likewise.
(TERNOP_UNONE_UNONE_NONE_UNONE_QUALIFIERS): Likewise.
(TERNOP_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Likewise.
(TERNOP_UNONE_NONE_NONE_UNONE_QUALIFIERS): Likewise.
(TERNOP_NONE_NONE_NONE_IMM_QUALIFIERS): Likewise.
(TERNOP_NONE_NONE_NONE_UNONE_QUALIFIERS): Likewise.
(TERNOP_NONE_NONE_IMM_UNONE_QUALIFIERS): Likewise.
(TERNOP_NONE_NONE_UNONE_UNONE_QUALIFIERS): Likewise.
(TERNOP_UNONE_UNONE_UNONE_UNONE_QUALIFIERS): Likewise.
(TERNOP_NONE_NONE_NONE_NONE_QUALIFIERS): Likewise.
* config/arm/arm_mve.h (vabavq_s8): Define macro.
(vabavq_s16): Likewise.
(vabavq_s32): Likewise.
(vbicq_m_n_s16): Likewise.
(vbicq_m_n_s32): Likewise.
(vbicq_m_n_u16): Likewise.
(vbicq_m_n_u32): Likewise.
(vcmpeqq_m_f16): Likewise.
(vcmpeqq_m_f32): Likewise.
(vcvtaq_m_s16_f16): Likewise.
(vcvtaq_m_u16_f16): Likewise.
(vcvtaq_m_s32_f32): Likewise.
(vcvtaq_m_u32_f32): Likewise.
(vcvtq_m_f16_s16): Likewise.
(vcvtq_m_f16_u16): Likewise.
(vcvtq_m_f32_s32): Likewise.
(vcvtq_m_f32_u32): Likewise.
(vqrshrnbq_n_s16): Likewise.
(vqrshrnbq_n_u16): Likewise.
(vqrshrnbq_n_s32): Likewise.
(vqrshrnbq_n_u32): Likewise.
(vqrshrunbq_n_s16): Likewise.
(vqrshrunbq_n_s32): Likewise.
(vrmlaldavhaq_s32): Likewise.
(vrmlaldavhaq_u32): Likewise.
(vshlcq_s8): Likewise.
(vshlcq_u8): Likewise.
(vshlcq_s16): Likewise.
(vshlcq_u16): Likewise.
(vshlcq_s32): Likewise.
(vshlcq_u32): Likewise.
(vabavq_u8): Likewise.
(vabavq_u16): Likewise.
(vabavq_u32): Likewise.
(__arm_vabavq_s8): Define intrinsic.
(__arm_vabavq_s16): Likewise.
(__arm_vabavq_s32): Likewise.
(__arm_vabavq_u8): Likewise.
(__arm_vabavq_u16): Likewise.
(__arm_vabavq_u32): Likewise.
(__arm_vbicq_m_n_s16): Likewise.
(__arm_vbicq_m_n_s32): Likewise.
(__arm_vbicq_m_n_u16): Likewise.
(__arm_vbicq_m_n_u32): Likewise.
(__arm_vqrshrnbq_n_s16): Likewise.
(__arm_vqrshrnbq_n_u16): Likewise.
(__arm_vqrshrnbq_n_s32): Likewise.
(__arm_vqrshrnbq_n_u32): Likewise.
(__arm_vqrshrunbq_n_s16): Likewise.
(__arm_vqrshrunbq_n_s32): Likewise.
(__arm_vrmlaldavhaq_s32): Likewise.
(__arm_vrmlaldavhaq_u32): Likewise.
(__arm_vshlcq_s8): Likewise.
(__arm_vshlcq_u8): Likewise.
(__arm_vshlcq_s16): Likewise.
(__arm_vshlcq_u16): Likewise.
(__arm_vshlcq_s32): Likewise.
(__arm_vshlcq_u32): Likewise.
(__arm_vcmpeqq_m_f16): Likewise.
(__arm_vcmpeqq_m_f32): Likewise.
(__arm_vcvtaq_m_s16_f16): Likewise.
(__arm_vcvtaq_m_u16_f16): Likewise.
(__arm_vcvtaq_m_s32_f32): Likewise.
(__arm_vcvtaq_m_u32_f32): Likewise.
(__arm_vcvtq_m_f16_s16): Likewise.
(__arm_vcvtq_m_f16_u16): Likewise.
(__arm_vcvtq_m_f32_s32): Likewise.
(__arm_vcvtq_m_f32_u32): Likewise.
(vcvtaq_m): Define polymorphic variant.
(vcvtq_m): Likewise.
(vabavq): Likewise.
(vshlcq): Likewise.
(vbicq_m_n): Likewise.
(vqrshrnbq_n): Likewise.
(vqrshrunbq_n): Likewise.
* config/arm/arm_mve_builtins.def
(TERNOP_UNONE_UNONE_UNONE_IMM_QUALIFIERS): Use the builtin qualifer.
(TERNOP_UNONE_UNONE_NONE_NONE_QUALIFIERS): Likewise.
(TERNOP_UNONE_NONE_UNONE_IMM_QUALIFIERS): Likewise.
(TERNOP_NONE_NON

[PATCH][ARM][GCC][4/4x]: MVE intrinsics with quaternary operands.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE intrinsics with quaternary operands.

vabdq_m_f32, vabdq_m_f16, vaddq_m_f32, vaddq_m_f16, vaddq_m_n_f32, 
vaddq_m_n_f16,
vandq_m_f32, vandq_m_f16, vbicq_m_f32, vbicq_m_f16, vbrsrq_m_n_f32, 
vbrsrq_m_n_f16,
vcaddq_rot270_m_f32, vcaddq_rot270_m_f16, vcaddq_rot90_m_f32, 
vcaddq_rot90_m_f16,
vcmlaq_m_f32, vcmlaq_m_f16, vcmlaq_rot180_m_f32, vcmlaq_rot180_m_f16,
vcmlaq_rot270_m_f32, vcmlaq_rot270_m_f16, vcmlaq_rot90_m_f32, 
vcmlaq_rot90_m_f16,
vcmulq_m_f32, vcmulq_m_f16, vcmulq_rot180_m_f32, vcmulq_rot180_m_f16,
vcmulq_rot270_m_f32, vcmulq_rot270_m_f16, vcmulq_rot90_m_f32, 
vcmulq_rot90_m_f16,
vcvtq_m_n_s32_f32, vcvtq_m_n_s16_f16, vcvtq_m_n_u32_f32, vcvtq_m_n_u16_f16,
veorq_m_f32, veorq_m_f16, vfmaq_m_f32, vfmaq_m_f16, vfmaq_m_n_f32, 
vfmaq_m_n_f16,
vfmasq_m_n_f32, vfmasq_m_n_f16, vfmsq_m_f32, vfmsq_m_f16, vmaxnmq_m_f32,
vmaxnmq_m_f16, vminnmq_m_f32, vminnmq_m_f16, vmulq_m_f32, vmulq_m_f16,
vmulq_m_n_f32, vmulq_m_n_f16, vornq_m_f32, vornq_m_f16, vorrq_m_f32, 
vorrq_m_f16,
vsubq_m_f32, vsubq_m_f16, vsubq_m_n_f32, vsubq_m_n_f16.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1]  
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-10-31  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm_mve.h (vabdq_m_f32): Define macro.
(vabdq_m_f16): Likewise.
(vaddq_m_f32): Likewise.
(vaddq_m_f16): Likewise.
(vaddq_m_n_f32): Likewise.
(vaddq_m_n_f16): Likewise.
(vandq_m_f32): Likewise.
(vandq_m_f16): Likewise.
(vbicq_m_f32): Likewise.
(vbicq_m_f16): Likewise.
(vbrsrq_m_n_f32): Likewise.
(vbrsrq_m_n_f16): Likewise.
(vcaddq_rot270_m_f32): Likewise.
(vcaddq_rot270_m_f16): Likewise.
(vcaddq_rot90_m_f32): Likewise.
(vcaddq_rot90_m_f16): Likewise.
(vcmlaq_m_f32): Likewise.
(vcmlaq_m_f16): Likewise.
(vcmlaq_rot180_m_f32): Likewise.
(vcmlaq_rot180_m_f16): Likewise.
(vcmlaq_rot270_m_f32): Likewise.
(vcmlaq_rot270_m_f16): Likewise.
(vcmlaq_rot90_m_f32): Likewise.
(vcmlaq_rot90_m_f16): Likewise.
(vcmulq_m_f32): Likewise.
(vcmulq_m_f16): Likewise.
(vcmulq_rot180_m_f32): Likewise.
(vcmulq_rot180_m_f16): Likewise.
(vcmulq_rot270_m_f32): Likewise.
(vcmulq_rot270_m_f16): Likewise.
(vcmulq_rot90_m_f32): Likewise.
(vcmulq_rot90_m_f16): Likewise.
(vcvtq_m_n_s32_f32): Likewise.
(vcvtq_m_n_s16_f16): Likewise.
(vcvtq_m_n_u32_f32): Likewise.
(vcvtq_m_n_u16_f16): Likewise.
(veorq_m_f32): Likewise.
(veorq_m_f16): Likewise.
(vfmaq_m_f32): Likewise.
(vfmaq_m_f16): Likewise.
(vfmaq_m_n_f32): Likewise.
(vfmaq_m_n_f16): Likewise.
(vfmasq_m_n_f32): Likewise.
(vfmasq_m_n_f16): Likewise.
(vfmsq_m_f32): Likewise.
(vfmsq_m_f16): Likewise.
(vmaxnmq_m_f32): Likewise.
(vmaxnmq_m_f16): Likewise.
(vminnmq_m_f32): Likewise.
(vminnmq_m_f16): Likewise.
(vmulq_m_f32): Likewise.
(vmulq_m_f16): Likewise.
(vmulq_m_n_f32): Likewise.
(vmulq_m_n_f16): Likewise.
(vornq_m_f32): Likewise.
(vornq_m_f16): Likewise.
(vorrq_m_f32): Likewise.
(vorrq_m_f16): Likewise.
(vsubq_m_f32): Likewise.
(vsubq_m_f16): Likewise.
(vsubq_m_n_f32): Likewise.
(vsubq_m_n_f16): Likewise.
(__attribute__): Likewise.
(__arm_vabdq_m_f32): Likewise.
(__arm_vabdq_m_f16): Likewise.
(__arm_vaddq_m_f32): Likewise.
(__arm_vaddq_m_f16): Likewise.
(__arm_vaddq_m_n_f32): Likewise.
(__arm_vaddq_m_n_f16): Likewise.
(__arm_vandq_m_f32): Likewise.
(__arm_vandq_m_f16): Likewise.
(__arm_vbicq_m_f32): Likewise.
(__arm_vbicq_m_f16): Likewise.
(__arm_vbrsrq_m_n_f32): Likewise.
(__arm_vbrsrq_m_n_f16): Likewise.
(__arm_vcaddq_rot270_m_f32): Likewise.
(__arm_vcaddq_rot270_m_f16): Likewise.
(__arm_vcaddq_rot90_m_f32): Likewise.
(__arm_vcaddq_rot90_m_f16): Likewise.
(__arm_vcmlaq_m_f32): Likewise.
(__arm_vcmlaq_m_f16): Likewise.
(__arm_vcmlaq_rot180_m_f32): Likewise.
(__arm_vcmlaq_rot180_m_f16): Likewise.
(__arm_vcmlaq_rot270_m_f32): Likewise.
(__arm_vcmlaq_rot270_m_f16): Likewise.
(__arm_vcmlaq_rot90_m_f32): Likewise.
(__arm_vcmlaq_rot90_m_f16): Likewise.
(__arm_vcmulq_m_f32): Likewise.
(__arm_vcmulq_m_f16): Likewise.
(__arm_vcmulq_rot180_m_f32): Define intrinsic.
(__arm_vcmulq_rot180_m_f16): Likewise.
(__arm_vcmulq_rot270_m_

[PATCH][ARM][GCC][3/4x]: MVE intrinsics with quaternary operands.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE intrinsics with quaternary operands.

vmlaldavaq_p_s16, vmlaldavaq_p_s32, vmlaldavaq_p_u16, vmlaldavaq_p_u32, 
vmlaldavaxq_p_s16,
vmlaldavaxq_p_s32, vmlaldavaxq_p_u16, vmlaldavaxq_p_u32, vmlsldavaq_p_s16, 
vmlsldavaq_p_s32,
vmlsldavaxq_p_s16, vmlsldavaxq_p_s32, vmullbq_poly_m_p16, vmullbq_poly_m_p8,
vmulltq_poly_m_p16, vmulltq_poly_m_p8, vqdmullbq_m_n_s16, vqdmullbq_m_n_s32, 
vqdmullbq_m_s16,
vqdmullbq_m_s32, vqdmulltq_m_n_s16, vqdmulltq_m_n_s32, vqdmulltq_m_s16, 
vqdmulltq_m_s32,
vqrshrnbq_m_n_s16, vqrshrnbq_m_n_s32, vqrshrnbq_m_n_u16, vqrshrnbq_m_n_u32, 
vqrshrntq_m_n_s16,
vqrshrntq_m_n_s32, vqrshrntq_m_n_u16, vqrshrntq_m_n_u32, vqrshrunbq_m_n_s16, 
vqrshrunbq_m_n_s32,
vqrshruntq_m_n_s16, vqrshruntq_m_n_s32, vqshrnbq_m_n_s16, vqshrnbq_m_n_s32, 
vqshrnbq_m_n_u16,
vqshrnbq_m_n_u32, vqshrntq_m_n_s16, vqshrntq_m_n_s32, vqshrntq_m_n_u16, 
vqshrntq_m_n_u32,
vqshrunbq_m_n_s16, vqshrunbq_m_n_s32, vqshruntq_m_n_s16, vqshruntq_m_n_s32, 
vrmlaldavhaq_p_s32,
vrmlaldavhaq_p_u32, vrmlaldavhaxq_p_s32, vrmlsldavhaq_p_s32, 
vrmlsldavhaxq_p_s32,
vrshrnbq_m_n_s16, vrshrnbq_m_n_s32, vrshrnbq_m_n_u16, vrshrnbq_m_n_u32, 
vrshrntq_m_n_s16,
vrshrntq_m_n_s32, vrshrntq_m_n_u16, vrshrntq_m_n_u32, vshllbq_m_n_s16, 
vshllbq_m_n_s8,
vshllbq_m_n_u16, vshllbq_m_n_u8, vshlltq_m_n_s16, vshlltq_m_n_s8, 
vshlltq_m_n_u16, vshlltq_m_n_u8,
vshrnbq_m_n_s16, vshrnbq_m_n_s32, vshrnbq_m_n_u16, vshrnbq_m_n_u32, 
vshrntq_m_n_s16,
vshrntq_m_n_s32, vshrntq_m_n_u16, vshrntq_m_n_u32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-10-31  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-protos.h (arm_mve_immediate_check): 
* config/arm/arm.c (arm_mve_immediate_check): Define fuction to check
mode and interger value.
* config/arm/arm_mve.h (vmlaldavaq_p_s32): Define macro.
(vmlaldavaq_p_s16): Likewise.
(vmlaldavaq_p_u32): Likewise.
(vmlaldavaq_p_u16): Likewise.
(vmlaldavaxq_p_s32): Likewise.
(vmlaldavaxq_p_s16): Likewise.
(vmlaldavaxq_p_u32): Likewise.
(vmlaldavaxq_p_u16): Likewise.
(vmlsldavaq_p_s32): Likewise.
(vmlsldavaq_p_s16): Likewise.
(vmlsldavaxq_p_s32): Likewise.
(vmlsldavaxq_p_s16): Likewise.
(vmullbq_poly_m_p8): Likewise.
(vmullbq_poly_m_p16): Likewise.
(vmulltq_poly_m_p8): Likewise.
(vmulltq_poly_m_p16): Likewise.
(vqdmullbq_m_n_s32): Likewise.
(vqdmullbq_m_n_s16): Likewise.
(vqdmullbq_m_s32): Likewise.
(vqdmullbq_m_s16): Likewise.
(vqdmulltq_m_n_s32): Likewise.
(vqdmulltq_m_n_s16): Likewise.
(vqdmulltq_m_s32): Likewise.
(vqdmulltq_m_s16): Likewise.
(vqrshrnbq_m_n_s32): Likewise.
(vqrshrnbq_m_n_s16): Likewise.
(vqrshrnbq_m_n_u32): Likewise.
(vqrshrnbq_m_n_u16): Likewise.
(vqrshrntq_m_n_s32): Likewise.
(vqrshrntq_m_n_s16): Likewise.
(vqrshrntq_m_n_u32): Likewise.
(vqrshrntq_m_n_u16): Likewise.
(vqrshrunbq_m_n_s32): Likewise.
(vqrshrunbq_m_n_s16): Likewise.
(vqrshruntq_m_n_s32): Likewise.
(vqrshruntq_m_n_s16): Likewise.
(vqshrnbq_m_n_s32): Likewise.
(vqshrnbq_m_n_s16): Likewise.
(vqshrnbq_m_n_u32): Likewise.
(vqshrnbq_m_n_u16): Likewise.
(vqshrntq_m_n_s32): Likewise.
(vqshrntq_m_n_s16): Likewise.
(vqshrntq_m_n_u32): Likewise.
(vqshrntq_m_n_u16): Likewise.
(vqshrunbq_m_n_s32): Likewise.
(vqshrunbq_m_n_s16): Likewise.
(vqshruntq_m_n_s32): Likewise.
(vqshruntq_m_n_s16): Likewise.
(vrmlaldavhaq_p_s32): Likewise.
(vrmlaldavhaq_p_u32): Likewise.
(vrmlaldavhaxq_p_s32): Likewise.
(vrmlsldavhaq_p_s32): Likewise.
(vrmlsldavhaxq_p_s32): Likewise.
(vrshrnbq_m_n_s32): Likewise.
(vrshrnbq_m_n_s16): Likewise.
(vrshrnbq_m_n_u32): Likewise.
(vrshrnbq_m_n_u16): Likewise.
(vrshrntq_m_n_s32): Likewise.
(vrshrntq_m_n_s16): Likewise.
(vrshrntq_m_n_u32): Likewise.
(vrshrntq_m_n_u16): Likewise.
(vshllbq_m_n_s8): Likewise.
(vshllbq_m_n_s16): Likewise.
(vshllbq_m_n_u8): Likewise.
(vshllbq_m_n_u16): Likewise.
(vshlltq_m_n_s8): Likewise.
(vshlltq_m_n_s16): Likewise.
(vshlltq_m_n_u8): Likewise.
(vshlltq_m_n_u16): Likewise.
(vshrnbq_m_n_s32): Likewise.
(vshrnbq_m_n_s16): Likewise.
(vshrnbq_m_n_u32): Likewise.
(vshrnbq_m_n_u16): Likewise.
(vshrntq_m_n_s32): Likewise.
(vshrntq_m_n_s16): Likewise.
(vshrntq_m_n_u32): Like

[PATCH][ARM][GCC][2/3x]: MVE intrinsics with ternary operands.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE intrinsics with ternary operands.

vpselq_u8, vpselq_s8, vrev64q_m_u8, vqrdmlashq_n_u8, vqrdmlahq_n_u8,
vqdmlahq_n_u8, vmvnq_m_u8, vmlasq_n_u8, vmlaq_n_u8, vmladavq_p_u8,
vmladavaq_u8, vminvq_p_u8, vmaxvq_p_u8, vdupq_m_n_u8, vcmpneq_m_u8,
vcmpneq_m_n_u8, vcmphiq_m_u8, vcmphiq_m_n_u8, vcmpeqq_m_u8,
vcmpeqq_m_n_u8, vcmpcsq_m_u8, vcmpcsq_m_n_u8, vclzq_m_u8, vaddvaq_p_u8,
vsriq_n_u8, vsliq_n_u8, vshlq_m_r_u8, vrshlq_m_n_u8, vqshlq_m_r_u8,
vqrshlq_m_n_u8, vminavq_p_s8, vminaq_m_s8, vmaxavq_p_s8, vmaxaq_m_s8,
vcmpneq_m_s8, vcmpneq_m_n_s8, vcmpltq_m_s8, vcmpltq_m_n_s8, vcmpleq_m_s8,
vcmpleq_m_n_s8, vcmpgtq_m_s8, vcmpgtq_m_n_s8, vcmpgeq_m_s8, vcmpgeq_m_n_s8,
vcmpeqq_m_s8, vcmpeqq_m_n_s8, vshlq_m_r_s8, vrshlq_m_n_s8, vrev64q_m_s8,
vqshlq_m_r_s8, vqrshlq_m_n_s8, vqnegq_m_s8, vqabsq_m_s8, vnegq_m_s8,
vmvnq_m_s8, vmlsdavxq_p_s8, vmlsdavq_p_s8, vmladavxq_p_s8, vmladavq_p_s8,
vminvq_p_s8, vmaxvq_p_s8, vdupq_m_n_s8, vclzq_m_s8, vclsq_m_s8, vaddvaq_p_s8,
vabsq_m_s8, vqrdmlsdhxq_s8, vqrdmlsdhq_s8, vqrdmlashq_n_s8, vqrdmlahq_n_s8,
vqrdmladhxq_s8, vqrdmladhq_s8, vqdmlsdhxq_s8, vqdmlsdhq_s8, vqdmlahq_n_s8,
vqdmladhxq_s8, vqdmladhq_s8, vmlsdavaxq_s8, vmlsdavaq_s8, vmlasq_n_s8,
vmlaq_n_s8, vmladavaxq_s8, vmladavaq_s8, vsriq_n_s8, vsliq_n_s8, vpselq_u16,
vpselq_s16, vrev64q_m_u16, vqrdmlashq_n_u16, vqrdmlahq_n_u16, vqdmlahq_n_u16,
vmvnq_m_u16, vmlasq_n_u16, vmlaq_n_u16, vmladavq_p_u16, vmladavaq_u16,
vminvq_p_u16, vmaxvq_p_u16, vdupq_m_n_u16, vcmpneq_m_u16, vcmpneq_m_n_u16,
vcmphiq_m_u16, vcmphiq_m_n_u16, vcmpeqq_m_u16, vcmpeqq_m_n_u16, vcmpcsq_m_u16,
vcmpcsq_m_n_u16, vclzq_m_u16, vaddvaq_p_u16, vsriq_n_u16, vsliq_n_u16,
vshlq_m_r_u16, vrshlq_m_n_u16, vqshlq_m_r_u16, vqrshlq_m_n_u16, vminavq_p_s16,
vminaq_m_s16, vmaxavq_p_s16, vmaxaq_m_s16, vcmpneq_m_s16, vcmpneq_m_n_s16,
vcmpltq_m_s16, vcmpltq_m_n_s16, vcmpleq_m_s16, vcmpleq_m_n_s16, vcmpgtq_m_s16,
vcmpgtq_m_n_s16, vcmpgeq_m_s16, vcmpgeq_m_n_s16, vcmpeqq_m_s16, vcmpeqq_m_n_s16,
vshlq_m_r_s16, vrshlq_m_n_s16, vrev64q_m_s16, vqshlq_m_r_s16, vqrshlq_m_n_s16,
vqnegq_m_s16, vqabsq_m_s16, vnegq_m_s16, vmvnq_m_s16, vmlsdavxq_p_s16,
vmlsdavq_p_s16, vmladavxq_p_s16, vmladavq_p_s16, vminvq_p_s16, vmaxvq_p_s16,
vdupq_m_n_s16, vclzq_m_s16, vclsq_m_s16, vaddvaq_p_s16, vabsq_m_s16,
vqrdmlsdhxq_s16, vqrdmlsdhq_s16, vqrdmlashq_n_s16, vqrdmlahq_n_s16,
vqrdmladhxq_s16, vqrdmladhq_s16, vqdmlsdhxq_s16, vqdmlsdhq_s16, vqdmlahq_n_s16,
vqdmladhxq_s16, vqdmladhq_s16, vmlsdavaxq_s16, vmlsdavaq_s16, vmlasq_n_s16,
vmlaq_n_s16, vmladavaxq_s16, vmladavaq_s16, vsriq_n_s16, vsliq_n_s16, 
vpselq_u32,
vpselq_s32, vrev64q_m_u32, vqrdmlashq_n_u32, vqrdmlahq_n_u32, vqdmlahq_n_u32,
vmvnq_m_u32, vmlasq_n_u32, vmlaq_n_u32, vmladavq_p_u32, vmladavaq_u32,
vminvq_p_u32, vmaxvq_p_u32, vdupq_m_n_u32, vcmpneq_m_u32, vcmpneq_m_n_u32,
vcmphiq_m_u32, vcmphiq_m_n_u32, vcmpeqq_m_u32, vcmpeqq_m_n_u32, vcmpcsq_m_u32,
vcmpcsq_m_n_u32, vclzq_m_u32, vaddvaq_p_u32, vsriq_n_u32, vsliq_n_u32,
vshlq_m_r_u32, vrshlq_m_n_u32, vqshlq_m_r_u32, vqrshlq_m_n_u32, vminavq_p_s32,
vminaq_m_s32, vmaxavq_p_s32, vmaxaq_m_s32, vcmpneq_m_s32, vcmpneq_m_n_s32,
vcmpltq_m_s32, vcmpltq_m_n_s32, vcmpleq_m_s32, vcmpleq_m_n_s32, vcmpgtq_m_s32,
vcmpgtq_m_n_s32, vcmpgeq_m_s32, vcmpgeq_m_n_s32, vcmpeqq_m_s32, vcmpeqq_m_n_s32,
vshlq_m_r_s32, vrshlq_m_n_s32, vrev64q_m_s32, vqshlq_m_r_s32, vqrshlq_m_n_s32,
vqnegq_m_s32, vqabsq_m_s32, vnegq_m_s32, vmvnq_m_s32, vmlsdavxq_p_s32,
vmlsdavq_p_s32, vmladavxq_p_s32, vmladavq_p_s32, vminvq_p_s32, vmaxvq_p_s32,
vdupq_m_n_s32, vclzq_m_s32, vclsq_m_s32, vaddvaq_p_s32, vabsq_m_s32,
vqrdmlsdhxq_s32, vqrdmlsdhq_s32, vqrdmlashq_n_s32, vqrdmlahq_n_s32,
vqrdmladhxq_s32, vqrdmladhq_s32, vqdmlsdhxq_s32, vqdmlsdhq_s32, vqdmlahq_n_s32,
vqdmladhxq_s32, vqdmladhq_s32, vmlsdavaxq_s32, vmlsdavaq_s32, vmlasq_n_s32,
vmlaq_n_s32, vmladavaxq_s32, vmladavaq_s32, vsriq_n_s32, vsliq_n_s32,
vpselq_u64, vpselq_s64.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

In this patch new constraints "Rc" and "Re" are added, which checks the 
constant is with
in the range of 0 to 15 and 0 to 31 respectively.

Also a new predicates "mve_imm_15" and "mve_imm_31" are added, to check the the 
matching
constraint Rc and Re respectively.

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-10-25  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm_mve.h (vpselq_u8): Define macro.
(vpselq_s8): Likewise.
(vrev64q_m_u8): Likewise.
(vqrdmlashq_n_u8): Likewise.
(vqrdmlahq_n_u8): Likewise.
(vqdmlahq_n_u8): Likewise.
(vmvnq_m_u8): Likewise.
(vmlasq_n_u8): Likewise.
(vmlaq_n_u8): Likewise.
(vmladavq_p_u8): Likewise.
(vmladavaq_u8): Likewise.
(vminvq_p_u8): Likewise.
(vmaxvq_p_u8): Like

[PATCH][ARM][GCC][3/3x]: MVE intrinsics with ternary operands.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE intrinsics with ternary operands.

vrmlaldavhaxq_s32, vrmlsldavhaq_s32, vrmlsldavhaxq_s32, vaddlvaq_p_s32, 
vcvtbq_m_f16_f32,
vcvtbq_m_f32_f16, vcvttq_m_f16_f32, vcvttq_m_f32_f16, vrev16q_m_s8, 
vrev32q_m_f16,
vrmlaldavhq_p_s32, vrmlaldavhxq_p_s32, vrmlsldavhq_p_s32, vrmlsldavhxq_p_s32, 
vaddlvaq_p_u32,
vrev16q_m_u8, vrmlaldavhq_p_u32, vmvnq_m_n_s16, vorrq_m_n_s16, vqrshrntq_n_s16, 
vqshrnbq_n_s16,
vqshrntq_n_s16, vrshrnbq_n_s16, vrshrntq_n_s16, vshrnbq_n_s16, vshrntq_n_s16, 
vcmlaq_f16,
vcmlaq_rot180_f16, vcmlaq_rot270_f16, vcmlaq_rot90_f16, vfmaq_f16, vfmaq_n_f16, 
vfmasq_n_f16,
vfmsq_f16, vmlaldavaq_s16, vmlaldavaxq_s16, vmlsldavaq_s16, vmlsldavaxq_s16, 
vabsq_m_f16,
vcvtmq_m_s16_f16, vcvtnq_m_s16_f16, vcvtpq_m_s16_f16, vcvtq_m_s16_f16, 
vdupq_m_n_f16,
vmaxnmaq_m_f16, vmaxnmavq_p_f16, vmaxnmvq_p_f16, vminnmaq_m_f16, 
vminnmavq_p_f16, 
vminnmvq_p_f16, vmlaldavq_p_s16, vmlaldavxq_p_s16, vmlsldavq_p_s16, 
vmlsldavxq_p_s16, 
vmovlbq_m_s8, vmovltq_m_s8, vmovnbq_m_s16, vmovntq_m_s16, vnegq_m_f16, 
vpselq_f16,
vqmovnbq_m_s16, vqmovntq_m_s16, vrev32q_m_s8, vrev64q_m_f16, vrndaq_m_f16, 
vrndmq_m_f16,
vrndnq_m_f16, vrndpq_m_f16, vrndq_m_f16, vrndxq_m_f16, vcmpeqq_m_n_f16, 
vcmpgeq_m_f16,
vcmpgeq_m_n_f16, vcmpgtq_m_f16, vcmpgtq_m_n_f16, vcmpleq_m_f16, vcmpleq_m_n_f16,
vcmpltq_m_f16, vcmpltq_m_n_f16, vcmpneq_m_f16, vcmpneq_m_n_f16, vmvnq_m_n_u16,
vorrq_m_n_u16, vqrshruntq_n_s16, vqshrunbq_n_s16, vqshruntq_n_s16, 
vcvtmq_m_u16_f16,
vcvtnq_m_u16_f16, vcvtpq_m_u16_f16, vcvtq_m_u16_f16, vqmovunbq_m_s16, 
vqmovuntq_m_s16,
vqrshrntq_n_u16, vqshrnbq_n_u16, vqshrntq_n_u16, vrshrnbq_n_u16, 
vrshrntq_n_u16, 
vshrnbq_n_u16, vshrntq_n_u16, vmlaldavaq_u16, vmlaldavaxq_u16, vmlaldavq_p_u16,
vmlaldavxq_p_u16, vmovlbq_m_u8, vmovltq_m_u8, vmovnbq_m_u16, vmovntq_m_u16, 
vqmovnbq_m_u16,
vqmovntq_m_u16, vrev32q_m_u8, vmvnq_m_n_s32, vorrq_m_n_s32, vqrshrntq_n_s32, 
vqshrnbq_n_s32,
vqshrntq_n_s32, vrshrnbq_n_s32, vrshrntq_n_s32, vshrnbq_n_s32, vshrntq_n_s32, 
vcmlaq_f32,
vcmlaq_rot180_f32, vcmlaq_rot270_f32, vcmlaq_rot90_f32, vfmaq_f32, vfmaq_n_f32, 
vfmasq_n_f32,
vfmsq_f32, vmlaldavaq_s32, vmlaldavaxq_s32, vmlsldavaq_s32, vmlsldavaxq_s32, 
vabsq_m_f32,
vcvtmq_m_s32_f32, vcvtnq_m_s32_f32, vcvtpq_m_s32_f32, vcvtq_m_s32_f32, 
vdupq_m_n_f32,
vmaxnmaq_m_f32, vmaxnmavq_p_f32, vmaxnmvq_p_f32, vminnmaq_m_f32, 
vminnmavq_p_f32,
vminnmvq_p_f32, vmlaldavq_p_s32, vmlaldavxq_p_s32, vmlsldavq_p_s32, 
vmlsldavxq_p_s32,
vmovlbq_m_s16, vmovltq_m_s16, vmovnbq_m_s32, vmovntq_m_s32, vnegq_m_f32, 
vpselq_f32,
vqmovnbq_m_s32, vqmovntq_m_s32, vrev32q_m_s16, vrev64q_m_f32, vrndaq_m_f32, 
vrndmq_m_f32,
vrndnq_m_f32, vrndpq_m_f32, vrndq_m_f32, vrndxq_m_f32, vcmpeqq_m_n_f32, 
vcmpgeq_m_f32,
vcmpgeq_m_n_f32, vcmpgtq_m_f32, vcmpgtq_m_n_f32, vcmpleq_m_f32, vcmpleq_m_n_f32,
vcmpltq_m_f32, vcmpltq_m_n_f32, vcmpneq_m_f32, vcmpneq_m_n_f32, vmvnq_m_n_u32, 
vorrq_m_n_u32,
vqrshruntq_n_s32, vqshrunbq_n_s32, vqshruntq_n_s32, vcvtmq_m_u32_f32, 
vcvtnq_m_u32_f32,
vcvtpq_m_u32_f32, vcvtq_m_u32_f32, vqmovunbq_m_s32, vqmovuntq_m_s32, 
vqrshrntq_n_u32,
vqshrnbq_n_u32, vqshrntq_n_u32, vrshrnbq_n_u32, vrshrntq_n_u32, vshrnbq_n_u32, 
vshrntq_n_u32,
vmlaldavaq_u32, vmlaldavaxq_u32, vmlaldavq_p_u32, vmlaldavxq_p_u32, 
vmovlbq_m_u16,
vmovltq_m_u16, vmovnbq_m_u32, vmovntq_m_u32, vqmovnbq_m_u32, vqmovntq_m_u32, 
vrev32q_m_u16.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-10-29  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm_mve.h (vrmlaldavhaxq_s32): Define macro.
(vrmlsldavhaq_s32): Likewise.
(vrmlsldavhaxq_s32): Likewise.
(vaddlvaq_p_s32): Likewise.
(vcvtbq_m_f16_f32): Likewise.
(vcvtbq_m_f32_f16): Likewise.
(vcvttq_m_f16_f32): Likewise.
(vcvttq_m_f32_f16): Likewise.
(vrev16q_m_s8): Likewise.
(vrev32q_m_f16): Likewise.
(vrmlaldavhq_p_s32): Likewise.
(vrmlaldavhxq_p_s32): Likewise.
(vrmlsldavhq_p_s32): Likewise.
(vrmlsldavhxq_p_s32): Likewise.
(vaddlvaq_p_u32): Likewise.
(vrev16q_m_u8): Likewise.
(vrmlaldavhq_p_u32): Likewise.
(vmvnq_m_n_s16): Likewise.
(vorrq_m_n_s16): Likewise.
(vqrshrntq_n_s16): Likewise.
(vqshrnbq_n_s16): Likewise.
(vqshrntq_n_s16): Likewise.
(vrshrnbq_n_s16): Likewise.
(vrshrntq_n_s16): Likewise.
(vshrnbq_n_s16): Likewise.
(vshrntq_n_s16): Likewise.
(vcmlaq_f16): Likewise.
(vcmlaq_rot180_f16): Likewise.
(vcmlaq_rot270_f16): Likewise.
(vcmlaq_rot90_f16): Likewise.
(vfmaq_f16): Likewise.
(vfmaq_n_f16): Likewise.
(vfmasq_n_f16): Likewise.
(vfmsq_f16): Likewis

[PATCH][ARM][GCC][4/2x]: MVE intrinsics with binary operands.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE intrinsics with binary operands.

vsubq_u8, vsubq_n_u8, vrmulhq_u8, vrhaddq_u8, vqsubq_u8, vqsubq_n_u8, vqaddq_u8,
vqaddq_n_u8, vorrq_u8, vornq_u8, vmulq_u8, vmulq_n_u8, vmulltq_int_u8, 
vmullbq_int_u8,
vmulhq_u8, vmladavq_u8, vminvq_u8, vminq_u8, vmaxvq_u8, vmaxq_u8, vhsubq_u8, 
vhsubq_n_u8,
vhaddq_u8, vhaddq_n_u8, veorq_u8, vcmpneq_n_u8, vcmphiq_u8, vcmphiq_n_u8, 
vcmpeqq_u8, 
vcmpeqq_n_u8, vcmpcsq_u8, vcmpcsq_n_u8, vcaddq_rot90_u8, vcaddq_rot270_u8, 
vbicq_u8,
vandq_u8, vaddvq_p_u8, vaddvaq_u8, vaddq_n_u8, vabdq_u8, vshlq_r_u8, vrshlq_u8,
vrshlq_n_u8, vqshlq_u8, vqshlq_r_u8, vqrshlq_u8, vqrshlq_n_u8, vminavq_s8, 
vminaq_s8,
vmaxavq_s8, vmaxaq_s8, vbrsrq_n_u8, vshlq_n_u8, vrshrq_n_u8, vqshlq_n_u8, 
vcmpneq_n_s8,
vcmpltq_s8, vcmpltq_n_s8, vcmpleq_s8, vcmpleq_n_s8, vcmpgtq_s8, vcmpgtq_n_s8, 
vcmpgeq_s8,
vcmpgeq_n_s8, vcmpeqq_s8, vcmpeqq_n_s8, vqshluq_n_s8, vaddvq_p_s8, vsubq_s8, 
vsubq_n_s8,
vshlq_r_s8, vrshlq_s8, vrshlq_n_s8, vrmulhq_s8, vrhaddq_s8, vqsubq_s8, 
vqsubq_n_s8,
vqshlq_s8, vqshlq_r_s8, vqrshlq_s8, vqrshlq_n_s8, vqrdmulhq_s8, vqrdmulhq_n_s8, 
vqdmulhq_s8,
vqdmulhq_n_s8, vqaddq_s8, vqaddq_n_s8, vorrq_s8, vornq_s8, vmulq_s8, 
vmulq_n_s8, vmulltq_int_s8,
vmullbq_int_s8, vmulhq_s8, vmlsdavxq_s8, vmlsdavq_s8, vmladavxq_s8, 
vmladavq_s8, vminvq_s8,
vminq_s8, vmaxvq_s8, vmaxq_s8, vhsubq_s8, vhsubq_n_s8, vhcaddq_rot90_s8, 
vhcaddq_rot270_s8,
vhaddq_s8, vhaddq_n_s8, veorq_s8, vcaddq_rot90_s8, vcaddq_rot270_s8, 
vbrsrq_n_s8, vbicq_s8,
vandq_s8, vaddvaq_s8, vaddq_n_s8, vabdq_s8, vshlq_n_s8, vrshrq_n_s8, 
vqshlq_n_s8, vsubq_u16,
vsubq_n_u16, vrmulhq_u16, vrhaddq_u16, vqsubq_u16, vqsubq_n_u16, vqaddq_u16, 
vqaddq_n_u16,
vorrq_u16, vornq_u16, vmulq_u16, vmulq_n_u16, vmulltq_int_u16, vmullbq_int_u16, 
vmulhq_u16,
vmladavq_u16, vminvq_u16, vminq_u16, vmaxvq_u16, vmaxq_u16, vhsubq_u16, 
vhsubq_n_u16,
vhaddq_u16, vhaddq_n_u16, veorq_u16, vcmpneq_n_u16, vcmphiq_u16, vcmphiq_n_u16, 
vcmpeqq_u16,
vcmpeqq_n_u16, vcmpcsq_u16, vcmpcsq_n_u16, vcaddq_rot90_u16, vcaddq_rot270_u16, 
vbicq_u16,
vandq_u16, vaddvq_p_u16, vaddvaq_u16, vaddq_n_u16, vabdq_u16, vshlq_r_u16, 
vrshlq_u16,
vrshlq_n_u16, vqshlq_u16, vqshlq_r_u16, vqrshlq_u16, vqrshlq_n_u16, 
vminavq_s16, vminaq_s16,
vmaxavq_s16, vmaxaq_s16, vbrsrq_n_u16, vshlq_n_u16, vrshrq_n_u16, vqshlq_n_u16, 
vcmpneq_n_s16,
vcmpltq_s16, vcmpltq_n_s16, vcmpleq_s16, vcmpleq_n_s16, vcmpgtq_s16, 
vcmpgtq_n_s16, 
vcmpgeq_s16, vcmpgeq_n_s16, vcmpeqq_s16, vcmpeqq_n_s16, vqshluq_n_s16, 
vaddvq_p_s16, vsubq_s16,
vsubq_n_s16, vshlq_r_s16, vrshlq_s16, vrshlq_n_s16, vrmulhq_s16, vrhaddq_s16, 
vqsubq_s16,
vqsubq_n_s16, vqshlq_s16, vqshlq_r_s16, vqrshlq_s16, vqrshlq_n_s16, 
vqrdmulhq_s16,
vqrdmulhq_n_s16, vqdmulhq_s16, vqdmulhq_n_s16, vqaddq_s16, vqaddq_n_s16, 
vorrq_s16, vornq_s16,
vmulq_s16, vmulq_n_s16, vmulltq_int_s16, vmullbq_int_s16, vmulhq_s16, 
vmlsdavxq_s16, vmlsdavq_s16,
vmladavxq_s16, vmladavq_s16, vminvq_s16, vminq_s16, vmaxvq_s16, vmaxq_s16, 
vhsubq_s16,
vhsubq_n_s16, vhcaddq_rot90_s16, vhcaddq_rot270_s16, vhaddq_s16, vhaddq_n_s16, 
veorq_s16,
vcaddq_rot90_s16, vcaddq_rot270_s16, vbrsrq_n_s16, vbicq_s16, vandq_s16, 
vaddvaq_s16, vaddq_n_s16,
vabdq_s16, vshlq_n_s16, vrshrq_n_s16, vqshlq_n_s16, vsubq_u32, vsubq_n_u32, 
vrmulhq_u32,
vrhaddq_u32, vqsubq_u32, vqsubq_n_u32, vqaddq_u32, vqaddq_n_u32, vorrq_u32, 
vornq_u32, vmulq_u32,
vmulq_n_u32, vmulltq_int_u32, vmullbq_int_u32, vmulhq_u32, vmladavq_u32, 
vminvq_u32, vminq_u32,
vmaxvq_u32, vmaxq_u32, vhsubq_u32, vhsubq_n_u32, vhaddq_u32, vhaddq_n_u32, 
veorq_u32, vcmpneq_n_u32,
vcmphiq_u32, vcmphiq_n_u32, vcmpeqq_u32, vcmpeqq_n_u32, vcmpcsq_u32, 
vcmpcsq_n_u32,
vcaddq_rot90_u32, vcaddq_rot270_u32, vbicq_u32, vandq_u32, vaddvq_p_u32, 
vaddvaq_u32, vaddq_n_u32,
vabdq_u32, vshlq_r_u32, vrshlq_u32, vrshlq_n_u32, vqshlq_u32, vqshlq_r_u32, 
vqrshlq_u32, vqrshlq_n_u32,
vminavq_s32, vminaq_s32, vmaxavq_s32, vmaxaq_s32, vbrsrq_n_u32, vshlq_n_u32, 
vrshrq_n_u32,
vqshlq_n_u32, vcmpneq_n_s32, vcmpltq_s32, vcmpltq_n_s32, vcmpleq_s32, 
vcmpleq_n_s32, vcmpgtq_s32,
vcmpgtq_n_s32, vcmpgeq_s32, vcmpgeq_n_s32, vcmpeqq_s32, vcmpeqq_n_s32, 
vqshluq_n_s32, vaddvq_p_s32,
vsubq_s32, vsubq_n_s32, vshlq_r_s32, vrshlq_s32, vrshlq_n_s32, vrmulhq_s32, 
vrhaddq_s32, vqsubq_s32,
vqsubq_n_s32, vqshlq_s32, vqshlq_r_s32, vqrshlq_s32, vqrshlq_n_s32, 
vqrdmulhq_s32, vqrdmulhq_n_s32,
vqdmulhq_s32, vqdmulhq_n_s32, vqaddq_s32, vqaddq_n_s32, vorrq_s32, vornq_s32, 
vmulq_s32, vmulq_n_s32,
vmulltq_int_s32, vmullbq_int_s32, vmulhq_s32, vmlsdavxq_s32, vmlsdavq_s32, 
vmladavxq_s32, vmladavq_s32,
vminvq_s32, vminq_s32, vmaxvq_s32, vmaxq_s32, vhsubq_s32, vhsubq_n_s32, 
vhcaddq_rot90_s32,
vhcaddq_rot270_s32, vhaddq_s32, vhaddq_n_s32, veorq_s32, vcaddq_rot90_s32, 
vcaddq_rot270_s32,
vbrsrq_n_s32, vbicq_s32, vandq_s32, vaddvaq_s32, vaddq_n_s32, vabdq_s32, 
vshlq_n_s32, vrshrq_n_s32,
vqshlq_n_s32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-set

[PATCH][ARM][GCC][5/2x]: MVE intrinsics with binary operands.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE intrinsics with binary operands.

vqmovntq_u16, vqmovnbq_u16, vmulltq_poly_p8, vmullbq_poly_p8, vmovntq_u16,
vmovnbq_u16, vmlaldavxq_u16, vmlaldavq_u16, vqmovuntq_s16, vqmovunbq_s16, 
vshlltq_n_u8, vshllbq_n_u8, vorrq_n_u16, vbicq_n_u16, vcmpneq_n_f16, 
vcmpneq_f16,
vcmpltq_n_f16, vcmpltq_f16, vcmpleq_n_f16, vcmpleq_f16, vcmpgtq_n_f16,
vcmpgtq_f16, vcmpgeq_n_f16, vcmpgeq_f16, vcmpeqq_n_f16, vcmpeqq_f16, vsubq_f16,
vqmovntq_s16, vqmovnbq_s16, vqdmulltq_s16, vqdmulltq_n_s16, vqdmullbq_s16,
vqdmullbq_n_s16, vorrq_f16, vornq_f16, vmulq_n_f16, vmulq_f16, vmovntq_s16,
vmovnbq_s16, vmlsldavxq_s16, vmlsldavq_s16, vmlaldavxq_s16, vmlaldavq_s16,
vminnmvq_f16, vminnmq_f16, vminnmavq_f16, vminnmaq_f16, vmaxnmvq_f16, 
vmaxnmq_f16,
vmaxnmavq_f16, vmaxnmaq_f16, veorq_f16, vcmulq_rot90_f16, vcmulq_rot270_f16,
vcmulq_rot180_f16, vcmulq_f16, vcaddq_rot90_f16, vcaddq_rot270_f16, vbicq_f16,
vandq_f16, vaddq_n_f16, vabdq_f16, vshlltq_n_s8, vshllbq_n_s8, vorrq_n_s16, 
vbicq_n_s16, vqmovntq_u32, vqmovnbq_u32, vmulltq_poly_p16, vmullbq_poly_p16,
vmovntq_u32, vmovnbq_u32, vmlaldavxq_u32, vmlaldavq_u32, vqmovuntq_s32,
vqmovunbq_s32, vshlltq_n_u16, vshllbq_n_u16, vorrq_n_u32, vbicq_n_u32, 
vcmpneq_n_f32, vcmpneq_f32, vcmpltq_n_f32, vcmpltq_f32, vcmpleq_n_f32, 
vcmpleq_f32, vcmpgtq_n_f32, vcmpgtq_f32, vcmpgeq_n_f32, vcmpgeq_f32, 
vcmpeqq_n_f32, vcmpeqq_f32, vsubq_f32, vqmovntq_s32, vqmovnbq_s32, 
vqdmulltq_s32, vqdmulltq_n_s32, vqdmullbq_s32, vqdmullbq_n_s32, vorrq_f32,
vornq_f32, vmulq_n_f32, vmulq_f32, vmovntq_s32, vmovnbq_s32, vmlsldavxq_s32,
vmlsldavq_s32, vmlaldavxq_s32, vmlaldavq_s32, vminnmvq_f32, vminnmq_f32,
vminnmavq_f32, vminnmaq_f32, vmaxnmvq_f32, vmaxnmq_f32, vmaxnmavq_f32,
vmaxnmaq_f32, veorq_f32, vcmulq_rot90_f32, vcmulq_rot270_f32, vcmulq_rot180_f32,
vcmulq_f32, vcaddq_rot90_f32, vcaddq_rot270_f32, vbicq_f32, vandq_f32, 
vaddq_n_f32, vabdq_f32, vshlltq_n_s16, vshllbq_n_s16, vorrq_n_s32, vbicq_n_s32, 
vrmlaldavhq_u32, vctp8q_m, vctp64q_m, vctp32q_m, vctp16q_m, vaddlvaq_u32, 
vrmlsldavhxq_s32, vrmlsldavhq_s32, vrmlaldavhxq_s32, vrmlaldavhq_s32,
vcvttq_f16_f32, vcvtbq_f16_f32, vaddlvaq_s32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

The above intrinsics are defined using the already defined builtin qualifiers 
BINOP_NONE_NONE_IMM,
BINOP_NONE_NONE_NONE, BINOP_UNONE_NONE_NONE, BINOP_UNONE_UNONE_IMM, 
BINOP_UNONE_UNONE_NONE,
BINOP_UNONE_UNONE_UNONE.

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-10-23  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm_mve.h (vqmovntq_u16): Define macro.
(vqmovnbq_u16): Likewise.
(vmulltq_poly_p8): Likewise.
(vmullbq_poly_p8): Likewise.
(vmovntq_u16): Likewise.
(vmovnbq_u16): Likewise.
(vmlaldavxq_u16): Likewise.
(vmlaldavq_u16): Likewise.
(vqmovuntq_s16): Likewise.
(vqmovunbq_s16): Likewise.
(vshlltq_n_u8): Likewise.
(vshllbq_n_u8): Likewise.
(vorrq_n_u16): Likewise.
(vbicq_n_u16): Likewise.
(vcmpneq_n_f16): Likewise.
(vcmpneq_f16): Likewise.
(vcmpltq_n_f16): Likewise.
(vcmpltq_f16): Likewise.
(vcmpleq_n_f16): Likewise.
(vcmpleq_f16): Likewise.
(vcmpgtq_n_f16): Likewise.
(vcmpgtq_f16): Likewise.
(vcmpgeq_n_f16): Likewise.
(vcmpgeq_f16): Likewise.
(vcmpeqq_n_f16): Likewise.
(vcmpeqq_f16): Likewise.
(vsubq_f16): Likewise.
(vqmovntq_s16): Likewise.
(vqmovnbq_s16): Likewise.
(vqdmulltq_s16): Likewise.
(vqdmulltq_n_s16): Likewise.
(vqdmullbq_s16): Likewise.
(vqdmullbq_n_s16): Likewise.
(vorrq_f16): Likewise.
(vornq_f16): Likewise.
(vmulq_n_f16): Likewise.
(vmulq_f16): Likewise.
(vmovntq_s16): Likewise.
(vmovnbq_s16): Likewise.
(vmlsldavxq_s16): Likewise.
(vmlsldavq_s16): Likewise.
(vmlaldavxq_s16): Likewise.
(vmlaldavq_s16): Likewise.
(vminnmvq_f16): Likewise.
(vminnmq_f16): Likewise.
(vminnmavq_f16): Likewise.
(vminnmaq_f16): Likewise.
(vmaxnmvq_f16): Likewise.
(vmaxnmq_f16): Likewise.
(vmaxnmavq_f16): Likewise.
(vmaxnmaq_f16): Likewise.
(veorq_f16): Likewise.
(vcmulq_rot90_f16): Likewise.
(vcmulq_rot270_f16): Likewise.
(vcmulq_rot180_f16): Likewise.
(vcmulq_f16): Likewise.
(vcaddq_rot90_f16): Likewise.
(vcaddq_rot270_f16): Likewise.
(vbicq_f16): Likewise.
(vandq_f16): Likewise.
(vaddq_n_f16): Likewise.
(vabdq_f16): Likewise.
(vshlltq_n_s8): Likewise.
(vshllbq_n_s8): Likewise.
(vorrq_n_s16

[PATCH][ARM][GCC][3/2x]: MVE intrinsics with binary operands.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE intrinsics with binary operands.

vaddlvq_p_s32, vaddlvq_p_u32, vcmpneq_s8, vcmpneq_s16, vcmpneq_s32, vcmpneq_u8,
vcmpneq_u16, vcmpneq_u32, vshlq_s8, vshlq_s16, vshlq_s32, vshlq_u8, vshlq_u16, 
vshlq_u32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-10-21  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c (BINOP_NONE_NONE_UNONE_QUALIFIERS): Define
qualifier for binary operands.
(BINOP_UNONE_NONE_NONE_QUALIFIERS): Likewise.
(BINOP_UNONE_UNONE_NONE_QUALIFIERS): Likewise.
* config/arm/arm_mve.h (vaddlvq_p_s32): Define macro.
(vaddlvq_p_u32): Likewise.
(vcmpneq_s8): Likewise.
(vcmpneq_s16): Likewise.
(vcmpneq_s32): Likewise.
(vcmpneq_u8): Likewise.
(vcmpneq_u16): Likewise.
(vcmpneq_u32): Likewise.
(vshlq_s8): Likewise.
(vshlq_s16): Likewise.
(vshlq_s32): Likewise.
(vshlq_u8): Likewise.
(vshlq_u16): Likewise.
(vshlq_u32): Likewise.
(__arm_vaddlvq_p_s32): Define intrinsic.
(__arm_vaddlvq_p_u32): Likewise.
(__arm_vcmpneq_s8): Likewise.
(__arm_vcmpneq_s16): Likewise.
(__arm_vcmpneq_s32): Likewise.
(__arm_vcmpneq_u8): Likewise.
(__arm_vcmpneq_u16): Likewise.
(__arm_vcmpneq_u32): Likewise.
(__arm_vshlq_s8): Likewise.
(__arm_vshlq_s16): Likewise.
(__arm_vshlq_s32): Likewise.
(__arm_vshlq_u8): Likewise.
(__arm_vshlq_u16): Likewise.
(__arm_vshlq_u32): Likewise.
(vaddlvq_p): Define polymorphic variant.
(vcmpneq): Likewise.
(vshlq): Likewise.
* config/arm/arm_mve_builtins.def (BINOP_NONE_NONE_UNONE_QUALIFIERS):
Use it.
(BINOP_UNONE_NONE_NONE_QUALIFIERS): Likewise.
(BINOP_UNONE_UNONE_NONE_QUALIFIERS): Likewise.
* config/arm/mve.md (mve_vaddlvq_p_v4si): Define RTL pattern.
(mve_vcmpneq_): Likewise.
(mve_vshlq_): Likewise.

gcc/testsuite/ChangeLog:

2019-10-21  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vaddlvq_p_s32.c: New test.
* gcc.target/arm/mve/intrinsics/vaddlvq_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpneq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpneq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpneq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpneq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpneq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpneq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_u8.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 
ec56dbcdd2bb1de696142186955d75570772..73a0a3070bda2e35f3e400994dbb6ccfb3043f76
 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -391,6 +391,24 @@ arm_binop_unone_none_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
 #define BINOP_UNONE_NONE_IMM_QUALIFIERS \
   (arm_binop_unone_none_imm_qualifiers)
 
+static enum arm_type_qualifiers
+arm_binop_none_none_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_none, qualifier_unsigned };
+#define BINOP_NONE_NONE_UNONE_QUALIFIERS \
+  (arm_binop_none_none_unone_qualifiers)
+
+static enum arm_type_qualifiers
+arm_binop_unone_none_none_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_none, qualifier_none };
+#define BINOP_UNONE_NONE_NONE_QUALIFIERS \
+  (arm_binop_unone_none_none_qualifiers)
+
+static enum arm_type_qualifiers
+arm_binop_unone_unone_none_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_unsigned, qualifier_none };
+#define BINOP_UNONE_UNONE_NONE_QUALIFIERS \
+  (arm_binop_unone_unone_none_qualifiers)
+
 /* End of Qualifier for MVE builtins.  */
 
/* void ([T element type] *, T, immediate).  */
diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 
0318a2cd8c3cdea430a32506e75a5aceb4d3a475..0140b0cb38ccee5f9caa7570627bdc010e158e71
 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -225,6 +225,20 @@ typedef struct { uint8x16_t val[4]; } uint8x16x4_t;
 #define vshrq_n_u8(__a,  __im

[PATCH][ARM][GCC][2/2x]: MVE intrinsics with binary operands.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE intrinsics with binary operands.

vcvtq_n_s16_f16, vcvtq_n_s32_f32, vcvtq_n_u16_f16, vcvtq_n_u32_f32, vcreateq_u8,
vcreateq_u16, vcreateq_u32, vcreateq_u64, vcreateq_s8, vcreateq_s16, 
vcreateq_s32,
vcreateq_s64, vshrq_n_s8, vshrq_n_s16, vshrq_n_s32, vshrq_n_u8, vshrq_n_u16, 
vshrq_n_u32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

In this patch new constraints "Rb" and "Rf" are added, which checks the 
constant is with
in the range of 1 to 8 and 1 to 32 respectively.

Also a new predicates "mve_imm_8" and "mve_imm_32" are added, to check the the 
matching
constraint Rb and Rf respectively.

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-10-21  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c (BINOP_UNONE_UNONE_IMM_QUALIFIERS): Define
qualifier for binary operands.
(BINOP_UNONE_UNONE_UNONE_QUALIFIERS): Likewise.
(BINOP_UNONE_NONE_IMM_QUALIFIERS): Likewise.
* config/arm/arm_mve.h (vcvtq_n_s16_f16): Define macro.
(vcvtq_n_s32_f32): Likewise.
(vcvtq_n_u16_f16): Likewise.
(vcvtq_n_u32_f32): Likewise.
(vcreateq_u8): Likewise.
(vcreateq_u16): Likewise.
(vcreateq_u32): Likewise.
(vcreateq_u64): Likewise.
(vcreateq_s8): Likewise.
(vcreateq_s16): Likewise.
(vcreateq_s32): Likewise.
(vcreateq_s64): Likewise.
(vshrq_n_s8): Likewise.
(vshrq_n_s16): Likewise.
(vshrq_n_s32): Likewise.
(vshrq_n_u8): Likewise.
(vshrq_n_u16): Likewise.
(vshrq_n_u32): Likewise.
(__arm_vcreateq_u8): Define intrinsic.
(__arm_vcreateq_u16): Likewise.
(__arm_vcreateq_u32): Likewise.
(__arm_vcreateq_u64): Likewise.
(__arm_vcreateq_s8): Likewise.
(__arm_vcreateq_s16): Likewise.
(__arm_vcreateq_s32): Likewise.
(__arm_vcreateq_s64): Likewise.
(__arm_vshrq_n_s8): Likewise.
(__arm_vshrq_n_s16): Likewise.
(__arm_vshrq_n_s32): Likewise.
(__arm_vshrq_n_u8): Likewise.
(__arm_vshrq_n_u16): Likewise.
(__arm_vshrq_n_u32): Likewise.
(__arm_vcvtq_n_s16_f16): Likewise.
(__arm_vcvtq_n_s32_f32): Likewise.
(__arm_vcvtq_n_u16_f16): Likewise.
(__arm_vcvtq_n_u32_f32): Likewise.
(vshrq_n): Define polymorphic variant.
* config/arm/arm_mve_builtins.def (BINOP_UNONE_UNONE_IMM_QUALIFIERS):
Use it.
(BINOP_UNONE_UNONE_UNONE_QUALIFIERS): Likewise.
(BINOP_UNONE_NONE_IMM_QUALIFIERS): Likewise.
* config/arm/constraints.md (Rb): Define constraint to check constant is
in the range of 1 to 8.
(Rf): Define constraint to check constant is in the range of 1 to 32.
* config/arm/mve.md (mve_vcreateq_): Define RTL pattern.
(mve_vshrq_n_): Likewise.
(mve_vcvtq_n_from_f_): Likewise.
* config/arm/predicates.md (mve_imm_8): Define predicate to check
the matching constraint Rb.
(mve_imm_32): Define predicate to check the matching constraint Rf.

gcc/testsuite/ChangeLog:

2019-10-21  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vcreateq_s16.c: New test.
* gcc.target/arm/mve/intrinsics/vcreateq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_u64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_s16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_n_u8.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 
c2dad057d1365914477c64d559aa1fd1c32bbf19..ec56dbcdd2bb1de696142186955d75570772
 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -373,6 

[PATCH][ARM][GCC][1/2x]: MVE intrinsics with binary operands.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE intrinsics with binary operand.

vsubq_n_f16, vsubq_n_f32, vbrsrq_n_f16, vbrsrq_n_f32, vcvtq_n_f16_s16,
vcvtq_n_f32_s32, vcvtq_n_f16_u16, vcvtq_n_f32_u32, vcreateq_f16, vcreateq_f32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

In this patch new constraint "Rd" is added, which checks the constant is with 
in the range of 1 to 16.
Also a new predicate "mve_imm_16" is added, to check the the matching 
constraint Rd.

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-10-21  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c (BINOP_NONE_NONE_NONE_QUALIFIERS): Define
qualifier for binary operands.
(BINOP_NONE_NONE_IMM_QUALIFIERS): Likewise.
(BINOP_NONE_UNONE_IMM_QUALIFIERS): Likewise.
(BINOP_NONE_UNONE_UNONE_QUALIFIERS): Likewise.
* config/arm/arm_mve.h (vsubq_n_f16): Define macro.
(vsubq_n_f32): Likewise.
(vbrsrq_n_f16): Likewise.
(vbrsrq_n_f32): Likewise.
(vcvtq_n_f16_s16): Likewise.
(vcvtq_n_f32_s32): Likewise.
(vcvtq_n_f16_u16): Likewise.
(vcvtq_n_f32_u32): Likewise.
(vcreateq_f16): Likewise.
(vcreateq_f32): Likewise.
(__arm_vsubq_n_f16): Define intrinsic.
(__arm_vsubq_n_f32): Likewise.
(__arm_vbrsrq_n_f16): Likewise.
(__arm_vbrsrq_n_f32): Likewise.
(__arm_vcvtq_n_f16_s16): Likewise.
(__arm_vcvtq_n_f32_s32): Likewise.
(__arm_vcvtq_n_f16_u16): Likewise.
(__arm_vcvtq_n_f32_u32): Likewise.
(__arm_vcreateq_f16): Likewise.
(__arm_vcreateq_f32): Likewise.
(vsubq): Define polymorphic variant.
(vbrsrq): Likewise.
(vcvtq_n): Likewise.
* config/arm/arm_mve_builtins.def (BINOP_NONE_NONE_NONE_QUALIFIERS): Use
it.
(BINOP_NONE_NONE_IMM_QUALIFIERS): Likewise.
(BINOP_NONE_UNONE_IMM_QUALIFIERS): Likewise.
(BINOP_NONE_UNONE_UNONE_QUALIFIERS): Likewise.
* config/arm/constraints.md (Rd): Define constraint to check constant is
in the range of 1 to 16.
* config/arm/mve.md (mve_vsubq_n_f): Define RTL pattern.
mve_vbrsrq_n_f: Likewise.
mve_vcvtq_n_to_f_: Likewise.
mve_vcreateq_f: Likewise.
* config/arm/predicates.md (mve_imm_16): Define predicate to check
the matching constraint Rd.

gcc/testsuite/ChangeLog:

2019-10-21  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vbrsrq_n_f16.c: New test.
* gcc.target/arm/mve/intrinsics/vbrsrq_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_f16_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_f16_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_f32_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_f32_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_n_f32.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 
cd82aa159089c288607e240de02a85dcbb134a14..c2dad057d1365914477c64d559aa1fd1c32bbf19
 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -349,6 +349,30 @@ arm_unop_unone_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
 #define UNOP_UNONE_IMM_QUALIFIERS \
   (arm_unop_unone_imm_qualifiers)
 
+static enum arm_type_qualifiers
+arm_binop_none_none_none_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_none, qualifier_none };
+#define BINOP_NONE_NONE_NONE_QUALIFIERS \
+  (arm_binop_none_none_none_qualifiers)
+
+static enum arm_type_qualifiers
+arm_binop_none_none_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_none, qualifier_immediate };
+#define BINOP_NONE_NONE_IMM_QUALIFIERS \
+  (arm_binop_none_none_imm_qualifiers)
+
+static enum arm_type_qualifiers
+arm_binop_none_unone_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_unsigned, qualifier_immediate };
+#define BINOP_NONE_UNONE_IMM_QUALIFIERS \
+  (arm_binop_none_unone_imm_qualifiers)
+
+static enum arm_type_qualifiers
+arm_binop_none_unone_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_unsigned, qualifier_unsigned };
+#define BINOP_NONE_UNONE_UNONE_QUALIFIERS \
+  (arm_binop_none_unone_unone_qualifiers)
+
 /* End of Qualifier for MVE builtins.  */
 
/* void ([T element type] *, T, immediate).  */
diff --git a/gcc/config/arm/arm_mve.h b/

[PATCH][ARM][GCC][1/1x]: Patch to support MVE ACLE intrinsics with unary operand.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports MVE ACLE intrinsics vcvtq_f16_s16, vcvtq_f32_s32, 
vcvtq_f16_u16, vcvtq_f32_u32n
vrndxq_f16, vrndxq_f32, vrndq_f16, vrndq_f32, vrndpq_f16, vrndpq_f32, 
vrndnq_f16, vrndnq_f32,
vrndmq_f16, vrndmq_f32, vrndaq_f16, vrndaq_f32, vrev64q_f16, vrev64q_f32, 
vnegq_f16, vnegq_f32,
vdupq_n_f16, vdupq_n_f32, vabsq_f16, vabsq_f32, vrev32q_f16, vcvttq_f32_f16, 
vcvtbq_f32_f16.


Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-10-17  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c (UNOP_NONE_NONE_QUALIFIERS): Define macro.
(UNOP_NONE_SNONE_QUALIFIERS): Likewise.
(UNOP_NONE_UNONE_QUALIFIERS): Likewise.
* config/arm/arm_mve.h (vrndxq_f16): Define macro.
(vrndxq_f32): Likewise.
(vrndq_f16) Likewise.:
(vrndq_f32): Likewise.
(vrndpq_f16): Likewise. 
(vrndpq_f32): Likewise.
(vrndnq_f16): Likewise.
(vrndnq_f32): Likewise.
(vrndmq_f16): Likewise.
(vrndmq_f32): Likewise. 
(vrndaq_f16): Likewise.
(vrndaq_f32): Likewise.
(vrev64q_f16): Likewise.
(vrev64q_f32): Likewise.
(vnegq_f16): Likewise.
(vnegq_f32): Likewise.
(vdupq_n_f16): Likewise.
(vdupq_n_f32): Likewise.
(vabsq_f16): Likewise. 
(vabsq_f32): Likewise.
(vrev32q_f16): Likewise.
(vcvttq_f32_f16): Likewise.
(vcvtbq_f32_f16): Likewise.
(vcvtq_f16_s16): Likewise. 
(vcvtq_f32_s32): Likewise.
(vcvtq_f16_u16): Likewise.
(vcvtq_f32_u32): Likewise.
(__arm_vrndxq_f16): Define intrinsic.
(__arm_vrndxq_f32): Likewise.
(__arm_vrndq_f16): Likewise.
(__arm_vrndq_f32): Likewise.
(__arm_vrndpq_f16): Likewise.
(__arm_vrndpq_f32): Likewise.
(__arm_vrndnq_f16): Likewise.
(__arm_vrndnq_f32): Likewise.
(__arm_vrndmq_f16): Likewise.
(__arm_vrndmq_f32): Likewise.
(__arm_vrndaq_f16): Likewise.
(__arm_vrndaq_f32): Likewise.
(__arm_vrev64q_f16): Likewise.
(__arm_vrev64q_f32): Likewise.
(__arm_vnegq_f16): Likewise.
(__arm_vnegq_f32): Likewise.
(__arm_vdupq_n_f16): Likewise.
(__arm_vdupq_n_f32): Likewise.
(__arm_vabsq_f16): Likewise.
(__arm_vabsq_f32): Likewise.
(__arm_vrev32q_f16): Likewise.
(__arm_vcvttq_f32_f16): Likewise.
(__arm_vcvtbq_f32_f16): Likewise.
(__arm_vcvtq_f16_s16): Likewise.
(__arm_vcvtq_f32_s32): Likewise.
(__arm_vcvtq_f16_u16): Likewise.
(__arm_vcvtq_f32_u32): Likewise.
(vrndxq): Define polymorphic variants.
(vrndq): Likewise.
(vrndpq): Likewise.
(vrndnq): Likewise.
(vrndmq): Likewise.
(vrndaq): Likewise.
(vrev64q): Likewise.
(vnegq): Likewise.
(vabsq): Likewise.
(vrev32q): Likewise.
(vcvtbq_f32): Likewise.
(vcvttq_f32): Likewise.
(vcvtq): Likewise.
* config/arm/arm_mve_builtins.def (VAR2): Define.
(VAR1): Define.
* config/arm/mve.md (mve_vrndxq_f): Add RTL pattern.
(mve_vrndq_f): Likewise.
(mve_vrndpq_f): Likewise.
(mve_vrndnq_f): Likewise.
(mve_vrndmq_f): Likewise.
(mve_vrndaq_f): Likewise.
(mve_vrev64q_f): Likewise.
(mve_vnegq_f): Likewise.
(mve_vdupq_n_f): Likewise.
(mve_vabsq_f): Likewise.
(mve_vrev32q_fv8hf): Likewise.
(mve_vcvttq_f32_f16v4sf): Likewise.
(mve_vcvtbq_f32_f16v4sf): Likewise.
(mve_vcvtq_to_f_): Likewise.

gcc/testsuite/ChangeLog:

2019-10-17  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vabsq_f16.c: New test.
* gcc.target/arm/mve/intrinsics/vabsq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtbq_f32_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_f16_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_f16_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_f32_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_f32_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvttq_f32_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev32q_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vr

[PATCH][ARM][GCC][4/x]: MVE ACLE vector interleaving store intrinsics.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports MVE ACLE intrinsics vst4q_s8, vst4q_s16, vst4q_s32, 
vst4q_u8,
vst4q_u16, vst4q_u32, vst4q_f16 and vst4q_f32.

In this patch arm_mve_builtins.def file is added to the source code in which the
builtins for MVE ACLE intrinsics are defined using builtin qualifiers.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-11-12  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c (CF): Define mve_builtin_data.
(VAR1): Define.
(ARM_BUILTIN_MVE_PATTERN_START): Define.
(arm_init_mve_builtins): Define function.
(arm_init_builtins): Add TARGET_HAVE_MVE check.
(arm_expand_builtin_1): Check the range of fcode.
(arm_expand_mve_builtin): Define function to expand MVE builtins.
(arm_expand_builtin): Check the range of fcode.
* config/arm/arm_mve.h (__ARM_FEATURE_MVE): Define MVE floating point
types.
(__ARM_MVE_PRESERVE_USER_NAMESPACE): Define to protect user namespace.
(vst4q_s8): Define macro.
(vst4q_s16): Likewise.
(vst4q_s32): Likewise.
(vst4q_u8): Likewise.
(vst4q_u16): Likewise.
(vst4q_u32): Likewise.
(vst4q_f16): Likewise.
(vst4q_f32): Likewise.
(__arm_vst4q_s8): Define inline builtin.
(__arm_vst4q_s16): Likewise.
(__arm_vst4q_s32): Likewise.
(__arm_vst4q_u8): Likewise.
(__arm_vst4q_u16): Likewise.
(__arm_vst4q_u32): Likewise.
(__arm_vst4q_f16): Likewise.
(__arm_vst4q_f32): Likewise.
(__ARM_mve_typeid): Define macro with MVE types.
(__ARM_mve_coerce): Define macro with _Generic feature.
(vst4q): Define polymorphic variant for different vst4q builtins.
* config/arm/arm_mve_builtins.def: New file.
* config/arm/mve.md (MVE_VLD_ST): Define iterator.
(unspec): Define unspec.
(mve_vst4q): Define RTL pattern.
* config/arm/t-arm (arm.o): Add entry for arm_mve_builtins.def.
(arm-builtins.o): Likewise.

gcc/testsuite/ChangeLog:

2019-11-12  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vst4q_f16.c: New test.
* gcc.target/arm/mve/intrinsics/vst4q_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_u8.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 
d4cb0ea3deb49b10266d1620c85e243ed34aee4d..a9f76971ef310118bf7edea6a8dd3de1da46b46b
 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -401,6 +401,13 @@ static arm_builtin_datum neon_builtin_data[] =
 };
 
 #undef CF
+#define CF(N,X) CODE_FOR_mve_##N##X
+static arm_builtin_datum mve_builtin_data[] =
+{
+#include "arm_mve_builtins.def"
+};
+
+#undef CF
 #undef VAR1
 #define VAR1(T, N, A) \
   {#N, UP (A), CODE_FOR_arm_##N, 0, T##_QUALIFIERS},
@@ -705,6 +712,13 @@ enum arm_builtins
 
 #include "arm_acle_builtins.def"
 
+  ARM_BUILTIN_MVE_BASE,
+
+#undef VAR1
+#define VAR1(T, N, X) \
+  ARM_BUILTIN_MVE_##N##X,
+#include "arm_mve_builtins.def"
+
   ARM_BUILTIN_MAX
 };
 
@@ -714,6 +728,9 @@ enum arm_builtins
 #define ARM_BUILTIN_NEON_PATTERN_START \
   (ARM_BUILTIN_NEON_BASE + 1)
 
+#define ARM_BUILTIN_MVE_PATTERN_START \
+  (ARM_BUILTIN_MVE_BASE + 1)
+
 #define ARM_BUILTIN_ACLE_PATTERN_START \
   (ARM_BUILTIN_ACLE_BASE + 1)
 
@@ -1219,6 +1236,22 @@ arm_init_acle_builtins (void)
 }
 }
 
+/* Set up all the MVE builtins mentioned in arm_mve_builtins.def file.  */
+static void
+arm_init_mve_builtins (void)
+{
+  volatile unsigned int i, fcode = ARM_BUILTIN_MVE_PATTERN_START;
+
+  arm_init_simd_builtin_scalar_types ();
+  arm_init_simd_builtin_types ();
+
+  for (i = 0; i < ARRAY_SIZE (mve_builtin_data); i++, fcode++)
+{
+  arm_builtin_datum *d = &mve_builtin_data[i];
+  arm_init_builtin (fcode, d, "__builtin_mve");
+}
+}
+
 /* Set up all the NEON builtins, even builtins for instructions that are not
in the current target ISA to allow the user to compile particular modules
with different target specific options that differ from the command line
@@ -1961,8 +1994,10 @@ arm_init_builtins (void)
   = add_builtin_function ("__builtin_arm_lane_check", lane_check_fpr,
  ARM_BUILT

[PATCH][ARM][GCC][1/x]: MVE ACLE intrinsics framework patch.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch creates the required framework for MVE ACLE intrinsics.

The following changes are done in this patch to support MVE ACLE intrinsics.

Header file arm_mve.h is added to source code, which contains the definitions 
of MVE ACLE intrinsics
and different data types used in MVE. Machine description file mve.md is also 
added which contains the
RTL patterns defined for MVE.

A new reigster "p0" is added which is used in by MVE predicated patterns. A new 
register class "VPR_REG"
is added and its contents are defined in REG_CLASS_CONTENTS.

The vec-common.md file is modified to support the standard move patterns. The 
prefix of neon functions
which are also used by MVE is changed from "neon_" to "simd_".
eg: neon_immediate_valid_for_move changed to simd_immediate_valid_for_move.

In the patch standard patterns mve_move, mve_store and move_load for MVE are 
added and neon.md and vfp.md
files are modified to support this common patterns.

Please refer to Arm reference manual [1] for more details.

[1] 
https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf?_ga=2.102521798.659307368.1572453718-1501600630.1548848914

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath

gcc/ChangeLog:

2019-11-11  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config.gcc (arm_mve.h): Add header file.
* config/arm/aout.h (p0): Add new register name.
* config/arm-builtins.c (ARM_BUILTIN_SIMD_LANE_CHECK): Define.
(ARM_BUILTIN_NEON_LANE_CHECK): Remove.
(arm_init_simd_builtin_types): Add TARGET_HAVE_MVE check.
(arm_init_neon_builtins): Move a check to arm_init_builtins function.
(arm_init_builtins): Move a check from arm_init_neon_builtins function.
(mve_dereference_pointer): Add new function.
(arm_expand_builtin_args): Add TARGET_HAVE_MVE check.
(arm_expand_neon_builtin): Move a check to arm_expand_builtin function.
(arm_expand_builtin): Move a check from arm_expand_neon_builtin 
function.
* config/arm/arm-c.c (arm_cpu_builtins): Define macros for MVE.
* config/arm/arm-modes.def (INT_MODE): Add three new integer modes.
* config/arm/arm-protos.h (neon_immediate_valid_for_move): Rename 
function.
(simd_immediate_valid_for_move): Rename neon_immediate_valid_for_move 
function.
* config/arm/arm.c (arm_options_perform_arch_sanity_checks):Enable mve 
isa bit.
(use_return_insn): Add TARGET_HAVE_MVE check.
(aapcs_vfp_allocate): Add TARGET_HAVE_MVE check.
(aapcs_vfp_allocate_return_reg): Add TARGET_HAVE_MVE check.
(thumb2_legitimate_address_p): Add TARGET_HAVE_MVE check.
(arm_rtx_costs_internal): Add TARGET_HAVE_MVE check.
(neon_valid_immediate): Rename to simd_valid_immediate.
(simd_valid_immediate): Rename from neon_valid_immediate.
(neon_immediate_valid_for_move): Rename to 
simd_immediate_valid_for_move.
(simd_immediate_valid_for_move): Rename from 
neon_immediate_valid_for_move.
(neon_immediate_valid_for_logic): Modify call to neon_valid_immediate 
function.
(neon_make_constant): Modify call to neon_valid_immediate function.
(neon_vector_mem_operand): Add TARGET_HAVE_MVE check.
(output_move_neon): Add TARGET_HAVE_MVE check.
(arm_compute_frame_layout): Add TARGET_HAVE_MVE check.
(arm_save_coproc_regs): Add TARGET_HAVE_MVE check.
(arm_print_operand): Add case 'E' to print memory operands.
(arm_print_operand_address): Add TARGET_HAVE_MVE check.
(arm_hard_regno_mode_ok): Add TARGET_HAVE_MVE check.
(arm_modes_tieable_p): Add TARGET_HAVE_MVE check.
(arm_regno_class): Add VPR_REGNUM check.
(arm_expand_epilogue_apcs_frame): Add TARGET_HAVE_MVE check.
(arm_expand_epilogue): Add TARGET_HAVE_MVE check.
(arm_vector_mode_supported_p): Add TARGET_HAVE_MVE check for MVE vector 
modes.
(arm_array_mode_supported_p): Add TARGET_HAVE_MVE check.
(arm_conditional_register_usage): For TARGET_HAVE_MVE enable VPR 
register.
* config/arm/arm.h (IS_VPR_REGNUM): Macro to check for VPR register.
(FIRST_PSEUDO_REGISTER): Modify.
(VALID_MVE_MODE): Define.
(VALID_MVE_SI_MODE): Define.
(VALID_MVE_SF_MODE): Define.
(VALID_MVE_STRUCT_MODE): Define.
(REG_ALLOC_ORDER): Add VPR_REGNUM entry.
(enum reg_class): Add VPR_REG entry.
(REG_CLASS_NAMES): Add VPR_REG entry.
* config/arm/arm.md (VPR_REGNUM): Define.
(arm_movsf_soft_insn): Add TARGET_HAVE_MVE check to not allow MVE.
(vfp_pop_multiple_with_writeback): Add TARGET_HAVE_MVE check to allow 
writeback.
(include "mve.md"): Include mve.md file.
* config/arm/arm_mve.h: New file.
* config/arm/constraints.md (Up): Define.
* config/arm/iterators.md (VNIM1): Define.
(VNINOTM1):

[PATCH][ARM][GCC][2/1x]: MVE intrinsics with unary operand.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE intrinsics with unary operand.

vmvnq_n_s16, vmvnq_n_s32, vrev64q_s8, vrev64q_s16, vrev64q_s32, vcvtq_s16_f16, 
vcvtq_s32_f32,
vrev64q_u8, vrev64q_u16, vrev64q_u32, vmvnq_n_u16, vmvnq_n_u32, vcvtq_u16_f16, 
vcvtq_u32_f32,
vrev64q.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-10-21  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c (UNOP_SNONE_SNONE_QUALIFIERS): Define.
(UNOP_SNONE_NONE_QUALIFIERS): Likewise.
(UNOP_SNONE_IMM_QUALIFIERS): Likewise.
(UNOP_UNONE_NONE_QUALIFIERS): Likewise.
(UNOP_UNONE_UNONE_QUALIFIERS): Likewise.
(UNOP_UNONE_IMM_QUALIFIERS): Likewise.
* config/arm/arm_mve.h (vmvnq_n_s16): Define macro.
(vmvnq_n_s32): Likewise.
(vrev64q_s8): Likewise.
(vrev64q_s16): Likewise.
(vrev64q_s32): Likewise.
(vcvtq_s16_f16): Likewise.
(vcvtq_s32_f32): Likewise.
(vrev64q_u8): Likewise.
(vrev64q_u16): Likewise.
(vrev64q_u32): Likewise.
(vmvnq_n_u16): Likewise.
(vmvnq_n_u32): Likewise.
(vcvtq_u16_f16): Likewise.
(vcvtq_u32_f32): Likewise.
(__arm_vmvnq_n_s16): Define intrinsic.
(__arm_vmvnq_n_s32): Likewise.
(__arm_vrev64q_s8): Likewise.
(__arm_vrev64q_s16): Likewise.
(__arm_vrev64q_s32): Likewise.
(__arm_vrev64q_u8): Likewise.
(__arm_vrev64q_u16): Likewise.
(__arm_vrev64q_u32): Likewise.
(__arm_vmvnq_n_u16): Likewise.
(__arm_vmvnq_n_u32): Likewise.
(__arm_vcvtq_s16_f16): Likewise.
(__arm_vcvtq_s32_f32): Likewise.
(__arm_vcvtq_u16_f16): Likewise.
(__arm_vcvtq_u32_f32): Likewise.
(vrev64q): Define polymorphic variant.
* config/arm/arm_mve_builtins.def (UNOP_SNONE_SNONE): Use it.
(UNOP_SNONE_NONE): Likewise.
(UNOP_SNONE_IMM): Likewise.
(UNOP_UNONE_UNONE): Likewise.
(UNOP_UNONE_NONE): Likewise.
(UNOP_UNONE_IMM): Likewise.
* config/arm/mve.md (mve_vrev64q_): Define RTL pattern.
(mve_vcvtq_from_f_): Likewise.
(mve_vmvnq_n_): Likewise.

gcc/testsuite/ChangeLog:

2019-10-21  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vcvtq_s16_f16.c: New test.
* gcc.target/arm/mve/intrinsics/vcvtq_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_u8.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 
2fee417fe6585f457edd4cf96655366b1d6bd1a0..21b213d8e1bc99a3946f15e97161e01d73832799
 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -313,6 +313,42 @@ arm_unop_none_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS]
 #define UNOP_NONE_UNONE_QUALIFIERS \
   (arm_unop_none_unone_qualifiers)
 
+static enum arm_type_qualifiers
+arm_unop_snone_snone_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_none };
+#define UNOP_SNONE_SNONE_QUALIFIERS \
+  (arm_unop_snone_snone_qualifiers)
+
+static enum arm_type_qualifiers
+arm_unop_snone_none_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_none };
+#define UNOP_SNONE_NONE_QUALIFIERS \
+  (arm_unop_snone_none_qualifiers)
+
+static enum arm_type_qualifiers
+arm_unop_snone_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_immediate };
+#define UNOP_SNONE_IMM_QUALIFIERS \
+  (arm_unop_snone_imm_qualifiers)
+
+static enum arm_type_qualifiers
+arm_unop_unone_none_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_none };
+#define UNOP_UNONE_NONE_QUALIFIERS \
+  (arm_unop_unone_none_qualifiers)
+
+static enum arm_type_qualifiers
+arm_unop_unone_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_unsigned };
+#define UNOP_UNONE_UNONE_QUALIFIERS \
+  (arm_u

[PATCH][ARM][GCC][4/1x]: MVE intrinsics with unary operand.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE intrinsics with unary operand.

vctp16q, vctp32q, vctp64q, vctp8q, vpnot.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

There are few conflicts in defining the machine registers, resolved by 
re-ordering
VPR_REGNUM, APSRQ_REGNUM and APSRGE_REGNUM.

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-11-12  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c (hi_UP): Define mode.
* config/arm/arm.h (IS_VPR_REGNUM): Move.
* config/arm/arm.md (VPR_REGNUM): Define before APSRQ_REGNUM.
(APSRQ_REGNUM): Modify.
(APSRGE_REGNUM): Modify.
* config/arm/arm_mve.h (vctp16q): Define macro.
(vctp32q): Likewise.
(vctp64q): Likewise.
(vctp8q): Likewise.
(vpnot): Likewise.
(__arm_vctp16q): Define intrinsic.
(__arm_vctp32q): Likewise.
(__arm_vctp64q): Likewise.
(__arm_vctp8q): Likewise.
(__arm_vpnot): Likewise.
* config/arm/arm_mve_builtins.def (UNOP_UNONE_UNONE): Use builtin
qualifier.
* config/arm/mve.md (mve_vctpqhi): Define RTL pattern.
(mve_vpnothi): Likewise.

gcc/testsuite/ChangeLog:

2019-11-12  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vctp16q.c: New test.
* gcc.target/arm/mve/intrinsics/vctp32q.c: Likewise.
* gcc.target/arm/mve/intrinsics/vctp64q.c: Likewise.
* gcc.target/arm/mve/intrinsics/vctp8q.c: Likewise.
* gcc.target/arm/mve/intrinsics/vpnot.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 
21b213d8e1bc99a3946f15e97161e01d73832799..cd82aa159089c288607e240de02a85dcbb134a14
 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -387,6 +387,7 @@ arm_set_sat_qualifiers[SIMD_MAX_BUILTIN_ARGS]
 #define oi_UP   E_OImode
 #define hf_UP   E_HFmode
 #define si_UP   E_SImode
+#define hi_UPE_HImode
 #define void_UP E_VOIDmode
 
 #define UP(X) X##_UP
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 
485db72f05f16ca389227289a35c232dc982bf9d..95ec7963a57a1a5652a0a9dc30391a0ce6348242
 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -955,6 +955,9 @@ extern int arm_arch_cmse;
 #define IS_IWMMXT_GR_REGNUM(REGNUM) \
   (((REGNUM) >= FIRST_IWMMXT_GR_REGNUM) && ((REGNUM) <= LAST_IWMMXT_GR_REGNUM))
 
+#define IS_VPR_REGNUM(REGNUM) \
+  ((REGNUM) == VPR_REGNUM)
+
 /* Base register for access to local variables of the function.  */
 #define FRAME_POINTER_REGNUM   102
 
@@ -999,7 +1002,7 @@ extern int arm_arch_cmse;
&& (LAST_VFP_REGNUM - (REGNUM) >= 2 * (N) - 1))
 
 /* The number of hard registers is 16 ARM + 1 CC + 1 SFP + 1 AFP
-   + 1 APSRQ + 1 APSRGE + 1 VPR.  */
+   +1 VPR + 1 APSRQ + 1 APSRGE.  */
 /* Intel Wireless MMX Technology registers add 16 + 4 more.  */
 /* VFP (VFP3) adds 32 (64) + 1 VFPCC.  */
 #define FIRST_PSEUDO_REGISTER   107
@@ -1101,13 +1104,10 @@ extern int arm_regs_in_sequence[];
   /* Registers not for general use.  */\
   CC_REGNUM, VFPCC_REGNUM, \
   FRAME_POINTER_REGNUM, ARG_POINTER_REGNUM,\
-  SP_REGNUM, PC_REGNUM, APSRQ_REGNUM, APSRGE_REGNUM,   \
-  VPR_REGNUM   \
+  SP_REGNUM, PC_REGNUM, VPR_REGNUM, APSRQ_REGNUM,\
+  APSRGE_REGNUM\
 }
 
-#define IS_VPR_REGNUM(REGNUM) \
-  ((REGNUM) == VPR_REGNUM)
-
 /* Use different register alloc ordering for Thumb.  */
 #define ADJUST_REG_ALLOC_ORDER arm_order_regs_for_local_alloc ()
 
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 
689baa0b0ff63ef90f47d2fd844cb98c9a1457a0..2a90482a873f8250a3b2b1dec141669f55e0c58b
 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -39,9 +39,9 @@
(LAST_ARM_REGNUM  15)   ;
(CC_REGNUM   100)   ; Condition code pseudo register
(VFPCC_REGNUM101)   ; VFP Condition code pseudo register
-   (APSRQ_REGNUM104)   ; Q bit pseudo register
-   (APSRGE_REGNUM   105)   ; GE bits pseudo register
-   (VPR_REGNUM  106)   ; Vector Predication Register - MVE register.
+   (VPR_REGNUM  104)   ; Vector Predication Register - MVE register.
+   (APSRQ_REGNUM105)   ; Q bit pseudo register
+   (APSRGE_REGNUM   106)   ; GE bits pseudo register
   ]
 )
 ;; 3rd operand to select_dominance_cc_mode
diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 
1d357180ba9ddb26347b55cde625903bdb09eef6..c8d9b6471634725cea9bab3f9fa145810b506938
 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/co

[PATCH][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch is part of MVE ACLE intrinsics framework.
This patches add support to update (read/write) the APSR (Application Program 
Status Register)
register and FPSCR (Floating-point Status and Control Register) register for 
MVE.
This patch also enables thumb2 mov RTL patterns for MVE.

Please refer to Arm reference manual [1] for more details.
[1] 
https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf?_ga=2.102521798.659307368.1572453718-1501600630.1548848914

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath

gcc/ChangeLog:

2019-11-11  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/thumb2.md (thumb2_movsfcc_soft_insn): Add check to not 
allow
TARGET_HAVE_MVE for this pattern.
(thumb2_cmse_entry_return): Add TARGET_HAVE_MVE check to update APSR 
register.
* config/arm/unspecs.md (UNSPEC_GET_FPSCR): Define.
(VUNSPEC_GET_FPSCR): Remove.
* config/arm/vfp.md (thumb2_movhi_vfp): Add TARGET_HAVE_MVE check.
(thumb2_movhi_fp16): Add TARGET_HAVE_MVE check.
(thumb2_movsi_vfp): Add TARGET_HAVE_MVE check.
(movdi_vfp): Add TARGET_HAVE_MVE check.
(thumb2_movdf_vfp): Add TARGET_HAVE_MVE check.
(thumb2_movsfcc_vfp): Add TARGET_HAVE_MVE check.
(thumb2_movdfcc_vfp): Add TARGET_HAVE_MVE check.
(push_multi_vfp): Add TARGET_HAVE_MVE check.
(set_fpscr): Add TARGET_HAVE_MVE check.
(get_fpscr): Add TARGET_HAVE_MVE check.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 
809461a25da5a8058a8afce972dea0d3131effc0..81afd8fcdc1b0a82493dc0758bce16fa9e5fde20
 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -435,10 +435,10 @@
 (define_insn "*cmovsi_insn"
   [(set (match_operand:SI 0 "arm_general_register_operand" "=r,r,r,r,r,r,r")
(if_then_else:SI
-(match_operator 1 "arm_comparison_operator"
- [(match_operand 2 "cc_register" "") (const_int 0)])
-(match_operand:SI 3 "arm_reg_or_m1_or_1" "r, r,UM, r,U1,UM,U1")
-(match_operand:SI 4 "arm_reg_or_m1_or_1" "r,UM, r,U1, r,UM,U1")))]
+   (match_operator 1 "arm_comparison_operator"
+[(match_operand 2 "cc_register" "") (const_int 0)])
+   (match_operand:SI 3 "arm_reg_or_m1_or_1" "r, r,UM, r,U1,UM,U1")
+   (match_operand:SI 4 "arm_reg_or_m1_or_1" "r,UM, r,U1, r,UM,U1")))]
   "TARGET_THUMB2 && TARGET_COND_ARITH
&& (!((operands[3] == const1_rtx && operands[4] == constm1_rtx)
|| (operands[3] == constm1_rtx && operands[4] == const1_rtx)))"
@@ -540,7 +540,7 @@
  [(match_operand 4 "cc_register" "") (const_int 0)])
 (match_operand:SF 1 "s_register_operand" "0,r")
 (match_operand:SF 2 "s_register_operand" "r,0")))]
-  "TARGET_THUMB2 && TARGET_SOFT_FLOAT"
+  "TARGET_THUMB2 && TARGET_SOFT_FLOAT && !TARGET_HAVE_MVE"
   "@
it\\t%D3\;mov%D3\\t%0, %2
it\\t%d3\;mov%d3\\t%0, %1"
@@ -1226,7 +1226,7 @@
; added to clear the APSR and potentially the FPSCR if VFP is available, so
; we adapt the length accordingly.
(set (attr "length")
- (if_then_else (match_test "TARGET_HARD_FLOAT")
+ (if_then_else (match_test "TARGET_HARD_FLOAT || TARGET_HAVE_MVE")
   (const_int 34)
   (const_int 8)))
; We do not support predicate execution of returns from cmse_nonsecure_entry
diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
index 
b3b4f8ee3e2d1bdad968a9dd8ccbc72ded274f48..ac7fe7d0af19f1965356d47d8327e24d410b99bd
 100644
--- a/gcc/config/arm/unspecs.md
+++ b/gcc/config/arm/unspecs.md
@@ -170,6 +170,7 @@
   UNSPEC_TORC  ; Used by the intrinsic form of the iWMMXt TORC 
instruction.
   UNSPEC_TORVSC; Used by the intrinsic form of the iWMMXt 
TORVSC instruction.
   UNSPEC_TEXTRC; Used by the intrinsic form of the iWMMXt 
TEXTRC instruction.
+  UNSPEC_GET_FPSCR ; Represent fetch of FPSCR content.
 ])
 
 
@@ -216,7 +217,6 @@
   VUNSPEC_SLX  ; Represent a store-register-release-exclusive.
   VUNSPEC_LDA  ; Represent a store-register-acquire.
   VUNSPEC_STL  ; Represent a store-register-release.
-  VUNSPEC_GET_FPSCR; Represent fetch of FPSCR content.
   VUNSPEC_SET_FPSCR; Represent assign of FPSCR content.
   VUNSPEC_PROBE_STACK_RANGE ; Represent stack range probing.
   VUNSPEC_CDP  ; Represent the coprocessor cdp instruction.
diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
index 
6349c0570540ec25a599166f5d427fcbdbf2af68..461a5d71ca8548cfc61c83f9716249425633ad21
 100644
--- a/gcc/config/arm/vfp.md
+++ b/gcc/config/arm/vfp.md
@@ -74,10 +74,10 @@
 (define_insn "*thumb2_movhi_vfp"
  [(set
(match_operand:HI 0 "nonimmediate_operand"
-"=rk, r, l, r, m, r, *t, r, *t")
+"=rk, r, l, r, m, r, *

[PATCH][ARM][GCC][3/1x]: MVE intrinsics with unary operand.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch supports following MVE ACLE intrinsics with unary operand.

vdupq_n_s8, vdupq_n_s16, vdupq_n_s32, vabsq_s8, vabsq_s16, vabsq_s32, vclsq_s8, 
vclsq_s16, vclsq_s32, vclzq_s8, vclzq_s16, vclzq_s32, vnegq_s8, vnegq_s16, 
vnegq_s32, vaddlvq_s32, vaddvq_s8, vaddvq_s16, vaddvq_s32, vmovlbq_s8, 
vmovlbq_s16, vmovltq_s8, vmovltq_s16, vmvnq_s8, vmvnq_s16, vmvnq_s32, 
vrev16q_s8, vrev32q_s8, vrev32q_s16, vqabsq_s8, vqabsq_s16, vqabsq_s32, 
vqnegq_s8, vqnegq_s16, vqnegq_s32, vcvtaq_s16_f16, vcvtaq_s32_f32, 
vcvtnq_s16_f16, vcvtnq_s32_f32, vcvtpq_s16_f16, vcvtpq_s32_f32, vcvtmq_s16_f16, 
vcvtmq_s32_f32, vmvnq_u8, vmvnq_u16, vmvnq_u32, vdupq_n_u8, vdupq_n_u16, 
vdupq_n_u32, vclzq_u8, vclzq_u16, vclzq_u32, vaddvq_u8, vaddvq_u16, vaddvq_u32, 
vrev32q_u8, vrev32q_u16, vmovltq_u8, vmovltq_u16, vmovlbq_u8, vmovlbq_u16, 
vrev16q_u8, vaddlvq_u32, vcvtpq_u16_f16, vcvtpq_u32_f32, vcvtnq_u16_f16, 
vcvtmq_u16_f16, vcvtmq_u32_f32, vcvtaq_u16_f16, vcvtaq_u32_f32, vdupq_n, vabsq, 
vclsq, vclzq, vnegq, vaddlvq, vaddvq, vmovlbq, vmovltq, vmvnq, vrev16q, 
vrev32q, vqabsq, vqnegq.

A new register class "EVEN_REGS" which allows only even registers is added in 
this patch.

The new constraint "e" allows only reigsters of EVEN_REGS class.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-10-21  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm.h (enum reg_class): Define new class EVEN_REGS.
* config/arm/arm_mve.h (vdupq_n_s8): Define macro.
(vdupq_n_s16): Likewise.
(vdupq_n_s32): Likewise.
(vabsq_s8): Likewise.
(vabsq_s16): Likewise.
(vabsq_s32): Likewise.
(vclsq_s8): Likewise.
(vclsq_s16): Likewise.
(vclsq_s32): Likewise.
(vclzq_s8): Likewise.
(vclzq_s16): Likewise.
(vclzq_s32): Likewise.
(vnegq_s8): Likewise.
(vnegq_s16): Likewise.
(vnegq_s32): Likewise.
(vaddlvq_s32): Likewise.
(vaddvq_s8): Likewise.
(vaddvq_s16): Likewise.
(vaddvq_s32): Likewise.
(vmovlbq_s8): Likewise.
(vmovlbq_s16): Likewise.
(vmovltq_s8): Likewise.
(vmovltq_s16): Likewise.
(vmvnq_s8): Likewise.
(vmvnq_s16): Likewise.
(vmvnq_s32): Likewise.
(vrev16q_s8): Likewise.
(vrev32q_s8): Likewise.
(vrev32q_s16): Likewise.
(vqabsq_s8): Likewise.
(vqabsq_s16): Likewise.
(vqabsq_s32): Likewise.
(vqnegq_s8): Likewise.
(vqnegq_s16): Likewise.
(vqnegq_s32): Likewise.
(vcvtaq_s16_f16): Likewise.
(vcvtaq_s32_f32): Likewise.
(vcvtnq_s16_f16): Likewise.
(vcvtnq_s32_f32): Likewise.
(vcvtpq_s16_f16): Likewise.
(vcvtpq_s32_f32): Likewise.
(vcvtmq_s16_f16): Likewise.
(vcvtmq_s32_f32): Likewise.
(vmvnq_u8): Likewise.
(vmvnq_u16): Likewise.
(vmvnq_u32): Likewise.
(vdupq_n_u8): Likewise.
(vdupq_n_u16): Likewise.
(vdupq_n_u32): Likewise.
(vclzq_u8): Likewise.
(vclzq_u16): Likewise.
(vclzq_u32): Likewise.
(vaddvq_u8): Likewise.
(vaddvq_u16): Likewise.
(vaddvq_u32): Likewise.
(vrev32q_u8): Likewise.
(vrev32q_u16): Likewise.
(vmovltq_u8): Likewise.
(vmovltq_u16): Likewise.
(vmovlbq_u8): Likewise.
(vmovlbq_u16): Likewise.
(vrev16q_u8): Likewise.
(vaddlvq_u32): Likewise.
(vcvtpq_u16_f16): Likewise.
(vcvtpq_u32_f32): Likewise.
(vcvtnq_u16_f16): Likewise.
(vcvtmq_u16_f16): Likewise.
(vcvtmq_u32_f32): Likewise.
(vcvtaq_u16_f16): Likewise.
(vcvtaq_u32_f32): Likewise.
(__arm_vdupq_n_s8): Define intrinsic.
(__arm_vdupq_n_s16): Likewise.
(__arm_vdupq_n_s32): Likewise.
(__arm_vabsq_s8): Likewise.
(__arm_vabsq_s16): Likewise.
(__arm_vabsq_s32): Likewise.
(__arm_vclsq_s8): Likewise.
(__arm_vclsq_s16): Likewise.
(__arm_vclsq_s32): Likewise.
(__arm_vclzq_s8): Likewise.
(__arm_vclzq_s16): Likewise.
(__arm_vclzq_s32): Likewise.
(__arm_vnegq_s8): Likewise.
(__arm_vnegq_s16): Likewise.
(__arm_vnegq_s32): Likewise.
(__arm_vaddlvq_s32): Likewise.
(__arm_vaddvq_s8): Likewise.
(__arm_vaddvq_s16): Likewise.
(__arm_vaddvq_s32): Likewise.
(__arm_vmovlbq_s8): Likewise.
(__arm_vmovlbq_s16): Likewise.
(__arm_vmovltq_s8): Likewise.
(__arm_vmovltq_s16): Likewise.
(__arm_vmvnq_s8): Likewise.
(__arm_vmvnq_s16): Likewise.
(__arm_vmvnq_s32): Likewise.
(__ar

[PATCH][ARM][GCC][3/x]: MVE ACLE intrinsics framework patch.

2019-11-14 Thread Srinath Parvathaneni
Hello,

This patch is part of MVE ACLE intrinsics framework.

The patch supports the use of emulation for the double-precision arithmetic
operations for MVE. This changes are to support the MVE ACLE intrinsics which
operates on vector floating point arithmetic operations.

Please refer to Arm reference manual [1] for more details.
[1] 
https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf?_ga=2.102521798.659307368.1572453718-1501600630.1548848914

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-11-11  Andre Vieira  
Srinath Parvathaneni  

* config/arm/arm.c (arm_libcall_uses_aapcs_base): Modify function to add
emulator calls for dobule precision arithmetic operations for MVE.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
6faed76206b93c1a9dea048e2f693dc16ee58072..358b2638b65a2007d1c7e8062844b67682597f45
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -5658,9 +5658,25 @@ arm_libcall_uses_aapcs_base (const_rtx libcall)
   /* Values from double-precision helper functions are returned in core
 registers if the selected core only supports single-precision
 arithmetic, even if we are using the hard-float ABI.  The same is
-true for single-precision helpers, but we will never be using the
-hard-float ABI on a CPU which doesn't support single-precision
-operations in hardware.  */
+true for single-precision helpers except in case of MVE, because in
+MVE we will be using the hard-float ABI on a CPU which doesn't support
+single-precision operations in hardware.  In MVE the following check
+enables use of emulation for the double-precision arithmetic
+operations.  */
+  if (TARGET_HAVE_MVE)
+   {
+ add_libcall (libcall_htab, optab_libfunc (add_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (sdiv_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (smul_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (neg_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (sub_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (eq_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (lt_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (le_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (ge_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (gt_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (unord_optab, SFmode));
+   }
   add_libcall (libcall_htab, optab_libfunc (add_optab, DFmode));
   add_libcall (libcall_htab, optab_libfunc (sdiv_optab, DFmode));
   add_libcall (libcall_htab, optab_libfunc (smul_optab, DFmode));

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
6faed76206b93c1a9dea048e2f693dc16ee58072..358b2638b65a2007d1c7e8062844b67682597f45
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -5658,9 +5658,25 @@ arm_libcall_uses_aapcs_base (const_rtx libcall)
   /* Values from double-precision helper functions are returned in core
 registers if the selected core only supports single-precision
 arithmetic, even if we are using the hard-float ABI.  The same is
-true for single-precision helpers, but we will never be using the
-hard-float ABI on a CPU which doesn't support single-precision
-operations in hardware.  */
+true for single-precision helpers except in case of MVE, because in
+MVE we will be using the hard-float ABI on a CPU which doesn't support
+single-precision operations in hardware.  In MVE the following check
+enables use of emulation for the double-precision arithmetic
+operations.  */
+  if (TARGET_HAVE_MVE)
+   {
+ add_libcall (libcall_htab, optab_libfunc (add_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (sdiv_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (smul_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (neg_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (sub_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (eq_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (lt_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (le_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (ge_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (gt_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (unord_optab, SFmode));
+   }
   add_libcall (libcall_htab, optab_libfunc (add_optab, DFmode));
   add_libcall (libcall_htab, optab_libfunc (sdiv_optab, DFmode));
   add_libcall (libcall_htab,

[committed] operator_abs::fold_range() returning incorrect result for overflows (pr92506)

2019-11-14 Thread Andrew MacLeod
Traced it back to a typo in operator_abs::fold_range() when I did the 
conversion where the wrong line got copied in..


Instead of returning value_range (type) when a overflow happens, it was 
returning the same result of the previous check, which was the case for 
all positives.
This had EVRP setting the range of an ABS to [-MIN, -1] instead of 
varying, which later caused VRP to intersect that with 0 - [-MIN, -1] 
and all heck broke loose. doh.


I also stumbled across a case where we should be starting with undefined 
in the default fold_range() and building with union for each sub-range.  
We previously declared a local value_range to work with, and that 
defaulted to undefined.  When I changed it to a reference parameter, I 
need to explicitly initialize it.


Bootstraps, checked in as revision 278259.

Andrew

2019-11-14  Andrew MacLeod  

	* range-op.cc (range_operator::fold_range): Start with range undefined.
	(operator_abs::wi_fold): Fix wrong line copy... With wrapv, abs with
	overflow is varying.

Index: range-op.cc
===
*** range-op.cc	(revision 277979)
--- range-op.cc	(working copy)
*** range_operator::fold_range (value_range
*** 146,151 
--- 146,152 
  return;
  
value_range tmp;
+   r.set_undefined ();
for (unsigned x = 0; x < lh.num_pairs (); ++x)
  for (unsigned y = 0; y < rh.num_pairs (); ++y)
{
*** operator_abs::wi_fold (value_range &r, t
*** 2359,2365 
wide_int max_value = wi::max_value (prec, sign);
if (!TYPE_OVERFLOW_UNDEFINED (type) && wi::eq_p (lh_lb, min_value))
  {
!   r = value_range (type, lh_lb, lh_ub);
return;
  }
  
--- 2360,2366 
wide_int max_value = wi::max_value (prec, sign);
if (!TYPE_OVERFLOW_UNDEFINED (type) && wi::eq_p (lh_lb, min_value))
  {
!   r = value_range (type);
return;
  }
  


Re: Tweak gcc.dg/vect/bb-slp-4[01].c (PR92366)

2019-11-14 Thread Richard Biener
On November 14, 2019 7:10:10 PM GMT+01:00, Richard Sandiford 
 wrote:
>gcc.dg/vect/bb-slp-40.c was failing on some targets because the
>explicit dg-options overrode things like -maltivec.  This patch
>uses dg-additional-options instead.
>
>Also, it seems safer not to require exactly 1 instance of each message,
>since that depends on the target vector length.
>
>gcc.dg/vect/bb-slp-41.c contained invariant constructors that are
>vectorised on AArch64 (foo) and constructors that aren't (bar).
>This meant that the number of times we print "Found vectorizable
>constructor" depended on how many vector sizes we try, since we'd
>print it for each failed attempt.
>
>In foo, we create invariant { b[0], ... } and { b[1], ... },
>and the test is making sure that the two separate invariant vectors
>can be fed from the same vector load at b.  This is a different case
>from bb-slp-40.c, where the constructors are naturally separate.
>(The expected count is 4 rather than 2 because we can vectorise the
>epilogue too.)
>
>However, due to limitations in the loop vectoriser, we still do the
>addition of { b[0], ... } and { b[1], ... } in the loop.  Hopefully
>that'll be fixed at some point, so this patch adds an alternative test
>that directly needs 4 separate invariant constructors.  E.g. with
>Joel's
>SLP optimisation, the new test generates:
>
>ldr q4, [x1]
>dup v7.4s, v4.s[0]
>dup v6.4s, v4.s[1]
>dup v5.4s, v4.s[2]
>dup v4.4s, v4.s[3]
>
>instead of the somewhat bizarre:
>
>ldp s6, s5, [x1, 4]
>ldr s4, [x1, 12]
>ld1r{v7.4s}, [x1]
>dup v6.4s, v6.s[0]
>dup v5.4s, v5.s[0]
>dup v4.4s, v4.s[0]
>
>The patch then disables vectorisation of the original foo in
>bb-vect-slp-41.c, so that we get the same correctness testing
>for bar but don't need to test for specific counts.
>
>Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.
>OK to install?

Ok. 

Richard. 

>Richard
>
>
>2019-11-14  Richard Sandiford  
>
>gcc/testsuite/
>   PR testsuite/92366
>   * gcc.dg/vect/bb-slp-40.c: Use dg-additional-options instead
>   of dg-options.  Remove expected counts.
>   * gcc.dg/vect/bb-slp-41.c: Remove dg-options and explicit
>   dg-do run.  Suppress vectorization of foo.
>   * gcc.dg/vect/bb-slp-42.c: New test.
>
>Index: gcc/testsuite/gcc.dg/vect/bb-slp-40.c
>===
>--- gcc/testsuite/gcc.dg/vect/bb-slp-40.c  2019-11-04 21:13:57.363758109
>+
>+++ gcc/testsuite/gcc.dg/vect/bb-slp-40.c  2019-11-14 18:08:36.323546916
>+
>@@ -1,5 +1,5 @@
> /* { dg-do compile } */
>-/* { dg-options "-O3 -fdump-tree-slp-all" } */
>+/* { dg-additional-options "-fvect-cost-model=dynamic" } */
> /* { dg-require-effective-target vect_int } */
> 
> char g_d[1024], g_s1[1024], g_s2[1024];
>@@ -30,5 +30,5 @@ void foo(void)
> }
> 
> /* See that we vectorize an SLP instance.  */
>-/* { dg-final { scan-tree-dump-times "Found vectorizable constructor"
>1 "slp1" } } */
>-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1
>"slp1" } } */
>+/* { dg-final { scan-tree-dump "Found vectorizable constructor" "slp1"
>} } */
>+/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "slp1" }
>} */
>Index: gcc/testsuite/gcc.dg/vect/bb-slp-41.c
>===
>--- gcc/testsuite/gcc.dg/vect/bb-slp-41.c  2019-11-04 21:13:57.363758109
>+
>+++ gcc/testsuite/gcc.dg/vect/bb-slp-41.c  2019-11-14 18:08:36.323546916
>+
>@@ -1,10 +1,9 @@
>-/* { dg-do run } */
>-/* { dg-options "-O3 -fdump-tree-slp-all -fno-vect-cost-model" } */
> /* { dg-require-effective-target vect_int } */
> 
> #define ARR_SIZE 1000
> 
>-void foo (int *a, int *b)
>+void __attribute__((optimize (0)))
>+foo (int *a, int *b)
> {
>   int i;
>   for (i = 0; i < (ARR_SIZE - 2); ++i)
>@@ -56,6 +55,4 @@ int main ()
>   return 0;
> 
> }
>-/* See that we vectorize an SLP instance.  */
>-/* { dg-final { scan-tree-dump-times "Found vectorizable constructor"
>12 "slp1" } } */
>-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4
>"slp1" } } */
>+/* { dg-final { scan-tree-dump-not "vectorizing stmts using SLP"
>"slp1" } } */
>Index: gcc/testsuite/gcc.dg/vect/bb-slp-42.c
>===
>--- /dev/null  2019-09-17 11:41:18.176664108 +0100
>+++ gcc/testsuite/gcc.dg/vect/bb-slp-42.c  2019-11-14 18:08:36.323546916
>+
>@@ -0,0 +1,49 @@
>+/* { dg-require-effective-target vect_int } */
>+/* { dg-require-effective-target vect_perm } */
>+
>+#include "tree-vect.h"
>+
>+#define ARR_SIZE 1024
>+
>+void __attribute__((noipa))
>+foo (int a[][ARR_SIZE], int *b)
>+{
>+  int i;
>+  for (i = 0; i < ARR_SIZE; ++i)
>+{
>+  a[0][i] += b[0];
>+  a[1][i] += b[1];
>+  a[2][i] += b[2];
>+  a[3][i] += b[3];
>+}
>+}
>+
>+int
>+main

[PATCH v2] Extend the simd function attribute

2019-11-14 Thread Szabolcs Nagy
GCC currently supports two ways to declare the availability of vector
variants of a scalar function:

  #pragma omp declare simd
  void f (void);

and

  __attribute__ ((simd))
  void f (void);

However these declare a set of symbols that are different simd variants
of f, so a library either provides definitions for all those symbols or
it cannot use these declarations. (The set of declared symbols can be
narrowed down with additional omp clauses, but not enough to allow
declaring a single symbol.)

OpenMP 5 has a declare variant feature that allows declaring more
specific simd variants, but it is complicated and still doesn't provide
a reliable mechanism (requires gcc or vendor specific extension for
unambiguous declarations). And it requires -fopenmp.

A simpler approach is to extend the gcc specific simd attribute such
that it can specify a single vector variant of simple scalar functions.
Where simple scalar functions are ones that only take and return scalar
integer or floating type values. I believe this can be achieved by

  __attribute__ ((simd (mask, simdlen, simdabi, name

where mask is "inbranch" or "notinbranch" like now, simdlen is an int
with the same meaning as in omp declare simd and simdabi is a string
specifying the call ABI (which the intel vector ABI calls ISA). The
name is optional and allows a library to use a different symbol name
than what the vector ABI specifies.

The simd attribute currently can be used for both declarations and
definitions, in the latter case the simd varaints of the function are
generated, which should work with the extended simd attribute too.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.


gcc/ChangeLog:

2019-11-14  Szabolcs Nagy  

* cgraph.h (struct cgraph_simd_clone): Add simdname field.
* doc/extend.texi: Update the simd attribute documentation.
* tree.h (OMP_CLAUSE__SIMDABI__EXPR): Define.
(OMP_CLAUSE__SIMDNAME__EXPR): Define.
* tree.c (walk_tree_1): Handle new omp clauses.
* tree-core.h (enum omp_clause_code): Likewise.
* tree-nested.c (convert_nonlocal_omp_clauses): Likewise.
* tree-pretty-print.c (dump_omp_clause): Likewise.
* omp-low.c (scan_sharing_clauses): Likewise.
* omp-simd-clone.c (simd_clone_clauses_extract): Likewise.
(simd_clone_mangle): Handle simdname.
* config/aarch64/aarch64.c
(aarch64_simd_clone_compute_vecsize_and_simdlen): Warn about
unsupported SIMD ABI.
* config/i386/i386.c
(ix86_simd_clone_compute_vecsize_and_simdlen): Likewise.

gcc/c-family/ChangeLog:

2019-11-14  Szabolcs Nagy  

* c-attribs.c (handle_simd_attribute): Handle 4 arguments.

gcc/testsuite/ChangeLog:

2019-11-14  Szabolcs Nagy  

* c-c++-common/attr-simd-5.c: Update.
* c-c++-common/attr-simd-6.c: New test.
* c-c++-common/attr-simd-7.c: New test.
* c-c++-common/attr-simd-8.c: New test.

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index c62cebf7bfd..bf2301eb790 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -448,7 +448,7 @@ const struct attribute_spec c_common_attribute_table[] =
 			  handle_omp_declare_variant_attribute, NULL },
   { "omp declare variant variant", 0, -1, true,  false, false, false,
 			  handle_omp_declare_variant_attribute, NULL },
-  { "simd",		  0, 1, true,  false, false, false,
+  { "simd",		  0, 4, true,  false, false, false,
 			  handle_simd_attribute, NULL },
   { "omp declare target", 0, -1, true, false, false, false,
 			  handle_omp_declare_target_attribute, NULL },
@@ -3094,13 +3094,22 @@ handle_simd_attribute (tree *node, tree name, tree args, int, bool *no_add_attrs
 {
   tree t = get_identifier ("omp declare simd");
   tree attr = NULL_TREE;
+
+  /* Allow
+	  simd
+	  simd (mask)
+	  simd (mask, simdlen)
+	  simd (mask, simdlen, simdabi)
+	  simd (mask, simdlen, simdabi, name)
+	 forms.  */
+
   if (args)
 	{
 	  tree id = TREE_VALUE (args);
 
 	  if (TREE_CODE (id) != STRING_CST)
 	{
-	  error ("attribute %qE argument not a string", name);
+	  error ("attribute %qE first argument not a string", name);
 	  *no_add_attrs = true;
 	  return NULL_TREE;
 	}
@@ -3113,13 +3122,75 @@ handle_simd_attribute (tree *node, tree name, tree args, int, bool *no_add_attrs
  OMP_CLAUSE_INBRANCH);
 	  else
 	{
-	  error ("only % and % flags are "
-		 "allowed for %<__simd__%> attribute");
+	  error ("%qE attribute first argument must be % or "
+		 "%", name);
+	  *no_add_attrs = true;
+	  return NULL_TREE;
+	}
+
+	  args = TREE_CHAIN (args);
+	}
+
+  if (args)
+	{
+	  tree arg = TREE_VALUE (args);
+
+	  arg = c_fully_fold (arg, false, NULL);
+	  if (TREE_CODE (arg) != INTEGER_CST
+	  || !INTEGRAL_TYPE_P (TREE_TYPE (arg))
+	  || tree_int_cst_sgn (arg) < 0)
+	{
+	  error ("%qE attribute second argument m

[SVE] PR89007 - Implement generic vector average expansion

2019-11-14 Thread Prathamesh Kulkarni
Hi,
As suggested in PR, the attached patch falls back to distributing
rshift over plus_expr instead of fallback widening -> arithmetic ->
narrowing sequence, if target support is not available.
Bootstrap+tested on x86_64-unknown-linux-gnu and aarch64-linux-gnu.
OK to commit ?

Thanks,
Prathamesh
2019-11-15  Prathamesh Kulkarni  

PR tree-optimization/89007
* tree-vect-patterns.c (vect_recog_average_pattern): If there is no
target support available, generate code to distribute rshift over plus
and add one depending upon floor or ceil rounding.

testsuite/
* gcc.target/aarch64/sve/pr89007.c: New test.

diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr89007.c 
b/gcc/testsuite/gcc.target/aarch64/sve/pr89007.c
new file mode 100644
index 000..b682f3f3b74
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pr89007.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+#define N 1024
+unsigned char dst[N];
+unsigned char in1[N];
+unsigned char in2[N];
+
+void
+foo ()
+{
+  for( int x = 0; x < N; x++ )
+dst[x] = (in1[x] + in2[x] + 1) >> 1;
+}
+
+/* { dg-final { scan-assembler-not {\tuunpklo\t} } } */
+/* { dg-final { scan-assembler-not {\tuunpkhi\t} } } */
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index 8ebbcd76b64..7025a3b4dc2 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -2019,22 +2019,59 @@ vect_recog_average_pattern (stmt_vec_info 
last_stmt_info, tree *type_out)
 
   /* Check for target support.  */
   tree new_vectype = get_vectype_for_scalar_type (vinfo, new_type);
-  if (!new_vectype
-  || !direct_internal_fn_supported_p (ifn, new_vectype,
- OPTIMIZE_FOR_SPEED))
+
+  if (!new_vectype)
 return NULL;
 
+  bool ifn_supported
+= direct_internal_fn_supported_p (ifn, new_vectype, OPTIMIZE_FOR_SPEED);
+
   /* The IR requires a valid vector type for the cast result, even though
  it's likely to be discarded.  */
   *type_out = get_vectype_for_scalar_type (vinfo, type);
   if (!*type_out)
 return NULL;
 
-  /* Generate the IFN_AVG* call.  */
   tree new_var = vect_recog_temp_ssa_var (new_type, NULL);
   tree new_ops[2];
   vect_convert_inputs (last_stmt_info, 2, new_ops, new_type,
   unprom, new_vectype);
+
+  if (!ifn_supported)
+{
+  /* If there is no target support available, generate code
+to distribute rshift over plus and add one depending
+upon floor or ceil rounding.  */
+
+  tree one_cst = build_one_cst (new_type);
+
+  tree tmp1 = vect_recog_temp_ssa_var (new_type, NULL);
+  gassign *g1 = gimple_build_assign (tmp1, RSHIFT_EXPR, new_ops[0], 
one_cst);
+
+  tree tmp2 = vect_recog_temp_ssa_var (new_type, NULL);
+  gassign *g2 = gimple_build_assign (tmp2, RSHIFT_EXPR, new_ops[1], 
one_cst);
+
+  tree tmp3 = vect_recog_temp_ssa_var (new_type, NULL);
+  gassign *g3 = gimple_build_assign (tmp3, PLUS_EXPR, tmp1, tmp2);
+  
+  tree tmp4 = vect_recog_temp_ssa_var (new_type, NULL);
+  tree_code c = (ifn == IFN_AVG_CEIL) ? BIT_IOR_EXPR : BIT_AND_EXPR;
+  gassign *g4 = gimple_build_assign (tmp4, c, new_ops[0], new_ops[1]);
+ 
+  tree tmp5 = vect_recog_temp_ssa_var (new_type, NULL);
+  gassign *g5 = gimple_build_assign (tmp5, BIT_AND_EXPR, tmp4, one_cst);
+
+  gassign *g6 = gimple_build_assign (new_var, PLUS_EXPR, tmp3, tmp5);
+
+  append_pattern_def_seq (last_stmt_info, g1, new_vectype);
+  append_pattern_def_seq (last_stmt_info, g2, new_vectype);
+  append_pattern_def_seq (last_stmt_info, g3, new_vectype);
+  append_pattern_def_seq (last_stmt_info, g4, new_vectype);
+  append_pattern_def_seq (last_stmt_info, g5, new_vectype);
+  return vect_convert_output (last_stmt_info, type, g6, new_vectype);
+}
+
+  /* Generate the IFN_AVG* call.  */
   gcall *average_stmt = gimple_build_call_internal (ifn, 2, new_ops[0],
new_ops[1]);
   gimple_call_set_lhs (average_stmt, new_var);


[Patch] [mid-end][__RTL] Clean df state despite invalid __RTL startwith passes

2019-11-14 Thread Matthew Malcomson
Hi there,

When compiling an __RTL function that has an invalid "startwith" pass we
currently don't run the dfinish cleanup pass. This means we ICE on the next
function.

This change ensures that all state is cleaned up for the next function
to run correctly.

As an example, before this change the following code would ICE when compiling
the function `foo2` because the "peephole2" pass is not run at optimisation
level -O0.

When compiled with
./aarch64-none-linux-gnu-gcc -O0 -S missed-pass-error.c -o test.s

```
int __RTL (startwith ("peephole2")) badfoo ()
{
(function "badfoo"
  (insn-chain
(block 2
  (edge-from entry (flags "FALLTHRU"))
  (cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK)
  (cinsn 101 (set (reg:DI x19) (reg:DI x0)))
  (cinsn 10 (use (reg/i:SI x19)))
  (edge-to exit (flags "FALLTHRU"))
) ;; block 2
  ) ;; insn-chain
) ;; function "foo2"
}

int __RTL (startwith ("final")) foo2 ()
{
(function "foo2"
  (insn-chain
(block 2
  (edge-from entry (flags "FALLTHRU"))
  (cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK)
  (cinsn 101 (set (reg:DI x19) (reg:DI x0)))
  (cinsn 10 (use (reg/i:SI x19)))
  (edge-to exit (flags "FALLTHRU"))
) ;; block 2
  ) ;; insn-chain
) ;; function "foo2"
}
```

Now it silently ignores the __RTL function and successfully compiles foo2.

regtest done on aarch64
regtest done on x86_64

OK for trunk?

gcc/ChangeLog:

2019-11-14  Matthew Malcomson  

* passes.c (should_skip_pass_p): Always run "dfinish".

gcc/testsuite/ChangeLog:

2019-11-14  Matthew Malcomson  

* gcc.dg/rtl/aarch64/missed-pass-error.c: New test.



### Attachment also inlined for ease of reply###


diff --git a/gcc/passes.c b/gcc/passes.c
index 
d86af115ecb16fcab6bfce070f1f3e4f1d90ce71..258f85ab4f8a1519b978b75dfa67536d2eacd106
 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -2375,7 +2375,8 @@ should_skip_pass_p (opt_pass *pass)
 return false;
 
   /* Don't skip df init; later RTL passes need it.  */
-  if (strstr (pass->name, "dfinit") != NULL)
+  if (strstr (pass->name, "dfinit") != NULL
+  || strstr (pass->name, "dfinish") != NULL)
 return false;
 
   if (!quiet_flag)
diff --git a/gcc/testsuite/gcc.dg/rtl/aarch64/missed-pass-error.c 
b/gcc/testsuite/gcc.dg/rtl/aarch64/missed-pass-error.c
new file mode 100644
index 
..2f02ca9d0c40b372d86b24009540e157ed1a8c59
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/rtl/aarch64/missed-pass-error.c
@@ -0,0 +1,45 @@
+/* { dg-do compile { target aarch64-*-* } } */
+/* { dg-additional-options "-O0" } */
+
+/*
+   When compiling __RTL functions the startwith string can be either incorrect
+   (i.e. not matching a pass) or be unused (i.e. can refer to a pass that is
+   not run at the current optimisation level).
+
+   Here we ensure that the state clean up is still run, so that functions other
+   than the faulty one can still be compiled.
+ */
+
+int __RTL (startwith ("peephole2")) badfoo ()
+{
+(function "badfoo"
+  (insn-chain
+(block 2
+  (edge-from entry (flags "FALLTHRU"))
+  (cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK)
+  (cinsn 101 (set (reg:DI x19) (reg:DI x0)))
+  (cinsn 10 (use (reg/i:SI x19)))
+  (edge-to exit (flags "FALLTHRU"))
+) ;; block 2
+  ) ;; insn-chain
+) ;; function "foo2"
+}
+
+/* Compile a valid __RTL function to test state from the "dfinit" pass has been
+   cleaned with the "dfinish" pass.  */
+
+int __RTL (startwith ("final")) foo2 ()
+{
+(function "foo2"
+  (insn-chain
+(block 2
+  (edge-from entry (flags "FALLTHRU"))
+  (cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK)
+  (cinsn 101 (set (reg:DI x19) (reg:DI x0)))
+  (cinsn 10 (use (reg/i:SI x19)))
+  (edge-to exit (flags "FALLTHRU"))
+) ;; block 2
+  ) ;; insn-chain
+) ;; function "foo2"
+}
+

diff --git a/gcc/passes.c b/gcc/passes.c
index 
d86af115ecb16fcab6bfce070f1f3e4f1d90ce71..258f85ab4f8a1519b978b75dfa67536d2eacd106
 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -2375,7 +2375,8 @@ should_skip_pass_p (opt_pass *pass)
 return false;
 
   /* Don't skip df init; later RTL passes need it.  */
-  if (strstr (pass->name, "dfinit") != NULL)
+  if (strstr (pass->name, "dfinit") != NULL
+  || strstr (pass->name, "dfinish") != NULL)
 return false;
 
   if (!quiet_flag)
diff --git a/gcc/testsuite/gcc.dg/rtl/aarch64/missed-pass-error.c 
b/gcc/testsuite/gcc.dg/rtl/aarch64/missed-pass-error.c
new file mode 100644
index 
..2f02ca9d0c40b372d86b24009540e157ed1a8c59
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/rtl/aarch64/missed-pass-error.c
@@ -0,0 +1,45 @@
+/* { dg-do compile { target aarch64-*-* } } */
+/* { dg-additional-options "-O0" } */
+
+/*
+   When compiling __RTL functions the startwith string can be either incorrect
+   (i.e. not matching a pass) or be unused (i.e. can refer to a pass that is
+   not run at the current optimisation level).
+
+   Here we ensure that the state clean

[mid-end][__RTL] Clean state despite unspecified __RTL startwith passes

2019-11-14 Thread Matthew Malcomson
Hi there,

When compiling an __RTL function that has an unspecified "startwith"
pass we currently don't run the cleanup pass, this means that we ICE on
the next function (if it's a basic function).
I asked about this on the GCC mailing list a while ago and Richard mentioned
it might be a good idea to clear bad state so it doesn't leak to other
functions.
https://gcc.gnu.org/ml/gcc/2019-02/msg00106.html

This change ensures that the clean_state pass is run even if the
startwith pass is unspecified.

We also ensure the name of the startwith pass is always freed correctly.

As an example, before this change the following code would ICE when compiling
the function `foo_a`.

When compiled with
./aarch64-none-linux-gnu-gcc -O0 -S unspecified-pass-error.c -o test.s

```
int __RTL () badfoo ()
{
(function "badfoo"
  (insn-chain
(block 2
  (edge-from entry (flags "FALLTHRU"))
  (cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK)
  (cinsn 101 (set (reg:DI x19) (reg:DI x0)))
  (cinsn 10 (use (reg/i:SI x19)))
  (edge-to exit (flags "FALLTHRU"))
) ;; block 2
  ) ;; insn-chain
) ;; function "foo2"
}

int
foo_a ()
{
  return 200;
}
```

Now it silently ignores the __RTL function and successfully compiles foo_a.

regtest done on aarch64
regtest done on x86_64

OK for trunk?

gcc/ChangeLog:

2019-11-14  Matthew Malcomson  

* run-rtl-passes.c (run_rtl_passes): Accept and handle empty
"initial_pass_name" argument -- by running "*clean_state" pass.
Also free the "initial_pass_name" when done.

gcc/c/ChangeLog:

2019-11-14  Matthew Malcomson  

* c-parser.c (c_parser_parse_rtl_body): Always call
run_rtl_passes, even if startwith pass is not provided.

gcc/testsuite/ChangeLog:

2019-11-14  Matthew Malcomson  

* gcc.dg/rtl/aarch64/unspecified-pass-error.c: New test.



### Attachment also inlined for ease of reply###


diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 
9589cc68c25b5b15bb364fdae56e24dedbe91601..05485833d306cd79c5405543175b63c2e7e62538
 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -20868,11 +20868,9 @@ c_parser_parse_rtl_body (c_parser *parser, char 
*start_with_pass)
   return;
 }
 
- /*  If a pass name was provided for START_WITH_PASS, run the backend
- accordingly now, on the cfun created above, transferring
- ownership of START_WITH_PASS.  */
-  if (start_with_pass)
-run_rtl_passes (start_with_pass);
+ /*  Run the backend on the cfun created above, transferring ownership of
+ START_WITH_PASS.  */
+  run_rtl_passes (start_with_pass);
 }
 
 #include "gt-c-c-parser.h"
diff --git a/gcc/run-rtl-passes.c b/gcc/run-rtl-passes.c
index 
f65c0af6dfd48aa9ca7ec29b63d7cd4108432178..38765ebbc288e7aef35d7c02693efd534c6b2ddc
 100644
--- a/gcc/run-rtl-passes.c
+++ b/gcc/run-rtl-passes.c
@@ -49,24 +49,31 @@ run_rtl_passes (char *initial_pass_name)
   switch_to_section (text_section);
   (*debug_hooks->assembly_start) ();
 
-  /* Pass "expand" normally sets this up.  */
+  if (initial_pass_name)
+{
+  /* Pass "expand" normally sets this up.  */
 #ifdef INSN_SCHEDULING
-  init_sched_attrs ();
+  init_sched_attrs ();
 #endif
+  bitmap_obstack_initialize (NULL);
+  bitmap_obstack_initialize (®_obstack);
+  opt_pass *rest_of_compilation
+   = g->get_passes ()->get_rest_of_compilation ();
+  gcc_assert (rest_of_compilation);
+  execute_pass_list (cfun, rest_of_compilation);
 
-  bitmap_obstack_initialize (NULL);
-  bitmap_obstack_initialize (®_obstack);
-
-  opt_pass *rest_of_compilation
-= g->get_passes ()->get_rest_of_compilation ();
-  gcc_assert (rest_of_compilation);
-  execute_pass_list (cfun, rest_of_compilation);
-
-  opt_pass *clean_slate = g->get_passes ()->get_clean_slate ();
-  gcc_assert (clean_slate);
-  execute_pass_list (cfun, clean_slate);
-
-  bitmap_obstack_release (®_obstack);
+  opt_pass *clean_slate = g->get_passes ()->get_clean_slate ();
+  gcc_assert (clean_slate);
+  execute_pass_list (cfun, clean_slate);
+  bitmap_obstack_release (®_obstack);
+}
+  else
+{
+  opt_pass *clean_slate = g->get_passes ()->get_clean_slate ();
+  gcc_assert (clean_slate);
+  execute_pass_list (cfun, clean_slate);
+}
 
   cfun->curr_properties |= PROP_rtl;
+  free (initial_pass_name);
 }
diff --git a/gcc/testsuite/gcc.dg/rtl/aarch64/unspecified-pass-error.c 
b/gcc/testsuite/gcc.dg/rtl/aarch64/unspecified-pass-error.c
new file mode 100644
index 
..596501e977044132bd3e9a2d0afd0f8b2b789186
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/rtl/aarch64/unspecified-pass-error.c
@@ -0,0 +1,30 @@
+/* { dg-do compile { target aarch64-*-* } } */
+/* { dg-additional-options "-O0" } */
+
+/*
+   Ensure an __RTL function with an unspecified "startwith" pass doesn't cause
+   an assertion error on the next function.
+ */
+
+int __RTL () badfoo ()
+{
+(function "badfoo"
+  (insn-chain
+(block 2
+

Tweak gcc.dg/vect/bb-slp-4[01].c (PR92366)

2019-11-14 Thread Richard Sandiford
gcc.dg/vect/bb-slp-40.c was failing on some targets because the
explicit dg-options overrode things like -maltivec.  This patch
uses dg-additional-options instead.

Also, it seems safer not to require exactly 1 instance of each message,
since that depends on the target vector length.

gcc.dg/vect/bb-slp-41.c contained invariant constructors that are
vectorised on AArch64 (foo) and constructors that aren't (bar).
This meant that the number of times we print "Found vectorizable
constructor" depended on how many vector sizes we try, since we'd
print it for each failed attempt.

In foo, we create invariant { b[0], ... } and { b[1], ... },
and the test is making sure that the two separate invariant vectors
can be fed from the same vector load at b.  This is a different case
from bb-slp-40.c, where the constructors are naturally separate.
(The expected count is 4 rather than 2 because we can vectorise the
epilogue too.)

However, due to limitations in the loop vectoriser, we still do the
addition of { b[0], ... } and { b[1], ... } in the loop.  Hopefully
that'll be fixed at some point, so this patch adds an alternative test
that directly needs 4 separate invariant constructors.  E.g. with Joel's
SLP optimisation, the new test generates:

ldr q4, [x1]
dup v7.4s, v4.s[0]
dup v6.4s, v4.s[1]
dup v5.4s, v4.s[2]
dup v4.4s, v4.s[3]

instead of the somewhat bizarre:

ldp s6, s5, [x1, 4]
ldr s4, [x1, 12]
ld1r{v7.4s}, [x1]
dup v6.4s, v6.s[0]
dup v5.4s, v5.s[0]
dup v4.4s, v4.s[0]

The patch then disables vectorisation of the original foo in
bb-vect-slp-41.c, so that we get the same correctness testing
for bar but don't need to test for specific counts.

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.
OK to install?

Richard


2019-11-14  Richard Sandiford  

gcc/testsuite/
PR testsuite/92366
* gcc.dg/vect/bb-slp-40.c: Use dg-additional-options instead
of dg-options.  Remove expected counts.
* gcc.dg/vect/bb-slp-41.c: Remove dg-options and explicit
dg-do run.  Suppress vectorization of foo.
* gcc.dg/vect/bb-slp-42.c: New test.

Index: gcc/testsuite/gcc.dg/vect/bb-slp-40.c
===
--- gcc/testsuite/gcc.dg/vect/bb-slp-40.c   2019-11-04 21:13:57.363758109 
+
+++ gcc/testsuite/gcc.dg/vect/bb-slp-40.c   2019-11-14 18:08:36.323546916 
+
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -fdump-tree-slp-all" } */
+/* { dg-additional-options "-fvect-cost-model=dynamic" } */
 /* { dg-require-effective-target vect_int } */
 
 char g_d[1024], g_s1[1024], g_s2[1024];
@@ -30,5 +30,5 @@ void foo(void)
 }
 
 /* See that we vectorize an SLP instance.  */
-/* { dg-final { scan-tree-dump-times "Found vectorizable constructor" 1 "slp1" 
} } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "slp1" } 
} */
+/* { dg-final { scan-tree-dump "Found vectorizable constructor" "slp1" } } */
+/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "slp1" } } */
Index: gcc/testsuite/gcc.dg/vect/bb-slp-41.c
===
--- gcc/testsuite/gcc.dg/vect/bb-slp-41.c   2019-11-04 21:13:57.363758109 
+
+++ gcc/testsuite/gcc.dg/vect/bb-slp-41.c   2019-11-14 18:08:36.323546916 
+
@@ -1,10 +1,9 @@
-/* { dg-do run } */
-/* { dg-options "-O3 -fdump-tree-slp-all -fno-vect-cost-model" } */
 /* { dg-require-effective-target vect_int } */
 
 #define ARR_SIZE 1000
 
-void foo (int *a, int *b)
+void __attribute__((optimize (0)))
+foo (int *a, int *b)
 {
   int i;
   for (i = 0; i < (ARR_SIZE - 2); ++i)
@@ -56,6 +55,4 @@ int main ()
   return 0;
 
 }
-/* See that we vectorize an SLP instance.  */
-/* { dg-final { scan-tree-dump-times "Found vectorizable constructor" 12 
"slp1" } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "slp1" } 
} */
+/* { dg-final { scan-tree-dump-not "vectorizing stmts using SLP" "slp1" } } */
Index: gcc/testsuite/gcc.dg/vect/bb-slp-42.c
===
--- /dev/null   2019-09-17 11:41:18.176664108 +0100
+++ gcc/testsuite/gcc.dg/vect/bb-slp-42.c   2019-11-14 18:08:36.323546916 
+
@@ -0,0 +1,49 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_perm } */
+
+#include "tree-vect.h"
+
+#define ARR_SIZE 1024
+
+void __attribute__((noipa))
+foo (int a[][ARR_SIZE], int *b)
+{
+  int i;
+  for (i = 0; i < ARR_SIZE; ++i)
+{
+  a[0][i] += b[0];
+  a[1][i] += b[1];
+  a[2][i] += b[2];
+  a[3][i] += b[3];
+}
+}
+
+int
+main ()
+{
+  int a[4][ARR_SIZE];
+  int b[4];
+
+  check_vect ();
+
+  for (int i = 0; i < 4; ++i)
+{
+  b[i] = 20 * i;
+  for (int j = 0; j < ARR_SIZE; ++j)
+   a[i][j] = (i + 1) * ARR_SIZE - j;
+

[arm] Follow up for asm-flags vs thumb1

2019-11-14 Thread Richard Henderson
What I committed today does in fact ICE for thumb1, as you suspected.

I'm currently testing the following vs

  arm-sim/
  arm-sim/-mthumb
  arm-sim/-mcpu=cortex-a15/-mthumb.

which, with the default cpu for arm-elf-eabi, should test all of arm, thumb1,
thumb2.

I'm not thrilled about the ifdef in aarch-common.c, but I don't see a different
way to catch this case for arm and still compile for aarch64.

Ideas?

Particularly ones that work with __attribute__((target("thumb")))?  Which, now
that I've thought about it I really should be testing...


r~
gcc/
* config/arm/aarch-common.c (arm_md_asm_adjust): Sorry
for asm flags in thumb1 mode.
* config/arm/arm-c.c (arm_cpu_builtins): Do not define
__GCC_ASM_FLAG_OUTPUTS__ in thumb1 mode.
* doc/extend.texi (FlagOutputOperands): Document thumb1 restriction.

gcc/testsuite/
* gcc.target/arm/asm-flag-1.c: Skip if arm_thumb1.
* gcc.target/arm/asm-flag-3.c: Skip if arm_thumb1.
* gcc.target/arm/asm-flag-5.c: Skip if arm_thumb1.
* gcc.target/arm/asm-flag-6.c: Skip if arm_thumb1.


diff --git a/gcc/config/arm/aarch-common.c b/gcc/config/arm/aarch-common.c
index 760ef6c9c0a..6f3db3838ba 100644
--- a/gcc/config/arm/aarch-common.c
+++ b/gcc/config/arm/aarch-common.c
@@ -544,6 +544,15 @@ arm_md_asm_adjust (vec &outputs, vec &/*inputs*/,
   if (strncmp (con, "=@cc", 4) != 0)
continue;
   con += 4;
+
+#ifdef TARGET_THUMB1
+  if (TARGET_THUMB1)
+   {
+ sorry ("asm flags not supported in thumb1 mode");
+ break;
+   }
+#endif
+
   if (strchr (con, ',') != NULL)
{
  error ("alternatives not allowed in % flag output");
diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
index c4485ce7af1..865c448d531 100644
--- a/gcc/config/arm/arm-c.c
+++ b/gcc/config/arm/arm-c.c
@@ -122,7 +122,9 @@ arm_cpu_builtins (struct cpp_reader* pfile)
   if (arm_arch_notm)
 builtin_define ("__ARM_ARCH_ISA_ARM");
   builtin_define ("__APCS_32__");
-  builtin_define ("__GCC_ASM_FLAG_OUTPUTS__");
+
+  if (!TARGET_THUMB1)
+builtin_define ("__GCC_ASM_FLAG_OUTPUTS__");
 
   def_or_undef_macro (pfile, "__thumb__", TARGET_THUMB);
   def_or_undef_macro (pfile, "__thumb2__", TARGET_THUMB2);
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 1c8ae0d5cd3..62a98e939c8 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -9810,6 +9810,8 @@ signed greater than
 signed less than equal
 @end table
 
+The flag output constraints are not supported in thumb1 mode.
+
 @item x86 family
 The flag output constraints for the x86 family are of the form
 @samp{=@@cc@var{cond}} where @var{cond} is one of the standard
diff --git a/gcc/testsuite/gcc.target/arm/asm-flag-1.c 
b/gcc/testsuite/gcc.target/arm/asm-flag-1.c
index 9707ebfcebb..97104d3ac73 100644
--- a/gcc/testsuite/gcc.target/arm/asm-flag-1.c
+++ b/gcc/testsuite/gcc.target/arm/asm-flag-1.c
@@ -1,6 +1,7 @@
 /* Test the valid @cc asm flag outputs.  */
 /* { dg-do compile } */
 /* { dg-options "-O" } */
+/* { dg-skip-if "" { arm_thumb1 } } */
 
 #ifndef __GCC_ASM_FLAG_OUTPUTS__
 #error "missing preprocessor define"
diff --git a/gcc/testsuite/gcc.target/arm/asm-flag-3.c 
b/gcc/testsuite/gcc.target/arm/asm-flag-3.c
index e84e3431277..e2d616051cc 100644
--- a/gcc/testsuite/gcc.target/arm/asm-flag-3.c
+++ b/gcc/testsuite/gcc.target/arm/asm-flag-3.c
@@ -1,6 +1,7 @@
 /* Test some of the valid @cc asm flag outputs.  */
 /* { dg-do compile } */
 /* { dg-options "-O" } */
+/* { dg-skip-if "" { arm_thumb1 } } */
 
 #define DO(C) \
 void f##C(void) { char x; asm("" : "=@cc"#C(x)); if (!x) asm(""); asm(""); }
diff --git a/gcc/testsuite/gcc.target/arm/asm-flag-5.c 
b/gcc/testsuite/gcc.target/arm/asm-flag-5.c
index 4d4394e1478..9a8ff586c29 100644
--- a/gcc/testsuite/gcc.target/arm/asm-flag-5.c
+++ b/gcc/testsuite/gcc.target/arm/asm-flag-5.c
@@ -1,6 +1,7 @@
 /* Test error conditions of asm flag outputs.  */
 /* { dg-do compile } */
 /* { dg-options "" } */
+/* { dg-skip-if "" { arm_thumb1 } } */
 
 void f_B(void) { _Bool x; asm("" : "=@"(x)); }
 void f_c(void) { char x; asm("" : "=@"(x)); }
diff --git a/gcc/testsuite/gcc.target/arm/asm-flag-6.c 
b/gcc/testsuite/gcc.target/arm/asm-flag-6.c
index 09174e04ae6..d862db4e106 100644
--- a/gcc/testsuite/gcc.target/arm/asm-flag-6.c
+++ b/gcc/testsuite/gcc.target/arm/asm-flag-6.c
@@ -1,5 +1,6 @@
 /* Executable testcase for 'output flags.'  */
 /* { dg-do run } */
+/* { dg-skip-if "" { arm_thumb1 } } */
 
 int test_bits (long nzcv)
 {


[PATCH] extend missing nul checks to all built-ins (PR 88226)

2019-11-14 Thread Martin Sebor

GCC 9 added checks for usafe uses of unterminated constant char
arrays to a few string functions but the checking is far from
comprehensive.  It's been on my list of things to do to do
a more thorough review and add the checks where they're missing.

The attached patch does this for the majority of common built-ins.
There still are a few where it could be added but this should cover
most of the commonly used ones where the misuses are likely to come
up.

This patch depends on the one I posted earlier today for PR 92501:
  https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01233.html

I tested both patches together on x86_64-linux.

Martin

PS I considered introducing a new attribute, say string, first
to reduce the extent of the changes in GCC, and second to provide
a mechanism to let GCC check even user-defined functions for these
bugs.  I stopped short of doing this because most of the changes
to the built-ins are necessary either way, and also because it
seems late in the cycle to introduce such an extension.  Unless
there's a strong preference for adding it now I will revisit
the decision for GCC 11.
PR middle-end/88226 - missing warning on fprintf, fputs, and puts with an unterminated array

gcc/ChangeLog:

	PR middle-end/88226
	* builtins.c (check_nul_terminated_array): New function.
	(fold_builtin_0): Remove declaration.
	(fold_builtin_1): Same.
	(fold_builtin_2): Same.
	(fold_builtin_3): Same.
	(fold_builtin_strpbrk): Add argument.
	(fold_builtin_strspn): Same.
	(fold_builtin_strcspn): Same.
	(expand_builtin_strcat): Call it.  Remove unused argument.
	(expand_builtin_stpncpy): Same.
	(expand_builtin_strncat): Same.
	(expand_builtin_strncpy): Same.  Adjust indentation.
	(expand_builtin_strcmp): Same.
	(expand_builtin_strncmp): Same.
	(expand_builtin_fork_or_exec): Same.
	(expand_builtin): Handle more built-ins.
	(fold_builtin_2): Add argument.
	(fold_builtin_n): Make static.  Add argument.
	(fold_call_expr): Pass new argument to fold_builtin_n and fold_builtin_2.
	(fold_builtin_call_array): Pass new argument to fold_builtin_n.
	(fold_builtin_strpbrk): Add argument.  Call check_nul_terminated_array.
	(fold_call_stmt): Pass new argument to fold_builtin_n.
	* builtins.h: Correct a comment.
	* gimple-fold.c (gimple_fold_builtin_strchr): Call
	check_nul_terminated_array.
	* tree-ssa-strlen.c (handle_builtin_strlen): Call
	check_nul_terminated_array.
	(handle_builtin_strchr): Same.
	(handle_builtin_string_cmp): Same.

gcc/testsuite/ChangeLog:
	PR middle-end/88226
	* gcc.dg/Wstringop-overflow-22.c: New test.
	* gcc.dg/tree-ssa/builtin-fprintf-warn-1.c: Remove xfails.

Index: gcc/builtins.c
===
--- gcc/builtins.c	(revision 278253)
+++ gcc/builtins.c	(working copy)
@@ -131,7 +131,7 @@ static rtx expand_builtin_memory_copy_args (tree d
 static rtx expand_builtin_memmove (tree, rtx);
 static rtx expand_builtin_mempcpy (tree, rtx);
 static rtx expand_builtin_mempcpy_args (tree, tree, tree, rtx, tree, memop_ret);
-static rtx expand_builtin_strcat (tree, rtx);
+static rtx expand_builtin_strcat (tree);
 static rtx expand_builtin_strcpy (tree, rtx);
 static rtx expand_builtin_strcpy_args (tree, tree, tree, rtx);
 static rtx expand_builtin_stpcpy (tree, rtx, machine_mode);
@@ -166,15 +166,11 @@ static tree fold_builtin_fabs (location_t, tree, t
 static tree fold_builtin_abs (location_t, tree, tree);
 static tree fold_builtin_unordered_cmp (location_t, tree, tree, tree, enum tree_code,
 	enum tree_code);
-static tree fold_builtin_0 (location_t, tree);
-static tree fold_builtin_1 (location_t, tree, tree);
-static tree fold_builtin_2 (location_t, tree, tree, tree);
-static tree fold_builtin_3 (location_t, tree, tree, tree, tree);
 static tree fold_builtin_varargs (location_t, tree, tree*, int);
 
-static tree fold_builtin_strpbrk (location_t, tree, tree, tree);
-static tree fold_builtin_strspn (location_t, tree, tree);
-static tree fold_builtin_strcspn (location_t, tree, tree);
+static tree fold_builtin_strpbrk (location_t, tree, tree, tree, tree);
+static tree fold_builtin_strspn (location_t, tree, tree, tree);
+static tree fold_builtin_strcspn (location_t, tree, tree, tree);
 
 static rtx expand_builtin_object_size (tree);
 static rtx expand_builtin_memory_chk (tree, rtx, machine_mode,
@@ -564,6 +560,51 @@ warn_string_no_nul (location_t loc, const char *fn
 }
 }
 
+/* For a call EXPR (which may be null) that expects a string argument
+   and SRC as the argument, returns false if SRC is a character array
+   with no terminating NUL.  When nonnull, BOUND is the number of
+   characters in which to expect the terminating NUL.
+   When EXPR is nonnull also issues a warning.  */
+
+bool
+check_nul_terminated_array (tree expr, tree src, tree bound /* = NULL_TREE */)
+{
+  tree size;
+  bool exact;
+  tree nonstr = unterminated_array (src, &size, &exact);
+  if (!nonstr)
+return true;
+
+  /* NONSTR refers to the non-nul terminated constant array and SI

Re: [patch, libgomp] Add tests for print from offload target

2019-11-14 Thread Jakub Jelinek
On Thu, Nov 14, 2019 at 05:18:41PM +, Andrew Stubbs wrote:
> On 14/11/2019 17:05, Jakub Jelinek wrote:
> > On Thu, Nov 14, 2019 at 04:47:49PM +, Andrew Stubbs wrote:
> > > This patch adds new libgomp tests to ensure that C "printf" and Fortran
> > > "write" work correctly within offload kernels. Both should work for 
> > > amdgcn,
> > > but nvptx uses the libgfortran "minimal" mode which lacks "write" support.
> > 
> > So, do those *.f90 testcases now FAIL with nvptx offloading?
> > If yes, perhaps there should be effective target check that it is not nvptx
> > offloading.
> > Once the declare variant support is finished, at least in OpenMP it could be
> > handled through that, but Fortran support for that will definitely not make
> > GCC 10.
> 
> Oops, I forgot to regenerate the patch before posting it.
> 
> Here's the version with the nvptx xfails.
> 
> OK?

Ok.

> Add tests for print from offload target.
> 
> 2019-11-14  Andrew Stubbs  
> 
>   libgomp/
>   * testsuite/libgomp.c/target-print-1.c: New file.
>   * testsuite/libgomp.fortran/target-print-1.f90: New file.
>   * testsuite/libgomp.oacc-c/print-1.c: New file.
>   * testsuite/libgomp.oacc-fortran/print-1.f90: New file.

> +int
> +main ()
> +{
> +#pragma omp target
> +{
> +  printf ("The answer is %d\n", var);
> +}

Just a nit,
#pragma omp target
  printf ("The answer is %d\n", var);
would be valid too, but no need to change the testcase.

Jakub



Re: [patch, libgomp] Add tests for print from offload target

2019-11-14 Thread Andrew Stubbs

On 14/11/2019 17:05, Jakub Jelinek wrote:

On Thu, Nov 14, 2019 at 04:47:49PM +, Andrew Stubbs wrote:

This patch adds new libgomp tests to ensure that C "printf" and Fortran
"write" work correctly within offload kernels. Both should work for amdgcn,
but nvptx uses the libgfortran "minimal" mode which lacks "write" support.


So, do those *.f90 testcases now FAIL with nvptx offloading?
If yes, perhaps there should be effective target check that it is not nvptx
offloading.
Once the declare variant support is finished, at least in OpenMP it could be
handled through that, but Fortran support for that will definitely not make
GCC 10.


Oops, I forgot to regenerate the patch before posting it.

Here's the version with the nvptx xfails.

OK?

Andrew

Add tests for print from offload target.

2019-11-14  Andrew Stubbs  

	libgomp/
	* testsuite/libgomp.c/target-print-1.c: New file.
	* testsuite/libgomp.fortran/target-print-1.f90: New file.
	* testsuite/libgomp.oacc-c/print-1.c: New file.
	* testsuite/libgomp.oacc-fortran/print-1.f90: New file.

diff --git a/libgomp/testsuite/libgomp.c/target-print-1.c b/libgomp/testsuite/libgomp.c/target-print-1.c
new file mode 100644
index 000..5857b875ced
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/target-print-1.c
@@ -0,0 +1,17 @@
+/* Ensure that printf on the offload device works.  */
+
+/* { dg-do run } */
+/* { dg-output "The answer is 42(\n|\r\n|\r)+" } */
+
+#include 
+
+int var = 42;
+
+int
+main ()
+{
+#pragma omp target
+{
+  printf ("The answer is %d\n", var);
+}
+}
diff --git a/libgomp/testsuite/libgomp.fortran/target-print-1.f90 b/libgomp/testsuite/libgomp.fortran/target-print-1.f90
new file mode 100644
index 000..c71a0952483
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/target-print-1.f90
@@ -0,0 +1,15 @@
+! Ensure that printf on the offload device works.
+
+! { dg-do run }
+! { dg-output "The answer is 42(\n|\r\n|\r)+" }
+! { dg-xfail-if "no write for nvidia" { openacc_nvidia_accel_selected } }
+
+program main
+  implicit none
+  integer :: var = 42
+
+!$omp target 
+  write (0, '("The answer is ", I2)') var
+!$omp end target
+
+end program main
diff --git a/libgomp/testsuite/libgomp.oacc-c/print-1.c b/libgomp/testsuite/libgomp.oacc-c/print-1.c
new file mode 100644
index 000..593885b5c2c
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c/print-1.c
@@ -0,0 +1,17 @@
+/* Ensure that printf on the offload device works.  */
+
+/* { dg-do run } */
+/* { dg-output "The answer is 42(\n|\r\n|\r)+" } */
+
+#include 
+
+int var = 42;
+
+int
+main ()
+{
+#pragma acc parallel
+{
+  printf ("The answer is %d\n", var);
+}
+}
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/print-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/print-1.f90
new file mode 100644
index 000..a83280d1edb
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/print-1.f90
@@ -0,0 +1,15 @@
+! Ensure that printf on the offload device works.
+
+! { dg-do run }
+! { dg-output "The answer is 42(\n|\r\n|\r)+" }
+! { dg-xfail-if "no write for nvidia" { openacc_nvidia_accel_selected } }
+
+program main
+  implicit none
+  integer :: var = 42
+
+!$acc parallel
+  write (0, '("The answer is ", I2)') var
+!$acc end parallel
+
+end program main


Re: [patch, libgomp] Add tests for print from offload target

2019-11-14 Thread Tobias Burnus

On 11/14/19 5:47 PM, Andrew Stubbs wrote:
This patch adds new libgomp tests to ensure that C "printf" and 
Fortran "write" work correctly within offload kernels. Both should 
work for amdgcn, but nvptx uses the libgfortran "minimal" mode which 
lacks "write" support.


Can't you add something like:

! { dg-do run { target { ! { openacc_nvidia_accel_selected } } } }
! For openacc_nvidia_accel_selected, there is no I/O support.

To avoid FAILs?

Cheers,

Tobias



[COMMITTED] Remove range_intersect, range_invert, and range_union.

2019-11-14 Thread Aldy Hernandez
range_intersect, range_union, and range_intersect are currently 
returning their results by value.  After Andrew's change, these should 
also return their results in an argument.  However, if we do this, the 
functions become superfluous since we have corresponding API methods 
with the same functionality:


-  r = range_intersect (op1, op2);
+  r = op1;
+  r.intersect (op2);

I have removed all 3 functions and have adjusted the code throughout.

Committed as mostly obvious, after having consulted with Andrew that it 
was these and not the range_true* ones as well that needed adjusting.


Aldy
commit e0f55e7de91f779fe12ab65fc9479e4df0fe2614
Author: Aldy Hernandez 
Date:   Thu Nov 14 17:55:32 2019 +0100

Remove range_intersect, range_invert, and range_union.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 051b10ed953..4266f6b1655 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,16 @@
+2019-11-14  Aldy Hernandez  
+
+	* range-op.cc (*operator*::*range): Remove calls to
+	range_intersect, range_invert, and range_union in favor of calling
+	the in-place API methods.
+	(range_tests): Same.
+	* range.cc (range_intersect): Remove.
+	(range_union): Remove.
+	(range_invert): Remove.
+	* range.h (range_intersect): Remove.
+	(range_union): Remove.
+	(range_intersect): Remove.
+
 2019-11-14  Ilya Leoshkevich  
 
 	PR rtl-optimization/92430
diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index ae3025c6eea..4a23cca3dbb 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -396,7 +396,8 @@ operator_equal::fold_range (value_range &r, tree type,
 {
   // If ranges do not intersect, we know the range is not equal,
   // otherwise we don't know anything for sure.
-  r = range_intersect (op1, op2);
+  r = op1;
+  r.intersect (op2);
   if (r.undefined_p ())
 	r = range_false (type);
   else
@@ -415,7 +416,10 @@ operator_equal::op1_range (value_range &r, tree type,
   // If the result is false, the only time we know anything is
   // if OP2 is a constant.
   if (wi::eq_p (op2.lower_bound(), op2.upper_bound()))
-	r = range_invert (op2);
+	{
+	  r = op2;
+	  r.invert ();
+	}
   else
 	r.set_varying (type);
   break;
@@ -476,7 +480,8 @@ operator_not_equal::fold_range (value_range &r, tree type,
 {
   // If ranges do not intersect, we know the range is not equal,
   // otherwise we don't know anything for sure.
-  r = range_intersect (op1, op2);
+  r = op1;
+  r.intersect (op2);
   if (r.undefined_p ())
 	r = range_true (type);
   else
@@ -495,7 +500,10 @@ operator_not_equal::op1_range (value_range &r, tree type,
   // If the result is true, the only time we know anything is if
   // OP2 is a constant.
   if (wi::eq_p (op2.lower_bound(), op2.upper_bound()))
-	r = range_invert (op2);
+	{
+	  r = op2;
+	  r.invert ();
+	}
   else
 	r.set_varying (type);
   break;
@@ -1974,7 +1982,8 @@ operator_logical_or::fold_range (value_range &r, tree type ATTRIBUTE_UNUSED,
   if (empty_range_check (r, lh, rh))
 return;
 
-  r = range_union (lh, rh);
+  r = lh;
+  r.union_ (rh);
 }
 
 bool
@@ -2221,7 +2230,10 @@ operator_logical_not::fold_range (value_range &r, tree type,
   if (lh.varying_p () || lh.undefined_p ())
 r = lh;
   else
-r = range_invert (lh);
+{
+  r = lh;
+  r.invert ();
+}
   gcc_checking_assert (lh.type() == type);
   return;
 }
@@ -2232,10 +2244,9 @@ operator_logical_not::op1_range (value_range &r,
  const value_range &lhs,
  const value_range &op2 ATTRIBUTE_UNUSED) const
 {
-  if (lhs.varying_p () || lhs.undefined_p ())
-r = lhs;
-  else
-r = range_invert (lhs);
+  r = lhs;
+  if (!lhs.varying_p () && !lhs.undefined_p ())
+r.invert ();
   return true;
 }
 
@@ -3033,13 +3044,6 @@ range_tests ()
   r1.union_ (r2);
   ASSERT_TRUE (r0 == r1);
 
-  // [10,20] U [30,40] ==> [10,20][30,40].
-  r0 = value_range (INT (10), INT (20));
-  r1 = value_range (INT (30), INT (40));
-  r0.union_ (r1);
-  ASSERT_TRUE (r0 == range_union (value_range (INT (10), INT (20)),
-  value_range (INT (30), INT (40;
-
   // Make sure NULL and non-NULL of pointer types work, and that
   // inverses of them are consistent.
   tree voidp = build_pointer_type (void_type_node);
@@ -3049,27 +3053,12 @@ range_tests ()
   r0.invert ();
   ASSERT_TRUE (r0 == r1);
 
-  // [10,20][30,40] U [25,70] => [10,70].
-  r0 = range_union (value_range (INT (10), INT (20)),
-		 value_range (INT (30), INT (40)));
-  r1 = value_range (INT (25), INT (70));
-  r0.union_ (r1);
-  ASSERT_TRUE (r0 == range_union (value_range (INT (10), INT (20)),
-  value_range (INT (25), INT (70;
-
   // [10,20] U [15, 30] => [10, 30].
   r0 = value_range (INT (10), INT (20));
   r1 = value_range (INT (15), INT (30));
   r0.union_ (r1);
   ASSERT_TRUE (r0 == value_range (INT (10), INT (30)));
 
-  // [10,20] U [25,25] => [10,20][25,25].
-  r0 = value_range (INT (10), INT (20));
-  r1 = value_range (INT (25),

[PATCH] fold strncmp of unterminated arrays (PR 92501)

2019-11-14 Thread Martin Sebor

Adding tests for unsafe uses of unterminated constant char arrays
in string functions exposed the limitation in strncmp folding
described in PR 92501: GCC only folds strncmp calls involving
nul-terminated constant strings.

The attached patch improves the folder to also handle unterminated
constant character arrays.  This capability is in turn relied on
for the dependable detection of unsafe uses of unterminated arrays
in strncpy, a patch for which I'm about to post separately.

Tested on x86_64-linux.

Martin
PR tree-optimization/92501 - strncmp with constant unterminated arrays not folded

gcc/testsuite/ChangeLog:

	PR tree-optimization/92501
	* gcc.dg/strcmpopt_7.c: New test.

gcc/ChangeLog:

	PR tree-optimization/92501
	* gimple-fold.c ((gimple_fold_builtin_string_compare): Let strncmp
	handle unterminated arrays.  Rename local variables for clarity.

Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c	(revision 278253)
+++ gcc/gimple-fold.c	(working copy)
@@ -2323,8 +2323,7 @@ gimple_load_first_char (location_t loc, tree str,
   return var;
 }
 
-/* Fold a call to the str{n}{case}cmp builtin pointed by GSI iterator.
-   FCODE is the name of the builtin.  */
+/* Fold a call to the str{n}{case}cmp builtin pointed by GSI iterator.  */
 
 static bool
 gimple_fold_builtin_string_compare (gimple_stmt_iterator *gsi)
@@ -2337,18 +2336,19 @@ gimple_fold_builtin_string_compare (gimple_stmt_it
   tree str1 = gimple_call_arg (stmt, 0);
   tree str2 = gimple_call_arg (stmt, 1);
   tree lhs = gimple_call_lhs (stmt);
-  HOST_WIDE_INT length = -1;
+  tree len = NULL_TREE;
+  HOST_WIDE_INT bound = -1;
 
   /* Handle strncmp and strncasecmp functions.  */
   if (gimple_call_num_args (stmt) == 3)
 {
-  tree len = gimple_call_arg (stmt, 2);
+  len = gimple_call_arg (stmt, 2);
   if (tree_fits_uhwi_p (len))
-	length = tree_to_uhwi (len);
+	bound = tree_to_uhwi (len);
 }
 
   /* If the LEN parameter is zero, return zero.  */
-  if (length == 0)
+  if (bound == 0)
 {
   replace_call_with_value (gsi, integer_zero_node);
   return true;
@@ -2361,9 +2361,32 @@ gimple_fold_builtin_string_compare (gimple_stmt_it
   return true;
 }
 
-  const char *p1 = c_getstr (str1);
-  const char *p2 = c_getstr (str2);
+  /* Initially set to the number of characters, including the terminating
+ nul if each array has one.   Nx == strnlen(Sx, Nx) implies that
+ the array is not terminated by a nul.
+ For nul-terminated strings then adjusted to their length.  */
+  unsigned HOST_WIDE_INT len1 = HOST_WIDE_INT_MAX, len2 = len1;
+  const char *p1 = c_getstr (str1, &len1);
+  const char *p2 = c_getstr (str2, &len2);
 
+  /* The position of the terminting nul character if one exists, otherwise
+ a value greater than LENx.  */
+  unsigned HOST_WIDE_INT nulpos1 = HOST_WIDE_INT_MAX, nulpos2 = nulpos1;
+
+  if (p1)
+{
+  nulpos1 = strnlen (p1, len1);
+  if (nulpos1 < len1)
+	len1 = nulpos1;
+}
+
+  if (p2)
+{
+  nulpos2 = strnlen (p2, len2);
+  if (nulpos2 < len2)
+	len2 = nulpos2;
+}
+
   /* For known strings, return an immediate value.  */
   if (p1 && p2)
 {
@@ -2374,17 +2397,19 @@ gimple_fold_builtin_string_compare (gimple_stmt_it
 	{
 	case BUILT_IN_STRCMP:
 	case BUILT_IN_STRCMP_EQ:
-	  {
-	r = strcmp (p1, p2);
-	known_result = true;
+	  if (len1 != nulpos1 || len2 != nulpos2)
 	break;
-	  }
+
+	  r = strcmp (p1, p2);
+	  known_result = true;
+	  break;
+
 	case BUILT_IN_STRNCMP:
 	case BUILT_IN_STRNCMP_EQ:
 	  {
-	if (length == -1)
+	if (bound == -1)
 	  break;
-	r = strncmp (p1, p2, length);
+	r = strncmp (p1, p2, bound);
 	known_result = true;
 	break;
 	  }
@@ -2394,9 +2419,9 @@ gimple_fold_builtin_string_compare (gimple_stmt_it
 	  break;
 	case BUILT_IN_STRNCASECMP:
 	  {
-	if (length == -1)
+	if (bound == -1)
 	  break;
-	r = strncmp (p1, p2, length);
+	r = strncmp (p1, p2, bound);
 	if (r == 0)
 	  known_result = true;
 	break;
@@ -2412,7 +2437,7 @@ gimple_fold_builtin_string_compare (gimple_stmt_it
 	}
 }
 
-  bool nonzero_length = length >= 1
+  bool nonzero_bound = bound >= 1
 || fcode == BUILT_IN_STRCMP
 || fcode == BUILT_IN_STRCMP_EQ
 || fcode == BUILT_IN_STRCASECMP;
@@ -2420,7 +2445,7 @@ gimple_fold_builtin_string_compare (gimple_stmt_it
   location_t loc = gimple_location (stmt);
 
   /* If the second arg is "", return *(const unsigned char*)arg1.  */
-  if (p2 && *p2 == '\0' && nonzero_length)
+  if (p2 && *p2 == '\0' && nonzero_bound)
 {
   gimple_seq stmts = NULL;
   tree var = gimple_load_first_char (loc, str1, &stmts);
@@ -2435,7 +2460,7 @@ gimple_fold_builtin_string_compare (gimple_stmt_it
 }
 
   /* If the first arg is "", return -*(const unsigned char*)arg2.  */
-  if (p1 && *p1 == '\0' && nonzero_length)
+  if (p1 && *p1 == '\0' && nonzero_bound)
 {
   gimple_se

Re: [patch, libgomp] Add tests for print from offload target

2019-11-14 Thread Jakub Jelinek
On Thu, Nov 14, 2019 at 04:47:49PM +, Andrew Stubbs wrote:
> This patch adds new libgomp tests to ensure that C "printf" and Fortran
> "write" work correctly within offload kernels. Both should work for amdgcn,
> but nvptx uses the libgfortran "minimal" mode which lacks "write" support.

So, do those *.f90 testcases now FAIL with nvptx offloading?
If yes, perhaps there should be effective target check that it is not nvptx
offloading.
Once the declare variant support is finished, at least in OpenMP it could be
handled through that, but Fortran support for that will definitely not make
GCC 10.

Jakub



Re: [Patch, fortran] PR69654 - ICE in gfc_trans_structure_assign

2019-11-14 Thread Steve Kargl
On Thu, Nov 14, 2019 at 03:52:26PM +, Paul Richard Thomas wrote:
> As I remarked in PR, this fix probably comes 1,379 days too late. I am
> not at all sure that I understand why I couldn't see the problem
> because it is rather trivial.
> 
> I am open to not adding the second gcc_assert - it does seem like overkill.
> 
> Regtested on FC30/x86_64 - OK for trunk and ?
> 

Yes.  7-branch is now closed.  So, if you're inclined
to backport then it is also ok for 8 and 9 branches
after testing. 

-- 
Steve


[PATCH] libstdc++: Implement new predicate concepts from P1716R3

2019-11-14 Thread Jonathan Wakely

* include/bits/iterator_concepts.h (__iter_concept_impl): Add
comments.
(indirect_relation): Rename to indirect_binary_predicate and adjust
definition as per P1716R3.
(indirect_equivalence_relation): Define.
(indirectly_comparable): Adjust definition.
* include/std/concepts (equivalence_relation): Define.
* testsuite/std/concepts/concepts.callable/relation.cc: Add tests for
equivalence_relation.

Tested powerpc64le-linux, committed to trunk.

commit 01e0b14116ce56d4327362686334c37272faac43
Author: Jonathan Wakely 
Date:   Wed Nov 13 22:26:06 2019 +

libstdc++: Implement new predicate concepts from P1716R3

* include/bits/iterator_concepts.h (__iter_concept_impl): Add
comments.
(indirect_relation): Rename to indirect_binary_predicate and adjust
definition as per P1716R3.
(indirect_equivalence_relation): Define.
(indirectly_comparable): Adjust definition.
* include/std/concepts (equivalence_relation): Define.
* testsuite/std/concepts/concepts.callable/relation.cc: Add tests 
for
equivalence_relation.

diff --git a/libstdc++-v3/include/bits/iterator_concepts.h 
b/libstdc++-v3/include/bits/iterator_concepts.h
index 7cc058eb8c9..90a8bc8071f 100644
--- a/libstdc++-v3/include/bits/iterator_concepts.h
+++ b/libstdc++-v3/include/bits/iterator_concepts.h
@@ -420,20 +420,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   namespace __detail
   {
 template
-  struct __iter_concept_impl
-  { };
+  struct __iter_concept_impl;
 
+// ITER_CONCEPT(I) is ITER_TRAITS(I)::iterator_concept if that is valid.
 template
   requires requires { typename __iter_traits<_Iter>::iterator_concept; }
   struct __iter_concept_impl<_Iter>
   { using type = typename __iter_traits<_Iter>::iterator_concept; };
 
+// Otherwise, ITER_TRAITS(I)::iterator_category if that is valid.
 template
   requires (!requires { typename __iter_traits<_Iter>::iterator_concept; }
  && requires { typename __iter_traits<_Iter>::iterator_category; })
   struct __iter_concept_impl<_Iter>
   { using type = typename __iter_traits<_Iter>::iterator_category; };
 
+// Otherwise, random_access_tag if iterator_traits is not specialized.
 template
   requires (!requires { typename __iter_traits<_Iter>::iterator_concept; }
  && !requires { typename __iter_traits<_Iter>::iterator_category; }
@@ -441,7 +443,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   struct __iter_concept_impl<_Iter>
   { using type = random_access_iterator_tag; };
 
-// ITER_TRAITS
+// Otherwise, there is no ITER_CONCEPT(I) type.
+template
+  struct __iter_concept_impl
+  { };
+
+// ITER_CONCEPT
 template
   using __iter_concept = typename __iter_concept_impl<_Iter>::type;
   } // namespace __detail
@@ -615,15 +622,26 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   && predicate<_Fn&, iter_reference_t<_Iter>>
   && predicate<_Fn&, iter_common_reference_t<_Iter>>;
 
-  template
-concept indirect_relation = readable<_I1> && readable<_I2>
+  template
+concept indirect_binary_predicate = readable<_I1> && readable<_I2>
   && copy_constructible<_Fn>
-  && relation<_Fn&, iter_value_t<_I1>&, iter_value_t<_I2>&>
-  && relation<_Fn&, iter_value_t<_I1>&, iter_reference_t<_I2>>
-  && relation<_Fn&, iter_reference_t<_I1>, iter_value_t<_I2>&>
-  && relation<_Fn&, iter_reference_t<_I1>, iter_reference_t<_I2>>
-  && relation<_Fn&, iter_common_reference_t<_I1>,
- iter_common_reference_t<_I2>>;
+  && predicate<_Fn&, iter_value_t<_I1>&, iter_value_t<_I2>&>
+  && predicate<_Fn&, iter_value_t<_I1>&, iter_reference_t<_I2>>
+  && predicate<_Fn&, iter_reference_t<_I1>, iter_value_t<_I2>&>
+  && predicate<_Fn&, iter_reference_t<_I1>, iter_reference_t<_I2>>
+  && predicate<_Fn&, iter_common_reference_t<_I1>,
+  iter_common_reference_t<_I2>>;
+
+  template
+concept indirect_equivalence_relation = readable<_I1> && readable<_I2>
+  && copy_constructible<_Fn>
+  && equivalence_relation<_Fn&, iter_value_t<_I1>&, iter_value_t<_I2>&>
+  && equivalence_relation<_Fn&, iter_value_t<_I1>&, iter_reference_t<_I2>>
+  && equivalence_relation<_Fn&, iter_reference_t<_I1>, iter_value_t<_I2>&>
+  && equivalence_relation<_Fn&, iter_reference_t<_I1>,
+ iter_reference_t<_I2>>
+  && equivalence_relation<_Fn&, iter_common_reference_t<_I1>,
+ iter_common_reference_t<_I2>>;
 
   template
 concept indirect_strict_weak_order = readable<_I1> && readable<_I2>
@@ -767,7 +785,8 @@ namespace ranges
   template
 concept indirectly_comparable
-  = indirect_relation<_Rel, projected<_I1, _P1>, projected<_I2, _P2>>;
+  = indirect_binary_predicate<_Rel, projected<_I1, _P1>,
+ 

[PATCH] libstdc++: Rename disable_sized_sentinel [P1871R1]

2019-11-14 Thread Jonathan Wakely

* include/bits/iterator_concepts.h (disable_sized_sentinel): Rename to
disable_sized_sentinel_for.
* testsuite/24_iterators/headers/iterator/synopsis_c++20.cc: Adjust.

Tested powerpc64le-linux, committed to trunk.


commit 3a7f3e87680cc0ca20318b0983d517cd32851fc5
Author: Jonathan Wakely 
Date:   Wed Nov 13 22:27:59 2019 +

libstdc++: Rename disable_sized_sentinel [P1871R1]

* include/bits/iterator_concepts.h (disable_sized_sentinel): Rename 
to
disable_sized_sentinel_for.
* testsuite/24_iterators/headers/iterator/synopsis_c++20.cc: Adjust.

diff --git a/libstdc++-v3/include/bits/iterator_concepts.h 
b/libstdc++-v3/include/bits/iterator_concepts.h
index 8b398616a56..7cc058eb8c9 100644
--- a/libstdc++-v3/include/bits/iterator_concepts.h
+++ b/libstdc++-v3/include/bits/iterator_concepts.h
@@ -524,11 +524,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   && __detail::__weakly_eq_cmp_with<_Sent, _Iter>;
 
   template
-inline constexpr bool disable_sized_sentinel = false;
+inline constexpr bool disable_sized_sentinel_for = false;
 
   template
 concept sized_sentinel_for = sentinel_for<_Sent, _Iter>
-&& !disable_sized_sentinel, remove_cv_t<_Iter>>
+&& !disable_sized_sentinel_for, remove_cv_t<_Iter>>
 && requires(const _Iter& __i, const _Sent& __s)
 {
   { __s - __i } -> same_as>;
diff --git a/libstdc++-v3/include/bits/stl_iterator.h 
b/libstdc++-v3/include/bits/stl_iterator.h
index 411feba90e0..a707621c9ed 100644
--- a/libstdc++-v3/include/bits/stl_iterator.h
+++ b/libstdc++-v3/include/bits/stl_iterator.h
@@ -449,9 +449,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #  if __cplusplus > 201703L && defined __cpp_lib_concepts
   template
 requires (!sized_sentinel_for<_Iterator1, _Iterator2>)
-inline constexpr bool disable_sized_sentinel,
-reverse_iterator<_Iterator2>>
-  = true;
+inline constexpr bool
+disable_sized_sentinel_for,
+  reverse_iterator<_Iterator2>> = true;
 #  endif // C++20
 # endif // C++14
 
diff --git 
a/libstdc++-v3/testsuite/24_iterators/headers/iterator/synopsis_c++20.cc 
b/libstdc++-v3/testsuite/24_iterators/headers/iterator/synopsis_c++20.cc
index 824b0b4f38c..fb3bb420a54 100644
--- a/libstdc++-v3/testsuite/24_iterators/headers/iterator/synopsis_c++20.cc
+++ b/libstdc++-v3/testsuite/24_iterators/headers/iterator/synopsis_c++20.cc
@@ -79,7 +79,7 @@ namespace std
 }
 
 struct I { };
-template<> constexpr bool std::disable_sized_sentinel = true;
+template<> constexpr bool std::disable_sized_sentinel_for = true;
 
 namespace __gnu_test
 {
@@ -87,8 +87,8 @@ namespace __gnu_test
   constexpr auto* iter_move = &std::ranges::iter_move;
   constexpr auto* iter_swap = &std::ranges::iter_swap;
   // sized sentinels
-  constexpr bool const* disable_sized_sentinel
-= &std::disable_sized_sentinel;
+  constexpr bool const* disable_sized_sentinel_for
+= &std::disable_sized_sentinel_for;
   // default sentinels
   constexpr std::default_sentinel_t const* default_sentinel
 = &std::default_sentinel;


[patch, libgomp] Add tests for print from offload target

2019-11-14 Thread Andrew Stubbs

Hi,

This patch adds new libgomp tests to ensure that C "printf" and Fortran 
"write" work correctly within offload kernels. Both should work for 
amdgcn, but nvptx uses the libgfortran "minimal" mode which lacks 
"write" support.


Obviously, printing from offload kernels is not recommended in 
production, but can be useful in development.


OK to commit?

Thanks

Andrew
Add tests for print from offload target.

2019-11-14  Andrew Stubbs  

	libgomp/
	* testsuite/libgomp.c/target-print-1.c: New file.
	* testsuite/libgomp.fortran/target-print-1.f90: New file.
	* testsuite/libgomp.oacc-c/print-1.c: New file.
	* testsuite/libgomp.oacc-fortran/print-1.f90: New file.

diff --git a/libgomp/testsuite/libgomp.c/target-print-1.c b/libgomp/testsuite/libgomp.c/target-print-1.c
new file mode 100644
index 000..5857b875ced
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/target-print-1.c
@@ -0,0 +1,17 @@
+/* Ensure that printf on the offload device works.  */
+
+/* { dg-do run } */
+/* { dg-output "The answer is 42(\n|\r\n|\r)+" } */
+
+#include 
+
+int var = 42;
+
+int
+main ()
+{
+#pragma omp target
+{
+  printf ("The answer is %d\n", var);
+}
+}
diff --git a/libgomp/testsuite/libgomp.fortran/target-print-1.f90 b/libgomp/testsuite/libgomp.fortran/target-print-1.f90
new file mode 100644
index 000..73ee09a2b79
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/target-print-1.f90
@@ -0,0 +1,14 @@
+! Ensure that printf on the offload device works.
+
+! { dg-do run }
+! { dg-output "The answer is 42(\n|\r\n|\r)+" }
+
+program main
+  implicit none
+  integer :: var = 42
+
+!$omp target 
+  write (0, '("The answer is ", I2)') var
+!$omp end target
+
+end program main
diff --git a/libgomp/testsuite/libgomp.oacc-c/print-1.c b/libgomp/testsuite/libgomp.oacc-c/print-1.c
new file mode 100644
index 000..593885b5c2c
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c/print-1.c
@@ -0,0 +1,17 @@
+/* Ensure that printf on the offload device works.  */
+
+/* { dg-do run } */
+/* { dg-output "The answer is 42(\n|\r\n|\r)+" } */
+
+#include 
+
+int var = 42;
+
+int
+main ()
+{
+#pragma acc parallel
+{
+  printf ("The answer is %d\n", var);
+}
+}
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/print-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/print-1.f90
new file mode 100644
index 000..bef664df4fa
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/print-1.f90
@@ -0,0 +1,14 @@
+! Ensure that printf on the offload device works.
+
+! { dg-do run }
+! { dg-output "The answer is 42(\n|\r\n|\r)+" }
+
+program main
+  implicit none
+  integer :: var = 42
+
+!$acc parallel
+  write (0, '("The answer is ", I2)') var
+!$acc end parallel
+
+end program main


  1   2   3   >