date:20190719

[C++ PATCH] PR c++/90101 - dependent class non-type parameter.

2019-07-19 Thread Jason Merrill

We shouldn't complain that a dependent type is incomplete.

Tested x86_64-pc-linux-gnu, applying to trunk.

* pt.c (invalid_nontype_parm_type_p): Check for dependent class type.
---
 gcc/cp/pt.c  |  2 ++
 gcc/testsuite/g++.dg/cpp2a/nontype-class21.C | 10 ++
 gcc/testsuite/g++.dg/cpp2a/nontype-class22.C | 21 
 gcc/cp/ChangeLog |  5 +
 4 files changed, 38 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class21.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class22.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 53aaad1800a..e433413827a 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -25228,6 +25228,8 @@ invalid_nontype_parm_type_p (tree type, tsubst_flags_t 
complain)
 "with %<-std=c++2a%> or %<-std=gnu++2a%>");
  return true;
}
+  if (dependent_type_p (type))
+   return false;
   if (!complete_type_or_else (type, NULL_TREE))
return true;
   if (!literal_type_p (type))
diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class21.C 
b/gcc/testsuite/g++.dg/cpp2a/nontype-class21.C
new file mode 100644
index 000..c58fe05b9dd
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/nontype-class21.C
@@ -0,0 +1,10 @@
+// PR c++/90101
+// { dg-do compile { target c++2a } }
+
+template
+struct A{};
+
+template>
+struct B {};
+
+B<2,A<2>{}> b;
diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class22.C 
b/gcc/testsuite/g++.dg/cpp2a/nontype-class22.C
new file mode 100644
index 000..026855f0bc6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/nontype-class22.C
@@ -0,0 +1,21 @@
+// PR c++/90100
+// { dg-do compile { target c++2a } }
+
+template
+inline constexpr bool is_nontype_list = false;
+
+template typename T, auto... NonTypes>
+inline constexpr bool is_nontype_list> = true;
+
+// works
+template
+struct A {};
+
+static_assert(is_nontype_list>);
+
+// fails
+struct X {
+int v;
+};
+
+static_assert(is_nontype_list>);
diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
index cef36b2d1b2..c1fc980e287 100644
--- a/gcc/cp/ChangeLog
+++ b/gcc/cp/ChangeLog
@@ -1,3 +1,8 @@
+2019-07-19  Jason Merrill  
+
+   PR c++/90101 - dependent class non-type parameter.
+   * pt.c (invalid_nontype_parm_type_p): Check for dependent class type.
+
 2019-07-18  Jason Merrill  
 
PR c++/90098 - partial specialization and class non-type parms.

base-commit: d4f6160a8abbf666fc86fc9c60763b4b93da8fb9
-- 
2.21.0

[PATCH] Fix hash checking ICE in temp slot handling (PR middle-end/91190)

2019-07-19 Thread Jakub Jelinek

Hi!

As mentioned in the PR, we ICE on the following testcase.
The problem is that offset_address (in this particular case) does
  update_temp_slot_address (XEXP (memref, 0), new_rtx);
  new_rtx = change_address_1 (memref, VOIDmode, new_rtx, 1, false);
where the first call inserts the address into a hash table and computes a
hash for it, but then change_address_1 will try to legitimize that address
and modifies it in place, so now the hash table contains an entry with an
incorrect hash value (and at incorrect spot in the hash table).
Next we end up trying to insert the legitimized address into the hash table
and the new hash table checking rightfully complains.

The following patch ensures later in-place modifications of the address
don't affect the hash table.  Bootstrapped/regtested on x86_64-linux and
i686-linux, ok for trunk?

2019-07-19  Jakub Jelinek  

PR middle-end/91190
* function.c (insert_temp_slot_address): Store into the hash table
a copy of address to avoid RTL sharing issues.

* gcc.c-torture/compile/pr91190.c: New test.

--- gcc/function.c.jj   2019-07-10 15:52:56.785588326 +0200
+++ gcc/function.c  2019-07-18 09:40:40.172869182 +0200
@@ -704,7 +704,7 @@ static void
 insert_temp_slot_address (rtx address, class temp_slot *temp_slot)
 {
   struct temp_slot_address_entry *t = ggc_alloc ();
-  t->address = address;
+  t->address = copy_rtx (address);
   t->temp_slot = temp_slot;
   t->hash = temp_slot_address_compute_hash (t);
   *temp_slot_address_table->find_slot_with_hash (t, t->hash, INSERT) = t;
--- gcc/testsuite/gcc.c-torture/compile/pr91190.c.jj2019-07-18 
09:52:07.364298430 +0200
+++ gcc/testsuite/gcc.c-torture/compile/pr91190.c   2019-07-18 
09:51:17.892058429 +0200
@@ -0,0 +1,31 @@
+/* PR middle-end/91190 */
+
+unsigned a[1], c;
+long d, h;
+int e[2], f, g;
+char i;
+
+int
+main ()
+{
+  char k = 0;
+  int l;
+  while (i || d)
+{
+  if (g)
+   while (1)
+ ;
+  e[1] = 0;
+  long m[2], n = ~(3 & (5 | (h | 9) * 2237420170));
+  g = 90 * n;
+  char b = m[3], j = 0;
+  c = 5 ^ a[c ^ (b & 5)];
+  int o = d;
+  k = o ? : j;
+  if (k)
+   for (l = 0; l < 3; l++)
+ if (m[20]) 
+   __builtin_printf ("%d", f);
+}
+  return 0; 
+}

Jakub

Re: [PATCH] Allow case-insensitive comparisons of register names by implementing CASE_INSENSITIVE_REGISTER_NAMES PR target/70320

2019-07-19 Thread Richard Sandiford

Jozef Lawrynowicz  writes:
> The attached patch adds a new target macro called 
> CASE_INSENSITIVE_REGISTER_NAMES, which allows the case of register names
> used in an asm statement clobber list, or given in a command line option, to 
> be
> disregarded when comparing with the register names defined for the target in
> REGISTER_NAMES. 
>
> The macro is set to 1 for msp430 only, and set to 0 by default, so
> comparisons continue to be case-sensitive for all targets except
> msp430.
>
> Previously, a register name provided by the user using one of the
> aforementioned methods must exactly match those defined in the targets
> REGISTER_NAMES macro.
>
> This means that, for example, for msp430-elf the following code emits an
> ambiguous error:
>
>> void
>> foo (void)
>> {
>>   __asm__ ("" : : : "r4", "R6");
>> }
>
>> asm-register-names.c:8:3: error: unknown register name 'r4' in 'asm'
>
> All the register names defined in the msp430 REGISTER_NAMES macro use an
> upper case 'R', so use of lower case 'r' gets rejected.
>
> Successfully bootstrapped and regtested on trunk for x86_64-pc-linux-gnu, and
> regtested for msp430-elf.
>
> Ok for trunk?
>
> From 82eadcdcbb8914b06818f7c8a10156336518e8d1 Mon Sep 17 00:00:00 2001
> From: Jozef Lawrynowicz 
> Date: Wed, 17 Jul 2019 11:48:23 +0100
> Subject: [PATCH] Implement CASE_INSENSITIVE_REGISTER_NAMES
>
> gcc/ChangeLog:
>
> 2019-07-18  Jozef Lawrynowicz  
>
>   PR target/70320
>   * doc/tm.texi.in: Document new macro CASE_INSENSITIVE_REGISTER_NAMES.
>   * doc/tm.texi: Likewise.
>   * defaults.h: Define CASE_INSENSITIVE_REGISTER_NAMES to 0.
>   * config/msp430/msp430.h: Define CASE_INSENSITIVE_REGISTER_NAMES to 1.
>   * varasm.c (decode_reg_name_and_count): Use strcasecmp instead of
>   strcmp for comparisons of asmspec with a register name if 
>   CASE_INSENSITIVE_REGISTER_NAMES is defined to 1.

I really don't think we should be adding new target macros for things
like this.  The code is hardly on the critical path, so I don't think
compile time is a concern.  That said...

> diff --git a/gcc/varasm.c b/gcc/varasm.c
> index e886cdc71b8..ab04bc2c332 100644
> --- a/gcc/varasm.c
> +++ b/gcc/varasm.c
> @@ -947,7 +947,12 @@ decode_reg_name_and_count (const char *asmspec, int 
> *pnregs)
>  
>for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
>   if (reg_names[i][0]
> - && ! strcmp (asmspec, strip_reg_name (reg_names[i])))
> +#if CASE_INSENSITIVE_REGISTER_NAMES
> + && ! strcasecmp (asmspec, strip_reg_name (reg_names[i]))
> +#else
> + && ! strcmp (asmspec, strip_reg_name (reg_names[i]))
> +#endif /* CASE_INSENSITIVE_REGISTER_NAMES */
> + )
> return i;
>  
>  #ifdef OVERLAPPING_REGISTER_NAMES

...if we do keep it as a macro, we should use:

if (reg_names[i][0]
&& (CASE_INSENSITIVE_REGISTER_NAMES
? !strcasecmp (asmspec, strip_reg_name (reg_names[i]))
: !strcmp (asmspec, strip_reg_name (reg_names[i]

So TBH I still prefer the DEFHOOKPOD suggestion.  I won't object if
someone else wants to approve the macro version though.

Thanks,
Richard

Re: [PATCH] Fix hash checking ICE in temp slot handling (PR middle-end/91190)

2019-07-19 Thread Richard Biener

On Fri, 19 Jul 2019, Jakub Jelinek wrote:

> Hi!
> 
> As mentioned in the PR, we ICE on the following testcase.
> The problem is that offset_address (in this particular case) does
>   update_temp_slot_address (XEXP (memref, 0), new_rtx);
>   new_rtx = change_address_1 (memref, VOIDmode, new_rtx, 1, false);
> where the first call inserts the address into a hash table and computes a
> hash for it, but then change_address_1 will try to legitimize that address
> and modifies it in place, so now the hash table contains an entry with an
> incorrect hash value (and at incorrect spot in the hash table).
> Next we end up trying to insert the legitimized address into the hash table
> and the new hash table checking rightfully complains.
> 
> The following patch ensures later in-place modifications of the address
> don't affect the hash table.  Bootstrapped/regtested on x86_64-linux and
> i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2019-07-19  Jakub Jelinek  
> 
>   PR middle-end/91190
>   * function.c (insert_temp_slot_address): Store into the hash table
>   a copy of address to avoid RTL sharing issues.
> 
>   * gcc.c-torture/compile/pr91190.c: New test.
> 
> --- gcc/function.c.jj 2019-07-10 15:52:56.785588326 +0200
> +++ gcc/function.c2019-07-18 09:40:40.172869182 +0200
> @@ -704,7 +704,7 @@ static void
>  insert_temp_slot_address (rtx address, class temp_slot *temp_slot)
>  {
>struct temp_slot_address_entry *t = ggc_alloc ();
> -  t->address = address;
> +  t->address = copy_rtx (address);
>t->temp_slot = temp_slot;
>t->hash = temp_slot_address_compute_hash (t);
>*temp_slot_address_table->find_slot_with_hash (t, t->hash, INSERT) = t;
> --- gcc/testsuite/gcc.c-torture/compile/pr91190.c.jj  2019-07-18 
> 09:52:07.364298430 +0200
> +++ gcc/testsuite/gcc.c-torture/compile/pr91190.c 2019-07-18 
> 09:51:17.892058429 +0200
> @@ -0,0 +1,31 @@
> +/* PR middle-end/91190 */
> +
> +unsigned a[1], c;
> +long d, h;
> +int e[2], f, g;
> +char i;
> +
> +int
> +main ()
> +{
> +  char k = 0;
> +  int l;
> +  while (i || d)
> +{
> +  if (g)
> + while (1)
> +   ;
> +  e[1] = 0;
> +  long m[2], n = ~(3 & (5 | (h | 9) * 2237420170));
> +  g = 90 * n;
> +  char b = m[3], j = 0;
> +  c = 5 ^ a[c ^ (b & 5)];
> +  int o = d;
> +  k = o ? : j;
> +  if (k)
> + for (l = 0; l < 3; l++)
> +   if (m[20]) 
> + __builtin_printf ("%d", f);
> +}
> +  return 0; 
> +}
> 
>   Jakub
> 

-- 
Richard Biener 
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

[PATCH] Fix PR91207, revert vectorizer change for PR91178

2019-07-19 Thread Richard Biener



Need to think more about that one.  I'm leaving the testcase in,
it compiles somewhat slowly (6s for -O0 optimized cc1 with checking)
but still reasonable due to the other fix for said PR.

Applied.

Richard.

2019-07-19  Richard Biener  

PR tree-optimization/91207
Revert
2019-07-17  Richard Biener  

PR tree-optimization/91178
* tree-vect-stmts.c (get_group_load_store_type): For SLP
loads with a gap larger than the vector size always use
VMAT_STRIDED_SLP.
(vectorizable_load): For VMAT_STRIDED_SLP with a permutation
avoid loading vectors that are only contained in the gap
and thus are not needed.

* gcc.dg/torture/pr91207.c: New testcase.

Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   (revision 273590)
+++ gcc/tree-vect-stmts.c   (working copy)
@@ -2267,14 +2267,6 @@ get_group_load_store_type (stmt_vec_info
/ vect_get_scalar_dr_size (first_dr_info)))
overrun_p = false;
 
- /* If the gap at the end of the group exceeds a whole vector
-in size use the strided SLP code which can skip code-generation
-for the gap.  */
- if (vls_type == VLS_LOAD && known_gt (gap, nunits))
-   *memory_access_type = VMAT_STRIDED_SLP;
- else
-   *memory_access_type = VMAT_CONTIGUOUS;
-
  /* If the gap splits the vector in half and the target
 can do half-vector operations avoid the epilogue peeling
 by simply loading half of the vector only.  Usually
@@ -2282,8 +2274,7 @@ get_group_load_store_type (stmt_vec_info
  dr_alignment_support alignment_support_scheme;
  scalar_mode elmode = SCALAR_TYPE_MODE (TREE_TYPE (vectype));
  machine_mode vmode;
- if (*memory_access_type == VMAT_CONTIGUOUS
- && overrun_p
+ if (overrun_p
  && !masked_p
  && (((alignment_support_scheme
  = vect_supportable_dr_alignment (first_dr_info, false)))
@@ -2306,6 +2297,7 @@ get_group_load_store_type (stmt_vec_info
 "Peeling for outer loop is not supported\n");
  return false;
}
+ *memory_access_type = VMAT_CONTIGUOUS;
}
 }
   else
@@ -8740,7 +8732,6 @@ vectorizable_load (stmt_vec_info stmt_in
   /* Checked by get_load_store_type.  */
   unsigned int const_nunits = nunits.to_constant ();
   unsigned HOST_WIDE_INT cst_offset = 0;
-  unsigned int group_gap = 0;
 
   gcc_assert (!LOOP_VINFO_FULLY_MASKED_P (loop_vinfo));
   gcc_assert (!nested_in_vect_loop);
@@ -8758,7 +8749,6 @@ vectorizable_load (stmt_vec_info stmt_in
   if (slp && grouped_load)
{
  group_size = DR_GROUP_SIZE (first_stmt_info);
- group_gap = DR_GROUP_GAP (first_stmt_info);
  ref_type = get_group_alias_ptr_type (first_stmt_info);
}
   else
@@ -8902,14 +8892,6 @@ vectorizable_load (stmt_vec_info stmt_in
  if (nloads > 1)
vec_alloc (v, nloads);
  stmt_vec_info new_stmt_info = NULL;
- if (slp && slp_perm
- && (group_el % group_size) > group_size - group_gap
- && (group_el % group_size) + nloads * lnel < group_size)
-   {
- dr_chain.quick_push (NULL_TREE);
- group_el += nloads * lnel;
- continue;
-   }
  for (i = 0; i < nloads; i++)
{
  tree this_off = build_int_cst (TREE_TYPE (alias_off),
Index: gcc/testsuite/gcc.dg/torture/pr91207.c
===
--- gcc/testsuite/gcc.dg/torture/pr91207.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr91207.c  (working copy)
@@ -0,0 +1,25 @@
+/* { dg-do run } */
+
+long long a;
+int b[92][32];
+unsigned int c, d;
+
+void e(long long *f, int p2) { *f = p2; }
+
+int main()
+{
+  for (int i = 6; i <= 20; d = i++)
+for (int j = 6; j <= 91; j++) {
+   for (int k = 16; k <= 31;k++)
+ b[j][k] ^= 7;
+   c *= d;
+}
+
+  for (int i = 0; i < 21; ++i)
+for (int j = 0; j < 32; ++j)
+  e(&a, b[i][j]);
+
+  if (a != 7)
+__builtin_abort ();
+  return 0;
+}

Re: [ARM/FDPIC v5 14/21] [ARM][testsuite] FDPIC: Skip unsupported tests

2019-07-19 Thread Kyrill Tkachov


Hi Christophe,

On 5/15/19 1:39 PM, Christophe Lyon wrote:

Several tests cannot work on ARM-FDPIC for various reasons: skip them,
or skip some directives.

gcc.dg/20020312-2.c: Skip since it forces -fno-pic.

gcc.target/arm/:
* Skip since r9 is clobbered by assembly code:
  20051215-1.c
  mmx-1.c
  pr61948.c
  pr77933-1.c
  pr77933-2.c

* Skip since the test forces armv5te which is not supported by FDPIC:
  pr40887.c
  pr19599.c

* Skip since FDPIC disables sibcall to external functions:
  sibcall-1.c
  tail-long-call
  vfp-longcall-apcs

* Skip size check since it's different for FDPIC:
  ivopts-2.c
  ivopts-3.c
  ivopts-4.c
  ivopts-5.c
  pr43597.c
  pr43920-2.c

* Disable assembler scanning invalid for FDPIC:
  pr45701-1.c
  pr45701-2.c
  stack-red-zone.c

* gnu2 TLS dialect is not supported by FDPIC:
  tlscall.c

* Test relies on symbols not generated in FDPIC:
  data-rel-2.c
  data-rel-3.c



Thanks for the summary.

Ok once the rest is approved.

Kyrill



2019-XX-XX  Christophe Lyon  
    Mickaël Guêné 

    gcc/testsuite/
    * gcc.dg/20020312-2.c: Skip on arm*-*-uclinuxfdpiceabi.
    * gcc.target/arm/20051215-1.c: Likewise.
    * gcc.target/arm/mmx-1.c: Likewise.
    * gcc.target/arm/pr19599.c: Likewise.
    * gcc.target/arm/pr40887.c: Likewise.
    * gcc.target/arm/pr61948.c: Likewise.
    * gcc.target/arm/pr77933-1.c: Likewise.
    * gcc.target/arm/pr77933-2.c: Likewise.
    * gcc.target/arm/sibcall-1.c: Likewise.
    * gcc.target/arm/data-rel-2.c: Likewise.
    * gcc.target/arm/data-rel-3.c: Likewise.
    * gcc.target/arm/tail-long-call: Likewise.
    * gcc.target/arm/tlscall.c: Likewise.
    * gcc.target/arm/vfp-longcall-apcs: Likewise.
    * gcc.target/arm/ivopts-2.c: Skip object-size test on
    arm*-*-uclinuxfdpiceabi.
    * gcc.target/arm/ivopts-3.c: Likewise.
    * gcc.target/arm/ivopts-4.c: Likewise.
    * gcc.target/arm/ivopts-5.c: Likewise.
    * gcc.target/arm/pr43597.c: Likewise.
    * gcc.target/arm/pr43920-2.c: Likewise.
    * gcc.target/arm/pr45701-1.c: Skip scan-assembler on
    arm*-*-uclinuxfdpiceabi.
    * gcc.target/arm/pr45701-2.c: Likewise.
    * gcc.target/arm/stack-red-zone.c: Likewise.

Change-Id: Icada7ce52537901fdac10403e7997571b7e2c509

diff --git a/gcc/testsuite/gcc.dg/20020312-2.c 
b/gcc/testsuite/gcc.dg/20020312-2.c

index c584d35..3df99d9 100644
--- a/gcc/testsuite/gcc.dg/20020312-2.c
+++ b/gcc/testsuite/gcc.dg/20020312-2.c
@@ -9,6 +9,7 @@
 /* { dg-options "-O -fno-pic" } */
 /* { dg-additional-options "-no-pie" { target pie_enabled } } */
 /* { dg-require-effective-target nonlocal_goto } */
+/* { dg-skip-if "" { arm*-*-uclinuxfdpiceabi } "*" "" } */

 extern void abort (void);

diff --git a/gcc/testsuite/gcc.target/arm/20051215-1.c 
b/gcc/testsuite/gcc.target/arm/20051215-1.c

index 0519dc7..cc07693 100644
--- a/gcc/testsuite/gcc.target/arm/20051215-1.c
+++ b/gcc/testsuite/gcc.target/arm/20051215-1.c
@@ -3,6 +3,7 @@
    the call would need an output reload.  */
 /* { dg-do run } */
 /* { dg-options "-O2 -fno-omit-frame-pointer" } */
+/* { dg-skip-if "r9 is reserved in FDPIC" { arm*-*-uclinuxfdpiceabi } 
"*" "" } */

 extern void abort (void);
 typedef void (*callback) (void);

diff --git a/gcc/testsuite/gcc.target/arm/data-rel-2.c 
b/gcc/testsuite/gcc.target/arm/data-rel-2.c

index 6ba47d6..7d37a8c 100644
--- a/gcc/testsuite/gcc.target/arm/data-rel-2.c
+++ b/gcc/testsuite/gcc.target/arm/data-rel-2.c
@@ -1,3 +1,4 @@
+/* { dg-skip-if "Not supported in FDPIC" { arm*-*-uclinuxfdpiceabi } 
"*" "" } */
 /* { dg-options "-fPIC -mno-pic-data-is-text-relative 
-mno-single-pic-base" } */

 /* { dg-final { scan-assembler-not "j-\\(.LPIC"  } } */
 /* { dg-final { scan-assembler "_GLOBAL_OFFSET_TABLE_-\\(.LPIC" } } */
diff --git a/gcc/testsuite/gcc.target/arm/data-rel-3.c 
b/gcc/testsuite/gcc.target/arm/data-rel-3.c

index 2ce1e66..534c6c4 100644
--- a/gcc/testsuite/gcc.target/arm/data-rel-3.c
+++ b/gcc/testsuite/gcc.target/arm/data-rel-3.c
@@ -1,3 +1,4 @@
+/* { dg-skip-if "Not supported in FDPIC" { arm*-*-uclinuxfdpiceabi } 
"*" "" } */

 /* { dg-options "-fPIC -mpic-data-is-text-relative" } */
 /* { dg-final { scan-assembler "j-\\(.LPIC"  } } */
 /* { dg-final { scan-assembler-not "_GLOBAL_OFFSET_TABLE_-\\(.LPIC" } 
} */
diff --git a/gcc/testsuite/gcc.target/arm/ivopts-2.c 
b/gcc/testsuite/gcc.target/arm/ivopts-2.c

index afe91aa..f1d5edb 100644
--- a/gcc/testsuite/gcc.target/arm/ivopts-2.c
+++ b/gcc/testsuite/gcc.target/arm/ivopts-2.c
@@ -14,4 +14,4 @@ tr4 (short array[], int n)

 /* { dg-final { scan-tree-dump-times "PHI +/* { dg-final { object-size text <= 26 { target { arm_thumb2 && { ! 
arm*-*-uclinuxfdpiceabi } } } } } */
diff --git a/gcc/testsuite/gcc.target/arm/ivopts-3.c 
b/gcc/testsuite/gcc.target/arm/ivopts-3.c

index faea996..357350c 100644
--- a/gcc/testsuite/gcc.target/arm/ivopts-3.c
+++ b/gcc/testsuite/gcc.target/arm/ivopts-3.c
@@ -16,4 +16,4 @@ tr3 (

Re: [ARM/FDPIC v5 15/21] [ARM][testsuite] FDPIC: Adjust scan-assembler patterns.

2019-07-19 Thread Kyrill Tkachov




On 5/15/19 1:39 PM, Christophe Lyon wrote:

In FDPIC mode, r9 is saved in addition to other registers, so update
the expected patterns accordingly.


Ok.

Thanks,

Kyrill



2019-XX-XX  Christophe Lyon  
    Mickaël Guêné 

    * gcc/testsuite/
    * gcc.target/arm/interrupt-1.c: Add scan-assembler pattern for
    arm*-*-uclinuxfdpiceabi.
    * gcc.target/arm/interrupt-2.c: Likewise.
    * gcc.target/arm/pr70830.c: Likewise.

Change-Id: Id946b79bacc32be585c31e60a355191f104cc29e

diff --git a/gcc/testsuite/gcc.target/arm/interrupt-1.c 
b/gcc/testsuite/gcc.target/arm/interrupt-1.c

index fe94877..493763d 100644
--- a/gcc/testsuite/gcc.target/arm/interrupt-1.c
+++ b/gcc/testsuite/gcc.target/arm/interrupt-1.c
@@ -13,5 +13,7 @@ void foo ()
   bar (0);
 }

-/* { dg-final { scan-assembler "push\t{r0, r1, r2, r3, r4, fp, ip, 
lr}" } } */
-/* { dg-final { scan-assembler "ldmfd\tsp!, {r0, r1, r2, r3, r4, fp, 
ip, pc}\\^" } } */
+/* { dg-final { scan-assembler "push\t{r0, r1, r2, r3, r4, fp, ip, 
lr}" { target { ! arm*-*-uclinuxfdpiceabi } } } } */
+/* { dg-final { scan-assembler "ldmfd\tsp!, {r0, r1, r2, r3, r4, fp, 
ip, pc}\\^" { target { ! arm*-*-uclinuxfdpiceabi } } } } */
+/* { dg-final { scan-assembler "push\t{r0, r1, r2, r3, r4, r5, r9, 
fp, ip, lr}" { target arm*-*-uclinuxfdpiceabi } } } */
+/* { dg-final { scan-assembler "ldmfd\tsp!, {r0, r1, r2, r3, r4, r5, 
r9, fp, ip, pc}\\^" { target arm*-*-uclinuxfdpiceabi } } } */
diff --git a/gcc/testsuite/gcc.target/arm/interrupt-2.c 
b/gcc/testsuite/gcc.target/arm/interrupt-2.c

index 289eca0..5be1f16 100644
--- a/gcc/testsuite/gcc.target/arm/interrupt-2.c
+++ b/gcc/testsuite/gcc.target/arm/interrupt-2.c
@@ -15,5 +15,7 @@ void test()
   foo = 0;
 }

-/* { dg-final { scan-assembler "push\t{r0, r1, r2, r3, r4, r5, ip, 
lr}" } } */
-/* { dg-final { scan-assembler "ldmfd\tsp!, {r0, r1, r2, r3, r4, r5, 
ip, pc}\\^" } } */
+/* { dg-final { scan-assembler "push\t{r0, r1, r2, r3, r4, r5, ip, 
lr}" { target { ! arm*-*-uclinuxfdpiceabi } } } } */
+/* { dg-final { scan-assembler "ldmfd\tsp!, {r0, r1, r2, r3, r4, r5, 
ip, pc}\\^" { target { ! arm*-*-uclinuxfdpiceabi } } } } */
+/* { dg-final { scan-assembler "push\t{r0, r1, r2, r3, r4, r5, r6, 
r9, ip, lr}" { target arm*-*-uclinuxfdpiceabi } } } */
+/* { dg-final { scan-assembler "ldmfd\tsp!, {r0, r1, r2, r3, r4, r5, 
r6, r9, ip, pc}\\^" { target arm*-*-uclinuxfdpiceabi } } } */
diff --git a/gcc/testsuite/gcc.target/arm/pr70830.c 
b/gcc/testsuite/gcc.target/arm/pr70830.c

index cad903b..cd84c42 100644
--- a/gcc/testsuite/gcc.target/arm/pr70830.c
+++ b/gcc/testsuite/gcc.target/arm/pr70830.c
@@ -11,4 +11,5 @@ void __attribute__ ((interrupt ("IRQ"))) 
dm3730_IRQHandler(void)

 {
 prints("IRQ" );
 }
-/* { dg-final { scan-assembler "ldmfd\tsp!, {r0, r1, r2, r3, ip, 
pc}\\^" } } */
+/* { dg-final { scan-assembler "ldmfd\tsp!, {r0, r1, r2, r3, ip, 
pc}\\^" { target { ! arm*-*-uclinuxfdpiceabi } } } } */
+/* { dg-final { scan-assembler "ldmfd\tsp!, {r0, r1, r2, r3, r4, r9, 
ip, pc}\\^" { target arm*-*-uclinuxfdpiceabi } } } */

--
2.6.3

Re: [ARM/FDPIC v5 16/21] [ARM][testsuite] FDPIC: Skip tests that don't work in PIC mode

2019-07-19 Thread Kyrill Tkachov




On 5/15/19 1:39 PM, Christophe Lyon wrote:

Some tests fail on arm*-*-uclinuxfdpiceabi because it generates PIC
code and they don't support it: skip them. They also fail on
arm*-linux* when forcing -fPIC.

2019-XX-XX  Christophe Lyon  

    gcc/testsuite/
    * gcc.target/arm/eliminate.c: Accept only nonpic targets.
    * g++.dg/other/anon5.C: Likewise.


Ok.

I think it's worth committing this now independently of the others.

Thanks,

Kyrill



Change-Id: I8efb8d356ce25b020c44a84b07f79a996dca0358

diff --git a/gcc/testsuite/g++.dg/other/anon5.C 
b/gcc/testsuite/g++.dg/other/anon5.C

index ee4601e..dadd92e 100644
--- a/gcc/testsuite/g++.dg/other/anon5.C
+++ b/gcc/testsuite/g++.dg/other/anon5.C
@@ -1,5 +1,6 @@
 // PR c++/34094
 // { dg-do link { target { ! { *-*-darwin* *-*-hpux* *-*-solaris2.* } 
} } }

+// { dg-require-effective-target nonpic }
 // { dg-options "-gdwarf-2" }
 // Ignore additional message on powerpc-ibm-aix
 // { dg-prune-output "obtain more information" } */
diff --git a/gcc/testsuite/gcc.target/arm/eliminate.c 
b/gcc/testsuite/gcc.target/arm/eliminate.c

index f254dd8..299d4df 100644
--- a/gcc/testsuite/gcc.target/arm/eliminate.c
+++ b/gcc/testsuite/gcc.target/arm/eliminate.c
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target { nonpic } } } */
 /* { dg-options "-O2" }  */

 struct X
--
2.6.3

Re: [ARM/FDPIC v5 17/21] [ARM][testsuite] FDPIC: Handle --uclinux*

2019-07-19 Thread Kyrill Tkachov




On 5/15/19 1:39 PM, Christophe Lyon wrote:

Add *-*-uclinux* to tests that work on this target.

2019-XX-XX  Christophe Lyon  

    gcc/testsuite/
    * g++.dg/abi/forced.C: Add *-*-uclinux*.
    * g++.dg/abi/guard2.C: Likewise.
    * g++.dg/ext/cleanup-10.C: Likewise.
    * g++.dg/ext/cleanup-11.C: Likewise.
    * g++.dg/ext/cleanup-8.C: Likewise.
    * g++.dg/ext/cleanup-9.C: Likewise.
    * g++.dg/ext/sync-4.C: Likewise.
    * g++.dg/ipa/comdat.C: Likewise.
    * gcc.dg/20041106-1.c: Likewise.
    * gcc.dg/cleanup-10.c: Likewise.
    * gcc.dg/cleanup-11.c: Likewise.
    * gcc.dg/cleanup-8.c: Likewise.
    * gcc.dg/cleanup-9.c: Likewise.
    * gcc.dg/fdata-sections-1.c: Likewise.
    * gcc.dg/fdata-sections-2.c: Likewise.
    * gcc.dg/pr39323-1.c: Likewise.
    * gcc.dg/pr39323-2.c: Likewise.
    * gcc.dg/pr39323-3.c: Likewise.
    * gcc.dg/pr65780-1.c: Likewise.
    * gcc.dg/pr65780-2.c: Likewise.
    * gcc.dg/pr67338.c: Likewise.
    * gcc.dg/pr78185.c: Likewise.
    * gcc.dg/pr83100-1.c: Likewise.
    * gcc.dg/pr83100-4.c: Likewise.
    * gcc.dg/strlenopt-12g.c: Likewise.
    * gcc.dg/strlenopt-14g.c: Likewise.
    * gcc.dg/strlenopt-14gf.c: Likewise.
    * gcc.dg/strlenopt-16g.c: Likewise.
    * gcc.dg/strlenopt-17g.c: Likewise.
    * gcc.dg/strlenopt-18g.c: Likewise.
    * gcc.dg/strlenopt-1f.c: Likewise.
    * gcc.dg/strlenopt-22g.c: Likewise.
    * gcc.dg/strlenopt-2f.c: Likewise.
    * gcc.dg/strlenopt-31g.c: Likewise.
    * gcc.dg/strlenopt-33g.c: Likewise.
    * gcc.dg/strlenopt-4g.c: Likewise.
    * gcc.dg/strlenopt-4gf.c: Likewise.
    * gcc.dg/strncmp-2.c: Likewise.
    * gcc.dg/struct-ret-3.c: Likewise.
    * gcc.dg/torture/pr69760.c: Likewise.
    * gcc.target/arm/div64-unwinding.c: Likewise.
    * gcc.target/arm/stack-checking.c: Likewise.
    * gcc.target/arm/synchronize.c: Likewise.
    * gcc.target/arm/pr66912.c: Add arm*-*-uclinuxfdpiceabi.
    * lib/target-supports.exp (check_effective_target_pie): Likewise.
    (check_effective_target_sync_long_long_runtime): Likewise.
    (check_effective_target_sync_int_long): Likewise.
    (check_effective_target_sync_char_short): Likewise.


I think these are ok, but you're changing many generic test targets.

Are the testsuite maintainers ok with this change?

Thanks,

Kyrill


Change-Id: I89bfea79d4490c5df0b6470def5a31d7f31ac2cc

diff --git a/gcc/testsuite/g++.dg/abi/forced.C 
b/gcc/testsuite/g++.dg/abi/forced.C

index 0e6be28..2d1ec53 100644
--- a/gcc/testsuite/g++.dg/abi/forced.C
+++ b/gcc/testsuite/g++.dg/abi/forced.C
@@ -1,4 +1,4 @@
-// { dg-do run { target *-*-linux* *-*-gnu* } }
+// { dg-do run { target *-*-linux* *-*-gnu* *-*-uclinux* } }
 // { dg-options "-pthread" }

 #include 
diff --git a/gcc/testsuite/g++.dg/abi/guard2.C 
b/gcc/testsuite/g++.dg/abi/guard2.C

index c35fa7e..74139a8 100644
--- a/gcc/testsuite/g++.dg/abi/guard2.C
+++ b/gcc/testsuite/g++.dg/abi/guard2.C
@@ -1,6 +1,6 @@
 // PR c++/41611
 // Test that the guard gets its own COMDAT group.
-// { dg-final { scan-assembler "_ZGVZN1A1fEvE1i,comdat" { target 
*-*-linux* *-*-gnu* } } }
+// { dg-final { scan-assembler "_ZGVZN1A1fEvE1i,comdat" { target 
*-*-linux* *-*-gnu* *-*-uclinux* } } }


 struct A {
   static int f()
diff --git a/gcc/testsuite/g++.dg/ext/cleanup-10.C 
b/gcc/testsuite/g++.dg/ext/cleanup-10.C

index 66c7b76..56aeb66 100644
--- a/gcc/testsuite/g++.dg/ext/cleanup-10.C
+++ b/gcc/testsuite/g++.dg/ext/cleanup-10.C
@@ -1,4 +1,4 @@
-/* { dg-do run { target hppa*-*-hpux* *-*-linux* *-*-gnu* 
powerpc*-*-darwin* *-*-darwin[912]* } } */
+/* { dg-do run { target hppa*-*-hpux* *-*-linux* *-*-gnu* 
powerpc*-*-darwin* *-*-darwin[912]* *-*-uclinux* } } */

 /* { dg-options "-fexceptions -fnon-call-exceptions -O2" } */
 /* Verify that cleanups work with exception handling through signal 
frames

    on alternate stack.  */
diff --git a/gcc/testsuite/g++.dg/ext/cleanup-11.C 
b/gcc/testsuite/g++.dg/ext/cleanup-11.C

index 6e96521..c6d3560 100644
--- a/gcc/testsuite/g++.dg/ext/cleanup-11.C
+++ b/gcc/testsuite/g++.dg/ext/cleanup-11.C
@@ -1,4 +1,4 @@
-/* { dg-do run { target hppa*-*-hpux* *-*-linux* *-*-gnu* 
powerpc*-*-darwin* *-*-darwin[912]* } } */
+/* { dg-do run { target hppa*-*-hpux* *-*-linux* *-*-gnu* 
powerpc*-*-darwin* *-*-darwin[912]* *-*-uclinux* } } */

 /* { dg-options "-fexceptions -fnon-call-exceptions -O2" } */
 /* Verify that cleanups work with exception handling through realtime 
signal

    frames on alternate stack.  */
diff --git a/gcc/testsuite/g++.dg/ext/cleanup-8.C 
b/gcc/testsuite/g++.dg/ext/cleanup-8.C

index ccf9bef..e99508d 100644
--- a/gcc/testsuite/g++.dg/ext/cleanup-8.C
+++ b/gcc/testsuite/g++.dg/ext/cleanup-8.C
@@ -1,4 +1,4 @@
-/* { dg-do run { target hppa*-*-hpux* *-*-linux* *-*-gnu* 
powerpc*-*-darwin* *-*-darwin[912]* } } */
+/* { dg-do run { target

Re: [ARM/FDPIC v5 18/21] [ARM][testsuite] FDPIC: Enable tests on pie_enabled targets

2019-07-19 Thread Kyrill Tkachov




On 5/15/19 1:39 PM, Christophe Lyon wrote:

Some tests have the "nonpic" guard, but pass on
arm*-*-uclinuxfdpiceabi because it is in PIE mode by default. Rather
than adding this target to all these tests, add the "pie_enabled"
effective target.

2019-XX-XX  Christophe Lyon  

    gcc/testsuite/
    * g++.dg/cpp0x/noexcept03.C: Add pie_enabled.
    * g++.dg/ipa/devirt-c-7.C: Likewise.
    * g++.dg/ipa/ivinline-1.C: Likewise.
    * g++.dg/ipa/ivinline-2.C: Likewise.
    * g++.dg/ipa/ivinline-3.C: Likewise.
    * g++.dg/ipa/ivinline-4.C: Likewise.
    * g++.dg/ipa/ivinline-5.C: Likewise.
    * g++.dg/ipa/ivinline-7.C: Likewise.
    * g++.dg/ipa/ivinline-8.C: Likewise.
    * g++.dg/ipa/ivinline-9.C: Likewise.
    * g++.dg/tls/pr79288.C: Likewise.
    * gcc.dg/addr_equal-1.c: Likewise.
    * gcc.dg/const-1.c: Likewise.
    * gcc.dg/ipa/pure-const-1.c: Likewise.
    * gcc.dg/noreturn-8.c: Likewise.
    * gcc.dg/pr33826.c: Likewise.
    * gcc.dg/torture/ipa-pta-1.c: Likewise.
    * gcc.dg/tree-ssa/alias-2.c: Likewise.
    * gcc.dg/tree-ssa/ipa-split-5.c: Likewise.
    * gcc.dg/tree-ssa/loadpre6.c: Likewise.
    * gcc.dg/uninit-19.c: Likewise.


Looks sensible, but this is not an arm-specific patch.

CC'ing testsuite maintainers.

Thanks,

Kyrill



Change-Id: I1a0d836b892c23891f739fccdc467d0f354ab82c

diff --git a/gcc/testsuite/g++.dg/cpp0x/noexcept03.C 
b/gcc/testsuite/g++.dg/cpp0x/noexcept03.C

index 2d37867..906a44d 100644
--- a/gcc/testsuite/g++.dg/cpp0x/noexcept03.C
+++ b/gcc/testsuite/g++.dg/cpp0x/noexcept03.C
@@ -1,6 +1,6 @@
 // Runtime test for noexcept-specification.
 // { dg-options "-Wnoexcept" }
-// { dg-do run { target nonpic } }
+// { dg-do run { target { nonpic || pie_enabled } } }
 // { dg-require-effective-target c++11 }

 #include 
diff --git a/gcc/testsuite/g++.dg/ipa/devirt-c-7.C 
b/gcc/testsuite/g++.dg/ipa/devirt-c-7.C

index 2e76cbe..efb65c2 100644
--- a/gcc/testsuite/g++.dg/ipa/devirt-c-7.C
+++ b/gcc/testsuite/g++.dg/ipa/devirt-c-7.C
@@ -1,7 +1,6 @@
 /* Verify that ipa-cp will not get confused by placement new 
constructing an

    object within another one when looking for dynamic type change .  */
-/* { dg-do run } */
-/* { dg-require-effective-target nonpic } */
+/* { dg-do run { target { nonpic || pie_enabled } } } */
 /* { dg-options "-O3 -Wno-attributes"  } */

 extern "C" void abort (void);
diff --git a/gcc/testsuite/g++.dg/ipa/ivinline-1.C 
b/gcc/testsuite/g++.dg/ipa/ivinline-1.C

index 9b10d20..2d988bc 100644
--- a/gcc/testsuite/g++.dg/ipa/ivinline-1.C
+++ b/gcc/testsuite/g++.dg/ipa/ivinline-1.C
@@ -1,6 +1,6 @@
 /* Verify that simple virtual calls are inlined even without early
    inlining.  */
-/* { dg-do run { target nonpic } } */
+/* { dg-do run { target { nonpic || pie_enabled } } } */
 /* { dg-options "-O3 -fdump-ipa-inline -fno-early-inlining 
-fno-ipa-cp"  } */


 extern "C" void abort (void);
diff --git a/gcc/testsuite/g++.dg/ipa/ivinline-2.C 
b/gcc/testsuite/g++.dg/ipa/ivinline-2.C

index 21cd46f..d978638 100644
--- a/gcc/testsuite/g++.dg/ipa/ivinline-2.C
+++ b/gcc/testsuite/g++.dg/ipa/ivinline-2.C
@@ -1,6 +1,6 @@
 /* Verify that simple virtual calls using this pointer are inlined
    even without early inlining..  */
-/* { dg-do run { target nonpic } } */
+/* { dg-do run { target { nonpic || pie_enabled } } } */
 /* { dg-options "-O3 -fdump-ipa-inline -fno-early-inlining 
-fno-ipa-cp"  } */


 extern "C" void abort (void);
diff --git a/gcc/testsuite/g++.dg/ipa/ivinline-3.C 
b/gcc/testsuite/g++.dg/ipa/ivinline-3.C

index 1e24644..f756a16 100644
--- a/gcc/testsuite/g++.dg/ipa/ivinline-3.C
+++ b/gcc/testsuite/g++.dg/ipa/ivinline-3.C
@@ -1,6 +1,6 @@
 /* Verify that simple virtual calls on an object refrence are inlined
    even without early inlining.  */
-/* { dg-do run { target nonpic } } */
+/* { dg-do run { target { nonpic || pie_enabled } } } */
 /* { dg-options "-O3 -fdump-ipa-inline -fno-early-inlining 
-fno-ipa-cp"  } */


 extern "C" void abort (void);
diff --git a/gcc/testsuite/g++.dg/ipa/ivinline-4.C 
b/gcc/testsuite/g++.dg/ipa/ivinline-4.C

index cf0d980..5fbd3ef 100644
--- a/gcc/testsuite/g++.dg/ipa/ivinline-4.C
+++ b/gcc/testsuite/g++.dg/ipa/ivinline-4.C
@@ -1,7 +1,7 @@
 /* Verify that simple virtual calls are inlined even without early
    inlining, even when a typecast to an ancestor is involved along the
    way.  */
-/* { dg-do run { target nonpic } } */
+/* { dg-do run { target { nonpic || pie_enabled } } } */
 /* { dg-options "-O3 -fdump-ipa-inline -fno-early-inlining 
-fno-ipa-cp"  } */


 extern "C" void abort (void);
diff --git a/gcc/testsuite/g++.dg/ipa/ivinline-5.C 
b/gcc/testsuite/g++.dg/ipa/ivinline-5.C

index f15ebf2..6c19907 100644
--- a/gcc/testsuite/g++.dg/ipa/ivinline-5.C
+++ b/gcc/testsuite/g++.dg/ipa/ivinline-5.C
@@ -1,6 +1,6 @@
 /* Verify that virtual call inlining does not pick a wrong method when
    there is a user defined ancestor in an object.  *

Re: [ARM/FDPIC v5 19/21] [ARM][testsuite] FDPIC: Adjust pr43698.c to avoid clash with uclibc.

2019-07-19 Thread Kyrill Tkachov




On 5/15/19 1:39 PM, Christophe Lyon wrote:

uclibc defines bswap_32, so use a different name in this test.

2019-XX-XX  Christophe Lyon  

    gcc/testsuite/
    * gcc.target/arm/pr43698.c (bswap_32): Rename as my_bswap_32.


Ok.

This can go in independently of the others.

Thanks,

Kyrill




Change-Id: I2591bd911030814331cabf97ee5cf6cf8124b4f3

diff --git a/gcc/testsuite/gcc.target/arm/pr43698.c 
b/gcc/testsuite/gcc.target/arm/pr43698.c

index 1fc497c..3b5dad0 100644
--- a/gcc/testsuite/gcc.target/arm/pr43698.c
+++ b/gcc/testsuite/gcc.target/arm/pr43698.c
@@ -6,7 +6,7 @@

 char do_reverse_endian = 0;

-#  define bswap_32(x) \
+#  define my_bswap_32(x) \
   x) & 0xff00) >> 24) | \
    (((x) & 0x00ff) >>  8) | \
    (((x) & 0xff00) <<  8) | \
@@ -16,7 +16,7 @@ char do_reverse_endian = 0;
   (__extension__ ({ \
   uint64_t __res; \
   if (!do_reverse_endian) {    __res = (X); \
-  } else if (sizeof(X) == 4) { __res = bswap_32((X)); \
+  } else if (sizeof(X) == 4) { __res = my_bswap_32((X)); \
   } \
   __res; \
 }))
--
2.6.3

Re: [ARM/FDPIC v5 20/21] [ARM][testsuite] FDPIC: Skip tests using architectures unsupported by FDPIC

2019-07-19 Thread Kyrill Tkachov




On 5/15/19 1:39 PM, Christophe Lyon wrote:

Since FDPIC currently supports arm and thumb-2 modes only, these tests
fail because they enforce an architecture version that doesn't match
these restrictions.

This patch introduces new values for the arm_arch effective-target
(v4t_thumb, v5t_thumb, v5te_thumb, v6_thumb, v6k_thumb, v6z_thumb) as
needed, and adds them to the relevant tests.  It also adds the
corresponding non-thumb effective-target to the tests that were
missing it.

2019-XX-XX  Christophe Lyon  

    * lib/target-supports.exp
    (check_effective_target_arm_arch_FUNC_ok): Add v4t_thumb,
    v5t_thumb, v5te_thumb, v6_thumb, v6k_thumb, v6z_thumb.
    * gcc.target/arm/armv6-unaligned-load-ice.c: Add arm_arch
    effective-target.
    * gcc.target/arm/attr-unaligned-load-ice.c: Likewise.
    * gcc.target/arm/attr_arm-err.c: Likewise.
    * gcc.target/arm/ftest-armv4-arm.c: Likewise.
    * gcc.target/arm/ftest-armv4t-arm.c: Likewise.
    * gcc.target/arm/ftest-armv4t-thumb.c: Likewise.
    * gcc.target/arm/ftest-armv5t-arm.c: Likewise.
    * gcc.target/arm/ftest-armv5t-thumb.c: Likewise.
    * gcc.target/arm/ftest-armv5te-arm.c: Likewise.
    * gcc.target/arm/ftest-armv5te-thumb.c: Likewise.
    * gcc.target/arm/ftest-armv6-arm.c: Likewise.
    * gcc.target/arm/ftest-armv6-thumb.c: Likewise.
    * gcc.target/arm/ftest-armv6k-arm.c: Likewise.
    * gcc.target/arm/ftest-armv6k-thumb.c: Likewise.
    * gcc.target/arm/ftest-armv6m-thumb.c: Likewise.
    * gcc.target/arm/ftest-armv6t2-arm.c: Likewise.
    * gcc.target/arm/ftest-armv6t2-thumb.c: Likewise.
    * gcc.target/arm/ftest-armv6z-arm.c: Likewise.
    * gcc.target/arm/ftest-armv6z-thumb.c: Likewise.
    * gcc.target/arm/g2.c: Likewise.
    * gcc.target/arm/macro_defs1.c: Likewise.
    * gcc.target/arm/pr59858.c: Likewise.
    * gcc.target/arm/pr65647-2.c: Likewise.
    * gcc.target/arm/pr79058.c: Likewise.
    * gcc.target/arm/pr83712.c: Likewise.
    * gcc.target/arm/pragma_arch_switch_2.c: Likewise.
    * gcc.target/arm/scd42-1.c: Likewise.
    * gcc.target/arm/scd42-2.c: Likewise.
    * gcc.target/arm/scd42-3.c: Likewise.
    * gcc.c-torture/compile/pr82096.c: Fix arm_arch effective-target.


Ok.

This looks like a good improvement on its own.

Thanks,

Kyrill




Change-Id: I0845b262b241026561cc52a19ff8bb1659675e49

diff --git a/gcc/testsuite/gcc.c-torture/compile/pr82096.c 
b/gcc/testsuite/gcc.c-torture/compile/pr82096.c

index d144b70..4e695cd 100644
--- a/gcc/testsuite/gcc.c-torture/compile/pr82096.c
+++ b/gcc/testsuite/gcc.c-torture/compile/pr82096.c
@@ -1,4 +1,4 @@
-/* { dg-require-effective-target arm_arch_v5t_ok { target arm*-*-* } } */
+/* { dg-require-effective-target arm_arch_v5t_thumb_ok { target 
arm*-*-* } } */
 /* { dg-skip-if "Do not combine float-abi values" { arm*-*-* } { 
"-mfloat-abi=*" } { "-mfloat-abi=soft" } } */
 /* { dg-additional-options "-march=armv5t -mthumb -mfloat-abi=soft" { 
target arm*-*-* } } */


diff --git a/gcc/testsuite/gcc.target/arm/armv6-unaligned-load-ice.c 
b/gcc/testsuite/gcc.target/arm/armv6-unaligned-load-ice.c

index 88528f1..886a012 100644
--- a/gcc/testsuite/gcc.target/arm/armv6-unaligned-load-ice.c
+++ b/gcc/testsuite/gcc.target/arm/armv6-unaligned-load-ice.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { 
"-march=*" } { "-march=armv6k" } } */
 /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { 
"-marm" } { "" } } */

+/* { dg-require-effective-target arm_arch_v6k_thumb_ok } */
 /* { dg-options "-mthumb -Os -mfloat-abi=softfp" } */
 /* { dg-add-options arm_arch_v6k } */

diff --git a/gcc/testsuite/gcc.target/arm/attr-unaligned-load-ice.c 
b/gcc/testsuite/gcc.target/arm/attr-unaligned-load-ice.c

index e1ed1c1..2eeb522 100644
--- a/gcc/testsuite/gcc.target/arm/attr-unaligned-load-ice.c
+++ b/gcc/testsuite/gcc.target/arm/attr-unaligned-load-ice.c
@@ -2,6 +2,7 @@
    Verify that unaligned_access is correctly with attribute target.  */
 /* { dg-do compile } */
 /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { 
"-march=*" } { "-march=armv6" } } */

+/* { dg-require-effective-target arm_arch_v6_ok } */
 /* { dg-options "-Os -mfloat-abi=softfp -mtp=soft" } */
 /* { dg-add-options arm_arch_v6 } */

diff --git a/gcc/testsuite/gcc.target/arm/attr_arm-err.c 
b/gcc/testsuite/gcc.target/arm/attr_arm-err.c

index 630c06a..d410056 100644
--- a/gcc/testsuite/gcc.target/arm/attr_arm-err.c
+++ b/gcc/testsuite/gcc.target/arm/attr_arm-err.c
@@ -2,6 +2,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_arm_ok } */
 /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { 
"-march=*" } { "-march=armv6-m" } } */

+/* { dg-require-effective-target arm_arch_v6m_ok } */
 /* { dg-add-options arm_arch_v6m } */

 int __attribute__((target("arm")))
diff --git a/gcc/testsuite/g

Re: [ARM/FDPIC v5 21/21] [ARM] FDPIC: Handle stack-protector combined patterns

2019-07-19 Thread Kyrill Tkachov




On 5/15/19 1:39 PM, Christophe Lyon wrote:

The recent stack_protect_combined_set_insn and
stack_protect_combined_test_insn force recomputing of GOT base, but
need to take into account that in FDPIC mode, the PIC register is
fixed by the ABI (r9).

2019-XX-XX  Christophe Lyon  

    * config/arm/arm.md (stack_protect_combined_set_insn): Handle
    FDPIC mode.
    (stack_protect_combined_test_insn): Likewise.

Change-Id: Ib243fab0791fc883ca7b1c1205af1e0893f3e8c5

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 0edcb1d..5a4dd00 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -8869,8 +8869,19 @@
 {
   if (flag_pic)
 {
+  rtx pic_reg;
+
+  if (TARGET_FDPIC)
+    {
+ pic_reg = gen_rtx_REG (Pmode, FDPIC_REGNUM);
+   }
+  else
+    {
+ pic_reg = operands[3];
+   }


Redundant braces here...


+
   /* Forces recomputing of GOT base now.  */
-  legitimize_pic_address (operands[1], SImode, operands[2], 
operands[3],

+  legitimize_pic_address (operands[1], SImode, operands[2], pic_reg,
   true /*compute_now*/);
 }
   else
@@ -8943,8 +8954,19 @@

   if (flag_pic)
 {
+  rtx pic_reg;
+
+  if (TARGET_FDPIC)
+    {
+ pic_reg = gen_rtx_REG (Pmode, FDPIC_REGNUM);
+    }
+  else
+    {
+ pic_reg = operands[4];
+   }
+


... and here.

Ok with that fixed.

Thanks,

Kyrill




   /* Forces recomputing of GOT base now.  */
-  legitimize_pic_address (operands[1], SImode, operands[3], 
operands[4],

+  legitimize_pic_address (operands[1], SImode, operands[3], pic_reg,
   true /*compute_now*/);
 }
   else
--
2.6.3

Re: [PATCH] Allow case-insensitive comparisons of register names by implementing CASE_INSENSITIVE_REGISTER_NAMES PR target/70320

2019-07-19 Thread Jozef Lawrynowicz

On Fri, 19 Jul 2019 09:31:15 +0100
Richard Sandiford  wrote:

> Jozef Lawrynowicz  writes:
> >
> > From 82eadcdcbb8914b06818f7c8a10156336518e8d1 Mon Sep 17 00:00:00 2001
> > From: Jozef Lawrynowicz 
> > Date: Wed, 17 Jul 2019 11:48:23 +0100
> > Subject: [PATCH] Implement CASE_INSENSITIVE_REGISTER_NAMES
> >
> > gcc/ChangeLog:
> >
> > 2019-07-18  Jozef Lawrynowicz  
> >
> > PR target/70320
> > * doc/tm.texi.in: Document new macro CASE_INSENSITIVE_REGISTER_NAMES.
> > * doc/tm.texi: Likewise.
> > * defaults.h: Define CASE_INSENSITIVE_REGISTER_NAMES to 0.
> > * config/msp430/msp430.h: Define CASE_INSENSITIVE_REGISTER_NAMES to 1.
> > * varasm.c (decode_reg_name_and_count): Use strcasecmp instead of
> > strcmp for comparisons of asmspec with a register name if 
> > CASE_INSENSITIVE_REGISTER_NAMES is defined to 1.  
> 
> I really don't think we should be adding new target macros for things
> like this.  The code is hardly on the critical path, so I don't think
> compile time is a concern.  That said...
> 
> ...
> 
> So TBH I still prefer the DEFHOOKPOD suggestion.  I won't object if
> someone else wants to approve the macro version though.

Ok, as you say, this code isn't on the critical path so I'd be happy to change
this to a DEFHOOKPOD.

In general, what should be considered when deciding between a hook and macro?
Does the choice lean towards macros mainly when compile time is a concern? And
hooks otherwise?

Is the downside of this macro implementation compared to a DEFHOOKPOD mainly
just the maintainability/readability of the added code?

Since if I take your welcome recommendation for how to improve the macro
implementation -

> ...if we do keep it as a macro, we should use:
> 
>   if (reg_names[i][0]
>   && (CASE_INSENSITIVE_REGISTER_NAMES
>   ? !strcasecmp (asmspec, strip_reg_name (reg_names[i]))
>   : !strcmp (asmspec, strip_reg_name (reg_names[i]

then when I make it a DEFHOOKPOD the changes in this function will look the
save as above, except CASE_INSENSITIVE_REGISTER_NAMES will be
targetm.case_insensitive_register_names.

Thanks,
Jozef

> 
> Thanks,
> Richard

Re: [PATCH] Allow case-insensitive comparisons of register names by implementing CASE_INSENSITIVE_REGISTER_NAMES PR target/70320

2019-07-19 Thread Jozef Lawrynowicz

On Thu, 18 Jul 2019 16:55:59 -0500
Segher Boessenkool  wrote:

> Hi!

Hi Segher,

> 
> On Thu, Jul 18, 2019 at 08:45:38PM +0100, Jozef Lawrynowicz wrote:
> > PR target/70320
> > * doc/tm.texi.in: Document new macro CASE_INSENSITIVE_REGISTER_NAMES.
> > * doc/tm.texi: Likewise.  
> 
> "Regenerate." -- or did you edit this file by hand?  Don't, or don't tell
> us anyway ;-)

I did indeed regenerate it, that's just a mistake in the ChangeLog.
> 
> > strcmp for comparisons of asmspec with a register name if   
> 
> (Trailing space here, and elsewhere).

Ah, I was given false confidence by contrib/check_GNU_style.sh, I guess that
only checks +/- lines.

> 
> > +/* { dg-do compile } */
> > +/* { dg-options "-ffixed-r6 -ffixed-R7" } */
> > +/* { dg-final { scan-assembler "PUSH.*R4" } } */
> > +/* { dg-final { scan-assembler "PUSH.*R5" } } */  
> 
> scan-assembler does multi-line matching by default, so that .* probably
> matches things you do not want it to match.  You can do things like
> 
> /* { dg-final { scan-assembler "(?n)PUSH.*R5" } } */
> 
> to make sure this is on one line at least.  See man re_syntax.

Right, thanks for pointing that out, the test was in fact matching cases it
shouldn't have been due to the multi-line matching.

Thanks for the review,
Jozef

> 
> Rest looks fine, but I'm not an RTL maintainer.
> 
> 
> Segher

Re: [PATCH] Allow case-insensitive comparisons of register names by implementing CASE_INSENSITIVE_REGISTER_NAMES PR target/70320

2019-07-19 Thread Jakub Jelinek

On Fri, Jul 19, 2019 at 10:39:52AM +0100, Jozef Lawrynowicz wrote:
> > > 2019-07-18  Jozef Lawrynowicz  
> > >
> > >   PR target/70320
> > >   * doc/tm.texi.in: Document new macro CASE_INSENSITIVE_REGISTER_NAMES.
> > >   * doc/tm.texi: Likewise.
> > >   * defaults.h: Define CASE_INSENSITIVE_REGISTER_NAMES to 0.
> > >   * config/msp430/msp430.h: Define CASE_INSENSITIVE_REGISTER_NAMES to 1.
> > >   * varasm.c (decode_reg_name_and_count): Use strcasecmp instead of
> > >   strcmp for comparisons of asmspec with a register name if 
> > >   CASE_INSENSITIVE_REGISTER_NAMES is defined to 1.  

Ugh, do we really need this?  If it is just for msp430, can't it instead
just
#define ADDITIONAL_REGISTER_NAMES macro and add those 16 "rN" register name
aliases to the current REGISTER_NAMES "RN" names?

Jakub

Re: [PATCH] Allow case-insensitive comparisons of register names by implementing CASE_INSENSITIVE_REGISTER_NAMES PR target/70320

2019-07-19 Thread Richard Sandiford

Jozef Lawrynowicz  writes:
> On Fri, 19 Jul 2019 09:31:15 +0100
> Richard Sandiford  wrote:
>
>> Jozef Lawrynowicz  writes:
>> >
>> > From 82eadcdcbb8914b06818f7c8a10156336518e8d1 Mon Sep 17 00:00:00 2001
>> > From: Jozef Lawrynowicz 
>> > Date: Wed, 17 Jul 2019 11:48:23 +0100
>> > Subject: [PATCH] Implement CASE_INSENSITIVE_REGISTER_NAMES
>> >
>> > gcc/ChangeLog:
>> >
>> > 2019-07-18  Jozef Lawrynowicz  
>> >
>> >PR target/70320
>> >* doc/tm.texi.in: Document new macro CASE_INSENSITIVE_REGISTER_NAMES.
>> >* doc/tm.texi: Likewise.
>> >* defaults.h: Define CASE_INSENSITIVE_REGISTER_NAMES to 0.
>> >* config/msp430/msp430.h: Define CASE_INSENSITIVE_REGISTER_NAMES to 1.
>> >* varasm.c (decode_reg_name_and_count): Use strcasecmp instead of
>> >strcmp for comparisons of asmspec with a register name if 
>> >CASE_INSENSITIVE_REGISTER_NAMES is defined to 1.  
>> 
>> I really don't think we should be adding new target macros for things
>> like this.  The code is hardly on the critical path, so I don't think
>> compile time is a concern.  That said...
>> 
>> ...
>> 
>> So TBH I still prefer the DEFHOOKPOD suggestion.  I won't object if
>> someone else wants to approve the macro version though.
>
> Ok, as you say, this code isn't on the critical path so I'd be happy to change
> this to a DEFHOOKPOD.
>
> In general, what should be considered when deciding between a hook and macro?
> Does the choice lean towards macros mainly when compile time is a concern? And
> hooks otherwise?
>
> Is the downside of this macro implementation compared to a DEFHOOKPOD mainly
> just the maintainability/readability of the added code?

Macros are essentially the "old way" and target hooks the "new way".
The decision was made a long time ago to move away from macros where
possible.  TBH I no longer remember the reasons clearly, but I think
it was partly to avoid including so much target stuff in non-target
files, and partly in the hope that we might one day support multiple
targets in a single build.  Years later, we're not much closer to the
latter and I'm not sure it's a realistic goal.

So over time various macros have been moved to hooks and new target
tests should generally use hooks if at all possible.

That said, there are some macros that are too compile-time sensitive
to be hooks or variables.  BITS_PER_UNIT is the most obvious example.
Also, the current approach of using header files to control OS and
object format features makes it difficult to convert the related
macros to hooks in a natural way.

Thanks,
Richard

Update m68k baseline symbols

2019-07-19 Thread Andreas Schwab

Installed as obvious.

Andreas.

* config/abi/post/m68k-linux-gnu/baseline_symbols.txt: Update.

diff --git a/libstdc++-v3/config/abi/post/m68k-linux-gnu/baseline_symbols.txt 
b/libstdc++-v3/config/abi/post/m68k-linux-gnu/baseline_symbols.txt
index 2af971a5696..1b4fbffa9e4 100644
--- a/libstdc++-v3/config/abi/post/m68k-linux-gnu/baseline_symbols.txt
+++ b/libstdc++-v3/config/abi/post/m68k-linux-gnu/baseline_symbols.txt
@@ -112,6 +112,7 @@ 
FUNC:_ZN11__gnu_debug19_Safe_sequence_base13_M_detach_allEv@@GLIBCXX_3.4
 FUNC:_ZN11__gnu_debug19_Safe_sequence_base18_M_detach_singularEv@@GLIBCXX_3.4
 
FUNC:_ZN11__gnu_debug19_Safe_sequence_base22_M_revalidate_singularEv@@GLIBCXX_3.4
 FUNC:_ZN11__gnu_debug19_Safe_sequence_base7_M_swapERS0_@@GLIBCXX_3.4
+FUNC:_ZN11__gnu_debug25_Safe_local_iterator_base16_M_attach_singleEPNS_19_Safe_sequence_baseEb@@GLIBCXX_3.4.26
 
FUNC:_ZN11__gnu_debug25_Safe_local_iterator_base9_M_attachEPNS_19_Safe_sequence_baseEb@@GLIBCXX_3.4.17
 FUNC:_ZN11__gnu_debug25_Safe_local_iterator_base9_M_detachEv@@GLIBCXX_3.4.17
 
FUNC:_ZN11__gnu_debug30_Safe_unordered_container_base13_M_detach_allEv@@GLIBCXX_3.4.17
@@ -261,6 +262,7 @@ 
FUNC:_ZNKSbIwSt11char_traitsIwESaIwEE8_M_limitEjj@@GLIBCXX_3.4
 FUNC:_ZNKSbIwSt11char_traitsIwESaIwEE8capacityEv@@GLIBCXX_3.4
 FUNC:_ZNKSbIwSt11char_traitsIwESaIwEE8max_sizeEv@@GLIBCXX_3.4
 FUNC:_ZNKSbIwSt11char_traitsIwESaIwEE9_M_ibeginEv@@GLIBCXX_3.4
+FUNC:_ZNKSbIwSt11char_traitsIwESaIwEEcvSt17basic_string_viewIwS0_EEv@@GLIBCXX_3.4.26
 FUNC:_ZNKSbIwSt11char_traitsIwESaIwEEixEj@@GLIBCXX_3.4
 FUNC:_ZNKSi6gcountEv@@GLIBCXX_3.4
 FUNC:_ZNKSi6sentrycvbEv@@GLIBCXX_3.4
@@ -328,9 +330,66 @@ FUNC:_ZNKSs8_M_limitEjj@@GLIBCXX_3.4
 FUNC:_ZNKSs8capacityEv@@GLIBCXX_3.4
 FUNC:_ZNKSs8max_sizeEv@@GLIBCXX_3.4
 FUNC:_ZNKSs9_M_ibeginEv@@GLIBCXX_3.4
+FUNC:_ZNKSscvSt17basic_string_viewIcSt11char_traitsIcEEEv@@GLIBCXX_3.4.26
 FUNC:_ZNKSsixEj@@GLIBCXX_3.4
 FUNC:_ZNKSt10bad_typeid4whatEv@@GLIBCXX_3.4.9
 FUNC:_ZNKSt10error_code23default_error_conditionEv@@GLIBCXX_3.4.11
+FUNC:_ZNKSt10filesystem16filesystem_error4whatEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem16filesystem_error5path1Ev@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem16filesystem_error5path2Ev@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem18directory_iteratordeEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem28recursive_directory_iterator17recursion_pendingEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem28recursive_directory_iterator5depthEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem28recursive_directory_iterator7optionsEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem28recursive_directory_iteratordeEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path11parent_pathEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path12has_filenameEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path13has_root_nameEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path13has_root_pathEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path13relative_pathEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path14root_directoryEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path15has_parent_pathEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path16lexically_normalEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path17_M_find_extensionEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path17has_relative_pathEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path18has_root_directoryEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path18lexically_relativeERKS0_@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path19lexically_proximateERKS0_@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path5_List13_Impl_deleterclEPNS1_5_ImplE@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path5_List3endEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path5_List5beginEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path7compareERKS0_@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path7compareESt17basic_string_viewIcSt11char_traitsIcEE@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path9root_nameEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path9root_pathEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx1116filesystem_error4whatEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx1116filesystem_error5path1Ev@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx1116filesystem_error5path2Ev@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx1118directory_iteratordeEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx1128recursive_directory_iterator17recursion_pendingEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx1128recursive_directory_iterator5depthEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx1128recursive_directory_iterator7optionsEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx1128recursive_directory_iteratordeEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx114path11parent_pathEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx114path12has_filenameEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx114path13has_root_nameEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx114path13has_root_pathEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx114path13relative_pathEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx114path14root_directo

Re: [PATCH] Allow case-insensitive comparisons of register names by implementing CASE_INSENSITIVE_REGISTER_NAMES PR target/70320

2019-07-19 Thread Jozef Lawrynowicz

On Fri, 19 Jul 2019 11:54:43 +0200
Jakub Jelinek  wrote:

> On Fri, Jul 19, 2019 at 10:39:52AM +0100, Jozef Lawrynowicz wrote:
> > > > 2019-07-18  Jozef Lawrynowicz  
> > > >
> > > > PR target/70320
> > > > * doc/tm.texi.in: Document new macro 
> > > > CASE_INSENSITIVE_REGISTER_NAMES.
> > > > * doc/tm.texi: Likewise.
> > > > * defaults.h: Define CASE_INSENSITIVE_REGISTER_NAMES to 0.
> > > > * config/msp430/msp430.h: Define 
> > > > CASE_INSENSITIVE_REGISTER_NAMES to 1.
> > > > * varasm.c (decode_reg_name_and_count): Use strcasecmp instead 
> > > > of
> > > > strcmp for comparisons of asmspec with a register name if 
> > > > CASE_INSENSITIVE_REGISTER_NAMES is defined to 1.
> 
> Ugh, do we really need this?  If it is just for msp430, can't it instead
> just
> #define ADDITIONAL_REGISTER_NAMES macro and add those 16 "rN" register name
> aliases to the current REGISTER_NAMES "RN" names?
> 
>   Jakub

That is something I considered in previous discussion here:
https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00372.html

But it seemed like this could potentially be a useful feature for other targets
that wish to enable it. It doesn't appear necessary to restrict the case of
register names, unless a target exists that has different registers that differ
only by case.

It makes the programmer's life easier IMO and gives them more choice as to how
they style their code. You can already use the register number alone without a
prefix (i.e. "9" instead of "r9"), this seems like a natural extension to that
flexibility.

As far as I can tell the exact register names for the different targets aren't
documented anywhere either. The MSP430 ABI, at least, doesn't explicitly
enforce a case for the register names. I would imagine there are others that
don't either.

Jozef

Re: [PATCH] Allow case-insensitive comparisons of register names by implementing CASE_INSENSITIVE_REGISTER_NAMES PR target/70320

2019-07-19 Thread Jakub Jelinek

On Fri, Jul 19, 2019 at 11:17:51AM +0100, Jozef Lawrynowicz wrote:
> That is something I considered in previous discussion here:
> https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00372.html
> 
> But it seemed like this could potentially be a useful feature for other 
> targets
> that wish to enable it. It doesn't appear necessary to restrict the case of
> register names, unless a target exists that has different registers that 
> differ
> only by case.

It doesn't seem like a generally useful feature to me, C as well as C++ are
case sensitive, so is gcc command line parsing, so having the register names
handled insensitive is strange and undesirable.
Perhaps the reason you want it for msp430 is that the register names were
chosen badly as upper case which surprises people?
Having register int x __asm ("eAx"); register __m512i __asm ("ZmM11"); is simply
weird, not something we should allow nor promote.

Jakub

Re: [PATCH] Allow case-insensitive comparisons of register names by implementing CASE_INSENSITIVE_REGISTER_NAMES PR target/70320

2019-07-19 Thread Jozef Lawrynowicz

On Fri, 19 Jul 2019 12:24:57 +0200
Jakub Jelinek  wrote:

> On Fri, Jul 19, 2019 at 11:17:51AM +0100, Jozef Lawrynowicz wrote:
> > That is something I considered in previous discussion here:
> > https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00372.html
> > 
> > But it seemed like this could potentially be a useful feature for other 
> > targets
> > that wish to enable it. It doesn't appear necessary to restrict the case of
> > register names, unless a target exists that has different registers that 
> > differ
> > only by case.  
> 
> It doesn't seem like a generally useful feature to me, C as well as C++ are
> case sensitive, so is gcc command line parsing, so having the register names
> handled insensitive is strange and undesirable.

That's true, but I guess user's feel that they are in assembly language "mode"
rather than C mode, since the clobber list is a string adjacent to a string of
assembler code.
The GNU assembler considers
> "Upper and lower case are equivalent in register names, opcodes, condition 
> codes and assembler directives."
(https://sourceware.org/binutils/docs/as/Z80_002dCase.html#Z80_002dCase)

> Perhaps the reason you want it for msp430 is that the register names were
> chosen badly as upper case which surprises people?

This is true. Users probably would not have complained and I would not be
considering this issue if the REGISTER_NAMES were originally lower case.

> Having register int x __asm ("eAx"); register __m512i __asm ("ZmM11"); is 
> simply
> weird, not something we should allow nor promote.

I was thinking along the lines that someone might want "EAX" rather than "eax".

At this point though, unless someone chimes in and says that they think
case-insensitive names would be useful for their target and they want this
hook, I will just go with the less controversial option and implement this
using ADDITIONAL_REGISTER_NAMES for msp430 only.

Thanks for the feedback,
Jozef

> 
>   Jakub

Re: [PATCH] Allow case-insensitive comparisons of register names by implementing CASE_INSENSITIVE_REGISTER_NAMES PR target/70320

2019-07-19 Thread Jozef Lawrynowicz

On Fri, 19 Jul 2019 10:55:59 +0100
Richard Sandiford  wrote:

> Jozef Lawrynowicz  writes:
> > On Fri, 19 Jul 2019 09:31:15 +0100
> > Richard Sandiford  wrote:
> >  
> >> Jozef Lawrynowicz  writes:  
> >> >
> >> > From 82eadcdcbb8914b06818f7c8a10156336518e8d1 Mon Sep 17 00:00:00 2001
> >> > From: Jozef Lawrynowicz 
> >> > Date: Wed, 17 Jul 2019 11:48:23 +0100
> >> > Subject: [PATCH] Implement CASE_INSENSITIVE_REGISTER_NAMES
> >> >
> >> > gcc/ChangeLog:
> >> >
> >> > 2019-07-18  Jozef Lawrynowicz  
> >> >
> >> >  PR target/70320
> >> >  * doc/tm.texi.in: Document new macro CASE_INSENSITIVE_REGISTER_NAMES.
> >> >  * doc/tm.texi: Likewise.
> >> >  * defaults.h: Define CASE_INSENSITIVE_REGISTER_NAMES to 0.
> >> >  * config/msp430/msp430.h: Define CASE_INSENSITIVE_REGISTER_NAMES to 1.
> >> >  * varasm.c (decode_reg_name_and_count): Use strcasecmp instead of
> >> >  strcmp for comparisons of asmspec with a register name if 
> >> >  CASE_INSENSITIVE_REGISTER_NAMES is defined to 1.
> >> 
> >> I really don't think we should be adding new target macros for things
> >> like this.  The code is hardly on the critical path, so I don't think
> >> compile time is a concern.  That said...
> >> 
> >> ...
> >> 
> >> So TBH I still prefer the DEFHOOKPOD suggestion.  I won't object if
> >> someone else wants to approve the macro version though.  
> >
> > Ok, as you say, this code isn't on the critical path so I'd be happy to 
> > change
> > this to a DEFHOOKPOD.
> >
> > In general, what should be considered when deciding between a hook and 
> > macro?
> > Does the choice lean towards macros mainly when compile time is a concern? 
> > And
> > hooks otherwise?
> >
> > Is the downside of this macro implementation compared to a DEFHOOKPOD mainly
> > just the maintainability/readability of the added code?  
> 
> Macros are essentially the "old way" and target hooks the "new way".
> The decision was made a long time ago to move away from macros where
> possible.  TBH I no longer remember the reasons clearly, but I think
> it was partly to avoid including so much target stuff in non-target
> files, and partly in the hope that we might one day support multiple
> targets in a single build.  Years later, we're not much closer to the
> latter and I'm not sure it's a realistic goal.
> 
> So over time various macros have been moved to hooks and new target
> tests should generally use hooks if at all possible.
> 
> That said, there are some macros that are too compile-time sensitive
> to be hooks or variables.  BITS_PER_UNIT is the most obvious example.
> Also, the current approach of using header files to control OS and
> object format features makes it difficult to convert the related
> macros to hooks in a natural way.

Ok great, thanks for the info, 

Jozef
> 
> Thanks,
> Richard

Re: [RFC][tree-vect]PR 88915: Further vectorize second loop when versioning

2019-07-19 Thread Andre Vieira (lists)





On 15/07/2019 11:54, Richard Biener wrote:

On Mon, 15 Jul 2019, Andre Vieira (lists) wrote:




On 12/07/2019 11:19, Richard Biener wrote:

On Thu, 11 Jul 2019, Andre Vieira (lists) wrote:



I have code that can split the condition and alias checks in
'vect_loop_versioning'. For this approach I am considering keeping that bit of
code and seeing if I can patch up the checks after vectorizing the epilogue
further. I think initially I will just go with a "hacked up" way of passing
down the bb with the iteration check and split the false edge every time we
vectorize it further. Will keep you posted on progress. If you have any
pointers/tips they are most welc ome!


I thought to somehow force the idea that we have a prologue loop
to the vectorizer so it creates the number-of-vectorized iterations
check and branch around the main (highest VF) vectorized loop.



Hmm I think I may have skimmed over this earlier. I am reading it now 
and am not entirely sure what you mean by this. Specifically the "number 
of vectorized iterations" or how a prologue loop plays a role in this.





The advantage is that this would just use the epilogue vectorization
code and it would avoid excessive code growth if you have many
VFs to consider (on x86 we now have 8 byte, 16 byte, 32 byte and
64 byte vectors...).  The disadvantage is of course that a small
number of loops will not enter the vector code at all - namely
those that would pass the alias check for lowest_VF but not the
one for highest_VF.  I'm sure this isn't a common situation and
in quite a number of cases we formulate the alias check in a way
that it isn't dependent on the VF anyways.


The code growth is indeed a factor and I can see the argument for choosing
this approach over the other. Cases of such specific overlaps are most likely
oddities rather than the common situation.


Yeah, it also looks simplest to me (and a motivation to enable
epilogue vectorization by default).


There's also possibly
an extra branch for the case the highest_VF loop isn't entered
(unless there already was a prologue loop).

I don't understand this one, can you elaborate?


The branch around the main vectorized loop I talked about above.
So I'd fool the versioning condition to use the lowest VF for
the iteration count checking and use the code that handles
zero-trip iteration count for the vector loop unconditionally.


I guess this sheds some light on the comment above. And it definitely 
implies we would need to know the lowest VF when creating this 
condition. Which is tricky.


In some way this makes checking the niter condition on the version
check pointless (at least if we have a really low lowest VF like
on x64 where it will likely be 2), so we may want to elide that
completely?  For the check to be "correct" we'd also need to
compute the lowest VF a vectorized epilogue is still profitable
(on x86 those will run once or never, but we can also end up
with say main AVX512 vectorization, and a single vectorized
epilogue with SSE2 if we somehow figure AVX256 vectorization
isn't profitable for it - we can also end up with non-vectorizable
epilogue).  So with the current setup how we vectorize epilogues
we maybe want to have a location of the version niter check we
can "patch up" later after (not) vectorizing the epilogue(s).


I think you come to the same conclusion here as I mentioned above. 
Somehow I wish I had understood this better when I first read it ... but 
eh such is life :)


I went on and continued hacking around the approach of splitting the 
niter and alias check I had earlier. I got it to work with a single 
loop. However, when dealing with nested loops I run into the problem 
that I'd need to sink the niter checks. Otherwise you could end up with 
an alias check and niter checks outside the outer loop. Where the 2nd 
and consequent VF niter checks point to the corresponding epilogues in 
the inner loop.  However, once those are done and it iterates over the 
outer-loop, it will go through the higher VF's first, leading to wrong 
behavior.


To illustrate what I mean, here is a very simplistic illustration of 
what is happening:


BB1: Alias check
BB2: niter check VF 32
BB3: niter check VF 16
BB4: Vectorized loop VF32
BB5: Vectorized loop VF16
BB6: Remaining epilogue scalar loop
BB7: Outer loop iteration (updates IV's and DRs of inner loop)
BB8: Scalar inner&outer loop

With edges:
BB1 -T-> BB2
BB1 -F-> BB8
BB2 -T-> BB4
BB2 -F-> BB3
BB3 -T-> BB5
BB3 -F-> BB8
BB4 -> BB5
BB5 -> BB6
BB6 -> BB7
BB7 -> BB4

Where -T-> is a True edge and -F-> is a False edge

So my first thought to solve this is to sink BB2 and BB3 into the loop 
for which BB7 is the latch.


I.e. make BB7 -> BB2

But then I would argue, it would be good to introduce a BB9:
BB1 -T-> BB9
BB9 -T-> BB2
BB9 -F-> BB8

Where BB9 checks that niter is at least the lowest VF.

Sorry if I am repeating what you were telling me to do all along :')

Cheers,
Andre

PS: I often find myself having to patch the DOMINA

[committed][AArch64] Rename +bitperm to +sve2-bitperm

2019-07-19 Thread Richard Sandiford

After some discussion, we've decided to rename the +bitperm feature
flag to +sve2-bitperm, so that it's consistent with the other SVE2
feature flags.  The associated macro was already
__ARM_FEATURE_SVE2_BITPERM, so only the feature flag itself
needs to change.

Tested on aarch64-linux-gnu and applied as r273600.

Richard


2019-07-19  Richard Sandiford  

gcc/
* doc/invoke.texi: Rename the AArch64 +bitperm extension flag
to +sve-bitperm.
* config/aarch64/aarch64-option-extensions.def: Likewise.
--

Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi 2019-07-16 09:11:05.077427347 +0100
+++ gcc/doc/invoke.texi 2019-07-19 12:22:30.128041121 +0100
@@ -16083,7 +16083,7 @@ not affect code generation.  This option
 @item sve2
 Enable the Armv8-a Scalable Vector Extension 2.  This also enables SVE
 instructions.
-@item bitperm
+@item sve2-bitperm
 Enable SVE2 bitperm instructions.  This also enables SVE2 instructions.
 @item sve2-sm4
 Enable SVE2 sm4 instructions.  This also enables SVE2 instructions.
Index: gcc/config/aarch64/aarch64-option-extensions.def
===
--- gcc/config/aarch64/aarch64-option-extensions.def2019-05-29 
10:49:36.240711500 +0100
+++ gcc/config/aarch64/aarch64-option-extensions.def2019-07-19 
12:22:30.124041152 +0100
@@ -58,13 +58,13 @@
 /* Enabling "fp" just enables "fp".
Disabling "fp" also disables "simd", "crypto", "fp16", "aes", "sha2",
"sha3", sm3/sm4, "sve", "sve2", "sve2-aes", "sve2-sha3", "sve2-sm4", and
-   "bitperm".  */
+   "sve2-bitperm".  */
 AARCH64_OPT_EXTENSION("fp", AARCH64_FL_FP, 0, AARCH64_FL_SIMD | 
AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | AARCH64_FL_SHA2 | 
AARCH64_FL_SHA3 | AARCH64_FL_SM4 | AARCH64_FL_SVE | AARCH64_FL_SVE2 | 
AARCH64_FL_SVE2_AES | AARCH64_FL_SVE2_SHA3 | AARCH64_FL_SVE2_SM4 | 
AARCH64_FL_SVE2_BITPERM, false, "fp")
 
 /* Enabling "simd" also enables "fp".
Disabling "simd" also disables "crypto", "dotprod", "aes", "sha2", "sha3",
-   "sm3/sm4", "sve", "sve2", "sve2-aes", "sve2-sha3", "sve2-sm4", and 
"bitperm".
-   */
+   "sm3/sm4", "sve", "sve2", "sve2-aes", "sve2-sha3", "sve2-sm4", and
+   "sve2-bitperm".  */
 AARCH64_OPT_EXTENSION("simd", AARCH64_FL_SIMD, AARCH64_FL_FP, 
AARCH64_FL_CRYPTO | AARCH64_FL_DOTPROD | AARCH64_FL_AES | AARCH64_FL_SHA2 | 
AARCH64_FL_SHA3 | AARCH64_FL_SM4 | AARCH64_FL_SVE | AARCH64_FL_SVE2 | 
AARCH64_FL_SVE2_AES | AARCH64_FL_SVE2_SHA3 | AARCH64_FL_SVE2_SM4 | 
AARCH64_FL_SVE2_BITPERM, false, "asimd")
 
 /* Enabling "crypto" also enables "fp", "simd", "aes" and "sha2".
@@ -80,7 +80,7 @@ AARCH64_OPT_EXTENSION("lse", AARCH64_FL_
 
 /* Enabling "fp16" also enables "fp".
Disabling "fp16" disables "fp16", "fp16fml", "sve", "sve2", "sve2-aes",
-   "sve2-sha3", "sve2-sm4", and "bitperm".  */
+   "sve2-sha3", "sve2-sm4", and "sve2-bitperm".  */
 AARCH64_OPT_EXTENSION("fp16", AARCH64_FL_F16, AARCH64_FL_FP, AARCH64_FL_F16FML 
| AARCH64_FL_SVE | AARCH64_FL_SVE2 | AARCH64_FL_SVE2_AES | AARCH64_FL_SVE2_SHA3 
| AARCH64_FL_SVE2_SM4 | AARCH64_FL_SVE2_BITPERM, false, "fphp asimdhp")
 
 /* Enabling or disabling "rcpc" only changes "rcpc".  */
@@ -116,7 +116,7 @@ AARCH64_OPT_EXTENSION("fp16fml", AARCH64
 
 /* Enabling "sve" also enables "fp16", "fp" and "simd".
Disabling "sve" disables "sve", "sve2", "sve2-aes", "sve2-sha3", "sve2-sm4"
-   and "bitperm".  */
+   and "sve2-bitperm".  */
 AARCH64_OPT_EXTENSION("sve", AARCH64_FL_SVE, AARCH64_FL_FP | AARCH64_FL_SIMD | 
AARCH64_FL_F16, AARCH64_FL_SVE2 | AARCH64_FL_SVE2_AES | AARCH64_FL_SVE2_SHA3 | 
AARCH64_FL_SVE2_SM4 | AARCH64_FL_SVE2_BITPERM, false, "sve")
 
 /* Enabling/Disabling "profile" does not enable/disable any other feature.  */
@@ -139,7 +139,7 @@ AARCH64_OPT_EXTENSION("predres", AARCH64
 
 /* Enabling "sve2" also enables "sve", "fp16", "fp", and "simd".
Disabling "sve2" disables "sve2", "sve2-aes", "sve2-sha3", "sve2-sm4", and
-   "bitperm".  */
+   "sve2-bitperm".  */
 AARCH64_OPT_EXTENSION("sve2", AARCH64_FL_SVE2, AARCH64_FL_SVE | AARCH64_FL_FP 
| AARCH64_FL_SIMD | AARCH64_FL_F16, AARCH64_FL_SVE2_AES | AARCH64_FL_SVE2_SHA3 
| AARCH64_FL_SVE2_SM4 | AARCH64_FL_SVE2_BITPERM, false, "")
 
 /* Enabling "sve2-sm4" also enables "sm4", "simd", "fp16", "fp", "sve", and
@@ -154,8 +154,8 @@ AARCH64_OPT_EXTENSION("sve2-aes", AARCH6
"sve2". Disabling "sve2-sha3" just disables "sve2-sha3".  */
 AARCH64_OPT_EXTENSION("sve2-sha3", AARCH64_FL_SVE2_SHA3, AARCH64_FL_SHA3 | 
AARCH64_FL_SIMD | AARCH64_FL_F16 | AARCH64_FL_FP | AARCH64_FL_SVE | 
AARCH64_FL_SVE2, 0, false, "")
 
-/* Enabling "bitperm" also enables "simd", "fp16", "fp", "sve", and "sve2".
-   Disabling "bitperm" just disables "bitperm".  */
-AARCH64_OPT_EXTENSION("bitperm", AARCH64_FL_SVE2_BITPERM, AARCH64_FL_SIMD | 
AARCH64_FL_F16 | AARCH64_FL_FP | AARCH64_FL_SVE | AARCH64_FL_S

[PATCH][ARM] Cleanup logical DImode operations

2019-07-19 Thread Wilco Dijkstra



Cleanup the logical DImode operations since the current implementation is way
too complicated.  Thumb-1, Thumb-2, VFP/Neon and iwMMXt all work differently,
resulting in a bewildering number of expansions, patterns and splits across
several md files.  All this complexity is counterproductive and results in
inefficient code.

A much simpler approach is to split these operations early in the expander
so that optimizations and register allocation are applied on the 32-bit halves.
Code generation is unchanged on Thumb-1 and Arm/Thumb-2 without Neon or iwMMXt
(which already expand these instructions early).  With Neon these changes save
~1000 instructions from the PR77308 testcase, mostly by significantly reducing
register pressure and spilling.

Bootstrap & regress OK on arm-none-linux-gnueabihf --with-cpu=cortex-a57

OK for commit?

ChangeLog:
2019-07-18  Wilco Dijkstra  

* config/arm/arm.md (split and/eor/ior): Remove Neon check.
(split not): Add DImode not splitter.
(anddi3): Remove pattern.
(anddi3_insn): Likewise.
(anddi_zesidi_di): Likewise.
(anddi_sesdi_di): Likewise.
(anddi_notdi_di): Likewise.
(anddi_notzesidi_di): Likewise.
(anddi_notsesidi_di): Likewise.
(iordi3): Likewise.
(iordi3_insn): Likewise.
(iordi_zesidi_di): Likewise.
(iordi_sesidi_di): Likewise.
(xordi3): Likewise.
(xordi3_insn): Likewise.
(xordi_sesidi_di): Likewise.
(xordi_zesidi_di): Likewise.
(one_cmpldi2): Likewise.
(one_cmpldi2_insn): Likewise.
* config/arm/constraints.md: Remove De, Df, Dg constraints.
* config/arm/iwmmxt.md (iwmmxt_iordi3): Remove general register
alternative.
(iwmmxt_xordi3): Likewise.
(iwmmxt_anddi3): Likewise.
* config/arm/neon.md (orndi3_neon): Remove pattern.
(anddi_notdi_di): Likewise.
* config/arm/predicates.md (arm_anddi_operand_neon): Remove.
(arm_iordi_operand_neon): Likewise.
(arm_xordi_operand_neon): Likewise.
* config/arm/thumb2.md(iordi_notdi_di): Remove pattern.
(iordi_notzesidi_di): Likewise.
(iordi_notdi_zesidi): Likewise.
(iordi_notsesidi_di): Likewise.


---
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 
8f4a4c26ea849a023f2e63d2efbf327423512dfc..cab59c403b777c37c1e412ab9a69db2c2ec533a2
 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -2183,19 +2183,16 @@ (define_expand "divdf3"
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE"
   "")
 
-;; Boolean and,ior,xor insns
 
-;; Split up double word logical operations
-
-;; Split up simple DImode logical operations.  Simply perform the logical
+;; Split DImode and, ior, xor operations.  Simply perform the logical
 ;; operation on the upper and lower halves of the registers.
+;; This is needed for atomic operations in arm_split_atomic_op.
 (define_split
   [(set (match_operand:DI 0 "s_register_operand" "")
(match_operator:DI 6 "logical_binary_operator"
  [(match_operand:DI 1 "s_register_operand" "")
   (match_operand:DI 2 "s_register_operand" "")]))]
   "TARGET_32BIT && reload_completed
-   && ! (TARGET_NEON && IS_VFP_REGNUM (REGNO (operands[0])))
&& ! IS_IWMMXT_REGNUM (REGNO (operands[0]))"
   [(set (match_dup 0) (match_op_dup:SI 6 [(match_dup 1) (match_dup 2)]))
(set (match_dup 3) (match_op_dup:SI 6 [(match_dup 4) (match_dup 5)]))]
@@ -2210,167 +2207,20 @@ (define_split
   }"
 )
 
+;; Split DImode not (needed for atomic operations in arm_split_atomic_op).
 (define_split
-  [(set (match_operand:DI 0 "s_register_operand" "")
-   (match_operator:DI 6 "logical_binary_operator"
- [(sign_extend:DI (match_operand:SI 2 "s_register_operand" ""))
-  (match_operand:DI 1 "s_register_operand" "")]))]
-  "TARGET_32BIT && reload_completed"
-  [(set (match_dup 0) (match_op_dup:SI 6 [(match_dup 1) (match_dup 2)]))
-   (set (match_dup 3) (match_op_dup:SI 6
-   [(ashiftrt:SI (match_dup 2) (const_int 31))
-(match_dup 4)]))]
-  "
-  {
-operands[3] = gen_highpart (SImode, operands[0]);
-operands[0] = gen_lowpart (SImode, operands[0]);
-operands[4] = gen_highpart (SImode, operands[1]);
-operands[1] = gen_lowpart (SImode, operands[1]);
-operands[5] = gen_highpart (SImode, operands[2]);
-operands[2] = gen_lowpart (SImode, operands[2]);
-  }"
-)
-
-;; The zero extend of operand 2 means we can just copy the high part of
-;; operand1 into operand0.
-(define_split
-  [(set (match_operand:DI 0 "s_register_operand" "")
-   (ior:DI
- (zero_extend:DI (match_operand:SI 2 "s_register_operand" ""))
- (match_operand:DI 1 "s_register_operand" "")))]
-  "TARGET_32BIT && operands[0] != operands[1] && reload_completed"
-  [(set (match_dup 0) (ior:SI (match_dup 1) (match_dup 2)))
-   (set (match_dup 3) (match_dup 4))]
-  "
-  {
-operands[4] = ge

Re: [RFC][tree-vect]PR 88915: Further vectorize second loop when versioning

2019-07-19 Thread Richard Biener

On Fri, 19 Jul 2019, Andre Vieira (lists) wrote:

> 
> 
> On 15/07/2019 11:54, Richard Biener wrote:
> > On Mon, 15 Jul 2019, Andre Vieira (lists) wrote:
> > 
> > > 
> > > 
> > > On 12/07/2019 11:19, Richard Biener wrote:
> > > > On Thu, 11 Jul 2019, Andre Vieira (lists) wrote:
> > > 
> > > 
> > > I have code that can split the condition and alias checks in
> > > 'vect_loop_versioning'. For this approach I am considering keeping that
> > > bit of
> > > code and seeing if I can patch up the checks after vectorizing the
> > > epilogue
> > > further. I think initially I will just go with a "hacked up" way of
> > > passing
> > > down the bb with the iteration check and split the false edge every time
> > > we
> > > vectorize it further. Will keep you posted on progress. If you have any
> > > pointers/tips they are most welc ome!
> > 
> > I thought to somehow force the idea that we have a prologue loop
> > to the vectorizer so it creates the number-of-vectorized iterations
> > check and branch around the main (highest VF) vectorized loop.
> > 
> 
> Hmm I think I may have skimmed over this earlier. I am reading it now and am
> not entirely sure what you mean by this. Specifically the "number of
> vectorized iterations" or how a prologue loop plays a role in this.

When there is no prologue then the versioning condition currently
ensures we always enter the vector loop.  Only if there is a prologue
is there a check and branch eventually skipping right to the epilogue.
If the versioning condition (now using a lower VF) doesn't guarantee
this we have to add this jump-around.

> 
> > > > 
> > > > The advantage is that this would just use the epilogue vectorization
> > > > code and it would avoid excessive code growth if you have many
> > > > VFs to consider (on x86 we now have 8 byte, 16 byte, 32 byte and
> > > > 64 byte vectors...).  The disadvantage is of course that a small
> > > > number of loops will not enter the vector code at all - namely
> > > > those that would pass the alias check for lowest_VF but not the
> > > > one for highest_VF.  I'm sure this isn't a common situation and
> > > > in quite a number of cases we formulate the alias check in a way
> > > > that it isn't dependent on the VF anyways.
> > > 
> > > The code growth is indeed a factor and I can see the argument for choosing
> > > this approach over the other. Cases of such specific overlaps are most
> > > likely
> > > oddities rather than the common situation.
> > 
> > Yeah, it also looks simplest to me (and a motivation to enable
> > epilogue vectorization by default).
> > 
> > > > There's also possibly
> > > > an extra branch for the case the highest_VF loop isn't entered
> > > > (unless there already was a prologue loop).
> > > I don't understand this one, can you elaborate?
> > 
> > The branch around the main vectorized loop I talked about above.
> > So I'd fool the versioning condition to use the lowest VF for
> > the iteration count checking and use the code that handles
> > zero-trip iteration count for the vector loop unconditionally.
> 
> I guess this sheds some light on the comment above. And it definitely implies
> we would need to know the lowest VF when creating this condition. Which is
> tricky.

We can simply use the smallest vector size supported by the target to
derive it from the actual VF, no?

> > 
> > In some way this makes checking the niter condition on the version
> > check pointless (at least if we have a really low lowest VF like
> > on x64 where it will likely be 2), so we may want to elide that
> > completely?  For the check to be "correct" we'd also need to
> > compute the lowest VF a vectorized epilogue is still profitable
> > (on x86 those will run once or never, but we can also end up
> > with say main AVX512 vectorization, and a single vectorized
> > epilogue with SSE2 if we somehow figure AVX256 vectorization
> > isn't profitable for it - we can also end up with non-vectorizable
> > epilogue).  So with the current setup how we vectorize epilogues
> > we maybe want to have a location of the version niter check we
> > can "patch up" later after (not) vectorizing the epilogue(s).
> 
> I think you come to the same conclusion here as I mentioned above. Somehow I
> wish I had understood this better when I first read it ... but eh such is life
> :)
> 
> I went on and continued hacking around the approach of splitting the niter and
> alias check I had earlier. I got it to work with a single loop. However, when
> dealing with nested loops I run into the problem that I'd need to sink the
> niter checks. Otherwise you could end up with an alias check and niter checks
> outside the outer loop. Where the 2nd and consequent VF niter checks point to
> the corresponding epilogues in the inner loop.  However, once those are done
> and it iterates over the outer-loop, it will go through the higher VF's first,
> leading to wrong behavior.
> 
> To illustrate what I mean, here is a very simplistic illustration of what is
> happenin

[PATCH] Fix PR91200

2019-07-19 Thread Richard Biener



The following fixes cselim.

Bootstrapped/tested on x86_64-unknown-linux-gnu, applied.

Richard.

2019-07-19  Richard Biener  

PR tree-optimization/91200
* tree-ssa-phiopt.c (cond_store_replacement): Check we have
no PHI nodes in middle-bb.

* gcc.dg/torture/pr91200.c: New testcase.

Index: gcc/tree-ssa-phiopt.c
===
--- gcc/tree-ssa-phiopt.c   (revision 273590)
+++ gcc/tree-ssa-phiopt.c   (working copy)
@@ -2216,6 +2216,11 @@ cond_store_replacement (basic_block midd
   || gimple_has_volatile_ops (assign))
 return false;
 
+  /* And no PHI nodes so all uses in the single stmt are also
+ available where we insert to.  */
+  if (!gimple_seq_empty_p (phi_nodes (middle_bb)))
+return false;
+
   locus = gimple_location (assign);
   lhs = gimple_assign_lhs (assign);
   rhs = gimple_assign_rhs1 (assign);
Index: gcc/testsuite/gcc.dg/torture/pr91200.c
===
--- gcc/testsuite/gcc.dg/torture/pr91200.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr91200.c  (working copy)
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+
+int printf (const char *, ...);
+
+char a;
+int b, c, **d;
+
+int main ()
+{
+  int f = -128, *g, *h[2] = {0, 0}, i;
+  printf("0");
+  if (a)
+{
+  while (f > a) {
+int *j = &i;
+*j |= 0;
+  }
+  h[i] = &c;
+}
+  if (h[1])
+{
+  int **k = &g;
+  *k = &f;
+  while (i)
+{
+  int **l[] = {&g};
+}
+  int **m = &g;
+  *d = *m = &b;
+}
+  return 0;
+}

[PATCH] Fix PR91211

2019-07-19 Thread Richard Biener



Another issue in partial-def VN.

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

Richard.

2019-07-19  Richard Biener  

PR tree-optimization/91211
* tree-ssa-sccvn.c (vn_walk_cb_data::push_partial_def): Fix
memset encoding size.

* gcc.dg/torture/pr91211.c: New testcase.

Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 273592)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -1818,7 +1818,8 @@ vn_walk_cb_data::push_partial_def (const
   if (TREE_CODE (pd.rhs) == CONSTRUCTOR)
/* Empty CONSTRUCTOR.  */
memset (buffer + MAX (0, pd.offset),
-   0, MIN ((HOST_WIDE_INT)sizeof (buffer), pd.size));
+   0, MIN ((HOST_WIDE_INT)sizeof (buffer),
+   pd.size + MIN (0, pd.offset)));
   else
{
  unsigned pad = 0;
Index: gcc/testsuite/gcc.dg/torture/pr91211.c
===
--- gcc/testsuite/gcc.dg/torture/pr91211.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr91211.c  (working copy)
@@ -0,0 +1,19 @@
+/* { dg-do run } */
+
+typedef __UINT32_TYPE__ u32;
+
+int
+main (void)
+{
+  u32 b = 0x027C5902;
+  u32 a = 0;
+  __builtin_memset (1 + (char *) &b, 0, 2);
+  __builtin_memcpy (&a, 2 + (char *) &b, 2);
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  if (a != 0x0200)
+#else
+  if (a != 0x0002)
+#endif
+__builtin_abort();
+  return 0;
+}

Re: [RFC][tree-vect]PR 88915: Further vectorize second loop when versioning

2019-07-19 Thread Andre Vieira (lists)





On 19/07/2019 12:35, Richard Biener wrote:

On Fri, 19 Jul 2019, Andre Vieira (lists) wrote:




On 15/07/2019 11:54, Richard Biener wrote:

On Mon, 15 Jul 2019, Andre Vieira (lists) wrote:




On 12/07/2019 11:19, Richard Biener wrote:

On Thu, 11 Jul 2019, Andre Vieira (lists) wrote:



I have code that can split the condition and alias checks in
'vect_loop_versioning'. For this approach I am considering keeping that
bit of
code and seeing if I can patch up the checks after vectorizing the
epilogue
further. I think initially I will just go with a "hacked up" way of
passing
down the bb with the iteration check and split the false edge every time
we
vectorize it further. Will keep you posted on progress. If you have any
pointers/tips they are most welc ome!


I thought to somehow force the idea that we have a prologue loop
to the vectorizer so it creates the number-of-vectorized iterations
check and branch around the main (highest VF) vectorized loop.



Hmm I think I may have skimmed over this earlier. I am reading it now and am
not entirely sure what you mean by this. Specifically the "number of
vectorized iterations" or how a prologue loop plays a role in this.


When there is no prologue then the versioning condition currently
ensures we always enter the vector loop.  Only if there is a prologue
is there a check and branch eventually skipping right to the epilogue.
If the versioning condition (now using a lower VF) doesn't guarantee
this we have to add this jump-around.


Right, I haven't looked at the prologue path yet. I had a quick look and 
can't find where this branch skipping to the epilogue is constructed.  I 
will take a better look after I got my current example to work.




I guess this sheds some light on the comment above. And it definitely implies
we would need to know the lowest VF when creating this condition. Which is
tricky.


We can simply use the smallest vector size supported by the target to
derive it from the actual VF, no?


So I could wait to introduce this check after all epilogue vectorization 
is done, back track to the last niter check and duplicate that in the 
outer loop.


What I didn't want to do was use the smallest possible vector size for 
the target because I was under the impression that does not necessarily 
correspond to the smallest VF we may have for a loop, as the vectorizer 
may have decided not to vectorize for that vector size because of costs? 
If it I can assume this never happens, that once it starts to vectorize 
epilogues that it will keep vectorizing them for any vector size it 
knows off then yeah I can use that.




I'm not sure I understand - why would you have any check not inside
the outer loop?  Yes, we now eventually hoist versioning checks
but the VF checks for the individual variants should be around
the vectorized loop itself (so not really part of the versioning check).


Yeah I agree. I was just explaining what I had done wrong now.



Cheers,
Andre

PS: I often find myself having to patch the DOMINATOR information, sometimes
its easy to, but sometimes it can get pretty complicated. I wonder whether it
would make sense to write something that traverses a loop and corrects this,
if it doesn't exist already.


There's iterate-fix-dominators, but unless you create new edges/blocks
manually rather than doing split-block/redirect-edge which should do
dominator updating for you.


Ah I was doing everything manually after having some bad experiences 
with lv_add_condition_to_bb.  I will have a look at those thanks!


Cheers,
Andre



Richard.





Richard.

Re: PR91166 - Unfolded ZIPs of constants

2019-07-19 Thread Richard Sandiford

Not really my area, but FWIW...

Prathamesh Kulkarni  writes:
> Hi,
> The attached patch tries to fix PR91166.
> Does it look OK ?
> Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu.
>
> Thanks,
> Prathamesh
>
> 2019-07-17  Prathamesh Kulkarni  
>
>   PR middle-end/91166
>   * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern.
>   (define_predicates): Add entry for uniform_vector_p.
>
> testsuite/
>   * gcc.target/aarch64/sve/pr91166.c: New test.
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 4a7aa0185d8..2ad98c28fd8 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3.  If not see
> integer_valued_real_p
> integer_pow2p
> uniform_integer_cst_p
> -   HONOR_NANS)
> +   HONOR_NANS
> +   uniform_vector_p)
>  
>  /* Operator lists.  */
>  (define_operator_list tcc_comparison
> @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type; })
> (if (changed)
>  (vec_perm { op0; } { op1; } { op2; }))
> +
> +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element.  */
> +(simplify
> + (vec_perm (vec_duplicate@0 @1) @0 @2)
> + { @0; })
> +
> +(simplify
> + (vec_perm uniform_vector_p@0 @0 @1)
> + { @0; }) 

No need for the curly braces here, can use "@0" as the target of
the simplification.

It'd probably be worth using (match ...) to define a new predicate
that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR,
calling into uniform_vector_p for the latter two.

Thanks,
Richard

> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
> new file mode 100644
> index 000..42654be3b31
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */
> +
> +void
> +f1 (double x[][4]) 
> +{
> +  for (int i = 0; i < 4; ++i)
> +for (int j = 0; j < 4; ++j)
> +  x[i][j] = 0;
> +}
> +
> +void
> +f2 (double x[][4], double y)
> +{
> +  for (int i = 0; i < 4; ++i)
> +for (int j = 0; j < 4; ++j)
> +  x[i][j] = y;
> +}
> +
> +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */

Re: C++ PATCH for c++/89906 (GCC 8 backport)

2019-07-19 Thread Marek Polacek

Ping.

On Fri, Jul 12, 2019 at 04:16:38PM -0400, Marek Polacek wrote:
> In order to fix 89906 in GCC 8, we need to backport 86098.
> I think the patch is safe to be backported.
> 
> Tested x86_64-linux, ok for 8?
> 
> 2018-06-12  Jason Merrill  
> 
>   PR c++/86098 - ICE with template placeholder for TTP.
>   * typeck.c (structural_comptypes) [TEMPLATE_TYPE_PARM]: Check
>   CLASS_PLACEHOLDER_TEMPLATE.
> 
> --- gcc/cp/typeck.c
> +++ gcc/cp/typeck.c
> @@ -1375,6 +1375,11 @@ structural_comptypes (tree t1, tree t2, int strict)
>template parameters set, they can't be equal.  */
>if (!comp_template_parms_position (t1, t2))
>   return false;
> +  /* If T1 and T2 don't represent the same class template deduction,
> + they aren't equal.  */
> +  if (CLASS_PLACEHOLDER_TEMPLATE (t1)
> +   != CLASS_PLACEHOLDER_TEMPLATE (t2))
> + return false;
>/* Constrained 'auto's are distinct from parms that don't have the same
>constraints.  */
>if (!equivalent_placeholder_constraints (t1, t2))
> --- /dev/null
> +++ gcc/testsuite/g++.dg/cpp1z/class-deduction58.C
> @@ -0,0 +1,16 @@
> +// PR c++/86098
> +// { dg-additional-options -std=c++17 }
> +
> +template  class future;
> +template  T&& declval();
> +
> +template class T>
> +struct construct_deduced {
> +  template 
> +  using deduced_t = decltype(T{declval()...});
> +  template
> +  deduced_t operator()(AN&&... an) const;
> +};
> +
> +template
> +future future_from(T singleSender);

Marek

Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)

2019-07-19 Thread Matthias Klose

On 14.06.19 15:09, Gaius Mulley wrote:
> 
> Hello,
> 
> here is version two of the patches which introduce Modula-2 into the
> GCC trunk.  The patches include:
> 
>   (*)  a patch to allow all front ends to register a lang spec function.
>(included are patches for all front ends to provide an empty
> callback function).

fyi, the hook patches for brig, D, Fortran and Go are not yet part of your gm2
repository for the trunk (while they are for gcc-9).  The patches are also
duplicated as trunc and trunk.

Matthias

Add ARRAY_REF based access patch disambiguation

2019-07-19 Thread Jan Hubicka

Hi,
this patch adds bare bones of disambiguation of access paths via
ARRAY_REF.  Similarly to COMPONENT_REF we need a matched ARRAY_REF size
and prove that indexes are actually different.

This adds about 20 new disambiguations to tramp3d.

Bootstrapped/regtested x86_64-linux, OK?

* tree-ssa-alias.c (nonoverlapping_component_refs_since_match_p):
Rename to ...
(nonoverlapping_refs_since_match_p): ... this; handle also ARRAY_REFs.
(alias_stats): Update stats.
(dump_alias_stats): Likewise.
(aliasing_matching_component_refs_p): Add partial_overlap argument;
pass it to nonoverlapping_refs_since_match_p.
(aliasing_component_refs_walk): Update call of
aliasing_matching_component_refs_p
(nonoverlapping_array_refs_p): New function.
(decl_refs_may_alias_p, indirect_ref_may_alias_decl_p,
indirect_refs_may_alias_p): Update calls of
nonoverlapping_refs_since_match_p.
* gcc.dg/tree-ssa/alias-access-path-10.c: New testcase.

Index: tree-ssa-alias.c
===
--- tree-ssa-alias.c(revision 273570)
+++ tree-ssa-alias.c(working copy)
@@ -87,7 +87,7 @@ along with GCC; see the file COPYING3.
this file.  Low-level disambiguators dealing with points-to
information are in tree-ssa-structalias.c.  */
 
-static int nonoverlapping_component_refs_since_match_p (tree, tree, tree, 
tree);
+static int nonoverlapping_refs_since_match_p (tree, tree, tree, tree, bool);
 static bool nonoverlapping_component_refs_p (const_tree, const_tree);
 
 /* Query statistics for the different low-level disambiguators.
@@ -104,9 +104,9 @@ static struct {
   unsigned HOST_WIDE_INT aliasing_component_refs_p_no_alias;
   unsigned HOST_WIDE_INT nonoverlapping_component_refs_p_may_alias;
   unsigned HOST_WIDE_INT nonoverlapping_component_refs_p_no_alias;
-  unsigned HOST_WIDE_INT nonoverlapping_component_refs_since_match_p_may_alias;
-  unsigned HOST_WIDE_INT 
nonoverlapping_component_refs_since_match_p_must_overlap;
-  unsigned HOST_WIDE_INT nonoverlapping_component_refs_since_match_p_no_alias;
+  unsigned HOST_WIDE_INT nonoverlapping_refs_since_match_p_may_alias;
+  unsigned HOST_WIDE_INT nonoverlapping_refs_since_match_p_must_overlap;
+  unsigned HOST_WIDE_INT nonoverlapping_refs_since_match_p_no_alias;
 } alias_stats;
 
 void
@@ -137,15 +137,15 @@ dump_alias_stats (FILE *s)
   alias_stats.nonoverlapping_component_refs_p_no_alias,
   alias_stats.nonoverlapping_component_refs_p_no_alias
   + alias_stats.nonoverlapping_component_refs_p_may_alias);
-  fprintf (s, "  nonoverlapping_component_refs_since_match_p: "
+  fprintf (s, "  nonoverlapping_refs_since_match_p: "
   HOST_WIDE_INT_PRINT_DEC" disambiguations, "
   HOST_WIDE_INT_PRINT_DEC" must overlaps, "
   HOST_WIDE_INT_PRINT_DEC" queries\n",
-  alias_stats.nonoverlapping_component_refs_since_match_p_no_alias,
-  alias_stats.nonoverlapping_component_refs_since_match_p_must_overlap,
-  alias_stats.nonoverlapping_component_refs_since_match_p_no_alias
-  + alias_stats.nonoverlapping_component_refs_since_match_p_may_alias
-  + 
alias_stats.nonoverlapping_component_refs_since_match_p_must_overlap);
+  alias_stats.nonoverlapping_refs_since_match_p_no_alias,
+  alias_stats.nonoverlapping_refs_since_match_p_must_overlap,
+  alias_stats.nonoverlapping_refs_since_match_p_no_alias
+  + alias_stats.nonoverlapping_refs_since_match_p_may_alias
+  + alias_stats.nonoverlapping_refs_since_match_p_must_overlap);
   fprintf (s, "  aliasing_component_refs_p: "
   HOST_WIDE_INT_PRINT_DEC" disambiguations, "
   HOST_WIDE_INT_PRINT_DEC" queries\n",
@@ -856,7 +856,8 @@ type_has_components_p (tree type)
 
 /* MATCH1 and MATCH2 which are part of access path of REF1 and REF2
respectively are either pointing to same address or are completely
-   disjoint.
+   disjoint. If PARITAL_OVERLAP is true, assume that outermost arrays may
+   just partly overlap.
 
Try to disambiguate using the access path starting from the match
and return false if there is no conflict.
@@ -867,24 +868,27 @@ static bool
 aliasing_matching_component_refs_p (tree match1, tree ref1,
poly_int64 offset1, poly_int64 max_size1,
tree match2, tree ref2,
-   poly_int64 offset2, poly_int64 max_size2)
+   poly_int64 offset2, poly_int64 max_size2,
+   bool partial_overlap)
 {
   poly_int64 offadj, sztmp, msztmp;
   bool reverse;
 
-
-  get_ref_base_and_extent (match2, &offadj, &sztmp, &msztmp, &reverse);
-  offset2 -= offadj;
-  get_ref_base_and_extent (match1, &offadj, &sztmp, &msztmp, &reverse);
-  offset1 -= offadj;
-  if (!ranges_maybe_overlap_p (offset1, max_size1,

Fix reversed conditional in recursive_inlining

2019-07-19 Thread Jan Hubicka

Hi,
this patch fixes bug in recursive_inlining noticed by Feng Xue.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

Index: ChangeLog
===
--- ChangeLog   (revision 273602)
+++ ChangeLog   (working copy)
@@ -1,3 +1,8 @@
+2019-07-19  Jan Hubicka  
+
+   PR ipa/91194
+   * ipa-inline.c (recursive_inlining): Fix limits check.
+
 2019-07-19  Richard Biener  
 
PR tree-optimization/91200
Index: ipa-inline.c
===
--- ipa-inline.c(revision 273570)
+++ ipa-inline.c(working copy)
@@ -1504,7 +1504,7 @@ recursive_inlining (struct cgraph_edge *
   struct cgraph_node *cnode, *dest = curr->callee;
 
   if (!can_inline_edge_p (curr, true)
- || can_inline_edge_by_limits_p (curr, true))
+ || !can_inline_edge_by_limits_p (curr, true))
continue;
 
   /* MASTER_CLONE is produced in the case we already started modified

[PATCH, i386]: Fix (target part) PR91204, ICE in expand_expr_real_2

2019-07-19 Thread Uros Bizjak

As suggested by Jakub in the PR, add missing vector one_cmpl2 to
mmx.md. A generic fix is in the works by Jakub.

2019-07-19  Uroš Bizjak  

PR target/91204
* config/i386/mmx.md (one_cmpl2): New expander.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 4c71e66e6607..c78b33b510a6 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -1158,6 +1158,14 @@
 ;;
 ;
 
+(define_expand "one_cmpl2"
+  [(set (match_operand:MMXMODEI 0 "register_operand")
+   (xor:MMXMODEI
+ (match_operand:MMXMODEI 1 "register_operand")
+ (match_dup 2)))]
+  "TARGET_MMX_WITH_SSE"
+  "operands[2] = force_reg (mode, CONSTM1_RTX (mode));")
+
 (define_insn "mmx_andnot3"
   [(set (match_operand:MMXMODEI 0 "register_operand" "=y,x,Yv")
(and:MMXMODEI

Re: [PATCH, rs6000] Support vrotr3 for int vector types

2019-07-19 Thread Segher Boessenkool

Hi!

On Fri, Jul 19, 2019 at 10:21:06AM +0800, Kewen.Lin wrote:
> on 2019/7/19 上午3:48, Segher Boessenkool wrote:
> > On Thu, Jul 18, 2019 at 01:44:36PM +0800, Kewen.Lin wrote:
> >> on 2019/7/17 下午9:40, Segher Boessenkool wrote:
> >>> On Wed, Jul 17, 2019 at 04:32:15PM +0800, Kewen.Lin wrote:
>  Regression testing just launched, is it OK for trunk if it's bootstrapped
>  and regresstested on powerpc64le-unknown-linux-gnu?
> >>>
>  +;; Expanders for rotatert to make use of vrotl
>  +(define_expand "vrotr3"
>  +  [(set (match_operand:VEC_I 0 "vint_operand")
>  +(rotatert:VEC_I (match_operand:VEC_I 1 "vint_operand")
>  +  (match_operand:VEC_I 2 
>  "vint_reg_or_const_vector")))]
> >>>
> >>> Having any rotatert in a define_expand or define_insn will regress.

This is wrong.  I don't know why I thought this for a while.

There shouldn't be any rotatert in anything that goes through recog, but
that is everything *except* define_expand.  So define_insn, define_split,
define_peephole, define_peephole2 (and define_insn_and_split, which is
just syntactic sugar).

> Thanks for further explanation!  Sorry that, but I didn't find this 
> HAVE_rotatert definition.  I guess it's due to the preparation is always 
> "DONE"?  Then it doesn't really generate rotatert. 

You only had one in a define_expand.  That is fine, that pattern is never
recognised against.  HAVE_rotatert means that something somewhere will
recognise rotatert RTL insns; if it isn't set, it doesn't make sense to
ever create them, because they will never match.

> although I can see rotatert in insn like below, it seems fine in note?

Sure, many things are allowed in notes that can never show up in RTL
proper.

So, this approach will work fine, and not be too bad.  Could you do a
new patch with it?  It's simple to do, and even if the generic thing
will happen eventually, this is a nice stepping stone for that.

Thanks, and sorry for the confusion,

Segher

Re: [PATCH] Fix simd attribute handling on aarch64 (version 2)

2019-07-19 Thread Steve Ellcey

Here is version two of my patch to fix simd attribute handling on
aarch64.  Unlike the first patch where I swapped the order of the
calls to targetm.simd_clone.adjust and simd_clone_adjust_return_type,
in this one I remove the (conditional) call to build_distinct_type_copy
from simd_clone_adjust_return_type and do it unconditionally before
calling either routine.  The only downside to this that I can see is
that on non-aarch64 platforms where the return type of a vector
function is VOID (and not changed), we will create a distinct type
where we did not before.

I also added some tests to ensure that, on aarch64, the vector
functions created by cloning a simd function have the .variant_pcs
directive and that the original non-vector version of the function
does not have the directive.  Without this patch the non-vector
version is putting out the directive, that is what this patch
fixes.

Retested on x86 and aarch64 with no regressions.

OK to checkin?

Steve Ellcey
sell...@marvell.com


2019-07-19  Steve Ellcey  

* omp-simd-clone.c (simd_clone_adjust_return_type): Remove call to
build_distinct_type_copy.
(simd_clone_adjust): Call build_distinct_type_copy.
(expand_simd_clones): Ditto.


2019-07-19  Steve Ellcey  

* gcc.target/aarch64/simd_pcs_attribute.c: New test.
* gcc.target/aarch64/simd_pcs_attribute-2.c: Ditto.
* gcc.target/aarch64/simd_pcs_attribute-3.c: Ditto.


diff --git a/gcc/omp-simd-clone.c b/gcc/omp-simd-clone.c
index caa8da3cba5..427d6f6f514 100644
--- a/gcc/omp-simd-clone.c
+++ b/gcc/omp-simd-clone.c
@@ -498,7 +498,6 @@ simd_clone_adjust_return_type (struct cgraph_node *node)
   /* Adjust the function return type.  */
   if (orig_rettype == void_type_node)
 return NULL_TREE;
-  TREE_TYPE (fndecl) = build_distinct_type_copy (TREE_TYPE (fndecl));
   t = TREE_TYPE (TREE_TYPE (fndecl));
   if (INTEGRAL_TYPE_P (t) || POINTER_TYPE_P (t))
 veclen = node->simdclone->vecsize_int;
@@ -1164,6 +1163,7 @@ simd_clone_adjust (struct cgraph_node *node)
 {
   push_cfun (DECL_STRUCT_FUNCTION (node->decl));
 
+  TREE_TYPE (node->decl) = build_distinct_type_copy (TREE_TYPE (node->decl));
   targetm.simd_clone.adjust (node);
 
   tree retval = simd_clone_adjust_return_type (node);
@@ -1737,6 +1737,8 @@ expand_simd_clones (struct cgraph_node *node)
 	simd_clone_adjust (n);
 	  else
 	{
+	  TREE_TYPE (n->decl)
+		= build_distinct_type_copy (TREE_TYPE (n->decl));
 	  targetm.simd_clone.adjust (n);
 	  simd_clone_adjust_return_type (n);
 	  simd_clone_adjust_argument_types (n);
diff --git a/gcc/testsuite/gcc.target/aarch64/simd_pcs_attribute-2.c b/gcc/testsuite/gcc.target/aarch64/simd_pcs_attribute-2.c
index e69de29bb2d..913960c607b 100644
--- a/gcc/testsuite/gcc.target/aarch64/simd_pcs_attribute-2.c
+++ b/gcc/testsuite/gcc.target/aarch64/simd_pcs_attribute-2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast" } */
+/* { dg-require-effective-target aarch64_variant_pcs } */
+
+__attribute__ ((__simd__ ("notinbranch")))
+__attribute__ ((__nothrow__ , __leaf__ , __const__))
+extern double foo (double x);
+
+void bar(double * f, int n)
+{
+	int i;
+	for (i = 0; i < n; i++)
+		f[i] = foo(f[i]);
+}
+
+/* { dg-final { scan-assembler-not {\.variant_pcs\tfoo} } } */
+/* { dg-final { scan-assembler-times {\.variant_pcs\t_ZGVnN2v_foo} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/simd_pcs_attribute-3.c b/gcc/testsuite/gcc.target/aarch64/simd_pcs_attribute-3.c
index e69de29bb2d..e3debb0ab18 100644
--- a/gcc/testsuite/gcc.target/aarch64/simd_pcs_attribute-3.c
+++ b/gcc/testsuite/gcc.target/aarch64/simd_pcs_attribute-3.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast" } */
+/* { dg-require-effective-target aarch64_variant_pcs } */
+
+__attribute__ ((__simd__))
+__attribute__ ((__nothrow__ , __leaf__ , __const__))
+double foo (double x);
+
+void bar(double *f, int n)
+{
+	int i;
+	for (i = 0; i < n; i++)
+		f[i] = foo(f[i]);
+}
+
+double foo(double x)
+{
+	return x * x / 3.0;
+}
+
+/* { dg-final { scan-assembler-not {\.variant_pcs\tfoo} } } */
+/* { dg-final { scan-assembler-times {\.variant_pcs\t_ZGVnM1v_foo} 1 } } */
+/* { dg-final { scan-assembler-times {\.variant_pcs\t_ZGVnM2v_foo} 1 } } */
+/* { dg-final { scan-assembler-times {\.variant_pcs\t_ZGVnN1v_foo} 1 } } */
+/* { dg-final { scan-assembler-times {\.variant_pcs\t_ZGVnN2v_foo} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/simd_pcs_attribute.c b/gcc/testsuite/gcc.target/aarch64/simd_pcs_attribute.c
index e69de29bb2d..17a0a701cf4 100644
--- a/gcc/testsuite/gcc.target/aarch64/simd_pcs_attribute.c
+++ b/gcc/testsuite/gcc.target/aarch64/simd_pcs_attribute.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast" } */
+/* { dg-require-effective-target aarch64_variant_pcs } */
+
+__attribute__ ((__simd__ ("notinbranch")))
+__attribute__ ((__nothrow__ , __leaf__ , __const__))
+extern double log (double __x);
+
+void foo(double

Re: [PATCH] Allow case-insensitive comparisons of register names by implementing CASE_INSENSITIVE_REGISTER_NAMES PR target/70320

2019-07-19 Thread Segher Boessenkool

On Fri, Jul 19, 2019 at 10:55:59AM +0100, Richard Sandiford wrote:
> Jozef Lawrynowicz  writes:
> > Is the downside of this macro implementation compared to a DEFHOOKPOD mainly
> > just the maintainability/readability of the added code?
> 
> Macros are essentially the "old way" and target hooks the "new way".
> The decision was made a long time ago to move away from macros where
> possible.  TBH I no longer remember the reasons clearly, but I think
> it was partly to avoid including so much target stuff in non-target
> files, and partly in the hope that we might one day support multiple
> targets in a single build.  Years later, we're not much closer to the
> latter and I'm not sure it's a realistic goal.

Macros also are hard to use: they do not follow the normal language rules,
they just do text replacement (and C macros are terribly weak, at that).
Some of this is made better by conventions like "always put all uses of
macro parameters in parentheses", but that itself is an extra mental load
to remember.

> That said, there are some macros that are too compile-time sensitive
> to be hooks or variables.  BITS_PER_UNIT is the most obvious example.

Maybe C++ can help here?  Or maybe we can declare some hooks as inline
functions somehow?

Segher

Handle strncpy in tree-ssa-dse.c

2019-07-19 Thread Jeff Law


While looking at BZ 80576 I realized a few things.

First for STRNCPY we know the exact count of bytes written and we can
treat it just like MEMCPY and others, both in terms of removing/trimming
them and in terms of using them to allow removal of other stores.

This patch adds support for those routines in DSE.  We test that
subsequent statements can make those calls dead and vice versa and that
we can trim from the head or tail appropriately.

While writing that code I also stumbled over a blob of code that I think
I copied from tree-ssa-alias.c that isn't necessary.  In the relevant
code the byte count is always found in the same place.  There's no need
to check the number of operands to the call to figure out where the
count would be.  So that little blob of code is simplified ever so slightly.

Finally, while writing the tests for strncpy I stumbled over a case that
we're still not handling well.

In particular something like this:



void h (char *s)
{
  extern char a[8];
  __builtin_memset (a, 0, sizeof a);
  __builtin_strncpy (a, s, sizeof a);
  frob (a);
}

In this case ref_maybe_used_by_stmt_p returns true for the "a" array at
the strncpy call.  AFAICT that appears to happen because  "a" and "s"
could alias each other.

strncpy is documented as not allowing overlap between the source and
destination objects.  So in theory we could consider them not aliasing
for this call.  I haven't implemented this, but I've got some ideas
here.  Anyway, I've included an xfailed test for this case in this patch.

Bootstrapped and regression tested on x86_64, ppc64, ppc64le, aarch64 &
sparc64.  Installing on the trunk momentarily.

We could in theory handle stpncpy too, we just have to be more careful
with its return value.

Jeff
commit 844df9c9ed48c2c0e80b633eb4f513d1228ef62d
Author: Jeff Law 
Date:   Fri Jul 19 11:03:10 2019 -0600

* tree-ssa-dse.c (initialize_ao_ref_for_dse): Handle
strncpy.  Drop some trivial dead code.
(maybe_trim_memstar_call): Handle strncpy.

* gcc.dg/tree-ssa/ssa-dse-37.c: New test.
* gcc.dg/tree-ssa/ssa-dse-38.c: New test.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 08f91ed32db..d8f60042ac1 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2019-07-19  Jeff Law  
+
+   * tree-ssa-dse.c (initialize_ao_ref_for_dse): Handle
+   strncpy.  Drop some trivial dead code.
+   (maybe_trim_memstar_call): Handle strncpy.
+
 2019-07-19  Richard Biener  
 
PR tree-optimization/91211
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 98fb40ddd96..ce8e3c781b9 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2019-07-19  Jeff Law  
+
+   * gcc.dg/tree-ssa/ssa-dse-37.c: New test.
+   * gcc.dg/tree-ssa/ssa-dse-38.c: New test.
+
 2019-07-19  Richard Biener  
 
PR tree-optimization/91211
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-37.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-37.c
new file mode 100644
index 000..56251fc340f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-37.c
@@ -0,0 +1,60 @@
+/* { dg-options "-O2 -fdump-tree-dse-details -fno-tree-fre" } */
+
+
+#ifndef SCOPE
+#define SCOPE
+#endif
+
+extern void frob (char *);
+
+void g (char *s)
+{
+  SCOPE char a[8];
+  __builtin_strncpy (a, s, sizeof a);
+  __builtin_memset (a, 0, sizeof a); 
+  frob (a);
+}
+
+void h (char *s)
+{
+  SCOPE char a[8];
+  __builtin_memset (a, 0, sizeof a); 
+  __builtin_strncpy (a, s, sizeof a);
+  frob (a);
+}
+
+void i (char *s)
+{
+  SCOPE char a[8];
+  __builtin_strncpy (a, s, sizeof a);
+  __builtin_memset (a, 0, sizeof a - 5); 
+  frob (a);
+}
+
+void j (char *s)
+{
+  SCOPE char a[8];
+  __builtin_memset (a, 0, sizeof a); 
+  __builtin_strncpy (a, s, sizeof a - 5);
+  frob (a);
+}
+
+void l (char *s)
+{
+  SCOPE char a[8];
+  __builtin_strncpy (a, s, sizeof a);
+  __builtin_memset (a + 2, 0, sizeof a - 2); 
+  frob (a);
+}
+
+void m (char *s)
+{
+  SCOPE char a[8];
+  __builtin_memset (a, 0, sizeof a); 
+  __builtin_strncpy (a + 2, s, sizeof a - 2);
+  frob (a);
+}
+
+/* { dg-final { scan-tree-dump-times "Deleted dead call" 2 "dse1" } } */
+/* { dg-final { scan-tree-dump-times "Trimming statement " 4 "dse1" } } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-38.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-38.c
new file mode 100644
index 000..7ae33bfd169
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-38.c
@@ -0,0 +1,12 @@
+/* { dg-options "-O2 -fdump-tree-dse-details -fno-tree-fre" } */
+
+
+/* This changes the scope of the destination object and exposes
+   missed optimizations in DSE.  */
+#define SCOPE extern
+#include "ssa-dse-37.c"
+
+/* { dg-final { scan-tree-dump-times "Deleted dead call" 2 "dse1" { xfail 
*-*-* } } } */
+/* { dg-final { scan-tree-dump-times "Trimming statement " 4 "dse1" { xfail 
*-*-* } } } */
+
+
diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c
index 9bdcf9ae6af..5b7c4

Re: [PATCH] Fix simd attribute handling on aarch64 (version 2)

2019-07-19 Thread Richard Sandiford

Steve Ellcey  writes:
> Here is version two of my patch to fix simd attribute handling on
> aarch64.  Unlike the first patch where I swapped the order of the
> calls to targetm.simd_clone.adjust and simd_clone_adjust_return_type,
> in this one I remove the (conditional) call to build_distinct_type_copy
> from simd_clone_adjust_return_type and do it unconditionally before
> calling either routine.  The only downside to this that I can see is
> that on non-aarch64 platforms where the return type of a vector
> function is VOID (and not changed), we will create a distinct type
> where we did not before.
>
> I also added some tests to ensure that, on aarch64, the vector
> functions created by cloning a simd function have the .variant_pcs
> directive and that the original non-vector version of the function
> does not have the directive.  Without this patch the non-vector
> version is putting out the directive, that is what this patch
> fixes.
>
> Retested on x86 and aarch64 with no regressions.
>
> OK to checkin?
>
> Steve Ellcey
> sell...@marvell.com
>
>
> 2019-07-19  Steve Ellcey  
>
>   * omp-simd-clone.c (simd_clone_adjust_return_type): Remove call to
>   build_distinct_type_copy.
>   (simd_clone_adjust): Call build_distinct_type_copy.
>   (expand_simd_clones): Ditto.
>
>
> 2019-07-19  Steve Ellcey  
>
>   * gcc.target/aarch64/simd_pcs_attribute.c: New test.
>   * gcc.target/aarch64/simd_pcs_attribute-2.c: Ditto.
>   * gcc.target/aarch64/simd_pcs_attribute-3.c: Ditto.
>
>
> diff --git a/gcc/omp-simd-clone.c b/gcc/omp-simd-clone.c
> index caa8da3cba5..427d6f6f514 100644
> --- a/gcc/omp-simd-clone.c
> +++ b/gcc/omp-simd-clone.c
> @@ -498,7 +498,6 @@ simd_clone_adjust_return_type (struct cgraph_node *node)
>/* Adjust the function return type.  */
>if (orig_rettype == void_type_node)
>  return NULL_TREE;
> -  TREE_TYPE (fndecl) = build_distinct_type_copy (TREE_TYPE (fndecl));
>t = TREE_TYPE (TREE_TYPE (fndecl));
>if (INTEGRAL_TYPE_P (t) || POINTER_TYPE_P (t))
>  veclen = node->simdclone->vecsize_int;
> @@ -1164,6 +1163,7 @@ simd_clone_adjust (struct cgraph_node *node)
>  {
>push_cfun (DECL_STRUCT_FUNCTION (node->decl));
>  
> +  TREE_TYPE (node->decl) = build_distinct_type_copy (TREE_TYPE (node->decl));
>targetm.simd_clone.adjust (node);
>  
>tree retval = simd_clone_adjust_return_type (node);
> @@ -1737,6 +1737,8 @@ expand_simd_clones (struct cgraph_node *node)
>   simd_clone_adjust (n);
> else
>   {
> +   TREE_TYPE (n->decl)
> + = build_distinct_type_copy (TREE_TYPE (n->decl));
> targetm.simd_clone.adjust (n);
> simd_clone_adjust_return_type (n);
> simd_clone_adjust_argument_types (n);

You can probably also remove:

  tree new_type = build_distinct_type_copy (TREE_TYPE (node->decl));
  ...
  TREE_TYPE (node->decl) = new_type;

in simd_clone_adjust_argument_types.

I'm happy doing it this way or doing the copy in the AArch64 hook.
It's really Jakub's call.

I don't think the tests need:

/* { dg-require-effective-target aarch64_variant_pcs } */

since they're only dg-do compile.  Leaving the line out would get more
coverage for people using older binutils.

The tests are OK with that change, thanks.

Richard

Re: [PATCH] Allow case-insensitive comparisons of register names by implementing CASE_INSENSITIVE_REGISTER_NAMES PR target/70320

2019-07-19 Thread Richard Sandiford

Segher Boessenkool  writes:
> On Fri, Jul 19, 2019 at 10:55:59AM +0100, Richard Sandiford wrote:
>> Jozef Lawrynowicz  writes:
>> > Is the downside of this macro implementation compared to a DEFHOOKPOD 
>> > mainly
>> > just the maintainability/readability of the added code?
>> 
>> Macros are essentially the "old way" and target hooks the "new way".
>> The decision was made a long time ago to move away from macros where
>> possible.  TBH I no longer remember the reasons clearly, but I think
>> it was partly to avoid including so much target stuff in non-target
>> files, and partly in the hope that we might one day support multiple
>> targets in a single build.  Years later, we're not much closer to the
>> latter and I'm not sure it's a realistic goal.
>
> Macros also are hard to use: they do not follow the normal language rules,
> they just do text replacement (and C macros are terribly weak, at that).
> Some of this is made better by conventions like "always put all uses of
> macro parameters in parentheses", but that itself is an extra mental load
> to remember.
>
>> That said, there are some macros that are too compile-time sensitive
>> to be hooks or variables.  BITS_PER_UNIT is the most obvious example.
>
> Maybe C++ can help here?  Or maybe we can declare some hooks as inline
> functions somehow?

FWIW, I have a WIP patch that uses C++ here.  I expect people will
hate it :-)

Richard

RE: sized delete in _Temporary_buffer<>

2019-07-19 Thread Morwenn Ed

If I'm not mistaken this patch allocates N*sizeof(_Tp) bytes of storage and 
deallocates N bytes when sized deallocation is enabled?

Shouldn't __return_temporary_buffer deallocate N*sizeof(_Tp) instead to match 
the value passed to new?


De : libstdc++-ow...@gcc.gnu.org  de la part de 
François Dumont 
Envoyé : jeudi 18 juillet 2019 07:41
À : libstd...@gcc.gnu.org ; gcc-patches 

Objet : sized delete in _Temporary_buffer<>

As we adopted the sized deallocation in the new_allocator why not doing
the same in _Temporary_buffer<>.

 * include/bits/stl_tempbuf.h (__detail::__return_temporary_buffer):
New.
 (~_Temporary_buffer()): Use latter.
 (_Temporary_buffer(_FIterator, size_type)): Likewise.

Tested w/o activating sized deallocation. I'll try to run tests with
this option activated.

Ok to commit ?

François

[Darwin, committed] More specs TLC.

2019-07-19 Thread Iain Sandoe

This strips out (%< wise) a few driver specs that are only specifying a default 
state.
Also warn on an option now ignored, and add some comments to the driver specs
section.  Update the comments to explain why we can’t process all the driver 
specs
here.

Tested on x86-64-darwin,
applied to mainline
thanks
Iain

2019-07-19  Iain Sandoe  

* config/darwin.h (DRIVER_SELF_SPECS): Ignore X and Mach specs which
refer to default conditions.  Warn for the 'y' spec which is ignored
by current linkers.


diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
index e17bc64..ed87984 100644
--- a/gcc/config/darwin.h
+++ b/gcc/config/darwin.h
@@ -118,13 +118,23 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 /* True if pragma ms_struct is in effect.  */
 extern GTY(()) int darwin_ms_struct;
 
-#define DRIVER_SELF_SPECS  \
-  "%{gfull:-g -fno-eliminate-unused-debug-symbols} %

Re: [patch, libfortran] Adjust block size for libgfortran for unformatted reads

2019-07-19 Thread Thomas Koenig


Hi Steve,


On Sun, Jul 14, 2019 at 12:07:58PM +0200, Thomas Koenig wrote:

OK, so here is a new version.

I think the discussion has shown that enlaring the buffer makes sense,
and that the buffer size for unformatted seems to be too bad.

I've reversed the names of the environment variables according to
Behnard's suggestion.

So, OK for trunk?

Also, what should we do about gcc-9?  I have now come to think
that we should add the environment variables to set the buffer lengths,
but leave the old default (8192).

What do you think?



If you are inclined to back port a portion of the patch to 9-branch,
then bumping up the old default would seem to be the most important
part.  As dje noted, users seem to have an aversion to reading the
documentation, so finding the environment variables may not happen.

Isn't 8192 an internal implementation detail for libgfortran?  Can
bumping it to larger value in 9-branch cause an issue for a normal
user?


Well, it allocates a bigger memory block, that's all.

Upon reconsideration, I think your point about people not reading
the docs is valid :-|

So, I will commit the patch to trunk over the weekend and to 9.2
a few days afterwards, unless somebody objects.

Regards

Thomas

Re: sized delete in _Temporary_buffer<>

2019-07-19 Thread François Dumont


(2nd sent attempt as text this time.)

Good spot, fixed with attached patch, committed as trivial.

2019-07-19  François Dumont 

    * include/bits/stl_tempbuf.h (__detail::__return_temporary_buffer): Fix
    sized deallocation size computation.


On 7/19/19 9:46 PM, Morwenn Ed wrote:
If I'm not mistaken this patch allocates N*sizeof(_Tp) bytes of 
storage and deallocates N bytes when sized deallocation is enabled?


Shouldn't __return_temporary_buffer deallocate N*sizeof(_Tp) instead 
to match the value passed to new?



*De :* libstdc++-ow...@gcc.gnu.org  de la 
part de François Dumont 

*Envoyé :* jeudi 18 juillet 2019 07:41
*À :* libstd...@gcc.gnu.org ; gcc-patches 


*Objet :* sized delete in _Temporary_buffer<>
As we adopted the sized deallocation in the new_allocator why not doing
the same in _Temporary_buffer<>.

 * include/bits/stl_tempbuf.h (__detail::__return_temporary_buffer):
New.
 (~_Temporary_buffer()): Use latter.
 (_Temporary_buffer(_FIterator, size_type)): Likewise.

Tested w/o activating sized deallocation. I'll try to run tests with
this option activated.

Ok to commit ?

François



diff --git a/libstdc++-v3/include/bits/stl_tempbuf.h b/libstdc++-v3/include/bits/stl_tempbuf.h
index bb7c2cd1334..ce3f3624437 100644
--- a/libstdc++-v3/include/bits/stl_tempbuf.h
+++ b/libstdc++-v3/include/bits/stl_tempbuf.h
@@ -71,7 +71,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 size_t __len __attribute__((__unused__)))
   {
 #if __cpp_sized_deallocation
-	::operator delete(__p, __len);
+	::operator delete(__p, __len * sizeof(_Tp));
 #else
 	::operator delete(__p);
 #endif

Add std::copy_n overload for istreambuf_iterator

2019-07-19 Thread François Dumont


It sounds reasonable to overload std::copy_n for istreambuf_iterator.

    * include/bits/stl_algo.h (copy_n(istreambuf_iterator<>, _Size, 
_OIte)):

    New declaration.
    * include/bits/streambuf_iterator.h (istreambuf_iterator<>): Declare
    std::copy_n for istreambuf_iterator of char types to be friend.
    (std::copy_n(istreambuf_iterator<>, _Size, _OIte)): New overload.
    * include/std/streambuf(basic_streambuf<>): Declare std::copy_n for
    istreambuf_iterator of char types to be friend.
    * testsuite/25_algorithms/copy_n/istreambuf_iterator.cc: New.
    * testsuite/25_algorithms/copy_n/istreambuf_iterator_neg.cc: New.

Tested under Linux x86_64, normal and debug modes.

Ok to commit ?

François

diff --git a/libstdc++-v3/include/bits/stl_algo.h b/libstdc++-v3/include/bits/stl_algo.h
index 478f012def8..ec651e2cc45 100644
--- a/libstdc++-v3/include/bits/stl_algo.h
+++ b/libstdc++-v3/include/bits/stl_algo.h
@@ -771,6 +771,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	 _OutputIterator __result, random_access_iterator_tag)
 { return std::copy(__first, __first + __n, __result); }
 
+  template
+typename enable_if<__is_char<_CharT>::__value,
+		   _OutputIterator>::type
+copy_n(istreambuf_iterator<_CharT, char_traits<_CharT> >,
+	   _Size __n, _OutputIterator __result);
+
   /**
*  @brief Copies the range [first,first+n) into [result,result+n).
*  @ingroup mutating_algorithms
diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h b/libstdc++-v3/include/bits/streambuf_iterator.h
index 2f4ff494a3a..c682fa91bde 100644
--- a/libstdc++-v3/include/bits/streambuf_iterator.h
+++ b/libstdc++-v3/include/bits/streambuf_iterator.h
@@ -80,6 +80,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	__copy_move_a2(istreambuf_iterator<_CharT2>,
 		   istreambuf_iterator<_CharT2>, _CharT2*);
 
+#if __cplusplus >= 201103L
+  template
+	friend typename enable_if<__is_char<_CharT2>::__value,
+  _OutputIterator>::type
+	copy_n(istreambuf_iterator<_CharT2>, _Size, _OutputIterator);
+#endif
+
   template
 	friend typename __gnu_cxx::__enable_if<__is_char<_CharT2>::__value,
 istreambuf_iterator<_CharT2> >::__type
@@ -367,6 +374,50 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return __result;
 }
 
+#if __cplusplus >= 201103L
+  template
+typename enable_if<__is_char<_CharT>::__value, _OutputIterator>::type
+copy_n(istreambuf_iterator<_CharT> __it, _Size __n,
+	   _OutputIterator __result)
+{
+  if (__n == 0)
+	return __result;
+
+  __glibcxx_assert(__n > 0);
+  __glibcxx_requires_cond(!__it._M_at_eof(),
+			  _M_message(__gnu_debug::__msg_inc_istreambuf)
+			  ._M_iterator(__it));
+
+  using traits_type = typename istreambuf_iterator<_CharT>::traits_type;
+  const auto __eof = traits_type::eof();
+
+  auto __sb = __it._M_sbuf;
+  while (__n > 0)
+	{
+	  streamsize __size = __sb->egptr() - __sb->gptr();
+	  if (__size == 0)
+	{
+	  if (traits_type::eq_int_type(__sb->underflow(), __eof))
+		{
+		  __glibcxx_requires_cond(__n == 0,
+		_M_message(__gnu_debug::__msg_inc_istreambuf)
+	  ._M_iterator(__it));
+		  break;
+		}
+
+	  __size =  __sb->egptr() - __sb->gptr();
+	}
+
+	  streamsize __xsize = std::min(__size, __n);
+	  __result = std::copy(__sb->gptr(), __sb->gptr() + __xsize, __result);
+	  __sb->__safe_gbump(__xsize);
+	  __n -= __xsize;
+	}
+
+  return __result;
+}
+#endif
+
   template
 typename __gnu_cxx::__enable_if<__is_char<_CharT>::__value,
 		  		istreambuf_iterator<_CharT> >::__type
diff --git a/libstdc++-v3/include/std/streambuf b/libstdc++-v3/include/std/streambuf
index d9ca981d704..4f62ebf4d95 100644
--- a/libstdc++-v3/include/std/streambuf
+++ b/libstdc++-v3/include/std/streambuf
@@ -155,6 +155,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __copy_move_a2(istreambuf_iterator<_CharT2>,
 		   istreambuf_iterator<_CharT2>, _CharT2*);
 
+#if __cplusplus >= 201103L
+  template
+	friend typename enable_if<__is_char<_CharT2>::__value,
+  _OutputIterator>::type
+	copy_n(istreambuf_iterator<_CharT2>, _Size, _OutputIterator);
+#endif
+
   template
 friend typename __gnu_cxx::__enable_if<__is_char<_CharT2>::__value,
   istreambuf_iterator<_CharT2> >::__type
diff --git a/libstdc++-v3/testsuite/25_algorithms/copy_n/istreambuf_iterator.cc b/libstdc++-v3/testsuite/25_algorithms/copy_n/istreambuf_iterator.cc
new file mode 100644
index 000..ebd769cf7c0
--- /dev/null
+++ b/libstdc++-v3/testsuite/25_algorithms/copy_n/istreambuf_iterator.cc
@@ -0,0 +1,59 @@
+// Copyright (C) 2019 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be

[PATCH] handle multibyte stores larger than char in strlen (PR 91183, 86888)

2019-07-19 Thread Martin Sebor


On targets with permissive alignment requirements GCC sometimes
lowers stores of short (between two and 16 bytes), power-of-two
char sequences  to single integer stores of the corresponding
width.  This happens for sequences of ordinary character stores
as well as for some  calls to memcpy.

However, the strlen pass is only prepared to handle character
stores and not those of wider integers.  As a result, the strlen
optimization tends to get defeated in cases when it could benefit
the most: very short strings.  I counted 1544 instances where
the strlen optimization was disabled in a GCC build on x86_64
due to this sort of early store merging, and over two thousand
in a build of the Linux kernel.

In addition, -Wstringop-overflow only considers calls to string
functions and is ineffective against past-the-end accesses by
these merged multibyte stores.

To improve the effectiveness of both the optimization as well
as the warning the attached patch enhances the strlen pass to
consider these wide accesses.  Since the infrastructure for doing
this is already in place (strlen can compute multibyte accesses
via MEM_REFs of character arrays), the enhancement isn't very
intrusive.  It relies on the native_encode_expr function to
determine the encoding of an expression and its "length".

I tested the patch on x86_64.  I expect the tests the patch
adds will need some adjustment for big-endian and strictly
aligned targets.

Martin
PR tree-optimization/91183 - strlen of a strcpy result with a conditional source not folded
PR tree-optimization/86688 - missing -Wstringop-overflow using a non-string local array in strnlen with excessive bound

gcc/ChangeLog:

	PR tree-optimization/91183
	PR tree-optimization/86688
	* builtins.c (compute_objsize): Handle MEM_REF.
	* tree-ssa-strlen.c (class ssa_name_limit_t): New.
	(get_min_string_length): Remove.
	(count_nonzero_bytes): New function.
	(handle_char_store): Rename...
	(handle_store): to this.  Handle multibyte stores via integer types.
	(strlen_check_and_optimize_stmt): Adjust conditional and the called
	function name.

gcc/testsuite/ChangeLog:

	PR tree-optimization/91183
	PR tree-optimization/86688
	* gcc.dg/attr-nonstring-2.c: Remove xfails.
	* gcc.dg/strlenopt-70.c: New test.
	* gcc.dg/strlenopt-71.c: New test.
	* gcc.dg/strlenopt-72.c: New test.
	* gcc.dg/strlenopt-8.c: Remove xfails.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index e5a9261e84c..695a9d191af 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -3652,6 +3652,20 @@ compute_objsize (tree dest, int ostype)
   if (!ostype)
 return NULL_TREE;
 
+  if (TREE_CODE (dest) == MEM_REF)
+{
+  tree ref = TREE_OPERAND (dest, 0);
+  tree off = TREE_OPERAND (dest, 1);
+  if (tree size = compute_objsize (ref, ostype))
+	{
+	  if (tree_int_cst_lt (off, size))
+	return fold_build2 (MINUS_EXPR, size_type_node, size, off);
+	  return integer_zero_node;
+	}
+
+  return NULL_TREE;
+}
+
   if (TREE_CODE (dest) != ADDR_EXPR)
 return NULL_TREE;
 
diff --git a/gcc/testsuite/gcc.dg/attr-nonstring-2.c b/gcc/testsuite/gcc.dg/attr-nonstring-2.c
index 246a3729a2a..ef2144d6207 100644
--- a/gcc/testsuite/gcc.dg/attr-nonstring-2.c
+++ b/gcc/testsuite/gcc.dg/attr-nonstring-2.c
@@ -73,8 +73,8 @@ void test_strnlen_string_cst (void)
   T (3, "12",  3, 1);
   T (3, "12",  3, 9);
   T (3, "123", 3, 1);
-  T (3, "123", 3, 4);   /* { dg-warning "argument 1 declared attribute .nonstring. is smaller than the specified bound 4" "bug 86688" { xfail *-*-* } } */
-  T (3, "123", 3, 9);   /* { dg-warning "argument 1 declared attribute .nonstring. is smaller than the specified bound 9" "bug 86688" { xfail *-*-* } } */
+  T (3, "123", 3, 4);   /* { dg-warning "argument 1 declared attribute .nonstring. is smaller than the specified bound 4" } */
+  T (3, "123", 3, 9);   /* { dg-warning "argument 1 declared attribute .nonstring. is smaller than the specified bound 9" } */
 
   T (5, "1",   2, 1);
   T (5, "1",   2, 2);
@@ -110,6 +110,6 @@ void test_strnlen_string_range (void)
 {
   T (3, "1",   2, UR (0, 1));
   T (3, "1",   2, UR (3, 9));
-  T (3, "123", 3, UR (4, 5));   /* { dg-warning "argument 1 declared attribute .nonstring. is smaller than the specified bound \\\[4, 5]" "bug 86688" { xfail *-*-* } } */
-  T (3, "123", 3, UR (5, 9));   /* { dg-warning "argument 1 declared attribute .nonstring. is smaller than the specified bound \\\[5, 9]" "bug 86688" { xfail *-*-* } } */
+  T (3, "123", 3, UR (4, 5));   /* { dg-warning "argument 1 declared attribute .nonstring. is smaller than the specified bound \\\[4, 5]" } */
+  T (3, "123", 3, UR (5, 9));   /* { dg-warning "argument 1 declared attribute .nonstring. is smaller than the specified bound \\\[5, 9]" } */
 }
diff --git a/gcc/testsuite/gcc.dg/strlenopt-70.c b/gcc/testsuite/gcc.dg/strlenopt-70.c
new file mode 100644
index 000..59e1081c9b5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strlenopt-70.c
@@

Go patch committed: Don't export function bodies marked go:noinline

2019-07-19 Thread Ian Lance Taylor

This Go frontend patch by Than McIntosh stops exporting bodies for
functions marked "go:noinline".  The current Mark_inline_candidates
helper looks only at budget when deciding to mark a function or method
as inline (with the proviso that IR constructs not yet supported by
the inliner are given artificially high cost).  This patch changes the
helper to also look at whether a function has the "go:noinline"
pragma; if it does have the pragma there is no point putting it into
the export data (it will just make the export data bigger).
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 273577)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-4df7c8d7af894ee93f50c3a50debdcf4e369a2c6
+e242929304e7a524ced56dc94605bbf6d83e6489
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/gogo.cc
===
--- gcc/go/gofrontend/gogo.cc   (revision 273564)
+++ gcc/go/gofrontend/gogo.cc   (working copy)
@@ -5109,6 +5109,8 @@ int
 Mark_inline_candidates::function(Named_object* no)
 {
   Function* func = no->func_value();
+  if ((func->pragmas() & GOPRAGMA_NOINLINE) != 0)
+return TRAVERSE_CONTINUE;
   int budget = budget_heuristic;
   Inline_within_budget iwb(&budget);
   func->block()->traverse(&iwb);
@@ -5138,6 +5140,8 @@ Mark_inline_candidates::type(Type* t)
   Named_object* no = *p;
   go_assert(no->is_function());
   Function *func = no->func_value();
+  if ((func->pragmas() & GOPRAGMA_NOINLINE) != 0)
+continue;
   int budget = budget_heuristic;
   Inline_within_budget iwb(&budget);
   func->block()->traverse(&iwb);

[PATCH] [rs6000] Add _mm_blend_epi16 and _mm_blendv_epi8

2019-07-19 Thread Paul Clarke

Add compatibility implementations of _mm_blend_epi16 and _mm_blendv_epi8
intrinsics.

Respective test cases are copied almost verbatim (minor changes to
the dejagnu head lines) from i386.

2019-07-19  Paul A. Clarke  

[gcc]

* config/rs6000/smmintrin.h (_mm_blend_epi16): New.
(_mm_blendv_epi8): New.

[gcc/testsuite]

* gcc.target/powerpc/sse4_1-check.h: New.
* gcc.target/powerpc/sse4_1-pblendvb.c: New.
* gcc.target/powerpc/sse4_1-pblendw.c: New.
* gcc.target/powerpc/sse4_1-pblendw-2.c: New.

Tested on 64bit LE, 64bit and 32bit BE.

OK for trunk?

Index: gcc/config/rs6000/smmintrin.h
===
diff --git a/trunk/gcc/config/rs6000/smmintrin.h 
b/trunk/gcc/config/rs6000/smmintrin.h
--- a/trunk/gcc/config/rs6000/smmintrin.h   (revision 273615)
+++ b/trunk/gcc/config/rs6000/smmintrin.h   (working copy)
@@ -66,4 +66,27 @@ _mm_extract_ps (__m128 __X, const int __N)
   return ((__v4si)__X)[__N & 3];
 }
 
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_mm_blend_epi16 (__m128i __A, __m128i __B, const int __imm8)
+{
+  __v8hu __bitmask = vec_splats ((unsigned short) __imm8);
+  const __v8hu __shifty = { 0, 1, 2, 3, 4, 5, 6, 7 };
+  __bitmask = vec_sr (__bitmask, __shifty);
+  const __v8hu __ones = vec_splats ((unsigned short) 0x0001);
+  __bitmask = vec_and (__bitmask, __ones);
+  const __v8hu __zero = {0};
+  __bitmask = vec_sub (__zero, __bitmask);
+  return (__m128i) vec_sel ((__v8hu) __A, (__v8hu) __B, __bitmask);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_mm_blendv_epi8 (__m128i __A, __m128i __B, __m128i __mask)
+{
+  const __v16qu __hibits = vec_splats ((unsigned char) 0x80);
+  __v16qu __lmask = vec_and ((__v16qu) __mask, __hibits);
+  const __v16qu __zero = {0};
+  __lmask = (vector unsigned char) vec_cmpgt (__lmask, __zero);
+  return (__m128i) vec_sel ((__v16qu) __A, (__v16qu) __B, __lmask);
+}
+
 #endif
Index: gcc/testsuite/gcc.target/powerpc/sse4_1-check.h
===
diff --git a/trunk/gcc/testsuite/gcc.target/powerpc/sse4_1-check.h 
b/trunk/gcc/testsuite/gcc.target/powerpc/sse4_1-check.h
new file mode 10644
--- /dev/null   (revision 0)
+++ b/trunk/gcc/testsuite/gcc.target/powerpc/sse4_1-check.h (working copy)
@@ -0,0 +1,27 @@
+#include 
+#include 
+
+#include "m128-check.h"
+
+//#define DEBUG 1
+
+#define TEST sse4_1_test
+
+static void sse4_1_test (void);
+
+static void
+__attribute__ ((noinline))
+do_test (void)
+{
+  sse4_1_test ();
+}
+
+int
+main ()
+{
+  do_test ();
+#ifdef DEBUG
+  printf ("PASSED\n");
+#endif
+  return 0;
+}
Index: gcc/testsuite/gcc.target/powerpc/sse4_1-pblendvb.c
===
diff --git a/trunk/gcc/testsuite/gcc.target/powerpc/sse4_1-pblendvb.c 
b/trunk/gcc/testsuite/gcc.target/powerpc/sse4_1-pblendvb.c
new file mode 10644
--- /dev/null   (revision 0)
+++ b/trunk/gcc/testsuite/gcc.target/powerpc/sse4_1-pblendvb.c  (working copy)
@@ -0,0 +1,71 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#define NO_WARN_X86_INTRINSICS 1
+#ifndef CHECK_H
+#define CHECK_H "sse4_1-check.h"
+#endif
+
+#ifndef TEST
+#define TEST sse4_1_test
+#endif
+
+#include CHECK_H
+
+#include 
+#include 
+
+#define NUM 20
+
+static void
+init_pblendvb (unsigned char *src1, unsigned char *src2,
+  unsigned char *mask)
+{
+  int i, sign = 1; 
+
+  for (i = 0; i < NUM * 16; i++)
+{
+  src1[i] = i* i * sign;
+  src2[i] = (i + 20) * sign;
+  mask[i] = (i % 3) + ((i * (14 + sign))
+  ^ (src1[i] | src2[i] | (i*3)));
+  sign = -sign;
+}
+}
+
+static int
+check_pblendvb (__m128i *dst, unsigned char *src1,
+   unsigned char *src2, unsigned char *mask)
+{
+  unsigned char tmp[16];
+  int j;
+
+  memcpy (&tmp[0], src1, sizeof (tmp));
+  for (j = 0; j < 16; j++)
+if (mask [j] & 0x80)
+  tmp[j] = src2[j];
+
+  return memcmp (dst, &tmp[0], sizeof (tmp));
+}
+
+static void
+TEST (void)
+{
+  union
+{
+  __m128i x[NUM];
+  unsigned char c[NUM * 16];
+} dst, src1, src2, mask;
+  int i;
+
+  init_pblendvb (src1.c, src2.c, mask.c);
+
+  for (i = 0; i < NUM; i++)
+{
+  dst.x[i] = _mm_blendv_epi8 (src1.x[i], src2.x[i], mask.x[i]);
+  if (check_pblendvb (&dst.x[i], &src1.c[i * 16], &src2.c[i * 16],
+ &mask.c[i * 16]))
+   abort ();
+}
+}
Index: gcc/testsuite/gcc.target/powerpc/sse4_1-pblendw-2.c
===
diff --git a/trunk/gcc/testsuite/gcc.target/powerpc/sse4_1-pblendw-2.c 
b/trunk/gcc/testsuite/gcc.target/powerpc/sse4_1-pblendw-2.c
new file mode 10644
--- /dev/null   (revision 0)
+++ b/trunk/gcc/testsuite/gcc.target/powerpc

[committed] Improve simd with a single lastprivate iterator

2019-07-19 Thread Jakub Jelinek

Hi!

While the iterator of a simd collapse(1) loop is predetermined linear, in
OpenMP 5 one can specify it also explicitly in a linear, lastprivate or
private clause.  The following testcase shows that we weren't vectorizing
those if the iterator wasn't addressable and has been explicitly
lastprivate, as the magic simd arrays prevented number of iterations
computation.  Fixed by not forcing it into a simd array in that case.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2019-07-20  Jakub Jelinek  

* omp-low.c (lower_rec_input_clauses): Don't force simd arrays for
lastprivate non-addressable iterator of a collapse(1) simd.

* gcc.dg/vect/vect-simd-16.c: New test.

--- gcc/omp-low.c.jj2019-07-19 13:25:52.001314547 +0200
+++ gcc/omp-low.c   2019-07-19 17:01:17.168704782 +0200
@@ -5123,7 +5123,10 @@ lower_rec_input_clauses (tree clauses, g
{
  tree y = lang_hooks.decls.omp_clause_dtor (c, new_var);
  if ((TREE_ADDRESSABLE (new_var) || nx || y
-  || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LASTPRIVATE
+  || (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LASTPRIVATE
+  && (gimple_omp_for_collapse (ctx->stmt) != 1
+  || (gimple_omp_for_index (ctx->stmt, 0)
+  != new_var)))
   || OMP_CLAUSE_CODE (c) == OMP_CLAUSE__CONDTEMP_
   || omp_is_reference (var))
  && lower_rec_simd_input_clauses (new_var, ctx, &sctx,
--- gcc/testsuite/gcc.dg/vect/vect-simd-16.c.jj 2019-07-19 17:16:24.035069658 
+0200
+++ gcc/testsuite/gcc.dg/vect/vect-simd-16.c2019-07-19 17:20:48.034099757 
+0200
@@ -0,0 +1,61 @@
+/* { dg-additional-options "-fopenmp-simd" } */
+/* { dg-additional-options "-mavx" { target avx_runtime } } */
+/* { dg-final { scan-tree-dump-times "vectorized \[1-3] loops" 3 "vect" { 
target i?86-*-* x86_64-*-* } } } */
+
+#include "tree-vect.h"
+
+__attribute__((noipa)) int
+foo (int *a)
+{
+  int i;
+  #pragma omp simd lastprivate (i)
+  for (i = 0; i < 64; i++)
+a[i] = i;
+  return i;
+}
+
+__attribute__((noipa)) void
+bar (int *a)
+{
+  int i;
+  #pragma omp simd private (i)
+  for (i = 0; i < 64; i++)
+a[i] = i + 1;
+}
+
+__attribute__((noipa)) int
+baz (int *a)
+{
+  int i;
+  #pragma omp simd linear (i)
+  for (i = 0; i < 64; i++)
+a[i] = i + 2;
+  return i;
+}
+
+int
+main ()
+{
+  int i;
+  int a[64];
+  check_vect ();
+  if (foo (a) != 64)
+abort ();
+  for (i = 0; i < 64; ++i)
+if (a[i] != i)
+  abort ();
+else
+  a[i] = -8;
+  bar (a);
+  for (i = 0; i < 64; ++i)
+if (a[i] != i + 1)
+  abort ();
+else
+  a[i] = -8;
+  if (baz (a) != 64)
+abort ();
+  for (i = 0; i < 64; ++i)
+if (a[i] != i + 2)
+  abort ();
+  return 0;
+}

Jakub

[PATCH]: Fix ICE in expand_expr_real_2 (PR target/91204)

2019-07-19 Thread Jakub Jelinek

On Fri, Jul 19, 2019 at 04:41:06PM +0200, Uros Bizjak wrote:
> As suggested by Jakub in the PR, add missing vector one_cmpl2 to
> mmx.md. A generic fix is in the works by Jakub.

Yes, here it is.  Bootstrapped/regtested on x86_64-linux and i686-linux, ok
for trunk?

2019-07-20  Jakub Jelinek  

PR target/91204
* optabs.c (expand_unop): As fallback, expand ~op0 as op0 ^ -1.

* gcc.c-torture/compile/pr91204.c: New test.

--- gcc/optabs.c.jj 2019-07-15 10:53:10.743205405 +0200
+++ gcc/optabs.c2019-07-19 00:38:20.271852242 +0200
@@ -2972,6 +2972,17 @@ expand_unop (machine_mode mode, optab un
   return target;
 }
 
+  /* Emit ~op0 as op0 ^ -1.  */
+  if (unoptab == one_cmpl_optab
+  && (SCALAR_INT_MODE_P (mode) || GET_MODE_CLASS (mode) == MODE_VECTOR_INT)
+  && optab_handler (xor_optab, mode) != CODE_FOR_nothing)
+{
+  temp = expand_binop (mode, xor_optab, op0, CONSTM1_RTX (mode),
+  target, unsignedp, OPTAB_DIRECT);
+  if (temp)
+   return temp;
+}
+
   if (optab_to_code (unoptab) == NEG)
 {
   /* Try negating floating point values by flipping the sign bit.  */
--- gcc/testsuite/gcc.c-torture/compile/pr91204.c.jj2019-07-19 
09:29:32.366011373 +0200
+++ gcc/testsuite/gcc.c-torture/compile/pr91204.c   2019-07-19 
09:29:11.011340662 +0200
@@ -0,0 +1,11 @@
+/* PR target/91204 */
+
+int a, b, c[64];
+
+void
+foo (void)
+{
+  int i;
+  for (i = 2; i < 64; i++)
+c[i] &= b ^ c[i] ^ c[i - 2];
+}


Jakub

Re: [Contrib PATCH] Add scripts to convert GCC repo from SVN to Git

2019-07-19 Thread Maxim Kuvyrkov

> On Jul 16, 2019, at 5:14 PM, Maxim Kuvyrkov  wrote:
> 
>> On Jul 16, 2019, at 3:34 PM, Jason Merrill  wrote:
>> 
...
>> 
>>> b. Re-write tags/ branches into annotated tags.  Note that tags/* are 
>>> included into history of several branches via merge or copy commits, so we 
>>> would need to re-write history to have proper references to annotated tag 
>>> commits in the histories of such branches.
>> 
>> Missing tags is definitely something to fix about the current mirror.
>> I don't think we need to worry about inserting them into branch
>> history.
> 
> If we don't do this then "git branch -a --contains some/tag" will not work 
> correctly.

I was wrong here.  Git tag objects (annotated tags) cannot appear in branch 
history because they are resolved to the commits they are pointing to.  Only 
commit objects can appear in branch history.

This makes conversion of tags much simpler, since [annotated] tags cannot 
affect history branches.

Regards,

--
Maxim Kuvyrkov
www.linaro.org

52 matches

Mail list logo