date:20200407

*PING**2 Re: [C/C++, OpenACC] Reject vars of different scope in acc declare (PR94120)

2020-04-07 Thread Tobias Burnus


*PING**2

On 3/24/20 11:18 AM, Tobias Burnus wrote:

On 3/11/20 2:28 PM, Tobias Burnus wrote:

Fortran patch:
https://gcc.gnu.org/pipermail/gcc-patches/current/541774.html

Like for Fortran, it also fixes some other issues – here for C++
related to namespaces. (For class, see PR c++/94140.)

Test case of the PR yields an ICE in the middle end and the
namespace tests an ICE in cc1plus. Additionally, invalid code
is not diagnosed.

The OpenACC spec has under "Declare Directive" has the following
restriction:

"A declare directive must be in the same scope
 as the declaration of any var that appears in
 the data clauses of the directive."

("A declare directive is used […] following a variable
  declaration in C or C++".)

NOTE for C++: This patch assumes that variables in a namespace
are handled in the same way as those which are at
global (namespace) scope; however, the OpenACC specification's
wording currently is "In C or C++ global scope, only …".
Hence, one can argue about this part of the patch; but as
it fixes an ICE and is a very sensible extension – the other
option is to reject it – I believe it is fine.
(On the OpenACC side, this is now Issue 288.)

OK for the trunk?

Tobias


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter

Re: *PING**2 Re: [C/C++, OpenACC] Reject vars of different scope in acc declare (PR94120)

2020-04-07 Thread Thomas Schwinge

Hi Tobias!

On 2020-04-07T09:21:27+0200, Tobias Burnus  wrote:
> *PING**2

<87h7y3ofad.fsf@euler.schwinge.homeip.net">http://mid.mail-archive.com/87h7y3ofad.fsf@euler.schwinge.homeip.net>?


Grüße
 Thomas
-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter

[committed] d: Always set ASM_VOLATILE_P on asm statements (PR94425)

2020-04-07 Thread Iain Buclaw via Gcc-patches

Hi,

This patch removes the special treatment of 'asm pure' blocks, as the
need for code correctness outweighs the need for optimization around
these statements.

Bootstrapped and regression tested on x86_64-linux-gnu. Committed to
mainline.

Regards
Iain.

---
gcc/d/ChangeLog:

* toir.cc (IRVisitor::visit (GccAsmStatement *)): Set ASM_VOLATILE_P
on all asm statements.

---
 gcc/d/toir.cc   | 5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/d/toir.cc b/gcc/d/toir.cc
index 21e31dc93d4..6aaf10bf4e4 100644
--- a/gcc/d/toir.cc
+++ b/gcc/d/toir.cc
@@ -1427,8 +1427,9 @@ public:
 if (s->args == NULL && s->clobbers == NULL)
   ASM_INPUT_P (exp) = 1;
 
-/* Asm statements are treated as volatile unless 'pure'.  */
-ASM_VOLATILE_P (exp) = !(s->stc & STCpure);
+/* All asm statements are assumed to have a side effect.  As a future
+   optimization, this could be unset when building in release mode.  */
+ASM_VOLATILE_P (exp) = 1;
 
 add_stmt (exp);
   }
-- 
2.20.1

Re: [PATCH] aarch64: Fix {ash[lr],lshr}3 expanders [PR94488]

2020-04-07 Thread Richard Sandiford

Jakub Jelinek  writes:
> Hi!
>
> The following testcase ICEs on aarch64 apparently since the introduction of
> the aarch64 port.  The reason is that the {ashl,ashr,lshr}3 expanders
> completely unnecessarily FAIL; if operands[2] is something other than
> a CONST_INT or REG or MEM and the middle-end code can't cope with the
> pattern giving up in these cases.  All the expanders use general_operand
> predicate for the shift amount operand, but then have just a special case
> for CONST_INT (if in-bound, emit an immediate shift, otherwise force into
> REG), or MEM (force into REG), or REG (that is the case it handles).
> In the testcase, operands[2] is a lowpart SUBREG of a REG, which is valid
> general_operand.
> I don't see any reason what is magic about MEMs that it should be forced
> into REG and others like SUBREGs that it shouldn't, there isn't even a
> reason to check for !REG_P because force_reg will do nothing if the operand
> is already a REG, and otherwise can handle general_operand just fine.
>
> Fixed thusly, bootstrapped/regtested on aarch64-linux, ok for trunk and
> after a while for backports?
>
> 2020-04-06  Jakub Jelinek  
>
>   PR target/94488
>   * config/aarch64/aarch64-simd.md (ashl3, lshr3,
>   ashr3): Force operands[2] into reg whenever it is not CONST_INT.
>   Assume it is a REG after that instead of testing it and doing FAIL
>   otherwise.  Formatting fix.
>
>   * gcc.c-torture/compile/pr94488.c: New test.
>
> --- gcc/config/aarch64/aarch64-simd.md.jj 2020-03-09 12:43:00.742038043 
> +0100
> +++ gcc/config/aarch64/aarch64-simd.md2020-04-06 08:28:35.527116650 
> +0200
> @@ -1106,30 +1106,18 @@ (define_expand "ashl3"
>DONE;
>  }
>else
> -{
> -  operands[2] = force_reg (SImode, operands[2]);
> -}
> -}
> -  else if (MEM_P (operands[2]))
> -{
> -  operands[2] = force_reg (SImode, operands[2]);
> -}
> -
> -  if (REG_P (operands[2]))
> -{
> -  rtx tmp = gen_reg_rtx (mode);
> -  emit_insn (gen_aarch64_simd_dup (tmp,
> -  convert_to_mode (mode,
> -   operands[2],
> -   0)));
> -  emit_insn (gen_aarch64_simd_reg_sshl (operands[0], operands[1],
> -   tmp));
> -  DONE;
> + operands[2] = force_reg (SImode, operands[2]);
>  }
>else
> -FAIL;
> -}
> -)
> +operands[2] = force_reg (SImode, operands[2]);

Looks like we now do this force_reg whenever we fall through, so it'd
be simpler do it unconditionally and get rid of the elses.  Same for
the other patterns.

OK with that change, thanks.

Richard


> +
> +  rtx tmp = gen_reg_rtx (mode);
> +  emit_insn (gen_aarch64_simd_dup (tmp, convert_to_mode (mode,
> +operands[2],
> +0)));
> +  emit_insn (gen_aarch64_simd_reg_sshl (operands[0], operands[1], 
> tmp));
> +  DONE;
> +})
>  
>  (define_expand "lshr3"
>[(match_operand:VDQ_I 0 "register_operand")
> @@ -1155,28 +1143,18 @@ (define_expand "lshr3"
>else
>  operands[2] = force_reg (SImode, operands[2]);
>  }
> -  else if (MEM_P (operands[2]))
> -{
> -  operands[2] = force_reg (SImode, operands[2]);
> -}
> -
> -  if (REG_P (operands[2]))
> -{
> -  rtx tmp = gen_reg_rtx (SImode);
> -  rtx tmp1 = gen_reg_rtx (mode);
> -  emit_insn (gen_negsi2 (tmp, operands[2]));
> -  emit_insn (gen_aarch64_simd_dup (tmp1,
> -  convert_to_mode (mode,
> -   tmp, 0)));
> -  emit_insn (gen_aarch64_simd_reg_shl_unsigned (operands[0],
> -   operands[1],
> -   tmp1));
> -  DONE;
> -}
>else
> -FAIL;
> -}
> -)
> +operands[2] = force_reg (SImode, operands[2]);
> +
> +  rtx tmp = gen_reg_rtx (SImode);
> +  rtx tmp1 = gen_reg_rtx (mode);
> +  emit_insn (gen_negsi2 (tmp, operands[2]));
> +  emit_insn (gen_aarch64_simd_dup (tmp1,
> +  convert_to_mode (mode, tmp, 0)));
> +  emit_insn (gen_aarch64_simd_reg_shl_unsigned (operands[0], 
> operands[1],
> +   tmp1));
> +  DONE;
> +})
>  
>  (define_expand "ashr3"
>[(match_operand:VDQ_I 0 "register_operand")
> @@ -1202,28 +1180,18 @@ (define_expand "ashr3"
>else
>  operands[2] = force_reg (SImode, operands[2]);
>  }
> -  else if (MEM_P (operands[2]))
> -{
> -  operands[2] = force_reg (SImode, operands[2]);
> -}
> -
> -  if (REG_P (operands[2]))
> -{
> -  rtx tmp = gen_reg_rtx (SImode);
> -  rtx tmp1 = gen_reg_rtx (mode);
> -  emit_insn (g

Re: [PATCH] S/390: Fix layout of struct sigaction_t

2020-04-07 Thread Iain Buclaw via Gcc-patches

On 01/04/2020 18:20, Stefan Liebler wrote:
> On 4/1/20 12:54 PM, Iain Buclaw wrote:
>> On 01/04/2020 08:28, Stefan Liebler wrote:
>>> ping
>>>
>>
>> Thanks, I'll send the patch upstream, as it's the same there.
>>
>> Looks OK to me.
>>
>> Regards
>> Iain.
>>
> 
> Thanks for committing the patch upstream
> 

Hi Stefan,

They have been merged in: https://github.com/dlang/druntime/pull/3020

Do you want to commit this yourself?

Iain.

Re: [PATCH] Check DECL_CONTEXT of new/delete operators.

2020-04-07 Thread Jonathan Wakely via Gcc-patches

On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell  wrote:
>
> On 4/6/20 4:34 AM, Martin Liška wrote:
>
> >
> > May I please ping Jason, Nathan and Jonathan who can help us here?
>
> On IRC Martin clarified the question as essentially 'how do you pair up
> operator new and operator delete calls?' (so you may delete them if the
> object is not used).
>
> I am not sure you're permitted to remove those calls in general.  All I
> can find is [expr.new]/12
> 'An implementation is allowed to omit a call to a replaceable global
> allocation function (17.6.2.1, 17.6.2.2). When it does so, the storage
> is instead provided by the implementation or provided by extending the
> allocation of another new-expression.'

At https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94295#c6 Richard Smith
summarised the rules as "new-expressions, like std::allocator, may
obtain storage by calling 'operator new', but it's unspecified how
often it's called and with what arguments."

So if our optimisation is removing the calls to base::operator new and
base::operator delete, but not the B::operator new call, then it seems
to be working at the wrong level. It should be eliding any calls to
operator new and operator delete at the point of the new-expression
and delete-expression, not leaving one call to operator new present
and then removing another one (which leaves the call "partially
removed").

RE: [PATCH][GCC][Arm]: MVE: Fix polymorphism for scalars and constants

2020-04-07 Thread Kyrylo Tkachov

> -Original Message-
> From: Andre Vieira (lists) 
> Sent: 07 April 2020 09:57
> To: Kyrylo Tkachov ; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH][GCC][Arm]: MVE: Fix polymorphism for scalars and 
> constants
> 
> Hi,
> 
> I rebased this patch and made some extra fixes.
> 
> This patch merges some polymorphic functions that were uncorrectly 
> separating scalar variants. It also simplifies the way we detect 
> scalars and constants in mve_typeid.
> 
> I also fixed some polymorphic intrinsics that were splitting of scalar cases.
> 
> Regression tested for arm-none-eabi.
> 
> Is this OK for trunk?

Ok.
Thanks,
Kyrill

> 
> 2020-04-07  Andre Vieira  
> 
>      * config/arm/arm_mve.h (vsubq_n): Merge with...
>      (vsubq): ... this.
>      (vmulq_n): Merge with...
>      (vmulq): ... this.
>      (__ARM_mve_typeid): Simplify scalar and constant detection.
> 
> 2020-04-07  Andre Vieira  
> 
>      * gcc.target/arm/mve/intrinsics/vmulq_n_f16.c: Fix test.
>      * gcc.target/arm/mve/intrinsics/vmulq_n_f32.c: Likewise.
>      * gcc.target/arm/mve/intrinsics/vmulq_n_s16.c: Likewise.
>      * gcc.target/arm/mve/intrinsics/vmulq_n_s32.c: Likewise.
>      * gcc.target/arm/mve/intrinsics/vmulq_n_s8.c: Likewise.
>      * gcc.target/arm/mve/intrinsics/vmulq_n_u16.c: Likewise.
>      * gcc.target/arm/mve/intrinsics/vmulq_n_u32.c: Likewise.
>      * gcc.target/arm/mve/intrinsics/vmulq_n_u8.c: Likewise.
> 
> On 02/04/2020 10:58, Kyrylo Tkachov wrote:
> >
> >> -Original Message-
> >> From: Andre Vieira (lists) 
> >> Sent: 02 April 2020 09:22
> >> To: gcc-patches@gcc.gnu.org
> >> Cc: Kyrylo Tkachov 
> >> Subject: [PATCH][GCC][Arm]: MVE: Fix polymorphism for scalars and 
> >> constants
> >>
> >> Hi,
> >>
> >> This patch merges some polymorphic functions that were incorrectly 
> >> separating scalar variants. It also simplifies the way we detect 
> >> scalars and constants in mve_typeid.
> >>
> >> Regression tested for arm-none-eabi.
> >>
> >> Is this OK for trunk?
> > Ok.
> > Thanks,
> > Kyrill
> >
> >> 2020-04-02  Andre Vieira  
> >>
> >>       * config/arm/arm_mve.h (vsubq_n): Merge with...
> >>       (vsubq): ... this.
> >>       (vmulq_n): Merge with...
> >>       (vmulq): ... this.
> >>       (__ARM_mve_typeid): Simplify scalar and constant detection.

Re: [PATCH][GCC][Arm]: MVE: Fix polymorphism for scalars and constants

2020-04-07 Thread Andre Vieira (lists)


Now with the zipped patch so it reaches the mailing list.

Sorry for that.

On 07/04/2020 09:57, Kyrylo Tkachov wrote:

-Original Message-
From: Andre Vieira (lists) 
Sent: 07 April 2020 09:57
To: Kyrylo Tkachov ; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH][GCC][Arm]: MVE: Fix polymorphism for scalars and
constants

Hi,

I rebased this patch and made some extra fixes.

This patch merges some polymorphic functions that were uncorrectly
separating scalar variants. It also simplifies the way we detect
scalars and constants in mve_typeid.

I also fixed some polymorphic intrinsics that were splitting of scalar cases.

Regression tested for arm-none-eabi.

Is this OK for trunk?

Ok.
Thanks,
Kyrill


2020-04-07  Andre Vieira  

      * config/arm/arm_mve.h (vsubq_n): Merge with...
      (vsubq): ... this.
      (vmulq_n): Merge with...
      (vmulq): ... this.
      (__ARM_mve_typeid): Simplify scalar and constant detection.

2020-04-07  Andre Vieira  

      * gcc.target/arm/mve/intrinsics/vmulq_n_f16.c: Fix test.
      * gcc.target/arm/mve/intrinsics/vmulq_n_f32.c: Likewise.
      * gcc.target/arm/mve/intrinsics/vmulq_n_s16.c: Likewise.
      * gcc.target/arm/mve/intrinsics/vmulq_n_s32.c: Likewise.
      * gcc.target/arm/mve/intrinsics/vmulq_n_s8.c: Likewise.
      * gcc.target/arm/mve/intrinsics/vmulq_n_u16.c: Likewise.
      * gcc.target/arm/mve/intrinsics/vmulq_n_u32.c: Likewise.
      * gcc.target/arm/mve/intrinsics/vmulq_n_u8.c: Likewise.

On 02/04/2020 10:58, Kyrylo Tkachov wrote:

-Original Message-
From: Andre Vieira (lists) 
Sent: 02 April 2020 09:22
To: gcc-patches@gcc.gnu.org
Cc: Kyrylo Tkachov 
Subject: [PATCH][GCC][Arm]: MVE: Fix polymorphism for scalars and
constants

Hi,

This patch merges some polymorphic functions that were incorrectly
separating scalar variants. It also simplifies the way we detect
scalars and constants in mve_typeid.

Regression tested for arm-none-eabi.

Is this OK for trunk?

Ok.
Thanks,
Kyrill


2020-04-02  Andre Vieira  

       * config/arm/arm_mve.h (vsubq_n): Merge with...
       (vsubq): ... this.
       (vmulq_n): Merge with...
       (vmulq): ... this.
       (__ARM_mve_typeid): Simplify scalar and constant detection.
<>

RE: [PATCH][GCC][Arm]: MVE: Do not use typeof for pointer parameters

2020-04-07 Thread Kyrylo Tkachov



> -Original Message-
> From: Gcc-patches  On Behalf Of Andre 
> Vieira (lists)
> Sent: 07 April 2020 10:02
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrill Tkachov 
> Subject: [PATCH][GCC][Arm]: MVE: Do not use typeof for pointer 
> parameters
> 
> Hi,
> 
> To make sure our inlining of _Generic doesn't go crazy we added an in 
> between declaration of the parameters used for _Generic selection.
> However, this will not work if the parameter being passed in is an array.
> Since none of our intrinsics return pointers we do not need to use 
> typeof here as we will never be able to nest intrinsics through this 
> parameter. I also removed the unnecessary const pointers in mve_typeid.
> 
> Regression tested on arm-none-eabi.
> 
> Is this OK for trunk?

Ok.
Thanks,
Kyrill

> 
> gcc/ChangeLog:
> 2020-04-07  Andre Vieira  
> 
>      * config/arm/arm_mve.h: Remove use of typeof for addr pointer 
> parameters and
>      remove const_ptr enums.

Re: [PATCH] Check DECL_CONTEXT of new/delete operators.

2020-04-07 Thread Richard Biener via Gcc-patches

On Tue, Apr 7, 2020 at 10:26 AM Jonathan Wakely  wrote:
>
> On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell  wrote:
> >
> > On 4/6/20 4:34 AM, Martin Liška wrote:
> >
> > >
> > > May I please ping Jason, Nathan and Jonathan who can help us here?
> >
> > On IRC Martin clarified the question as essentially 'how do you pair up
> > operator new and operator delete calls?' (so you may delete them if the
> > object is not used).
> >
> > I am not sure you're permitted to remove those calls in general.  All I
> > can find is [expr.new]/12
> > 'An implementation is allowed to omit a call to a replaceable global
> > allocation function (17.6.2.1, 17.6.2.2). When it does so, the storage
> > is instead provided by the implementation or provided by extending the
> > allocation of another new-expression.'
>
> At https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94295#c6 Richard Smith
> summarised the rules as "new-expressions, like std::allocator, may
> obtain storage by calling 'operator new', but it's unspecified how
> often it's called and with what arguments."
>
> So if our optimisation is removing the calls to base::operator new and
> base::operator delete, but not the B::operator new call, then it seems
> to be working at the wrong level. It should be eliding any calls to
> operator new and operator delete at the point of the new-expression
> and delete-expression, not leaving one call to operator new present
> and then removing another one (which leaves the call "partially
> removed").

Well, the question is how to identify "operator new and operator delete at the
point of the new-expression and delete-expression".  Currently we're
just matching up "is this a new/delete operator" and the dataflow of the
pointer.  In the PR it was suggested the actual called methods should have
the same DECL_CONTEXT.  Honza suggested the context should have the
"same" ODR type (or be global).

You make it sound it's much harder and the parser needs to build the
relation?  You also suggest the "validness" is only OK in the context
of std::allocator and based on the unspecifiedness of the number of
allocations from the standard library.  That would further suggest that
we need to mark the allocation/deallocation points somehow and _not_
base the optimization on the called new/delete "function" (maybe with
an exception of the global ::new and ::delete).

Richard.

Re: [PATCH] sra: Fix sra_modify_expr handling of partial writes (PR 94482)

2020-04-07 Thread Richard Biener

On Tue, 7 Apr 2020, Martin Jambor wrote:

> Hi,
> 
> when sra_modify_expr is invoked on an expression that modifies only
> part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
> of an assignment and the SRA replacement's type is not compatible with
> what is being replaced (0th operand of the B_F_R in the above
> example), it does not work properly, basically throwing away the partd
> of the expr that should have stayed intact.
> 
> This is fixed in two ways.  For BIT_FIELD_REFs, which operate on the
> binary image of the replacement (and so in a way serve as a
> VIEW_CONVERT_EXPR) we just do not bother with convertsing.  For
> REALPART_EXPRs and IMAGPART_EXPRs, we insert a VIEW_CONVERT_EXPR under
> the complex partial access expression, which is OK even in a LHS of an
> assignment (and other write contexts) because in those cases the
> replacement is not going to be a giple register anyway.
> 
> The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a bit
> fragile because SRA prefers complex and vector types over anything else
> (and in between the two it decides based on TYPE_UID which in my testing
> today always preferred complex types) and even when GIMPLE is wrong I
> often still did not get failing runs, so I only run it at -O1 (which is
> the only level where the the test fails for me).
> 
> Bootstrapped and tested on x86_64-linux, bootstrap and testing on
> i686-linux and aarch64-linux underway.
> 
> OK for trunk (and subsequently for release branches) if it passes?

OK.

generate_subtree_copies contains similar code and the call in
sra_modify_expr suggests that an (outer?) bit-field-ref is possible
here, so is generate_subtree_copies affected as well?  I'm not sure
how subtree copy generation "works" when we already replaced sth
but maybe access->grp_to_be_replaced is never true when
access->first_child is not NULL?  But then we still pass down
the "stripped" orig_expr (op0 of BIT_FIELD_REF/REALPART_EXPR).

Thanks,
Richard.

> Thanks,
> 
> Martin
> 
> 2020-04-06  Martin Jambor  
> 
>   PR tree-optimization/94482
>   * tree-sra.c (create_access_replacement): Dump new replacement with
>   TDF_UID.
>   (sra_modify_expr): Fix handling of cases when the original EXPR writes
>   to only part of the replacement.
> 
>   testsuite/
>   * gcc.dg/torture/pr94482.c: New test.
>   * gcc.dg/tree-ssa/pr94482-2.c: Likewise.
> ---
>  gcc/ChangeLog |  8 
>  gcc/testsuite/ChangeLog   |  6 +++
>  gcc/testsuite/gcc.dg/torture/pr94482.c| 34 +++
>  gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c | 50 +++
>  gcc/tree-sra.c| 17 ++--
>  5 files changed, 111 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/pr94482.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
> 
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index e10fb251c16..36ddef64afd 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,11 @@
> +2020-04-06  Martin Jambor  
> +
> + PR tree-optimization/94482
> + * tree-sra.c (create_access_replacement): Dump new replacement with
> + TDF_UID.
> + (sra_modify_expr): Fix handling of cases when the original EXPR writes
> + to only part of the replacement.
> +
>  2020-04-05 Zachary Spytz  
>  
>   * extend.texi: Add free to list of ISO C90 functions that
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 88bab5d3d19..8b85e55afe8 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,9 @@
> +2020-04-06  Martin Jambor  
> +
> + PR tree-optimization/94482
> + * gcc.dg/torture/pr94482.c: New test.
> + * gcc.dg/tree-ssa/pr94482-2.c: Likewise.
> +
>  2020-04-05  Iain Sandoe  
>  
>   * g++.dg/coroutines/torture/co-await-14-template-traits.C: Rename...
> diff --git a/gcc/testsuite/gcc.dg/torture/pr94482.c 
> b/gcc/testsuite/gcc.dg/torture/pr94482.c
> new file mode 100644
> index 000..5bb19e745c2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr94482.c
> @@ -0,0 +1,34 @@
> +/* { dg-do run } */
> +
> +typedef unsigned V __attribute__ ((__vector_size__ (16)));
> +union U
> +{
> +  V j;
> +  unsigned long long i __attribute__ ((__vector_size__ (16)));
> +};
> +
> +static inline __attribute__((always_inline)) V
> +foo (unsigned long long a)
> +{
> +  union U z = { .j = (V) {} };
> +  for (unsigned long i = 0; i < 1; i++)
> +z.i[i] = a;
> +  return z.j;
> +}
> +
> +static inline __attribute__((always_inline)) V
> +bar (V a, unsigned long long i, int q)
> +{
> +  union U z = { .j = a };
> +  z.i[q] = i;
> +  return z.j;
> +}
> +
> +int
> +main ()
> +{
> +  union U z = { .j = bar (foo (1729), 2, 1) };
> +  if (z.i[0] != 1729)
> +__builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
> new file mode 100644
> index

Re: [PATCH] Check DECL_CONTEXT of new/delete operators.

2020-04-07 Thread Jan Hubicka

> 
> Well, the question is how to identify "operator new and operator delete at the
> point of the new-expression and delete-expression".  Currently we're
> just matching up "is this a new/delete operator" and the dataflow of the
> pointer.  In the PR it was suggested the actual called methods should have
> the same DECL_CONTEXT.  Honza suggested the context should have the
> "same" ODR type (or be global).
I was just arguing that comparing pointers to types does not make much
sence in LTO where types can get unshared :)
Since the classes have ODR name at least this problem can be solved by
using ODR name compare.

However the testcase I sent abuses the fact that if you inherit the
class you can rewrite only new operator.  In the inerited class
DECL_CONTEXT of delete will still point to the oriignal class.  This
means that you can mix new/delete pair from the original and inherited
class.

Honza
> 
> You make it sound it's much harder and the parser needs to build the
> relation?  You also suggest the "validness" is only OK in the context
> of std::allocator and based on the unspecifiedness of the number of
> allocations from the standard library.  That would further suggest that
> we need to mark the allocation/deallocation points somehow and _not_
> base the optimization on the called new/delete "function" (maybe with
> an exception of the global ::new and ::delete).
> 
> Richard.

Re: [PATCH v2 00/11] aarch64: Implement TImode comparisons

2020-04-07 Thread Richard Earnshaw (lists)

On 06/04/2020 12:19, Richard Sandiford wrote:
> "Richard Earnshaw (lists)"  writes:
>> On 03/04/2020 16:03, Richard Sandiford wrote:
>>> "Richard Earnshaw (lists)"  writes:
 On 03/04/2020 13:27, Richard Sandiford wrote:
> "Richard Earnshaw (lists)"  writes:
>> On 02/04/2020 19:53, Richard Henderson via Gcc-patches wrote:
>>> This is attacking case 3 of PR 94174.
>>>
>>> In v2, I unify the various subtract-with-borrow and add-with-carry
>>> patterns that also output flags with unspecs.  As suggested by
>>> Richard Sandiford during review of v1.  It does seem cleaner.
>>>
>>
>> Really?  I didn't need to use any unspecs for the Arm version of this.
>>
>> R.
>
> See https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543063.html
> (including quoted context) for how we got here.
>
> The same problem affects the existing aarch64 patterns like
> *usub3_carryinC.  Although that pattern avoids unspecs,
> the compare:CC doesn't seem to be correct.
>
> Richard
>

 But I don't think you can use ANY_EXTEND in these comparisons.  It
 doesn't describe what the instruction does, since the instruction does
 not really extend the values first.
>>>
>>> Yeah, that was the starting point in the thread above too.  And using
>>> zero_extend in the existing *usub3_carryinC pattern:
>>>
>>> (define_insn "*usub3_carryinC"
>>>   [(set (reg:CC CC_REGNUM)
>>> (compare:CC
>>>   (zero_extend:
>>> (match_operand:GPI 1 "register_operand" "r"))
>>>   (plus:
>>> (zero_extend:
>>>   (match_operand:GPI 2 "register_operand" "r"))
>>> (match_operand: 3 "aarch64_borrow_operation" ""
>>>(set (match_operand:GPI 0 "register_operand" "=r")
>>> (minus:GPI
>>>   (minus:GPI (match_dup 1) (match_dup 2))
>>>   (match_operand:GPI 4 "aarch64_borrow_operation" "")))]
>>>""
>>>"sbcs\\t%0, %1, %2"
>>>   [(set_attr "type" "adc_reg")]
>>> )
>>>
>>> looks wrong for the same reason.  But the main problem IMO isn't how the
>>> inputs to the compare:CC are represented, but that we're using compare:CC
>>> at all.  Using compare doesn't accurately model the effect of SBCS on NZCV
>>> for all inputs, so if we're going to use a compare here, it can't be :CC.
>>>
 I would really expect this patch series to be pretty much a dual of this
 series that I posted last year for Arm.

 https://gcc.gnu.org/pipermail/gcc-patches/2019-October/532180.html
>>>
>>> That series uses compares with modes like CC_V and CC_B, so I think
>>> you're saying that given the choice in the earlier thread between adding
>>> a new CC mode or using unspecs, you would have preferred a new CC mode,
>>> is that right?
>>>
>>
>> Yes.  It surprised me, when doing the aarch32 version, just how often
>> the mid-end parts of the compiler were able to reason about parts of the
>> parallel insn and optimize things accordingly (eg propagating the truth
>> of the comparison).  If you use an unspec that can never happen.
> 
> That could be changed though.  E.g. we could add something like a
> simplify_unspec target hook if this becomes a problem (either here
> or for other unspecs).  A fancy implementation could even use
> match.pd-style rules in the .md file.

I really don't like that.  It sounds like the top of a long slippery
slope.  What about all the other cases where the RTL is comprehended by
the mid-end?

> 
> The reason I'm not keen on using special modes for this case is that
> they'd describe one way in which the result can be used rather than
> describing what the instruction actually does.  The instruction really
> does set all four flags to useful values.  The "problem" is that they're
> not the values associated with a compare between two values, so representing
> them that way will always lose information.
> 

Yes, it's true that the rtl -> machine instruction transform is not 100%
reversible.  That's always been the case, but it's the price we pay for
a generic IL that describes instructions on multiple architectures.

R.

> Thanks,
> Richard
>

Re: libgcc patch committed: Avoid hooks in split-stack code

2020-04-07 Thread Richard Biener via Gcc-patches

On Fri, Apr 3, 2020 at 11:59 PM Ian Lance Taylor via Gcc-patches
 wrote:
>
> The split-stack code invokes mmap and munmap with a limited amount of
> stack space.  That is fine when the functions just make a system call,
> but it's not fine when programs use LD_PRELOAD or linker tricks to add
> hooks to mmap/munmap.  Those hooks may use too much stack space,
> leading to an obscure program crash.
>
> Avoid that at least on GNU/Linux by calling __mmap/__munmap instead.
>
> Bootstrapped and ran Go and split-stack tests on x86_64-pc-linux-gnu.
> Committed to mainline.

This causes references to GLIBC_PRIVATE exported symbols which is
highly undesirable for system integrators (no ABI stability is guaranteed
for those).  I opened the regression PR94513 for this.

Richard.

> Ian
>
> 2020-04-03  Ian Lance Taylor  
>
> * generic-morestack.c: On GNU/Linux use __mmap/__munmap rather
> than mmap/munmap, to avoid hooks.

[Arm] Implement CDE predicated intrinsics for MVE registers

2020-04-07 Thread Matthew Malcomson

These intrinsics are the predicated version of the intrinsics inroduced
in https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542725.html.

These are not yet public on developer.arm.com but we have reached
internal consensus on them.

The approach follows the same method as for the CDE intrinsics for MVE
registers, most notably using the same arm_resolve_overloaded_builtin
function with minor modifications.

The resolver hook has been moved from arm-builtins.c to arm-c.c so it
can access the c-common function build_function_call_vec.  This function
is needed to perform the same checks on arguments as a normal C or C++
function would perform.
It is fine to put this resolver in arm-c.c since it's only use is for
the ACLE functions, and these are only available in C/C++.
So that the resolver function has access to information it needs from
the builtins, we put two query functions into arm-builtins.c and use
them from arm-c.c.

We rely on the order that the builtins are defined in
gcc/config/arm/arm_cde_builtins.def, knowing that the predicated
versions come after the non-predicated versions.

The machine description patterns for these builtins are simpler than
those for the non-predicated versions, since the accumulator versions
*and* non-accumulator versions both need an input vector now.
The input vector is needed for the non-accumulator version to describe
the original values for those lanes that are not updated during the
merge operation.

We additionally need to introduce qualifiers for these new builtins,
which follow the same pattern as the non-predicated versions but with an
extra argument to describe the predicate.

Error message changes:
- We directly mention the builtin argument when complaining that an
  argument is not in the correct range.
  This more closely matches the C error messages.
- We ensure the resolver complains about *all* invalid arguments to a
  function instead of just the first one.
- The resolver error messages index arguments from 1 instead of 0 to
  match the arguments coming from the C/C++ frontend.

In order to allow the user to give an argument for the merging predicate
when they don't care what data is stored in the 'false' lanes, we also
move the __arm_vuninitializedq* intrinsics from arm_mve.h to
arm_mve_types.h which is shared with arm_cde.h.

We only move the fully type-specified `__arm_vuninitializedq*`
intrinsics and not the polymorphic versions, since moving the
polymorphic versions requires moving the _Generic framework as well as
just the intrinsics we're interested in.  This matches the approach taken
for the `__arm_vreinterpret*` functions in this include file.

This patch also contains a slight change in spacing of an existing
assembly instruction to be emitted.
This is just to help writing tests -- vmsr usually has a tab and a space
between the mnemonic and the first argument, but in one case it just has
a tab -- making all the same helps make test regexps simpler.

Testing Done:
Bootstrap and full regtest on arm-none-linux-gnueabihf
Full regtest on arm-none-eabi

All testing done with a local fix for the bugzilla PR below.
That bugzilla currently causes multiple ICE's on the tests added in
this patch.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94341

gcc/ChangeLog:

2020-04-07  Matthew Malcomson  

* config/arm/arm-builtins.c (CX_UNARY_UNONE_QUALIFIERS): New.
(CX_BINARY_UNONE_QUALIFIERS): New.
(CX_TERNARY_UNONE_QUALIFIERS): New.
(arm_resolve_overloaded_builtin): Move to arm-c.c.
(arm_expand_builtin_args): Update error message.
(enum resolver_ident): New.
(arm_describe_resolver): New.
(arm_cde_end_args): New.
* config/arm/arm-builtins.h: New file.
* config/arm/arm-c.c (arm_resolve_overloaded_builtin): New.
(arm_resolve_cde_builtin): Moved from arm-builtins.c.
* config/arm/arm_cde.h (__arm_vcx1q_m, __arm_vcx1qa_m,
__arm_vcx2q_m, __arm_vcx2qa_m, __arm_vcx3q_m, __arm_vcx3qa_m):
New.
* config/arm/arm_cde_builtins.def (vcx1q_p_, vcx1qa_p_,
vcx2q_p_, vcx2qa_p_, vcx3q_p_, vcx3qa_p_): New builtin defs.
* config/arm/iterators.md (CDE_VCX): New int iterator.
(a) New int attribute.
* config/arm/mve.md (arm_vcx1q_p_v16qi, arm_vcx2q_p_v16qi,
arm_vcx3q_p_v16qi): New patterns.
* config/arm/vfp.md (thumb2_movhi_fp16): Extra space in assembly.

gcc/testsuite/ChangeLog:

2020-04-07  Matthew Malcomson  

* gcc.target/arm/acle/cde-errors.c: Add predicated forms.
* gcc.target/arm/acle/cde-mve-error-1.c: Add predicated forms.
* gcc.target/arm/acle/cde-mve-error-2.c: Add predicated forms.
* gcc.target/arm/acle/cde-mve-error-3.c: Add predicated forms.
* gcc.target/arm/acle/cde-mve-full-assembly.c: Add predicated
forms.
* gcc.target/arm/acle/cde-mve-tests.c: Add predicated forms.
* gcc.target/arm/acle/cde_v_1_err.c (test_imm_rang

Re: [PATCH] Check DECL_CONTEXT of new/delete operators.

2020-04-07 Thread Richard Biener via Gcc-patches

On Tue, Apr 7, 2020 at 11:49 AM Jan Hubicka  wrote:
>
> >
> > Well, the question is how to identify "operator new and operator delete at 
> > the
> > point of the new-expression and delete-expression".  Currently we're
> > just matching up "is this a new/delete operator" and the dataflow of the
> > pointer.  In the PR it was suggested the actual called methods should have
> > the same DECL_CONTEXT.  Honza suggested the context should have the
> > "same" ODR type (or be global).
> I was just arguing that comparing pointers to types does not make much
> sence in LTO where types can get unshared :)
> Since the classes have ODR name at least this problem can be solved by
> using ODR name compare.

Sure.

> However the testcase I sent abuses the fact that if you inherit the
> class you can rewrite only new operator.  In the inerited class
> DECL_CONTEXT of delete will still point to the oriignal class.  This
> means that you can mix new/delete pair from the original and inherited
> class.

Yeah, but we're searching for a correctness fix not for an optimality one ;)

It sounds matching up the pairs in the middle-end is impossible and thus
the FE has to do this match-up (or refrain from marking new/delete when
a matchup according to to be defined methods is not going to be enough).
And maybe that tracking has to be done on a per call level rather than
on a called function level.

Richard.

> Honza
> >
> > You make it sound it's much harder and the parser needs to build the
> > relation?  You also suggest the "validness" is only OK in the context
> > of std::allocator and based on the unspecifiedness of the number of
> > allocations from the standard library.  That would further suggest that
> > we need to mark the allocation/deallocation points somehow and _not_
> > base the optimization on the called new/delete "function" (maybe with
> > an exception of the global ::new and ::delete).
> >
> > Richard.

Re: [testsuite][arm] Fix cmse-15.c expected output

2020-04-07 Thread Andre Vieira (lists)


On 06/04/2020 16:12, Christophe Lyon via Gcc-patches wrote:

Hi,

While checking Martin's fix for PR ipa/94445, he made me realize that
the cmse-15.c testcase still fails at -Os because ICF means that we
generate
nonsecure2:
 b   nonsecure0

which is OK, but does not match the currently expected
nonsecure2:
...
 bl  __gnu_cmse_nonsecure_call

(see https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543190.html)

The test has already different expectations for v8-M and v8.1-M.

I've decided to try to use check-function-bodies to account for the
different possibilities:
- v8-M vs v8.1-M via two different prefixes
- code generation variants (-0?) via multiple regexps

I've tested that the test now passes with --target-board=-march=armv8-m.main
and --target-board=-march=armv8.1-m.main.

I feel this a bit too much of a burden for the purpose, maybe there's
a better way of handling all these alternatives (in particular,
there's a lot of duplication since the expected code for the secure*
functions is the same for v8-M and v8.1-M).

OK?

Thanks,

Christophe

Hi Christophe,

This check-function-bodies functionality is pretty sweet, I assume the ( 
A | B ) checks for either of them?
If so that looks like a good improvement. Ideally we'd also check the 
clearing for the v8.1-M cases, but that wasn't there before either and 
they would need again splitting for -mfloat-abi=soft+softfp and 
-mfloat-abi=hard.



So yeah this LGTM but you need approval from a port/global maintainer.

Cheers,
Andre

Re: libgcc patch committed: Avoid hooks in split-stack code

2020-04-07 Thread Jakub Jelinek via Gcc-patches

On Tue, Apr 07, 2020 at 12:19:45PM +0200, Richard Biener via Gcc-patches wrote:
> On Fri, Apr 3, 2020 at 11:59 PM Ian Lance Taylor via Gcc-patches
>  wrote:
> >
> > The split-stack code invokes mmap and munmap with a limited amount of
> > stack space.  That is fine when the functions just make a system call,
> > but it's not fine when programs use LD_PRELOAD or linker tricks to add
> > hooks to mmap/munmap.  Those hooks may use too much stack space,
> > leading to an obscure program crash.
> >
> > Avoid that at least on GNU/Linux by calling __mmap/__munmap instead.
> >
> > Bootstrapped and ran Go and split-stack tests on x86_64-pc-linux-gnu.
> > Committed to mainline.
> 
> This causes references to GLIBC_PRIVATE exported symbols which is
> highly undesirable for system integrators (no ABI stability is guaranteed
> for those).  I opened the regression PR94513 for this.

Yeah, GLIBC_PRIVATE symbols may only be used by glibc itself, not by
anything else.
I'd suggest to use syscall function instead (though, that can be interposed
too).

Jakub

[PATCH][GCC][Arm]: MVE: Fix constant load pattern

2020-04-07 Thread Andre Vieira (lists)


Hi,

This patch fixes the constant load pattern for MVE, this was not 
accounting correctly for label + offset cases.


Added test that ICE'd before and removed the scan assemblers for the 
mve_vector* tests as they were too fragile.


Bootstrapped on arm-linux-gnueabihf and regression tested on arm-none-eabi.

Is this OK for trunk?

gcc/ChangeLog:
2020-04-07  Andre Vieira  

    * config/arm/arm.c (output_move_neon): Deal with label + offset 
cases.

    * config/arm/mve.md (*mve_mov): Handle const vectors.

gcc/testsuite/ChangeLog:
2020-04-07  Andre Vieira  

    * gcc.target/arm/mve/intrinsics/mve_load_from_array.c: New test.
    * gcc.target/arm/mve/intrinsics/mve_vector_float.c: Remove 
scan-assembler.

    * gcc.target/arm/mve/intrinsics/mve_vector_float1.c: Likewise.
    * gcc.target/arm/mve/intrinsics/mve_vector_int1.c: Likewise.
    * gcc.target/arm/mve/intrinsics/mve_vector_int2.c: Likewise.

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
d5207e0d8f07f9be5265fc6d175c148c6cdd53cb..1af9d5cf145f6d01e364a1afd7ceb3df5da86c9a
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -20122,52 +20122,43 @@ output_move_neon (rtx *operands)
  break;
}
   /* Fall through.  */
-case LABEL_REF:
 case PLUS:
+  addr = XEXP (addr, 0);
+  /* Fall through.  */
+case LABEL_REF:
   {
int i;
int overlap = -1;
-   if (TARGET_HAVE_MVE && !BYTES_BIG_ENDIAN
-   && GET_CODE (addr) != LABEL_REF)
+   for (i = 0; i < nregs; i++)
  {
-   sprintf (buff, "v%srw.32\t%%q0, %%1", load ? "ld" : "st");
-   ops[0] = reg;
-   ops[1] = mem;
-   output_asm_insn (buff, ops);
- }
-   else
- {
-   for (i = 0; i < nregs; i++)
+   /* We're only using DImode here because it's a convenient
+  size.  */
+   ops[0] = gen_rtx_REG (DImode, REGNO (reg) + 2 * i);
+   ops[1] = adjust_address (mem, DImode, 8 * i);
+   if (reg_overlap_mentioned_p (ops[0], mem))
  {
-   /* We're only using DImode here because it's a convenient
-  size.  */
-   ops[0] = gen_rtx_REG (DImode, REGNO (reg) + 2 * i);
-   ops[1] = adjust_address (mem, DImode, 8 * i);
-   if (reg_overlap_mentioned_p (ops[0], mem))
- {
-   gcc_assert (overlap == -1);
-   overlap = i;
- }
-   else
- {
-   if (TARGET_HAVE_MVE && GET_CODE (addr) == LABEL_REF)
- sprintf (buff, "v%sr.64\t%%P0, %%1", load ? "ld" : "st");
-   else
- sprintf (buff, "v%sr%%?\t%%P0, %%1", load ? "ld" : "st");
-   output_asm_insn (buff, ops);
- }
+   gcc_assert (overlap == -1);
+   overlap = i;
  }
-   if (overlap != -1)
+   else
  {
-   ops[0] = gen_rtx_REG (DImode, REGNO (reg) + 2 * overlap);
-   ops[1] = adjust_address (mem, SImode, 8 * overlap);
if (TARGET_HAVE_MVE && GET_CODE (addr) == LABEL_REF)
- sprintf (buff, "v%sr.32\t%%P0, %%1", load ? "ld" : "st");
+ sprintf (buff, "v%sr.64\t%%P0, %%1", load ? "ld" : "st");
else
  sprintf (buff, "v%sr%%?\t%%P0, %%1", load ? "ld" : "st");
output_asm_insn (buff, ops);
  }
  }
+   if (overlap != -1)
+ {
+   ops[0] = gen_rtx_REG (DImode, REGNO (reg) + 2 * overlap);
+   ops[1] = adjust_address (mem, SImode, 8 * overlap);
+   if (TARGET_HAVE_MVE && GET_CODE (addr) == LABEL_REF)
+ sprintf (buff, "v%sr.32\t%%P0, %%1", load ? "ld" : "st");
+   else
+ sprintf (buff, "v%sr%%?\t%%P0, %%1", load ? "ld" : "st");
+   output_asm_insn (buff, ops);
+ }
 
 return "";
   }
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 
d1028f4542b4972b4080e46544c86d625d77383a..10abc3fae3709891346b63213afb1fe3754af41a
 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -695,9 +695,9 @@ (define_insn "*mve_mov"
 case 2:
   return "vmov\t%Q0, %R0, %e1  @ \;vmov\t%J0, %K0, %f1";
 case 4:
-  if ((TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE (mode))
- || (MEM_P (operands[1])
- && GET_CODE (XEXP (operands[1], 0)) == LABEL_REF))
+  if (MEM_P (operands[1])
+ && (GET_CODE (XEXP (operands[1], 0)) == LABEL_REF
+ || GET_CODE (XEXP (operands[1], 0)) == CONST))
return output_move_neon (operands);
   else
return "vldrb.8 %q0, %E1";
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_load_from_array.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_load_from_array.c
new file mode 100644
index 
00

[PATCH][GCC][Arm]: MVE: Fix v[id]wdup's

2020-04-07 Thread Andre Vieira (lists)


Hi,

This patch fixes v[id]wdup intrinsics. They had two issues:
1) the predicated versions did not link the incoming inactive vector 
parameter to the output

2) The backend didn't enforce the wrap limit operand be in an odd register.

1) was fixed like we did for all other predicated intrinsics
2) requires a temporary hack where we pass the value in the top end of 
DImode operand. The proper fix would be to add a register CLASS but this 
interacted badly with other existing targets codegen.  We will look to 
fix this properly in GCC 11.


Regression tested on arm-none-eabi.

Is this OK for trunk?

gcc/ChangeLog:
2020-04-07  Andre Vieira  

    * config/arm/arm_mve.h: Fix v[id]wdup intrinsics.
    * config/arm/mve/md: Fix v[id]wdup patterns.

diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 
e31c2e8fdc4f500bf9408d05ad86e151397627f7..47eead71d9515b4103a5b66999a3f9357dc3c3be
 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -13585,29 +13585,33 @@ __extension__ extern __inline uint8x16_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vdwdupq_m_n_u8 (uint8x16_t __inactive, uint32_t __a, uint32_t __b, const 
int __imm, mve_pred16_t __p)
 {
-  return __builtin_mve_vdwdupq_m_n_uv16qi (__inactive, __a, __b, __imm, __p);
+  uint64_t __c = ((uint64_t) __b) << 32;
+  return __builtin_mve_vdwdupq_m_n_uv16qi (__inactive, __a, __c, __imm, __p);
 }
 
 __extension__ extern __inline uint32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vdwdupq_m_n_u32 (uint32x4_t __inactive, uint32_t __a, uint32_t __b, 
const int __imm, mve_pred16_t __p)
 {
-  return __builtin_mve_vdwdupq_m_n_uv4si (__inactive, __a, __b, __imm, __p);
+  uint64_t __c = ((uint64_t) __b) << 32;
+  return __builtin_mve_vdwdupq_m_n_uv4si (__inactive, __a, __c, __imm, __p);
 }
 
 __extension__ extern __inline uint16x8_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vdwdupq_m_n_u16 (uint16x8_t __inactive, uint32_t __a, uint32_t __b, 
const int __imm, mve_pred16_t __p)
 {
-  return __builtin_mve_vdwdupq_m_n_uv8hi (__inactive, __a, __b, __imm, __p);
+  uint64_t __c = ((uint64_t) __b) << 32;
+  return __builtin_mve_vdwdupq_m_n_uv8hi (__inactive, __a, __c, __imm, __p);
 }
 
 __extension__ extern __inline uint8x16_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vdwdupq_m_wb_u8 (uint8x16_t __inactive, uint32_t * __a, uint32_t __b, 
const int __imm, mve_pred16_t __p)
 {
-  uint8x16_t __res =  __builtin_mve_vdwdupq_m_n_uv16qi (__inactive, *__a, __b, 
__imm, __p);
-  *__a = __builtin_mve_vdwdupq_m_wb_uv16qi (__inactive, *__a, __b, __imm, __p);
+  uint64_t __c = ((uint64_t) __b) << 32;
+  uint8x16_t __res =  __builtin_mve_vdwdupq_m_n_uv16qi (__inactive, *__a, __c, 
__imm, __p);
+  *__a = __builtin_mve_vdwdupq_m_wb_uv16qi (__inactive, *__a, __c, __imm, __p);
   return __res;
 }
 
@@ -13615,8 +13619,9 @@ __extension__ extern __inline uint32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vdwdupq_m_wb_u32 (uint32x4_t __inactive, uint32_t * __a, uint32_t __b, 
const int __imm, mve_pred16_t __p)
 {
-  uint32x4_t __res =  __builtin_mve_vdwdupq_m_n_uv4si (__inactive, *__a, __b, 
__imm, __p);
-  *__a = __builtin_mve_vdwdupq_m_wb_uv4si (__inactive, *__a, __b, __imm, __p);
+  uint64_t __c = ((uint64_t) __b) << 32;
+  uint32x4_t __res =  __builtin_mve_vdwdupq_m_n_uv4si (__inactive, *__a, __c, 
__imm, __p);
+  *__a = __builtin_mve_vdwdupq_m_wb_uv4si (__inactive, *__a, __c, __imm, __p);
   return __res;
 }
 
@@ -13624,8 +13629,9 @@ __extension__ extern __inline uint16x8_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vdwdupq_m_wb_u16 (uint16x8_t __inactive, uint32_t * __a, uint32_t __b, 
const int __imm, mve_pred16_t __p)
 {
-  uint16x8_t __res =  __builtin_mve_vdwdupq_m_n_uv8hi (__inactive, *__a, __b, 
__imm, __p);
-  *__a = __builtin_mve_vdwdupq_m_wb_uv8hi (__inactive, *__a, __b, __imm, __p);
+  uint64_t __c = ((uint64_t) __b) << 32;
+  uint16x8_t __res =  __builtin_mve_vdwdupq_m_n_uv8hi (__inactive, *__a, __c, 
__imm, __p);
+  *__a = __builtin_mve_vdwdupq_m_wb_uv8hi (__inactive, *__a, __c, __imm, __p);
   return __res;
 }
 
@@ -13633,29 +13639,33 @@ __extension__ extern __inline uint8x16_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vdwdupq_n_u8 (uint32_t __a, uint32_t __b, const int __imm)
 {
-  return __builtin_mve_vdwdupq_n_uv16qi (__a, __b, __imm);
+  uint64_t __c = ((uint64_t) __b) << 32;
+  return __builtin_mve_vdwdupq_n_uv16qi (__a, __c, __imm);
 }
 
 __extension__ extern __inline uint32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vdwdupq_n_u32 (uint32_t __a, uint32_t __b, const int __imm)
 {
-  return __builtin_mve_vdwdupq_n_uv4si (__a, __b, __imm);
+  uint64_t __c = ((uint64_t) __b) << 32;
+  return __builtin_mve_vdwdupq_n_uv4si (__a, __c, __imm);
 }
 
 __extension__ exter

Re: [PATCH] Check DECL_CONTEXT of new/delete operators.

2020-04-07 Thread Martin Liška


On 4/7/20 12:22 PM, Richard Biener wrote:

On Tue, Apr 7, 2020 at 11:49 AM Jan Hubicka  wrote:




Well, the question is how to identify "operator new and operator delete at the
point of the new-expression and delete-expression".  Currently we're
just matching up "is this a new/delete operator" and the dataflow of the
pointer.  In the PR it was suggested the actual called methods should have
the same DECL_CONTEXT.  Honza suggested the context should have the
"same" ODR type (or be global).

I was just arguing that comparing pointers to types does not make much
sence in LTO where types can get unshared :)
Since the classes have ODR name at least this problem can be solved by
using ODR name compare.


Sure.


However the testcase I sent abuses the fact that if you inherit the
class you can rewrite only new operator.  In the inerited class
DECL_CONTEXT of delete will still point to the oriignal class.  This
means that you can mix new/delete pair from the original and inherited
class.


Yeah, but we're searching for a correctness fix not for an optimality one ;)

It sounds matching up the pairs in the middle-end is impossible and thus
the FE has to do this match-up (or refrain from marking new/delete when
a matchup according to to be defined methods is not going to be enough).
And maybe that tracking has to be done on a per call level rather than
on a called function level.


Based on what was said here, I see a proper matching implementation quite 
expensive
from implementation point of point. Moreover, the optimization itself has quite
low impact and so I suggest to only remove matching replaceable operators (1-12)
from [1], which are the top-level ones.

I'll come up with DECL_IS_REPLACEABLE_OPERATOR_DELETE_P and fix

#define DECL_IS_REPLACEABLE_OPERATOR_NEW_P(NODE) \
  (DECL_IS_OPERATOR_NEW_P (NODE) && DECL_IS_MALLOC (NODE))

which is not correct. It also matches

struct A
{
  __attribute__((noinline))
  static void* operator new(unsigned long sz)
  {
++count;
return ::operator new(sz);
  }

where DECL_IS_MALLOC is discovered by local IPA passes.

Hope it's viable solution?
Thanks,
Martin

[1] https://en.cppreference.com/w/cpp/memory/new/operator_delete



Richard.


Honza


You make it sound it's much harder and the parser needs to build the
relation?  You also suggest the "validness" is only OK in the context
of std::allocator and based on the unspecifiedness of the number of
allocations from the standard library.  That would further suggest that
we need to mark the allocation/deallocation points somehow and _not_
base the optimization on the called new/delete "function" (maybe with
an exception of the global ::new and ::delete).

Richard.

[PATCH][GCC][Arm]: MVE Don't use lsll for 32-bit shifts scalar

2020-04-07 Thread Andre Vieira (lists)


Hi,

After fixing the v[id]wdups using the "moving the wrap parameter" into 
the top-end of a DImode operand using a shift, I noticed we were using 
lsll for 32-bit shifts in scalars, where we don't need to, as we can 
simply do a move, which is much better if we don't need to use the 
bottom part.


We can solve this in a better way, but for now this will do.

Regression tested on arm-none-eabi.

Is this OK for trunk?

2020-04-07  Andre Vieira  

    * config/arm/arm.d (ashldi3): Don't use lsll for constant 32-bit
    shifts.

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 
1a7ea0d701e5677965574d877d0fe4b2f5bc149f..6d5560398dae3d0ace0342b4907542d2a6865f70
 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -4422,7 +4422,8 @@ (define_expand "ashldi3"
 operands[2] = force_reg (SImode, operands[2]);
 
   /* Armv8.1-M Mainline double shifts are not expanded.  */
-  if (arm_reg_or_long_shift_imm (operands[2], GET_MODE (operands[2])))
+  if (arm_reg_or_long_shift_imm (operands[2], GET_MODE (operands[2]))
+ && (REG_P (operands[2]) || INTVAL(operands[2]) != 32))
 {
  if (!reg_overlap_mentioned_p(operands[0], operands[1]))
emit_insn (gen_movdi (operands[0], operands[1]));

Re: [PATCH] Check DECL_CONTEXT of new/delete operators.

2020-04-07 Thread Martin Liška


On 4/6/20 2:45 PM, Nathan Sidwell wrote:

On 4/6/20 4:34 AM, Martin Liška wrote:



May I please ping Jason, Nathan and Jonathan who can help us here?


On IRC Martin clarified the question as essentially 'how do you pair up 
operator new and operator delete calls?' (so you may delete them if the object 
is not used).

I am not sure you're permitted to remove those calls in general.  All I can 
find is [expr.new]/12
'An implementation is allowed to omit a call to a replaceable global allocation 
function (17.6.2.1, 17.6.2.2). When it does so, the storage is instead provided 
by the implementation or provided by extending the allocation of another 
new-expression.'

But, I suspect the optimization is safe, in that either no one counts objects 
by their allocation, or if they do, they don't actually care that the 
std-conforming number of allocations happen.

The both operator new and operator delete are looked up in the same manner.  
The std does not require a 'matching pair' be found.  but it is extremely poor 
form for a class to declare exactly one of operator {new,delete}.


Hi.

Thank you for the information.



The following is well formed:

struct PoorForm {
   void *operator new (size_t s) {count++; return ::operator new (s)};
   static unsigned count;
};

Have you considered throwing ctors?

struct Obj {
   Obj (); // might throw
};

'obj = new Obj (); ... delete obj;' sequence expand to something like ...

// obj = new Obj ();
void *p = ::operator new (sizeof (Obj));
try {
   Obj::ctor(p);
}
catch (...) // cleanup code
{
   ::operator delete (p); // #1
   throw;
}

obj = (Obj*)p;

 user code

// delete obj;
Obj::dtor (obj);
::operator delete (obj); // #2

calls 1 & 2 matchup to the operator new call


Looking at the throwing ctors:

$ cat new-delete2.C
#include 

struct A
{
  __attribute__((always_inline)) A(int x)
  {
if (x == 123)
  throw x;
  }
};

int main(int argc, char **argv) {
  A *a = new A (argc);
  delete a;

  return 0;
}

$ g++-9 new-delete2.C -O2 -c -fdump-tree-optimized=/dev/stdout
...
   [local count: 1073741824]:
  _3 = operator new (1);
  if (argc_4(D) == 123)
goto ; [0.00%]
  else
goto ; [100.00%]

   [count: 0]:
  _8 = __cxa_allocate_exception (4);
  MEM[(int *)_8] = 123;
  __cxa_throw (_8, &_ZTIi, 0B);

   [local count: 1073741824]:
  operator delete (_3, 1);
  return 0;
...

As seen cddce can still optimize out
  _3 = operator new (1);
...
  operator delete (_3, 1);

Martin



Notice that para I quoted allows one to coalesce allocations using the global 
operator new/deletes.  The rules are pretty much as you can guess -- one 
lifetime must be entirely within the other.  If inner one's ctor throws, the 
exception path must destroy the outer.

does that help?

nathan

RE: [PATCH][GCC][Arm]: MVE: Fix constant load pattern

2020-04-07 Thread Kyrylo Tkachov



> -Original Message-
> From: Andre Vieira (lists) 
> Sent: 07 April 2020 11:35
> To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov 
> Subject: [PATCH][GCC][Arm]: MVE: Fix constant load pattern
> 
> Hi,
> 
> This patch fixes the constant load pattern for MVE, this was not
> accounting correctly for label + offset cases.
> 
> Added test that ICE'd before and removed the scan assemblers for the
> mve_vector* tests as they were too fragile.
> 
> Bootstrapped on arm-linux-gnueabihf and regression tested on arm-none-
> eabi.
> 
> Is this OK for trunk?

This makes me a bit nervous as it touches the common output_move_neon code ☹ 
but it looks like it mostly shuffles things around.
Ok for trunk but please watch out for fallout.
Thanks,
Kyrill

> 
> gcc/ChangeLog:
> 2020-04-07  Andre Vieira  
> 
>      * config/arm/arm.c (output_move_neon): Deal with label + offset
> cases.
>      * config/arm/mve.md (*mve_mov): Handle const vectors.
> 
> gcc/testsuite/ChangeLog:
> 2020-04-07  Andre Vieira  
> 
>      * gcc.target/arm/mve/intrinsics/mve_load_from_array.c: New test.
>      * gcc.target/arm/mve/intrinsics/mve_vector_float.c: Remove
> scan-assembler.
>      * gcc.target/arm/mve/intrinsics/mve_vector_float1.c: Likewise.
>      * gcc.target/arm/mve/intrinsics/mve_vector_int1.c: Likewise.
>      * gcc.target/arm/mve/intrinsics/mve_vector_int2.c: Likewise.

Patch ping: Re: [PATCH] combine: Fix split_i2i3 ICE [PR94291]

2020-04-07 Thread Jakub Jelinek via Gcc-patches

Hi!

I'd like to ping this P1 PR fix:

On Tue, Mar 31, 2020 at 09:04:18AM +0200, Jakub Jelinek via Gcc-patches wrote:
> 2020-03-31  Jakub Jelinek  
> 
>   PR rtl-optimization/94291
>   * combine.c (try_combine): For split_i2i3, don't assume SET_DEST
>   must be a REG or SUBREG of REG; if it is not one of these, don't
>   update LOG_LINKs.
> 
>   * gcc.dg/pr94291.c: New test.

Jakub

Re: [testsuite][arm] Fix cmse-15.c expected output

2020-04-07 Thread Christophe Lyon via Gcc-patches

On Tue, 7 Apr 2020 at 12:31, Andre Vieira (lists)
 wrote:
>
> On 06/04/2020 16:12, Christophe Lyon via Gcc-patches wrote:
> > Hi,
> >
> > While checking Martin's fix for PR ipa/94445, he made me realize that
> > the cmse-15.c testcase still fails at -Os because ICF means that we
> > generate
> > nonsecure2:
> >  b   nonsecure0
> >
> > which is OK, but does not match the currently expected
> > nonsecure2:
> > ...
> >  bl  __gnu_cmse_nonsecure_call
> >
> > (see https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543190.html)
> >
> > The test has already different expectations for v8-M and v8.1-M.
> >
> > I've decided to try to use check-function-bodies to account for the
> > different possibilities:
> > - v8-M vs v8.1-M via two different prefixes
> > - code generation variants (-0?) via multiple regexps
> >
> > I've tested that the test now passes with --target-board=-march=armv8-m.main
> > and --target-board=-march=armv8.1-m.main.
> >
> > I feel this a bit too much of a burden for the purpose, maybe there's
> > a better way of handling all these alternatives (in particular,
> > there's a lot of duplication since the expected code for the secure*
> > functions is the same for v8-M and v8.1-M).
> >
> > OK?
> >
> > Thanks,
> >
> > Christophe
> Hi Christophe,
>
> This check-function-bodies functionality is pretty sweet, I assume the (
> A | B ) checks for either of them?

Yes.

> If so that looks like a good improvement. Ideally we'd also check the
> clearing for the v8.1-M cases, but that wasn't there before either and
> they would need again splitting for -mfloat-abi=soft+softfp and
> -mfloat-abi=hard.
>
Not sure what you mean?
The only nonsecure test with the (A|B) construct is:
+*Clear nonsecure2:
+*Clear ...
+*Clear (
+*Clear blxns r[0-3]
+*Clear |
+*Clear b nonsecure0
+*Clear )

So it does check the clearing (blxns), and 'b nonsecure0' is as valid
as the result of the test for nonsecure0.


>
> So yeah this LGTM but you need approval from a port/global maintainer.
>
Thanks

> Cheers,
> Andre

Re: [PATCH][GCC][Arm]: MVE: Fix v[id]wdup's

2020-04-07 Thread Christophe Lyon via Gcc-patches

On Tue, 7 Apr 2020 at 12:40, Andre Vieira (lists)
 wrote:
>
> Hi,
>
> This patch fixes v[id]wdup intrinsics. They had two issues:
> 1) the predicated versions did not link the incoming inactive vector
> parameter to the output
> 2) The backend didn't enforce the wrap limit operand be in an odd register.
>
> 1) was fixed like we did for all other predicated intrinsics
> 2) requires a temporary hack where we pass the value in the top end of
> DImode operand. The proper fix would be to add a register CLASS but this
> interacted badly with other existing targets codegen.  We will look to
> fix this properly in GCC 11.
>
> Regression tested on arm-none-eabi.
>

Hi Andre,

How did you find problem 1? I suspect you are using an internal
simulator since qemu does not support MVE yet?
And you probably have runtime tests to exhibit this failure?

Thanks,

Christophe

> Is this OK for trunk?
>
> gcc/ChangeLog:
> 2020-04-07  Andre Vieira  
>
>  * config/arm/arm_mve.h: Fix v[id]wdup intrinsics.
>  * config/arm/mve/md: Fix v[id]wdup patterns.
>

Re: [PATCH] Check DECL_CONTEXT of new/delete operators.

2020-04-07 Thread Jonathan Wakely via Gcc-patches

On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell wrote:
> The both operator new and operator delete are looked up in the same
> manner.  The std does not require a 'matching pair' be found.  but it is
> extremely poor form for a class to declare exactly one of operator
> {new,delete}.

There are unfortunately several such example in the standard!

I wonder how much benefit we will really get from trying to make this
optimisation too general.

Just eliding (or coalescing) implicit calls to ::operator new(size_t)
and ::operator delete(void*, size_t) (and the [] and align_val_t forms
of those) probably covers 99% of real cases.

Re: [PATCH] Check DECL_CONTEXT of new/delete operators.

2020-04-07 Thread Richard Biener via Gcc-patches

On Tue, Apr 7, 2020 at 1:30 PM Jonathan Wakely  wrote:
>
> On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell wrote:
> > The both operator new and operator delete are looked up in the same
> > manner.  The std does not require a 'matching pair' be found.  but it is
> > extremely poor form for a class to declare exactly one of operator
> > {new,delete}.
>
> There are unfortunately several such example in the standard!
>
> I wonder how much benefit we will really get from trying to make this
> optimisation too general.
>
> Just eliding (or coalescing) implicit calls to ::operator new(size_t)
> and ::operator delete(void*, size_t) (and the [] and align_val_t forms
> of those) probably covers 99% of real cases.

IIRC the only reason to have the optimization was to fully elide
STL containers when possible.  That brings in allocators and
thus non ::new/::delete allocations.  Which then suggests that
our standard library implementation could annotate allocation
points somehow.

Richard.

Re: [PATCH] Check DECL_CONTEXT of new/delete operators.

2020-04-07 Thread Jonathan Wakely via Gcc-patches

On Tue, 7 Apr 2020 at 10:29, Richard Biener  wrote:
>
> On Tue, Apr 7, 2020 at 10:26 AM Jonathan Wakely  wrote:
> >
> > On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell  wrote:
> > >
> > > On 4/6/20 4:34 AM, Martin Liška wrote:
> > >
> > > >
> > > > May I please ping Jason, Nathan and Jonathan who can help us here?
> > >
> > > On IRC Martin clarified the question as essentially 'how do you pair up
> > > operator new and operator delete calls?' (so you may delete them if the
> > > object is not used).
> > >
> > > I am not sure you're permitted to remove those calls in general.  All I
> > > can find is [expr.new]/12
> > > 'An implementation is allowed to omit a call to a replaceable global
> > > allocation function (17.6.2.1, 17.6.2.2). When it does so, the storage
> > > is instead provided by the implementation or provided by extending the
> > > allocation of another new-expression.'
> >
> > At https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94295#c6 Richard Smith
> > summarised the rules as "new-expressions, like std::allocator, may
> > obtain storage by calling 'operator new', but it's unspecified how
> > often it's called and with what arguments."
> >
> > So if our optimisation is removing the calls to base::operator new and
> > base::operator delete, but not the B::operator new call, then it seems
> > to be working at the wrong level. It should be eliding any calls to
> > operator new and operator delete at the point of the new-expression
> > and delete-expression, not leaving one call to operator new present
> > and then removing another one (which leaves the call "partially
> > removed").
>
> Well, the question is how to identify "operator new and operator delete at the
> point of the new-expression and delete-expression".  Currently we're
> just matching up "is this a new/delete operator" and the dataflow of the
> pointer.  In the PR it was suggested the actual called methods should have
> the same DECL_CONTEXT.  Honza suggested the context should have the
> "same" ODR type (or be global).
>
> You make it sound it's much harder and the parser needs to build the
> relation?  You also suggest the "validness" is only OK in the context
> of std::allocator and based on the unspecifiedness of the number of
> allocations from the standard library.

I don't think Richard's summary or my paraphrasing intends to say it
only applies to std::allocator.

Re: [PATCH] Check DECL_CONTEXT of new/delete operators.

2020-04-07 Thread Jonathan Wakely via Gcc-patches

On Tue, 7 Apr 2020 at 12:40, Richard Biener  wrote:
>
> On Tue, Apr 7, 2020 at 1:30 PM Jonathan Wakely  wrote:
> >
> > On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell wrote:
> > > The both operator new and operator delete are looked up in the same
> > > manner.  The std does not require a 'matching pair' be found.  but it is
> > > extremely poor form for a class to declare exactly one of operator
> > > {new,delete}.
> >
> > There are unfortunately several such example in the standard!
> >
> > I wonder how much benefit we will really get from trying to make this
> > optimisation too general.
> >
> > Just eliding (or coalescing) implicit calls to ::operator new(size_t)
> > and ::operator delete(void*, size_t) (and the [] and align_val_t forms
> > of those) probably covers 99% of real cases.
>
> IIRC the only reason to have the optimization was to fully elide
> STL containers when possible.  That brings in allocators and
> thus non ::new/::delete allocations.

But the vast majority of containers are used with std::allocator, and
we control that.

Wouldn't it be simpler to add __builtin_operator_new and
__builtin_operator_delete like clang has, then make std::allocator use
those, and limit the optimizations to uses of those built-ins?

Re: [PATCH] Check DECL_CONTEXT of new/delete operators.

2020-04-07 Thread Richard Biener via Gcc-patches

On Tue, Apr 7, 2020 at 1:46 PM Jonathan Wakely  wrote:
>
> On Tue, 7 Apr 2020 at 12:40, Richard Biener  
> wrote:
> >
> > On Tue, Apr 7, 2020 at 1:30 PM Jonathan Wakely  
> > wrote:
> > >
> > > On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell wrote:
> > > > The both operator new and operator delete are looked up in the same
> > > > manner.  The std does not require a 'matching pair' be found.  but it is
> > > > extremely poor form for a class to declare exactly one of operator
> > > > {new,delete}.
> > >
> > > There are unfortunately several such example in the standard!
> > >
> > > I wonder how much benefit we will really get from trying to make this
> > > optimisation too general.
> > >
> > > Just eliding (or coalescing) implicit calls to ::operator new(size_t)
> > > and ::operator delete(void*, size_t) (and the [] and align_val_t forms
> > > of those) probably covers 99% of real cases.
> >
> > IIRC the only reason to have the optimization was to fully elide
> > STL containers when possible.  That brings in allocators and
> > thus non ::new/::delete allocations.
>
> But the vast majority of containers are used with std::allocator, and
> we control that.
>
> Wouldn't it be simpler to add __builtin_operator_new and
> __builtin_operator_delete like clang has, then make std::allocator use
> those, and limit the optimizations to uses of those built-ins?

Sure, that's a viable implementation strathegy.  Another might be

void delete (void *) __attribute__((matching_new(somewhere::new)));

and thus allow the user to relate a new/delete pair (here the FE would
do lookup for 'new' and record for example the mangled assembler name).

That is, we usually try to design a mechanism that's more broadly usable.
But yes, we match malloc/free so matching ::new/::delete by aliasing them
to __builtin_operator_new and __builtin_operator_delete is fair enough.

Not easily usable by other languages with custom allocation though
(fortran allocate/deallocate anyone? that's currently inlined to expose
underlying malloc/free calls for similar reasons)

Richard.

Re: [PATCH][GCC][Arm]: MVE: Fix constant load pattern

2020-04-07 Thread Andre Vieira (lists)

The diff looks weird, but this only removes the first if 
(TARGET_HAVE_MVE... ) block and updates the variable 'addr' which is 
only used in the consecutive(TARGET_HAVE_MVE ..) blocks. So it doesn't 
change NEON codegen.


It's unfortunate the diff looks so complicated :(

Cheers,
Andre

On 07/04/2020 11:52, Kyrylo Tkachov wrote:



-Original Message-
From: Andre Vieira (lists) 
Sent: 07 April 2020 11:35
To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov 
Subject: [PATCH][GCC][Arm]: MVE: Fix constant load pattern

Hi,

This patch fixes the constant load pattern for MVE, this was not
accounting correctly for label + offset cases.

Added test that ICE'd before and removed the scan assemblers for the
mve_vector* tests as they were too fragile.

Bootstrapped on arm-linux-gnueabihf and regression tested on arm-none-
eabi.

Is this OK for trunk?

This makes me a bit nervous as it touches the common output_move_neon code ☹ 
but it looks like it mostly shuffles things around.
Ok for trunk but please watch out for fallout.
Thanks,
Kyrill


gcc/ChangeLog:
2020-04-07  Andre Vieira  

      * config/arm/arm.c (output_move_neon): Deal with label + offset
cases.
      * config/arm/mve.md (*mve_mov): Handle const vectors.

gcc/testsuite/ChangeLog:
2020-04-07  Andre Vieira  

      * gcc.target/arm/mve/intrinsics/mve_load_from_array.c: New test.
      * gcc.target/arm/mve/intrinsics/mve_vector_float.c: Remove
scan-assembler.
      * gcc.target/arm/mve/intrinsics/mve_vector_float1.c: Likewise.
      * gcc.target/arm/mve/intrinsics/mve_vector_int1.c: Likewise.
      * gcc.target/arm/mve/intrinsics/mve_vector_int2.c: Likewise.

[PATCH] i386: Fix V{64QI,32HI}mode constant permutations [PR94509]

2020-04-07 Thread Jakub Jelinek via Gcc-patches

Hi!

The following testcases are miscompiled, because expand_vec_perm_pshufb
incorrectly thinks it can use vpshufb instruction for the permutations
when it can't.
The
  if (vmode == V32QImode)
{
  /* vpshufb only works intra lanes, it is not
 possible to shuffle bytes in between the lanes.  */
  for (i = 0; i < nelt; ++i)
if ((d->perm[i] ^ i) & (nelt / 2))
  return false;
}
intra-lane check which is correct has been copied and adjusted for 64-byte
modes into:
  if (vmode == V64QImode)
{
  /* vpshufb only works intra lanes, it is not
 possible to shuffle bytes in between the lanes.  */
  for (i = 0; i < nelt; ++i)
if ((d->perm[i] ^ i) & (nelt / 4))
  return false;
}
which is not correct, because 64-byte modes have 4 lanes rather than just
two and the above is only testing that the permutation grabs even lane elts
from even lanes and odd lane elts from odd lanes, but not that they are
from the same 256-bit half.

The following patch fixes it by using 3 * nelt / 4 instead of nelt / 4,
so we actually check the most significant 2 bits rather than just one.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and
release branches?

2020-04-07  Jakub Jelinek  

PR target/94509
* config/i386/i386-expand.c (expand_vec_perm_pshufb): Fix the check
for inter-lane permutation for 64-byte modes.

* gcc.target/i386/avx512bw-pr94509-1.c: New test.
* gcc.target/i386/avx512bw-pr94509-2.c: New test.

--- gcc/config/i386/i386-expand.c.jj2020-04-07 08:27:13.770891167 +0200
+++ gcc/config/i386/i386-expand.c   2020-04-07 10:40:09.754089204 +0200
@@ -16781,7 +16781,7 @@ expand_vec_perm_pshufb (struct expand_ve
  /* vpshufb only works intra lanes, it is not
 possible to shuffle bytes in between the lanes.  */
  for (i = 0; i < nelt; ++i)
-   if ((d->perm[i] ^ i) & (nelt / 4))
+   if ((d->perm[i] ^ i) & (3 * nelt / 4))
  return false;
}
}
--- gcc/testsuite/gcc.target/i386/avx512bw-pr94509-1.c.jj   2020-04-07 
10:42:58.560566492 +0200
+++ gcc/testsuite/gcc.target/i386/avx512bw-pr94509-1.c  2020-04-07 
10:44:44.010990594 +0200
@@ -0,0 +1,30 @@
+/* PR target/94509 */
+/* { dg-do run { target avx512bw } } */
+/* { dg-options "-O2 -mavx512bw" } */
+
+#define AVX512BW
+#include "avx512f-helper.h"
+
+typedef unsigned short __attribute__ ((__vector_size__ (64))) V;
+
+__attribute__((noipa)) V
+foo (V x)
+{
+  return __builtin_shuffle (x, (V) { 0, 0, 0, 0, 0, 0, 0, 0,
+15, 15, 15, 15, 15, 15, 15, 15,
+0, 0, 0, 0, 0, 0, 0, 0,
+15, 15, 15, 15, 15, 15, 15, 15 });
+}
+
+static void
+TEST (void)
+{
+  V v = foo ((V) { 1, 2, 3, 4, 5, 6, 7, 8,
+  9, 10, 11, 12, 13, 14, 15, 16,
+  17, 18, 19, 20, 21, 22, 23, 24,
+  25, 26, 27, 28, 29, 30, 31, 32 });
+  unsigned int i;
+  for (i = 0; i < sizeof (v) / sizeof (v[0]); i++)
+if (v[i] != ((i & 8) ? 16 : 1))
+  abort ();
+}
--- gcc/testsuite/gcc.target/i386/avx512bw-pr94509-2.c.jj   2020-04-07 
10:43:01.739518984 +0200
+++ gcc/testsuite/gcc.target/i386/avx512bw-pr94509-2.c  2020-04-07 
10:44:49.811903904 +0200
@@ -0,0 +1,38 @@
+/* PR target/94509 */
+/* { dg-do run { target avx512bw } } */
+/* { dg-options "-O2 -mavx512bw" } */
+
+#define AVX512BW
+#include "avx512f-helper.h"
+
+typedef unsigned char __attribute__ ((__vector_size__ (64))) V;
+
+__attribute__((noipa)) V
+foo (V x)
+{
+  return __builtin_shuffle (x, (V) { 0, 1, 0, 1, 0, 1, 0, 1,
+0, 1, 0, 1, 0, 1, 0, 1,
+30, 31, 30, 31, 30, 31, 30, 31,
+30, 31, 30, 31, 30, 31, 30, 31,
+0, 1, 0, 1, 0, 1, 0, 1,
+0, 1, 0, 1, 0, 1, 0, 1,
+30, 31, 30, 31, 30, 31, 30, 31,
+30, 31, 30, 31, 30, 31, 30, 31 });
+}
+
+static void
+TEST (void)
+{
+  V v = foo ((V) { 1, 2, 3, 4, 5, 6, 7, 8,
+  9, 10, 11, 12, 13, 14, 15, 16,
+  17, 18, 19, 20, 21, 22, 23, 24,
+  25, 26, 27, 28, 29, 30, 31, 32,
+  33, 34, 35, 36, 37, 38, 39, 40,
+  41, 42, 43, 44, 45, 46, 47, 48,
+  49, 50, 51, 52, 53, 54, 55, 56,
+  57, 58, 59, 60, 61, 62, 63, 64 });
+  unsigned int i;
+  for (i = 0; i < sizeof (v) / sizeof (v[0]); i++)
+if (v[i] != ((i & 16) ? 31 : 1) + (i & 1))
+  abort ();
+}

Jakub

Re: [PATCH][Arm][2/4] Custom Datapath Extension intrinsics: instructions using FPU/MVE S/D registers

2020-04-07 Thread Dennis Zhang

Hi all,

This patch is updated to support DImode for vfp target as required by CDE.
Changelog is updated as following.

Is this ready for commit please?

Cheers
Dennis

gcc/ChangeLog:

2020-04-07  Dennis Zhang  
Matthew Malcomson 

* config/arm/arm-builtins.c (CX_IMM_QUALIFIERS): New macro.
(CX_UNARY_QUALIFIERS, CX_BINARY_QUALIFIERS): Likewise.
(CX_TERNARY_QUALIFIERS): Likewise.
(ARM_BUILTIN_CDE_PATTERN_START): Likewise.
(ARM_BUILTIN_CDE_PATTERN_END): Likewise.
(arm_init_acle_builtins): Initialize CDE builtins.
(arm_expand_acle_builtin): Check CDE constant operands.
* config/arm/arm.h (ARM_CDE_CONST_COPROC): New macro to set the range
of CDE constant operand.
* config/arm/arm.c (arm_hard_regno_mode_ok): Support DImode for
TARGET_VFP_BASE.
(ARM_VCDE_CONST_1, ARM_VCDE_CONST_2, ARM_VCDE_CONST_3): Likewise.
* config/arm/arm_cde.h (__arm_vcx1_u32): New macro of ACLE interface.
(__arm_vcx1a_u32, __arm_vcx2_u32, __arm_vcx2a_u32): Likewise.
(__arm_vcx3_u32, __arm_vcx3a_u32, __arm_vcx1d_u64): Likewise.
(__arm_vcx1da_u64, __arm_vcx2d_u64, __arm_vcx2da_u64): Likewise.
(__arm_vcx3d_u64, __arm_vcx3da_u64): Likewise.
* config/arm/arm_cde_builtins.def: New file.
* config/arm/iterators.md (V_reg): New attribute of SI.
* config/arm/predicates.md (const_int_coproc_operand): New.
(const_int_vcde1_operand, const_int_vcde2_operand): New.
(const_int_vcde3_operand): New.
* config/arm/unspecs.md (UNSPEC_VCDE, UNSPEC_VCDEA): New.
* config/arm/vfp.md (arm_vcx1): New entry.
(arm_vcx1a, arm_vcx2, arm_vcx2a): Likewise.
(arm_vcx3, arm_vcx3a): Likewise.

gcc/testsuite/ChangeLog:

2020-04-07  Dennis Zhang  

* gcc.target/arm/acle/cde_v_1.c: New test.
* gcc.target/arm/acle/cde_v_1_err.c: New test.
* gcc.target/arm/acle/cde_v_1_mve.c: New test.

> Hi all,
>
> This patch is updated as attached.
> It's rebased to the top. Is it ready for commit please?
>
> Cheers
> Dennis
>
> > Hi all,
> >
> > This patch is part of a series that adds support for the ARMv8.m Custom 
> > Datapath Extension (CDE).
> > It enables the ACLE intrinsics calling VCX1, VCX2, and VCX3 
> > instructions who work with FPU/MVE 32-bit/64-bit registers.
> >
> > This patch depends on the CDE feature patch: 
> > https://gcc.gnu.org/pipermail/gcc-patches/2020-March/541921.html
> > It also depends on the MVE framework patch: 
> > https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540415.html
> > ISA has been announced at 
> > https://developer.arm.com/architectures/instruction-sets/custom-instructions
> >
> > Regtested and bootstrapped for arm-none-linux-gnueabi-armv8-m.main.
> >
> > Is it OK for commit please?
> >
> > Cheers
> > Dennis
> >

arm-m-cde-vcxsidi-final-20200407-rb12663.patch
Description: arm-m-cde-vcxsidi-final-20200407-rb12663.patch

Re: [PATCH] i386: Fix V{64QI, 32HI}mode constant permutations [PR94509]

2020-04-07 Thread Uros Bizjak via Gcc-patches

On Tue, Apr 7, 2020 at 2:29 PM Jakub Jelinek  wrote:
>
> Hi!
>
> The following testcases are miscompiled, because expand_vec_perm_pshufb
> incorrectly thinks it can use vpshufb instruction for the permutations
> when it can't.
> The
>   if (vmode == V32QImode)
> {
>   /* vpshufb only works intra lanes, it is not
>  possible to shuffle bytes in between the lanes.  */
>   for (i = 0; i < nelt; ++i)
> if ((d->perm[i] ^ i) & (nelt / 2))
>   return false;
> }
> intra-lane check which is correct has been copied and adjusted for 64-byte
> modes into:
>   if (vmode == V64QImode)
> {
>   /* vpshufb only works intra lanes, it is not
>  possible to shuffle bytes in between the lanes.  */
>   for (i = 0; i < nelt; ++i)
> if ((d->perm[i] ^ i) & (nelt / 4))
>   return false;
> }
> which is not correct, because 64-byte modes have 4 lanes rather than just
> two and the above is only testing that the permutation grabs even lane elts
> from even lanes and odd lane elts from odd lanes, but not that they are
> from the same 256-bit half.
>
> The following patch fixes it by using 3 * nelt / 4 instead of nelt / 4,
> so we actually check the most significant 2 bits rather than just one.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and
> release branches?
>
> 2020-04-07  Jakub Jelinek  
>
> PR target/94509
> * config/i386/i386-expand.c (expand_vec_perm_pshufb): Fix the check
> for inter-lane permutation for 64-byte modes.
>
> * gcc.target/i386/avx512bw-pr94509-1.c: New test.
> * gcc.target/i386/avx512bw-pr94509-2.c: New test.

The patch is OK, and after reading your detailed explanation, even as
obvious everywhere.

Thanks,
Uros.


> --- gcc/config/i386/i386-expand.c.jj2020-04-07 08:27:13.770891167 +0200
> +++ gcc/config/i386/i386-expand.c   2020-04-07 10:40:09.754089204 +0200
> @@ -16781,7 +16781,7 @@ expand_vec_perm_pshufb (struct expand_ve
>   /* vpshufb only works intra lanes, it is not
>  possible to shuffle bytes in between the lanes.  */
>   for (i = 0; i < nelt; ++i)
> -   if ((d->perm[i] ^ i) & (nelt / 4))
> +   if ((d->perm[i] ^ i) & (3 * nelt / 4))
>   return false;
> }
> }
> --- gcc/testsuite/gcc.target/i386/avx512bw-pr94509-1.c.jj   2020-04-07 
> 10:42:58.560566492 +0200
> +++ gcc/testsuite/gcc.target/i386/avx512bw-pr94509-1.c  2020-04-07 
> 10:44:44.010990594 +0200
> @@ -0,0 +1,30 @@
> +/* PR target/94509 */
> +/* { dg-do run { target avx512bw } } */
> +/* { dg-options "-O2 -mavx512bw" } */
> +
> +#define AVX512BW
> +#include "avx512f-helper.h"
> +
> +typedef unsigned short __attribute__ ((__vector_size__ (64))) V;
> +
> +__attribute__((noipa)) V
> +foo (V x)
> +{
> +  return __builtin_shuffle (x, (V) { 0, 0, 0, 0, 0, 0, 0, 0,
> +15, 15, 15, 15, 15, 15, 15, 15,
> +0, 0, 0, 0, 0, 0, 0, 0,
> +15, 15, 15, 15, 15, 15, 15, 15 });
> +}
> +
> +static void
> +TEST (void)
> +{
> +  V v = foo ((V) { 1, 2, 3, 4, 5, 6, 7, 8,
> +  9, 10, 11, 12, 13, 14, 15, 16,
> +  17, 18, 19, 20, 21, 22, 23, 24,
> +  25, 26, 27, 28, 29, 30, 31, 32 });
> +  unsigned int i;
> +  for (i = 0; i < sizeof (v) / sizeof (v[0]); i++)
> +if (v[i] != ((i & 8) ? 16 : 1))
> +  abort ();
> +}
> --- gcc/testsuite/gcc.target/i386/avx512bw-pr94509-2.c.jj   2020-04-07 
> 10:43:01.739518984 +0200
> +++ gcc/testsuite/gcc.target/i386/avx512bw-pr94509-2.c  2020-04-07 
> 10:44:49.811903904 +0200
> @@ -0,0 +1,38 @@
> +/* PR target/94509 */
> +/* { dg-do run { target avx512bw } } */
> +/* { dg-options "-O2 -mavx512bw" } */
> +
> +#define AVX512BW
> +#include "avx512f-helper.h"
> +
> +typedef unsigned char __attribute__ ((__vector_size__ (64))) V;
> +
> +__attribute__((noipa)) V
> +foo (V x)
> +{
> +  return __builtin_shuffle (x, (V) { 0, 1, 0, 1, 0, 1, 0, 1,
> +0, 1, 0, 1, 0, 1, 0, 1,
> +30, 31, 30, 31, 30, 31, 30, 31,
> +30, 31, 30, 31, 30, 31, 30, 31,
> +0, 1, 0, 1, 0, 1, 0, 1,
> +0, 1, 0, 1, 0, 1, 0, 1,
> +30, 31, 30, 31, 30, 31, 30, 31,
> +30, 31, 30, 31, 30, 31, 30, 31 });
> +}
> +
> +static void
> +TEST (void)
> +{
> +  V v = foo ((V) { 1, 2, 3, 4, 5, 6, 7, 8,
> +  9, 10, 11, 12, 13, 14, 15, 16,
> +  17, 18, 19, 20, 21, 22, 23, 24,
> +  25, 26, 27, 28, 29, 30, 31, 32,
> +  33, 34, 35, 36, 37, 38, 39, 40,
> +

[committed] openmp: Fix parallel master error recovery [PR94512]

2020-04-07 Thread Jakub Jelinek via Gcc-patches

Hi!

We need to set OMP_PARALLEL_COMBINED only if the parsing of omp_master
succeeded, because otherwise there is no nested master construct in the
parallel.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
committed to trunk, queued for backporting to 9.x.

2020-04-07  Jakub Jelinek  

PR c++/94512
* c-parser.c (c_parser_omp_parallel): Set OMP_PARALLEL_COMBINED
if c_parser_omp_master succeeded.

* parser.c (cp_parser_omp_parallel): Set OMP_PARALLEL_COMBINED
if cp_parser_omp_master succeeded.

* g++.dg/gomp/pr94512.C: New test.

--- gcc/c/c-parser.c.jj 2020-03-23 19:44:48.271181302 +0100
+++ gcc/c/c-parser.c2020-04-07 12:28:19.619763021 +0200
@@ -18877,9 +18877,9 @@ c_parser_omp_parallel (location_t loc, c
  stmt = c_finish_omp_parallel (loc,
cclauses[C_OMP_CLAUSE_SPLIT_PARALLEL],
block);
- OMP_PARALLEL_COMBINED (stmt) = 1;
  if (ret == NULL)
return ret;
+ OMP_PARALLEL_COMBINED (stmt) = 1;
  return stmt;
}
   else if (strcmp (p, "loop") == 0)
--- gcc/cp/parser.c.jj  2020-04-04 09:14:29.911001109 +0200
+++ gcc/cp/parser.c 2020-04-07 12:27:21.977628266 +0200
@@ -39818,9 +39818,9 @@ cp_parser_omp_parallel (cp_parser *parse
  cp_parser_end_omp_structured_block (parser, save);
  stmt = finish_omp_parallel (cclauses[C_OMP_CLAUSE_SPLIT_PARALLEL],
  block);
- OMP_PARALLEL_COMBINED (stmt) = 1;
  if (ret == NULL_TREE)
return ret;
+ OMP_PARALLEL_COMBINED (stmt) = 1;
  return stmt;
}
   else if (strcmp (p, "loop") == 0)
--- gcc/testsuite/g++.dg/gomp/pr94512.C.jj  2020-04-07 12:41:31.250883365 
+0200
+++ gcc/testsuite/g++.dg/gomp/pr94512.C 2020-04-07 12:40:57.381391551 +0200
@@ -0,0 +1,18 @@
+// PR c++/94512
+
+void
+foo ();
+
+template 
+void
+bar ()
+{
+#pragma omp parallel master taskloop
+  foo ();  // { dg-error "for statement expected before" }
+}
+
+void
+baz ()
+{
+  bar<0> ();
+}

Jakub

(v4 update) Re: [PATCH, OpenACC, v3] Non-contiguous array support for OpenACC data clauses

2020-04-07 Thread Chung-Lin Tang


On 2019/11/26 10:49 PM, Chung-Lin Tang wrote:

Hi Thomas,
this is a reorg of the last non-contiguous arrays patch. You'll notice that:

(1) A large part of the code has been pulled into oacc-parallel.c, with most
of the data structure declarations in oacc-int.h.

(2) target.c only contains relatively little code from gomp_map_vars_internal
that processes what GOACC_parallel_keyed/data_start gives it.

(3) Instead of directly passed in the map pointer, the array descriptor
pointers are now passed to GOACC_parallel_keyed/data_start using varargs.
(I believe the adding of '...' to GOACC_data_start does not break any
compatiblity)

(4) Along the way, I've added a 'gomp_map_vars_openacc' for specializing our
uses, which should shave off quite some code through inlining.

The GOMP_MAP_NONCONTIG_ARRAY_P maps are still placed at the beginning of the
recieved map sequence in this patch. It should still be relatively easy to
use a GOACC_FLAG_* to do so if deemed better before committing.

Thanks,
Chung-Lin


Hi Thomas,
this is a rebased version, with some updates WRT the attach/detach changes and
some bug fixes, dubbed "v4". Plan to merge this version to the OG10 branch soon.

Thanks,
Chung-Lin




     PR other/76739

     gcc/c/
     * c-typeck.c (handle_omp_array_sections_1): Add 'bool &non_contiguous'
     parameter, adjust recursive call site, add cases for allowing
     pointer based multi-dimensional arrays for OpenACC.
     (handle_omp_array_sections): Adjust handle_omp_array_sections_1 call,
     handle non-contiguous case to create dynamic array map.

     gcc/cp/
     * semantics.c (handle_omp_array_sections_1): Add 'bool &non_contiguous'
     parameter, adjust recursive call site, add cases for allowing
     pointer based multi-dimensional arrays for OpenACC.
     (handle_omp_array_sections): Adjust handle_omp_array_sections_1 call,
     handle non-contiguous case to create dynamic array map.

     gcc/fortran/
     * f95-lang.c (DEF_FUNCTION_TYPE_VAR_5): New symbol.
     * types.def (BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_VAR): New type.

     gcc/
     * builtin-types.def (BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_VAR): New type.
     * omp-builtins.def (BUILT_IN_GOACC_DATA_START): Adjust function type
     to new BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_VAR.
     * gimplify.c (gimplify_scan_omp_clauses): Skip gimplification of
     OMP_CLAUSE_SIZE of non-contiguous array maps (which is a TREE_LIST).
     * omp-expand.c (expand_omp_target): Add non-contiguous array descriptor
     pointers to variadic arguments.
     * omp-low.c (append_field_to_record_type): New function.
     (create_noncontig_array_descr_type): Likewise.
     (create_noncontig_array_descr_init_code): Likewise.
     (scan_sharing_clauses): For non-contiguous array map kinds, check for
     supported dimension structure, and install non-contiguous array
     variable into current omp_context.
     (reorder_noncontig_array_clauses): New function.
     (scan_omp_target): Call reorder_noncontig_array_clauses to place
     non-contiguous array map clauses at beginning of clause sequence.
     (lower_omp_target): Add handling for non-contiguous array map kinds,
     add all created non-contiguous array descriptors to
     gimple_omp_target_data_arg.

     gcc/testsuite/
 * c-c++-common/goacc/noncontig_array-1.c: New test.

 libgomp/
     * libgomp_g.h (GOACC_data_start): Add variadic '...' to declaration.
     * libgomp.h (gomp_map_vars_openacc): New function declaration.
     * oacc-int.h (struct goacc_ncarray_dim): New struct declaration.
     (struct goacc_ncarray_descr_type): Likewise.
     (struct goacc_ncarray): Likewise.
     (struct goacc_ncarray_info): Likewise.
 (goacc_noncontig_array_create_ptrblock): New function declaration.
     * oacc-parallel.c (goacc_noncontig_array_count_rows): New function.
     (goacc_noncontig_array_compute_sizes): Likewise.
     (goacc_noncontig_array_fill_rows_1): Likewise.
     (goacc_noncontig_array_fill_rows): Likewise.
     (goacc_process_noncontiguous_arrays): Likewise.
     (goacc_noncontig_array_create_ptrblock): Likewise.
 (GOACC_parallel_keyed): Use goacc_process_noncontiguous_arrays to
     handle non-contiguous array descriptors at end of varargs, adjust
     to use gomp_map_vars_openacc.
     (GOACC_data_start): Likewise. Adjust function type to accept varargs.
 * target.c (gomp_map_vars_internal): Add struct goacc_ncarray_info *
     nca_info parameter, add handling code for non-contiguous arrays.
     (gomp_map_vars_openacc): Add new function for specialization of
     gomp_map_vars_internal for OpenACC structured region usage.

     * testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c: New test.
     * testsuite/libgomp.oacc-c-c++-common/noncontig_array-2.c: New

Re: [PATCH][GCC][Arm]: MVE: Fix v[id]wdup's

2020-04-07 Thread Andre Vieira (lists)


On 07/04/2020 11:57, Christophe Lyon wrote:

On Tue, 7 Apr 2020 at 12:40, Andre Vieira (lists)
 wrote:

Hi,

This patch fixes v[id]wdup intrinsics. They had two issues:
1) the predicated versions did not link the incoming inactive vector
parameter to the output
2) The backend didn't enforce the wrap limit operand be in an odd register.

1) was fixed like we did for all other predicated intrinsics
2) requires a temporary hack where we pass the value in the top end of
DImode operand. The proper fix would be to add a register CLASS but this
interacted badly with other existing targets codegen.  We will look to
fix this properly in GCC 11.

Regression tested on arm-none-eabi.


Hi Andre,

How did you find problem 1? I suspect you are using an internal
simulator since qemu does not support MVE yet?
And you probably have runtime tests to exhibit this failure?

Hi Christophe,

I actually found 1) because I was fixing 2). Though yes, I am trying to 
complement testing using an internal simulator and running tests in 
Arm's CMSIS DSP Library (https://github.com/ARM-software/CMSIS_5) that 
use MVE.


Cheers,
Andre

Thanks,

Christophe


Is this OK for trunk?

gcc/ChangeLog:
2020-04-07  Andre Vieira  

  * config/arm/arm_mve.h: Fix v[id]wdup intrinsics.
  * config/arm/mve/md: Fix v[id]wdup patterns.

RE: [PATCH][GCC][Arm]: MVE: Fix v[id]wdup's

2020-04-07 Thread Kyrylo Tkachov

> -Original Message-
> From: Andre Vieira (lists) 
> Sent: 07 April 2020 11:41
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov 
> Subject: [PATCH][GCC][Arm]: MVE: Fix v[id]wdup's
> 
> Hi,
> 
> This patch fixes v[id]wdup intrinsics. They had two issues:
> 1) the predicated versions did not link the incoming inactive vector
> parameter to the output
> 2) The backend didn't enforce the wrap limit operand be in an odd register.
> 
> 1) was fixed like we did for all other predicated intrinsics
> 2) requires a temporary hack where we pass the value in the top end of
> DImode operand. The proper fix would be to add a register CLASS but this
> interacted badly with other existing targets codegen.  We will look to
> fix this properly in GCC 11.
> 
> Regression tested on arm-none-eabi.
> 
> Is this OK for trunk?

Argh, not a fan of explicitly matching subregs as they usually break on 
big-endian, but we've disabled the intrinsics for big-endian for now ☹
So ok for trunk.
Thanks,
Kyrill

> 
> gcc/ChangeLog:
> 2020-04-07  Andre Vieira  
> 
>      * config/arm/arm_mve.h: Fix v[id]wdup intrinsics.
>      * config/arm/mve/md: Fix v[id]wdup patterns.

RE: [PATCH][GCC][Arm]: MVE Don't use lsll for 32-bit shifts scalar

2020-04-07 Thread Kyrylo Tkachov

> -Original Message-
> From: Andre Vieira (lists) 
> Sent: 07 April 2020 11:43
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov 
> Subject: [PATCH][GCC][Arm]: MVE Don't use lsll for 32-bit shifts scalar
> 
> Hi,
> 
> After fixing the v[id]wdups using the "moving the wrap parameter" into
> the top-end of a DImode operand using a shift, I noticed we were using
> lsll for 32-bit shifts in scalars, where we don't need to, as we can
> simply do a move, which is much better if we don't need to use the
> bottom part.
> 
> We can solve this in a better way, but for now this will do.
> 
> Regression tested on arm-none-eabi.
> 
> Is this OK for trunk?

Ok.
Thanks,
Kyrill

> 
> 2020-04-07  Andre Vieira  
> 
>      * config/arm/arm.d (ashldi3): Don't use lsll for constant 32-bit
>      shifts.

RE: [PATCH][Arm][2/4] Custom Datapath Extension intrinsics: instructions using FPU/MVE S/D registers

2020-04-07 Thread Kyrylo Tkachov

Hi Dennis,

> -Original Message-
> From: Dennis Zhang 
> Sent: 07 April 2020 13:31
> To: gcc-patches@gcc.gnu.org
> Cc: nd ; Richard Earnshaw ;
> Ramana Radhakrishnan ; Kyrylo
> Tkachov 
> Subject: Re: [PATCH][Arm][2/4] Custom Datapath Extension intrinsics:
> instructions using FPU/MVE S/D registers
> 
> Hi all,
> 
> This patch is updated to support DImode for vfp target as required by CDE.
> Changelog is updated as following.
> 
> Is this ready for commit please?

This is ok.
Has the first patch been updated and committed yet?
Thanks,
Kyrill

> 
> Cheers
> Dennis
> 
> gcc/ChangeLog:
> 
> 2020-04-07  Dennis Zhang  
> Matthew Malcomson 
> 
> * config/arm/arm-builtins.c (CX_IMM_QUALIFIERS): New macro.
> (CX_UNARY_QUALIFIERS, CX_BINARY_QUALIFIERS): Likewise.
> (CX_TERNARY_QUALIFIERS): Likewise.
> (ARM_BUILTIN_CDE_PATTERN_START): Likewise.
> (ARM_BUILTIN_CDE_PATTERN_END): Likewise.
> (arm_init_acle_builtins): Initialize CDE builtins.
> (arm_expand_acle_builtin): Check CDE constant operands.
> * config/arm/arm.h (ARM_CDE_CONST_COPROC): New macro to set the
> range
> of CDE constant operand.
> * config/arm/arm.c (arm_hard_regno_mode_ok): Support DImode for
> TARGET_VFP_BASE.
> (ARM_VCDE_CONST_1, ARM_VCDE_CONST_2, ARM_VCDE_CONST_3):
> Likewise.
> * config/arm/arm_cde.h (__arm_vcx1_u32): New macro of ACLE interface.
> (__arm_vcx1a_u32, __arm_vcx2_u32, __arm_vcx2a_u32): Likewise.
> (__arm_vcx3_u32, __arm_vcx3a_u32, __arm_vcx1d_u64): Likewise.
> (__arm_vcx1da_u64, __arm_vcx2d_u64, __arm_vcx2da_u64): Likewise.
> (__arm_vcx3d_u64, __arm_vcx3da_u64): Likewise.
> * config/arm/arm_cde_builtins.def: New file.
> * config/arm/iterators.md (V_reg): New attribute of SI.
> * config/arm/predicates.md (const_int_coproc_operand): New.
> (const_int_vcde1_operand, const_int_vcde2_operand): New.
> (const_int_vcde3_operand): New.
> * config/arm/unspecs.md (UNSPEC_VCDE, UNSPEC_VCDEA): New.
> * config/arm/vfp.md (arm_vcx1): New entry.
> (arm_vcx1a, arm_vcx2, arm_vcx2a): Likewise.
> (arm_vcx3, arm_vcx3a): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-04-07  Dennis Zhang  
> 
> * gcc.target/arm/acle/cde_v_1.c: New test.
> * gcc.target/arm/acle/cde_v_1_err.c: New test.
> * gcc.target/arm/acle/cde_v_1_mve.c: New test.
> 
> > Hi all,
> >
> > This patch is updated as attached.
> > It's rebased to the top. Is it ready for commit please?
> >
> > Cheers
> > Dennis
> >
> > > Hi all,
> > >
> > > This patch is part of a series that adds support for the ARMv8.m Custom
> Datapath Extension (CDE).
> > > It enables the ACLE intrinsics calling VCX1, VCX2, and VCX3
> instructions who work with FPU/MVE 32-bit/64-bit registers.
> > >
> > > This patch depends on the CDE feature patch:
> https://gcc.gnu.org/pipermail/gcc-patches/2020-March/541921.html
> > > It also depends on the MVE framework patch:
> https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540415.html
> > > ISA has been announced at
> https://developer.arm.com/architectures/instruction-sets/custom-
> instructions
> > >
> > > Regtested and bootstrapped for arm-none-linux-gnueabi-armv8-m.main.
> > >
> > > Is it OK for commit please?
> > >
> > > Cheers
> > > Dennis
> > >

Re: [PATCH 2/2] RISC-V: Handle implied extension for -march parser.

2020-04-07 Thread Kito Cheng

Hi Jim:

> When I try this patch, I see an error:
>
> rohan:2132$ ./xgcc -B./ -O -march=rv64imafdc -mabi=lp64d  tmp.c
> /tmp/ccULN36f.s: Assembler messages:
> /tmp/ccULN36f.s:3: Fatal error:
> -march=rv64i2p0_m2p0_a2p0_f2p0_f2p0_d2p0_c2p0: ISA string is not in
> canonical order. `f'
> rohan:2133$
>
> Looks like you need to make sure that we don't add the implied F if it
> is already there.  Otherwise, this looks OK to me.

Thanks, I should add a testcase for that :)

>
> We don't have support for these implied extensions in binutils yet.
> If we are adding it to gcc, we should probably add it to binutils too.
> Maybe you can ask Nelson to work on that.

I also found another issue: assembler will complain when passing -march=rv64id,
 because it does not accept implied extensions yet...

$ riscv64-unknown-elf-gcc ~/hello.c -march=rv64id  -c
# GCC also invoke as with -march=rv64id
Assembler messages:
Fatal error: -march=rv64id: `d' extension requires `f' extension

However if we require assembler support that means we need to bump the
binutils version requirement again and might break backward compatibility,
so I guess we can transform -march= via spec functions?
I'll send a v2 patch with this approach.

diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 567c23380fe..38644436f7c 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -46,7 +46,7 @@ along with GCC; see the file COPYING3.  If not see
   --with-tune is ignored if -mtune is specified.  */
#define OPTION_DEFAULT_SPECS \
  {"tune", "%{!mtune=*:-mtune=%(VALUE)}" }, \
-  {"arch", "%{!march=*:-march=%(VALUE)}" }, \
+  {"arch", "%{!march=*:%:riscv_expand_arch(%(VALUE))}" }, \
  {"abi", "%{!mabi=*:-mabi=%(VALUE)}" }, \

#ifdef IN_LIBGCC2


Thanks :)

>
> > +  /* TODO: Implied extension might use differet version.  */
>
> differet -> different
>
> Jim

Re: [PATCH] Check DECL_CONTEXT of new/delete operators.

2020-04-07 Thread Nathan Sidwell


On 4/7/20 7:29 AM, Jonathan Wakely wrote:

On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell wrote:

The both operator new and operator delete are looked up in the same
manner.  The std does not require a 'matching pair' be found.  but it is
extremely poor form for a class to declare exactly one of operator
{new,delete}.


There are unfortunately several such example in the standard!


Pah!

I also realized one could write:

struct Derived : Base {
  void *operator new (size_t t) { ... }
  using Base::operator delete;
};

which is not such poor code, but the FE will generate calls that 
completely lose the usingness of the operator delete.


As RichardB comments, I think the FE needs to mark sets of new/delete 
calls for the middle end to safely optimize. but ...




I wonder how much benefit we will really get from trying to make this
optimisation too general.

Just eliding (or coalescing) implicit calls to ::operator new(size_t)
and ::operator delete(void*, size_t) (and the [] and align_val_t forms
of those) probably covers 99% of real cases.


I agree.  It's certainly a simpler problem and the major component of 
any win we'll get.


nathan

--
Nathan Sidwell

[PATCH][GCC][Arm]: MVE Fix immediate constraints on some vector instructions

2020-04-07 Thread Andre Vieira (lists)


Hi,

This patch fixes the immediate checks on vcvt and vqshr(u)n[bt] 
instrucitons.  It also removes the 'arm_mve_immediate_check' as the 
check was wrong and the error message is not much better than the 
constraint one, which albeit isn't great either.


Regression tested on arm-none-eabi.

Is this OK for trunk?

gcc/ChangeLog:
2020-04-07  Andre Vieira      (mve_vcvtq_n_to_f_*, mve_vcvtq_n_from_f_*, mve_vqshrnbq_n_*, 
mve_vqshrntq_n_*,
 mve_vqshrunbq_n_s*, mve_vqshruntq_n_s*, 
mve_vcvtq_m_n_from_f_*, mve_vcvtq_m_n_to_f_*,

 mve_vqshrnbq_m_n_*, mve_vqrshruntq_m_n_s*, mve_vqshrunbq_m_n_s*,
 mve_vqshruntq_m_n_s*): Fixed immediate constraints.

gcc/testsuite/ChangeLog:
2020-04-07  Andre Vieira  diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
1af9d5cf145f6d01e364a1afd7ceb3df5da86c9a..cd0a49cdb63690d794981a73e1e7e0d47f6d1987
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -32693,31 +32693,6 @@ arm_simd_check_vect_par_cnst_half_p (rtx op, 
machine_mode mode,
   return true;
 }
 
-/* To check op's immediate values matches the mode of the defined insn.  */
-bool
-arm_mve_immediate_check (rtx op, machine_mode mode, bool val)
-{
-  if (val)
-{
-  if (((GET_CODE (op) == CONST_INT) && (INTVAL (op) <= 7)
-  && (mode == E_V16QImode))
- || ((GET_CODE (op) == CONST_INT) && (INTVAL (op) <= 15)
-  && (mode == E_V8HImode))
- || ((GET_CODE (op) == CONST_INT) && (INTVAL (op) <= 31)
-  && (mode == E_V4SImode)))
-   return true;
-}
-  else
-{
-  if (((GET_CODE (op) == CONST_INT) && (INTVAL (op) <= 7)
-  && (mode == E_V8HImode))
- || ((GET_CODE (op) == CONST_INT) && (INTVAL (op) <= 15)
-  && (mode == E_V4SImode)))
-   return true;
-}
-  return false;
-}
-
 /* Can output mi_thunk for all cases except for non-zero vcall_offset
in Thumb1.  */
 static bool
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 
4a506cc3861534b4ddc30ba8f4f3c4ec28a8cc69..3c75f9ebc70d5765a59934b944955c757b6b2195
 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -401,8 +401,10 @@ (define_int_attr mode1 [(VCTP8Q "8") (VCTP16Q "16") 
(VCTP32Q "32")
(VCTP64Q "64") (VCTP8Q_M "8") (VCTP16Q_M "16")
(VCTP32Q_M "32") (VCTP64Q_M "64")])
 (define_mode_attr MVE_pred2 [(V16QI "mve_imm_8") (V8HI "mve_imm_16")
-(V4SI "mve_imm_32")])
-(define_mode_attr MVE_constraint2 [(V16QI "Rb") (V8HI "Rd") (V4SI "Rf")])
+(V4SI "mve_imm_32")
+(V8HF "mve_imm_16") (V4SF "mve_imm_32")])
+(define_mode_attr MVE_constraint2 [(V16QI "Rb") (V8HI "Rd") (V4SI "Rf")
+   (V8HF "Rd") (V4SF "Rf")])
 (define_mode_attr MVE_LANES [(V16QI "16") (V8HI "8") (V4SI "4")])
 (define_mode_attr MVE_constraint [ (V16QI "Ra") (V8HI "Rc") (V4SI "Re")])
 (define_mode_attr MVE_pred [ (V16QI "mve_imm_7") (V8HI "mve_imm_15")
@@ -1330,7 +1332,7 @@ (define_insn "mve_vcvtq_n_to_f_"
   [
(set (match_operand:MVE_0 0 "s_register_operand" "=w")
(unspec:MVE_0 [(match_operand: 1 "s_register_operand" "w")
-  (match_operand:SI 2 "mve_imm_16" "Rd")]
+  (match_operand:SI 2 "" "")]
 VCVTQ_N_TO_F))
   ]
   "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
@@ -1389,7 +1391,7 @@ (define_insn "mve_vcvtq_n_from_f_"
   [
(set (match_operand:MVE_5 0 "s_register_operand" "=w")
(unspec:MVE_5 [(match_operand: 1 "s_register_operand" "w")
-  (match_operand:SI 2 "mve_imm_16" "Rd")]
+  (match_operand:SI 2 "" "")]
 VCVTQ_N_FROM_F))
   ]
   "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
@@ -5484,7 +5486,7 @@ (define_insn "mve_vqshrnbq_n_"
(set (match_operand: 0 "s_register_operand" "=w")
(unspec: [(match_operand: 1 
"s_register_operand" "0")
   (match_operand:MVE_5 2 "s_register_operand" "w")
-  (match_operand:SI 3 "" "")]
+  (match_operand:SI 3 "" "")]
 VQSHRNBQ_N))
   ]
   "TARGET_HAVE_MVE"
@@ -5500,7 +5502,7 @@ (define_insn "mve_vqshrntq_n_"
(set (match_operand: 0 "s_register_operand" "=w")
(unspec: [(match_operand: 1 
"s_register_operand" "0")
   (match_operand:MVE_5 2 "s_register_operand" "w")
-  (match_operand:SI 3 "mve_imm_8" "Rb")]
+  (match_operand:SI 3 "" "")]
 VQSHRNTQ_N))
   ]
   "TARGET_HAVE_MVE"
@@ -5516,7 +5518,7 @@ (define_insn "mve_vqshrunbq_n_s"
(set (match_operand: 0 "s_register_operand" "=w")
(unspec: [(match_operand: 1 
"s_register_operand" "0")
   (match_operand:MVE_5 2 "s_register_operand" "w")
-  (match_operand:SI 3 "immediate_operand" "i")]
+  (match_operand:SI 3 "" "")]
 VQSHRUNBQ_N_S))
   ]
   "TARGET_HAVE_MVE"
@@ -5532,7 +5534,

[PATCH][GCC][Arm]: MVE: Fix vec extracts to memory

2020-04-07 Thread Andre Vieira (lists)


Hi,

This patch fixes vec extracts to memory that can arise from code as seen 
in the testcase added. The patch fixes this by allowing mem operands in 
the set of mve_vec_extract patterns, which given the only '=r' 
constraint will lead to the scalar value being written to a register and 
then stored in memory using scalar store pattern.


Regression tested on arm-none-eabi.

Is this OK for trunk?

gcc/ChangeLog:
2020-04-07  Andre Vieira  

    * config/arm/mve.md (mve_vec_extract*): Allow memory operands 
in set.


gcc/testsuite/ChangeLog:
2020-04-07  Andre Vieira  

    * gcc.target/arm/mve/intrinsics/mve_vec_extracts_from_memory.c: 
New test.


diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 
3c75f9ebc70d5765a59934b944955c757b6b2195..c49c14c4240838ce086f424f58726e2e94cf190e
 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -10993,7 +10993,7 @@ (define_insn "mve_vld4q"
 ;; [vgetq_lane_u, vgetq_lane_s, vgetq_lane_f])
 ;;
 (define_insn "mve_vec_extract"
- [(set (match_operand: 0 "s_register_operand" "=r")
+ [(set (match_operand: 0 "nonimmediate_operand" "=r")
(vec_select:
 (match_operand:MVE_VLD_ST 1 "s_register_operand" "w")
 (parallel [(match_operand:SI 2 "immediate_operand" "i")])))]
@@ -11011,7 +11011,7 @@ (define_insn "mve_vec_extract"
  [(set_attr "type" "mve_move")])
 
 (define_insn "mve_vec_extractv2didi"
- [(set (match_operand:DI 0 "s_register_operand" "=r")
+ [(set (match_operand:DI 0 "nonimmediate_operand" "=r")
(vec_select:DI
 (match_operand:V2DI 1 "s_register_operand" "w")
 (parallel [(match_operand:SI 2 "immediate_operand" "i")])))]
@@ -11024,7 +11024,7 @@ (define_insn "mve_vec_extractv2didi"
   if (elt == 0)
return "vmov\t%Q0, %R0, %e1";
   else
-   return "vmov\t%J0, %K0, %f1";
+   return "vmov\t%Q0, %R0, %f1";
 }
  [(set_attr "type" "mve_move")])
 
diff --git 
a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vec_extracts_from_memory.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vec_extracts_from_memory.c
new file mode 100644
index 
..12f2f2d38d3c2e189a9c06f21fc63e2c23e2e721
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vec_extracts_from_memory.c
@@ -0,0 +1,40 @@
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+/* { dg-additional-options "-O3" } */
+
+#include "arm_mve.h"
+
+uint8x16_t *vu8;
+int8x16_t *vs8;
+uint16x8_t *vu16;
+int16x8_t *vs16;
+uint32x4_t *vu32;
+int32x4_t *vs32;
+uint64x2_t *vu64;
+int64x2_t *vs64;
+float16x8_t *vf16;
+float32x4_t *vf32;
+uint8_t u8;
+uint16_t u16;
+uint32_t u32;
+uint64_t u64;
+int8_t s8;
+int16_t s16;
+int32_t s32;
+int64_t s64;
+float16_t f16;
+float32_t f32;
+
+void foo (void)
+{
+  u8 = vgetq_lane (*vu8, 1);
+  u16 = vgetq_lane (*vu16, 1);
+  u32 = vgetq_lane (*vu32, 1);
+  u64 = vgetq_lane (*vu64, 1);
+  s8 = vgetq_lane (*vs8, 1);
+  s16 = vgetq_lane (*vs16, 1);
+  s32 = vgetq_lane (*vs32, 1);
+  s64 = vgetq_lane (*vs64, 1);
+  f16 = vgetq_lane (*vf16, 1);
+  f32 = vgetq_lane (*vf32, 1);
+}

[PATCH][GCC][Arm]MVE: Fix -Wall testisms

2020-04-07 Thread Andre Vieira (lists)


Hi,

This patch fixes some testisms I found when testing using -Wall/-Werror.

Is this OK for trunk?

gcc/testsuite/ChangeLog:
2020-04-07  Andre Vieira  

    * gcc.target/arm/mve/intrinsics/vuninitializedq_float.c: Likewise.
    * gcc.target/arm/mve/intrinsics/vuninitializedq_float1.c: Likewise.
    * gcc.target/arm/mve/intrinsics/vuninitializedq_int.c: Likewise.
    * gcc.target/arm/mve/intrinsics/vuninitializedq_int1.c: Likewise.

diff --git 
a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_float.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_float.c
index 
3b9c0a740976854e7189ab23a6a8b2764c9b888a..52bad05b6219621ada414dc74ab2deebdd1c93e3
 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_float.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_float.c
@@ -4,11 +4,12 @@
 
 #include "arm_mve.h"
 
+float16x8_t fa;
+float32x4_t fb;
+
 void
 foo ()
 {
-  float16x8_t fa;
-  float32x4_t fb;
   fa = vuninitializedq_f16 ();
   fb = vuninitializedq_f32 ();
 }
diff --git 
a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_float1.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_float1.c
index 
0c94608af41fc30c65b959759704033a76bb879f..c6724a52074c6ce0361fdba66c4add831e8c13db
 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_float1.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_float1.c
@@ -4,13 +4,14 @@
 
 #include "arm_mve.h"
 
+float16x8_t fa, faa;
+float32x4_t fb, fbb;
+
 void
 foo ()
 {
-  float16x8_t fa, faa;
-  float32x4_t fb, fbb;
   fa = vuninitializedq (faa);
   fb = vuninitializedq (fbb);
 }
 
-/* { dg-final { scan-assembler-times "vstrb.8" } */
+/* { dg-final { scan-assembler-times "vstrb.8" 6 } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int.c
index 
9ae17e240083a66e7c20c16ae06b99463c213bf9..13a0109a9b5380cd83f48154df231081ddb8f08e
 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int.c
@@ -3,18 +3,18 @@
 /* { dg-additional-options "-O0" } */
 
 #include "arm_mve.h"
+int8x16_t a;
+int16x8_t b;
+int32x4_t c;
+int64x2_t d;
+uint8x16_t ua;
+uint16x8_t ub;
+uint32x4_t uc;
+uint64x2_t ud;
 
 void
 foo ()
 {
-  int8x16_t a;
-  int16x8_t b;
-  int32x4_t c;
-  int64x2_t d;
-  uint8x16_t ua;
-  uint16x8_t ub;
-  uint32x4_t uc;
-  uint64x2_t ud;
   a = vuninitializedq_s8 ();
   b = vuninitializedq_s16 ();
   c = vuninitializedq_s32 ();
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int1.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int1.c
index 
e8c1f1019c95af6d871cda9c9142c346ff3b49ae..a321398709e65ee7daadfab9c6089116baccde83
 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int1.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int1.c
@@ -4,17 +4,18 @@
 
 #include "arm_mve.h"
 
+int8x16_t a, aa;
+int16x8_t b, bb;
+int32x4_t c, cc;
+int64x2_t d, dd;
+uint8x16_t ua, uaa;
+uint16x8_t ub, ubb;
+uint32x4_t uc, ucc;
+uint64x2_t ud, udd;
+
 void
 foo ()
 {
-  int8x16_t a, aa;
-  int16x8_t b, bb;
-  int32x4_t c, cc;
-  int64x2_t d, dd;
-  uint8x16_t ua, uaa;
-  uint16x8_t ub, ubb;
-  uint32x4_t uc, ucc;
-  uint64x2_t ud, udd;
   a = vuninitializedq (aa);
   b = vuninitializedq (bb);
   c = vuninitializedq (cc);

Re: [PATCH] S/390: Fix layout of struct sigaction_t

2020-04-07 Thread Andreas Krebbel via Gcc-patches

On 07.04.20 09:59, Iain Buclaw wrote:
> On 01/04/2020 18:20, Stefan Liebler wrote:
>> On 4/1/20 12:54 PM, Iain Buclaw wrote:
>>> On 01/04/2020 08:28, Stefan Liebler wrote:
 ping

>>>
>>> Thanks, I'll send the patch upstream, as it's the same there.
>>>
>>> Looks OK to me.
>>>
>>> Regards
>>> Iain.
>>>
>>
>> Thanks for committing the patch upstream
>>
> 
> Hi Stefan,
> 
> They have been merged in: https://github.com/dlang/druntime/pull/3020
> 
> Do you want to commit this yourself?

I've committed the patch for Stefan. Thanks!

Andreas

[PATCH][GCC][Arm]: MVE: Fixes for pointers used in intrinsics for c++

2020-04-07 Thread Andre Vieira (lists)


Hi,

This patch fixes the passing of some pointers to builtins that expect 
slightly different types of pointers.  In C this didn't prove an issue, 
but when compiling for C++ gcc complains.


Regression tested on arm-none-eabi.

Is this OK for trunk?

2020-04-07  Andre Vieira  

    * config/arm/arm_mve.h: Cast some pointers to expected types.

diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 
44cb0d4a92f7f64d08c17722944d20bd6ea7048a..49c7fb95f17347d283c4df34a6875d686a3e3f09
 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -12897,56 +12897,56 @@ __extension__ extern __inline void
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vstrdq_scatter_offset_p_s64 (int64_t * __base, uint64x2_t __offset, 
int64x2_t __value, mve_pred16_t __p)
 {
-  __builtin_mve_vstrdq_scatter_offset_p_sv2di (__base, __offset, __value, __p);
+  __builtin_mve_vstrdq_scatter_offset_p_sv2di ((__builtin_neon_di *) __base, 
__offset, __value, __p);
 }
 
 __extension__ extern __inline void
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vstrdq_scatter_offset_p_u64 (uint64_t * __base, uint64x2_t __offset, 
uint64x2_t __value, mve_pred16_t __p)
 {
-  __builtin_mve_vstrdq_scatter_offset_p_uv2di (__base, __offset, __value, __p);
+  __builtin_mve_vstrdq_scatter_offset_p_uv2di ((__builtin_neon_di *) __base, 
__offset, __value, __p);
 }
 
 __extension__ extern __inline void
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vstrdq_scatter_offset_s64 (int64_t * __base, uint64x2_t __offset, 
int64x2_t __value)
 {
-  __builtin_mve_vstrdq_scatter_offset_sv2di (__base, __offset, __value);
+  __builtin_mve_vstrdq_scatter_offset_sv2di ((__builtin_neon_di *) __base, 
__offset, __value);
 }
 
 __extension__ extern __inline void
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vstrdq_scatter_offset_u64 (uint64_t * __base, uint64x2_t __offset, 
uint64x2_t __value)
 {
-  __builtin_mve_vstrdq_scatter_offset_uv2di (__base, __offset, __value);
+  __builtin_mve_vstrdq_scatter_offset_uv2di ((__builtin_neon_di *) __base, 
__offset, __value);
 }
 
 __extension__ extern __inline void
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vstrdq_scatter_shifted_offset_p_s64 (int64_t * __base, uint64x2_t 
__offset, int64x2_t __value, mve_pred16_t __p)
 {
-  __builtin_mve_vstrdq_scatter_shifted_offset_p_sv2di (__base, __offset, 
__value, __p);
+  __builtin_mve_vstrdq_scatter_shifted_offset_p_sv2di ((__builtin_neon_di *) 
__base, __offset, __value, __p);
 }
 
 __extension__ extern __inline void
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vstrdq_scatter_shifted_offset_p_u64 (uint64_t * __base, uint64x2_t 
__offset, uint64x2_t __value, mve_pred16_t __p)
 {
-  __builtin_mve_vstrdq_scatter_shifted_offset_p_uv2di (__base, __offset, 
__value, __p);
+  __builtin_mve_vstrdq_scatter_shifted_offset_p_uv2di ((__builtin_neon_di *) 
__base, __offset, __value, __p);
 }
 
 __extension__ extern __inline void
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vstrdq_scatter_shifted_offset_s64 (int64_t * __base, uint64x2_t 
__offset, int64x2_t __value)
 {
-  __builtin_mve_vstrdq_scatter_shifted_offset_sv2di (__base, __offset, 
__value);
+  __builtin_mve_vstrdq_scatter_shifted_offset_sv2di ((__builtin_neon_di *) 
__base, __offset, __value);
 }
 
 __extension__ extern __inline void
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vstrdq_scatter_shifted_offset_u64 (uint64_t * __base, uint64x2_t 
__offset, uint64x2_t __value)
 {
-  __builtin_mve_vstrdq_scatter_shifted_offset_uv2di (__base, __offset, 
__value);
+  __builtin_mve_vstrdq_scatter_shifted_offset_uv2di ((__builtin_neon_di *) 
__base, __offset, __value);
 }
 
 __extension__ extern __inline void
@@ -18968,14 +18968,14 @@ __extension__ extern __inline float16x8_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vldrhq_gather_shifted_offset_f16 (float16_t const * __base, uint16x8_t 
__offset)
 {
-  return __builtin_mve_vldrhq_gather_shifted_offset_fv8hf (__base, __offset);
+  return __builtin_mve_vldrhq_gather_shifted_offset_fv8hf ((__builtin_neon_hi 
*) __base, __offset);
 }
 
 __extension__ extern __inline float16x8_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vldrhq_gather_shifted_offset_z_f16 (float16_t const * __base, uint16x8_t 
__offset, mve_pred16_t __p)
 {
-  return __builtin_mve_vldrhq_gather_shifted_offset_z_fv8hf (__base, __offset, 
__p);
+  return __builtin_mve_vldrhq_gather_shifted_offset_z_fv8hf 
((__builtin_neon_hi *) __base, __offset, __p);
 }
 
 __extension__ extern __inline float32x4_t
@@ -19010,84 +19010,84 @@ __extension__ extern __inline float32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vldrwq_gather_shifted_offset_f32 (float32_t const * __base, uint32x4_t 
_

[PATCH][GCC][Arm]: MVE: Add C++ polymorphism and fix some more issues

2020-04-07 Thread Andre Vieira (lists)


Hi,

This patch adds C++ polymorphism for the MVE intrinsics, by using the 
native C++ polymorphic functions when C++ is used.


It also moves the PRESERVE name macro definitions to the right place so 
that the variants without the '__arm_' prefix are not available if we 
define the PRESERVE NAMESPACE macro.


This patch further fixes two testisms that were brought to light by C++ 
testing added in this patch.


Regression tested on arm-none-eabi.

Is this OK for trunk?

gcc/ChangeLog:
2020-04-07  Andre Vieira  

    * config/arm/arm_mve.h: Add C++ polymorphism and fix
    preserve MACROs.

gcc/testsuite/ChangeLog:
2020-04-07  Andre Vieira  

    * g++.target/arm/mve.exp: New.
    * gcc.target/arm/mve/intrinsics/vcmpneq_n_f16: Fix testism.
    * gcc.target/arm/mve/intrinsics/vcmpneq_n_f32: Likewise.

<>

RE: [PATCH][GCC][Arm]: MVE Fix immediate constraints on some vector instructions

2020-04-07 Thread Kyrylo Tkachov



> -Original Message-
> From: Andre Vieira (lists) 
> Sent: 07 April 2020 15:12
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov 
> Subject: [PATCH][GCC][Arm]: MVE Fix immediate constraints on some vector
> instructions
> 
> Hi,
> 
> This patch fixes the immediate checks on vcvt and vqshr(u)n[bt]
> instrucitons.  It also removes the 'arm_mve_immediate_check' as the
> check was wrong and the error message is not much better than the
> constraint one, which albeit isn't great either.
> 
> Regression tested on arm-none-eabi.
> 
> Is this OK for trunk?

Ok.
Thanks,
Kyrill

> 
> gcc/ChangeLog:
> 2020-04-07  Andre Vieira   
>      * config/arm/arm.c (arm_mve_immediate_check): Removed.
>      * config/arm/mve.md (MVE_pred2, MVE_constraint2): Added FP types.
>      (mve_vcvtq_n_to_f_*, mve_vcvtq_n_from_f_*, mve_vqshrnbq_n_*,
> mve_vqshrntq_n_*,
>   mve_vqshrunbq_n_s*, mve_vqshruntq_n_s*,
> mve_vcvtq_m_n_from_f_*, mve_vcvtq_m_n_to_f_*,
>   mve_vqshrnbq_m_n_*, mve_vqrshruntq_m_n_s*,
> mve_vqshrunbq_m_n_s*,
>   mve_vqshruntq_m_n_s*): Fixed immediate constraints.
> 
> gcc/testsuite/ChangeLog:
> 2020-04-07  Andre Vieira   
>      * gcc.target/arm/mve/intrinsics/mve_immediates_1_n.c: New test.

Re: [PATCH] S/390: Fix PR91628

2020-04-07 Thread Stefan Liebler via Gcc-patches


On 4/1/20 6:20 PM, Stefan Liebler wrote:

On 4/1/20 12:50 PM, Iain Buclaw wrote:



On 01/04/2020 08:28, Stefan Liebler wrote:

ping



Hi Stefan,

As I've already said, I think that the name should be 
__ibmz_get_tls_offset to make clear that it is an internal function.


Other than that, looks good to me.

Iain.



Hi Iain,

Sorry. I've missed your comment in the bugzilla.
I've updated the name to __ibmz_get_this_offset.
Nothing else is changed in the attached patch.

Please commit the patch upstream.
Do you also close the bugzilla as soon as committed?

Regarding the mentioned musl-patch in your bugzilla comment:
Yes, the diff looks like not conflicting.

Thanks,
Stefan


Hi Iain,

Andreas has just committed the other patch "S/390: Fix layout of struct 
sigaction_t" to gcc after your pull-request was merged 
(https://github.com/dlang/druntime/pull/3020).


To me it seems that this patch is not simply pull-request-able to 
https://github.com/dlang/druntime.
As you've already mentioned "Other than that, looks good to me.", is 
this gcc patch okay to commit from your side? Then Andreas can also 
commit it and we can close the bugzilla.


Bye,
Stefan

RE: [PATCH][GCC][Arm]: MVE: Fix vec extracts to memory

2020-04-07 Thread Kyrylo Tkachov



> -Original Message-
> From: Andre Vieira (lists) 
> Sent: 07 April 2020 15:14
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov 
> Subject: [PATCH][GCC][Arm]: MVE: Fix vec extracts to memory
> 
> Hi,
> 
> This patch fixes vec extracts to memory that can arise from code as seen
> in the testcase added. The patch fixes this by allowing mem operands in
> the set of mve_vec_extract patterns, which given the only '=r'
> constraint will lead to the scalar value being written to a register and
> then stored in memory using scalar store pattern.
> 
> Regression tested on arm-none-eabi.
> 
> Is this OK for trunk?

Ok.
Thanks,
Kyrill

> 
> gcc/ChangeLog:
> 2020-04-07  Andre Vieira  
> 
>      * config/arm/mve.md (mve_vec_extract*): Allow memory operands
> in set.
> 
> gcc/testsuite/ChangeLog:
> 2020-04-07  Andre Vieira  
> 
>      * gcc.target/arm/mve/intrinsics/mve_vec_extracts_from_memory.c:
> New test.

RE: [PATCH][GCC][Arm]: MVE: make sure we only use the Arm namespace variant of vuninitializedq

2020-04-07 Thread Kyrylo Tkachov



> -Original Message-
> From: Andre Vieira (lists) 
> Sent: 07 April 2020 15:15
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov 
> Subject: [PATCH][GCC][Arm]: MVE: make sure we only use the Arm
> namespace variant of vuninitializedq
> 
> Hi,
> 
> This patch replaces all uses of 'vuninitializedq_*' by the same function
> but under the __arm_ namespace. In case we define the PRESERVE MACRO
> the
> variant without the '__arm_' prefix will not be available.
> 
> Regression tested arm-none-eabi.
> 
> Is this OK for trunk?

Ok.
Thanks,
Kyrill

> 
> 2020-04-07  Andre Vieira  
> 
>      * config/arm/arm_mve.h: Replace all uses of vuninitializedq_* with
>      the same with '__arm_' prefix.

RE: [PATCH][GCC][Arm]MVE: Fix -Wall testisms

2020-04-07 Thread Kyrylo Tkachov



> -Original Message-
> From: Andre Vieira (lists) 
> Sent: 07 April 2020 15:16
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov 
> Subject: [PATCH][GCC][Arm]MVE: Fix -Wall testisms
> 
> Hi,
> 
> This patch fixes some testisms I found when testing using -Wall/-Werror.
> 
> Is this OK for trunk?

Ok, sorry for missing those in review.
Kyrill

> 
> gcc/testsuite/ChangeLog:
> 2020-04-07  Andre Vieira  
> 
>      * gcc.target/arm/mve/intrinsics/vuninitializedq_float.c: Likewise.
>      * gcc.target/arm/mve/intrinsics/vuninitializedq_float1.c: Likewise.
>      * gcc.target/arm/mve/intrinsics/vuninitializedq_int.c: Likewise.
>      * gcc.target/arm/mve/intrinsics/vuninitializedq_int1.c: Likewise.

RE: [PATCH][GCC][Arm]: MVE: Fixes for pointers used in intrinsics for c++

2020-04-07 Thread Kyrylo Tkachov



> -Original Message-
> From: Andre Vieira (lists) 
> Sent: 07 April 2020 15:18
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov 
> Subject: [PATCH][GCC][Arm]: MVE: Fixes for pointers used in intrinsics for c++
> 
> Hi,
> 
> This patch fixes the passing of some pointers to builtins that expect
> slightly different types of pointers.  In C this didn't prove an issue,
> but when compiling for C++ gcc complains.
> 
> Regression tested on arm-none-eabi.
> 
> Is this OK for trunk?

Ok.
Thanks,
Kyrill

> 
> 2020-04-07  Andre Vieira  
> 
>      * config/arm/arm_mve.h: Cast some pointers to expected types.

RE: [PATCH][GCC][Arm]: MVE: Add C++ polymorphism and fix some more issues

2020-04-07 Thread Kyrylo Tkachov



> -Original Message-
> From: Andre Vieira (lists) 
> Sent: 07 April 2020 15:19
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov 
> Subject: [PATCH][GCC][Arm]: MVE: Add C++ polymorphism and fix some
> more issues
> 
> Hi,
> 
> This patch adds C++ polymorphism for the MVE intrinsics, by using the
> native C++ polymorphic functions when C++ is used.
> 
> It also moves the PRESERVE name macro definitions to the right place so
> that the variants without the '__arm_' prefix are not available if we
> define the PRESERVE NAMESPACE macro.
> 
> This patch further fixes two testisms that were brought to light by C++
> testing added in this patch.
> 
> Regression tested on arm-none-eabi.
> 
> Is this OK for trunk?

Ok.
Thanks,
Kyrill

> 
> gcc/ChangeLog:
> 2020-04-07  Andre Vieira  
> 
>      * config/arm/arm_mve.h: Add C++ polymorphism and fix
>      preserve MACROs.
> 
> gcc/testsuite/ChangeLog:
> 2020-04-07  Andre Vieira  
> 
>      * g++.target/arm/mve.exp: New.
>      * gcc.target/arm/mve/intrinsics/vcmpneq_n_f16: Fix testism.
>      * gcc.target/arm/mve/intrinsics/vcmpneq_n_f32: Likewise.

[PATCH] Allow new/delete operator deletion only for replaceable.

2020-04-07 Thread Martin Liška


Hi.

The patch allows DCE to remove only replaceable operators new and delete.
That's achieved by proper mark up of all these operators in C++ FE.
The patch also brings all tests we've collected so far for the PR.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin
>From 815495bbd52abe0da173c8f88d1de8dcb6095ba7 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Tue, 7 Apr 2020 16:23:27 +0200
Subject: [PATCH] Allow new/delete operator deletion only for replaceable.

gcc/ChangeLog:

2020-04-07  Martin Liska  

	PR c++/94314
	* gimple.c (gimple_call_operator_delete_p): Rename to...
	(gimple_call_replaceable_operator_delete_p): ... this.
	Use DECL_IS_REPLACEABLE_OPERATOR_DELETE_P.
	* gimple.h (gimple_call_operator_delete_p): Rename to ...
	(gimple_call_replaceable_operator_delete_p): ... this.
	* tree-core.h (tree_function_decl): Add replaceable_operator
	flag.
	* tree-ssa-dce.c (mark_all_reaching_defs_necessary_1):
	Use DECL_IS_REPLACEABLE_OPERATOR_DELETE_P.
	(propagate_necessity): Use gimple_call_replaceable_operator_delete_p.
	(eliminate_unnecessary_stmts): Likewise.
	* tree-streamer-in.c (unpack_ts_function_decl_value_fields):
	Pack DECL_IS_REPLACEABLE_OPERATOR.
	* tree-streamer-out.c (pack_ts_function_decl_value_fields):
	Unpack the field here.
	* tree.h (DECL_IS_REPLACEABLE_OPERATOR): New.
	(DECL_IS_REPLACEABLE_OPERATOR_NEW_P): New.
	(DECL_IS_REPLACEABLE_OPERATOR_DELETE_P): New.

gcc/cp/ChangeLog:

2020-04-07  Martin Liska  

	PR c++/94314
	* decl.c (duplicate_decls): Duplicate also DECL_IS_REPLACEABLE_OPERATOR.
	(cxx_init_decl_processing): Mark replaceable all implicitly defined
	operators.

gcc/lto/ChangeLog:

2020-04-07  Martin Liska  

	PR c++/94314
	* lto-common.c (compare_tree_sccs_1): Compare also
	DECL_IS_REPLACEABLE_OPERATOR.

gcc/testsuite/ChangeLog:

2020-04-07  Martin Liska  

	PR c++/94314
	* g++.dg/pr94314-2.C: New test.
	* g++.dg/pr94314-3.C: New test.
	* g++.dg/pr94314.C: New test.
---
 gcc/cp/decl.c| 14 ++
 gcc/gimple.c |  6 +--
 gcc/gimple.h |  2 +-
 gcc/lto/lto-common.c |  1 +
 gcc/testsuite/g++.dg/pr94314-2.C | 26 ++
 gcc/testsuite/g++.dg/pr94314-3.C | 55 +
 gcc/testsuite/g++.dg/pr94314.C   | 85 
 gcc/tree-core.h  |  3 +-
 gcc/tree-ssa-dce.c   |  8 +--
 gcc/tree-streamer-in.c   |  1 +
 gcc/tree-streamer-out.c  |  1 +
 gcc/tree.h   | 10 +++-
 12 files changed, 202 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/pr94314-2.C
 create mode 100644 gcc/testsuite/g++.dg/pr94314-3.C
 create mode 100644 gcc/testsuite/g++.dg/pr94314.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index a127734af69..a731702bdd2 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -2368,6 +2368,8 @@ duplicate_decls (tree newdecl, tree olddecl, bool newdecl_is_friend)
 	DECL_SET_IS_OPERATOR_NEW (newdecl, true);
 	  DECL_LOOPING_CONST_OR_PURE_P (newdecl)
 	|= DECL_LOOPING_CONST_OR_PURE_P (olddecl);
+	  DECL_IS_REPLACEABLE_OPERATOR (newdecl)
+	|= DECL_IS_REPLACEABLE_OPERATOR (olddecl);
 
 	  if (merge_attr)
 	merge_attribute_bits (newdecl, olddecl);
@@ -4438,13 +4440,17 @@ cxx_init_decl_processing (void)
 tree opnew = push_cp_library_fn (NEW_EXPR, newtype, 0);
 DECL_IS_MALLOC (opnew) = 1;
 DECL_SET_IS_OPERATOR_NEW (opnew, true);
+DECL_IS_REPLACEABLE_OPERATOR (opnew) = 1;
 opnew = push_cp_library_fn (VEC_NEW_EXPR, newtype, 0);
 DECL_IS_MALLOC (opnew) = 1;
 DECL_SET_IS_OPERATOR_NEW (opnew, true);
+DECL_IS_REPLACEABLE_OPERATOR (opnew) = 1;
 tree opdel = push_cp_library_fn (DELETE_EXPR, deltype, ECF_NOTHROW);
 DECL_SET_IS_OPERATOR_DELETE (opdel, true);
+DECL_IS_REPLACEABLE_OPERATOR (opdel) = 1;
 opdel = push_cp_library_fn (VEC_DELETE_EXPR, deltype, ECF_NOTHROW);
 DECL_SET_IS_OPERATOR_DELETE (opdel, true);
+DECL_IS_REPLACEABLE_OPERATOR (opdel) = 1;
 if (flag_sized_deallocation)
   {
 	/* Also push the sized deallocation variants:
@@ -4458,8 +4464,10 @@ cxx_init_decl_processing (void)
 	deltype = build_exception_variant (deltype, empty_except_spec);
 	opdel = push_cp_library_fn (DELETE_EXPR, deltype, ECF_NOTHROW);
 	DECL_SET_IS_OPERATOR_DELETE (opdel, true);
+	DECL_IS_REPLACEABLE_OPERATOR (opdel) = 1;
 	opdel = push_cp_library_fn (VEC_DELETE_EXPR, deltype, ECF_NOTHROW);
 	DECL_SET_IS_OPERATOR_DELETE (opdel, true);
+	DECL_IS_REPLACEABLE_OPERATOR (opdel) = 1;
   }
 
 if (aligned_new_threshold)
@@ -4478,9 +4486,11 @@ cxx_init_decl_processing (void)
 	opnew = push_cp_library_fn (NEW_EXPR, newtype, 0);
 	DECL_IS_MALLOC (opnew) = 1;
 	DECL_SET_IS_OPERATOR_NEW (opnew, true);
+	DECL_IS_REPLACEABLE_OPERATOR (opnew) = 1;
 	opnew = push_cp_library_fn (VEC_NEW_EXPR, newtype, 0);
 	DECL_IS_MALLOC (opnew) = 1;
 	DECL_SET_IS_OPERATOR_NEW (opnew, true);
+	DECL_IS_REPLACEABLE_OPE

Re: [PATCH] Check DECL_CONTEXT of new/delete operators.

2020-04-07 Thread Jonathan Wakely via Gcc-patches

On Tue, 7 Apr 2020 at 12:57, Richard Biener  wrote:
>
> On Tue, Apr 7, 2020 at 1:46 PM Jonathan Wakely  wrote:
> >
> > On Tue, 7 Apr 2020 at 12:40, Richard Biener  
> > wrote:
> > >
> > > On Tue, Apr 7, 2020 at 1:30 PM Jonathan Wakely  
> > > wrote:
> > > >
> > > > On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell wrote:
> > > > > The both operator new and operator delete are looked up in the same
> > > > > manner.  The std does not require a 'matching pair' be found.  but it 
> > > > > is
> > > > > extremely poor form for a class to declare exactly one of operator
> > > > > {new,delete}.
> > > >
> > > > There are unfortunately several such example in the standard!
> > > >
> > > > I wonder how much benefit we will really get from trying to make this
> > > > optimisation too general.
> > > >
> > > > Just eliding (or coalescing) implicit calls to ::operator new(size_t)
> > > > and ::operator delete(void*, size_t) (and the [] and align_val_t forms
> > > > of those) probably covers 99% of real cases.
> > >
> > > IIRC the only reason to have the optimization was to fully elide
> > > STL containers when possible.  That brings in allocators and
> > > thus non ::new/::delete allocations.
> >
> > But the vast majority of containers are used with std::allocator, and
> > we control that.
> >
> > Wouldn't it be simpler to add __builtin_operator_new and
> > __builtin_operator_delete like clang has, then make std::allocator use
> > those, and limit the optimizations to uses of those built-ins?
>
> Sure, that's a viable implementation strathegy.  Another might be
>
> void delete (void *) __attribute__((matching_new(somewhere::new)));
>
> and thus allow the user to relate a new/delete pair (here the FE would
> do lookup for 'new' and record for example the mangled assembler name).

Does that attribute tell us anything useful?

Given a pointer obtained from new and passed to delete, can't we
assume they are a matching pair? If not, the behaviour would be
undefined anyway.

> That is, we usually try to design a mechanism that's more broadly usable.
> But yes, we match malloc/free so matching ::new/::delete by aliasing them
> to __builtin_operator_new and __builtin_operator_delete is fair enough.

Would it make sense to annotate the actual calls to the
allocation/deallocation functions, instead of the declarations of
those functions?

According to Richard Smith, we don't want to optimise away 'operator
delete(operator new(1), 1)' because that's an explicit call, and the
user might have replaced those functions and be relying on the side
effects. But we can optimise away 'delete new char' and we can
optimise away 
std::allocator().deallocate(std::allocator().allocate(1),
1). So what matters is not whether the functions being called match up
(they have to anyway) or which functions are being called. What
matters is whether they are called implicitly (by a new-expression or
by std::allocator).

So when the compiler expands 'new T' into a call to operator new
followed by constructing a T (plus exception handling) the call to
operator new would be marked as optimisable by the FE, and likewise
when the compiler expands 'delete p' into a destructor followed by
calling operator delete, the call to operator delete would be marked
as optimisable. If a pointer is allocated by a call marked optimisable
and deallocated by a call marked optimisable, it can be optimised away
(or coalesced with another optimisation).

Then for std::allocator we just want to be able to mark the explicit
calls to operator new and operator delete as "optimisable", as the FE
does for the implicit calls it generates. So if we want a general
purpose utility, it would be an attribute to mark /calls/ as
optimisable, not functions.

Adding __builtin_operator_new would just be a different syntax for
"call operator new but mark it optimisable".

N.B. I am not a compiler dev and might be talking nonsense :-)

[committed] coroutines: ensure placeholder var is properly declared.

2020-04-07 Thread Iain Sandoe

Hi,

An omission spotted while working on open PRs.
tested on x86_64-linux/darwin, applied to master as obvious.
thanks
Iain

In cases that we need to extended the lifetime of a temporary captured
by reference, we make a replacement var for the temporary.  This will
be then used to define a coroutine frame entry (so that the var created
is elided by a later phase).  However, we should ensure that the var
is correctly declared anyway.

gcc/cp/ChangeLog:

2020-04-07  Iain Sandoe  

* coroutines.cc (maybe_promote_captured_temps): Ensure that
reference capture placeholder vars are properly declared.

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 38a23a91f8b..983fa650b55 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -2832,7 +2832,9 @@ maybe_promote_captured_temps (tree *stmt, void *d)
sloc = DECL_SOURCE_LOCATION (orig_temp);
  DECL_SOURCE_LOCATION (newvar) = sloc;
  DECL_CHAIN (newvar) = varlist;
- varlist = newvar;
+ varlist = newvar; /* Chain it onto the list for the bind expr.  */
+ /* Declare and initialze it in the new bind scope.  */
+ add_decl_expr (newvar);
  tree stmt
= build2_loc (sloc, INIT_EXPR, var_type, newvar, to_replace);
  stmt = coro_build_cvt_void_expr_stmt (stmt, sloc);

Re: Fix an ICE found in PR93686

2020-04-07 Thread Fritz Reese via Gcc-patches

On Mon, Apr 6, 2020 at 5:47 PM Steve Kargl
 wrote:
[...]
> BTW, if you haven't committed the degree trig functions,
> then I think you should to get the fixes in for 10.1.

Roger that.

---
Fritz

Re: [PING PATCH coroutines] Do not strip cleanup_point when promote temporaries out of current stmt

2020-04-07 Thread Iain Sandoe

Bin.Cheng  wrote:

> On Mon, Mar 16, 2020 at 5:34 PM Bin.Cheng  wrote:
>> On Wed, Mar 11, 2020 at 5:07 PM Iain Sandoe  wrote:
>>> Bin.Cheng  wrote:
>>> 
 On Thu, Mar 5, 2020 at 10:18 PM Iain Sandoe  wrote:

>>> If it helps, I could push a branch to users/iains/ on the FSF git server 
>>> with this sequence.
>> Sorry for being slow replying.  This is weird, were you testing against 
>> trunk?

1/ @Jun Ma

  - I think your observation is correct, we should enusre that cleanups are 
applied where needed
   to any temps that remain after we’ve those captured by reference.

  However, I would prefer that we do this as required, rather than assuming it 
will be needed so
  I have an updated patch below.

2/ @ Bin / Jun Ma

  - The problem is that the testcase seems to be fragile, perhaps it was a 
c-reduced one?

So… I do not propose to add this test-case at this point (perhaps we could 
construct one that actually
tests the required behaviour - that cleanups are still  run correctly for temps 
that are not promoted by
capture)?

Anyway to avoid further delay, I think we should apply the patch below (I have 
other fixes on top of this
for open PRs)

OK for master?
thanks
Iain

coroutines: Add cleanups, where required, to statements with captured 
references.

When we promote captured temporaries to local variables, we also
remove their initializers from the relevant call expression.  This
means that we should recompute the need for a cleanup expression
once the set of temporaries that remains becomes known.

gcc/cp/ChangeLog:

2020-04-07  Iain Sandoe  
Jun Ma  

* coroutines.cc (maybe_promote_captured_temps): Add a
cleanup expression, if needed, to any call from which
we promoted temporaries captured by reference.

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 983fa650b55..936be06c336 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -2798,11 +2798,13 @@ maybe_promote_captured_temps (tree *stmt, void *d)
   location_t sloc = EXPR_LOCATION (*stmt);
   tree aw_bind
= build3_loc (sloc, BIND_EXPR, void_type_node, NULL, NULL, NULL);
-  tree aw_statement_current;
-  if (TREE_CODE (*stmt) == CLEANUP_POINT_EXPR)
-   aw_statement_current = TREE_OPERAND (*stmt, 0);
-  else
-   aw_statement_current = *stmt;
+
+  /* Any cleanup point expression might no longer be necessary, since we
+are removing one or more temporaries.  */
+  tree aw_statement_current = *stmt;
+  if (TREE_CODE (aw_statement_current) == CLEANUP_POINT_EXPR)
+   aw_statement_current = TREE_OPERAND (aw_statement_current, 0);
+
   /* Collected the scope vars we need move the temps to regular. */
   tree aw_bind_body = push_stmt_list ();
   tree varlist = NULL_TREE;
@@ -2843,8 +2845,12 @@ maybe_promote_captured_temps (tree *stmt, void *d)
  /* Replace all instances of that temp in the original expr.  */
  cp_walk_tree (&aw_statement_current, replace_proxy, &pr, NULL);
}
-  /* What's left should be the original statement with any temporaries
-broken out.  */
+
+  /* What's left should be the original statement with any co_await
+captured temporaries broken out.  Other temporaries might remain
+so see if we need to wrap the revised statement in a cleanup.  */
+  aw_statement_current =
+   maybe_cleanup_point_expr_void (aw_statement_current);
   add_stmt (aw_statement_current);
   BIND_EXPR_BODY (aw_bind) = pop_stmt_list (aw_bind_body);
   awpts->captured_temps.empty ();

Re: [PATCH] sra: Fix sra_modify_expr handling of partial writes (PR 94482)

2020-04-07 Thread Martin Jambor

Hi,

On Tue, Apr 07 2020, Richard Biener wrote:
> On Tue, 7 Apr 2020, Martin Jambor wrote:
>
>> Hi,
>> 
>> when sra_modify_expr is invoked on an expression that modifies only
>> part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
>> of an assignment and the SRA replacement's type is not compatible with
>> what is being replaced (0th operand of the B_F_R in the above
>> example), it does not work properly, basically throwing away the partd
>> of the expr that should have stayed intact.
>> 
>> This is fixed in two ways.  For BIT_FIELD_REFs, which operate on the
>> binary image of the replacement (and so in a way serve as a
>> VIEW_CONVERT_EXPR) we just do not bother with convertsing.  For
>> REALPART_EXPRs and IMAGPART_EXPRs, we insert a VIEW_CONVERT_EXPR under
>> the complex partial access expression, which is OK even in a LHS of an
>> assignment (and other write contexts) because in those cases the
>> replacement is not going to be a giple register anyway.
>> 
>> The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a bit
>> fragile because SRA prefers complex and vector types over anything else
>> (and in between the two it decides based on TYPE_UID which in my testing
>> today always preferred complex types) and even when GIMPLE is wrong I
>> often still did not get failing runs, so I only run it at -O1 (which is
>> the only level where the the test fails for me).
>> 
>> Bootstrapped and tested on x86_64-linux, bootstrap and testing on
>> i686-linux and aarch64-linux underway.
>> 
>> OK for trunk (and subsequently for release branches) if it passes?
>
> OK.

I must have been already half asleep when writing that it passed
testing.  It did not, testsuite/gcc.c-torture/compile/pr42196-3.c fails
even on x86_64, because fwprop happily combines

  u$ci_10 = MEM[(union U *)&u];

and

  f1_6 = IMAGPART_EXPR (u$ci_10)>;

into

  f1_6 = IMAGPART_EXPR ;

and the gimple verifier of course does not like that.  I'll have a look
at that tomorrow.

>
> generate_subtree_copies contains similar code and the call in
> sra_modify_expr suggests that an (outer?) bit-field-ref is possible
> here, so is generate_subtree_copies affected as well?  I'm not sure
> how subtree copy generation "works" when we already replaced sth

Because they were just horribly problematic, SRA does not create
replacements for any scalar access that has a scalar ancestor or any
children in the access tree.  So the generate_subtree_copies call is
only invoked when the expr itself is not replaced (because there are
children), even after SRA the expression will still load/store the
original bit of the original aggregate.  The task of
generate_subtree_copies is to write all children replacements back to
the aggregate before it is read or re-load the replacements after a
write to the aggregate.  So no data should be forgotten here.

Martin

Re: [PATCH] S/390: Fix PR91628

2020-04-07 Thread Iain Buclaw via Gcc-patches




On 07/04/2020 16:33, Stefan Liebler wrote:
> On 4/1/20 6:20 PM, Stefan Liebler wrote:
>> On 4/1/20 12:50 PM, Iain Buclaw wrote:
>>>
>>>
>>> On 01/04/2020 08:28, Stefan Liebler wrote:
 ping

>>>
>>> Hi Stefan,
>>>
>>> As I've already said, I think that the name should be __ibmz_get_tls_offset 
>>> to make clear that it is an internal function.
>>>
>>> Other than that, looks good to me.
>>>
>>> Iain.
>>>
>>
>> Hi Iain,
>>
>> Sorry. I've missed your comment in the bugzilla.
>> I've updated the name to __ibmz_get_this_offset.
>> Nothing else is changed in the attached patch.
>>
>> Please commit the patch upstream.
>> Do you also close the bugzilla as soon as committed?
>>
>> Regarding the mentioned musl-patch in your bugzilla comment:
>> Yes, the diff looks like not conflicting.
>>
>> Thanks,
>> Stefan
> 
> Hi Iain,
> 
> Andreas has just committed the other patch "S/390: Fix layout of struct 
> sigaction_t" to gcc after your pull-request was merged 
> (https://github.com/dlang/druntime/pull/3020).
> 
> To me it seems that this patch is not simply pull-request-able to 
> https://github.com/dlang/druntime.

Correct, all sources touched are local to gcc only.

> As you've already mentioned "Other than that, looks good to me.", is this gcc 
> patch okay to commit from your side? Then Andreas can also commit it and we 
> can close the bugzilla.
> 

No problems with it being committed.

Regards
Iain.

Re: [PATCH] sra: Fix sra_modify_expr handling of partial writes (PR 94482)

2020-04-07 Thread Richard Biener

On April 7, 2020 6:25:26 PM GMT+02:00, Martin Jambor  wrote:
>Hi,
>
>On Tue, Apr 07 2020, Richard Biener wrote:
>> On Tue, 7 Apr 2020, Martin Jambor wrote:
>>
>>> Hi,
>>> 
>>> when sra_modify_expr is invoked on an expression that modifies only
>>> part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
>>> of an assignment and the SRA replacement's type is not compatible
>with
>>> what is being replaced (0th operand of the B_F_R in the above
>>> example), it does not work properly, basically throwing away the
>partd
>>> of the expr that should have stayed intact.
>>> 
>>> This is fixed in two ways.  For BIT_FIELD_REFs, which operate on the
>>> binary image of the replacement (and so in a way serve as a
>>> VIEW_CONVERT_EXPR) we just do not bother with convertsing.  For
>>> REALPART_EXPRs and IMAGPART_EXPRs, we insert a VIEW_CONVERT_EXPR
>under
>>> the complex partial access expression, which is OK even in a LHS of
>an
>>> assignment (and other write contexts) because in those cases the
>>> replacement is not going to be a giple register anyway.
>>> 
>>> The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a bit
>>> fragile because SRA prefers complex and vector types over anything
>else
>>> (and in between the two it decides based on TYPE_UID which in my
>testing
>>> today always preferred complex types) and even when GIMPLE is wrong
>I
>>> often still did not get failing runs, so I only run it at -O1 (which
>is
>>> the only level where the the test fails for me).
>>> 
>>> Bootstrapped and tested on x86_64-linux, bootstrap and testing on
>>> i686-linux and aarch64-linux underway.
>>> 
>>> OK for trunk (and subsequently for release branches) if it passes?
>>
>> OK.
>
>I must have been already half asleep when writing that it passed
>testing.  It did not, testsuite/gcc.c-torture/compile/pr42196-3.c fails
>even on x86_64, because fwprop happily combines
>
>  u$ci_10 = MEM[(union U *)&u];
>
>and
>
>  f1_6 = IMAGPART_EXPR (u$ci_10)>;
>
>into
>
>  f1_6 = IMAGPART_EXPR ;
>
>and the gimple verifier of course does not like that.  I'll have a look
>at that tomorrow.

Ah, that might be a genuine fwprop bug. 

>>
>> generate_subtree_copies contains similar code and the call in
>> sra_modify_expr suggests that an (outer?) bit-field-ref is possible
>> here, so is generate_subtree_copies affected as well?  I'm not sure
>> how subtree copy generation "works" when we already replaced sth
>
>Because they were just horribly problematic, SRA does not create
>replacements for any scalar access that has a scalar ancestor or any
>children in the access tree.  So the generate_subtree_copies call is
>only invoked when the expr itself is not replaced (because there are
>children), even after SRA the expression will still load/store the
>original bit of the original aggregate.  The task of
>generate_subtree_copies is to write all children replacements back to
>the aggregate before it is read or re-load the replacements after a
>write to the aggregate.  So no data should be forgotten here.
>
>Martin

[GCC 9 backport][AArch64] PR target/94518: Fix memmodel index in aarch64_store_exclusive_pair

2020-04-07 Thread Kyrylo Tkachov

Hi all,

A straight backport of

[AArch64] Fix memmodel index in aarch64_store_exclusive_pair

Found via an rtx checking failure.

2019-09-23  Richard Sandiford  

gcc/
* config/aarch64/atomics.md (aarch64_store_exclusive_pair): Fix
memmodel index.

Bootstrapped and tested on aarch64-none-linux-gnu.
Committing to the GCC 9 branch.
Thanks,
Kyrill


tmp.patch
Description: tmp.patch

Re: [PATCH v2 00/11] aarch64: Implement TImode comparisons

2020-04-07 Thread Richard Sandiford

"Richard Earnshaw (lists)"  writes:
> On 06/04/2020 12:19, Richard Sandiford wrote:
>> "Richard Earnshaw (lists)"  writes:
>>> On 03/04/2020 16:03, Richard Sandiford wrote:
 "Richard Earnshaw (lists)"  writes:
> On 03/04/2020 13:27, Richard Sandiford wrote:
>> "Richard Earnshaw (lists)"  writes:
>>> On 02/04/2020 19:53, Richard Henderson via Gcc-patches wrote:
 This is attacking case 3 of PR 94174.

 In v2, I unify the various subtract-with-borrow and add-with-carry
 patterns that also output flags with unspecs.  As suggested by
 Richard Sandiford during review of v1.  It does seem cleaner.

>>>
>>> Really?  I didn't need to use any unspecs for the Arm version of this.
>>>
>>> R.
>>
>> See https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543063.html
>> (including quoted context) for how we got here.
>>
>> The same problem affects the existing aarch64 patterns like
>> *usub3_carryinC.  Although that pattern avoids unspecs,
>> the compare:CC doesn't seem to be correct.
>>
>> Richard
>>
>
> But I don't think you can use ANY_EXTEND in these comparisons.  It
> doesn't describe what the instruction does, since the instruction does
> not really extend the values first.

 Yeah, that was the starting point in the thread above too.  And using
 zero_extend in the existing *usub3_carryinC pattern:

 (define_insn "*usub3_carryinC"
   [(set (reg:CC CC_REGNUM)
(compare:CC
  (zero_extend:
(match_operand:GPI 1 "register_operand" "r"))
  (plus:
(zero_extend:
  (match_operand:GPI 2 "register_operand" "r"))
(match_operand: 3 "aarch64_borrow_operation" ""
(set (match_operand:GPI 0 "register_operand" "=r")
(minus:GPI
  (minus:GPI (match_dup 1) (match_dup 2))
  (match_operand:GPI 4 "aarch64_borrow_operation" "")))]
""
"sbcs\\t%0, %1, %2"
   [(set_attr "type" "adc_reg")]
 )

 looks wrong for the same reason.  But the main problem IMO isn't how the
 inputs to the compare:CC are represented, but that we're using compare:CC
 at all.  Using compare doesn't accurately model the effect of SBCS on NZCV
 for all inputs, so if we're going to use a compare here, it can't be :CC.

> I would really expect this patch series to be pretty much a dual of this
> series that I posted last year for Arm.
>
> https://gcc.gnu.org/pipermail/gcc-patches/2019-October/532180.html

 That series uses compares with modes like CC_V and CC_B, so I think
 you're saying that given the choice in the earlier thread between adding
 a new CC mode or using unspecs, you would have preferred a new CC mode,
 is that right?

>>>
>>> Yes.  It surprised me, when doing the aarch32 version, just how often
>>> the mid-end parts of the compiler were able to reason about parts of the
>>> parallel insn and optimize things accordingly (eg propagating the truth
>>> of the comparison).  If you use an unspec that can never happen.
>> 
>> That could be changed though.  E.g. we could add something like a
>> simplify_unspec target hook if this becomes a problem (either here
>> or for other unspecs).  A fancy implementation could even use
>> match.pd-style rules in the .md file.
>
> I really don't like that.  It sounds like the top of a long slippery
> slope.  What about all the other cases where the RTL is comprehended by
> the mid-end?

Is it really so bad though?  It's similar to how frontends can define
and fold their own trees, and how targets can define and fold their own
built-in functions.

The CC modes are also quite heavily macro/hook-based.

>> The reason I'm not keen on using special modes for this case is that
>> they'd describe one way in which the result can be used rather than
>> describing what the instruction actually does.  The instruction really
>> does set all four flags to useful values.  The "problem" is that they're
>> not the values associated with a compare between two values, so representing
>> them that way will always lose information.
>> 
>
> Yes, it's true that the rtl -> machine instruction transform is not 100%
> reversible.  That's always been the case, but it's the price we pay for
> a generic IL that describes instructions on multiple architectures.

It's not really reversibility that I'm after (at least not for its
own sake).

If we had a three-input compare_cc rtx_code that described a comparison
involving a carry input, we'd certainly be using it here, because that's
what the instruction does.  Given that we don't have the rtx_code, three
obvious choices are:

(1) Add it.

(2) Continue to represent what the instruction does using an unspec.

(3) Don't try to represent the "three-input compare_cc" operation and
instead describe a two-input comparison that only yields a valid
result for a

[committed] libstdc++: Restore ability to use in C++14 (PR 94520)

2020-04-07 Thread Jonathan Wakely via Gcc-patches

This C++17 header is supported in C++14 as a GNU extension, but stopped
working last year because I made it depend on an internal helper which
is only defined for C++17 and up.

PR libstdc++/94520
* include/std/charconv (__integer_to_chars_result_type)
(__integer_from_chars_result_type): Use __or_ instead of __or_v_ to
allow use in C++14.
* testsuite/20_util/from_chars/1.cc: Run test as C++14 and replace
use of std::string_view with std::string.
* testsuite/20_util/from_chars/2.cc: Likewise.
* testsuite/20_util/to_chars/1.cc: Likewise.
* testsuite/20_util/to_chars/2.cc: Likewise.

Tested powerpc64le-linux, committed to master.


commit c104e8f1b67a75ea82c62f1fd2aac69c09127562
Author: Jonathan Wakely 
Date:   Tue Apr 7 17:18:21 2020 +0100

libstdc++: Restore ability to use  in C++14 (PR 94520)

This C++17 header is supported in C++14 as a GNU extension, but stopped
working last year because I made it depend on an internal helper which
is only defined for C++17 and up.

PR libstdc++/94520
* include/std/charconv (__integer_to_chars_result_type)
(__integer_from_chars_result_type): Use __or_ instead of __or_v_ to
allow use in C++14.
* testsuite/20_util/from_chars/1.cc: Run test as C++14 and replace
use of std::string_view with std::string.
* testsuite/20_util/from_chars/2.cc: Likewise.
* testsuite/20_util/to_chars/1.cc: Likewise.
* testsuite/20_util/to_chars/2.cc: Likewise.

diff --git a/libstdc++-v3/include/std/charconv 
b/libstdc++-v3/include/std/charconv
index 35f8efc7e35..8c9ce9d280e 100644
--- a/libstdc++-v3/include/std/charconv
+++ b/libstdc++-v3/include/std/charconv
@@ -68,9 +68,9 @@ namespace __detail
 {
   template
 using __integer_to_chars_result_type
-  = enable_if_t<__or_v<__is_signed_integer<_Tp>,
-  __is_unsigned_integer<_Tp>,
-  is_same>>,
+  = enable_if_t<__or_<__is_signed_integer<_Tp>,
+ __is_unsigned_integer<_Tp>,
+ is_same>>::value,
to_chars_result>;
 
   // Pick an unsigned type of suitable size. This is used to reduce the
@@ -564,9 +564,9 @@ namespace __detail
 
   template
 using __integer_from_chars_result_type
-  = enable_if_t<__or_v<__is_signed_integer<_Tp>,
-  __is_unsigned_integer<_Tp>,
-  is_same>>,
+  = enable_if_t<__or_<__is_signed_integer<_Tp>,
+ __is_unsigned_integer<_Tp>,
+ is_same>>::value,
from_chars_result>;
 
 } // namespace __detail
diff --git a/libstdc++-v3/testsuite/20_util/from_chars/1.cc 
b/libstdc++-v3/testsuite/20_util/from_chars/1.cc
index 6ce95590793..916025bc7c6 100644
--- a/libstdc++-v3/testsuite/20_util/from_chars/1.cc
+++ b/libstdc++-v3/testsuite/20_util/from_chars/1.cc
@@ -15,21 +15,23 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-options "-std=gnu++17" }
-// { dg-do run { target c++17 } }
+//  is supported in C++14 as a GNU extension
+// { dg-do run { target c++14 } }
 
 #include 
-#include 
+#include 
 
 template
 bool
-check_from_chars(I expected, std::string_view s, int base = 0, char term = 
'\0')
+check_from_chars(I expected, std::string s, int base = 0, char term = '\0')
 {
+  const char* begin = s.data();
+  const char* end = s.data() + s.length();
   I val;
   std::from_chars_result r = base == 0
-? std::from_chars(s.begin(), s.end(), val)
-: std::from_chars(s.begin(), s.end(), val, base);
-  return r.ec == std::errc{} && (r.ptr == s.end() || *r.ptr == term) && val == 
expected;
+? std::from_chars(begin, end, val)
+: std::from_chars(begin, end, val, base);
+  return r.ec == std::errc{} && (r.ptr == end || *r.ptr == term) && val == 
expected;
 }
 
 #include 
diff --git a/libstdc++-v3/testsuite/20_util/from_chars/2.cc 
b/libstdc++-v3/testsuite/20_util/from_chars/2.cc
index caff17e66b2..902092fd423 100644
--- a/libstdc++-v3/testsuite/20_util/from_chars/2.cc
+++ b/libstdc++-v3/testsuite/20_util/from_chars/2.cc
@@ -15,11 +15,11 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-options "-std=gnu++17" }
-// { dg-do run { target c++17 } }
+//  is supported in C++14 as a GNU extension
+// { dg-do run { target c++14 } }
 
 #include 
-#include 
+#include 
 #include 
 
 // Test std::from_chars error handling.
@@ -29,45 +29,45 @@ test01()
 {
   std::from_chars_result r;
   int i = 999;
-  std::string_view s;
+  std::string s;
 
   s = "";
-  r = std::from_chars(s.begin(), s.end(), i);
+  r = std::from_chars(s.data(), s.data() + s.length(), i);
   VERIFY( r.ec == std::errc::invalid_argument );
-  VERIFY( r.ptr == s.begin() );
+  VERIFY( r.ptr == s.data() );
   VERIFY( i == 999 );

Re: [PATCH v2 00/11] aarch64: Implement TImode comparisons

2020-04-07 Thread Richard Henderson via Gcc-patches

On 4/7/20 9:32 AM, Richard Sandiford wrote:
> It's not really reversibility that I'm after (at least not for its
> own sake).
> 
> If we had a three-input compare_cc rtx_code that described a comparison
> involving a carry input, we'd certainly be using it here, because that's
> what the instruction does.  Given that we don't have the rtx_code, three
> obvious choices are:
> 
> (1) Add it.
> 
> (2) Continue to represent what the instruction does using an unspec.
> 
> (3) Don't try to represent the "three-input compare_cc" operation and
> instead describe a two-input comparison that only yields a valid
> result for a subset of tests.
> 
> (1) seems like the best technical solution but would probably be
> a lot of work.  I guess the reason I like (2) is that it stays
> closest to (1).

Indeed, the biggest problem that I'm having with copying the arm solution to
aarch64 is the special cases of the constants.

The first problem is that (any_extend:M1 (match_operand:M2)) is invalid rtl for
a constant, so you can't share the same define_insn to handle both register and
immediate input.

The second problem is how unpredictable the canonical rtl of an expression can
be after constant folding.  Which again requires more and more define_insns.
Even the Arm target gets this wrong.  In particular,

> (define_insn "cmpsi3_carryin_out"
>   [(set (reg: CC_REGNUM)
> (compare:
>  (SE:DI (match_operand:SI 1 "s_register_operand" "0,r"))
>  (plus:DI (match_operand:DI 3 "arm_borrow_operation" "")
>   (SE:DI (match_operand:SI 2 "s_register_operand" "l,r")
>(clobber (match_scratch:SI 0 "=l,r"))]

is non-canonical according to combine.  It will only attempt the ordering

  (compare
(plus ...)
(sign_extend ...))

I have no idea why combine is attempting to reverse the sense of the comparison
here.  I can only presume it would also reverse the sense of the branch on
which the comparison is made, had the pattern matched.

This second problem is partially worked around by fwprop, in that it will try
to simply replace the operand without folding if that is recognizable.  Thus
cases like

  (compare (const_int 0) (plus ...))

can be produced from fwprop but not combine.  Which works well enough to not
bother with the CC_RSBmode that the arm target uses.

The third problem is the really quite complicated code that goes into
SELECT_CC_MODE.  This really should not be as difficult as it is, and is the
sort of thing for which we built recog.

Related to that is the insn costing, which also ought to use something akin to
recog.  We have all of the information there: if the insn is recognizable, the
type/length attributes can be used to provide a good value.

r~

[PATCH] middle-end/94479 - fix gimplification of address

2020-04-07 Thread Richard Biener



When gimplifying an address operand we may expose an indirect
ref via DECL_VALUE_EXPR for example.  This is dealt with in the
code already but it fails to consider that INDIRECT_REFs get
gimplified to MEM_REFs.

Fixed which makes the ICE observed on x86_64-netbsd go away.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2020-04-07  Richard Biener  

PR middle-end/94479
* gimplify.c (gimplify_addr_expr): Also consider generated
MEM_REFs.

* gcc.dg/torture/pr94479.c: New testcase.
---
 gcc/gimplify.c |  4 +++-
 gcc/testsuite/gcc.dg/torture/pr94479.c | 12 
 2 files changed, 15 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr94479.c

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 787435c38cd..8cdfae26510 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -6181,7 +6181,9 @@ gimplify_addr_expr (tree *expr_p, gimple_seq *pre_p, 
gimple_seq *post_p)
 
   /* For various reasons, the gimplification of the expression
 may have made a new INDIRECT_REF.  */
-  if (TREE_CODE (op0) == INDIRECT_REF)
+  if (TREE_CODE (op0) == INDIRECT_REF
+ || (TREE_CODE (op0) == MEM_REF
+ && integer_zerop (TREE_OPERAND (op0, 1
goto do_indirect_ref;
 
   mark_addressable (TREE_OPERAND (expr, 0));
diff --git a/gcc/testsuite/gcc.dg/torture/pr94479.c 
b/gcc/testsuite/gcc.dg/torture/pr94479.c
new file mode 100644
index 000..53285bb4f38
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr94479.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-stack-check "specific" } */
+/* { dg-additional-options "-fstack-check -w" } */
+
+int a;
+struct b {
+char c;
+void *d;  
+};
+struct b e() {
+struct b f[] = {{}, "", f, a};
+}
-- 
2.25.1

Re: [PATCH] c++: Fix ICE in tsubst_default_argument [PR92010]

2020-04-07 Thread Patrick Palka via Gcc-patches

On Mon, 6 Apr 2020, Jason Merrill wrote:
> On 4/6/20 11:45 AM, Patrick Palka wrote:
> > On Wed, 1 Apr 2020, Jason Merrill wrote:
> > 
> > > On 4/1/20 6:29 PM, Jason Merrill wrote:
> > > > On 3/31/20 3:50 PM, Patrick Palka wrote:
> > > > > On Tue, 31 Mar 2020, Jason Merrill wrote:
> > > > > 
> > > > > > On 3/30/20 6:46 PM, Patrick Palka wrote:
> > > > > > > On Mon, 30 Mar 2020, Jason Merrill wrote:
> > > > > > > > On 3/30/20 3:58 PM, Patrick Palka wrote:
> > > > > > > > > On Thu, 26 Mar 2020, Jason Merrill wrote:
> > > > > > > > > 
> > > > > > > > > > On 3/22/20 9:21 PM, Patrick Palka wrote:
> > > > > > > > > > > This patch relaxes an assertion in tsubst_default_argument
> > > > > > > > > > > that
> > > > > > > > > > > exposes
> > > > > > > > > > > a
> > > > > > > > > > > latent
> > > > > > > > > > > bug in how we substitute an array type into a cv-qualified
> > > > > > > > > > > wildcard
> > > > > > > > > > > function
> > > > > > > > > > > parameter type.  Concretely, the latent bug is that given
> > > > > > > > > > > the
> > > > > > > > > > > function
> > > > > > > > > > > template
> > > > > > > > > > > 
> > > > > > > > > > >    template void foo(const T t);
> > > > > > > > > > > 
> > > > > > > > > > > one would expect the type of foo to be void(const
> > > > > > > > > > > int*), but
> > > > > > > > > > > we
> > > > > > > > > > > (seemingly prematurely) strip function parameter types of
> > > > > > > > > > > their
> > > > > > > > > > > top-level
> > > > > > > > > > > cv-qualifiers when building the function's TYPE_ARG_TYPES,
> > > > > > > > > > > and
> > > > > > > > > > > instead
> > > > > > > > > > > end
> > > > > > > > > > > up
> > > > > > > > > > > obtaining void(int*) as the type of foo after
> > > > > > > > > > > substitution
> > > > > > > > > > > and
> > > > > > > > > > > decaying.
> > > > > > > > > > > 
> > > > > > > > > > > We still however correctly substitute into and decay the
> > > > > > > > > > > formal
> > > > > > > > > > > parameter
> > > > > > > > > > > type,
> > > > > > > > > > > obtaining const int* as the type of t after substitution. 
> > > > > > > > > > > But
> > > > > > > > > > > this
> > > > > > > > > > > then
> > > > > > > > > > > leads
> > > > > > > > > > > to us tripping over the assert in tsubst_default_argument
> > > > > > > > > > > that
> > > > > > > > > > > verifies
> > > > > > > > > > > the
> > > > > > > > > > > formal parameter type and the function type are
> > > > > > > > > > > consistent.
> > > > > > > > > > > 
> > > > > > > > > > > Assuming it's too late at this stage to fix the
> > > > > > > > > > > substitution
> > > > > > > > > > > bug, we
> > > > > > > > > > > can
> > > > > > > > > > > still
> > > > > > > > > > > relax the assertion like so.  Tested on
> > > > > > > > > > > x86_64-pc-linux-gnu,
> > > > > > > > > > > does
> > > > > > > > > > > this
> > > > > > > > > > > look
> > > > > > > > > > > OK?
> > > > > > > > > > 
> > > > > > > > > > This is core issues 1001/1322, which have not been resolved.
> > > > > > > > > > Clang
> > > > > > > > > > does
> > > > > > > > > > the
> > > > > > > > > > substitution the way you suggest; EDG rejects the testcase
> > > > > > > > > > because the
> > > > > > > > > > two
> > > > > > > > > > substitutions produce different results.  I think it would
> > > > > > > > > > make
> > > > > > > > > > sense
> > > > > > > > > > to
> > > > > > > > > > follow the EDG behavior until this issue is actually
> > > > > > > > > > resolved.
> > > > > > > > > 
> > > > > > > > > Here is what I have so far towards that end.  When
> > > > > > > > > substituting
> > > > > > > > > into the
> > > > > > > > > PARM_DECLs of a function decl, we now additionally check if
> > > > > > > > > the
> > > > > > > > > aforementioned Core issues are relevant and issue a (fatal)
> > > > > > > > > diagnostic
> > > > > > > > > if so.  This patch checks this in tsubst_decl 
> > > > > > > > > rather
> > > > > > > > > than in tsubst_function_decl for efficiency reasons, so that
> > > > > > > > > we
> > > > > > > > > don't
> > > > > > > > > have to perform another traversal over the DECL_ARGUMENTS /
> > > > > > > > > TYPE_ARG_TYPES just to implement this check.
> > > > > > > > 
> > > > > > > > Hmm, this seems like writing more complicated code for a very
> > > > > > > > marginal
> > > > > > > > optimization; how many function templates have so many
> > > > > > > > parameters
> > > > > > > > that
> > > > > > > > walking
> > > > > > > > over them once to compare types will have any effect on compile
> > > > > > > > time?
> > > > > > > 
> > > > > > > Good point... though I just tried implementing this check in
> > > > > > > tsubst_function_decl, and it seems it might be just as complicated
> > > > > > > to
> > > > > > > implement it there instead, at least if we want to handle function
> > > > > > > parameter packs correctly.
> > > > > > > 
> > > > > > > If we were to implement this check in tsubst_function_decl, then
> > > > > > > since
> > > > > > > we have access to the in

Re: libgcc patch committed: Avoid hooks in split-stack code

2020-04-07 Thread Ian Lance Taylor via Gcc-patches

On Tue, Apr 7, 2020 at 3:33 AM Jakub Jelinek  wrote:
>
> On Tue, Apr 07, 2020 at 12:19:45PM +0200, Richard Biener via Gcc-patches 
> wrote:
> > On Fri, Apr 3, 2020 at 11:59 PM Ian Lance Taylor via Gcc-patches
> >  wrote:
> > >
> > > The split-stack code invokes mmap and munmap with a limited amount of
> > > stack space.  That is fine when the functions just make a system call,
> > > but it's not fine when programs use LD_PRELOAD or linker tricks to add
> > > hooks to mmap/munmap.  Those hooks may use too much stack space,
> > > leading to an obscure program crash.
> > >
> > > Avoid that at least on GNU/Linux by calling __mmap/__munmap instead.
> > >
> > > Bootstrapped and ran Go and split-stack tests on x86_64-pc-linux-gnu.
> > > Committed to mainline.
> >
> > This causes references to GLIBC_PRIVATE exported symbols which is
> > highly undesirable for system integrators (no ABI stability is guaranteed
> > for those).  I opened the regression PR94513 for this.
>
> Yeah, GLIBC_PRIVATE symbols may only be used by glibc itself, not by
> anything else.
> I'd suggest to use syscall function instead (though, that can be interposed
> too).

Sorry about that.  OK, let's try syscall.  I committed this patch.
Bootstrapped and tested on x86_64-pc-linux-gnu, did a small amount of
testing on s390x (but I don't have full access to an s390x system).

Ian

2020-04-07  Ian Lance Taylor  

PR libgcc/94513
* generic-morestack.c: Give up trying to use __mmap/__munmap, use
syscall instead.
diff --git a/libgcc/generic-morestack.c b/libgcc/generic-morestack.c
index fa2062e2bb3..35764a848f6 100644
--- a/libgcc/generic-morestack.c
+++ b/libgcc/generic-morestack.c
@@ -56,17 +56,56 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 /* Some systems use LD_PRELOAD or similar tricks to add hooks to
mmap/munmap.  That breaks this code, because when we call mmap
there is enough stack space for the system call but there is not,
-   in general, enough stack space to run a hook.  At least when using
-   glibc on GNU/Linux we can avoid the problem by calling __mmap and
-   __munmap.  */
+   in general, enough stack space to run a hook.  Try to avoid the
+   problem by calling syscall directly.  We only do this on GNU/Linux
+   for now, but it should be easy to add support for more systems with
+   testing.  */
 
-#if defined(__gnu_linux__) && (__GLIBC__ > 2 || (__GLIBC__ == 2 && 
__GLIBC_MINOR__ >= 26))
+#if defined(__gnu_linux__)
 
-extern void *__mmap (void *, size_t, int, int, int, off_t);
-extern int __munmap (void *, size_t);
+#include 
 
-#define mmap __mmap
-#define munmap __munmap
+#if defined(SYS_mmap) || defined(SYS_mmap2)
+
+#ifdef SYS_mmap2
+#define MORESTACK_MMAP SYS_mmap2
+#define MORESTACK_ADJUST_OFFSET(x) ((x) / 4096ULL)
+#else
+#define MORESTACK_MMAP SYS_mmap
+#define MORESTACK_ADJUST_OFFSET(x) (x)
+#endif
+
+static void *
+morestack_mmap (void *addr, size_t length, int prot, int flags, int fd,
+   off_t offset)
+{
+  offset = MORESTACK_ADJUST_OFFSET (offset);
+
+#ifdef __s390__
+  long args[6] = { (long) addr, (long) length, (long) prot, (long) flags,
+  (long) fd, (long) offset };
+  return (void *) syscall (MORESTACK_MMAP, args);
+#else
+  return (void *) syscall (MORESTACK_MMAP, addr, length, prot, flags, fd,
+  offset);
+#endif
+}
+
+#define mmap morestack_mmap
+
+#endif /* defined(SYS_MMAP) || defined(SYS_mmap2) */
+
+#if defined(SYS_munmap)
+
+static int
+morestack_munmap (void * addr, size_t length)
+{
+  return (int) syscall (SYS_munmap, addr, length);
+}
+
+#define munmap morestack_munmap
+
+#endif /* defined(SYS_munmap) */
 
 #endif /* defined(__gnu_linux__) */

[PATCH] reject scalar array initialization with nullptr [PR94510]

2020-04-07 Thread Martin Sebor via Gcc-patches


Among the numerous regressions introduced by the change committed
to GCC 9 to allow string literals as template arguments is a failure
to recognize the C++ nullptr and GCC's __null constants as pointers.
For one, I didn't realize that nullptr, being a null pointer constant,
doesn't have a pointer type, and two, I didn't think of __null (which
is a special integer constant that NULL sometimes expands to).

The attached patch adjusts the special handling of trailing zero
initializers in reshape_init_array_1 to recognize both kinds of
constants and avoid treating them as zeros of the array integer
element type.  This restores the expected diagnostics when either
constant is used in the initializer list.

Martin
PR c++/94510 - nullptr_t implicitly cast to zero twice in std::array

gcc/cp/ChangeLog:

	PR c++/94510
	* decl.c (reshape_init_array_1): Exclude mismatches with all kinds
	of pointers.

gcc/testsuite/ChangeLog:

	PR c++/94510
	* g++.dg/init/array57.C: New test.
	* g++.dg/init/array58.C: New test.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index a127734af69..692c8ed73f4 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -6041,9 +6041,14 @@ reshape_init_array_1 (tree elt_type, tree max_index, reshape_iter *d,
 	TREE_CONSTANT (new_init) = false;
 
   /* Pointers initialized to strings must be treated as non-zero
-	 even if the string is empty.  */
+	 even if the string is empty.  Handle all kinds of pointers,
+	 including std::nullptr and GCC's __nullptr, neither of which
+	 has a pointer type.  */
   tree init_type = TREE_TYPE (elt_init);
-  if (POINTER_TYPE_P (elt_type) != POINTER_TYPE_P (init_type)
+  bool init_is_ptr = (POINTER_TYPE_P (init_type)
+			  || NULLPTR_TYPE_P (init_type)
+			  || null_node_p (elt_init));
+  if (POINTER_TYPE_P (elt_type) != init_is_ptr
 	  || !type_initializer_zero_p (elt_type, elt_init))
 	last_nonzero = index;
 
diff --git a/gcc/testsuite/g++.dg/init/array57.C b/gcc/testsuite/g++.dg/init/array57.C
new file mode 100644
index 000..fdd7e76eb18
--- /dev/null
+++ b/gcc/testsuite/g++.dg/init/array57.C
@@ -0,0 +1,15 @@
+/* PR c++/94510 - nullptr_t implicitly cast to zero twice in std::array
+   { dg-do compile } */
+
+int ia1[2] = { (void*)0 };  // { dg-error "invalid conversion from 'void\\\*'" }
+int ia2[2] = { (void*)0, 0 };   // { dg-error "invalid conversion from 'void\\\*'" }
+int ia3[] = { (void*)0, 0 };// { dg-error "invalid conversion from 'void\\\*'" }
+
+int ia4[2] = { __null };// { dg-warning "\\\[-Wconversion-null" }
+int ia5[2] = { __null, 0 }; // { dg-warning "\\\[-Wconversion-null" }
+int ia6[] = { __null, 0 };  // { dg-warning "\\\[-Wconversion-null" }
+
+
+const char ca1[2] = { (char*)0, 0 };// { dg-error "invalid conversion from 'char\\\*'" }
+
+const char ca2[2] = { __null, 0 };  // { dg-warning "\\\[-Wconversion-null" }
diff --git a/gcc/testsuite/g++.dg/init/array58.C b/gcc/testsuite/g++.dg/init/array58.C
new file mode 100644
index 000..655e08fa600
--- /dev/null
+++ b/gcc/testsuite/g++.dg/init/array58.C
@@ -0,0 +1,21 @@
+/* PR c++/94510 - nullptr_t implicitly cast to zero twice in std::array
+   { dg-do compile { target c++11 } } */
+
+namespace std {
+typedef __typeof__ (nullptr) nullptr_t;
+}
+
+int ia1[2] = { nullptr }; // { dg-error "cannot convert 'std::nullptr_t' to 'int'" }
+int ia2[2] = { nullptr, 0 };  // { dg-error "cannot convert 'std::nullptr_t' to 'int'" }
+int ia3[] = { nullptr, 0 };   // { dg-error "cannot convert 'std::nullptr_t' to 'int'" }
+
+int ia4[2] = { (std::nullptr_t)0 };  // { dg-error "cannot convert 'std::nullptr_t' to 'int'" }
+int ia5[2] = { (std::nullptr_t)0, 0 };   // { dg-error "cannot convert 'std::nullptr_t' to 'int'" }
+int ia6[] = { (std::nullptr_t)0, 0 };// { dg-error "cannot convert 'std::nullptr_t' to 'int'" }
+
+
+const char ca1[2] = { nullptr, 0 };   // { dg-error "cannot convert 'std::nullptr_t' to 'const char'" }
+
+const char ca2[2] = { (char*)nullptr, 0 };// { dg-error "invalid conversion from 'char\\\*' to 'char'" }
+
+const char ca3[2] = { std::nullptr_t () };// { dg-error "cannot convert 'std::nullptr_t'" }

Re: [PATCH] sra: Fix sra_modify_expr handling of partial writes (PR 94482)

2020-04-07 Thread Jeff Law via Gcc-patches

On Tue, 2020-04-07 at 18:25 +0200, Martin Jambor wrote:
> Hi,
> 
> On Tue, Apr 07 2020, Richard Biener wrote:
> > On Tue, 7 Apr 2020, Martin Jambor wrote:
> > 
> > > Hi,
> > > 
> > > when sra_modify_expr is invoked on an expression that modifies only
> > > part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
> > > of an assignment and the SRA replacement's type is not compatible with
> > > what is being replaced (0th operand of the B_F_R in the above
> > > example), it does not work properly, basically throwing away the partd
> > > of the expr that should have stayed intact.
> > > 
> > > This is fixed in two ways.  For BIT_FIELD_REFs, which operate on the
> > > binary image of the replacement (and so in a way serve as a
> > > VIEW_CONVERT_EXPR) we just do not bother with convertsing.  For
> > > REALPART_EXPRs and IMAGPART_EXPRs, we insert a VIEW_CONVERT_EXPR under
> > > the complex partial access expression, which is OK even in a LHS of an
> > > assignment (and other write contexts) because in those cases the
> > > replacement is not going to be a giple register anyway.
> > > 
> > > The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a bit
> > > fragile because SRA prefers complex and vector types over anything else
> > > (and in between the two it decides based on TYPE_UID which in my testing
> > > today always preferred complex types) and even when GIMPLE is wrong I
> > > often still did not get failing runs, so I only run it at -O1 (which is
> > > the only level where the the test fails for me).
> > > 
> > > Bootstrapped and tested on x86_64-linux, bootstrap and testing on
> > > i686-linux and aarch64-linux underway.
> > > 
> > > OK for trunk (and subsequently for release branches) if it passes?
> > 
> > OK.
> 
> I must have been already half asleep when writing that it passed
> testing.  It did not, testsuite/gcc.c-torture/compile/pr42196-3.c fails
> even on x86_64, because fwprop happily combines
Yea, my tester complained about that on msp430-elf as well.  There's other 
recent
failures on msp430, ft32 and h8300 that I'm digging into now.
> 

Jeff

Re: [PATCH] S/390: Fix PR91628

2020-04-07 Thread Andreas Krebbel via Gcc-patches

On 07.04.20 18:26, Iain Buclaw wrote:
> 
> 
> On 07/04/2020 16:33, Stefan Liebler wrote:
>> On 4/1/20 6:20 PM, Stefan Liebler wrote:
>>> On 4/1/20 12:50 PM, Iain Buclaw wrote:


 On 01/04/2020 08:28, Stefan Liebler wrote:
> ping
>

 Hi Stefan,

 As I've already said, I think that the name should be 
 __ibmz_get_tls_offset to make clear that it is an internal function.

 Other than that, looks good to me.

 Iain.

>>>
>>> Hi Iain,
>>>
>>> Sorry. I've missed your comment in the bugzilla.
>>> I've updated the name to __ibmz_get_this_offset.
>>> Nothing else is changed in the attached patch.
>>>
>>> Please commit the patch upstream.
>>> Do you also close the bugzilla as soon as committed?
>>>
>>> Regarding the mentioned musl-patch in your bugzilla comment:
>>> Yes, the diff looks like not conflicting.
>>>
>>> Thanks,
>>> Stefan
>>
>> Hi Iain,
>>
>> Andreas has just committed the other patch "S/390: Fix layout of struct 
>> sigaction_t" to gcc after your pull-request was merged 
>> (https://github.com/dlang/druntime/pull/3020).
>>
>> To me it seems that this patch is not simply pull-request-able to 
>> https://github.com/dlang/druntime.
> 
> Correct, all sources touched are local to gcc only.
> 
>> As you've already mentioned "Other than that, looks good to me.", is this 
>> gcc patch okay to commit from your side? Then Andreas can also commit it and 
>> we can close the bugzilla.
>>
> 
> No problems with it being committed.

Committed to mainline. Thanks!

Andreas

> 
> Regards
> Iain.
>

[PATCH] c++: ICE with defaulted comparison operator [PR94478]

2020-04-07 Thread Marek Polacek via Gcc-patches

Here we ICE because early_check_defaulted_comparison passed a null
ctx to same_type_p.  The attached test is ill-formed according to
[class.compare.default]/1, so fixed by detecting this case early.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/94478 - ICE with defaulted comparison operator
* method.c (early_check_defaulted_comparison): Give an error when the
context is null.

* g++.dg/cpp2a/spaceship-err4.C: New test.
---
 gcc/cp/method.c | 11 +++
 gcc/testsuite/g++.dg/cpp2a/spaceship-err4.C |  7 +++
 2 files changed, 18 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/spaceship-err4.C

diff --git a/gcc/cp/method.c b/gcc/cp/method.c
index 41b9ff86bdd..9a21bfc1f66 100644
--- a/gcc/cp/method.c
+++ b/gcc/cp/method.c
@@ -1102,6 +1102,17 @@ early_check_defaulted_comparison (tree fn)
   return false;
 }
 
+  if (!ctx)
+{
+  if (DECL_OVERLOADED_OPERATOR_IS (fn, SPACESHIP_EXPR))
+   error_at (loc, "three-way comparison operator can only be defaulted "
+ "in a class definition");
+  else
+   error_at (loc, "equality comparison operator can only be defaulted "
+ "in a class definition");
+  return false;
+}
+
   if (!DECL_OVERLOADED_OPERATOR_IS (fn, SPACESHIP_EXPR)
   && !same_type_p (TREE_TYPE (TREE_TYPE (fn)), boolean_type_node))
 {
diff --git a/gcc/testsuite/g++.dg/cpp2a/spaceship-err4.C 
b/gcc/testsuite/g++.dg/cpp2a/spaceship-err4.C
new file mode 100644
index 000..00f90ced255
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/spaceship-err4.C
@@ -0,0 +1,7 @@
+// PR c++/94478 - ICE with defaulted comparison operator.
+// { dg-do compile { target c++2a } }
+
+struct B {};
+bool operator!=(const B&, const B&) = default; // { dg-error "equality 
comparison operator can only be defaulted in a class definition" }
+bool operator==(const B&, const B&) = default; // { dg-error "equality 
comparison operator can only be defaulted in a class definition" }
+bool operator<=>(const B&, const B&) = default; // { dg-error "three-way 
comparison operator can only be defaulted in a class definition" }

base-commit: 3d947f1f27188e3a61ba7f42399b1c348469fe13
-- 
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA

Re: [PATCH] combine: Fix split_i2i3 ICE [PR94291]

2020-04-07 Thread Segher Boessenkool

Hi!

(I have no idea why this PR is a P1).

On Tue, Mar 31, 2020 at 09:04:18AM +0200, Jakub Jelinek wrote:
> The following testcase ICEs on armv7hl-linux-gnueabi.

Also on default arm-linux-gnueabi.  It needs -Og.

> try_combine is called on:
> (gdb) p debug_rtx (i3)
> (insn 20 12 22 2 (set (mem/c:SI (plus:SI (reg/f:SI 102 sfp)
> (const_int -4 [0xfffc])) [1 x+0 S4 A32])
> (reg:SI 125)) "pr94291.c":7:8 241 {*arm_movsi_insn}
>  (expr_list:REG_DEAD (reg:SI 125)
> (nil)))
> (gdb) p debug_rtx (i2)
> (insn 12 7 20 2 (parallel [
> (set (reg:CC 100 cc)
> (compare:CC (reg:SI 121 [  ])
> (const_int 0 [0])))
> (set (reg:SI 125)
> (reg:SI 121 [  ]))
> ]) "pr94291.c":7:8 248 {*movsi_compare0}
>  (expr_list:REG_UNUSED (reg:CC 100 cc)
> (nil)))
> and tries to recognize cc = r121 cmp 0; [sfp-4] = r121 parallel,
> but that isn't recognized,

Trying 12 -> 20:
   12: {cc:CC=cmp(r118:SI,0);r122:SI=r118:SI;}
  REG_UNUSED cc:CC
   20: [sfp:SI-0x4]=r122:SI
  REG_DEAD r122:SI
Failed to match this instruction:
(parallel [
(set (reg:CC 100 cc)
(compare:CC (reg:SI 118 [  ])
(const_int 0 [0])))
(set (mem/c:SI (plus:SI (reg/f:SI 102 sfp)
(const_int -4 [0xfffc])) [1 x+0 S4 A32])
(reg:SI 118 [  ]))
])
(twice)

> so it splits it into two: split_i2i3

(That variable does not *control* the splitting: it registers that we
did split a parallel into newi2pat and newpat).

Successfully matched this instruction:
(set (mem/c:SI (plus:SI (reg/f:SI 102 sfp)
(const_int -4 [0xfffc])) [1 x+0 S4 A32])
(reg:SI 118 [  ]))
Successfully matched this instruction:
(set (reg:CC 100 cc)
(compare:CC (reg:SI 118 [  ])
(const_int 0 [0])))
allowing combination of insns 12 and 20
original costs 4 + 4 = 8
replacement costs 4 + 4 = 8

> [sfp-4] = r121 followed by cc = r121 cmp 0 which is recognized, but
> ICEs because the code below insist that the SET_DEST of newi2pat
> (or first set in PARALLEL thereof) must be a REG or SUBREG of REG,
> but it is a MEM in this case.  I don't see any condition that would
> guarantee that, perhaps for the swap_i2i3 case it was somehow guaranteed.

I don't see how it was guaranteed for that case, either.

> As the code just wants to update LOG_LINKS and LOG_LINKS are only for
> registers, not for MEM or anything else, the patch just doesn't update those
> if it isn't a REG or SUBREG of REG.

That is correct.

> 2020-03-31  Jakub Jelinek  
> 
>   PR rtl-optimization/94291

Please mention PR rtl-optimization/84169 as well.

>   * combine.c (try_combine): For split_i2i3, don't assume SET_DEST
>   must be a REG or SUBREG of REG; if it is not one of these, don't
>   update LOG_LINKs.
> 
>   * gcc.dg/pr94291.c: New test.

Okay for trunk, and for backports as well (after soaking it for a bit).
Thanks!


Segher

Re: Merge from master to gccgo branch

2020-04-07 Thread Ian Lance Taylor via Gcc-patches

On Mon, Apr 6, 2020 at 5:51 PM Ian Lance Taylor  wrote:
>
> I merged master revision 52fa80f853c0b0f623ea9e4c7198e324ce44ff30 to
> the gccgo branch.

Another merge from master to gccgo branch, of revision
50c7853216e8511971c55b51d7fe29173db4749b.

Ian

Re: [PATCH] sra: Fix sra_modify_expr handling of partial writes (PR 94482)

2020-04-07 Thread Martin Jambor

Hi Jeff,

On Tue, Apr 07 2020, Jeff Law wrote:
> On Tue, 2020-04-07 at 18:25 +0200, Martin Jambor wrote:
>> Hi,
>> 
>> On Tue, Apr 07 2020, Richard Biener wrote:
>> > On Tue, 7 Apr 2020, Martin Jambor wrote:
>> > 
>> > > Hi,
>> > > 
>> > > when sra_modify_expr is invoked on an expression that modifies only
>> > > part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
>> > > of an assignment and the SRA replacement's type is not compatible with
>> > > what is being replaced (0th operand of the B_F_R in the above
>> > > example), it does not work properly, basically throwing away the partd
>> > > of the expr that should have stayed intact.
>> > > 
>> > > This is fixed in two ways.  For BIT_FIELD_REFs, which operate on the
>> > > binary image of the replacement (and so in a way serve as a
>> > > VIEW_CONVERT_EXPR) we just do not bother with convertsing.  For
>> > > REALPART_EXPRs and IMAGPART_EXPRs, we insert a VIEW_CONVERT_EXPR under
>> > > the complex partial access expression, which is OK even in a LHS of an
>> > > assignment (and other write contexts) because in those cases the
>> > > replacement is not going to be a giple register anyway.
>> > > 
>> > > The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a bit
>> > > fragile because SRA prefers complex and vector types over anything else
>> > > (and in between the two it decides based on TYPE_UID which in my testing
>> > > today always preferred complex types) and even when GIMPLE is wrong I
>> > > often still did not get failing runs, so I only run it at -O1 (which is
>> > > the only level where the the test fails for me).
>> > > 
>> > > Bootstrapped and tested on x86_64-linux, bootstrap and testing on
>> > > i686-linux and aarch64-linux underway.
>> > > 
>> > > OK for trunk (and subsequently for release branches) if it passes?
>> > 
>> > OK.
>> 
>> I must have been already half asleep when writing that it passed
>> testing.  It did not, testsuite/gcc.c-torture/compile/pr42196-3.c fails
>> even on x86_64, because fwprop happily combines
> Yea, my tester complained about that on msp430-elf as well.  There's other 
> recent
> failures on msp430, ft32 and h8300 that I'm digging into now.
>> 

I did not commit the patch.  I waited to look at test results from other
platforms and saw the failure there and so did not proceed.  So unless
you tested it independently, it's not caused by my patch.

Martin

Re: [PATCH] sra: Fix sra_modify_expr handling of partial writes (PR 94482)

2020-04-07 Thread Jeff Law via Gcc-patches

On Tue, 2020-04-07 at 21:31 +0200, Martin Jambor wrote:
> Hi Jeff,
> 
> On Tue, Apr 07 2020, Jeff Law wrote:
> > On Tue, 2020-04-07 at 18:25 +0200, Martin Jambor wrote:
> > > Hi,
> > > 
> > > On Tue, Apr 07 2020, Richard Biener wrote:
> > > > On Tue, 7 Apr 2020, Martin Jambor wrote:
> > > > 
> > > > > Hi,
> > > > > 
> > > > > when sra_modify_expr is invoked on an expression that modifies only
> > > > > part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
> > > > > of an assignment and the SRA replacement's type is not compatible with
> > > > > what is being replaced (0th operand of the B_F_R in the above
> > > > > example), it does not work properly, basically throwing away the partd
> > > > > of the expr that should have stayed intact.
> > > > > 
> > > > > This is fixed in two ways.  For BIT_FIELD_REFs, which operate on the
> > > > > binary image of the replacement (and so in a way serve as a
> > > > > VIEW_CONVERT_EXPR) we just do not bother with convertsing.  For
> > > > > REALPART_EXPRs and IMAGPART_EXPRs, we insert a VIEW_CONVERT_EXPR under
> > > > > the complex partial access expression, which is OK even in a LHS of an
> > > > > assignment (and other write contexts) because in those cases the
> > > > > replacement is not going to be a giple register anyway.
> > > > > 
> > > > > The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a bit
> > > > > fragile because SRA prefers complex and vector types over anything 
> > > > > else
> > > > > (and in between the two it decides based on TYPE_UID which in my
> > > > > testing
> > > > > today always preferred complex types) and even when GIMPLE is wrong I
> > > > > often still did not get failing runs, so I only run it at -O1 (which 
> > > > > is
> > > > > the only level where the the test fails for me).
> > > > > 
> > > > > Bootstrapped and tested on x86_64-linux, bootstrap and testing on
> > > > > i686-linux and aarch64-linux underway.
> > > > > 
> > > > > OK for trunk (and subsequently for release branches) if it passes?
> > > > 
> > > > OK.
> > > 
> > > I must have been already half asleep when writing that it passed
> > > testing.  It did not, testsuite/gcc.c-torture/compile/pr42196-3.c fails
> > > even on x86_64, because fwprop happily combines
> > Yea, my tester complained about that on msp430-elf as well.  There's other
> > recent
> > failures on msp430, ft32 and h8300 that I'm digging into now.
> 
> I did not commit the patch.  I waited to look at test results from other
> platforms and saw the failure there and so did not proceed.  So unless
> you tested it independently, it's not caused by my patch.
ohhh, interesting, no I did not test that independently...  I'll put it back in
the queue of things to investigate.

jeff
>

Re: [PATCH v2 00/11] aarch64: Implement TImode comparisons

2020-04-07 Thread Segher Boessenkool

Hi!

On Tue, Apr 07, 2020 at 10:52:10AM +0100, Richard Earnshaw (lists) wrote:
> On 06/04/2020 12:19, Richard Sandiford wrote:
> > "Richard Earnshaw (lists)"  writes:
> >> Yes.  It surprised me, when doing the aarch32 version, just how often
> >> the mid-end parts of the compiler were able to reason about parts of the
> >> parallel insn and optimize things accordingly (eg propagating the truth
> >> of the comparison).  If you use an unspec that can never happen.
> > 
> > That could be changed though.  E.g. we could add something like a
> > simplify_unspec target hook if this becomes a problem (either here
> > or for other unspecs).  A fancy implementation could even use
> > match.pd-style rules in the .md file.
> 
> I really don't like that.  It sounds like the top of a long slippery
> slope.  What about all the other cases where the RTL is comprehended by
> the mid-end?

Same here.  And the interesting transforms are not done in simplify-rtx
anyway, but in combine: simplify-rtx should only ever make a *simplified*
representation, while combine makes a (single!) choice that works out the
best in practice.

You do need a few separate patterns (for reg, 0, pos, -1, other neg, for
example), but then everything automatically works.  The canonical way to
write these insns is different for different constants, that is all.

Segher

Re: [PATCH] reject scalar array initialization with nullptr [PR94510]

2020-04-07 Thread Marek Polacek via Gcc-patches

On Tue, Apr 07, 2020 at 12:50:48PM -0600, Martin Sebor via Gcc-patches wrote:
> Among the numerous regressions introduced by the change committed
> to GCC 9 to allow string literals as template arguments is a failure
> to recognize the C++ nullptr and GCC's __null constants as pointers.
> For one, I didn't realize that nullptr, being a null pointer constant,
> doesn't have a pointer type, and two, I didn't think of __null (which
> is a special integer constant that NULL sometimes expands to).
> 
> The attached patch adjusts the special handling of trailing zero
> initializers in reshape_init_array_1 to recognize both kinds of
> constants and avoid treating them as zeros of the array integer
> element type.  This restores the expected diagnostics when either
> constant is used in the initializer list.
> 
> Martin

> PR c++/94510 - nullptr_t implicitly cast to zero twice in std::array
> 
> gcc/cp/ChangeLog:
> 
>   PR c++/94510
>   * decl.c (reshape_init_array_1): Exclude mismatches with all kinds
>   of pointers.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR c++/94510
>   * g++.dg/init/array57.C: New test.
>   * g++.dg/init/array58.C: New test.
> 
> diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
> index a127734af69..692c8ed73f4 100644
> --- a/gcc/cp/decl.c
> +++ b/gcc/cp/decl.c
> @@ -6041,9 +6041,14 @@ reshape_init_array_1 (tree elt_type, tree max_index, 
> reshape_iter *d,
>   TREE_CONSTANT (new_init) = false;
>  
>/* Pointers initialized to strings must be treated as non-zero
> -  even if the string is empty.  */
> +  even if the string is empty.  Handle all kinds of pointers,
> +  including std::nullptr and GCC's __nullptr, neither of which
> +  has a pointer type.  */
>tree init_type = TREE_TYPE (elt_init);
> -  if (POINTER_TYPE_P (elt_type) != POINTER_TYPE_P (init_type)
> +  bool init_is_ptr = (POINTER_TYPE_P (init_type)
> +   || NULLPTR_TYPE_P (init_type)
> +   || null_node_p (elt_init));
> +  if (POINTER_TYPE_P (elt_type) != init_is_ptr
> || !type_initializer_zero_p (elt_type, elt_init))
>   last_nonzero = index;

It looks like this still won't handle e.g. pointers to member functions,
e.g.

struct S { };
int arr[3] = { (void (S::*) ()) 0, 0, 0 };

would still be accepted.  You could use TYPE_PTR_OR_PTRMEM_P instead of
POINTER_TYPE_P to catch this case.

Marek

Backports to 9.4

2020-04-07 Thread Jakub Jelinek via Gcc-patches

Hi!

I've backported following 21 commits from trunk to 9.4,
bootstrapped/regtested on x86_64-linux and i686-linux and committed to 9
branch.

Jakub
>From 980a7a0be5a114e285c49ab05ac70881e4f27fc3 Mon Sep 17 00:00:00 2001
From: Jakub Jelinek 
Date: Tue, 17 Mar 2020 21:21:16 +0100
Subject: [PATCH] c++: Fix parsing of invalid enum specifiers [PR90995]

The testcase shows some accepts-invalid (the ones without alignas) and
ice-on-invalid-code (the ones with alignas) cases.
If the enum doesn't have an underlying type and is not a definition,
the caller retries to parse it as elaborated type specifier.
E.g. for enum struct S s it will then pedwarn that elaborated type specifier
shouldn't have the struct/class keywords.
The problem is if the enum specifier is not followed by { when it has
underlying type.  In that case we have already called
cp_parser_parse_definitely to end the tentative parsing started at the
beginning of cp_parser_enum_specifier.  But the
cp_parser_error (parser, "expected %<;%> or %<{%>");
doesn't emit any error because the whole function is called from yet another
tentative parse and the caller starts parsing the elaborated type
specifier where the cp_parser_enum_specifier stopped (i.e. after the
underlying type token(s)).  The ultimate caller than commits the tentative
parsing (and even if it wouldn't, it wouldn't know what kind of error
to report).  I think after seeing enum {,struct,class} : type not being
followed by { or ;, there is no reason not to report it right away, as it
can't be valid C++, which is what the patch does.  Not sure if we shouldn't
also return error_mark_node instead of NULL_TREE, so that the caller doesn't
try to parse it as elaborated type specifier (the patch doesn't do that
right now).

Furthermore, while reading the code, I've noticed that
parser->colon_corrects_to_scope_p is saved and set to false at the start
of the function, but not restored back in some cases.  Don't have a testcase
where this would be a problem, but it just seems wrong.  Either we can in
the two spots replace return NULL_TREE; with { type = NULL_TREE; goto out; }
or we could perhaps abuse warning_sentinel or create a special class with
dtor to clean the flag up.

And lastly, I've fixed some formatting issues in the function while reading
it.

2020-03-17  Jakub Jelinek  

PR c++/90995
* parser.c (cp_parser_enum_specifier): Use temp_override for
parser->colon_corrects_to_scope_p, replace goto out with return.
If scoped enum or enum with underlying type is not followed by
{ or ;, call cp_parser_commit_to_tentative_parse before calling
cp_parser_error and make sure to return error_mark_node instead of
NULL_TREE.  Formatting fixes.

* g++.dg/cpp0x/enum40.C: New test.
---
 gcc/cp/ChangeLog| 13 
 gcc/cp/parser.c | 52 -
 gcc/testsuite/ChangeLog |  8 +
 gcc/testsuite/g++.dg/cpp0x/enum40.C | 26 +++
 4 files changed, 68 insertions(+), 31 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/enum40.C

diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
index 2e247069b96..8afe6aca339 100644
--- a/gcc/cp/ChangeLog
+++ b/gcc/cp/ChangeLog
@@ -1,3 +1,16 @@
+2020-04-07  Jakub Jelinek  
+
+   Backported from mainline
+   2020-03-17  Jakub Jelinek  
+
+   PR c++/90995
+   * parser.c (cp_parser_enum_specifier): Use temp_override for
+   parser->colon_corrects_to_scope_p, replace goto out with return.
+   If scoped enum or enum with underlying type is not followed by
+   { or ;, call cp_parser_commit_to_tentative_parse before calling
+   cp_parser_error and make sure to return error_mark_node instead of
+   NULL_TREE.  Formatting fixes.
+
 2020-04-05  Marek Polacek  

2020-02-06  Marek Polacek  
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index e1c02d7b718..0219920be8f 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -18705,9 +18705,7 @@ cp_parser_enum_specifier (cp_parser* parser)
   bool is_unnamed = false;
   tree underlying_type = NULL_TREE;
   cp_token *type_start_token = NULL;
-  bool saved_colon_corrects_to_scope_p = parser->colon_corrects_to_scope_p;
-
-  parser->colon_corrects_to_scope_p = false;
+  temp_override cleanup (parser->colon_corrects_to_scope_p, false);

   /* Parse tentatively so that we can back up if we don't find a
  enum-specifier.  */
@@ -18747,24 +18745,24 @@ cp_parser_enum_specifier (cp_parser* parser)

   push_deferring_access_checks (dk_no_check);
   nested_name_specifier
-  = cp_parser_nested_name_specifier_opt (parser,
-/*typename_keyword_p=*/true,
-/*check_dependency_p=*/false,
-/*type_p=*/false,
-/*is_declaration=*/false);
+= cp_parser_nested_name_specifier_opt (pa

Re: [PATCH v2 00/11] aarch64: Implement TImode comparisons

2020-04-07 Thread Segher Boessenkool

On Mon, Apr 06, 2020 at 12:19:42PM +0100, Richard Sandiford wrote:
> The reason I'm not keen on using special modes for this case is that
> they'd describe one way in which the result can be used rather than
> describing what the instruction actually does.  The instruction really
> does set all four flags to useful values.  The "problem" is that they're
> not the values associated with a compare between two values, so representing
> them that way will always lose information.

CC modes describe how the flags are set, not how the flags are used.
You cannot easily describe the V bit setting with a compare (it needs
a mode bigger than the register), is that what you mean?


Segher

Re: [PATCH] c++: Fix ICE in tsubst_default_argument [PR92010]

2020-04-07 Thread Patrick Palka via Gcc-patches

On Tue, 7 Apr 2020, Patrick Palka wrote:
> On Mon, 6 Apr 2020, Jason Merrill wrote:
> > On 4/6/20 11:45 AM, Patrick Palka wrote:
> > > On Wed, 1 Apr 2020, Jason Merrill wrote:
> > > 
> > > > On 4/1/20 6:29 PM, Jason Merrill wrote:
> > > > > On 3/31/20 3:50 PM, Patrick Palka wrote:
> > > > > > On Tue, 31 Mar 2020, Jason Merrill wrote:
> > > > > > 
> > > > > > > On 3/30/20 6:46 PM, Patrick Palka wrote:
> > > > > > > > On Mon, 30 Mar 2020, Jason Merrill wrote:
> > > > > > > > > On 3/30/20 3:58 PM, Patrick Palka wrote:
> > > > > > > > > > On Thu, 26 Mar 2020, Jason Merrill wrote:
> > > > > > > > > > 
> > > > > > > > > > > On 3/22/20 9:21 PM, Patrick Palka wrote:
> > > > > > > > > > > > This patch relaxes an assertion in 
> > > > > > > > > > > > tsubst_default_argument
> > > > > > > > > > > > that
> > > > > > > > > > > > exposes
> > > > > > > > > > > > a
> > > > > > > > > > > > latent
> > > > > > > > > > > > bug in how we substitute an array type into a 
> > > > > > > > > > > > cv-qualified
> > > > > > > > > > > > wildcard
> > > > > > > > > > > > function
> > > > > > > > > > > > parameter type.  Concretely, the latent bug is that 
> > > > > > > > > > > > given
> > > > > > > > > > > > the
> > > > > > > > > > > > function
> > > > > > > > > > > > template
> > > > > > > > > > > > 
> > > > > > > > > > > >    template void foo(const T t);
> > > > > > > > > > > > 
> > > > > > > > > > > > one would expect the type of foo to be void(const
> > > > > > > > > > > > int*), but
> > > > > > > > > > > > we
> > > > > > > > > > > > (seemingly prematurely) strip function parameter types 
> > > > > > > > > > > > of
> > > > > > > > > > > > their
> > > > > > > > > > > > top-level
> > > > > > > > > > > > cv-qualifiers when building the function's 
> > > > > > > > > > > > TYPE_ARG_TYPES,
> > > > > > > > > > > > and
> > > > > > > > > > > > instead
> > > > > > > > > > > > end
> > > > > > > > > > > > up
> > > > > > > > > > > > obtaining void(int*) as the type of foo after
> > > > > > > > > > > > substitution
> > > > > > > > > > > > and
> > > > > > > > > > > > decaying.
> > > > > > > > > > > > 
> > > > > > > > > > > > We still however correctly substitute into and decay the
> > > > > > > > > > > > formal
> > > > > > > > > > > > parameter
> > > > > > > > > > > > type,
> > > > > > > > > > > > obtaining const int* as the type of t after 
> > > > > > > > > > > > substitution. 
> > > > > > > > > > > > But
> > > > > > > > > > > > this
> > > > > > > > > > > > then
> > > > > > > > > > > > leads
> > > > > > > > > > > > to us tripping over the assert in 
> > > > > > > > > > > > tsubst_default_argument
> > > > > > > > > > > > that
> > > > > > > > > > > > verifies
> > > > > > > > > > > > the
> > > > > > > > > > > > formal parameter type and the function type are
> > > > > > > > > > > > consistent.
> > > > > > > > > > > > 
> > > > > > > > > > > > Assuming it's too late at this stage to fix the
> > > > > > > > > > > > substitution
> > > > > > > > > > > > bug, we
> > > > > > > > > > > > can
> > > > > > > > > > > > still
> > > > > > > > > > > > relax the assertion like so.  Tested on
> > > > > > > > > > > > x86_64-pc-linux-gnu,
> > > > > > > > > > > > does
> > > > > > > > > > > > this
> > > > > > > > > > > > look
> > > > > > > > > > > > OK?
> > > > > > > > > > > 
> > > > > > > > > > > This is core issues 1001/1322, which have not been 
> > > > > > > > > > > resolved.
> > > > > > > > > > > Clang
> > > > > > > > > > > does
> > > > > > > > > > > the
> > > > > > > > > > > substitution the way you suggest; EDG rejects the testcase
> > > > > > > > > > > because the
> > > > > > > > > > > two
> > > > > > > > > > > substitutions produce different results.  I think it would
> > > > > > > > > > > make
> > > > > > > > > > > sense
> > > > > > > > > > > to
> > > > > > > > > > > follow the EDG behavior until this issue is actually
> > > > > > > > > > > resolved.
> > > > > > > > > > 
> > > > > > > > > > Here is what I have so far towards that end.  When
> > > > > > > > > > substituting
> > > > > > > > > > into the
> > > > > > > > > > PARM_DECLs of a function decl, we now additionally check if
> > > > > > > > > > the
> > > > > > > > > > aforementioned Core issues are relevant and issue a (fatal)
> > > > > > > > > > diagnostic
> > > > > > > > > > if so.  This patch checks this in tsubst_decl  > > > > > > > > > PARM_DECL>
> > > > > > > > > > rather
> > > > > > > > > > than in tsubst_function_decl for efficiency reasons, so that
> > > > > > > > > > we
> > > > > > > > > > don't
> > > > > > > > > > have to perform another traversal over the DECL_ARGUMENTS /
> > > > > > > > > > TYPE_ARG_TYPES just to implement this check.
> > > > > > > > > 
> > > > > > > > > Hmm, this seems like writing more complicated code for a very
> > > > > > > > > marginal
> > > > > > > > > optimization; how many function templates have so many
> > > > > > > > > parameters
> > > > > > > > > that
> > > > > > > > > walking
> > > > > > > > > over them once to compare types will have any effect on 
> > >

[RFC, Instruction Scheduler, Stage1] New hook/code to perform fusion of dependent instructions

2020-04-07 Thread Pat Haugen via Gcc-patches

The Power processor has the ability to fuse certain pairs of dependent
instructions to improve their performance if they appear back-to-back in
the instruction stream. In looking at the current support for
instruction fusion in GCC I saw the following 2 options.

1) TARGET_SCHED_MACRO_FUSION target hooks: Only looks at existing
back-to-back instructions and will ensure the scheduler keeps them together.

2) -fsched-fusion/TARGET_SCHED_FUSION_PRIORITY: Runs as a separate
scheduling pass before peephole2. Operates independently on a single
insn. Used by ARM backend to assign higher priorities to base/disp loads
and stores so that the scheduling pass will schedule loads/stores to
adjacent memory back-to-back. Later these insns will be transformed into
load/store pair insns.

Neither of these work for Power's purpose because they don't deal with
fusion of dependent insns that may not already be back-to-back. The
TARGET_SCHED_REORDER[2] hooks also don't work since the dependent insn
more than likely gets queued for N cycles so wouldn't be on the ready
list for the reorder hooks to process. We want the ability for the
scheduler to schedule dependent insn pairs back-to-back when possible
(i.e. other dependencies of both insns have been satisfied).

I have coded up a proof of concept that implements our needs via a new
target hook. The hook is passed a pair of dependent insns and returns if
they are a fusion candidate. It is called while removing the forward
dependencies of the just scheduled insn. If a dependent insn becomes
available to schedule and it's a fusion candidate with the just
scheduled insn, then the new code moves it to the ready list (if
necessary) and marks it as SCHED_GROUP (piggy-backing on the existing
code used by TARGET_SCHED_MACRO_FUSION) to make sure the fusion
candidate will be scheduled next. Following is the scheduling part of
the diff. Does this sound like a feasible approach? I welcome any
comments/discussion.

Thanks,
Pat


diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
index 80687fb5359..7a62136d497 100644
--- a/gcc/haifa-sched.c
+++ b/gcc/haifa-sched.c
@@ -4152,6 +4152,39 @@ schedule_insn (rtx_insn *insn)
  && SCHED_GROUP_P (next)
  && advance < effective_cost)
advance = effective_cost;
+
+ /* If all dependencies of this insn have now been resolved and the
+just scheduled insn was not part of a SCHED_GROUP, check if
+this dependent insn can be fused with the just scheduled insn.  */
+ else if (effective_cost >= 0 && !SCHED_GROUP_P (insn)
+  && targetm.sched.dep_fusion
+  && targetm.sched.dep_fusion (insn, next))
+   {
+ /* Move to ready list if necessary.  */
+ if (effective_cost > 0)
+   {
+ queue_remove (next);
+ ready_add (&ready, next, true);
+   }
+
+ /* Mark as sched_group.  */
+ SCHED_GROUP_P (next) = 1;
+
+ /* Fix insn_tick.  */
+ INSN_TICK (next) = INSN_TICK (insn);
+
+ /* Dump some debug output for success.  */
+ if (sched_verbose >= 5)
+   {
+ fprintf (sched_dump, ";;\t\tFusing dependent insns: ");
+ fprintf (sched_dump, "%4d %-30s --> ", INSN_UID (insn),
+  str_pattern_slim (PATTERN (insn)));
+ fprintf (sched_dump, "%4d %-30s\n", INSN_UID (next),
+  str_pattern_slim (PATTERN (next)));
+   }
+   }
}
   else
/* Check always has only one forward dependence (to the first insn in

Re: [PATCH] reject scalar array initialization with nullptr [PR94510]

2020-04-07 Thread Martin Sebor via Gcc-patches


On 4/7/20 1:50 PM, Marek Polacek wrote:

On Tue, Apr 07, 2020 at 12:50:48PM -0600, Martin Sebor via Gcc-patches wrote:

Among the numerous regressions introduced by the change committed
to GCC 9 to allow string literals as template arguments is a failure
to recognize the C++ nullptr and GCC's __null constants as pointers.
For one, I didn't realize that nullptr, being a null pointer constant,
doesn't have a pointer type, and two, I didn't think of __null (which
is a special integer constant that NULL sometimes expands to).

The attached patch adjusts the special handling of trailing zero
initializers in reshape_init_array_1 to recognize both kinds of
constants and avoid treating them as zeros of the array integer
element type.  This restores the expected diagnostics when either
constant is used in the initializer list.

Martin



PR c++/94510 - nullptr_t implicitly cast to zero twice in std::array

gcc/cp/ChangeLog:

PR c++/94510
* decl.c (reshape_init_array_1): Exclude mismatches with all kinds
of pointers.

gcc/testsuite/ChangeLog:

PR c++/94510
* g++.dg/init/array57.C: New test.
* g++.dg/init/array58.C: New test.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index a127734af69..692c8ed73f4 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -6041,9 +6041,14 @@ reshape_init_array_1 (tree elt_type, tree max_index, 
reshape_iter *d,
TREE_CONSTANT (new_init) = false;
  
/* Pointers initialized to strings must be treated as non-zero

-even if the string is empty.  */
+even if the string is empty.  Handle all kinds of pointers,
+including std::nullptr and GCC's __nullptr, neither of which
+has a pointer type.  */
tree init_type = TREE_TYPE (elt_init);
-  if (POINTER_TYPE_P (elt_type) != POINTER_TYPE_P (init_type)
+  bool init_is_ptr = (POINTER_TYPE_P (init_type)
+ || NULLPTR_TYPE_P (init_type)
+ || null_node_p (elt_init));
+  if (POINTER_TYPE_P (elt_type) != init_is_ptr
  || !type_initializer_zero_p (elt_type, elt_init))
last_nonzero = index;


It looks like this still won't handle e.g. pointers to member functions,
e.g.

struct S { };
int arr[3] = { (void (S::*) ()) 0, 0, 0 };

would still be accepted.  You could use TYPE_PTR_OR_PTRMEM_P instead of
POINTER_TYPE_P to catch this case.


Good catch!  That doesn't fail because unlike null data member pointers
which are represented as -1, member function pointers are represented
as a zero.

I had looked for an API that would answer the question: "is this
expression a pointer?" without having to think of all the different
kinds of them but all I could find was null_node_p().  Is this a rare,
isolated case that having an API like that wouldn't be worth having
or should I add one like in the attached update?

Martin
PR c++/94510 - nullptr_t implicitly cast to zero twice in std::array

gcc/cp/ChangeLog:

	PR c++/94510
	* decl.c (reshape_init_array_1): Exclude mismatches with all kinds
	of pointers.
	* gcc/cp/cp-tree.h (null_pointer_constant_p): New function.

gcc/testsuite/ChangeLog:

	PR c++/94510
	* g++.dg/init/array57.C: New test.
	* g++.dg/init/array58.C: New test.

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 63aaf615926..9ec6e3883c8 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -8022,6 +8022,22 @@ null_node_p (const_tree expr)
   return expr == null_node;
 }
 
+/* Returns true if EXPR is a null pointer constant of any type.  */
+
+inline bool
+null_pointer_constant_p (tree expr)
+{
+  STRIP_ANY_LOCATION_WRAPPER (expr);
+  if (expr == null_node)
+return true;
+  tree type = TREE_TYPE (expr);
+  if (NULLPTR_TYPE_P (type))
+return true;
+  if (POINTER_TYPE_P (type))
+return integer_zerop (expr);
+  return null_member_pointer_value_p (expr);
+}
+
 /* True iff T is a variable template declaration. */
 inline bool
 variable_template_p (tree t)
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index a127734af69..5cf5b601d29 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -6041,9 +6041,13 @@ reshape_init_array_1 (tree elt_type, tree max_index, reshape_iter *d,
 	TREE_CONSTANT (new_init) = false;
 
   /* Pointers initialized to strings must be treated as non-zero
-	 even if the string is empty.  */
+	 even if the string is empty.  Handle all kinds of pointers,
+	 including std::nullptr and GCC's __nullptr, neither of which
+	 has a pointer type.  */
   tree init_type = TREE_TYPE (elt_init);
-  if (POINTER_TYPE_P (elt_type) != POINTER_TYPE_P (init_type)
+  bool init_is_ptr = (POINTER_TYPE_P (init_type)
+			  || null_pointer_constant_p (elt_init));
+  if (POINTER_TYPE_P (elt_type) != init_is_ptr
 	  || !type_initializer_zero_p (elt_type, elt_init))
 	last_nonzero = index;
 
diff --git a/gcc/testsuite/g++.dg/init/array57.C b/gcc/testsuite/g++.dg/init/array57.C
new file mode 100644
index 000..70e86445c07
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ini

[pushed] c++: ICE on invalid concept placeholder [PR94481].

2020-04-07 Thread Jason Merrill via Gcc-patches

Here the 'decltype' is missing '(auto)', so open_paren was NULL, and trying
to get its location is a SEGV.  Using matching_parens avoids that problem.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog
2020-04-07  Jason Merrill  

PR c++/94481
* parser.c (cp_parser_placeholder_type_specifier): Use
matching_parens.
---
 gcc/cp/parser.c| 12 +---
 gcc/testsuite/g++.dg/cpp2a/concepts-placeholder2.C |  9 +
 2 files changed, 14 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-placeholder2.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index fbcdc9bb5fc..2c33ca4dd16 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -18367,7 +18367,7 @@ cp_parser_placeholder_type_specifier (cp_parser 
*parser, location_t loc,
 
   /* As per the standard, require auto or decltype(auto), except in some
  cases (template parameter lists, -fconcepts-ts enabled).  */
-  cp_token *placeholder = NULL, *open_paren = NULL, *close_paren = NULL;
+  cp_token *placeholder = NULL, *close_paren = NULL;
   if (cxx_dialect >= cxx2a)
 {
   if (cp_lexer_next_token_is_keyword (parser->lexer, RID_AUTO))
@@ -18375,12 +18375,10 @@ cp_parser_placeholder_type_specifier (cp_parser 
*parser, location_t loc,
   else if (cp_lexer_next_token_is_keyword (parser->lexer, RID_DECLTYPE))
{
  placeholder = cp_lexer_consume_token (parser->lexer);
- open_paren = cp_parser_require (parser, CPP_OPEN_PAREN,
- RT_OPEN_PAREN);
+ matching_parens parens;
+ parens.require_open (parser);
  cp_parser_require_keyword (parser, RID_AUTO, RT_AUTO);
-  close_paren = cp_parser_require (parser, CPP_CLOSE_PAREN,
-  RT_CLOSE_PAREN,
-  open_paren->location);
+ close_paren = parens.require_close (parser);
}
 }
 
@@ -18429,7 +18427,7 @@ cp_parser_placeholder_type_specifier (cp_parser 
*parser, location_t loc,
  results in an invented template parameter.  */
   if (parser->auto_is_implicit_function_template_parm_p)
 {
-  if (placeholder && token_is_decltype (placeholder))
+  if (close_paren)
{
  location_t loc = make_location (placeholder->location,
  placeholder->location,
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-placeholder2.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-placeholder2.C
new file mode 100644
index 000..b04354cf8d0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-placeholder2.C
@@ -0,0 +1,9 @@
+// PR c++/94481
+// { dg-do compile { target c++2a } }
+
+template
+concept C = true;
+
+void foo() {
+  C decltype c = 1;// { dg-error "" }
+}

base-commit: c23c899aedf11069e992eed7358802b262d62f98
-- 
2.18.1

Re: [PATCH][PPC64] [PR88877]

2020-04-07 Thread Segher Boessenkool

On Mon, Apr 06, 2020 at 11:46:17AM +0530, kamlesh kumar wrote:
> Segher,
> Please provide your suggestion/thought on the fix.

Why me?

I cannot approve almost all of this patch.  It is huge.  It is stage 1
so patches like this should not be submitted at all now (you can ask for
people to look at it, etc., but you didn't).

Segher

Re: [PATCH] rs6000: Save/restore r31 if frame_pointer_needed is true

2020-04-07 Thread Segher Boessenkool

Hi!

On Mon, Mar 30, 2020 at 12:18:59PM +0800, luoxhu wrote:
> >>>If df_regs_ever_live_p(31) is false there is no hard frame pointer
> >>>anywhere in the program.  How can it be used then?
> >>
> >>There is a piece of code emit move instruction to r31 even 
> >>df_regs_ever_live_p(31) is false
> >>in pro_and_epilogue.
> >
> >Can you point out where (in rs6000-logue.c ot similar)?  We should fix
> >*that*.

> >frame_pointer_needed is often true when the backend can figure out we do
> >not actually need it.

> >(so just that "if" clause changes), it'll all be fine.  Could you test
> >that please?
> 
> Tried with below patch, it still fails at same place, I guess this is not 
> enough.
> The instruction in 100dc034 change r31 from 0x105fdd70 to 0x7fffbbf0 
> and didn't
> restore it before return. 

So where was *that* insn generated, then?


Segher

Re: [PATCH] c++: Fix ICE in tsubst_default_argument [PR92010]

2020-04-07 Thread Jason Merrill via Gcc-patches


On 4/7/20 1:40 PM, Patrick Palka wrote:

On Mon, 6 Apr 2020, Jason Merrill wrote:

On 4/6/20 11:45 AM, Patrick Palka wrote:

On Wed, 1 Apr 2020, Jason Merrill wrote:


On 4/1/20 6:29 PM, Jason Merrill wrote:

On 3/31/20 3:50 PM, Patrick Palka wrote:

On Tue, 31 Mar 2020, Jason Merrill wrote:


On 3/30/20 6:46 PM, Patrick Palka wrote:

On Mon, 30 Mar 2020, Jason Merrill wrote:

On 3/30/20 3:58 PM, Patrick Palka wrote:

On Thu, 26 Mar 2020, Jason Merrill wrote:


On 3/22/20 9:21 PM, Patrick Palka wrote:

This patch relaxes an assertion in tsubst_default_argument
that
exposes
a
latent
bug in how we substitute an array type into a cv-qualified
wildcard
function
parameter type.  Concretely, the latent bug is that given
the
function
template

    template void foo(const T t);

one would expect the type of foo to be void(const
int*), but
we
(seemingly prematurely) strip function parameter types of
their
top-level
cv-qualifiers when building the function's TYPE_ARG_TYPES,
and
instead
end
up
obtaining void(int*) as the type of foo after
substitution
and
decaying.

We still however correctly substitute into and decay the
formal
parameter
type,
obtaining const int* as the type of t after substitution.
But
this
then
leads
to us tripping over the assert in tsubst_default_argument
that
verifies
the
formal parameter type and the function type are
consistent.

Assuming it's too late at this stage to fix the
substitution
bug, we
can
still
relax the assertion like so.  Tested on
x86_64-pc-linux-gnu,
does
this
look
OK?


This is core issues 1001/1322, which have not been resolved.
Clang
does
the
substitution the way you suggest; EDG rejects the testcase
because the
two
substitutions produce different results.  I think it would
make
sense
to
follow the EDG behavior until this issue is actually
resolved.


Here is what I have so far towards that end.  When
substituting
into the
PARM_DECLs of a function decl, we now additionally check if
the
aforementioned Core issues are relevant and issue a (fatal)
diagnostic
if so.  This patch checks this in tsubst_decl 
rather
than in tsubst_function_decl for efficiency reasons, so that
we
don't
have to perform another traversal over the DECL_ARGUMENTS /
TYPE_ARG_TYPES just to implement this check.


Hmm, this seems like writing more complicated code for a very
marginal
optimization; how many function templates have so many
parameters
that
walking
over them once to compare types will have any effect on compile
time?


Good point... though I just tried implementing this check in
tsubst_function_decl, and it seems it might be just as complicated
to
implement it there instead, at least if we want to handle function
parameter packs correctly.

If we were to implement this check in tsubst_function_decl, then
since
we have access to the instantiated function, it would presumably
suffice
to compare its substituted DECL_ARGUMENTS with its substituted
TYPE_ARG_TYPES to see if they're consistent.  Doing so would
certainly
catch the original testcase, i.e.

      template
    void foo(const T);
      int main() { foo(0); }

because the DECL_ARGUMENTS of foo would be {const int*} and
its
TYPE_ARG_TYPES would be {int*}.  But apparently it doesn't catch
the
corresponding testcase that uses a function parameter pack, i.e.

      template
    void foo(const Ts...);
      int main() { foo(0); }

because it turns out we don't strip top-level cv-qualifiers from
function parameter packs from TYPE_ARG_TYPES at declaration time,
as
we
do with regular function parameters.  So in this second testcase
both
DECL_ARGUMENTS and TYPE_ARG_TYPES of foo would be {const
int*},
and yet we would (presumably) want to reject this instantiation
too.

So it seems comparing TYPE_ARG_TYPES and DECL_ARGUMENTS from
tsubst_function_decl would not suffice, and we would still need to
do
a
variant of the trick that's done in this patch, i.e. substitute
into
each dependent parameter type stripped of its top-level
cv-qualifiers,
to see if these cv-qualifiers make a material difference in the
resulting function type.  Or maybe there's yet another way to
detect
this?


I think let's go ahead with comparing TYPE_ARG_TYPES and
DECL_ARGUMENTS;
the
problem comes when they disagree.  If we're handling pack expansions
wrong,
that's a separate issue.


Hm, comparing TYPE_ARG_TYPES and DECL_ARGUMENTS for compatibility
seems
to be exposing a latent bug with how we handle lambdas that appear in
function parameter types.  Take g++.dg/cpp2a/lambda-uneval3.C for
example:

   template  void spam(decltype([]{}) (*s)[sizeof(T)]) {}
   int main() { spam(nullptr); }

According to tsubst_function_decl in current trunk, the type of the
function paremeter 's' of spam according to its TYPE_ARG_TYPES
is
   struct ._anon_4[1] *
and according to its DECL_ARGUMENTS the type of 's' is
   struct ._anon_5[1] *

The disagreement happens because we call tsubst_lambda_expr twice
during
substitution and thereby generate two distinct lambda type

Re: [PATCH] reject scalar array initialization with nullptr [PR94510]

2020-04-07 Thread Marek Polacek via Gcc-patches

On Tue, Apr 07, 2020 at 02:46:52PM -0600, Martin Sebor wrote:
> On 4/7/20 1:50 PM, Marek Polacek wrote:
> > On Tue, Apr 07, 2020 at 12:50:48PM -0600, Martin Sebor via Gcc-patches 
> > wrote:
> > > Among the numerous regressions introduced by the change committed
> > > to GCC 9 to allow string literals as template arguments is a failure
> > > to recognize the C++ nullptr and GCC's __null constants as pointers.
> > > For one, I didn't realize that nullptr, being a null pointer constant,
> > > doesn't have a pointer type, and two, I didn't think of __null (which
> > > is a special integer constant that NULL sometimes expands to).
> > > 
> > > The attached patch adjusts the special handling of trailing zero
> > > initializers in reshape_init_array_1 to recognize both kinds of
> > > constants and avoid treating them as zeros of the array integer
> > > element type.  This restores the expected diagnostics when either
> > > constant is used in the initializer list.
> > > 
> > > Martin
> > 
> > > PR c++/94510 - nullptr_t implicitly cast to zero twice in std::array
> > > 
> > > gcc/cp/ChangeLog:
> > > 
> > >   PR c++/94510
> > >   * decl.c (reshape_init_array_1): Exclude mismatches with all kinds
> > >   of pointers.
> > > 
> > > gcc/testsuite/ChangeLog:
> > > 
> > >   PR c++/94510
> > >   * g++.dg/init/array57.C: New test.
> > >   * g++.dg/init/array58.C: New test.
> > > 
> > > diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
> > > index a127734af69..692c8ed73f4 100644
> > > --- a/gcc/cp/decl.c
> > > +++ b/gcc/cp/decl.c
> > > @@ -6041,9 +6041,14 @@ reshape_init_array_1 (tree elt_type, tree 
> > > max_index, reshape_iter *d,
> > >   TREE_CONSTANT (new_init) = false;
> > > /* Pointers initialized to strings must be treated as non-zero
> > > -  even if the string is empty.  */
> > > +  even if the string is empty.  Handle all kinds of pointers,
> > > +  including std::nullptr and GCC's __nullptr, neither of which
> > > +  has a pointer type.  */
> > > tree init_type = TREE_TYPE (elt_init);
> > > -  if (POINTER_TYPE_P (elt_type) != POINTER_TYPE_P (init_type)
> > > +  bool init_is_ptr = (POINTER_TYPE_P (init_type)
> > > +   || NULLPTR_TYPE_P (init_type)
> > > +   || null_node_p (elt_init));
> > > +  if (POINTER_TYPE_P (elt_type) != init_is_ptr
> > > || !type_initializer_zero_p (elt_type, elt_init))
> > >   last_nonzero = index;
> > 
> > It looks like this still won't handle e.g. pointers to member functions,
> > e.g.
> > 
> > struct S { };
> > int arr[3] = { (void (S::*) ()) 0, 0, 0 };
> > 
> > would still be accepted.  You could use TYPE_PTR_OR_PTRMEM_P instead of
> > POINTER_TYPE_P to catch this case.
> 
> Good catch!  That doesn't fail because unlike null data member pointers
> which are represented as -1, member function pointers are represented
> as a zero.
> 
> I had looked for an API that would answer the question: "is this
> expression a pointer?" without having to think of all the different
> kinds of them but all I could find was null_node_p().  Is this a rare,
> isolated case that having an API like that wouldn't be worth having
> or should I add one like in the attached update?
> 
> Martin

> PR c++/94510 - nullptr_t implicitly cast to zero twice in std::array
> 
> gcc/cp/ChangeLog:
> 
>   PR c++/94510
>   * decl.c (reshape_init_array_1): Exclude mismatches with all kinds
>   of pointers.
>   * gcc/cp/cp-tree.h (null_pointer_constant_p): New function.

(Drop the gcc/cp/.)

> +/* Returns true if EXPR is a null pointer constant of any type.  */
> +
> +inline bool
> +null_pointer_constant_p (tree expr)
> +{
> +  STRIP_ANY_LOCATION_WRAPPER (expr);
> +  if (expr == null_node)
> +return true;
> +  tree type = TREE_TYPE (expr);
> +  if (NULLPTR_TYPE_P (type))
> +return true;
> +  if (POINTER_TYPE_P (type))
> +return integer_zerop (expr);
> +  return null_member_pointer_value_p (expr);
> +}
> +

We already have a null_ptr_cst_p so it would be sort of confusing to have
this as well.  But are you really interested in whether it's a null pointer,
not just a pointer?

Marek

Re: [PATCH v2 00/11] aarch64: Implement TImode comparisons

2020-04-07 Thread Richard Henderson

On 4/7/20 1:27 PM, Segher Boessenkool wrote:
> On Mon, Apr 06, 2020 at 12:19:42PM +0100, Richard Sandiford wrote:
>> The reason I'm not keen on using special modes for this case is that
>> they'd describe one way in which the result can be used rather than
>> describing what the instruction actually does.  The instruction really
>> does set all four flags to useful values.  The "problem" is that they're
>> not the values associated with a compare between two values, so representing
>> them that way will always lose information.
> 
> CC modes describe how the flags are set, not how the flags are used.
> You cannot easily describe the V bit setting with a compare (it needs
> a mode bigger than the register), is that what you mean?

I think that is a good deal of the effort.

I wonder if it would be helpful to have

  (uoverflow_plus x y carry)
  (soverflow_plus x y carry)

etc.

(define_insn "uaddsi3_cout"
  [(set (reg:CC_C CC_REGNUM)
(uoverflow_plus:CC_C
  (match_operand:SI 1 "register_operand")
  (match_operand:SI 2 "plus_operand")
  (const_int 0)))
(set (match_operand:SI 0 "register_operand")
 (plus:SI (match_dup 1) (match_dup 2)))]
  ...
)

(define_insn "uaddsi4_cin_cout"
  [(set (reg:CC_C CC_REGNUM)
(uoverflow_plus:CC_C
  (match_operand:SI 1 "register_operand")
  (match_operand:SI 2 "reg_or_zero_operand")
  (match_operand:SI 3 "carry_operand")))
(set (match_operand:SI 0 "register_operand")
 (plus:SI
   (plus:SI (match_dup 3) (match_dup 1))
   (match_dup 2)))]
  ...
)

(define_insn "usubsi4_cin_cout"
  [(set (reg:CC_C CC_REGNUM)
(uoverflow_plus:CC_C
  (match_operand:SI 1 "register_operand")
  (not:SI (match_operand:SI 2 "reg_or_zero_operand"))
  (match_operand:SI 3 "carry_operand")))
(set (match_operand:SI 0 "register_operand")
 (minus:SI
   (minus:SI (match_dup 1) (match_dup 2))
   (match_operand:SI 4 "borrow_operand")))]
  ...
)

This does have the advantage of avoiding the extensions, so that constants can
be retained in the original mode.

Though of course if we go this way, there will be incentive to add
overflow codes for all __builtin_*_overflow_p.

r~

Re: [PATCH] rs6000: Don't split constant oprator when add, move to temp register for future optimization

2020-04-07 Thread Segher Boessenkool

Hi!

On Sat, Apr 04, 2020 at 10:21:00AM +1030, Alan Modra wrote:
> I have a series of small patches already, the most significant so far
> being the one that adds the following comment to rs6000_rtx_costs:
> 
>   "Calls from places like optabs.c:avoid_expensive_constant will come
>here with OUTER_CODE set to an operation such as AND with X being a
>CONST_INT or other CONSTANT_P type.  This will be compared against
>set_src_cost, where we'll come here with OUTER_CODE as SET and X
>the same constant.
> 
>Calls from places like default_noce_conversion_profitable_p will
>come here via seq_cost and pass the pattern of a SET insn in X.

That uses set_rtx_cost, which still doesn't use insn_cost, yes.
set_rtx_cost and set_src_cost are the two biggest things that still
need rtx_cost.

>The SET isn't handled here so *TOTAL will remain as
>COSTS_N_INSNS(1) multiplied by the number of words being set.
>Presuming the insn is valid and set_dest a reg, rs6000_rtx_costs
>will next see the set_src.  Continuing the example of an AND, this
>might be an AND of two other regs.  This AND should cost zero here
>since the insn cost has already been counted.  The overall cost
>value should be comparable to rs6000_insn_cost."

The biggest problem in converting it is that sometimes there is no valid
code yet, and constructing it isn't easy either, but some code still
wants to know the cost :-/

> It pays to figure out what you need to do before doing anything.  :-)

Yes :-)

> Those two paragraphs will result in quite a few changes.  The first
> one means that, for example, the rs6000_is_valid_and_mask test belongs
> under case CONST_INT as
> || (outer_code == AND
> && rs6000_is_valid_and_mask (x, mode))
> not under case AND.

Does it?  Isn't the rtx_cost logic inside-out enough already?

> The second paragraph says we are costing practically all operations
> too highly.

Another huge advantage of insn_cost.

> I'm still in analysis mode, worried about whether gcc generates deep
> rtl expressions to pass to rtx_cost.

Yes :-(

> I have a vague recollection of
> seeing that years ago, but it looks like most places with anything
> complex generate a sequence of insns and cost the sequence.

Sure, and anything not a single_set will use insn_cost, not set_rtx_cost,
and sanity prevails.

> If we do
> have expressions like (set (reg1) (and (plus (reg2) cst1) cst2)) then
> rs6000_rtx_cost should recognise that AND as costing an extra insn.

Well, what does such an expression *mean*, if there is no instruction
like that?  How should we estimate the cost if we do not know what insns
it will generate?


Segher

Re: [committed] cselib: Fix endless cselib loop on (plus:P (reg) (const_int 0))

2020-04-07 Thread Joseph Myers

This introduces an ICE building glibc for m68k (and the same ICE appears 
for microblaze, though I haven't bisected there).  See bug 94526.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [committed] cselib: Fix endless cselib loop on (plus:P (reg) (const_int 0))

2020-04-07 Thread Jeff Law via Gcc-patches

On Tue, 2020-04-07 at 22:35 +, Joseph Myers wrote:
> This introduces an ICE building glibc for m68k (and the same ICE appears 
> for microblaze, though I haven't bisected there).  See bug 94526.
Yea, I've already forwarded Jakub the microblaze testcase.

jeff

[PATCH] postreload: Fix autoinc handling in reload_cse_move2add [PR94516]

2020-04-07 Thread Jakub Jelinek via Gcc-patches

Hi!

The following testcase shows two separate issues caused by the cselib
changes.
One is that through the cselib sp tracking improvements on
... r12 = rsp; rsp -= 8; push cst1; push cst2; push cst3; call
rsp += 32; rsp -= 8; push cst4; push cst5; push cst6; call
rsp += 32; rsp -= 8; push cst7; push cst8; push cst9; call
rsp += 32
reload_cse_simplify_set decides to optimize the rsp += 32 insns
into rsp = r12 because cselib figures that the r12 register holds the right
value.  From the pure cost perspective that seems like a win and on its own
at least for -Os that would be beneficial, except that there are those
rsp -= 8 stack adjustments after it, where rsp += 32; rsp -= 8; is optimized
into rsp += 24; by the csa pass, but rsp = r12; rsp -= 8 can't.  Dunno
what to do about this part, the PR has a hack in a comment.

Anyway, the following patch fixes the other part, which isn't a missed
optimization, but a wrong-code issue.  The problem is that the pushes of
constant are on x86 represented through PRE_MODIFY and while
move2add_note_store has some code to handle {PRE,POST}_{INC,DEC} without
REG_INC note, it doesn't handle {PRE,POST}_MODIFY (that would be enough
to fix this testcase).  But additionally it looks misplaced, because
move2add_note_store is only called on the rtxes that are stored into,
while RTX_AUTOINC can happen not just in those, but anywhere else in the
instruction (e.g. pop insn can have autoinc in the SET_SRC MEM).
REG_INC note seems to be required for any autoinc except for stack pointer
autoinc which doesn't have those notes, so this patch just handles
the sp autoinc after the REG_INC note handling loop.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2020-04-07  Jakub Jelinek  

PR rtl-optimization/94516
* postreload.c: Include rtl-iter.h.
(reload_cse_move2add): Handle SP autoinc here by FOR_EACH_SUBRTX_VAR
looking for all MEMs with RTX_AUTOINC operand.
(move2add_note_store): Remove {PRE,POST}_{INC,DEC} handling.

* gcc.dg/torture/pr94516.c: New test.

--- gcc/postreload.c.jj 2020-01-23 20:07:55.041981751 +0100
+++ gcc/postreload.c2020-04-07 16:13:56.812293400 +0200
@@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.
 #include "tree-pass.h"
 #include "dbgcnt.h"
 #include "function-abi.h"
+#include "rtl-iter.h"
 
 static int reload_cse_noop_set_p (rtx);
 static bool reload_cse_simplify (rtx_insn *, rtx);
@@ -2090,6 +2091,21 @@ reload_cse_move2add (rtx_insn *first)
}
}
}
+
+  /* There are no REG_INC notes for SP autoinc.  */
+  subrtx_var_iterator::array_type array;
+  FOR_EACH_SUBRTX_VAR (iter, array, PATTERN (insn), NONCONST)
+   {
+ rtx mem = *iter;
+ if (mem
+ && MEM_P (mem)
+ && GET_RTX_CLASS (GET_CODE (XEXP (mem, 0))) == RTX_AUTOINC)
+   {
+ if (XEXP (XEXP (mem, 0), 0) == stack_pointer_rtx)
+   reg_mode[STACK_POINTER_REGNUM] = VOIDmode;
+   }
+   }
+
   note_stores (insn, move2add_note_store, insn);
 
   /* If INSN is a conditional branch, we try to extract an
@@ -2144,17 +2160,6 @@ move2add_note_store (rtx dst, const_rtx
   unsigned int regno = 0;
   scalar_int_mode mode;
 
-  /* Some targets do argument pushes without adding REG_INC notes.  */
-
-  if (MEM_P (dst))
-{
-  dst = XEXP (dst, 0);
-  if (GET_CODE (dst) == PRE_INC || GET_CODE (dst) == POST_INC
- || GET_CODE (dst) == PRE_DEC || GET_CODE (dst) == POST_DEC)
-   reg_mode[REGNO (XEXP (dst, 0))] = VOIDmode;
-  return;
-}
-
   if (GET_CODE (dst) == SUBREG)
 regno = subreg_regno (dst);
   else if (REG_P (dst))
--- gcc/testsuite/gcc.dg/torture/pr94516.c.jj   2020-04-07 16:50:25.531448655 
+0200
+++ gcc/testsuite/gcc.dg/torture/pr94516.c  2020-04-07 16:52:57.703196275 
+0200
@@ -0,0 +1,31 @@
+/* PR rtl-optimization/94516 */
+/* { dg-do run } */
+/* { dg-additional-options "-fpie" { target pie } } */
+
+struct S { unsigned char *a; unsigned int b; };
+typedef int V __attribute__((vector_size (sizeof (int) * 4)));
+
+__attribute__((noipa)) void
+foo (const char *a, const char *b, const char *c, const struct S *d, int e, 
int f, int g, int h, int i)
+{
+  V v = { 1, 2, 3, 4 };
+  asm volatile ("" : : "g" (&v) : "memory");
+  v += (V) { 5, 6, 7, 8 };
+  asm volatile ("" : : "g" (&v) : "memory");
+}
+
+__attribute__((noipa)) void
+bar (void)
+{
+  const struct S s = { "foobarbaz", 9 };
+  foo ("foo", (const char *) 0, "corge", &s, 0, 1, 0, -12, -31);
+  foo ("bar", "quux", "qux", &s, 0, 0, 9, 0, 0);
+  foo ("baz", (const char *) 0, "qux", &s, 1, 0, 0, -12, -32);
+}
+
+int
+main ()
+{
+  bar ();
+  return 0;
+}

Jakub

Re: [PATCH, V4] PowerPC Turn on -mpcrel by default for -mcpu=future

2020-04-07 Thread Segher Boessenkool

Hi!

On Mon, Apr 06, 2020 at 12:52:04PM -0400, Michael Meissner wrote:
> Commit message:
> Enable -mpcrel for -mcpu=future if it is allowed by the ABI.

You can just put this in the mail subject.  No dot at the end, but start
with a capital, an put a subsystem label on it:

Subject: rs6000: Enable -mpcrel for -mcpu=future if it is allowed by the ABI

(As you see, it is a bit long already).

> 2020-04-06  Michael Meissner  
> 
>   * config/rs6000/linux64.h (PCREL_SUPPORTED_BY_OS): Enable
>   prefixed PC-relative addressing if the ABI supports it.
>   * config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Do not
>   set OPTION_MASK_PREFIXED here.
>   * config/rs6000/rs6000.c (rs6000_option_override_internal): Enable
>   OPTION_MASK_PREFIXED and OPTION_MASK_PCREL on -mcpu=future by
>   default if the current ABI allows the options.

The changelog goes at the *end* of the commit message.  Oh, there was
nothing more?

> --- /tmp/XzRKno_rs6000-cpus.def   2020-04-03 17:15:05.068676928 -0400
> +++ gcc/config/rs6000/rs6000-cpus.def 2020-04-03 17:00:50.115550614 -0400
> @@ -75,10 +75,14 @@
>| OPTION_MASK_P8_VECTOR\
>| OPTION_MASK_P9_VECTOR)
>  
> -/* Support for a future processor's features.  Do not enable -mpcrel until it
> -   is fully functional.  */
> +/* Support for a future processor's features.  Do not set 
> OPTION_MASK_PREFIXED
> +   or OPTION_MASK_PCREL here.  Those options are enabled in the function
> +   rs6000_option_override if the ABI supports them.  */

I am still not happy with that, and it should be fixed.  But, let's do
that later then.

The ISA mask should say what features are supported.  If we choose not
to use some we should do that elsewhere, in option_override for example.
Expressing only 95% of the features in the ISA masks is the worst of
both worlds.

> +/* Flags that need to be turned off if -mno-future.  */
> +#define OTHER_FUTURE_MASKS   (OPTION_MASK_PCREL  \
>| OPTION_MASK_PREFIXED)
>  
>  /* Flags that need to be turned off if -mno-future.  */

And right after this is already a define for OTHER_FUTURE_MASKS.  Did
you mismerge this?  Please correct and repost.

> +  /* Enable -mprefixed by default on 64-bit 'future' systems.  */
> +  if (TARGET_FUTURE && TARGET_POWERPC64
> +  && (rs6000_isa_flags_explicit & OPTION_MASK_PREFIXED) == 0)
> +rs6000_isa_flags |= OPTION_MASK_PREFIXED;

I don't understand why only for 64 bit?

> +#ifdef PCREL_SUPPORTED_BY_OS
> +  /* If the ABI has support for PC-relative relocations, enable it by
> + default.  */
> +  if (PCREL_SUPPORTED_BY_OS
> +  && (rs6000_isa_flags_explicit & OPTION_MASK_PCREL) == 0)
> +rs6000_isa_flags |= OPTION_MASK_PCREL;
> +#endif

No #ifdef at uses please, unless there is a reason (there is none here
afaics).

The #ifndef you had before for the default definition was fine.  The
#undef you had for a non-default definition was not: it hides problems,
it never helps.

Segher

[committed] Fix H8 testsuite failures after recent cselib changes

2020-04-07 Thread Jeff Law via Gcc-patches

Whee, more fallout from the cselib work.  This is clearly another backend bug.

The H8 port has this peephole:
;; Turn
;;
;;   mov.l er7,er0
;;   add.l #10,er0  (takes 8 bytes)
;;
;; into
;;
;;   sub.l er0,er0
;;   add.b #10,r0l
;;   add.l er7,er0  (takes 6 bytes)

(define_peephole2
  [(set (match_operand:SI 0 "register_operand" "")
(match_operand:SI 1 "register_operand" ""))
   (set (match_dup 0)
(plus:SI (match_dup 0)
 (match_operand:SI 2 "const_int_operand" "")))]
  "(TARGET_H8300H || TARGET_H8300S)
&& REG_P (operands[0]) && REG_P (operands[1])
&& REGNO (operands[0]) != REGNO (operands[1])
&& !satisfies_constraint_L (operands[2])
&& !satisfies_constraint_N (operands[2])
&& ((INTVAL (operands[2]) & 0xff) == INTVAL (operands[2])
|| (INTVAL (operands[2]) & 0xff00) == INTVAL (operands[2])
|| INTVAL (operands[2]) == 0x
|| INTVAL (operands[2]) == 0xfffe)"
  [(set (match_dup 0)
(match_dup 2))
   (set (match_dup 0)
(plus:SI (match_dup 0)
 (match_dup 1)))]
  "")

As the comment indicates this saves a couple bytes of space.  But boy it's a bad
idea if operand0 is the stack pointer.  Aside from ICE related to ARG_SIZE 
notes,
imagine what happens if you take an interrupt after zero-ing $sp, but before the
final add.l.  Nothing good.

The fix is pretty simple, verify operands[0] is not the stack pointer.  I looked
briefly at the other peepholes, but none of the others seem to be fundamentally
broken when a destination is the stack pointer.


The ICE showed up in a good variety of existing tests.  So there's no new test.

Committing to the trunk.

Jeff

commit 14162197fd4cf0e3a848d32bd9385876e1b1f249
Author: Jeff Law 
Date:   Tue Apr 7 17:55:00 2020 -0600

Fix a variety of testsuite failures on the H8 after recent cselib changes

PR rtl-optimization/92264
* config/h8300/h8300.md (mov;add peephole2): Avoid applying when
the destination is the stack pointer.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 12803e90b0a..6f2dcfb766c 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2020-04-07  Jeff Law  
+
+   PR rtl-optimization/92264
+   * config/h8300/h8300.md (mov;add peephole2): Avoid applying when
+   the destination is the stack pointer.
+
 2020-04-07  Jakub Jelinek  
 
PR rtl-optimization/94291
diff --git a/gcc/config/h8300/h8300.md b/gcc/config/h8300/h8300.md
index bcc78a4ce4d..fdd2d8b02d7 100644
--- a/gcc/config/h8300/h8300.md
+++ b/gcc/config/h8300/h8300.md
@@ -4299,6 +4299,7 @@
(plus:SI (match_dup 0)
 (match_operand:SI 2 "const_int_operand" "")))]
   "(TARGET_H8300H || TARGET_H8300S)
+&& operands[0] != stack_pointer_rtx
 && REG_P (operands[0]) && REG_P (operands[1])
 && REGNO (operands[0]) != REGNO (operands[1])
 && !satisfies_constraint_L (operands[2])

Re: [PATCH v2 00/11] aarch64: Implement TImode comparisons

2020-04-07 Thread Segher Boessenkool

On Tue, Apr 07, 2020 at 02:43:36PM -0700, Richard Henderson wrote:
> On 4/7/20 1:27 PM, Segher Boessenkool wrote:
> > On Mon, Apr 06, 2020 at 12:19:42PM +0100, Richard Sandiford wrote:
> >> The reason I'm not keen on using special modes for this case is that
> >> they'd describe one way in which the result can be used rather than
> >> describing what the instruction actually does.  The instruction really
> >> does set all four flags to useful values.  The "problem" is that they're
> >> not the values associated with a compare between two values, so 
> >> representing
> >> them that way will always lose information.
> > 
> > CC modes describe how the flags are set, not how the flags are used.
> > You cannot easily describe the V bit setting with a compare (it needs
> > a mode bigger than the register), is that what you mean?
> 
> I think that is a good deal of the effort.

So you need to have different CC modes for different ways of setting
(combinations) of the four NZVC bits, there isn't really any way around
that, that is how CC modes work.

> I wonder if it would be helpful to have
> 
>   (uoverflow_plus x y carry)
>   (soverflow_plus x y carry)
> 
> etc.

Those have three operands, which is nasty to express.

On rs6000 we have the carry bit as a separate register (it really is
only one bit, XER[CA], but in GCC we model it as a separate register).
We handle it as a fixed register (there is only one, and saving and
restoring it is relatively expensive, so this worked out the best).
Still, in the patterns (for insns like "adde") that take both a carry
input and have it as output, the expression for the carry output but
already the one for the GPR output become so unwieldy that nothing
can properly work with it.  So, in the end, I have all such insns that
take a carry input just clobber their carry output.  This works great!

Expressing the carry setting for insns that do not take a carry in is
much easier.  You get somewhat different patterns for various
immediate inputs, but that is all.

[ snip ]

> This does have the advantage of avoiding the extensions, so that constants can
> be retained in the original mode.

But it won't ever survive simplification; or, it will be in the way of
simplification.

A simple test is looking what you get if you do

long long f(long long x) { return x + 0x1; }

(on a 32-bit config).

> Though of course if we go this way, there will be incentive to add
> overflow codes for all __builtin_*_overflow_p.

Yeah, eww.  And where will it stop?  What muladd insns should we have
special RTL codes for, for the high part?

Segher

1 2 >

1 - 100 of 108 matches

Mail list logo