Re: [PATCH] Hashtable PR96088

2021-05-20 Thread François Dumont via Gcc-patches

On 20/05/21 6:44 pm, Jonathan Wakely wrote:

On 06/05/21 22:03 +0200, François Dumont via Libstdc++ wrote:

Hi

    Considering your feedback on backtrace in debug mode is going to 
take me some time so here is another one.


    Compared to latest submission I've added a _Hash_arg_t partial 
specialization for std::hash<>. It is not strictly necessary for the 
moment but when we will eventually remove its nested argument_type it 
will be. I also wonder if it is not easier to handle for the 
compiler, not sure about that thought.


The std::hash specializations in libstdc++ define argument_type, but
I'm already working on one that doesn't (forstd::stacktrace).

And std::hash can be specialized by users,
and is not required to provide argument_type.

So it's already not valid to assume that std::hash::argument_type
exists.


Yes, I know that the plan is to get rid of argument_type. But as long as 
it is there we can still use it even if I didn't realize that you were 
already in the process of removing it so soon.






@@ -850,9 +852,56 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
iterator
_M_emplace(const_iterator, false_type __uks, _Args&&... __args);

+  template
+    std::pair
+    _M_insert_unique(_Kt&&, _Arg&&, const _NodeGenerator&);
+
+  // Detect nested argument_type.
+  template>
+    struct _Hash_arg_t
+    { typedef _Kt argument_type; };
+
+  // std::hash
+  template
+    struct _Hash_arg_t<_Kt, std::hash<_Arg>>
+    { typedef _Arg argument_type; };
+
+  // Nested argument_type.
+  template
+    struct _Hash_arg_t<_Kt, _Ht,
+  __void_t>
+    { typedef typename _Ht::argument_type argument_type; };
+
+  // Function pointer.
+  template
+    struct _Hash_arg_t<_Kt, std::size_t(*)(const _Arg&)>
+    { typedef _Arg argument_type; };
+
+  template::argument_type>
+    static typename conditional<
+  __is_nothrow_convertible<_Kt, _ArgType>::value, _Kt&&, 
key_type>::type


Please use __conditional_t<...> here instead of
typename conditional<...>::type.

The purpose of the _Hash_arg_t type is to determine whether invoking
the hash function with _Kt&& can throw, right?


No, the purpose of _Hash_arg_t is to find out what is the argument type 
of the _Hash functor. With this info I can check if invoking that 
functor is going to instantiate a temporary using a throwing operation. 
If so, I create a temporary at _Hashtable code level and move it to its 
final storage place when needed.


The tricky part is that _Hash can accept different argument types, for 
the moment I just do not create a temporary in this case.




And if it can throw, you force a conversion early, and if it can't,
you don't do the conversion.

Can't you use __is_nothrow_invocable<_Hash&, _Kt> for that, instead of
this fragile approach?


I think I already try but I'll check.

I fear that __is_nothrow_invocable<_Hash&, _Kt> tells if the chosen 
operator()(const _Arg&) is noexcept qualified. Not if the conversion 
from _Kt to _Arg is noexcept.





Re: [PATCH] Add 3 target hooks for memset

2021-05-20 Thread Bernd Edlinger
On 5/20/21 10:49 PM, H.J. Lu wrote:
> On Wed, May 19, 2021 at 5:55 AM H.J. Lu  wrote:
>>
>> On Wed, May 19, 2021 at 2:25 AM Richard Biener
>>  wrote:
>>>
>>> On Tue, May 18, 2021 at 9:16 PM H.J. Lu  wrote:

 Add TARGET_READ_MEMSET_VALUE and TARGET_GEN_MEMSET_VALUE to support
 target instructions to duplicate QImode value to TImode/OImode/XImode
 value for memmset.

 PR middle-end/90773
 * builtins.c (builtin_memset_read_str): Call
 targetm.read_memset_value.
 (builtin_memset_gen_str): Call targetm.gen_memset_value.
 * target.def (read_memset_value): New hook.
 (gen_memset_value): Likewise.
 * targhooks.c: Inclue "builtins.h".
 (default_read_memset_value): New function.
 (default_gen_memset_value): Likewise.
 * targhooks.h (default_read_memset_value): New prototype.
 (default_gen_memset_value): Likewise.
 * doc/tm.texi.in: Add TARGET_READ_MEMSET_VALUE and
 TARGET_GEN_MEMSET_VALUE hooks.
 * doc/tm.texi: Regenerated.
 ---
  gcc/builtins.c | 47 --
  gcc/doc/tm.texi| 16 +
  gcc/doc/tm.texi.in |  4 
  gcc/target.def | 20 +
  gcc/targhooks.c| 56 ++
  gcc/targhooks.h|  4 
  6 files changed, 104 insertions(+), 43 deletions(-)

 diff --git a/gcc/builtins.c b/gcc/builtins.c
 index e1b284846b1..f78a36478ef 100644
 --- a/gcc/builtins.c
 +++ b/gcc/builtins.c
 @@ -6584,24 +6584,11 @@ expand_builtin_strncpy (tree exp, rtx target)
 previous iteration.  */

  rtx
 -builtin_memset_read_str (void *data, void *prevp,
 +builtin_memset_read_str (void *data, void *prev,
  HOST_WIDE_INT offset ATTRIBUTE_UNUSED,
  scalar_int_mode mode)
  {
 -  by_pieces_prev *prev = (by_pieces_prev *) prevp;
 -  if (prev != nullptr && prev->data != nullptr)
 -{
 -  /* Use the previous data in the same mode.  */
 -  if (prev->mode == mode)
 -   return prev->data;
 -}
 -
 -  const char *c = (const char *) data;
 -  char *p = XALLOCAVEC (char, GET_MODE_SIZE (mode));
 -
 -  memset (p, *c, GET_MODE_SIZE (mode));
 -
 -  return c_readstr (p, mode);
 +  return targetm.read_memset_value ((const char *) data, prev, mode);
  }

  /* Callback routine for store_by_pieces.  Return the RTL of a register
 @@ -6611,37 +6598,11 @@ builtin_memset_read_str (void *data, void *prevp,
 nullptr, it has the RTL info from the previous iteration.  */

  static rtx
 -builtin_memset_gen_str (void *data, void *prevp,
 +builtin_memset_gen_str (void *data, void *prev,
 HOST_WIDE_INT offset ATTRIBUTE_UNUSED,
 scalar_int_mode mode)
  {
 -  rtx target, coeff;
 -  size_t size;
 -  char *p;
 -
 -  by_pieces_prev *prev = (by_pieces_prev *) prevp;
 -  if (prev != nullptr && prev->data != nullptr)
 -{
 -  /* Use the previous data in the same mode.  */
 -  if (prev->mode == mode)
 -   return prev->data;
 -
 -  target = simplify_gen_subreg (mode, prev->data, prev->mode, 0);
 -  if (target != nullptr)
 -   return target;
 -}
 -
 -  size = GET_MODE_SIZE (mode);
 -  if (size == 1)
 -return (rtx) data;
 -
 -  p = XALLOCAVEC (char, size);
 -  memset (p, 1, size);
 -  coeff = c_readstr (p, mode);
 -
 -  target = convert_to_mode (mode, (rtx) data, 1);
 -  target = expand_mult (mode, target, coeff, NULL_RTX, 1);
 -  return force_reg (mode, target);
 +  return targetm.gen_memset_value ((rtx) data, prev, mode);
  }

  /* Expand expression EXP, which is a call to the memset builtin.  Return
 diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
 index 85ea9395560..51385044e76 100644
 --- a/gcc/doc/tm.texi
 +++ b/gcc/doc/tm.texi
 @@ -11868,6 +11868,22 @@ This function prepares to emit a conditional 
 comparison within a sequence
   @var{bit_code} is @code{AND} or @code{IOR}, which is the op on the 
 compares.
  @end deftypefn

 +@deftypefn {Target Hook} rtx TARGET_READ_MEMSET_VALUE (const char 
 *@var{c}, void *@var{prev}, scalar_int_mode @var{mode})
 +This function returns the RTL of a constant integer corresponding to
 +target reading @code{GET_MODE_SIZE (@var{mode})} bytes from the stringn
 +constant @var{str}.  If @var{prev} is not @samp{nullptr}, it contains
 +the RTL information from the previous interation.
 +@end deftypefn
 +
 +@deftypefn {Target Hook} rtx TARGET_GEN_MEMSET_VALUE (rtx @var{data}, 
 void *@var{prev}, scalar_int_mode 

Re: [PATCH] constructor: Elide expand_constructor when can move by pieces is true

2021-05-20 Thread Bernd Edlinger
On 5/20/21 4:03 PM, H.J. Lu wrote:
> On Thu, May 20, 2021 at 12:51 AM Richard Biener
>  wrote:
>>
>> On Wed, May 19, 2021 at 3:22 PM H.J. Lu  wrote:
>>>
>>> On Wed, May 19, 2021 at 2:33 AM Richard Biener
>>>  wrote:

 On Tue, May 18, 2021 at 9:16 PM H.J. Lu  wrote:
>
> When expanding a constant constructor, don't call expand_constructor if
> it is more efficient to load the data from the memory via move by pieces.
>
> gcc/
>
> PR middle-end/90773
> * expr.c (expand_expr_real_1): Don't call expand_constructor if
> it is more efficient to load the data from the memory.
>
> gcc/testsuite/
>
> PR middle-end/90773
> * gcc.target/i386/pr90773-24.c: New test.
> * gcc.target/i386/pr90773-25.c: Likewise.
> ---
>  gcc/expr.c | 10 ++
>  gcc/testsuite/gcc.target/i386/pr90773-24.c | 22 ++
>  gcc/testsuite/gcc.target/i386/pr90773-25.c | 20 
>  3 files changed, 52 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-24.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-25.c
>
> diff --git a/gcc/expr.c b/gcc/expr.c
> index d09ee42e262..80e01ea1cbe 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -10886,6 +10886,16 @@ expand_expr_real_1 (tree exp, rtx target, 
> machine_mode tmode,
> unsigned HOST_WIDE_INT ix;
> tree field, value;
>
> +   /* Check if it is more efficient to load the data from
> +  the memory directly.  FIXME: How many stores do we
> +  need here if not moved by pieces?  */
> +   unsigned HOST_WIDE_INT bytes
> + = tree_to_uhwi (TYPE_SIZE_UNIT (type));

 that's prone to fail - it could be a VLA.
>>>
>>> What do you mean by fail?  Is it ICE or missed optimization?
>>> Do you have a testcase?
>>>

> +   if ((bytes / UNITS_PER_WORD) > 2
> +   && MOVE_MAX_PIECES > UNITS_PER_WORD
> +   && can_move_by_pieces (bytes, TYPE_ALIGN (type)))
> + goto normal_inner_ref;
> +

 It looks like you're concerned about aggregate copies but this also handles
 non-aggregates (which on GIMPLE might already be optimized of course).
>>>
>>> Here I check if we copy more than 2 words and we can move more than
>>> a word in a single instruction.
>>>
 Also you say "if it's cheaper" but I see no cost considerations.  How do
 we generally handle immed const vs. load from constant pool costs?
>>>
>>> This trades 2 (update to 8) stores with one load plus one store.  Is there
>>> a way to check which one is faster?
>>
>> I'm not sure - it depends on whether the target can do stores from immediates
>> at all or what restrictions apply, what the immediate value actually is
>> (zero or all-ones should be way cheaper than sth arbitrary) and how the
>> pressure on the load unit is.  can_move_by_pieces (bytes, TYPE_ALIGN (type))
>> also does not guarantee it will actually move pieces larger than 
>> UNITS_PER_WORD,
>> that might depend on alignment.  There's by_pieces_ninsns that might provide
>> some hint here.
>>
>> I'm sure it works well for x86.
>>
>> I wonder if the existing code is in the appropriate place and we
>> shouldn't instead
>> handle this somewhere upthread where we ask to copy 'exp' into some other
>> memory location.  For your testcase that's expand_assignment but I can
>> imagine passing array[0] by value to a function resulting in similar copying.
>> Testing that shows we get
>>
>> pushq   array+56(%rip)
>> .cfi_def_cfa_offset 24
>> pushq   array+48(%rip)
>> .cfi_def_cfa_offset 32
>> pushq   array+40(%rip)
>> .cfi_def_cfa_offset 40
>> pushq   array+32(%rip)
>> .cfi_def_cfa_offset 48
>> pushq   array+24(%rip)
>> .cfi_def_cfa_offset 56
>> pushq   array+16(%rip)
>> .cfi_def_cfa_offset 64
>> pushq   array+8(%rip)
>> .cfi_def_cfa_offset 72
>> pushq   array(%rip)
>> .cfi_def_cfa_offset 80
>> callbar
>>
>> for that.  We do have the by-pieces infrastructure to generally do this kind 
>> of
>> copying but in both of these cases we do not seem to use it.  I also wonder
>> if the by-pieces infrastructure can pick up constant initializers 
>> automagically
>> (we could native_encode the initializer part and feed the by-pieces
>> infrastructure with an array of bytes).  There for example might be easy to
>> immediate-store byte parts and difficult ones where we could decide on a
>> case-by-case basis whether to load+store or immediate-store them.
> 
> I opened:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100704
> 
>> For example if I change your testcase to have the array[] 

[committed] libstdc++: Implement LWG 3490 change to drop_while_view::begin()

2021-05-20 Thread Patrick Palka via Gcc-patches
Tested on x86_64-pc-linux-gnu, committed to trunk as obvious.

libstdc++-v3/ChangeLog:

PR libstdc++/100606
* include/std/ranges (drop_while_view::begin): Assert the
precondition added by LWG 3490.
---
 libstdc++-v3/include/std/ranges | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index 767a65c5822..76add252ca6 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -2190,6 +2190,7 @@ namespace views::__adaptor
if (_M_cached_begin._M_has_value())
  return _M_cached_begin._M_get(_M_base);
 
+   __glibcxx_assert(_M_pred.has_value());
auto __it = __detail::find_if_not(ranges::begin(_M_base),
  ranges::end(_M_base),
  std::cref(*_M_pred));
-- 
2.32.0.rc0



Re: [i386] [PATCH] Fix ICE when lhs is NULL [PR target/100660]

2021-05-20 Thread Hongtao Liu via Gcc-patches
On Thu, May 20, 2021 at 4:30 PM Richard Biener
 wrote:
>
> On Thu, May 20, 2021 at 10:15 AM Hongtao Liu  wrote:
> >
> > On Thu, May 20, 2021 at 4:06 PM Richard Biener
> >  wrote:
> > >
> > > On Thu, May 20, 2021 at 8:54 AM Hongtao Liu  wrote:
> > > >
> > > > Hi:
> > > >   In folding target-specific builtin, when lhs is NULL, create a
> > > > temporary variable for it.
> > > >   Bootstrapped and regtested on x86_64-linux-gnu{-m32,}
> > >
> > > I would suggest to drop the stmt or leave it unfolded instead.
> > Will -O0 be able to optimize the builtin away?
> >  Since i've deleted the corresponding expander, there would be an
> > error if the builtin goes directly to pass_expand.
>
> In that case replace it with a NOP, that should be safe then.
>
update patch.
> Richard.
>
> > > Note dropping would mean replacing it with a GIMPLE_NOP
> > > (gimple_build_nop ()).  But creating a new unused LHS certainly
> > > works as well.
> > >
> > > Jakub, any preference?
> > >
> > > Richard.
> > >
> > > > gcc/ChangeLog:
> > > > PR target/100660
> > > > * config/i386/i386.c (ix86_gimple_fold_builtin): Create a tmp
> > > > variable for lhs when it doesn't exist.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > > PR target/100660
> > > > * gcc.target/i386/pr100660.c: New test.
> > > >
> > > >
> > > >
> > > > --
> > > > BR,
> > > > Hongtao
> >
> >
> >
> > --
> > BR,
> > Hongtao



-- 
BR,
Hongtao


0001-Fix-ICE-when-lhs-is-NULL.patch_v2
Description: Binary data


Re: [PATCH] i386: Optimize vpblendvb on inverted mask register to vpblendvb on swapping the order of operand 1 and operand 2. [PR target/99908]

2021-05-20 Thread Hongtao Liu via Gcc-patches
On Thu, May 13, 2021 at 8:43 AM Hongtao Liu  wrote:
>
> On Wed, May 12, 2021 at 8:38 PM Uros Bizjak  wrote:
> >
> > On Wed, May 12, 2021 at 1:42 PM Hongtao Liu  wrote:
> > >
> > > On Wed, May 12, 2021 at 4:36 PM Uros Bizjak  wrote:
> > > >
> > > > On Tue, Apr 27, 2021 at 1:05 PM Hongtao Liu via Gcc-patches
> > > >  wrote:
> > > > >
> > > > > Hi:
> > > > >   As described in the subject line, this patch is about to do the
> > > > > below transformation.
> > > > >
> > > > > -   vpcmpeqd%ymm3, %ymm3, %ymm3
> > > > > -   vpandn  %ymm3, %ymm2, %ymm2
> > > > > -   vpblendvb   %ymm2, %ymm1, %ymm0, %ymm0
> > > > > +   vpblendvb   %ymm2, %ymm0, %ymm1, %ymm0
> > > > >
> > > > >   Bootstrapped and regtested on x86-64_iinux-gnu{-m32,}.
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > > PR target/99908
> > > > > * config/i386/sse.md (_pblendvb): Add
> > > > > splitters for pblendvb of NOT mask register.
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > >
> > > > > PR target/99908
> > > > > * gcc.target/i386/avx2-pr99908.c: New test.
> > > > > * gcc.target/i386/sse4_1-pr99908.c: New test.
> > >
> > > Thanks for the review.
> >
> > OTOH, have you considered ix86_fold_builtinor
> > ix86_gimple_fold_builtin? These functions are implemented as builtins,
> > so perhaps the transformation can be more efficiently implemented by
> > calling these two target functions.
> Good idea, I'll try that.
I find it's not that good to fold andn to 2 gimple IRs which don't
always come back to andn in rtl, and lose some opt.
But blendv folding seems to be obviously good.
> >
> > Uros.
>
>
>
> --
> BR,
> Hongtao



-- 
BR,
Hongtao


Re: [PATCH] Fix vec-splati-runnable.c test.

2021-05-20 Thread Michael Meissner via Gcc-patches
On Thu, May 20, 2021 at 02:31:59PM -0500, will schmidt wrote:
> On Tue, 2021-05-18 at 16:49 -0400, Michael Meissner wrote:
> > [PATCH] Fix vec-splati-runnable.c test.
> > 
> 
> hi,
> 
> 
> > I noticed that the vec-splati-runnable.c did not have an abort after one
> > of the tests.  If the test was run with optimization, the optimizer could
> > delete some of the tests and throw off the count.
> > 
> 
> 
> > I have bootstraped this on LE power9 and BE power8 systems.  There were no
> > regressions in the tests.  Can I check this into the trunk?
> > 
> > I do not expect to back port this to GCC 11 unless we will be back porting 
> > the
> > future patches that add support for the XXSPLITW, XXSPLTIDP, and XXSPLTI32DX
> > instructions.
> > 
> > gcc/testsuite/
> > 2021-05-18  Michael Meissner  
> > 
> > * gcc.target/powerpc/vec-splati-runnable.c: Run test with -O2
> > optimization.  Do not check what XXSPLTIDP generates if the value
> > is undefined.
> > ---
> >  .../gcc.target/powerpc/vec-splati-runnable.c  | 29 ++-
> >  1 file changed, 9 insertions(+), 20 deletions(-)
> > 
> > diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c 
> > b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
> > index e84ce77a21d..a135279b1d7 100644
> > --- a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
> > @@ -1,7 +1,7 @@
> >  /* { dg-do run { target { power10_hw } } } */
> >  /* { dg-do link { target { ! power10_hw } } } */
> >  /* { dg-require-effective-target power10_ok } */
> > -/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
> > +/* { dg-options "-mdejagnu-cpu=power10 -save-temps -O2" } */
> >  #include 
> > 
> >  #define DEBUG 0
> > @@ -12,6 +12,8 @@
> > 
> >  extern void abort (void);
> > 
> > +volatile vector double vresult_d_undefined;
> > +
> >  int
> >  main (int argc, char *argv [])
> >  {
> > @@ -85,25 +87,12 @@ main (int argc, char *argv [])
> >  #endif
> >}
> > 
> > -  /* This test will generate a "note" to the user that the argument
> > - is subnormal.  It is not an error, but results are not defined.  */
> > -  vresult_d = (vector double) { 2.0, 3.0 };
> > -  expected_vresult_d = (vector double) { 6.6E-42f, 6.6E-42f };
> > -
> > -  vresult_d = vec_splatid (6.6E-42f);
> > -
> > -  /* Although the instruction says the results are not defined, it does 
> > seem
> > - to work, at least on Mambo.  But no guarentees!  */
> > -  if (!vec_all_eq (vresult_d,  expected_vresult_d)) {
> > -#if DEBUG
> > -printf("ERROR, vec_splati (6.6E-42f)\n");
> > -for(i = 0; i < 2; i++)
> > -  printf(" vresult_d[%i] = %e, expected_vresult_d[%i] = %e\n",
> > -i, vresult_d[i], i, expected_vresult_d[i]);
> > -#else
> > -;
> > -#endif
> > -  }
> > +  /* This test will generate a "note" to the user that the argument is
> > + subnormal.  It is not an error, but results are not defined.  Because 
> > this
> > + is undefined, we cannot check that any value is correct.  Just store 
> > it in
> 
> as in undefined-behavior..?

It is undefined what value is put into the registers if you specify a denomral
value.

> 
> > + a volatile variable so the XXSPLTIDP instruction gets generated and 
> > the
> > + warning message printed. */
> > +  vresult_d_undefined = vec_splatid (6.6E-42f);
> 
> 
> This does not look like it adds an abort() call as I would have
> expected per the patch description. 

Originally the code did add an abort test.  Then I discovered the hardware
generates something different than what the simulator showed.  So I had to
remove the test for equality of what is loaded.  Sorry about that.

> 
> So this looks like it still calls vec_splatid(), but instead assigns
> result to a variable name vresult_d_undefined.   Also removes some
> DEBUG code, which is fine.  So just the vec_all_eq() call is removed?  
> I'm not certain I see how that will change the results, just the -O2
> optimization makes the difference?
> I may be missing something...

The original code was run at -O0.  If you enabled optimization, the optimizer
would delete setting the value because it did nothing.

  /* This test will generate a "note" to the user that the argument
 is subnormal.  It is not an error, but results are not defined.  */
  vresult_d = (vector double) { 2.0, 3.0 };
  expected_vresult_d = (vector double) { 6.6E-42f, 6.6E-42f };

  vresult_d = vec_splatid (6.6E-42f);

  /* Although the instruction says the results are not defined, it does 
seem
 to work, at least on Mambo.  But no guarentees!  */
  if (!vec_all_eq (vresult_d,  expected_vresult_d)) {
#if DEBUG
printf("ERROR, vec_splati (6.6E-42f)\n");
for(i = 0; i < 2; i++)
  printf(" vresult_d[%i] = %e, expected_vresult_d[%i] = %e\n",
 i, vresult_d[i], i, expected_vresult_d[i]);

Re: [PATCH 2/2] Fix xxeval predicates.

2021-05-20 Thread Michael Meissner via Gcc-patches
On Thu, May 20, 2021 at 02:31:08PM -0500, will schmidt wrote:
> On Tue, 2021-05-18 at 16:47 -0400, Michael Meissner wrote:
> > [PATCH 2/2] Fix xxeval predicates.
> > 
> > In doing the patch to move the XX* built-in functions from altivec.md to
> > vsx.md, I noticed that the xxeval built-in function used the
> > altivec_register_operand predicate.  Since it takes vsx registers, this
> > might force the register allocate to issue a move when it could use a
> > traditional floating point register.  This patch fixes that.
> 
> allocator ?

Thanks.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH 1/2] Move xx* builtins to vsx.md.

2021-05-20 Thread Michael Meissner via Gcc-patches
On Thu, May 20, 2021 at 02:30:24PM -0500, will schmidt wrote:
> > +;; XXPERMX built-in function support
> > +(define_expand "xxpermx"
> > +  [(set (match_operand:V2DI 0 "register_operand" "+wa")
> > +   (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "wa")
> > + (match_operand:V2DI 2 "register_operand" "wa")
> > + (match_operand:V16QI 3 "register_operand" "wa")
> > + (match_operand:QI 4 "u8bit_cint_operand" "n")]
> > +UNSPEC_XXPERMX))]
> > +  "TARGET_POWER10"
> > +{
> > +  if (BYTES_BIG_ENDIAN)
> > +emit_insn (gen_xxpermx_inst (operands[0], operands[1],
> > +operands[2], operands[3],
> > +operands[4]));
> > +  else
> > +{
> > +  /* Reverse value of byte element indexes by XORing with 0xFF.
> > +Reverse the 32-byte section identifier match by subracting bits [0:2]
> > +of elemet from 7.  */
> 
> 
> element   (typo also existed in original).

Yep.  I just copied the text verbatim from altivec.md.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH] Allow __ibm128 on older PowerPC systems.

2021-05-20 Thread Michael Meissner via Gcc-patches
On Thu, May 20, 2021 at 02:29:02PM -0500, will schmidt wrote:
> On Tue, 2021-05-18 at 16:36 -0400, Michael Meissner wrote:
> > [PATCH] Allow __ibm128 on older PowerPC systems.
> > 
> 
> Hi,
> 
> 
> > On January 8th, 2018, I added code to ibm-ldouble.c to use the built-in
> > function __builtin_pack_ibm128 if long double is IEEE 128-bit and continue 
> > to
> > use __builtin_pack_longdouble if long double is IBM extended double.  This 
> > code
> > was needed because __builtin_pack_ibm128 is not available unless the 
> > __ibm128
> > keyword is availabe.  In the current code, __ibm128 is only enabled if we 
> > have
> > support for both IBM and IEEE 128-bit long double.
> 
> "available."
> 
> May be worth re-sifting the description to drop the history not
> directly applicable to what this patch is doing.

When I commit the patch, I try to eliminate the history section.  But for
active patches, I hope that it helps to state what is different from the
previous patches.

> > gcc/
> > 2021-05-18  Michael Meissner  
> > 
> > * config/rs6000/rs6000-builtin.def (BU_IBM128_2): Rename
> > RS6000_BTM_IBM128 from RS6000_BTM_FLOAT128.
> 
> > * config/rs6000/rs6000-call.c (rs6000_invalid_builtin): Update
> > error message for __ibm128 built-in functions.
> > (rs6000_init_builtins): Create the __ibm128 keyword on older
> > systems where long double uses the IBM extended double format,
> > even if they don't support IEEE 128-bit floating point.
> 
> Could drop 'older', ok.

Well yes, but the reality is all of the server systems that support VSX (and
hence IEEE 128-bit) are newer than the systems that don't (power6, power5,
etc.).

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH 2/2] Add IEEE 128-bit fp conditional move on PowerPC.

2021-05-20 Thread Michael Meissner via Gcc-patches
On Thu, May 20, 2021 at 02:27:06PM -0500, will schmidt wrote:
> > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> > index fdaf12aeda0..ef1ebaaee05 100644
> > --- a/gcc/config/rs6000/rs6000.c
> > +++ b/gcc/config/rs6000/rs6000.c
> > @@ -15706,8 +15706,8 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx 
> > op_true, rtx op_false,
> >return 1;
> >  }
> > 
> > -/* Possibly emit the xsmaxcdp and xsmincdp instructions to emit a maximum 
> > or
> > -   minimum with "C" semantics.
> > +/* Possibly emit the xsmaxc{dp,qp} and xsminc{dp,qp} instructions to emit a
> > +   maximum or minimum with "C" semantics.
> > 
> > Unless you use -ffast-math, you can't use these instructions to replace
> > conditions that implicitly reverse the condition because the comparison
> > @@ -15783,6 +15783,7 @@ rs6000_maybe_emit_fp_cmove (rtx dest, rtx op, rtx 
> > true_cond, rtx false_cond)
> >enum rtx_code code = GET_CODE (op);
> >rtx op0 = XEXP (op, 0);
> >rtx op1 = XEXP (op, 1);
> > +  machine_mode compare_mode = GET_MODE (op0);
> >machine_mode result_mode = GET_MODE (dest);
> >rtx compare_rtx;
> >rtx cmove_rtx;
> > @@ -15791,6 +15792,35 @@ rs6000_maybe_emit_fp_cmove (rtx dest, rtx op, rtx 
> > true_cond, rtx false_cond)
> >if (!can_create_pseudo_p ())
> >  return 0;
> > 
> > +  /* We allow the comparison to be either SFmode/DFmode and the true/false
> > + condition to be either SFmode/DFmode.  I.e. we allow:
> > +
> > +   float a, b;
> > +   double c, d, r;
> > +
> > +   r = (a == b) ? c : d;
> > +
> > +and:
> > +
> > +   double a, b;
> > +   float c, d, r;
> > +
> > +   r = (a == b) ? c : d;
> 
> 
> This new comment does not seem to align with the comments in the
> description, which statee "But you can't do ..." 

Yes, the comment is perhaps a little unclear.

> > +
> > +but we don't allow intermixing the IEEE 128-bit floating point types 
> > with
> > +the 32/64-bit scalar types.
> > +
> > +It gets too messy where SFmode/DFmode can use any register and 
> > TFmode/KFmode
> > +can only use Altivec registers.  In addtion, we would need to do a 
> > XXPERMDI
> > +if we compare SFmode/DFmode and move TFmode/KFmode.  */
> > +
> > +  if (compare_mode == result_mode
> > +  || (compare_mode == SFmode && result_mode == DFmode)
> > +  || (compare_mode == DFmode && result_mode == SFmode))
> > +;
> > +  else
> > +return false;
> 
> Interesting if/else block.  May want to reverse the logic. I defer if
> this way is notably simpler than inverting it.

I originally tried inverting it, and it just got messy.

> > +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
> > @@ -0,0 +1,15 @@
> > +/* { dg-require-effective-target ppc_float128_hw } */
> > +/* { dg-require-effective-target power10_ok } */
> > +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
> > +
> > +#ifndef TYPE
> > +#define TYPE _Float128
> > +#endif
> > +
> > +/* Test that the fminf128/fmaxf128 functions generate if/then/else and not 
> > a
> > +   call.  */
> 
> s/"if/then/else"/"minmax"/  ?

Thanks.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.

2021-05-20 Thread Michael Meissner via Gcc-patches
On Thu, May 20, 2021 at 02:25:58PM -0500, will schmidt wrote:
> I'd throw the ternary term in there, easier to search for later. 
> s/?: operations/ternary (?:) operations /

Thanks.

> So, presumably the float128-minmax-2.c test adds/replaces the power10
> code gen tests that were removed or disabled from float128-minmax.c. 

Yes.

> Probably fine..  It's good to exercise the pragma target stuff, thoguh
> I wonder if it would be better to just specify -mcpu=power9 in the
> options since we are already specifying (redundant?) -mpower9-vector. 
> 
> I see similar changes in a later patch, probably OK there since those
> tests do not appear to be specifying -mcpu=foo options that are already
> pointed at power9 features...

I think we really want a better solution than #pragma, since some systems (AIX
if memory serves) might not support #pragma to change code generation models,
because they don't have the assembler/linker support for it.

Basically for code generation tests, I see the following cases:

1) Test code targetting precisley power8 (or power9, power10), etc.  Hopefully
these are rare.

2) Test code targetting at least power8.  But as these tests show, that a lot
of the code won't generate the appropriate instructions on power10.  This is
what we have now.  It relies on undocumented switches like -mpower9-vector to
add the necessary support.

3) Test code targetting at least power8 but go to power9 at the maximum.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[PATCH] libgccjit: Add function to set the initial value of a global variable [PR96089]

2021-05-20 Thread Antoni Boucher via Gcc-patches
Hi.

I made this patch to set an arbitrary value to a global variable.

This patch suffers from the same issue as inline assembly
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100380), i.e. it
segfaults if the `rvalue` is created after the global variable.
It seems to be a design issue so I'm not sure what would be the fix for
it and having it fixed would allow me to test this new function much
more and see if things are missing (i.e. it might require a way to
create a constant struct).
See the link above for an explanation of this issue.

Thanks for the review.
From 0a5fd7f759e1bd7becc993f01bdcf84ff8fc5fd5 Mon Sep 17 00:00:00 2001
From: Antoni Boucher 
Date: Sat, 15 May 2021 10:54:36 -0400
Subject: [PATCH] Add function to set the initial value of a global variable
 [PR96089]

2021-05-20  Antoni Boucher  

gcc/jit/
PR target/96089
* docs/topics/compatibility.rst (LIBGCCJIT_ABI_19): New ABI
tag.
* docs/topics/expressions.rst: Add documentation for the
function gcc_jit_global_set_initializer_value.
* jit-playback.c: New function (new_global_with_value).
* jit-playback.h: New function (new_global_with_value).
* jit-recording.c: Add support for setting a value to a
global variable.
* jit-recording.h: New function (set_initializer_value) and
new field m_initializer_value.
* libgccjit.c: New macro RETURN_IF_FAIL_PRINTF5 and new
function (gcc_jit_global_set_initializer_value).
* libgccjit.h: New function (gcc_jit_global_set_initializer_value).
* libgccjit.map (LIBGCCJIT_ABI_19): New ABI tag.

gcc/testsuite/
PR target/96089
* jit.dg/test-global-set-initializer.c: Add test for the new
function (gcc_jit_global_set_initializer_value).
---
 gcc/jit/docs/topics/compatibility.rst |  9 
 gcc/jit/docs/topics/expressions.rst   | 14 ++
 gcc/jit/jit-playback.c| 18 
 gcc/jit/jit-playback.h|  7 +++
 gcc/jit/jit-recording.c   | 34 ---
 gcc/jit/jit-recording.h   |  8 
 gcc/jit/libgccjit.c   | 43 +++
 gcc/jit/libgccjit.h   | 13 ++
 gcc/jit/libgccjit.map | 14 ++
 .../jit.dg/test-global-set-initializer.c  | 15 +++
 10 files changed, 169 insertions(+), 6 deletions(-)

diff --git a/gcc/jit/docs/topics/compatibility.rst b/gcc/jit/docs/topics/compatibility.rst
index 239b6aa1a92..666eb3a1c51 100644
--- a/gcc/jit/docs/topics/compatibility.rst
+++ b/gcc/jit/docs/topics/compatibility.rst
@@ -243,3 +243,12 @@ embedding assembler instructions:
   * :func:`gcc_jit_extended_asm_add_input_operand`
   * :func:`gcc_jit_extended_asm_add_clobber`
   * :func:`gcc_jit_context_add_top_level_asm`
+
+.. _LIBGCCJIT_ABI_19:
+
+``LIBGCCJIT_ABI_19``
+---
+``LIBGCCJIT_ABI_19`` covers the addition of an API entrypoint to set the value
+of a global variable:
+
+  * :func:`gcc_jit_global_set_initializer_value`
diff --git a/gcc/jit/docs/topics/expressions.rst b/gcc/jit/docs/topics/expressions.rst
index 396259ef07e..f638cb68fdd 100644
--- a/gcc/jit/docs/topics/expressions.rst
+++ b/gcc/jit/docs/topics/expressions.rst
@@ -603,6 +603,20 @@ Global variables
 
   #ifdef LIBGCCJIT_HAVE_gcc_jit_global_set_initializer
 
+.. function:: void
+  gcc_jit_global_set_initializer_value (gcc_jit_lvalue *global,\
+gcc_jit_rvalue *value)
+
+   Set an initializer for ``global`` using the specified value.
+   ``global`` must be the same type as ``value``.
+
+   This entrypoint was added in :ref:`LIBGCCJIT_ABI_19`; you can test for
+   its presence using
+
+   .. code-block:: c
+
+  #ifdef LIBGCCJIT_HAVE_gcc_jit_global_set_initializer_value
+
 Working with pointers, structs and unions
 -
 
diff --git a/gcc/jit/jit-playback.c b/gcc/jit/jit-playback.c
index c6136301243..d86701a8ae6 100644
--- a/gcc/jit/jit-playback.c
+++ b/gcc/jit/jit-playback.c
@@ -664,6 +664,24 @@ new_global_initialized (location *loc,
   return global_finalize_lvalue (inner);
 }
 
+playback::lvalue *
+playback::context::
+new_global_with_value (location *loc,
+		   enum gcc_jit_global_kind kind,
+		   type *type,
+		   playback::rvalue *value,
+		   const char *name)
+{
+  tree inner = global_new_decl (loc, kind, type, name);
+
+  tree inner_type = type->as_tree ();
+  tree initial = value->as_tree ();
+  gcc_assert (TREE_CONSTANT (initial));
+  DECL_INITIAL (inner) = initial;
+
+  return global_finalize_lvalue (inner);
+}
+
 /* Implementation of the various
   gcc::jit::playback::context::new_rvalue_from_const 
methods.
diff --git a/gcc/jit/jit-playback.h b/gcc/jit/jit-playback.h
index 

Re: [PATCH,V3 2/2] dwarf: new dwarf_debuginfo_p predicate

2021-05-20 Thread Indu Bhagat via Gcc-patches

On 5/20/21 2:40 AM, Richard Biener wrote:

On Thu, May 13, 2021 at 12:52 AM Indu Bhagat via Gcc-patches
 wrote:


[Changes from V2]
   - Tested build (make all-gcc) of cross compiler for target triplets
 containing c6x/mips/powerpc and darwin/cygwin.
[End of changes from V2]

This patch introduces a dwarf_debuginfo_p predicate that abstracts and
replaces complex checks on write_symbols.


OK.

Thanks,
Richard.



Committed.
Thanks,
Indu


gcc/c-family/ChangeLog:

 * c-lex.c (init_c_lex): Use dwarf_debuginfo_p.

gcc/ChangeLog:

 * config/c6x/c6x.c (c6x_output_file_unwind): Use dwarf_debuginfo_p.
 * config/darwin.c (darwin_override_options): Likewise.
 * config/i386/cygming.h (DBX_REGISTER_NUMBER): Likewise.
 * config/i386/darwin.h (DBX_REGISTER_NUMBER): Likewise.
 (DWARF2_FRAME_REG_OUT): Likewise.
 * config/mips/mips.c (mips_output_filename): Likewise.
 * config/rs6000/rs6000.c (rs6000_xcoff_declare_function_name):
 Likewise.
 (rs6000_dbx_register_number): Likewise.
 * dbxout.c: Include flags.h.
 * dwarf2cfi.c (cfi_label_required_p): Likewise.
 (dwarf2out_do_frame): Likewise.
 * except.c: Include flags.h.
 * final.c (dwarf2_debug_info_emitted_p): Likewise.
 (final_scan_insn_1): Likewise.
 * flags.h (dwarf_debuginfo_p): New function declaration.
 * opts.c (dwarf_debuginfo_p): New function definition.
 * targhooks.c (default_debug_unwind_info): Use dwarf_debuginfo_p.
 * toplev.c (process_options): Likewise.
---
  gcc/c-family/c-lex.c   |  4 ++--
  gcc/config/c6x/c6x.c   |  4 ++--
  gcc/config/darwin.c|  3 ++-
  gcc/config/i386/cygming.h  |  2 +-
  gcc/config/i386/darwin.h   |  4 ++--
  gcc/config/mips/mips.c |  3 ++-
  gcc/config/rs6000/rs6000.c |  4 ++--
  gcc/dbxout.c   |  1 +
  gcc/dwarf2cfi.c|  9 -
  gcc/except.c   |  1 +
  gcc/final.c| 15 ++-
  gcc/flags.h|  4 
  gcc/opts.c |  8 
  gcc/targhooks.c|  2 +-
  gcc/toplev.c   |  6 ++
  15 files changed, 40 insertions(+), 30 deletions(-)

diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index 1c66ecd..c44e7a1 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -27,6 +27,7 @@ along with GCC; see the file COPYING3.  If not see
  #include "stor-layout.h"
  #include "c-pragma.h"
  #include "debug.h"
+#include "flags.h"
  #include "file-prefix-map.h" /* remap_macro_filename()  */
  #include "langhooks.h"
  #include "attribs.h"
@@ -87,8 +88,7 @@ init_c_lex (void)

/* Set the debug callbacks if we can use them.  */
if ((debug_info_level == DINFO_LEVEL_VERBOSE
-   && (write_symbols == DWARF2_DEBUG
-  || write_symbols == VMS_AND_DWARF2_DEBUG))
+   && dwarf_debuginfo_p ())
|| flag_dump_go_spec != NULL)
  {
cb->define = cb_define;
diff --git a/gcc/config/c6x/c6x.c b/gcc/config/c6x/c6x.c
index f9ad1e5..e2011f0 100644
--- a/gcc/config/c6x/c6x.c
+++ b/gcc/config/c6x/c6x.c
@@ -59,6 +59,7 @@
  #include "regrename.h"
  #include "dumpfile.h"
  #include "builtins.h"
+#include "flags.h"

  /* This file should be included last.  */
  #include "target-def.h"
@@ -439,8 +440,7 @@ c6x_output_file_unwind (FILE * f)
  {
if (flag_unwind_tables || flag_exceptions)
 {
- if (write_symbols == DWARF2_DEBUG
- || write_symbols == VMS_AND_DWARF2_DEBUG)
+ if (dwarf_debuginfo_p ())
 asm_fprintf (f, "\t.cfi_sections .debug_frame, .c6xabi.exidx\n");
   else
 asm_fprintf (f, "\t.cfi_sections .c6xabi.exidx\n");
diff --git a/gcc/config/darwin.c b/gcc/config/darwin.c
index 5d17391..026c1fb 100644
--- a/gcc/config/darwin.c
+++ b/gcc/config/darwin.c
@@ -46,6 +46,7 @@ along with GCC; see the file COPYING3.  If not see
  #include "lto-section-names.h"
  #include "intl.h"
  #include "optabs.h"
+#include "flags.h"

  /* Fix and Continue.

@@ -3348,7 +3349,7 @@ darwin_override_options (void)
&& generating_for_darwin_version >= 9
&& (flag_gtoggle ? (debug_info_level == DINFO_LEVEL_NONE)
: (debug_info_level >= DINFO_LEVEL_NORMAL))
-  && write_symbols == DWARF2_DEBUG)
+  && dwarf_debuginfo_p ())
  flag_var_tracking_uninit = flag_var_tracking;

/* Final check on PCI options; for Darwin these are not dependent on the PIE
diff --git a/gcc/config/i386/cygming.h b/gcc/config/i386/cygming.h
index cfbca34..ac458cd 100644
--- a/gcc/config/i386/cygming.h
+++ b/gcc/config/i386/cygming.h
@@ -82,7 +82,7 @@ along with GCC; see the file COPYING3.  If not see
  #undef DBX_REGISTER_NUMBER
  #define DBX_REGISTER_NUMBER(n) \
(TARGET_64BIT ? dbx64_register_map[n]\
-   : (write_symbols == DWARF2_DEBUG\
+   : (dwarf_debuginfo_p () 

Re: [PATCH,V3 1/2] opts: change write_symbols to support bitmasks

2021-05-20 Thread Indu Bhagat via Gcc-patches

On 5/20/21 2:40 AM, Richard Biener wrote:

On Thu, May 13, 2021 at 12:53 AM Indu Bhagat via Gcc-patches
 wrote:


[No changes from V2]

To support multiple debug formats, we need to move away from explicit
enumeration of each individual combination of debug formats.


OK.

Thanks,
Richard.


Committed.
Thanks,
Indu


gcc/c-family/ChangeLog:

 * c-opts.c (c_common_post_options): Adjust access to debug_type_names.
 * c-pch.c (struct c_pch_validity): Use type uint32_t.
 (pch_init): Renamed member.
 (c_common_valid_pch): Adjust access to debug_type_names.

gcc/ChangeLog:

 * common.opt: Change type to support bitmasks.
 * flag-types.h (enum debug_info_type): Rename enumerator constants.
 (NO_DEBUG): New bitmask.
 (DBX_DEBUG): Likewise.
 (DWARF2_DEBUG): Likewise.
 (XCOFF_DEBUG): Likewise.
 (VMS_DEBUG): Likewise.
 (VMS_AND_DWARF2_DEBUG): Likewise.
 * flags.h (debug_set_to_format): New function declaration.
 (debug_set_count): Likewise.
 (debug_set_names): Likewise.
 * opts.c (debug_type_masks): Array of bitmasks for debug formats.
 (debug_set_to_format): New function definition.
 (debug_set_count): Likewise.
 (debug_set_names): Likewise.
 (set_debug_level): Update access to debug_type_names.
 * toplev.c: Likewise.

gcc/objc/ChangeLog:

 * objc-act.c (synth_module_prologue): Use uint32_t instead of enum
 debug_info_type.

gcc/testsuite/ChangeLog:

 * gcc.dg/pch/valid-1.c: Adjust diagnostic message in testcase.
 * lib/dg-pch.exp: Adjust diagnostic message.
---
  gcc/c-family/c-opts.c  |   7 ++-
  gcc/c-family/c-pch.c   |  12 ++--
  gcc/common.opt |   2 +-
  gcc/flag-types.h   |  29 +++---
  gcc/flags.h|  17 +-
  gcc/objc/objc-act.c|   2 +-
  gcc/opts.c | 109 +
  gcc/testsuite/gcc.dg/pch/valid-1.c |   2 +-
  gcc/testsuite/lib/dg-pch.exp   |   4 +-
  gcc/toplev.c   |   9 ++-
  10 files changed, 157 insertions(+), 36 deletions(-)

diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 89e05a4..60b5802 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -1112,9 +1112,10 @@ c_common_post_options (const char **pfilename)
   /* Only -g0 and -gdwarf* are supported with PCH, for other
  debug formats we warn here and refuse to load any PCH files.  */
   if (write_symbols != NO_DEBUG && write_symbols != DWARF2_DEBUG)
-   warning (OPT_Wdeprecated,
-"the %qs debug format cannot be used with "
-"pre-compiled headers", debug_type_names[write_symbols]);
+ warning (OPT_Wdeprecated,
+  "the %qs debug info cannot be used with "
+  "pre-compiled headers",
+  debug_set_names (write_symbols & ~DWARF2_DEBUG));
 }
else if (write_symbols != NO_DEBUG && write_symbols != DWARF2_DEBUG)
 c_common_no_more_pch ();
diff --git a/gcc/c-family/c-pch.c b/gcc/c-family/c-pch.c
index fd94c37..8f0f760 100644
--- a/gcc/c-family/c-pch.c
+++ b/gcc/c-family/c-pch.c
@@ -52,7 +52,7 @@ enum {

  struct c_pch_validity
  {
-  unsigned char debug_info_type;
+  uint32_t pch_write_symbols;
signed char match[MATCH_SIZE];
void (*pch_init) (void);
size_t target_data_length;
@@ -108,7 +108,7 @@ pch_init (void)
pch_outfile = f;

memset (, '\0', sizeof (v));
-  v.debug_info_type = write_symbols;
+  v.pch_write_symbols = write_symbols;
{
  size_t i;
  for (i = 0; i < MATCH_SIZE; i++)
@@ -252,13 +252,13 @@ c_common_valid_pch (cpp_reader *pfile, const char *name, 
int fd)
/* The allowable debug info combinations are that either the PCH file
   was built with the same as is being used now, or the PCH file was
   built for some kind of debug info but now none is in use.  */
-  if (v.debug_info_type != write_symbols
+  if (v.pch_write_symbols != write_symbols
&& write_symbols != NO_DEBUG)
  {
cpp_warning (pfile, CPP_W_INVALID_PCH,
-  "%s: created with -g%s, but used with -g%s", name,
-  debug_type_names[v.debug_info_type],
-  debug_type_names[write_symbols]);
+  "%s: created with '%s' debug info, but used with '%s'", name,
+  debug_set_names (v.pch_write_symbols),
+  debug_set_names (write_symbols));
return 2;
  }

diff --git a/gcc/common.opt b/gcc/common.opt
index a75b44e..ffb968d 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -109,7 +109,7 @@ bool exit_after_options
  ; flag-types.h for the definitions of the different possible types of
  ; debugging information.
  Variable
-enum debug_info_type write_symbols = NO_DEBUG

Re: [PATCH 05/57] rs6000: Add file support and functions for diagnostic support

2021-05-20 Thread Segher Boessenkool
Hi!

On Tue, Apr 27, 2021 at 10:32:40AM -0500, Bill Schmidt via Gcc-patches wrote:
>   * config/rs6000/rs6000-gen-builtins.c (bif_file): New filescope
>   variable.

What makes it interesting that this var has file scope?  Did you mean to
say it has internal linkage ("is static")?  I would just leave that off
completely, anyway (just "New." or "New variable.")

>   (LINELEN): New defined constant.

It isn't a constant, it's a macro.  "New macro."?

> +#define LINELEN 1024
> +static char linebuf[LINELEN];

You never get anywhere close to 1024 I suppose?

> +/* Pointer to a diagnostic function.  */
> +void (*diag) (const char *, ...) __attribute__ ((format (printf, 1, 2)))
> +  = NULL;

This isn't portable: you cannot assign NULL to a pointer to function.
But it will work on all POSIX machines, and those are the only hosts we
support.  Still icky :-)

If you just leave off the initialisation, it will be initialised to 0,
which is exactly the same problem of course, just less explicit.

This pointer is not static btw?  Should it be?


Okay for trunk with a little changelog massaging.  Thanks!


Segher


Re: [PATCH 04/57] rs6000: Add initial input files

2021-05-20 Thread Segher Boessenkool
On Tue, Apr 27, 2021 at 10:32:39AM -0500, Bill Schmidt via Gcc-patches wrote:
> This patch adds a tiny subset of the built-in and overload descriptions.

>   * config/rs6000/rs6000-builtin-new.def: New.

You'll have to rename this to not have "-new" in the name later, I hope
you realise :-)

Okay for trunk.  Thanks!


Segher


Re: [PATCH 03/57] rs6000: Initial create of rs6000-gen-builtins.c

2021-05-20 Thread Segher Boessenkool
Hi!

On Tue, Apr 27, 2021 at 10:32:38AM -0500, Bill Schmidt via Gcc-patches wrote:
> gcc/
>   * config/rs6000/rs6000-gen-builtins.c: New.

> +[altivec]
> +  const vsc __builtin_altivec_abs_v16qi (vsc);
> +ABS_V16QI absv16qi2 {}
> +  const vss __builtin_altivec_abs_v8hi (vss);
> +ABS_V8HI absv8hi2 {}
> +
> +   Here "vsc" and "vss" are shorthand for "vector signed char" and
> +   "vector signed short" to shorten line lengths and improve readability.

I like :-)


Okay for trunk.  Thanks!


Segher


Re: [PATCH 02/57] Support scanning of build-time GC roots in gengtype

2021-05-20 Thread Segher Boessenkool
On Tue, May 11, 2021 at 11:01:22AM -0500, Bill Schmidt wrote:
> Hi!  I'd like to ping this specific patch from the series, which is the 
> only one remaining that affects common code.  I confess that I don't 
> know whom to ask for a review for gengtype; I didn't get any good ideas 
> from MAINTAINERS.  If you know of a good reviewer candidate, please CC 
> them.

Richard is listed as the "gen* on machine desc" maintainer, that might
be the closest to this.  cc:ed.


Segher


Re: [PATCH 02/57] Support scanning of build-time GC roots in gengtype

2021-05-20 Thread Segher Boessenkool
Hi!

On Tue, Apr 27, 2021 at 10:32:37AM -0500, Bill Schmidt via Gcc-patches wrote:
> --- a/gcc/gengtype-state.c
> +++ b/gcc/gengtype-state.c
> @@ -1269,7 +1269,7 @@ state_writer::write_state_files_list (void)
>int i = 0;
>/* Write the list of files with their lang_bitmap.  */
>begin_s_expr ("fileslist");
> -  fprintf (state_file, "%d", (int) num_gt_files);
> +  fprintf (state_file, "%d %d", (int) num_gt_files, (int) num_build_headers);

Please use %zd instead, and don't cast?  We require a moderately new
host compiler nowadays :-)

>for (i = 0; i < (int) num_gt_files; i++)

For this one you can make i itself a size_t.  Remember: all explicit
casts are evil (and just some are useful).

Alternatively you can make num_gt_files (and your new num_build_headers)
itself an int: it's not like we could have 2G of those anyway.

> --- a/gcc/gengtype.h
> +++ b/gcc/gengtype.h
> @@ -55,6 +55,11 @@ struct fileloc
>  extern const input_file** gt_files;
>  extern size_t num_gt_files;
>  
> +/* Table of headers to be included in gtype-desc.c that are generated
> +   during the build.  These are identified as "./.h".  */
> +extern const char** build_headers;

extern const char **build_headers;

(I hope someone who can approve this will review it.)


Segher


Re: [PATCH 00/57] Replace the Power target-specific built-in machinery

2021-05-20 Thread Segher Boessenkool
Hi!

Just a few small things about this -- I'll reply to more of it later.

On Tue, Apr 27, 2021 at 10:32:35AM -0500, Bill Schmidt via Gcc-patches wrote:
> The design of the target-specific built-in function support in the
> Power back end has not stood the test of time.  The machinery is
> grossly inefficient, confusing, and arcane; and adding new built-in
> functions is inefficient and error-prone.

You are too nice to it.  People have had to work with it over the
years, there is some pent-up anger :-)

> Because of the scope of the changes, it's important to be able to
> verify that the new system makes only intended changes to the
> functions that are supported.  Therefore this patch set adds a new
> mechanism, and (in the final patch) enables it instead of the existing
> support, but does not yet remove the old support.  That will happen in
> a follow-up patch once we're comfortable with the new system.

Is there some (semi-)automatic way to compare the results of the old
and new systems?

> Patch 0057: Fix one last late-breaking change
> 
>   Keeping the code up-to-date with upstream has been fun.  When I
>   rebased to create the patch set, I found one new issue where a
>   small change had been made to the overload handling for the
>   vec_insert builtins.  This patch reflects that change into the
>   new handling.  My version of git is having trouble with
>   interactive rebasing, so it was easier to just add the extra patch.

What breaks by keeping this fix after the other patches?

> I deliberately implemented all the old built-ins exactly as previously
> defined, wherever possible, despite an overwhelming desire to pitch
> out a bunch of them that have already been considered deprecated for
> ages.  I found that it was too difficult to both implement a new
> system and remove deprecated things at the same time, and in general
> it seems like a dangerous thing to do.  Better to do this in stages if
> we're going to do it at all.

Independent fixes you can put before the meat of the series.  This often
is the best way to do it, since then you don't have to duplicate the
weird / buggy / whatever behaviour of the old system.

But things that aren't simple fixes, that need deprecation periods and
everything...  no no no, you want this done *this* decade!

> Unfortunately a lot of deprecated things
> still appear all over our own test suite, and I'm afraid we can assume
> they appear in user code all over the place as well.

Pretty much the only old features you can remove are features that have
been broken for many years.  You can break something on purpose to see
if anyone still uses it, too, but :-)

> What I've done instead is to make very clear which interfaces are
> considered deprecated in the input files themselves.  Over time,
> perhaps we can start to remove some of these, but in reality I think
> we're going to just continue to be stuck with them.

It is extremely useful to have it clearly documented which interfaces
shoul;d be considered deprecated, even if we will never remove it.  It
is useful in the documentation, but it is even more useful in the code,
for ourselves!

> (3) A number of built-ins used "long" for DImode types, which would
> break these for 32-bit.  I changed those arguments and return values
> to "long long" to avoid such problems, when those built-ins were not
> restricted to 64-bit mode already.  There aren't many such cases.

You can do this for 64-bit-only builtins as well -- the actual argument
type is never visible (to the user), and everything becomes modes early.


Segher


[PATCH] PR fortran/100551 - [11/12 Regression] Passing return value to class(*) dummy argument

2021-05-20 Thread Harald Anlauf via Gcc-patches
The fix for PR93924/5 has caused a regression for code such as given
in the present PR.  This can be remedied by adjusting the check when
to invoke the implicit conversion of actual argument to an unlimited
polymorphic procedure argument.

Regtested on x86_64-pc-linux-gnu.

OK for mainline and backport to 11-branch?

Thanks,
Harald


Fortran: fix passing return value to class(*) dummy argument

gcc/fortran/ChangeLog:

PR fortran/100551
* trans-expr.c (gfc_conv_procedure_call): Adjust check for
implicit conversion of actual argument to an unlimited polymorphic
procedure argument.

gcc/testsuite/ChangeLog:

PR fortran/100551
* gfortran.dg/pr100551.f90: New test.

diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index cce18d094a6..3432cd4fdfd 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -5826,7 +5826,9 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
  _array);
 	}
   else if (UNLIMITED_POLY (fsym) && e->ts.type != BT_CLASS
-	   && gfc_expr_attr (e).flavor != FL_PROCEDURE)
+	   && e->ts.type != BT_PROCEDURE
+	   && (gfc_expr_attr (e).flavor != FL_PROCEDURE
+		   || gfc_expr_attr (e).proc != PROC_UNKNOWN))
 	{
 	  /* The intrinsic type needs to be converted to a temporary
 	 CLASS object for the unlimited polymorphic formal.  */
diff --git a/gcc/testsuite/gfortran.dg/pr100551.f90 b/gcc/testsuite/gfortran.dg/pr100551.f90
new file mode 100644
index 000..f82f505e734
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr100551.f90
@@ -0,0 +1,30 @@
+! { dg-do run }
+! PR fortran/100551 - Passing return value to class(*) dummy argument
+
+program p
+  implicit none
+  integer :: result
+  result = 1
+  result = test ((result)) ! works
+  if (result /= 1) stop 1
+  result = test (int (result)) ! issue 1
+! write(*,*) result
+  if (result /= 1) stop 2
+  result = test (f   (result)) ! issue 2
+! write(*,*) result
+  if (result /= 2) stop 3
+contains
+  integer function test(x)
+class(*), intent(in) :: x
+select type (x)
+type is (integer)
+   test = x
+class default
+   test = -1
+end select
+  end function test
+  integer function f(x)
+integer, intent(in) :: x
+f = 2*x
+  end function f
+end program


[pushed] c++: designators in single-element init lists

2021-05-20 Thread Jason Merrill via Gcc-patches
While looking at PR100489, it occurred to me that places that currently
use an initializer-list with a single element to initialize an object of the
same type shouldn't do that if the element has a designator.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* call.c (reference_binding): Check for designator.
(implicit_conversion_1, build_special_member_call): Likewise.
* decl.c (reshape_init_r): Likewise.
* pt.c (do_class_deduction): Likewise.
* typeck2.c (digest_init_r): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/desig19.C: New test.
---
 gcc/cp/call.c|  5 -
 gcc/cp/decl.c|  2 ++
 gcc/cp/pt.c  |  3 ++-
 gcc/cp/typeck2.c |  1 +
 gcc/testsuite/g++.dg/cpp2a/desig19.C | 33 
 5 files changed, 42 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/desig19.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 4a59b97c110..016ae32a272 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -1731,7 +1731,8 @@ reference_binding (tree rto, tree rfrom, tree expr, bool 
c_cast_p, int flags,
 because A[] and A[2] are reference-related.  But we don't do it
 because grok_reference_init has deduced the array size (to 1), and
 A[1] and A[2] aren't reference-related.  */
-  if (CONSTRUCTOR_NELTS (expr) == 1)
+  if (CONSTRUCTOR_NELTS (expr) == 1
+ && !CONSTRUCTOR_IS_DESIGNATED_INIT (expr))
{
  tree elt = CONSTRUCTOR_ELT (expr, 0)->value;
  if (error_operand_p (elt))
@@ -2095,6 +2096,7 @@ implicit_conversion_1 (tree to, tree from, tree expr, 
bool c_cast_p,
{
  if (BRACE_ENCLOSED_INITIALIZER_P (expr)
  && CONSTRUCTOR_NELTS (expr) == 1
+ && !CONSTRUCTOR_IS_DESIGNATED_INIT (expr)
  && !is_list_ctor (cand->fn))
{
  /* "If C is not an initializer-list constructor and the
@@ -10198,6 +10200,7 @@ build_special_member_call (tree instance, tree name, 
vec **args,
 
   if (BRACE_ENCLOSED_INITIALIZER_P (arg)
  && !TYPE_HAS_LIST_CTOR (class_type)
+ && !CONSTRUCTOR_IS_DESIGNATED_INIT (arg)
  && CONSTRUCTOR_NELTS (arg) == 1)
arg = CONSTRUCTOR_ELT (arg, 0)->value;
 
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 8a54e569041..13556e3ded1 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -6650,6 +6650,8 @@ reshape_init_r (tree type, reshape_iter *d, tree 
first_initializer_p,
  initialized from that element."  Even if T is an aggregate.  */
   if (cxx_dialect >= cxx11 && (CLASS_TYPE_P (type) || VECTOR_TYPE_P (type))
   && first_initializer_p
+  /* But not if it's a designated init.  */
+  && !d->cur->index
   && d->end - d->cur == 1
   && reference_related_p (type, TREE_TYPE (init)))
 {
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index cbd2f3dc338..1deb359c011 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -29326,7 +29326,8 @@ do_class_deduction (tree ptype, tree tmpl, tree init,
 {
   list_init_p = true;
   try_list_ctor = TYPE_HAS_LIST_CTOR (type);
-  if (try_list_ctor && CONSTRUCTOR_NELTS (init) == 1)
+  if (try_list_ctor && CONSTRUCTOR_NELTS (init) == 1
+ && !CONSTRUCTOR_IS_DESIGNATED_INIT (init))
{
  /* As an exception, the first phase in 16.3.1.7 (considering the
 initializer list as a single argument) is omitted if the
diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index 5a7219dec65..6679e247816 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -1183,6 +1183,7 @@ digest_init_r (tree type, tree init, int nested, int 
flags,
  the object is initialized from that element."  */
   if (cxx_dialect >= cxx11
   && BRACE_ENCLOSED_INITIALIZER_P (stripped_init)
+  && !CONSTRUCTOR_IS_DESIGNATED_INIT (stripped_init)
   && CONSTRUCTOR_NELTS (stripped_init) == 1
   && ((CLASS_TYPE_P (type) && !CLASSTYPE_NON_AGGREGATE (type))
  || VECTOR_TYPE_P (type)))
diff --git a/gcc/testsuite/g++.dg/cpp2a/desig19.C 
b/gcc/testsuite/g++.dg/cpp2a/desig19.C
new file mode 100644
index 000..3321da85802
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/desig19.C
@@ -0,0 +1,33 @@
+// { dg-do compile { target c++11 } }
+// { dg-options "" }
+
+struct A
+{
+  int i;
+  constexpr operator int() { return 42; }
+};
+
+#define SA(X) static_assert ((X),#X)
+constexpr A a1 { A() };
+SA(a1.i == 0);
+constexpr A a2 { i: A() };
+SA(a2.i == 42);
+#if __cpp_constexpr >= 201304L
+constexpr int f3 () { A const  { A() }; return r.i; }
+SA(f3() == 0);
+constexpr int f4 () { A const  { i: A() }; return r.i; }
+SA(f4() == 42);
+constexpr int f5 () { A ar[1]{{ A() }}; return ar[0].i; }
+SA(f5() == 0);
+constexpr int f5a () { A ar[1]{{ i: A() }}; return ar[0].i; }
+SA(f5a() == 42);
+#if __cpp_constexpr >= 201907L
+constexpr int f6 () { A* p = new A{A()}; int i = p->i; delete p; 

[pushed] c++: designated init with anonymous union [PR100489]

2021-05-20 Thread Jason Merrill via Gcc-patches
My patch for PR98463 added an assert that tripped on this testcase, because
we ended up with a U CONSTRUCTOR with an initializer for a, which is not a
member of U.  We need to wrap the a initializer in another CONSTRUCTOR for
the anonymous union.

There was already support for this in process_init_constructor_record, but
not in process_init_constructor_union.  But since this is about brace
elision, it really belongs under reshape_init rather than digest_init, so
this patch moves the handling to reshape_init_class, which also handles
unions.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/100489

gcc/cp/ChangeLog:

* decl.c (reshape_init_class): Handle designator for
member of anonymous aggregate here.
* typeck2.c (process_init_constructor_record): Not here.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/desig18.C: New test.
---
 gcc/cp/decl.c| 33 
 gcc/cp/typeck2.c | 26 --
 gcc/testsuite/g++.dg/cpp2a/desig18.C | 17 ++
 3 files changed, 46 insertions(+), 30 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/desig18.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 28052df9f45..8a54e569041 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -6418,10 +6418,9 @@ reshape_init_class (tree type, reshape_iter *d, bool 
first_initializer_p,
  /* We already reshaped this.  */
  if (field != d->cur->index)
{
- tree id = DECL_NAME (d->cur->index);
- gcc_assert (id);
- gcc_checking_assert (d->cur->index
-  == get_class_binding (type, id));
+ if (tree id = DECL_NAME (d->cur->index))
+   gcc_checking_assert (d->cur->index
+== get_class_binding (type, id));
  field = d->cur->index;
}
}
@@ -6442,6 +6441,32 @@ reshape_init_class (tree type, reshape_iter *d, bool 
first_initializer_p,
   d->cur->index);
  return error_mark_node;
}
+
+ /* If the element is an anonymous union object and the initializer
+list is a designated-initializer-list, the anonymous union object
+is initialized by the designated-initializer-list { D }, where D
+is the designated-initializer-clause naming a member of the
+anonymous union object.  */
+ tree ictx = DECL_CONTEXT (field);
+ if (!same_type_ignoring_top_level_qualifiers_p (ictx, type))
+   {
+ gcc_assert (ANON_AGGR_TYPE_P (ictx));
+ /* Find the anon aggr that is a direct member of TYPE.  */
+ while (true)
+   {
+ tree cctx = TYPE_CONTEXT (ictx);
+ if (same_type_ignoring_top_level_qualifiers_p (cctx, type))
+   break;
+ ictx = cctx;
+   }
+ /* And then the TYPE member with that anon aggr type.  */
+ tree aafield = TYPE_FIELDS (type);
+ for (; aafield; aafield = TREE_CHAIN (aafield))
+   if (TREE_TYPE (aafield) == ictx)
+ break;
+ gcc_assert (aafield);
+ field = aafield;
+   }
}
 
   /* If we processed all the member of the class, we are done.  */
diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index ce3016c780d..5a7219dec65 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -1517,19 +1517,6 @@ process_init_constructor_record (tree type, tree init, 
int nested, int flags,
  || identifier_p (ce->index));
  if (ce->index == field || ce->index == DECL_NAME (field))
next = ce->value;
- else if (ANON_AGGR_TYPE_P (fldtype)
-  && search_anon_aggr (fldtype,
-   TREE_CODE (ce->index) == FIELD_DECL
-   ? DECL_NAME (ce->index)
-   : ce->index))
-   /* If the element is an anonymous union object and the
-  initializer list is a designated-initializer-list, the
-  anonymous union object is initialized by the
-  designated-initializer-list { D }, where D is the
-  designated-initializer-clause naming a member of the
-  anonymous union object.  */
-   next = build_constructor_single (init_list_type_node,
-ce->index, ce->value);
  else
{
  ce = NULL;
@@ -1675,19 +1662,6 @@ process_init_constructor_record (tree type, tree init, 
int nested, int flags,
 
  if (ce->index == field || ce->index == DECL_NAME (field))
break;
- if 

[PATCH] Add 3 target hooks for memset

2021-05-20 Thread H.J. Lu via Gcc-patches
On Wed, May 19, 2021 at 5:55 AM H.J. Lu  wrote:
>
> On Wed, May 19, 2021 at 2:25 AM Richard Biener
>  wrote:
> >
> > On Tue, May 18, 2021 at 9:16 PM H.J. Lu  wrote:
> > >
> > > Add TARGET_READ_MEMSET_VALUE and TARGET_GEN_MEMSET_VALUE to support
> > > target instructions to duplicate QImode value to TImode/OImode/XImode
> > > value for memmset.
> > >
> > > PR middle-end/90773
> > > * builtins.c (builtin_memset_read_str): Call
> > > targetm.read_memset_value.
> > > (builtin_memset_gen_str): Call targetm.gen_memset_value.
> > > * target.def (read_memset_value): New hook.
> > > (gen_memset_value): Likewise.
> > > * targhooks.c: Inclue "builtins.h".
> > > (default_read_memset_value): New function.
> > > (default_gen_memset_value): Likewise.
> > > * targhooks.h (default_read_memset_value): New prototype.
> > > (default_gen_memset_value): Likewise.
> > > * doc/tm.texi.in: Add TARGET_READ_MEMSET_VALUE and
> > > TARGET_GEN_MEMSET_VALUE hooks.
> > > * doc/tm.texi: Regenerated.
> > > ---
> > >  gcc/builtins.c | 47 --
> > >  gcc/doc/tm.texi| 16 +
> > >  gcc/doc/tm.texi.in |  4 
> > >  gcc/target.def | 20 +
> > >  gcc/targhooks.c| 56 ++
> > >  gcc/targhooks.h|  4 
> > >  6 files changed, 104 insertions(+), 43 deletions(-)
> > >
> > > diff --git a/gcc/builtins.c b/gcc/builtins.c
> > > index e1b284846b1..f78a36478ef 100644
> > > --- a/gcc/builtins.c
> > > +++ b/gcc/builtins.c
> > > @@ -6584,24 +6584,11 @@ expand_builtin_strncpy (tree exp, rtx target)
> > > previous iteration.  */
> > >
> > >  rtx
> > > -builtin_memset_read_str (void *data, void *prevp,
> > > +builtin_memset_read_str (void *data, void *prev,
> > >  HOST_WIDE_INT offset ATTRIBUTE_UNUSED,
> > >  scalar_int_mode mode)
> > >  {
> > > -  by_pieces_prev *prev = (by_pieces_prev *) prevp;
> > > -  if (prev != nullptr && prev->data != nullptr)
> > > -{
> > > -  /* Use the previous data in the same mode.  */
> > > -  if (prev->mode == mode)
> > > -   return prev->data;
> > > -}
> > > -
> > > -  const char *c = (const char *) data;
> > > -  char *p = XALLOCAVEC (char, GET_MODE_SIZE (mode));
> > > -
> > > -  memset (p, *c, GET_MODE_SIZE (mode));
> > > -
> > > -  return c_readstr (p, mode);
> > > +  return targetm.read_memset_value ((const char *) data, prev, mode);
> > >  }
> > >
> > >  /* Callback routine for store_by_pieces.  Return the RTL of a register
> > > @@ -6611,37 +6598,11 @@ builtin_memset_read_str (void *data, void *prevp,
> > > nullptr, it has the RTL info from the previous iteration.  */
> > >
> > >  static rtx
> > > -builtin_memset_gen_str (void *data, void *prevp,
> > > +builtin_memset_gen_str (void *data, void *prev,
> > > HOST_WIDE_INT offset ATTRIBUTE_UNUSED,
> > > scalar_int_mode mode)
> > >  {
> > > -  rtx target, coeff;
> > > -  size_t size;
> > > -  char *p;
> > > -
> > > -  by_pieces_prev *prev = (by_pieces_prev *) prevp;
> > > -  if (prev != nullptr && prev->data != nullptr)
> > > -{
> > > -  /* Use the previous data in the same mode.  */
> > > -  if (prev->mode == mode)
> > > -   return prev->data;
> > > -
> > > -  target = simplify_gen_subreg (mode, prev->data, prev->mode, 0);
> > > -  if (target != nullptr)
> > > -   return target;
> > > -}
> > > -
> > > -  size = GET_MODE_SIZE (mode);
> > > -  if (size == 1)
> > > -return (rtx) data;
> > > -
> > > -  p = XALLOCAVEC (char, size);
> > > -  memset (p, 1, size);
> > > -  coeff = c_readstr (p, mode);
> > > -
> > > -  target = convert_to_mode (mode, (rtx) data, 1);
> > > -  target = expand_mult (mode, target, coeff, NULL_RTX, 1);
> > > -  return force_reg (mode, target);
> > > +  return targetm.gen_memset_value ((rtx) data, prev, mode);
> > >  }
> > >
> > >  /* Expand expression EXP, which is a call to the memset builtin.  Return
> > > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> > > index 85ea9395560..51385044e76 100644
> > > --- a/gcc/doc/tm.texi
> > > +++ b/gcc/doc/tm.texi
> > > @@ -11868,6 +11868,22 @@ This function prepares to emit a conditional 
> > > comparison within a sequence
> > >   @var{bit_code} is @code{AND} or @code{IOR}, which is the op on the 
> > > compares.
> > >  @end deftypefn
> > >
> > > +@deftypefn {Target Hook} rtx TARGET_READ_MEMSET_VALUE (const char 
> > > *@var{c}, void *@var{prev}, scalar_int_mode @var{mode})
> > > +This function returns the RTL of a constant integer corresponding to
> > > +target reading @code{GET_MODE_SIZE (@var{mode})} bytes from the stringn
> > > +constant @var{str}.  If @var{prev} is not @samp{nullptr}, it contains
> > > +the RTL information from the previous interation.
> > > +@end deftypefn
> > > +
> > > +@deftypefn {Target Hook} rtx 

[committed] libstdc++: Do not use static_assert without message in C++11

2021-05-20 Thread Jonathan Wakely via Gcc-patches
libstdc++-v3/ChangeLog:

* include/bits/random.tcc (__representable_as_double)
(__p1_representable_as_double): Add "" to static asserts.

Tested powerpc64le-linux. Committed to trunk.

commit 64ba45c76e831914764b70207d69a06f800b43a4
Author: Jonathan Wakely 
Date:   Thu May 20 21:12:15 2021

libstdc++: Do not use static_assert without message in C++11

libstdc++-v3/ChangeLog:

* include/bits/random.tcc (__representable_as_double)
(__p1_representable_as_double): Add "" to static asserts.

diff --git a/libstdc++-v3/include/bits/random.tcc 
b/libstdc++-v3/include/bits/random.tcc
index bf4397045ef..1357e181874 100644
--- a/libstdc++-v3/include/bits/random.tcc
+++ b/libstdc++-v3/include/bits/random.tcc
@@ -811,8 +811,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   constexpr bool
   __representable_as_double(_Tp __x) noexcept
   {
-   static_assert(numeric_limits<_Tp>::is_integer);
-   static_assert(!numeric_limits<_Tp>::is_signed);
+   static_assert(numeric_limits<_Tp>::is_integer, "");
+   static_assert(!numeric_limits<_Tp>::is_signed, "");
// All integers <= 2^53 are representable.
return (__x <= (1ull << __DBL_MANT_DIG__))
  // Between 2^53 and 2^54 only even numbers are representable.
@@ -824,8 +824,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   constexpr bool
   __p1_representable_as_double(_Tp __x) noexcept
   {
-   static_assert(numeric_limits<_Tp>::is_integer);
-   static_assert(!numeric_limits<_Tp>::is_signed);
+   static_assert(numeric_limits<_Tp>::is_integer, "");
+   static_assert(!numeric_limits<_Tp>::is_signed, "");
return numeric_limits<_Tp>::digits < __DBL_MANT_DIG__
  || (bool(__x + 1u) // return false if x+1 wraps around to zero
  && __detail::__representable_as_double(__x + 1u));


[committed] libstdc++: Use __builtin_unreachable for constexpr assertions [PR 100676]

2021-05-20 Thread Jonathan Wakely via Gcc-patches
The current implementation of compile-time precondition checks causes
compilation to fail by calling a non-constexpr function declared at
block scope. This breaks the CUDA compiler, which wraps some libstdc++
headers in a pragma that declares everything as a __host__ __device__
function, but others are not wrapped and so everything is a __host__
function. The local declaration thus gets redeclared as two different
types of function, which doesn't work.

Just use __builtin_unreachable to make constant evaluation fail, instead
of the local function declaration. Also simplify the assertion macros,
which has the side effect of giving simpler compilation errors when
using Clang.

libstdc++-v3/ChangeLog:

PR libstdc++/100676
* include/bits/c++config (__glibcxx_assert_1): Rename to ...
(__glibcxx_constexpr_assert): ... this.
(__glibcxx_assert_impl): Use __glibcxx_constexpr_assert.
(__glibcxx_assert): Define as either __glibcxx_constexpr_assert
or __glibcxx_assert_impl.
(__glibcxx_assert_2): Remove
* include/debug/macros.h (_GLIBCXX_DEBUG_VERIFY_AT_F): Use
__glibcxx_constexpr_assert instead of __glibcxx_assert_1.
* 
testsuite/21_strings/basic_string_view/element_access/char/back_constexpr_neg.cc:
Adjust expected error.
* 
testsuite/21_strings/basic_string_view/element_access/char/constexpr_neg.cc:
Likewise.
* 
testsuite/21_strings/basic_string_view/element_access/char/front_constexpr_neg.cc:
Likewise.
Likewise.
* 
testsuite/21_strings/basic_string_view/element_access/wchar_t/back_constexpr_neg.cc:
Likewise.
* 
testsuite/21_strings/basic_string_view/element_access/wchar_t/constexpr_neg.cc:
Likewise.
* 
testsuite/21_strings/basic_string_view/element_access/wchar_t/front_constexpr_neg.cc:
Likewise.
* testsuite/23_containers/span/back_neg.cc: Likewise.
* testsuite/23_containers/span/front_neg.cc: Likewise.
* testsuite/23_containers/span/index_op_neg.cc: Likewise.

Tested powerpc64le-linux. Committed to trunk.

commit 6b42b5a8a207de5e021a2916281f46bcd60b20d2
Author: Jonathan Wakely 
Date:   Thu May 20 16:39:06 2021

libstdc++: Use __builtin_unreachable for constexpr assertions [PR 100676]

The current implementation of compile-time precondition checks causes
compilation to fail by calling a non-constexpr function declared at
block scope. This breaks the CUDA compiler, which wraps some libstdc++
headers in a pragma that declares everything as a __host__ __device__
function, but others are not wrapped and so everything is a __host__
function. The local declaration thus gets redeclared as two different
types of function, which doesn't work.

Just use __builtin_unreachable to make constant evaluation fail, instead
of the local function declaration. Also simplify the assertion macros,
which has the side effect of giving simpler compilation errors when
using Clang.

libstdc++-v3/ChangeLog:

PR libstdc++/100676
* include/bits/c++config (__glibcxx_assert_1): Rename to ...
(__glibcxx_constexpr_assert): ... this.
(__glibcxx_assert_impl): Use __glibcxx_constexpr_assert.
(__glibcxx_assert): Define as either __glibcxx_constexpr_assert
or __glibcxx_assert_impl.
(__glibcxx_assert_2): Remove
* include/debug/macros.h (_GLIBCXX_DEBUG_VERIFY_AT_F): Use
__glibcxx_constexpr_assert instead of __glibcxx_assert_1.
* 
testsuite/21_strings/basic_string_view/element_access/char/back_constexpr_neg.cc:
Adjust expected error.
* 
testsuite/21_strings/basic_string_view/element_access/char/constexpr_neg.cc:
Likewise.
* 
testsuite/21_strings/basic_string_view/element_access/char/front_constexpr_neg.cc:
Likewise.
Likewise.
* 
testsuite/21_strings/basic_string_view/element_access/wchar_t/back_constexpr_neg.cc:
Likewise.
* 
testsuite/21_strings/basic_string_view/element_access/wchar_t/constexpr_neg.cc:
Likewise.
* 
testsuite/21_strings/basic_string_view/element_access/wchar_t/front_constexpr_neg.cc:
Likewise.
* testsuite/23_containers/span/back_neg.cc: Likewise.
* testsuite/23_containers/span/front_neg.cc: Likewise.
* testsuite/23_containers/span/index_op_neg.cc: Likewise.

diff --git a/libstdc++-v3/include/bits/c++config 
b/libstdc++-v3/include/bits/c++config
index 72ec91949de..9314117aed8 100644
--- a/libstdc++-v3/include/bits/c++config
+++ b/libstdc++-v3/include/bits/c++config
@@ -487,6 +487,16 @@ namespace std
 # define _GLIBCXX_EXTERN_TEMPLATE -1
 #endif
 
+
+#if __has_builtin(__builtin_is_constant_evaluated)
+# define __glibcxx_constexpr_assert(cond) \
+  if 

Re: [PATCH] libgccjit: Add support for types used by atomic builtins [PR96066] [PR96067]

2021-05-20 Thread David Malcolm via Gcc-patches
On Mon, 2021-05-17 at 21:02 -0400, Antoni Boucher via Jit wrote:
> Hello.
> This patch fixes the issue with using atomic builtins in libgccjit.
> Thanks to review it.

[...snip...]
 
> diff --git a/gcc/jit/jit-recording.c b/gcc/jit/jit-recording.c
> index 117ff70114c..de876ff9fa6 100644
> --- a/gcc/jit/jit-recording.c
> +++ b/gcc/jit/jit-recording.c
> @@ -2598,8 +2598,18 @@ recording::memento_of_get_pointer::accepts_writes_from 
> (type *rtype)
>  return false;
>  
>/* It's OK to assign to a (const T *) from a (T *).  */
> -  return m_other_type->unqualified ()
> -->accepts_writes_from (rtype_points_to);
> +  if (m_other_type->unqualified ()
> +->accepts_writes_from (rtype_points_to)) {
> +  return true;
> +  }
> +
> +  /* It's OK to assign to a (volatile const T *) from a (volatile const T 
> *). */
> +  if (m_other_type->unqualified ()->unqualified ()
> +->accepts_writes_from (rtype_points_to->unqualified ())) {
> +  return true;
> +  }

Presumably you need this to get the atomic builtins working?

If I'm reading the above correctly, the new test doesn't distinguish
between the 3 different kinds of qualifiers (aligned, volatile, and
const), it merely tries to strip some of them off.

It's not valid to e.g. assign to a (aligned T *) from a (const T *).

Maybe we need an internal enum to discriminate between different
subclasses of decorated_type?


> +
> +  return false;
>  }
>  
>  /* Implementation of pure virtual hook recording::memento::replay_into
> diff --git a/gcc/testsuite/jit.dg/all-non-failing-tests.h 
> b/gcc/testsuite/jit.dg/all-non-failing-tests.h
> index 4202eb7798b..dfc6596358c 100644
> --- a/gcc/testsuite/jit.dg/all-non-failing-tests.h
> +++ b/gcc/testsuite/jit.dg/all-non-failing-tests.h
> @@ -181,6 +181,13 @@
>  #undef create_code
>  #undef verify_code
>  
> +/* test-builtin-types.c */
> +#define create_code create_code_builtin_types
> +#define verify_code verify_code_builtin_types
> +#include "test-builtin-types.c"
> +#undef create_code
> +#undef verify_code
> +
>  /* test-hello-world.c */
>  #define create_code create_code_hello_world
>  #define verify_code verify_code_hello_world

As with various other patches, this needs to also add a new entry to
the "testcases" array making use of the new
{create|verify}_code_builtin_types functions.

[...snip...]

Hope this is constructive
Dave



Re: [PATCH] libgccjit: Add support for TLS variable [PR95415]

2021-05-20 Thread David Malcolm via Gcc-patches
On Tue, 2021-05-18 at 20:43 -0400, Antoni Boucher via Gcc-patches
wrote:
> Hello.
> This patch adds support for TLS variables.
> One thing to fix before we merge it is the libgccjit.map file which
> contains LIBGCCJIT_ABI_16 instead of LIBGCCJIT_ABI_17.
> LIBGCCJIT_ABI_16 was added in one of my other patches.
> Thanks for the review.

> diff --git a/gcc/jit/docs/topics/compatibility.rst 
> b/gcc/jit/docs/topics/compatibility.rst
> index 239b6aa1a92..d10bc1df080 100644
> --- a/gcc/jit/docs/topics/compatibility.rst
> +++ b/gcc/jit/docs/topics/compatibility.rst
> @@ -243,3 +243,12 @@ embedding assembler instructions:
>* :func:`gcc_jit_extended_asm_add_input_operand`
>* :func:`gcc_jit_extended_asm_add_clobber`
>* :func:`gcc_jit_context_add_top_level_asm`
> +
> +.. _LIBGCCJIT_ABI_17:
> +
> +``LIBGCCJIT_ABI_17``
> +---
> +``LIBGCCJIT_ABI_17`` covers the addition of an API entrypoint to set the
> +thread-local storage model of a variable:
> +
> +  * :func:`gcc_jit_lvalue_set_tls_model`

Sorry about the delay in reviewing patches.

Is there a summary somewhere of the various outstanding patches and
their associated ABI versions?  Are there dependencies between the
patches?

> diff --git a/gcc/jit/docs/topics/expressions.rst
b/gcc/jit/docs/topics/expressions.rst
> index 396259ef07e..68defd6a311 100644
> --- a/gcc/jit/docs/topics/expressions.rst
> +++ b/gcc/jit/docs/topics/expressions.rst
> @@ -539,6 +539,34 @@ where the rvalue is computed by reading from the storage 
> area.
>  
> in C.
>  
> +.. function:: void\
> +  gcc_jit_lvalue_set_tls_model (gcc_jit_lvalue *lvalue,\
> +enum gcc_jit_tls_model model)
> +
> +   Make a variable a thread-local variable.
> +
> +   The "model" parameter determines the thread-local storage model of the 
> "lvalue":
> +
> +   .. type:: enum gcc_jit_tls_model
> +
> +   .. c:macro:: GCC_JIT_TLS_MODEL_GLOBAL_DYNAMIC
> +
> +   .. c:macro:: GCC_JIT_TLS_MODEL_LOCAL_DYNAMIC
> +
> +   .. c:macro:: GCC_JIT_TLS_MODEL_INITIAL_EXEC
> +
> +   .. c:macro:: GCC_JIT_TLS_MODEL_LOCAL_EXEC
> +
> +   .. c:macro:: GCC_JIT_TLS_MODEL_DEFAULT
> +
> +   This is analogous to:
> +
> +   .. code-block:: c
> +
> + _Thread_local int foo;
> +
> +   in C.

This comment needs the usual "This entrypoint was added in" text to
state which API version it was added in.

I confess to being a bit hazy on the different TLS models, and it's
unclear to me what all the different enum values do.  Is this
equivalent to the various values for __attribute__((tls_model(VALUE)))
?  This attribute is documented in
https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html,
though sadly that document doesn't seem to have a good anchor for that
attribute.

https://gcc.gnu.org/onlinedocs/gcc/Thread-Local.html currently links to
https://www.akkadia.org/drepper/tls.pdf "for a detailed explanation of
the four thread-local storage addressing models, and how the runtime is
expected to function."

One thing that should be clarified: does GCC_JIT_TLS_MODEL_DEFAULT mean
(a) thread-local storage, using a default model, or
(b) non-thread-local storage i.e. normal storage.

?

Reading the docs I thought it meant (a), but when I looked in more
detail at the implementation it looks like it means (b); is it meant
to?  This needs clarifying.

Are you using all of these enum values in your code?  Is this something
you need to expose for the rustc backend?


>  Global variables
>  
>  
> diff --git a/gcc/jit/jit-playback.h b/gcc/jit/jit-playback.h
> index 825a3e172e9..654a9c472d4 100644
> --- a/gcc/jit/jit-playback.h
> +++ b/gcc/jit/jit-playback.h
> @@ -650,6 +650,8 @@ public:
>  
>  private:
>context *m_ctxt;
> +
> +protected:
>tree m_inner;
>  };

As noted in another review, I don't think you need to make this
protected...

>  
> @@ -670,6 +672,12 @@ public:
>rvalue *
>get_address (location *loc);
>  
> +  void
> +  set_tls_model (enum tls_model tls_model)
> +  {
> +set_decl_tls_model (m_inner, tls_model);
> +  }

...as I think you can use "as_tree ()" to get at m_inner here.


> diff --git a/gcc/jit/jit-recording.c b/gcc/jit/jit-recording.c
> index 117ff70114c..64f3ae2d8f9 100644
> --- a/gcc/jit/jit-recording.c
> +++ b/gcc/jit/jit-recording.c
> @@ -3713,6 +3713,12 @@ recording::lvalue::get_address (recording::location 
> *loc)
>return result;
>  }
>  
> +void
> +recording::lvalue::set_tls_model (enum gcc_jit_tls_model model)
> +{
> +m_tls_model = model;
> +}
> +
>  /* The implementation of class gcc::jit::recording::param.  */
>  
>  /* Implementation of pure virtual hook recording::memento::replay_into
> @@ -4539,6 +4545,15 @@ recording::block::dump_edges_to_dot (pretty_printer 
> *pp)
>  #  pragma GCC diagnostic pop
>  #endif
>  
> +namespace recording {
> +static const enum tls_model tls_models[] = {
> +  TLS_MODEL_GLOBAL_DYNAMIC, /* GCC_JIT_TLS_MODEL_GLOBAL_DYNAMIC */
> +  TLS_MODEL_LOCAL_DYNAMIC, /* 

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-20 Thread Jonathan Wakely via Gcc-patches

On 19/05/21 23:52 +0100, Jonathan Wakely wrote:

On 19/05/21 16:08 -0400, Jason Merrill wrote:

On 5/19/21 4:05 PM, Jonathan Wakely wrote:

Oh, also we have https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93769
which points out a problem with the current wording. Not a very
important one, but still ...

While I'm touching all 38(?) places that say "only available with
-std=c++NN or -std=gnu++NN I could change them to say something like
"only available since C++NN". Should I bother?

Clang's equivalent warnings say "are a C++11 feature" e.g.

ext.C:1:17: warning: inline namespaces are a C++11 feature 
[-Wc++11-inline-namespace]


(They have a specific warning for each feature, with
-Wc++11-extensions to control them all at once.)


The clang wording seems more accurate, as that PR points out.


OK, that requires touching a number of error_at and inform calls as
well as the pedwarns, so I'll address that separately in a later
patch.


Here's a WIP patch that rewords all those diagnostics. This doesn't
include the necessary testsuite changes, and I don't know when I'll
have time to do the rest of it. But it's a start. Does this look like
the right approach?



diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 7c32f09cf0e..b073883ad14 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -11166,8 +11166,8 @@ mark_inline_variable (tree decl, location_t loc)
   inlinep = false;
 }
   else if (cxx_dialect < cxx17)
-pedwarn (loc, OPT_Wc__17_extensions, "inline variables are only available "
-	 "with %<-std=c++17%> or %<-std=gnu++17%>");
+pedwarn (loc, OPT_Wc__17_extensions,
+	 "inline variables are a C++17 feature");
   if (inlinep)
 {
   retrofit_lang_decl (decl);
@@ -12006,9 +12006,9 @@ grokdeclarator (const cp_declarator *declarator,
 	{
 	  gcc_rich_location richloc (declspecs->locations[ds_virtual]);
 	  richloc.add_range (declspecs->locations[ds_constexpr]);
-	  pedwarn (, OPT_Wc__20_extensions, "member %qD can be "
-		   "declared both % and % only in "
-		   "%<-std=c++20%> or %<-std=gnu++20%>", dname);
+	  pedwarn (, OPT_Wc__20_extensions, "declaring member %qD as "
+		   "both % and % is a C++20 feature",
+		   dname);
 	}
 }
   friendp = decl_spec_seq_has_spec_p (declspecs, ds_friend);
@@ -12097,8 +12097,8 @@ grokdeclarator (const cp_declarator *declarator,
 		  "binding declaration cannot be %qs", "consteval");
   if (thread_p && cxx_dialect < cxx20)
 	pedwarn (declspecs->locations[ds_thread], OPT_Wc__20_extensions,
-		 "structured binding declaration can be %qs only in "
-		 "%<-std=c++20%> or %<-std=gnu++20%>",
+		 "structured binding declarations using %qs are a C++20 "
+		 "feature",
 		 declspecs->gnu_thread_keyword_p
 		 ? "__thread" : "thread_local");
   if (concept_p)
@@ -12119,8 +12119,9 @@ grokdeclarator (const cp_declarator *declarator,
 	case sc_static:
 	  if (cxx_dialect < cxx20)
 	pedwarn (loc, OPT_Wc__20_extensions,
-		 "structured binding declaration can be %qs only in "
-		 "%<-std=c++20%> or %<-std=gnu++20%>", "static");
+		 "structured binding declarations using %qs are a C++20 "
+		 "feature",
+		 "static");
 	  break;
 	case sc_extern:
 	  error_at (loc, "structured binding declaration cannot be %qs",
@@ -12416,8 +12417,7 @@ grokdeclarator (const cp_declarator *declarator,
   "% type specifier without "
   "trailing return type", name);
 			inform (typespec_loc,
-"deduced return type only available "
-"with %<-std=c++14%> or %<-std=gnu++14%>");
+"deduced return type is a C++14 feature");
 		  }
 		else if (virtualp)
 		  {
@@ -12489,8 +12489,7 @@ grokdeclarator (const cp_declarator *declarator,
 		  /* Not using maybe_warn_cpp0x because this should
 		 always be an error.  */
 		  error_at (typespec_loc,
-			"trailing return type only available "
-			"with %<-std=c++11%> or %<-std=gnu++11%>");
+			"trailing return type is a C++11 feature");
 		else
 		  error_at (typespec_loc, "%qs function with trailing "
 			"return type not declared with % "
@@ -13584,8 +13583,7 @@ grokdeclarator (const cp_declarator *declarator,
 		if (constexpr_p && cxx_dialect < cxx20)
 		  {
 		error_at (declspecs->locations[ds_constexpr],
-			  "% destructors only available"
-			  " with %<-std=c++20%> or %<-std=gnu++20%>");
+			  "% destructors are a C++20 feature");
 		return error_mark_node;
 		  }
 		if (consteval_p)
diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 3d5eebd4bcd..4c7cb1725c7 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -4408,79 +4408,64 @@ maybe_warn_cpp0x (cpp0x_warn_str str)
   {
   case CPP0X_INITIALIZER_LISTS:
 	pedwarn (input_location, OPT_Wc__11_extensions,
-		 "extended initializer lists "
-		 "only available with %<-std=c++11%> or %<-std=gnu++11%>");
+		 "extended initializer lists are a C++11 feature");
 	break;
   case CPP0X_EXPLICIT_CONVERSION:
 	pedwarn (input_location, OPT_Wc__11_extensions,
-		 "explicit conversion 

Re: [PATCH] libgccjit: Add support for setting the link section of global variables [PR100688]

2021-05-20 Thread David Malcolm via Gcc-patches
On Thu, 2021-05-20 at 15:29 -0400, David Malcolm wrote:
> On Wed, 2021-05-19 at 20:32 -0400, Antoni Boucher via Jit wrote:
> > Hello.
> > This patch adds support to set the link section of global
> > variables.
> > I used the ABI 18 because I submitted other patches up to 17.
> > Thanks for the review.
> 
> I didn't see this email until now, and put the review in bugzilla
> instead; sorry.
> 
> Here's a copy-and-paste of what I put in bugzilla:
> 

[..snip...]

> 
> > diff --git a/gcc/testsuite/jit.dg/all-non-failing-tests.h
> b/gcc/testsuite/jit.dg/all-non-failing-tests.h
> > index 4202eb7798b..7e3b59dee0d 100644
> > --- a/gcc/testsuite/jit.dg/all-non-failing-tests.h
> > +++ b/gcc/testsuite/jit.dg/all-non-failing-tests.h
> > @@ -181,6 +181,13 @@
> >  #undef create_code
> >  #undef verify_code
> >  
> > +/* test-link-section.c */
> > +#define create_code create_code_link_section
> > +#define verify_code verify_code_link_section
> > +#include "test-link-section.c"
> > +#undef create_code
> > +#undef verify_code
> > +
> >  /* test-hello-world.c */
> >  #define create_code create_code_hello_world
> >  #define verify_code verify_code_hello_world

Something else I just noticed when looking at another of your patches:
you also need to add a new entry to the "testcases" array to use the
new {create|verify}_code_link_section functions.

> 
[...snip...]

Dave



Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-20 Thread Jonathan Wakely via Gcc-patches

On 20/05/21 12:34 -0400, Jason Merrill wrote:

On 5/20/21 8:56 AM, Jonathan Wakely wrote:

On 19/05/21 16:05 -0400, Jason Merrill wrote:

On 5/19/21 3:55 PM, Jonathan Wakely wrote:

On 19/05/21 13:26 -0400, Jason Merrill wrote:

On 5/19/21 12:46 PM, Jonathan Wakely wrote:

On 19/05/21 17:39 +0100, Jonathan Wakely wrote:

Jakub pointed out I'd forgotten the spaces before the opening parens
for function calls. The attached patch should fix all those, with no
other changes.

Tested x86_64-linux. OK for trunk?


Jakub also pointed out we already have some similar diagnostics for
C++23, but I missed them as they say "only optional with" not "only
available with".

I'm testing the incremental change in the attached patch which also
adds -Wc++23-extensions, and I'll resend the full patch after that
finishes.



  if (omitted_parms_loc && lambda_specs.any_specifiers_p)
    {
-  pedwarn (omitted_parms_loc, 0,
+  pedwarn (omitted_parms_loc, OPT_Wc__23_extensions,


You probably want to change


 else if (cxx_dialect < cxx23)
   omitted_parms_loc = cp_lexer_peek_token (parser->lexer)->location;


To use warn_about_dialect_p.


Ah yes.

And just above that there's another pedwarn about a C++14 feature
being used:


 /* Default arguments shall not be specified in the
 parameter-declaration-clause of a lambda-declarator.  */
 if (cxx_dialect < cxx14)
for (tree t = param_list; t; t = TREE_CHAIN (t))
  if (TREE_PURPOSE (t) && DECL_P (TREE_VALUE (t)))
    pedwarn (DECL_SOURCE_LOCATION (TREE_VALUE (t)), OPT_Wpedantic,
 "default argument specified for lambda parameter");


I didn't notice that one initially. That should also use
warn_about_dialect_p and OPT_Wc__14_extensions.


Indeed.


Should I change the message to say "init capture" rather than
"default argument"?


No, this is about e.g. [](int = 42){}


OK, this is a simpler version of the patch, with docs now, but without
the new warn_about_cxx_dialect_p function (which isn't needed) and
with no changes to any actual warning text (I'll do that separately,
if at all).

I also caught a few more pedwarn cases that I missed previously.

Tested powerpc64le-linux. OK for trunk?


OK.  Do we also want, say, -Wno-std-extensions to turn them all off at once?


That seems to make sense, but on the other hand Clang seems to manage
without. I wonder how useful it would be in practice. I expect that
people will usually only be using features from one standard newer
than they compile with, e.g. maybe using variadic templates in C++03,
or generic lambdas in C++11. I doubt many people are using C++20
features while compiling with -std=gnu++11, so I'm not sure how often
anybody would need to specify multiple -Wno-c++NN-extensions options.

And if they are using C++20 features with -std=gnu++11, maybe we
shouldn't be encouraging that by making it easier :-)




Re: [PATCH 2/2] Fix tests when running on power10, PR testsuite/100166

2021-05-20 Thread will schmidt via Gcc-patches
On Tue, 2021-05-18 at 16:59 -0400, Michael Meissner wrote:
> [PATCH 2/2] Fix tests when running on power10, PR testsuite/100166
> 
Hi,


> This patch updates the various tests in the testsuite to adjust the test
> if power10 code generation is used.
> 
> Some tests would not generate the expected instructions because power10
> provides new instructions that the compiler now generates.  These tests are
> adjusted to use '#pragma GCC target ("cpu=power9"), or the new instructions
> were added to regex.

ok

> 
> One test was checking for 64-bit TOC calls, and it was adjusted to also allow
> PC-relative calls.
> 
> I have bootstraped this on LE power9 and BE power8 systems.  There were no
> regressions in the tests.  Can I check this into the trunk?
> 
> I would like to back port these patches to GCC 11 after a cooling off period.
> Is that ok?
> 
> gcc/testsuite/
> 2021-05-18  Michael Meissner  
> 
>   PR testsuite/100166
>   * gcc.dg/pr56727-2.c: Add support for PC-relative calls.
>   * gcc.target/powerpc/fold-vec-div-longlong.c:
>   * gcc.target/powerpc/fold-vec-mult-longlong.c: Disable power10
>   code generation.
>   * gcc.target/powerpc/ppc-eq0-1.c: Add support for the setbc
>   instruction.
>   * gcc.target/powerpc/ppc-ne0-1.c: Disable power10 code
>   generation.
> ---
>  gcc/testsuite/gcc.dg/pr56727-2.c  | 2 +-
>  gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c  | 7 +++
>  gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c | 7 +++
>  gcc/testsuite/gcc.target/powerpc/ppc-eq0-1.c  | 2 +-
>  gcc/testsuite/gcc.target/powerpc/ppc-ne0-1.c  | 8 
>  5 files changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/pr56727-2.c 
> b/gcc/testsuite/gcc.dg/pr56727-2.c
> index c54369ed25e..77fdf4bc350 100644
> --- a/gcc/testsuite/gcc.dg/pr56727-2.c
> +++ b/gcc/testsuite/gcc.dg/pr56727-2.c
> @@ -18,4 +18,4 @@ void h ()
> 
>  /* { dg-final { scan-assembler "@(PLT|plt)" { target i?86-*-* x86_64-*-* } } 
> } */
>  /* { dg-final { scan-assembler "@(PLT|plt)" { target { powerpc*-*-linux* && 
> ilp32 } } } } */
> -/* { dg-final { scan-assembler "bl f\n\\s*nop" { target { powerpc*-*-linux* 
> && lp64 } } } } */
> +/* { dg-final { scan-assembler "(bl f\n\\s*nop)|(bl f@notoc)" { target { 
> powerpc*-*-linux* && lp64 } } } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c 
> b/gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c
> index 312e984d3cc..1d20b7ff100 100644
> --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-div-longlong.c
> @@ -6,6 +6,13 @@
>  /* { dg-require-effective-target lp64 } */
>  /* { dg-options "-mvsx -O2" } */
> 
> +/* If the compiler was configured to automatically generate power10 support 
> with
> +   --with-cpu=power10, turn it off.  Otherwise, it will generate VDIVSD and
> +   VDIVUD instructions.  */
> +#ifdef _ARCH_PWR10
> +#pragma GCC target ("cpu=power9")
> +#endif
> +
>  #include 
> 
>  vector signed long long
> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c 
> b/gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c
> index 38dba9f5023..7510dc5c7a7 100644
> --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-mult-longlong.c
> @@ -6,6 +6,13 @@
>  /* { dg-options "-maltivec -mvsx -mpower8-vector" } */
>  /* { dg-additional-options "-maix64" { target powerpc-ibm-aix* } } */
> 
> +/* If the compiler was configured to automatically generate power10 support 
> with
> +   --with-cpu=power10, turn it off.  Otherwise, it will generate VMULLD
> +   instructions.  */
> +#ifdef _ARCH_PWR10
> +#pragma GCC target ("cpu=power9")
> +#endif
> +
>  #include 
> 
>  vector signed long long
> diff --git a/gcc/testsuite/gcc.target/powerpc/ppc-eq0-1.c 
> b/gcc/testsuite/gcc.target/powerpc/ppc-eq0-1.c
> index 496a6e340c0..2ddf03117ab 100644
> --- a/gcc/testsuite/gcc.target/powerpc/ppc-eq0-1.c
> +++ b/gcc/testsuite/gcc.target/powerpc/ppc-eq0-1.c
> @@ -7,4 +7,4 @@ int foo(int x)
>return x == 0;
>  }
> 
> -/* { dg-final { scan-assembler "cntlzw|isel" } } */
> +/* { dg-final { scan-assembler "cntlzw|isel|setbc" } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/ppc-ne0-1.c 
> b/gcc/testsuite/gcc.target/powerpc/ppc-ne0-1.c
> index 63c4b6087df..bf777979833 100644
> --- a/gcc/testsuite/gcc.target/powerpc/ppc-ne0-1.c
> +++ b/gcc/testsuite/gcc.target/powerpc/ppc-ne0-1.c
> @@ -2,6 +2,14 @@
>  /* { dg-do compile } */
>  /* { dg-options "-O2 -mno-isel" } */
> 
> +/* If the compiler was configured to automatically generate power10 support 
> with
> +   --with-cpu=power10, turn it off.  Otherwise, it will generate a SETBCR
> +   instruction instead of ADDIC/SUBFE.  */
> +
> +#ifdef _ARCH_PWR10
> +#pragma GCC target ("cpu=power9")
> +#endif
> +
>  /* { dg-final { scan-assembler-times "addic" 4 } } */
>  

Re: [PATCH 1/2] Deal with prefixed loads/stores in tests, PR testsuite/100166

2021-05-20 Thread will schmidt via Gcc-patches
On Tue, 2021-05-18 at 16:57 -0400, Michael Meissner wrote:
> [PATCH 1/2] Deal with prefixed loads/stores in tests, PR testsuite/100166
> 

Hi,

> This patch updates the various tests in the testsuite to treat plxv
> and pstxv as being vector loads/stores.  This shows up if you run the
> testsuite with a compiler configured with the option: --with-cpu=power10.
> 
> I have bootstraped this on LE power9 and BE power8 systems.  There were no
> regressions in the tests.  Can I check this into the trunk?
> 
> I would like to back port these patches to GCC 11 after a cooling off period.
> Is that ok?
> 
> gcc/testsuite/
> 2021-05-18  Michael Meissner  
> 
>   PR testsuite/100166
>   * gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c:
>   * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c:
>   * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-double.c:
>   * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-float.c:
>   * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-int.c:
>   * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-longlong.c:
>   * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-short.c:
>   * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-char.c:
>   * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-double.c:
>   * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-float.c:
>   * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-int.c:
>   * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-longlong.c:
>   * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-short.c:
>   * gcc.target/powerpc/fold-vec-load-vec_xl-char.c:
>   * gcc.target/powerpc/fold-vec-load-vec_xl-double.c:
>   * gcc.target/powerpc/fold-vec-load-vec_xl-float.c:
>   * gcc.target/powerpc/fold-vec-load-vec_xl-int.c:
>   * gcc.target/powerpc/fold-vec-load-vec_xl-longlong.c:
>   * gcc.target/powerpc/fold-vec-load-vec_xl-short.c:
>   * gcc.target/powerpc/fold-vec-splat-floatdouble.c:
>   * gcc.target/powerpc/fold-vec-splat-longlong.c:
>   * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-char.c:
>   * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-double.c:
>   * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-float.c:
>   * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-int.c:
>   * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-longlong.c:
>   * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-short.c:
>   * gcc.target/powerpc/fold-vec-store-vec_vsx_st-char.c:
>   * gcc.target/powerpc/fold-vec-store-vec_vsx_st-double.c:
>   * gcc.target/powerpc/fold-vec-store-vec_vsx_st-float.c:
>   * gcc.target/powerpc/fold-vec-store-vec_vsx_st-int.c:
>   * gcc.target/powerpc/fold-vec-store-vec_vsx_st-longlong.c:
>   * gcc.target/powerpc/fold-vec-store-vec_vsx_st-short.c:
>   * gcc.target/powerpc/fold-vec-store-vec_xst-char.c:
>   * gcc.target/powerpc/fold-vec-store-vec_xst-double.c:
>   * gcc.target/powerpc/fold-vec-store-vec_xst-float.c:
>   * gcc.target/powerpc/fold-vec-store-vec_xst-int.c:
>   * gcc.target/powerpc/fold-vec-store-vec_xst-longlong.c:
>   * gcc.target/powerpc/fold-vec-store-vec_xst-short.c:
>   * gcc.target/powerpc/lvsl-lvsr.c:
>   * gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c:
>   Update insn counts to account for power10 prefixed loads and
>   stores.
> ---
>  .../vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c   | 2 +-
>  .../gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c | 2 +-
>  .../powerpc/fold-vec-load-builtin_vec_xl-double.c  | 2 +-
>  .../powerpc/fold-vec-load-builtin_vec_xl-float.c   | 2 +-
>  .../gcc.target/powerpc/fold-vec-load-builtin_vec_xl-int.c  | 2 +-
>  .../powerpc/fold-vec-load-builtin_vec_xl-longlong.c| 2 +-
>  .../powerpc/fold-vec-load-builtin_vec_xl-short.c   | 2 +-
>  .../gcc.target/powerpc/fold-vec-load-vec_vsx_ld-char.c | 2 +-
>  .../gcc.target/powerpc/fold-vec-load-vec_vsx_ld-double.c   | 2 +-
>  .../gcc.target/powerpc/fold-vec-load-vec_vsx_ld-float.c| 2 +-
>  .../gcc.target/powerpc/fold-vec-load-vec_vsx_ld-int.c  | 2 +-
>  .../gcc.target/powerpc/fold-vec-load-vec_vsx_ld-longlong.c | 2 +-
>  .../gcc.target/powerpc/fold-vec-load-vec_vsx_ld-short.c| 2 +-
>  .../gcc.target/powerpc/fold-vec-load-vec_xl-char.c | 2 +-
>  .../gcc.target/powerpc/fold-vec-load-vec_xl-double.c   | 2 +-
>  .../gcc.target/powerpc/fold-vec-load-vec_xl-float.c| 2 +-
>  .../gcc.target/powerpc/fold-vec-load-vec_xl-int.c  | 2 +-
>  .../gcc.target/powerpc/fold-vec-load-vec_xl-longlong.c | 2 +-
>  .../gcc.target/powerpc/fold-vec-load-vec_xl-short.c| 2 +-
>  .../gcc.target/powerpc/fold-vec-splat-floatdouble.c| 7 ---
>  gcc/testsuite/gcc.target/powerpc/fold-vec-splat-longlong.c | 2 +-
>  .../powerpc/fold-vec-store-builtin_vec_xst-char.c  | 2 +-
>  .../powerpc/fold-vec-store-builtin_vec_xst-double.c| 2 +-
>  

Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-20 Thread Jonathan Wakely via Gcc-patches

On 20/05/21 11:25 -0600, Martin Sebor wrote:

On 5/20/21 6:56 AM, Jonathan Wakely wrote:

On 19/05/21 16:05 -0400, Jason Merrill wrote:

On 5/19/21 3:55 PM, Jonathan Wakely wrote:

On 19/05/21 13:26 -0400, Jason Merrill wrote:

On 5/19/21 12:46 PM, Jonathan Wakely wrote:

On 19/05/21 17:39 +0100, Jonathan Wakely wrote:

Jakub pointed out I'd forgotten the spaces before the opening parens
for function calls. The attached patch should fix all those, with no
other changes.

Tested x86_64-linux. OK for trunk?


Jakub also pointed out we already have some similar diagnostics for
C++23, but I missed them as they say "only optional with" not "only
available with".

I'm testing the incremental change in the attached patch which also
adds -Wc++23-extensions, and I'll resend the full patch after that
finishes.



  if (omitted_parms_loc && lambda_specs.any_specifiers_p)
    {
-  pedwarn (omitted_parms_loc, 0,
+  pedwarn (omitted_parms_loc, OPT_Wc__23_extensions,


You probably want to change


 else if (cxx_dialect < cxx23)
   omitted_parms_loc = cp_lexer_peek_token (parser->lexer)->location;


To use warn_about_dialect_p.


Ah yes.

And just above that there's another pedwarn about a C++14 feature
being used:


 /* Default arguments shall not be specified in the
 parameter-declaration-clause of a lambda-declarator.  */
 if (cxx_dialect < cxx14)
for (tree t = param_list; t; t = TREE_CHAIN (t))
  if (TREE_PURPOSE (t) && DECL_P (TREE_VALUE (t)))
    pedwarn (DECL_SOURCE_LOCATION (TREE_VALUE (t)), OPT_Wpedantic,
 "default argument specified for lambda parameter");


I didn't notice that one initially. That should also use
warn_about_dialect_p and OPT_Wc__14_extensions.


Indeed.


Should I change the message to say "init capture" rather than
"default argument"?


No, this is about e.g. [](int = 42){}


OK, this is a simpler version of the patch, with docs now, but without
the new warn_about_cxx_dialect_p function (which isn't needed) and
with no changes to any actual warning text (I'll do that separately,
if at all).

I also caught a few more pedwarn cases that I missed previously.

Tested powerpc64le-linux. OK for trunk?


This looks good to me, and the change overall simpler.  Just one
minor thing (sorry if that seems nit-picky): in the last sentence
in the documentation, does "this option" refer to the -Wc++11 form
or to the negative? (The latter is the one that's going to be
mentioned in the entry.)


It refers to the form documented, the -Wno- one.

To give a specific example, warnings about variadic templates in C++98
are enabled by default, and disabled by -Wno-c++11-extensions. But
warnings about inline namespaces in C++98 are *not* enabled by
default, only if you use -Wpedantic. -Wno-c++11-extensions still
silences them when you do use -Wpedantic, it's just not needed to
suppress them by default.



If what the sentence is trying to say is that warnings for some C++
11 constructs are controlled only by -Wpedantic then I'd suggest to


No, not _only_ by -Wpedantic. They depend on both -Wpedantic and
-Wc++11-extensions. Some warnings are not emitted by default, only
when -Wpedantic is used. When -Wpedantic _is_ used, you can use
-Wno-c++11-extensions to suppress them again (without suppressing all
the other warnings that -Wpedantic enables).


rephrase it to make that part clearer (or drop it altogether since
it sounds like it describes a limitation/problem that we might want
to work toward fixing).


I think the doc text is accurate, but it seems it could be clearer so
that's it's both accurate and easy to understand. I'm open to
suggestions.

At one point I did include the specific examples given above, so it
read something like this:


-Wno-c++11-extensions (C++ and Objective-C++ only)

Do not warn about C++11 constructs in code being compiled using an
older C++ standard, e.g., disable warnings about using variadic
templates in C++98 code.  Even without this option, some C++11
constructs will only be diagnosed if @option{-Wpedantic} is used,
e.g., by default there are no warnings about inline namespaces in
C++98 code, only when -Wpedantic is used.  The -Wno-c++11-extensions
option disables those warnings when -Wpedantic is used.




Re: [PATCH] Fix vec-splati-runnable.c test.

2021-05-20 Thread will schmidt via Gcc-patches
On Tue, 2021-05-18 at 16:49 -0400, Michael Meissner wrote:
> [PATCH] Fix vec-splati-runnable.c test.
> 

hi,


> I noticed that the vec-splati-runnable.c did not have an abort after one
> of the tests.  If the test was run with optimization, the optimizer could
> delete some of the tests and throw off the count.
> 


> I have bootstraped this on LE power9 and BE power8 systems.  There were no
> regressions in the tests.  Can I check this into the trunk?
> 
> I do not expect to back port this to GCC 11 unless we will be back porting the
> future patches that add support for the XXSPLITW, XXSPLTIDP, and XXSPLTI32DX
> instructions.
> 
> gcc/testsuite/
> 2021-05-18  Michael Meissner  
> 
>   * gcc.target/powerpc/vec-splati-runnable.c: Run test with -O2
>   optimization.  Do not check what XXSPLTIDP generates if the value
>   is undefined.
> ---
>  .../gcc.target/powerpc/vec-splati-runnable.c  | 29 ++-
>  1 file changed, 9 insertions(+), 20 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c 
> b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
> index e84ce77a21d..a135279b1d7 100644
> --- a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
> +++ b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
> @@ -1,7 +1,7 @@
>  /* { dg-do run { target { power10_hw } } } */
>  /* { dg-do link { target { ! power10_hw } } } */
>  /* { dg-require-effective-target power10_ok } */
> -/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
> +/* { dg-options "-mdejagnu-cpu=power10 -save-temps -O2" } */
>  #include 
> 
>  #define DEBUG 0
> @@ -12,6 +12,8 @@
> 
>  extern void abort (void);
> 
> +volatile vector double vresult_d_undefined;
> +
>  int
>  main (int argc, char *argv [])
>  {
> @@ -85,25 +87,12 @@ main (int argc, char *argv [])
>  #endif
>}
> 
> -  /* This test will generate a "note" to the user that the argument
> - is subnormal.  It is not an error, but results are not defined.  */
> -  vresult_d = (vector double) { 2.0, 3.0 };
> -  expected_vresult_d = (vector double) { 6.6E-42f, 6.6E-42f };
> -
> -  vresult_d = vec_splatid (6.6E-42f);
> -
> -  /* Although the instruction says the results are not defined, it does seem
> - to work, at least on Mambo.  But no guarentees!  */
> -  if (!vec_all_eq (vresult_d,  expected_vresult_d)) {
> -#if DEBUG
> -printf("ERROR, vec_splati (6.6E-42f)\n");
> -for(i = 0; i < 2; i++)
> -  printf(" vresult_d[%i] = %e, expected_vresult_d[%i] = %e\n",
> -  i, vresult_d[i], i, expected_vresult_d[i]);
> -#else
> -;
> -#endif
> -  }
> +  /* This test will generate a "note" to the user that the argument is
> + subnormal.  It is not an error, but results are not defined.  Because 
> this
> + is undefined, we cannot check that any value is correct.  Just store it 
> in

as in undefined-behavior..?

> + a volatile variable so the XXSPLTIDP instruction gets generated and the
> + warning message printed. */
> +  vresult_d_undefined = vec_splatid (6.6E-42f);


This does not look like it adds an abort() call as I would have
expected per the patch description. 

So this looks like it still calls vec_splatid(), but instead assigns
result to a variable name vresult_d_undefined.   Also removes some
DEBUG code, which is fine.  So just the vec_all_eq() call is removed?  
I'm not certain I see how that will change the results, just the -O2
optimization makes the difference?
I may be missing something...


Thanks,
-Will

> 
>/* Vector splat immediate */
>vsrc_a_int = (vector int) { 2, 3, 4, 5 };
> -- 
> 2.31.1
> 



Re: [PATCH 2/2] Fix xxeval predicates.

2021-05-20 Thread will schmidt via Gcc-patches
On Tue, 2021-05-18 at 16:47 -0400, Michael Meissner wrote:
> [PATCH 2/2] Fix xxeval predicates.
> 
> In doing the patch to move the XX* built-in functions from altivec.md to
> vsx.md, I noticed that the xxeval built-in function used the
> altivec_register_operand predicate.  Since it takes vsx registers, this
> might force the register allocate to issue a move when it could use a
> traditional floating point register.  This patch fixes that.

allocator ?

> 
> gcc/
> 2021-05-18  Michael Meissner  
> 
>   * config/rs6000/vsx.md (xxeval): Use register_predicate instead of
>   altivec_register_predicate.
> ---
>  gcc/config/rs6000/vsx.md | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index a859038d399..15a8c0e22d8 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -6410,9 +6410,9 @@ (define_insn "xxpermx_inst"
>  ;; XXEVAL built-in function support
>  (define_insn "xxeval"
>[(set (match_operand:V2DI 0 "register_operand" "=wa")
> - (unspec:V2DI [(match_operand:V2DI 1 "altivec_register_operand" "wa")
> -   (match_operand:V2DI 2 "altivec_register_operand" "wa")
> -   (match_operand:V2DI 3 "altivec_register_operand" "wa")
> + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "wa")
> +   (match_operand:V2DI 2 "register_operand" "wa")
> +   (match_operand:V2DI 3 "register_operand" "wa")
> (match_operand:QI 4 "u8bit_cint_operand" "n")]
>UNSPEC_XXEVAL))]
> "TARGET_POWER10"
> -- 


ok
Thanks,
-Will

> 2.31.1
> 



Re: [PATCH 1/2] Move xx* builtins to vsx.md.

2021-05-20 Thread will schmidt via Gcc-patches
On Tue, 2021-05-18 at 16:46 -0400, Michael Meissner wrote:
> [PATCH 1/2] Move xx* builtins to vsx.md.
> 

Hi,


> I noticed that the xx built-in functions (xxspltiw, xxspltidp, xxsplti32dx,
> xxeval, xxblend, and xxpermx) were all defined in altivec.md.  However, since
> the XX instructions can take both traditional floating point and Altivec
> registers, these built-in functions should be in vsx.md.
> 
> This patch just moves the insns from altivec.md to vsx.md.
> 
> I also moved the VM3 mode iterator and VM3_char mode attribute from altivec.md
> to vsx.md, since the only use of these were for the XXBLEND insns.
> 
> I have bootstraped this on LE power9 and BE power8 systems.  There were no
> regressions in the tests.  Can I check this into the trunk?
> 
> I do not expect to back port this to GCC 11 unless we will be back porting the
> future patches that add support for the XXSPLITW, XXSPLTIDP, and XXSPLTI32DX
> instructions.
> 
> gcc/
> 2021-05-18  Michael Meissner  
> 
>   * config/rs6000/altivec.md (UNSPEC_XXEVAL): Move to vsx.md.
>   (UNSPEC_XXSPLTIW): Move to vsx.md.
>   (UNSPEC_XXSPLTID): Move to vsx.md.
>   (UNSPEC_XXSPLTI32DX): Move to vsx.md.
>   (UNSPEC_XXBLEND): Move to vsx.md.
>   (UNSPEC_XXPERMX): Move to vsx.md.
>   (VM3): Move to vsx.md.
>   (VM3_char): Move to vsx.md.
>   (xxspltiw_v4si): Move to vsx.md.
>   (xxspltiw_v4sf): Move to vsx.md.
>   (xxspltiw_v4sf_inst): Move to vsx.md.
>   (xxspltidp_v2df): Move to vsx.md.
>   (xxspltidp_v2df_inst): Move to vsx.md.
>   (xxsplti32dx_v4si_inst): Move to vsx.md.
>   (xxsplti32dx_v4sf): Move to vsx.md.
>   (xxsplti32dx_v4sf_inst): Move to vsx.md.
>   (xxblend_): Move to vsx.md.
>   (xxpermx): Move to vsx.md.
>   (xxpermx_inst): Move to vsx.md.
>   * config/rs6000/vsx.md (UNSPEC_XXEVAL): Move from altivec.md.
>   (UNSPEC_XXSPLTIW): Move from altivec.md.
>   (UNSPEC_XXSPLTID): Move from altivec.md.
>   (UNSPEC_XXSPLTI32DX): Move from altivec.md.
>   (UNSPEC_XXBLEND): Move from altivec.md.
>   (UNSPEC_XXPERMX): Move from altivec.md.
>   (VM3): Move from altivec.md.
>   (VM3_char): Move from altivec.md.
>   (xxspltiw_v4si): Move from altivec.md.
>   (xxspltiw_v4sf): Move from altivec.md.
>   (xxspltiw_v4sf_inst): Move from altivec.md.
>   (xxspltidp_v2df): Move from altivec.md.
>   (xxspltidp_v2df_inst): Move from altivec.md.
>   (xxsplti32dx_v4si_inst): Move from altivec.md.
>   (xxsplti32dx_v4sf): Move from altivec.md.
>   (xxsplti32dx_v4sf_inst): Move from altivec.md.
>   (xxblend_): Move from altivec.md.
>   (xxpermx): Move from altivec.md.
>   (xxpermx_inst): Move from altivec.md.
> ---
>  gcc/config/rs6000/altivec.md | 196 -
>  gcc/config/rs6000/vsx.md | 204 +++
>  2 files changed, 204 insertions(+), 196 deletions(-)
> 
> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
> index 1351dafbc41..8a9f55c561b 100644
> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -171,16 +171,10 @@ (define_c_enum "unspec"
> UNSPEC_VPEXTD
> UNSPEC_VCLRLB
> UNSPEC_VCLRRB
> -   UNSPEC_XXEVAL
> UNSPEC_VSTRIR
> UNSPEC_VSTRIL
> UNSPEC_SLDB
> UNSPEC_SRDB
> -   UNSPEC_XXSPLTIW
> -   UNSPEC_XXSPLTID
> -   UNSPEC_XXSPLTI32DX
> -   UNSPEC_XXBLEND
> -   UNSPEC_XXPERMX
>  ])
> 
>  (define_c_enum "unspecv"
> @@ -221,21 +215,6 @@ (define_mode_iterator VM2 [V4SI
>  (KF "FLOAT128_VECTOR_P (KFmode)")
>  (TF "FLOAT128_VECTOR_P (TFmode)")])
> 
> -;; Like VM2, just do char, short, int, long, float and double
> -(define_mode_iterator VM3 [V4SI
> -V8HI
> -V16QI
> -V4SF
> -V2DF
> -V2DI])
> -
> -(define_mode_attr VM3_char [(V2DI "d")
> -(V4SI "w")
> -(V8HI "h")
> -(V16QI "b")
> -(V2DF  "d")
> -(V4SF  "w")])
> -
>  ;; Map the Vector convert single precision to double precision for integer
>  ;; versus floating point
>  (define_mode_attr VS_sxwsp [(V4SI "sxw") (V4SF "sp")])
> @@ -820,169 +799,6 @@ (define_insn "vsdb_"
>"vsdbi %0,%1,%2,%3"
>[(set_attr "type" "vecsimple")])
> 
> -(define_insn "xxspltiw_v4si"
> -  [(set (match_operand:V4SI 0 "register_operand" "=wa")
> - (unspec:V4SI [(match_operand:SI 1 "s32bit_cint_operand" "n")]
> -  UNSPEC_XXSPLTIW))]
> - "TARGET_POWER10"
> - "xxspltiw %x0,%1"
> - [(set_attr "type" "vecsimple")
> -  (set_attr "prefixed" "yes")])
> -
> -(define_expand "xxspltiw_v4sf"
> -  [(set (match_operand:V4SF 0 "register_operand" "=wa")
> - (unspec:V4SF [(match_operand:SF 1 "const_double_operand" "n")]
> -  UNSPEC_XXSPLTIW))]
> - 

Re: [PATCH] Change rs6000_const_f32_to_i32 return type.

2021-05-20 Thread will schmidt via Gcc-patches
On Tue, 2021-05-18 at 16:39 -0400, Michael Meissner wrote:
> [PATCH] Change rs6000_const_f32_to_i32 return type.
> 
> The function rs6000_const_f32_to_i32 called REAL_VALUE_TO_TARGET_SINGLE
> with a long long type and returns it.  This patch changes the type to long
> which is the proper type for REAL_VALUE_TO_TARGET_SINGLE.

ok

That seems consistent with the tm.texi blurb: 
For @code{REAL_VALUE_TO_TARGET_SINGLE} and
@code{REAL_VALUE_TO_TARGET_DECIMAL32}, this variable should be
a simple @code{long int}. 

> 
> I have done bootstraps on little endian power9 and big endian power8 systems.
> Can I check this into the trunk?
> 
> This does not need to go into GCC 11, unless some of the other patches that 
> use
> this function are also back ported.
> 
> gcc/
> 2021-05-18  Michael Meissner  
> 
>   * config/rs6000/rs6000-protos.h (rs6000_const_f32_to_i32): Change
>   return type to long.
>   * config/rs6000/rs6000.c (rs6000_const_f32_to_i32): Change return
>   type to long.
> ---
>  gcc/config/rs6000/rs6000-protos.h | 2 +-
>  gcc/config/rs6000/rs6000.c| 6 --
>  2 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000-protos.h 
> b/gcc/config/rs6000/rs6000-protos.h
> index bef727e0a64..c407034d58c 100644
> --- a/gcc/config/rs6000/rs6000-protos.h
> +++ b/gcc/config/rs6000/rs6000-protos.h
> @@ -282,7 +282,7 @@ extern void rs6000_asm_output_dwarf_pcrel (FILE *file, 
> int size,
>  const char *label);
>  extern void rs6000_asm_output_dwarf_datarel (FILE *file, int size,
>const char *label);
> -extern long long rs6000_const_f32_to_i32 (rtx operand);
> +extern long rs6000_const_f32_to_i32 (rtx operand);
> 
>  /* Declare functions in rs6000-c.c */

ok

> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 86f53297cb9..ef1ebaaee05 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -27937,10 +27937,12 @@ rs6000_invalid_conversion (const_tree fromtype, 
> const_tree totype)
>return NULL;
>  }
> 
> -long long
> +/* Convert a SFmode constant to the integer bit pattern.  */
> +
> +long
>  rs6000_const_f32_to_i32 (rtx operand)
>  {
> -  long long value;
> +  long value;
>const struct real_value *rv = CONST_DOUBLE_REAL_VALUE (operand);

ok

Thanks
-Will

> 
>gcc_assert (GET_MODE (operand) == SFmode);
> -- 
> 2.31.1
> 



Re: [PATCH] libgccjit: Add support for setting the link section of global variables [PR100688]

2021-05-20 Thread David Malcolm via Gcc-patches
On Wed, 2021-05-19 at 20:32 -0400, Antoni Boucher via Jit wrote:
> Hello.
> This patch adds support to set the link section of global variables.
> I used the ABI 18 because I submitted other patches up to 17.
> Thanks for the review.

I didn't see this email until now, and put the review in bugzilla
instead; sorry.

Here's a copy-and-paste of what I put in bugzilla:


Thanks for the patch; I like the idea; various nits below:

> diff --git a/gcc/jit/docs/topics/expressions.rst
b/gcc/jit/docs/topics/expressions.rst
> index 396259ef07e..b39f6c02527 100644
> --- a/gcc/jit/docs/topics/expressions.rst
> +++ b/gcc/jit/docs/topics/expressions.rst
> @@ -539,6 +539,18 @@ where the rvalue is computed by reading from the
storage area.
>  
> in C.
>  
> +.. function:: void
> +  gcc_jit_lvalue_set_link_section (gcc_jit_lvalue
*lvalue,
> +   const char *name)
> +
> +   Set the link section of a variable; analogous to:
> +
> +   .. code-block:: c
> +
> + int variable __attribute__((section(".section")));
> +
> +   in C.

Please rename param "name" to "section_name".  Your implementation
requires that it be non-NULL (rather than having NULL unset the
section), so please specify that it must be non-NULL in the docs.

Please add the usual "This entrypoint was added in" text to state which
API version it was added in.

> +
>  Global variables
>  
>  
> diff --git a/gcc/jit/jit-playback.h b/gcc/jit/jit-playback.h
> index 825a3e172e9..8b0f65e87e8 100644
> --- a/gcc/jit/jit-playback.h
> +++ b/gcc/jit/jit-playback.h
> @@ -650,6 +650,8 @@ public:
>  
>  private:
>context *m_ctxt;
> +
> +protected:
>tree m_inner;
>  };

I think you only use this...

>  
> @@ -670,6 +672,12 @@ public:
>rvalue *
>get_address (location *loc);
>  
> +  void
> +  set_link_section (const char* name)
> +  {
> +set_decl_section_name (m_inner, name);
> +  }

...here, and you can get at rvalue::m_inner using as_tree (), so I
don't think we need to make m_inner protected.

> diff --git a/gcc/jit/jit-recording.c b/gcc/jit/jit-recording.c
> index 117ff70114c..d54f878cc6b 100644
> --- a/gcc/jit/jit-recording.c
> +++ b/gcc/jit/jit-recording.c
> @@ -3713,6 +3713,11 @@ recording::lvalue::get_address
(recording::location *loc)
>return result;
>  }
>  
> +void recording::lvalue::set_link_section (const char *name)
> +{
> +  m_link_section = new_string (name);
> +}
> +
>  /* The implementation of class gcc::jit::recording::param.  */
>  
>  /* Implementation of pure virtual hook
recording::memento::replay_into
> @@ -4547,8 +4552,7 @@ recording::block::dump_edges_to_dot
(pretty_printer *pp)
>  void
>  recording::global::replay_into (replayer *r)
>  {
> -  set_playback_obj (
> -m_initializer
> +  playback::lvalue *global = m_initializer
>  ? r->new_global_initialized (playback_location (r, m_loc),
>m_kind,
>m_type->playback_type (),
> @@ -4560,7 +4564,12 @@ recording::global::replay_into (replayer *r)
>  : r->new_global (playback_location (r, m_loc),
>m_kind,
>m_type->playback_type (),
> -  playback_string (m_name)));
> +  playback_string (m_name));
> +  if (m_link_section != NULL)
> +  {
> +global->set_link_section(m_link_section->c_str());
> +  }

Coding convention nits: don't use {} when it's just one statement
(which I think is a bad convention, but it is the project's
convention).
Missing spaces between function name and open-paren in both calls here.


> +  set_playback_obj (global);
>  }
>  

[...snip]

> diff --git a/gcc/jit/jit-recording.h b/gcc/jit/jit-recording.h
> index 03fa1160cf0..0691fac579d 100644
> --- a/gcc/jit/jit-recording.h
> +++ b/gcc/jit/jit-recording.h
> @@ -1105,7 +1105,8 @@ public:
>lvalue (context *ctxt,
> location *loc,
> type *type_)
> -: rvalue (ctxt, loc, type_)
> +: rvalue (ctxt, loc, type_),
> +  m_link_section(NULL)
>  {}
>  
>playback::lvalue *
> @@ -1127,6 +1128,10 @@ public:
>const char *access_as_rvalue (reproducer ) OVERRIDE;
>virtual const char *access_as_lvalue (reproducer );
>virtual bool is_global () const { return false; }
> +  void set_link_section (const char *name);
> +
> +protected:
> +  string *m_link_section;
>  };

Can it be private, rather than protected?


> diff --git a/gcc/jit/libgccjit.c b/gcc/jit/libgccjit.c
> index 7fa948007ad..8cfa48aae24 100644
> --- a/gcc/jit/libgccjit.c
> +++ b/gcc/jit/libgccjit.c
> @@ -1953,6 +1953,18 @@ gcc_jit_lvalue_get_address (gcc_jit_lvalue
*lvalue,
>return (gcc_jit_rvalue *)lvalue->get_address (loc);
>  }
>  
> +/* Public entrypoint.  See description in libgccjit.h.
> +
> +   After error-checking, the real work is done by the
> +   gcc::jit::recording::lvalue::set_section method in jit-
recording.c.  */
   ^^^

Re: [PATCH] Allow __ibm128 on older PowerPC systems.

2021-05-20 Thread will schmidt via Gcc-patches
On Tue, 2021-05-18 at 16:36 -0400, Michael Meissner wrote:
> [PATCH] Allow __ibm128 on older PowerPC systems.
> 

Hi,


> On January 8th, 2018, I added code to ibm-ldouble.c to use the built-in
> function __builtin_pack_ibm128 if long double is IEEE 128-bit and continue to
> use __builtin_pack_longdouble if long double is IBM extended double.  This 
> code
> was needed because __builtin_pack_ibm128 is not available unless the __ibm128
> keyword is availabe.  In the current code, __ibm128 is only enabled if we have
> support for both IBM and IEEE 128-bit long double.

"available."

May be worth re-sifting the description to drop the history not
directly applicable to what this patch is doing.

> 
> Segher suggested that instead I should make __ibm128, __builtin_pack_ibm128,
> and __builtin_unpack_ibm128 available on older systems that don't support IEEE
> 128-bit floating point but does support the IBM extended double floating 
> point.
> 
> This patch changes the code so that __ibm128 is now exported if either
> long double uses the IBM extended double format, or IEEE 128-bit floating
> point is available.
> 
> I changed the internal built-in types from float128 to ibm128, since the
> only built-in functions that use this are __builtin_pack_ibm128 and
> __builtin_unpack_ibm128, and the new name matches the function.
> 
> In addition, this patch changes the function within libgcc that handles
> IBM long double to use the __builtin_pack_ibm128 function.

ok

> 
> I have done bootstrap builds with this patch on the following 3 systems:
> 1)power9 running LE Linux using --with-cpu=power9
> 2)power8 running BE Linux using --with-cpu=power8, testing both
>   32/64-bit.
> 3)power10 prototype running LE Linux using --with-cpu=power10.
> 
> There were no regressions to the tests, and the new test added passed.  Can I
> check these patches into trunk branch for GCC 12?
> 
> At the moment, I'm not sure this should be backported to GCC 11.  But I can
> easily do the back port after a stabilizing period.
> 
> gcc/
> 2021-05-18  Michael Meissner  
> 
>   * config/rs6000/rs6000-builtin.def (BU_IBM128_2): Rename
>   RS6000_BTM_IBM128 from RS6000_BTM_FLOAT128.

>   * config/rs6000/rs6000-call.c (rs6000_invalid_builtin): Update
>   error message for __ibm128 built-in functions.
>   (rs6000_init_builtins): Create the __ibm128 keyword on older
>   systems where long double uses the IBM extended double format,
>   even if they don't support IEEE 128-bit floating point.

Could drop 'older', ok.

>   * config/rs6000/rs6000.c (rs6000_builtin_mask_calculate): Rename
>   RS6000_BTM_IBM128 from RS6000_BTM_FLOAT128.
>   (rs6000_builtin_mask_names): Rename RS6000_BTM_IBM128 from
>   RS6000_BTM_FLOAT128.
>   * config/rs6000/rs6000.h (TARGET_IBM128): New macro.
>   (RS6000_BTM_IBM128): Rename from RS6000_BTM_FLOAT128.
>   (RS6000_BTM_COMMON): Rename RS6000_BTM_IBM128 from
>   RS6000_BTM_FLOAT128.
ok
> 
> libgcc/
> 2021-05-18  Michael Meissner  
> 
>   * config/rs6000/ibm-ldouble.c (pack_ldouble): Use
>   __builtin_pack_ibm128 instead of __builtin_pack_longdouble.
> ---
>  gcc/config/rs6000/rs6000-builtin.def |  5 ++---
>  gcc/config/rs6000/rs6000-call.c  | 14 ++
>  gcc/config/rs6000/rs6000.c   |  4 ++--
>  gcc/config/rs6000/rs6000.h   | 12 +---
>  libgcc/config/rs6000/ibm-ldouble.c   |  4 ++--
>  5 files changed, 25 insertions(+), 14 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000-builtin.def 
> b/gcc/config/rs6000/rs6000-builtin.def
> index 609bebdfd74..6d82ed224fb 100644
> --- a/gcc/config/rs6000/rs6000-builtin.def
> +++ b/gcc/config/rs6000/rs6000-builtin.def
> @@ -796,13 +796,12 @@
>| RS6000_BTC_BINARY),  \
>   CODE_FOR_ ## ICODE) /* ICODE */
> 
> -/* 128-bit __ibm128 floating point builtins (use -mfloat128 to indicate that
> -   __ibm128 is available).  */
> +/* 128-bit __ibm128 floating point builtins.  */
>  #define BU_IBM128_2(ENUM, NAME, ATTR, ICODE) \
>RS6000_BUILTIN_2 (MISC_BUILTIN_ ## ENUM,   /* ENUM */  \
>   "__builtin_" NAME,  /* NAME */  \
>   (RS6000_BTM_HARD_FLOAT  /* MASK */  \
> -  | RS6000_BTM_FLOAT128),\
> +  | RS6000_BTM_IBM128),  \
>   (RS6000_BTC_ ## ATTR/* ATTR */  \
>| RS6000_BTC_BINARY),  \
>   CODE_FOR_ ## ICODE) /* ICODE */
ok


> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index c4332a61862..7bdc4eeca5f 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -11540,8 +11540,8 @@ 

Re: [PATCH] Fix long double tests when default long double is not IBM.

2021-05-20 Thread will schmidt via Gcc-patches
On Tue, 2021-05-18 at 16:32 -0400, Michael Meissner wrote:
> [PATCH] Fix long double tests when default long double is not IBM.
> 

Hi,


> This patch adds 3 more selections to target-supports.exp to see if we can 
> force
> the compiler to use a particular long double format (IEEE 128-bit, IBM 
> extended
> double, 64-bit), and the library support will track the changes for the long
> double.  This is needed because two of the tests in the test suite use long
> double, and they are actually testing IBM extended double.
> 
> This patch also forces the two tests that explicitly require long double
> to use the IBM double-double encoding to explicitly run the test.  This
> requires GLIBC 2.32 or greater in order to do the switch.
> 
> I have run tests on a little endian power9 system with 3 compilers.  There 
> were
> no regressions with these patches, and the two tests in the following patches
> now work if the default long double is not IBM 128-bit:
> 
> * One compiler used the default IBM 128-bit format;
> * One compiler used the IEEE 128-bit format; (and)
> * One compiler used 64-bit long doubles.
> 
> I have also tested compilers on a big endian power8 system with a compiler
> defaulting to power8 code generation and another with the default cpu
> set.  There were no regressions.
> 
> Can I check this patch into the master branch?
> 
> I have done bootstrap builds with this patch on the following 4 systems:
> 1)power9 running LE Linux using --with-cpu=power9 with long 
> double == IBM
> 2)power9 running LE Linux using --with-cpu=power9 with long 
> double == IEEE
> 3)power8 running BE Linux using --with-cpu=power8, testing both
>   32/64-bit.
> 4)power10 prototype running LE Linux using --with-cpu=power10.
> 
> There were no regressions to the tests, and the two test cases that previously
> failed with I ran the compiler defaulting to long double using IEEE 128-bit 
> now
> passed.  Can I check these patches into trunk branch for GCC 12?
> 
> I would like to check these patches into GCC 11 after a cooling off period, 
> but
> I can also not do the backport if desired.
> 
> gcc/testsuite/
> 2021-05-18  Michael Meissner  
> 
>   PR target/70117
>   * gcc.target/powerpc/pr70117.c: Force the long double type to use
>   the IBM 128-bit format.
>   * c-c++-common/dfp/convert-bfp-11.c: Force using IBM 128-bit long
>   double.  Remove check for 64-bit long double.
>   * lib/target-supports.exp
>   (add_options_for_ppc_long_double_override_ibm128): New function.
>   (check_effective_target_ppc_long_double_override_ibm128): New
>   function.
>   (add_options_for_ppc_long_double_override_ieee128): New function.
>   (check_effective_target_ppc_long_double_override_ieee128): New
>   function.
>   (add_options_for_ppc_long_double_override_64bit): New function.
>   (check_effective_target_ppc_long_double_override_64bit): New
>   function.

ok.

> ---
>  .../c-c++-common/dfp/convert-bfp-11.c |  18 +--
>  gcc/testsuite/gcc.target/powerpc/pr70117.c|   6 +-
>  gcc/testsuite/lib/target-supports.exp | 107 ++
>  3 files changed, 121 insertions(+), 10 deletions(-)
> 
> diff --git a/gcc/testsuite/c-c++-common/dfp/convert-bfp-11.c 
> b/gcc/testsuite/c-c++-common/dfp/convert-bfp-11.c
> index 95c433d2c24..35da07d1fa4 100644
> --- a/gcc/testsuite/c-c++-common/dfp/convert-bfp-11.c
> +++ b/gcc/testsuite/c-c++-common/dfp/convert-bfp-11.c
> @@ -1,9 +1,14 @@
> -/* { dg-skip-if "" { ! "powerpc*-*-linux*" } } */
> +/* { dg-require-effective-target dfp } */
> +/* { dg-require-effective-target ppc_long_double_override_ibm128 } */
> +/* { dg-add-options ppc_long_double_override_ibm128 } */
> 
> -/* Test decimal float conversions to and from IBM 128-bit long double. 
> -   Checks are skipped at runtime if long double is not 128 bits.
> -   Don't force 128-bit long doubles because runtime support depends
> -   on glibc.  */
> +/* We force the long double type to be IBM 128-bit because the 
> CONVERT_TO_PINF
> +   tests will fail if we use IEEE 128-bit floating point.  This is due to 
> IEEE
> +   128-bit having a larger exponent range than IBM 128-bit extended double.  
> So
> +   tests that would generate an infinity with IBM 128-bit will generate a
> +   normal number with IEEE 128-bit.  */

ok

> +
> +/* Test decimal float conversions to and from IBM 128-bit long double.   */
> 
>  #include "convert.h"
> 
> @@ -36,9 +41,6 @@ CONVERT_TO_PINF (312, tf, sd, 1.6e+308L, d32)
>  int
>  main ()
>  {
> -  if (sizeof (long double) != 16)
> -return 0;
> -
>convert_101 ();
>convert_102 ();
> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr70117.c 
> b/gcc/testsuite/gcc.target/powerpc/pr70117.c
> index 3bbd2c595e0..8a5fad1dee0 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr70117.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr70117.c
> @@ -1,5 +1,7 @@
> -/* { dg-do run { target 

Re: [PATCH 2/2] Add IEEE 128-bit fp conditional move on PowerPC.

2021-05-20 Thread will schmidt via Gcc-patches
On Tue, 2021-05-18 at 16:28 -0400, Michael Meissner wrote:
> [PATCH 2/2] Add IEEE 128-bit fp conditional move on PowerPC.
> 

Hi,


> This patch adds the support for power10 IEEE 128-bit floating point 
> conditional
> move and for automatically generating min/max.
> 
> In this patch, I simplified things compared to previous patches.  Instead of
> allowing any four of the modes to be used for the conditional move comparison
> and the move itself could use different modes, I restricted the conditional
> move to just the same mode.  I.e. you can do:

ok.

> 
> _Float128 a, b, c, d, e, r;
> 
> r = (a == b) ? c : d;
> 
> But you can't do:
> 
> _Float128 c, d, r;
> double a, b;
> 
> r = (a == b) ? c : d;
> 
> or:
> 
> _Float128 a, b;
> double c, d, r;
> 
> r = (a == b) ? c : d;
> 
> This eliminates a lot of the complexity of the code, because you don't have to
> worry about the sizes being different, and the IEEE 128-bit types being
> restricted to Altivec registers, while the SF/DF modes can use any VSX
> register.
> 
> I did not modify the existing support that allowed conditional moves where
> SFmode operands are compared and DFmode operands are moved (and vice versa).
> 
> I modified the test cases that I added to reflect this change.  I have also
> fixed the test for not equal to use '!=' instead of '=='.
> 
> I have done bootstrap builds with this patch on the following 3 systems:
> 1)power9 running LE Linux using --with-cpu=power9
> 2)power8 running BE Linux using --with-cpu=power8, testing both
>   32/64-bit.
> 3)power10 prototype running LE Linux using --with-cpu=power10.
> 
> There were no regressions to the tests, and the new test added passed.  Can I
> check these patches into trunk branch for GCC 12?
> 
> I would like to check these patches into GCC 11 after a cooling off period, 
> but
> I can also not do the backport if desired.
> 
> gcc/
> 2021-05-18 Michael Meissner  
> 
> * config/rs6000/rs6000.c (rs6000_maybe_emit_fp_cmove): Add IEEE
>   128-bit floating point conditional move support.
>   (have_compare_and_set_mask): Add IEEE 128-bit floating point
>   types.
>   * config/rs6000/rs6000.md (movcc, IEEE128 iterator): New insn.
>   (movcc_p10, IEEE128 iterator): New insn.
>   (movcc_invert_p10, IEEE128 iterator): New insn.
>   (fpmask, IEEE128 iterator): New insn.
>   (xxsel, IEEE128 iterator): New insn.
> 
> gcc/testsuite/
> 2021-05-18  Michael Meissner  
> 
> * gcc.target/powerpc/float128-cmove.c: New test.
> * gcc.target/powerpc/float128-minmax-3.c: New test.

ok


> ---
>  gcc/config/rs6000/rs6000.c|  38 ++-
>  gcc/config/rs6000/rs6000.md   | 106 ++
>  .../gcc.target/powerpc/float128-cmove.c   |  58 ++
>  .../gcc.target/powerpc/float128-minmax-3.c|  15 +++
>  4 files changed, 215 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-cmove.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index fdaf12aeda0..ef1ebaaee05 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -15706,8 +15706,8 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, 
> rtx op_false,
>return 1;
>  }
> 
> -/* Possibly emit the xsmaxcdp and xsmincdp instructions to emit a maximum or
> -   minimum with "C" semantics.
> +/* Possibly emit the xsmaxc{dp,qp} and xsminc{dp,qp} instructions to emit a
> +   maximum or minimum with "C" semantics.
> 
> Unless you use -ffast-math, you can't use these instructions to replace
> conditions that implicitly reverse the condition because the comparison
> @@ -15783,6 +15783,7 @@ rs6000_maybe_emit_fp_cmove (rtx dest, rtx op, rtx 
> true_cond, rtx false_cond)
>enum rtx_code code = GET_CODE (op);
>rtx op0 = XEXP (op, 0);
>rtx op1 = XEXP (op, 1);
> +  machine_mode compare_mode = GET_MODE (op0);
>machine_mode result_mode = GET_MODE (dest);
>rtx compare_rtx;
>rtx cmove_rtx;
> @@ -15791,6 +15792,35 @@ rs6000_maybe_emit_fp_cmove (rtx dest, rtx op, rtx 
> true_cond, rtx false_cond)
>if (!can_create_pseudo_p ())
>  return 0;
> 
> +  /* We allow the comparison to be either SFmode/DFmode and the true/false
> + condition to be either SFmode/DFmode.  I.e. we allow:
> +
> + float a, b;
> + double c, d, r;
> +
> + r = (a == b) ? c : d;
> +
> +and:
> +
> + double a, b;
> + float c, d, r;
> +
> + r = (a == b) ? c : d;


This new comment does not seem to align with the comments in the
description, which statee "But you can't do ..." 


> +
> +but we don't allow intermixing the IEEE 128-bit floating point types with
> +the 32/64-bit scalar types.
> +
> +It gets too messy where SFmode/DFmode can use any register and 
> 

Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.

2021-05-20 Thread will schmidt via Gcc-patches
On Tue, 2021-05-18 at 16:26 -0400, Michael Meissner wrote:
> [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
> 

Hi,


> This patch adds the support for the IEEE 128-bit floating point C minimum and
> maximum instructions.  The next patch will add the support for using the
> compare and set mask instruction to implement conditional moves.
> 
> This patch does not try to re-use the code used for SF/DF min/max
> support.  It defines a separate insn for the IEEE 128-bit support.  It
> uses the code iterator  to simplify adding both operations.
> 
> GCC will not convert ?: operations into using min/max instructions provided in

I'd throw the ternary term in there, easier to search for later. 
s/?: operations/ternary (?:) operations /

> this patch unless the user uses -Ofast or similar switches due to issues with
> NaNs.  The next patch that adds conditional move instructions will enable the
> ?: conversion in many cases.
> 
> I have done bootstrap builds with this patch on the following 3 systems:
> 1)power9 running LE Linux using --with-cpu=power9
> 2)power8 running BE Linux using --with-cpu=power8, testing both
>   32/64-bit.
> 3)power10 prototype running LE Linux using --with-cpu=power10.
> 
> There were no regressions to the tests, and the new test added passed.  Can I
> check these patches into trunk branch for GCC 12?
> 
> I would like to check these patches into GCC 11 after a cooling off period, 
> but
> I can also not do the backport if desired.
> 
> gcc/
> 2021-05-18  Michael Meissner  
> 
>   * config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
>   3.1   IEEE   128-bit   floating  point   xsmaxcqp   and   xsmincqp
>   instructions.
>   * config/rs6000/rs6000.md (s3, IEEE128 iterator):
>   New insns.

ok

> 
> gcc/testsuite/
> 2021-05-18  Michael Meissner  
> 
>   * gcc.target/powerpc/float128-minmax-2.c: New test.
>   * gcc.target/powerpc/float128-minmax.c: Turn off power10 code
>   generation.

So, presumably the float128-minmax-2.c test adds/replaces the power10
code gen tests that were removed or disabled from float128-minmax.c. 



> ---
>  gcc/config/rs6000/rs6000.c|  3 ++-
>  gcc/config/rs6000/rs6000.md   | 11 +++
>  .../gcc.target/powerpc/float128-minmax-2.c| 15 +++
>  .../gcc.target/powerpc/float128-minmax.c  |  7 +++
>  4 files changed, 35 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 0d05956..fdaf12aeda0 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -16111,7 +16111,8 @@ rs6000_emit_minmax (rtx dest, enum rtx_code code, rtx 
> op0, rtx op1)
>/* VSX/altivec have direct min/max insns.  */
>if ((code == SMAX || code == SMIN)
>&& (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)
> -   || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode
> +   || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))
> +   || (TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode
>  {
>emit_insn (gen_rtx_SET (dest, gen_rtx_fmt_ee (code, mode, op0, op1)));
>return;
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 0bfeb24d9e8..3a1bc1f8547 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -5196,6 +5196,17 @@ (define_insn "*s3_vsx"
>  }
>[(set_attr "type" "fp")])
> 
> +;; Min/max for ISA 3.1 IEEE 128-bit floating point
> +(define_insn "s3"
> +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> + (fp_minmax:IEEE128
> +  (match_operand:IEEE128 1 "altivec_register_operand" "v")
> +  (match_operand:IEEE128 2 "altivec_register_operand" "v")))]
> +  "TARGET_POWER10"
> +  "xscqp %0,%1,%2"
> +  [(set_attr "type" "vecfloat")
> +   (set_attr "size" "128")])
> +
>  ;; The conditional move instructions allow us to perform max and min 
> operations
>  ;; even when we don't have the appropriate max/min instruction using the FSEL
>  ;; instruction.

ok


> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c 
> b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> new file mode 100644
> index 000..c71ba08c9f8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> @@ -0,0 +1,15 @@
> +/* { dg-require-effective-target ppc_float128_hw } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2 -ffast-math" } */
> +
> +#ifndef TYPE
> +#define TYPE _Float128
> +#endif
> +
> +/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
> +   call.  */
> +TYPE f128_min (TYPE a, TYPE b) { return __builtin_fminf128 (a, b); }
> +TYPE f128_max (TYPE a, TYPE b) { return __builtin_fmaxf128 (a, b); }
> +
> +/* { dg-final { scan-assembler {\mxsmaxcqp\M} 

Re: [PATCH] libgccjit: Add support for sized integer types, including 128-bit integers [PR95325]

2021-05-20 Thread David Malcolm via Gcc-patches
On Tue, 2021-05-18 at 14:53 +0200, Jakub Jelinek via Jit wrote:
> On Tue, May 18, 2021 at 08:23:56AM -0400, Antoni Boucher via Gcc-
> patches wrote:
> > Hello.
> > This patch add support for sized integer types.
> > Maybe it should check whether the size of a byte for the current
> > platform is 8 bits and do other checks so that they're only available
> > when it makes sense.
> > What do you think?
> 
> Not a review, just a comment.  The 128-bit integral types are available
> only on some targets, the test e.g. the C/C++ FE do for those is
> targetm.scalar_mode_supported_p (TImode)
> and so even libgccjit shouldn't provide those types unconditionally.
> Similarly for the tests (though it could be guarded with e.g
> #ifdef __SIZEOF_INT128__
> in that case).
> Also, while currently all in tree targets have BITS_PER_UNIT 8 and
> therefore QImode is 8-bit, HImode 16-bit, SImode 32-bit and DImode 64-
> bit,
> in the past and maybe in he future there can be targets that could have
> e.g. 16-bit or 32-bit QImode and then there wouldn't be any
> uint8_t/int8_t
> and int16_t would be intQImode_type_node etc.
>   uint16_type_node = make_or_reuse_type (16, 1);
>   uint32_type_node = make_or_reuse_type (32, 1);
>   uint64_type_node = make_or_reuse_type (64, 1);
>   if (targetm.scalar_mode_supported_p (TImode))
>     uint128_type_node = make_or_reuse_type (128, 1);
> are always with the given precisions, perhaps jit should use
> signed_type_for (uint16_type_node) etc.?

I seem to have mislaid Antoni's original email (sorry), so I'll reply
to Jakub's.

> 2021-05-18  Antoni Boucher  
> 
> gcc/jit/
> PR target/95325
> * jit-playback.c: Add support for the sized integer types.
> * jit-recording.c: Add support for the sized integer types.
> * libgccjit.h (GCC_JIT_TYPE_UINT8_T, GCC_JIT_TYPE_UINT16_T,
> GCC_JIT_TYPE_UINT32_T, GCC_JIT_TYPE_UINT64_T,
> GCC_JIT_TYPE_UINT128_T, GCC_JIT_TYPE_INT8_T, GCC_JIT_TYPE_INT16_T,
> GCC_JIT_TYPE_INT32_T, GCC_JIT_TYPE_INT64_T, 
> GCC_JIT_TYPE_INT128_T):
> New enum variants for gcc_jit_types.
> gcc/testsuite/
> PR target/95325
> * jit.dg/test-types.c: Add tests for sized integer types.

First a high-level question, why not use (or extend)
gcc_jit_context_get_int_type instead?

Do we really need to extend enum gcc_jit_types?  Is this a quality-of-
life thing for users of the library?

That said, recording::context::get_int_type is currently a bit of a
hack, and maybe could probably be improved by using the new enum values
the patch adds.

IIRC, libgccjit.c does type-checking by comparing recording::type
pointer values; does this patch gives us multiple equivalent types that
ought to compare as equal?

If a user gets a type via GCC_JIT_TYPE_INT and gets "another" type via
GCC_JIT_TYPE_INT32_T and they happen to be the same on the current
target, should libgccjit complain if you use "int" when you meant
"int32_t", or accept it?

Various comments inline below...

> diff --git a/gcc/jit/jit-playback.c b/gcc/jit/jit-playback.c
> index c6136301243..40630aa1ab8 100644
> --- a/gcc/jit/jit-playback.c
> +++ b/gcc/jit/jit-playback.c
> @@ -193,6 +193,27 @@ get_tree_node_for_type (enum gcc_jit_types type_)
>  case GCC_JIT_TYPE_UNSIGNED_INT:
>return unsigned_type_node;
>  
> +case GCC_JIT_TYPE_UINT8_T:
> +  return unsigned_intQI_type_node;
> +case GCC_JIT_TYPE_UINT16_T:
> +  return uint16_type_node;
> +case GCC_JIT_TYPE_UINT32_T:
> +  return uint32_type_node;
> +case GCC_JIT_TYPE_UINT64_T:
> +  return uint64_type_node;
> +case GCC_JIT_TYPE_UINT128_T:
> +  return uint128_type_node;
> +case GCC_JIT_TYPE_INT8_T:
> +  return intQI_type_node;
> +case GCC_JIT_TYPE_INT16_T:
> +  return intHI_type_node;
> +case GCC_JIT_TYPE_INT32_T:
> +  return intSI_type_node;
> +case GCC_JIT_TYPE_INT64_T:
> +  return intDI_type_node;
> +case GCC_JIT_TYPE_INT128_T:
> +  return intTI_type_node;
> +

Jakub has already commented that 128 bit types might not be available.

Ideally we'd report that they're not available as soon as the user
tries to use them, in gcc_jit_context_get_type, but unfortunately it
looks like the test requires us to use targetm.scalar_mode_supported_p,
and that requires us to hold the jit mutex and thus be at playback
time.

So I think get_tree_node_for_type should take a context, and add an
error on the context if there's a failure, returning NULL. 
playback::context::get_type is the only caller currently and has
handling for an unrecognized value, so I think that logic needs to be
moved to get_tree_node_for_type so that the user can distinguish
between unrecognized types versus types that are unsupported on this
target.


> diff --git a/gcc/jit/jit-recording.c b/gcc/jit/jit-recording.c
> index 117ff70114c..b67ae8bfb78 100644
> --- a/gcc/jit/jit-recording.c
> +++ b/gcc/jit/jit-recording.c
> 

Re: [PATCH] i386: Avoid integer logic insns for 32bit and 64bit vector modes [PR100701]

2021-05-20 Thread Richard Biener via Gcc-patches
On May 20, 2021 6:52:17 PM GMT+02:00, Uros Bizjak via Gcc-patches 
 wrote:
>Integer logic instructions clobber flags, do not use them for
>32bit and 64bit vector modes.

We could add a CC clobber before reload and a splitter afterwards into one of 
the two variants? Not sure if worth the trouble of course. 

Richard. 

>2021-05-20  Uroš Bizjak  
>
>gcc/
>PR target/100701
>* config/i386/i386.md (isa): Remove x64_bmi.
>(enabled): Remove x64_bmi.
>* config/i386/mmx.md (mmx_andnot3):
>Remove general register alternative.
>(*andnot3): Ditto.
>(*mmx_3): Ditto.
>(*3): Ditto.
>
>gcc/testsuite/
>
>PR target/100701
>* gcc.target/i386/pr100701.c: New test.
>
>Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
>
>Pushed to master.
>
>Uros.



Re: [RFC] ldist: Recognize rawmemchr loop patterns

2021-05-20 Thread Stefan Schulze Frielinghaus via Gcc-patches
On Thu, May 20, 2021 at 11:24:57AM +0200, Richard Biener wrote:
> On Fri, May 7, 2021 at 2:32 PM Stefan Schulze Frielinghaus
>  wrote:
> >
> > On Wed, May 05, 2021 at 11:36:41AM +0200, Richard Biener wrote:
> > > On Tue, Mar 16, 2021 at 6:13 PM Stefan Schulze Frielinghaus
> > >  wrote:
> > > >
> > > > [snip]
> > > >
> > > > Please find attached a new version of the patch.  A major change 
> > > > compared to
> > > > the previous patch is that I created a separate pass which hopefully 
> > > > makes
> > > > reviewing also easier since it is almost self-contained.  After 
> > > > realizing that
> > > > detecting loops which mimic the behavior of rawmemchr/strlen functions 
> > > > does not
> > > > really fit into the topic of loop distribution, I created a separate 
> > > > pass.
> > >
> > > It's true that these reduction-like patterns are more difficult than
> > > the existing
> > > memcpy/memset cases.
> > >
> > > >  Due
> > > > to this I was also able to play around a bit and schedule the pass at 
> > > > different
> > > > times.  Currently it is scheduled right before loop distribution where 
> > > > loop
> > > > header copying already took place which leads to the following effect.
> > >
> > > In fact I'd schedule it after loop distribution so there's the chance 
> > > that loop
> > > distribution can expose a loop that fits the new pattern.
> > >
> > > >  Running
> > > > this setup over
> > > >
> > > > char *t (char *p)
> > > > {
> > > >   for (; *p; ++p);
> > > >   return p;
> > > > }
> > > >
> > > > the new pass transforms
> > > >
> > > > char * t (char * p)
> > > > {
> > > >   char _1;
> > > >   char _7;
> > > >
> > > >[local count: 118111600]:
> > > >   _7 = *p_3(D);
> > > >   if (_7 != 0)
> > > > goto ; [89.00%]
> > > >   else
> > > > goto ; [11.00%]
> > > >
> > > >[local count: 105119324]:
> > > >
> > > >[local count: 955630225]:
> > > >   # p_8 = PHI 
> > > >   p_6 = p_8 + 1;
> > > >   _1 = *p_6;
> > > >   if (_1 != 0)
> > > > goto ; [89.00%]
> > > >   else
> > > > goto ; [11.00%]
> > > >
> > > >[local count: 105119324]:
> > > >   # p_2 = PHI 
> > > >   goto ; [100.00%]
> > > >
> > > >[local count: 850510901]:
> > > >   goto ; [100.00%]
> > > >
> > > >[local count: 12992276]:
> > > >
> > > >[local count: 118111600]:
> > > >   # p_9 = PHI 
> > > >   return p_9;
> > > >
> > > > }
> > > >
> > > > into
> > > >
> > > > char * t (char * p)
> > > > {
> > > >   char * _5;
> > > >   char _7;
> > > >
> > > >[local count: 118111600]:
> > > >   _7 = *p_3(D);
> > > >   if (_7 != 0)
> > > > goto ; [89.00%]
> > > >   else
> > > > goto ; [11.00%]
> > > >
> > > >[local count: 105119324]:
> > > >   _5 = p_3(D) + 1;
> > > >   p_10 = .RAWMEMCHR (_5, 0);
> > > >
> > > >[local count: 118111600]:
> > > >   # p_9 = PHI 
> > > >   return p_9;
> > > >
> > > > }
> > > >
> > > > which is fine so far.  However, I haven't made up my mind so far 
> > > > whether it is
> > > > worthwhile to spend more time in order to also eliminate the "first 
> > > > unrolling"
> > > > of the loop.
> > >
> > > Might be a phiopt transform ;)  Might apply to quite some set of
> > > builtins.  I wonder how the strlen case looks like though.
> > >
> > > > I gave it a shot by scheduling the pass prior pass copy header
> > > > and ended up with:
> > > >
> > > > char * t (char * p)
> > > > {
> > > >[local count: 118111600]:
> > > >   p_5 = .RAWMEMCHR (p_3(D), 0);
> > > >   return p_5;
> > > >
> > > > }
> > > >
> > > > which seems optimal to me.  The downside of this is that I have to 
> > > > initialize
> > > > scalar evolution analysis which might be undesired that early.
> > > >
> > > > All this brings me to the question where do you see this peace of code 
> > > > running?
> > > > If in a separate pass when would you schedule it?  If in an existing 
> > > > pass,
> > > > which one would you choose?
> > >
> > > I think it still fits loop distribution.  If you manage to detect it
> > > with your pass
> > > standalone then you should be able to detect it in loop distribution.
> >
> > If a loop is distributed only because one of the partitions matches a
> > rawmemchr/strlen-like loop pattern, then we have at least two partitions
> > which walk over the same memory region.  Since a rawmemchr/strlen-like
> > loop has no body (neglecting expression-3 of a for-loop where just an
> > increment happens) it is governed by the memory accesses in the loop
> > condition.  Therefore, in such a case loop distribution would result in
> > performance degradation.  This is why I think that it does not fit
> > conceptually into ldist pass.  However, since I make use of a couple of
> > helper functions from ldist pass, it may still fit technically.
> >
> > Since currently all ldist optimizations operate over loops where niters
> > is known and for rawmemchr/strlen-like loops this is not the case, it is
> > not possible that those optimizations expose a loop which is suitable
> > for 

Re: [PATCH] libstdc++: Fix access issue in iota_view::_Sentinel [PR100690]

2021-05-20 Thread Jonathan Wakely via Gcc-patches
On Thu, 20 May 2021, 19:35 Patrick Palka via Libstdc++, <
libstd...@gcc.gnu.org> wrote:

> Tested on x86_64-pc-linux-gnu, does this look OK for trunk/11/10?
>


Yes/yes/yes

Thanks.

>
> libstdc++-v3/ChangeLog:
>
> PR libstdc++/100690
> * include/std/ranges (iota_view::_Sentinel::_M_distance_from):
> Split out into this member function from ...
> (iota_view::_Sentinel::operator-): ... here, for sake of access
> checking.
> * testsuite/std/ranges/iota/iota_view.cc:
> ---
>  libstdc++-v3/include/std/ranges |  8 ++--
>  libstdc++-v3/testsuite/std/ranges/iota/iota_view.cc | 11 +++
>  2 files changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/ranges
> b/libstdc++-v3/include/std/ranges
> index 704f924557c..76add252ca6 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -499,6 +499,10 @@ namespace ranges
> _M_equal(const _Iterator& __x) const
> { return __x._M_value == _M_bound; }
>
> +   constexpr auto
> +   _M_distance_from(const _Iterator& __x) const
> +   { return _M_bound - __x._M_value; }
> +
> _Bound _M_bound = _Bound();
>
>public:
> @@ -515,12 +519,12 @@ namespace ranges
> friend constexpr iter_difference_t<_Winc>
> operator-(const _Iterator& __x, const _Sentinel& __y)
>   requires sized_sentinel_for<_Bound, _Winc>
> -   { return __x._M_value - __y._M_bound; }
> +   { return -__y._M_distance_from(__x); }
>
> friend constexpr iter_difference_t<_Winc>
> operator-(const _Sentinel& __x, const _Iterator& __y)
>   requires sized_sentinel_for<_Bound, _Winc>
> -   { return -(__y - __x); }
> +   { return __x._M_distance_from(__y); }
>};
>
>_Winc _M_value = _Winc();
> diff --git a/libstdc++-v3/testsuite/std/ranges/iota/iota_view.cc
> b/libstdc++-v3/testsuite/std/ranges/iota/iota_view.cc
> index be8695120ad..362ef1f7f78 100644
> --- a/libstdc++-v3/testsuite/std/ranges/iota/iota_view.cc
> +++ b/libstdc++-v3/testsuite/std/ranges/iota/iota_view.cc
> @@ -80,6 +80,16 @@ test04()
>  // Verify we optimize away the 'bound' data member of an unbounded
> iota_view.
>  static_assert(sizeof(std::ranges::iota_view) == 1);
>
> +void
> +test05()
> +{
> +  // PR libstdc++/100690
> +  int x[] = {42, 42, 42};
> +  auto r = std::views::iota(std::ranges::begin(x), std::ranges::cbegin(x)
> + 3);
> +  VERIFY( r.end() - r.begin() == 3 );
> +  VERIFY( r.begin() - r.end() == -3 );
> +}
> +
>  int
>  main()
>  {
> @@ -87,4 +97,5 @@ main()
>test02();
>test03();
>test04();
> +  test05();
>  }
> --
> 2.32.0.rc0
>
>


Re: [committed] testuite: Check pthread for omp module testing

2021-05-20 Thread Bernd Edlinger
On 5/19/21 4:58 PM, Kito Cheng wrote:
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/modules/omp-1_a.C: Check pthread is available.
>   * g++.dg/modules/omp-1_b.C: Ditto.
>   * g++.dg/modules/omp-1_c.C: Ditto.
>   * g++.dg/modules/omp-2_a.C: Ditto.
>   * g++.dg/modules/omp-2_b.C: Ditto.
> ---
>  gcc/testsuite/g++.dg/modules/omp-1_a.C | 1 +
>  gcc/testsuite/g++.dg/modules/omp-1_b.C | 1 +
>  gcc/testsuite/g++.dg/modules/omp-1_c.C | 1 +
>  gcc/testsuite/g++.dg/modules/omp-2_a.C | 1 +
>  gcc/testsuite/g++.dg/modules/omp-2_b.C | 1 +
>  5 files changed, 5 insertions(+)
> 
> diff --git a/gcc/testsuite/g++.dg/modules/omp-1_a.C 
> b/gcc/testsuite/g++.dg/modules/omp-1_a.C
> index 722720a0e93..94e1171f03c 100644
> --- a/gcc/testsuite/g++.dg/modules/omp-1_a.C
> +++ b/gcc/testsuite/g++.dg/modules/omp-1_a.C
> @@ -1,4 +1,5 @@
>  // { dg-additional-options "-fmodules-ts -fopenmp" }
> +// { dg-require-effective-target pthread }
>  
>  export module foo;
>  // { dg-module-cmi foo }
> diff --git a/gcc/testsuite/g++.dg/modules/omp-1_b.C 
> b/gcc/testsuite/g++.dg/modules/omp-1_b.C
> index f3f5d92e517..09d97e4ac4e 100644
> --- a/gcc/testsuite/g++.dg/modules/omp-1_b.C
> +++ b/gcc/testsuite/g++.dg/modules/omp-1_b.C
> @@ -1,4 +1,5 @@
>  // { dg-additional-options "-fmodules-ts -fopenmp" }
> +// { dg-require-effective-target pthread }
>  
>  import foo;
>  
> diff --git a/gcc/testsuite/g++.dg/modules/omp-1_c.C 
> b/gcc/testsuite/g++.dg/modules/omp-1_c.C
> index f30f6115277..599a5a5d34f 100644
> --- a/gcc/testsuite/g++.dg/modules/omp-1_c.C
> +++ b/gcc/testsuite/g++.dg/modules/omp-1_c.C
> @@ -1,4 +1,5 @@
>  // { dg-additional-options "-fmodules-ts" }
> +// { dg-require-effective-target pthread }
>  
>  import foo;
>  
> diff --git a/gcc/testsuite/g++.dg/modules/omp-2_a.C 
> b/gcc/testsuite/g++.dg/modules/omp-2_a.C
> index d2291b6bbe0..b0d4bbc6e8a 100644
> --- a/gcc/testsuite/g++.dg/modules/omp-2_a.C
> +++ b/gcc/testsuite/g++.dg/modules/omp-2_a.C
> @@ -1,4 +1,5 @@
>  // { dg-additional-options "-fmodules-ts -fopenmp" }
> +// { dg-require-effective-target pthread }
>  
>  export module foo;
>  // { dg-module-cmi foo }
> diff --git a/gcc/testsuite/g++.dg/modules/omp-2_b.C 
> b/gcc/testsuite/g++.dg/modules/omp-2_b.C
> index 39f34c70275..aeee4d1561a 100644
> --- a/gcc/testsuite/g++.dg/modules/omp-2_b.C
> +++ b/gcc/testsuite/g++.dg/modules/omp-2_b.C
> @@ -1,4 +1,5 @@
>  // { dg-additional-options "-fmodules-ts" }
> +// { dg-require-effective-target pthread }
>  
>  import foo;
>  
> 

Hi,

this patch causes a couple test failures.

FAIL: g++.dg/modules/omp-1_c.C -std=c++17  dg-regexp 6 not found: "In module 
imported at [^\\n]*omp-1_c.C:3:1:\\nfoo: error: module contains OpenMP, use 
'-fopenmp' to enable\\n"
FAIL: g++.dg/modules/omp-1_c.C -std=c++17 (test for excess errors)
FAIL: g++.dg/modules/omp-1_c.C -std=c++2a  dg-regexp 6 not found: "In module 
imported at [^\\n]*omp-1_c.C:3:1:\\nfoo: error: module contains OpenMP, use 
'-fopenmp' to enable\\n"
FAIL: g++.dg/modules/omp-1_c.C -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/omp-1_c.C -std=c++2b  dg-regexp 6 not found: "In module 
imported at [^\\n]*omp-1_c.C:3:1:\\nfoo: error: module contains OpenMP, use 
'-fopenmp' to enable\\n"
FAIL: g++.dg/modules/omp-1_c.C -std=c++2b (test for excess errors)

That's because the line number in the pattern match changes from 3 to 4.

I've adjusted this test with the following patch
tested on x86_64-pc-linux-gnu and committed as obvious:



Regards
Bernd.
From 4f4a2f199baf46d35492edadc16f30f32920c4df Mon Sep 17 00:00:00 2001
From: Bernd Edlinger 
Date: Thu, 20 May 2021 20:19:43 +0200
Subject: [PATCH] Fix a test failure in g++.dg/modules/omp-1_c.C

Adjust the line number due to previous commit,
which added a line for dg-require-effective-target.

2021-05-20  Bernd Edlinger  

	* g++.dg/modules/omp-1_c.C: Fix testcase.
---
 gcc/testsuite/g++.dg/modules/omp-1_c.C | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/g++.dg/modules/omp-1_c.C b/gcc/testsuite/g++.dg/modules/omp-1_c.C
index 599a5a5..71a24f6 100644
--- a/gcc/testsuite/g++.dg/modules/omp-1_c.C
+++ b/gcc/testsuite/g++.dg/modules/omp-1_c.C
@@ -3,7 +3,7 @@
 
 import foo;
 
-// { dg-regexp "In module imported at \[^\n]*omp-1_c.C:3:1:\nfoo: error: module contains OpenMP, use '-fopenmp' to enable\n" }
+// { dg-regexp "In module imported at \[^\n]*omp-1_c.C:4:1:\nfoo: error: module contains OpenMP, use '-fopenmp' to enable\n" }
 // { dg-prune-output "failed to read" }
 // { dg-prune-output "fatal error:" }
 // { dg-prune-output "compilation terminated" }
-- 
1.9.1



[PATCH] libstdc++: Fix access issue in iota_view::_Sentinel [PR100690]

2021-05-20 Thread Patrick Palka via Gcc-patches
Tested on x86_64-pc-linux-gnu, does this look OK for trunk/11/10?

libstdc++-v3/ChangeLog:

PR libstdc++/100690
* include/std/ranges (iota_view::_Sentinel::_M_distance_from):
Split out into this member function from ...
(iota_view::_Sentinel::operator-): ... here, for sake of access
checking.
* testsuite/std/ranges/iota/iota_view.cc:
---
 libstdc++-v3/include/std/ranges |  8 ++--
 libstdc++-v3/testsuite/std/ranges/iota/iota_view.cc | 11 +++
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index 704f924557c..76add252ca6 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -499,6 +499,10 @@ namespace ranges
_M_equal(const _Iterator& __x) const
{ return __x._M_value == _M_bound; }
 
+   constexpr auto
+   _M_distance_from(const _Iterator& __x) const
+   { return _M_bound - __x._M_value; }
+
_Bound _M_bound = _Bound();
 
   public:
@@ -515,12 +519,12 @@ namespace ranges
friend constexpr iter_difference_t<_Winc>
operator-(const _Iterator& __x, const _Sentinel& __y)
  requires sized_sentinel_for<_Bound, _Winc>
-   { return __x._M_value - __y._M_bound; }
+   { return -__y._M_distance_from(__x); }
 
friend constexpr iter_difference_t<_Winc>
operator-(const _Sentinel& __x, const _Iterator& __y)
  requires sized_sentinel_for<_Bound, _Winc>
-   { return -(__y - __x); }
+   { return __x._M_distance_from(__y); }
   };
 
   _Winc _M_value = _Winc();
diff --git a/libstdc++-v3/testsuite/std/ranges/iota/iota_view.cc 
b/libstdc++-v3/testsuite/std/ranges/iota/iota_view.cc
index be8695120ad..362ef1f7f78 100644
--- a/libstdc++-v3/testsuite/std/ranges/iota/iota_view.cc
+++ b/libstdc++-v3/testsuite/std/ranges/iota/iota_view.cc
@@ -80,6 +80,16 @@ test04()
 // Verify we optimize away the 'bound' data member of an unbounded iota_view.
 static_assert(sizeof(std::ranges::iota_view) == 1);
 
+void
+test05()
+{
+  // PR libstdc++/100690
+  int x[] = {42, 42, 42};
+  auto r = std::views::iota(std::ranges::begin(x), std::ranges::cbegin(x) + 3);
+  VERIFY( r.end() - r.begin() == 3 );
+  VERIFY( r.begin() - r.end() == -3 );
+}
+
 int
 main()
 {
@@ -87,4 +97,5 @@ main()
   test02();
   test03();
   test04();
+  test05();
 }
-- 
2.32.0.rc0



Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-20 Thread Martin Sebor via Gcc-patches

On 5/20/21 6:56 AM, Jonathan Wakely wrote:

On 19/05/21 16:05 -0400, Jason Merrill wrote:

On 5/19/21 3:55 PM, Jonathan Wakely wrote:

On 19/05/21 13:26 -0400, Jason Merrill wrote:

On 5/19/21 12:46 PM, Jonathan Wakely wrote:

On 19/05/21 17:39 +0100, Jonathan Wakely wrote:

Jakub pointed out I'd forgotten the spaces before the opening parens
for function calls. The attached patch should fix all those, with no
other changes.

Tested x86_64-linux. OK for trunk?


Jakub also pointed out we already have some similar diagnostics for
C++23, but I missed them as they say "only optional with" not "only
available with".

I'm testing the incremental change in the attached patch which also
adds -Wc++23-extensions, and I'll resend the full patch after that
finishes.



  if (omitted_parms_loc && lambda_specs.any_specifiers_p)
    {
-  pedwarn (omitted_parms_loc, 0,
+  pedwarn (omitted_parms_loc, OPT_Wc__23_extensions,


You probably want to change


 else if (cxx_dialect < cxx23)
   omitted_parms_loc = cp_lexer_peek_token (parser->lexer)->location;


To use warn_about_dialect_p.


Ah yes.

And just above that there's another pedwarn about a C++14 feature
being used:


 /* Default arguments shall not be specified in the
 parameter-declaration-clause of a lambda-declarator.  */
 if (cxx_dialect < cxx14)
for (tree t = param_list; t; t = TREE_CHAIN (t))
  if (TREE_PURPOSE (t) && DECL_P (TREE_VALUE (t)))
    pedwarn (DECL_SOURCE_LOCATION (TREE_VALUE (t)), OPT_Wpedantic,
 "default argument specified for lambda parameter");


I didn't notice that one initially. That should also use
warn_about_dialect_p and OPT_Wc__14_extensions.


Indeed.


Should I change the message to say "init capture" rather than
"default argument"?


No, this is about e.g. [](int = 42){}


OK, this is a simpler version of the patch, with docs now, but without
the new warn_about_cxx_dialect_p function (which isn't needed) and
with no changes to any actual warning text (I'll do that separately,
if at all).

I also caught a few more pedwarn cases that I missed previously.

Tested powerpc64le-linux. OK for trunk?


This looks good to me, and the change overall simpler.  Just one
minor thing (sorry if that seems nit-picky): in the last sentence
in the documentation, does "this option" refer to the -Wc++11 form
or to the negative? (The latter is the one that's going to be
mentioned in the entry.)

If what the sentence is trying to say is that warnings for some C++
11 constructs are controlled only by -Wpedantic then I'd suggest to
rephrase it to make that part clearer (or drop it altogether since
it sounds like it describes a limitation/problem that we might want
to work toward fixing).

@@ -8154,6 +8156,41 @@ and ISO C++ 2017.  This warning is enabled by 
@option{-Wall}.

 Warn about C++ constructs whose meaning differs between ISO C++ 2017
 and ISO C++ 2020.  This warning is enabled by @option{-Wall}.

+@item -Wno-c++11-extensions @r{(C++ and Objective-C++ only)}
+@opindex Wc++11-extensions
+@opindex Wno-c++11-extensions
+Do not warn about C++11 constructs in code being compiled using
+an older C++ standard.  Even without this option, some C++11 constructs
+will only be diagnosed if @option{-Wpedantic} is used.

Martin


[PATCH] i386: Avoid integer logic insns for 32bit and 64bit vector modes [PR100701]

2021-05-20 Thread Uros Bizjak via Gcc-patches
Integer logic instructions clobber flags, do not use them for
32bit and 64bit vector modes.

2021-05-20  Uroš Bizjak  

gcc/
PR target/100701
* config/i386/i386.md (isa): Remove x64_bmi.
(enabled): Remove x64_bmi.
* config/i386/mmx.md (mmx_andnot3):
Remove general register alternative.
(*andnot3): Ditto.
(*mmx_3): Ditto.
(*3): Ditto.

gcc/testsuite/

PR target/100701
* gcc.target/i386/pr100701.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 2fc8fae30f3..960ecbd327a 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -815,7 +815,7 @@ (define_attr "use_carry" "0,1" (const_string "0"))
 (define_attr "movu" "0,1" (const_string "0"))
 
 ;; Used to control the "enabled" attribute on a per-instruction basis.
-(define_attr "isa" "base,x64,nox64,x64_bmi,x64_sse2,x64_sse4,x64_sse4_noavx,
+(define_attr "isa" "base,x64,nox64,x64_sse2,x64_sse4,x64_sse4_noavx,
x64_avx,x64_avx512bw,x64_avx512dq,
sse_noavx,sse2,sse2_noavx,sse3,sse3_noavx,sse4,sse4_noavx,
avx,noavx,avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,noavx512f,
@@ -831,8 +831,6 @@ (define_attr "mmx_isa" "base,native,sse,sse_noavx,avx"
 (define_attr "enabled" ""
   (cond [(eq_attr "isa" "x64") (symbol_ref "TARGET_64BIT")
 (eq_attr "isa" "nox64") (symbol_ref "!TARGET_64BIT")
-(eq_attr "isa" "x64_bmi")
-  (symbol_ref "TARGET_64BIT && TARGET_BMI")
 (eq_attr "isa" "x64_sse2")
   (symbol_ref "TARGET_64BIT && TARGET_SSE2")
 (eq_attr "isa" "x64_sse4")
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 948ba479c32..baeed04d8c9 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -2055,40 +2055,34 @@ (define_expand "one_cmpl2"
   "operands[2] = force_reg (mode, CONSTM1_RTX (mode));")
 
 (define_insn "mmx_andnot3"
-  [(set (match_operand:MMXMODEI 0 "register_operand" "=y,r,x,x,v")
+  [(set (match_operand:MMXMODEI 0 "register_operand" "=y,x,x,v")
(and:MMXMODEI
- (not:MMXMODEI (match_operand:MMXMODEI 1 "register_operand"
-   "0,r,0,x,v"))
- (match_operand:MMXMODEI 2 "register_mmxmem_operand"
-   "ym,r,x,x,v")))]
+ (not:MMXMODEI (match_operand:MMXMODEI 1 "register_operand" "0,0,x,v"))
+ (match_operand:MMXMODEI 2 "register_mmxmem_operand" "ym,x,x,v")))]
   "TARGET_MMX || TARGET_MMX_WITH_SSE"
   "@
pandn\t{%2, %0|%0, %2}
-   andn\t{%2, %1, %0|%0, %1, %2}
pandn\t{%2, %0|%0, %2}
vpandn\t{%2, %1, %0|%0, %1, %2}
vpandnd\t{%2, %1, %0|%0, %1, %2}"
-  [(set_attr "isa" "*,x64_bmi,sse2_noavx,avx,avx512vl")
-   (set_attr "mmx_isa" "native,*,*,*,*")
-   (set_attr "type" "mmxadd,bitmanip,sselog,sselog,sselog")
-   (set_attr "btver2_decode" "*,direct,*,*,*")
-   (set_attr "mode" "DI,DI,TI,TI,TI")])
+  [(set_attr "isa" "*,sse2_noavx,avx,avx512vl")
+   (set_attr "mmx_isa" "native,*,*,*")
+   (set_attr "type" "mmxadd,sselog,sselog,sselog")
+   (set_attr "mode" "DI,TI,TI,TI")])
 
 (define_insn "*andnot3"
-  [(set (match_operand:VI_32 0 "register_operand" "=r,x,x,v")
+  [(set (match_operand:VI_32 0 "register_operand" "=x,x,v")
(and:VI_32
- (not:VI_32 (match_operand:VI_32 1 "register_operand" "r,0,x,v"))
- (match_operand:VI_32 2 "register_operand" "r,x,x,v")))]
+ (not:VI_32 (match_operand:VI_32 1 "register_operand" "0,x,v"))
+ (match_operand:VI_32 2 "register_operand" "x,x,v")))]
   "TARGET_SSE2"
   "@
-   andn\t{%2, %1, %0|%0, %1, %2}
pandn\t{%2, %0|%0, %2}
vpandn\t{%2, %1, %0|%0, %1, %2}
vpandnd\t{%2, %1, %0|%0, %1, %2}"
-  [(set_attr "isa" "bmi,noavx,avx,avx512vl")
-   (set_attr "type" "bitmanip,sselog,sselog,sselog")
-   (set_attr "btver2_decode" "direct,*,*,*")
-   (set_attr "mode" "SI,TI,TI,TI")])
+  [(set_attr "isa" "noavx,avx,avx512vl")
+   (set_attr "type" "sselog")
+   (set_attr "mode" "TI")])
 
 (define_expand "mmx_3"
   [(set (match_operand:MMXMODEI 0 "register_operand")
@@ -2107,22 +2101,21 @@ (define_expand "3"
   "ix86_fixup_binary_operands_no_copy (, mode, operands);")
 
 (define_insn "*mmx_3"
-  [(set (match_operand:MMXMODEI 0 "register_operand" "=y,r,x,x,v")
+  [(set (match_operand:MMXMODEI 0 "register_operand" "=y,x,x,v")
 (any_logic:MMXMODEI
- (match_operand:MMXMODEI 1 "register_mmxmem_operand" "%0,0,0,x,v")
- (match_operand:MMXMODEI 2 "register_mmxmem_operand" "ym,r,x,x,v")))]
+ (match_operand:MMXMODEI 1 "register_mmxmem_operand" "%0,0,x,v")
+ (match_operand:MMXMODEI 2 "register_mmxmem_operand" "ym,x,x,v")))]
   "(TARGET_MMX || TARGET_MMX_WITH_SSE)
&& ix86_binary_operator_ok (, mode, operands)"
   "@
p\t{%2, %0|%0, %2}
-   \t{%2, %0|%0, %2}
p\t{%2, %0|%0, %2}
vp\t{%2, %1, %0|%0, %1, %2}
vpd\t{%2, %1, %0|%0, %1, %2}"
-  [(set_attr "isa" "*,x64,sse2_noavx,avx,avx512vl")
-   (set_attr 

Re: [PATCH] Hashtable PR96088

2021-05-20 Thread Jonathan Wakely via Gcc-patches

On 06/05/21 22:03 +0200, François Dumont via Libstdc++ wrote:

Hi

    Considering your feedback on backtrace in debug mode is going to 
take me some time so here is another one.


    Compared to latest submission I've added a _Hash_arg_t partial 
specialization for std::hash<>. It is not strictly necessary for the 
moment but when we will eventually remove its nested argument_type it 
will be. I also wonder if it is not easier to handle for the compiler, 
not sure about that thought.


The std::hash specializations in libstdc++ define argument_type, but
I'm already working on one that doesn't (forstd::stacktrace).

And std::hash can be specialized by users,
and is not required to provide argument_type.

So it's already not valid to assume that std::hash::argument_type
exists.


@@ -850,9 +852,56 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
iterator
_M_emplace(const_iterator, false_type __uks, _Args&&... __args);

+  template
+   std::pair
+   _M_insert_unique(_Kt&&, _Arg&&, const _NodeGenerator&);
+
+  // Detect nested argument_type.
+  template>
+   struct _Hash_arg_t
+   { typedef _Kt argument_type; };
+
+  // std::hash
+  template
+   struct _Hash_arg_t<_Kt, std::hash<_Arg>>
+   { typedef _Arg argument_type; };
+
+  // Nested argument_type.
+  template
+   struct _Hash_arg_t<_Kt, _Ht,
+ __void_t>
+   { typedef typename _Ht::argument_type argument_type; };
+
+  // Function pointer.
+  template
+   struct _Hash_arg_t<_Kt, std::size_t(*)(const _Arg&)>
+   { typedef _Arg argument_type; };
+
+  template::argument_type>
+   static typename conditional<
+ __is_nothrow_convertible<_Kt, _ArgType>::value, _Kt&&, key_type>::type


Please use __conditional_t<...> here instead of
typename conditional<...>::type.

The purpose of the _Hash_arg_t type is to determine whether invoking
the hash function with _Kt&& can throw, right?

And if it can throw, you force a conversion early, and if it can't,
you don't do the conversion.

Can't you use __is_nothrow_invocable<_Hash&, _Kt> for that, instead of
this fragile approach?




Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-20 Thread Jason Merrill via Gcc-patches

On 5/20/21 8:56 AM, Jonathan Wakely wrote:

On 19/05/21 16:05 -0400, Jason Merrill wrote:

On 5/19/21 3:55 PM, Jonathan Wakely wrote:

On 19/05/21 13:26 -0400, Jason Merrill wrote:

On 5/19/21 12:46 PM, Jonathan Wakely wrote:

On 19/05/21 17:39 +0100, Jonathan Wakely wrote:

Jakub pointed out I'd forgotten the spaces before the opening parens
for function calls. The attached patch should fix all those, with no
other changes.

Tested x86_64-linux. OK for trunk?


Jakub also pointed out we already have some similar diagnostics for
C++23, but I missed them as they say "only optional with" not "only
available with".

I'm testing the incremental change in the attached patch which also
adds -Wc++23-extensions, and I'll resend the full patch after that
finishes.



  if (omitted_parms_loc && lambda_specs.any_specifiers_p)
    {
-  pedwarn (omitted_parms_loc, 0,
+  pedwarn (omitted_parms_loc, OPT_Wc__23_extensions,


You probably want to change


 else if (cxx_dialect < cxx23)
   omitted_parms_loc = cp_lexer_peek_token (parser->lexer)->location;


To use warn_about_dialect_p.


Ah yes.

And just above that there's another pedwarn about a C++14 feature
being used:


 /* Default arguments shall not be specified in the
 parameter-declaration-clause of a lambda-declarator.  */
 if (cxx_dialect < cxx14)
for (tree t = param_list; t; t = TREE_CHAIN (t))
  if (TREE_PURPOSE (t) && DECL_P (TREE_VALUE (t)))
    pedwarn (DECL_SOURCE_LOCATION (TREE_VALUE (t)), OPT_Wpedantic,
 "default argument specified for lambda parameter");


I didn't notice that one initially. That should also use
warn_about_dialect_p and OPT_Wc__14_extensions.


Indeed.


Should I change the message to say "init capture" rather than
"default argument"?


No, this is about e.g. [](int = 42){}


OK, this is a simpler version of the patch, with docs now, but without
the new warn_about_cxx_dialect_p function (which isn't needed) and
with no changes to any actual warning text (I'll do that separately,
if at all).

I also caught a few more pedwarn cases that I missed previously.

Tested powerpc64le-linux. OK for trunk?


OK.  Do we also want, say, -Wno-std-extensions to turn them all off at once?

Jason



Re: [PATCH] libstdc++: Implement missing P0896 changes to reverse_view [PR100639]

2021-05-20 Thread Jonathan Wakely via Gcc-patches

On 18/05/21 15:52 -0400, Patrick Palka via Libstdc++ wrote:

This implements the P0896 changes to reverse_view's member types
value_type, difference_type and reference in C++20 mode, which fixes
problems taking the reverse_iterator of an iterator with a non-integral
difference_type (such as iota_view).

Tested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps
10/11?


Yes for all.



[PATCH] libstdc++: Support range adaptors with defaultable arguments

2021-05-20 Thread Patrick Palka via Gcc-patches
This adds support for defining range adaptors with defaultable arguments.
No such range adaptors have yet been standardized, but range-v3 has a
couple, e.g. 'unique' and 'sample' (which are approximately implemented
in the added testcase), and it would be good to preemptively support
such adaptors.

In order to make 'unique | unique' (where 'unique' is an adaptor that
takes a single defaultable extra argument) unambiguously mean
composition instead of the partial application 'unique(unique)', we need
to additionally constrain the first operand in the first overload of
_RangeAdaptorClosure::operator| as per [range.adaptor.object]/1, which
says R | C is equivalent to C(R) only if R models viewable_range.
However, for our purposes checking range instead of viewable_range
suffices and is cheaper to check.

Tested on x86_64-pc-linux-gnu, does this look OK for trunk/11?  Existing
adaptors aren't affected by this change.

libstdc++-v3/ChangeLog:

* include/std/ranges (__adaptor_partial_app_arity_ok): Define.
(__adaptor_partial_app_viable): Use it.
(_RangeAdaptorClosure::operator|): In the first overload, swap
order of template parameters _Self and _Range.  Add a
range<_Range> constraint to this overload.
(_RangeAdaptor): Document that _S_arity can also be defined
as a pair consisting of the minimum and maximum arity.
* testsuite/std/ranges/adaptors/detail/user_defined.cc: New test.
---
 libstdc++-v3/include/std/ranges   | 27 +--
 .../ranges/adaptors/detail/user_defined.cc| 81 +++
 2 files changed, 102 insertions(+), 6 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/std/ranges/adaptors/detail/user_defined.cc

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index 48100e9d7f2..8f691ee41f6 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -742,12 +742,25 @@ namespace views::__adaptor
 concept __adaptor_invocable
   = requires { std::declval<_Adaptor>()(declval<_Args>()...); };
 
+  template
+constexpr bool
+__adaptor_partial_app_arity_ok(int __nargs)
+{
+  if constexpr (integral)
+   return 1 + __nargs == _Adaptor::_S_arity;
+  else
+   {
+ auto [__min, __max] = _Adaptor::_S_arity;
+ return 1 + __nargs >= __min && 1 + __nargs <= __max;
+   }
+}
+
   // True if the range adaptor non-closure _Adaptor can be partially applied
   // with _Args.
   template
-concept __adaptor_partial_app_viable = (_Adaptor::_S_arity > 1)
-  && (sizeof...(_Args) == _Adaptor::_S_arity - 1)
-  && (constructible_from, _Args> && ...);
+concept __adaptor_partial_app_viable
+  = __adaptor_partial_app_arity_ok<_Adaptor>(sizeof...(_Args))
+   && (constructible_from, _Args> && ...);
 
   template
 struct _Partial;
@@ -759,7 +772,7 @@ namespace views::__adaptor
   struct _RangeAdaptorClosure
   {
 // range | adaptor is equivalent to adaptor(range).
-template
+template
   requires derived_from, _RangeAdaptorClosure>
&& __adaptor_invocable<_Self, _Range>
   friend constexpr auto
@@ -778,8 +791,10 @@ namespace views::__adaptor
 
   // The base class of every range adaptor non-closure.
   //
-  // The static data member _Derived::_S_arity must contain the total number of
-  // arguments that the adaptor takes, and the class _Derived must introduce
+  // The static data member _Derived::_S_arity must either be an integer
+  // denoting the total arity of the adaptor, or a tuple consisting of the
+  // minimum and maximum arity of the adaptor (if e.g. the adaptor has
+  // defaultable arguments).  The class _Derived must also introduce
   // _RangeAdaptor::operator() into the class scope via a using-declaration.
   template
 struct _RangeAdaptor
diff --git a/libstdc++-v3/testsuite/std/ranges/adaptors/detail/user_defined.cc 
b/libstdc++-v3/testsuite/std/ranges/adaptors/detail/user_defined.cc
new file mode 100644
index 000..94dabb50db8
--- /dev/null
+++ b/libstdc++-v3/testsuite/std/ranges/adaptors/detail/user_defined.cc
@@ -0,0 +1,81 @@
+// Copyright (C) 2021 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++2a" }
+// { dg-do compile { 

Re: [PATCH] Try LTO partial linking. (Was: Speed of compiling gimple-match.c)

2021-05-20 Thread Jan Hubicka
> On Thu, May 20, 2021 at 3:16 PM Richard Biener
>  wrote:
> >
> > On Thu, May 20, 2021 at 3:06 PM Martin Liška  wrote:
> > >
> > > On 5/20/21 2:54 PM, Richard Biener wrote:
> > > > So why did you go from applying this per-file to multiple files?
> > >
> > > When I did per-file for {gimple,generic}-match.c I hit the following 
> > > issue with lto.priv symbols:
> > >
> > > /usr/lib64/gcc/x86_64-suse-linux/10/../../../../x86_64-suse-linux/bin/ld: 
> > > error: libbackend.a(generic-match.o): multiple definition of 
> > > 'wi::to_wide(tree_node const*) [clone .part.0] [clone .lto_priv.0]'
> > > /usr/lib64/gcc/x86_64-suse-linux/10/../../../../x86_64-suse-linux/bin/ld: 
> > > libbackend.a(gimple-match.o): previous definition here
> > > /usr/lib64/gcc/x86_64-suse-linux/10/../../../../x86_64-suse-linux/bin/ld: 
> > > error: libbackend.a(generic-match.o): multiple definition of 
> > > 'TYPE_VECTOR_SUBPARTS(tree_node const*) [clone .part.0] [clone 
> > > .lto_priv.0]'
> > > /usr/lib64/gcc/x86_64-suse-linux/10/../../../../x86_64-suse-linux/bin/ld: 
> > > libbackend.a(gimple-match.o): previous definition here
> > > /usr/lib64/gcc/x86_64-suse-linux/10/../../../../x86_64-suse-linux/bin/ld: 
> > > error: libbackend.a(generic-match.o): multiple definition of 
> > > 'vec::operator[](unsigned int) [clone 
> > > .part.0] [clone .lto_priv.0]'
> > >
> > > Any idea what was I doing wrong?
> >
> > Nothing in particular I think - you're just hitting the issue that LTO
> > produces new symbols and that those
> > can obviously clash.  Giuliano hit the very same issue.  When not
> > doing partial links those internal
> > symbols pose no problem, but with -r -flinker-output=nolto-rel and
> > re-linking the produced objects
> > they obviously do.  ELF has no solution for this though, but I think
> > we could strip those from the
> > partially linked object - if WPA would give us a list of objects the
> > link step could postprocess
> > the object with objcopy or maybe a custom linker script could do the
> > trick as well.
> 
> Oh, and the "best" solution would be to avoid involving the linker
> when doing -r -flinker-output=nolto-rel but instead have the assembler
> produce the single object from the multiple LTRANS assembly snippets
> which could then use local labels instead of symbols for these.

Quick solution is to also modify partitioner to use the local symbol
names when doing incremental linking (those mixing in source code and
random seeds) to avoid clashes.

Honza
> 
> > So your workaround is to only ever have a single LTO produced object
> > file participating in the
> > final links ;)
> >
> > Richard.
> >
> > >
> > > Martin


[PATCH, OpenMP 5.0] Improve OpenMP target support for C++ (includes PR92120 v3)

2021-05-20 Thread Chung-Lin Tang

Hi Jakub,
the attached patch is a combination of the below patches already pushed to 
devel/omp/gcc-10,
some are kind of transient bug fixes, but listing all for completeness:

aadfc984: [PATCH] Target mapping C++ members inside member functions
https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562467.html

36a1ebdb: [PATCH] OpenMP 5.0: map this[:1] in C++ non-static member functions 
(PR 92120)
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558975.html

bf8605f1: [PATCH] Enable gimplify GOMP_MAP_STRUCT handling of (COMPONENT_REF 
(INDIRECT_REF ...)) map clauses.
https://gcc.gnu.org/pipermail/gcc-patches/2021-February/564976.html

da047f63: [PATCH] Fix regression of array members in OpenMP map clauses.
https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566086.html

4e714eaa: [PATCH] Fix template case of non-static member access inside member 
functions
https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566592.html

2ed80263: [PATCH] Lambda capturing of pointers and references in target 
directives
https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566935.html

08caada8: Arrow operator handling for C front-end in OpenMP map clauses
https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566419.html

To summarize, this patch set is an improvement for OpenMP target support for 
C++,
including for inside non-static members, lambda objects, and struct member 
deref access expressions.
The corresponding modifications for the C front-end are also included.

This patch supercedes the prior versions of my PR92120 patch (implicit C++ 
map(this[:1])),
so dubbing this "v3" of patch for that PR.

Prior versions of the PR92120 patch was implemented by recording uses of 'this' 
in the parser,
and then use the recorded uses during "finish" to create the implicit maps.

When working on supporting lambda objects, this required using a tree-walk 
style processing of
the OMP_TARGET body, so in only made sense to merge the entire 'this' 
processing together with it,
so a large part of the parser changes were dropped, with the main processing in 
semantics.c now.

Other parser changes to support '->' in map clauses are also with this patch.

Tested without regressions on x86_64-linux with nvptx offloading, okay for 
trunk?

Thanks,
Chung-Lin

2021-05-20  Chung-Lin Tang  

gcc/cp/
* cp-tree.h (finish_omp_target): New declaration.
(finish_omp_target_clauses): Likewise.
* parser.c (cp_parser_omp_clause_map): Adjust call to
cp_parser_omp_var_list_no_open to set 'allow_deref' argument to true.
(cp_parser_omp_target): Factor out code, adjust into calls to new
function finish_omp_target.
* pt.c (tsubst_expr): Add call to finish_omp_target_clauses for
OMP_TARGET case.
* semantics.c (handle_omp_array_sections_1): Add handling to create
'this->member' from 'member' FIELD_DECL.
(handle_omp_array_sections): Likewise.
(finish_omp_clauses): Likewise. Adjust to allow 'this[]' in OpenMP
map clauses. Handle 'A->member' case in map clauses.
(struct omp_target_walk_data): New struct for walking over
target-directive tree body.
(finish_omp_target_clauses_r): New function for tree walk.
(finish_omp_target_clauses): New function.
(finish_omp_target): New function.

gcc/c/
* c-parser.c (c_parser_omp_clause_map): Set 'allow_deref' argument in
call to c_parser_omp_variable_list to 'true'.
* c-typeck.c (handle_omp_array_sections_1): Add strip of MEM_REF in
array base handling.
(c_finish_omp_clauses): Handle 'A->member' case in map clauses.

gcc/
* gimplify.c ("tree-hash-traits.h"): Add include.
(gimplify_scan_omp_clauses): Change struct_map_to_clause to type
hash_map *. Adjust struct map handling to handle
cases of *A and A->B expressions. Under !DECL_P case of
GOMP_CLAUSE_MAP handling, add STRIP_NOPS for indir_p case, add to
struct_deref_set for map(*ptr_to_struct) cases. Add MEM_REF case when
handling component_ref_p case. Add unshare_expr and gimplification
when created GOMP_MAP_STRUCT is not a DECL. Add code to add
firstprivate pointer for *pointer-to-struct case.
(gimplify_adjust_omp_clauses): Move GOMP_MAP_STRUCT removal code for
exit data directives code to earlier position.
* omp-low.c (lower_omp_target):
Handle GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and
GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION map kinds.
* tree-pretty-print.c (dump_omp_clause): Likewise.

gcc/testsuite/
* gcc.dg/gomp/target-3.c: New testcase.
* g++.dg/gomp/target-3.C: New testcase.
* g++.dg/gomp/target-lambda-1.C: New testcase.
* g++.dg/gomp/target-this-1.C: New testcase.
* g++.dg/gomp/target-this-2.C: New testcase.
* g++.dg/gomp/target-this-3.C: New testcase.
* 

Re: [PATCH] Fortran/OpenMP: Add support for 'close' in map clause

2021-05-20 Thread Jakub Jelinek via Gcc-patches
On Thu, May 20, 2021 at 04:11:10PM +0200, Marcel Vollweiler wrote:
> --- a/gcc/fortran/openmp.c
> +++ b/gcc/fortran/openmp.c
> @@ -1710,27 +1710,62 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const 
> omp_mask mask,
> && gfc_match ("map ( ") == MATCH_YES)
>   {
> locus old_loc2 = gfc_current_locus;
> -   bool always = false;
> +   int always_modifier = 0;
> +   int close_modifier = 0;
> +   locus second_always_locus;
> +   locus second_close_locus;

I'm afraid this will or might lead to -Wmaybe-uninitialized errors.
Just initialize those two to = old_loc2; or so.

Ok for trunk with that change.

Jakub



Re: [PATCH] Fortran/OpenMP: Add support for 'close' in map clause

2021-05-20 Thread Marcel Vollweiler

Hi Jakub,

Am 20.05.2021 um 10:57 schrieb Jakub Jelinek:

On Thu, May 20, 2021 at 10:47:52AM +0200, Marcel Vollweiler wrote:

--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -1710,10 +1710,21 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const 
omp_mask mask,
   && gfc_match ("map ( ") == MATCH_YES)
 {
   locus old_loc2 = gfc_current_locus;
-  bool always = false;
+
+  int always = 0;
+  int close = 0;


The vertical space should be after the 3 variable declarations
rather than in between 1 and 2.


Changed.




+  for (;;)
+{
+  if (gfc_match ("always ") == MATCH_YES)
+always++;
+  else if (gfc_match ("close ") == MATCH_YES)
+close++;
+  else
+break;
+  gfc_match (", ");
+}
+
   gfc_omp_map_op map_op = OMP_MAP_TOFROM;
-  if (gfc_match ("always , ") == MATCH_YES)
-always = true;
   if (gfc_match ("alloc : ") == MATCH_YES)
 map_op = OMP_MAP_ALLOC;
   else if (gfc_match ("tofrom : ") == MATCH_YES)
@@ -1726,11 +1737,24 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const 
omp_mask mask,
 map_op = OMP_MAP_RELEASE;
   else if (gfc_match ("delete : ") == MATCH_YES)
 map_op = OMP_MAP_DELETE;
-  else if (always)
+  else
 {
   gfc_current_locus = old_loc2;
-  always = false;
+  always = 0;
+  close = 0;
 }
+
+  if (always > 1)
+{
+  gfc_error ("too many % modifiers at %C");
+  break;
+}
+  if (close > 1)
+{
+  gfc_error ("too many % modifiers at %C");
+  break;


I think it would be nice to show the locus of the second always or close
modifier.  Could the loop above remember that locus when always++ == 1
(or ++always == 2) and similarly for close and use it when printing the
error?


Good point. I changed the loop and the error messages accordingly.


And similarly to the C/C++ patch, better use always_modifier and
close_modifier as the names of the variables, as close is a function and
could be defined as macro.


Changed.



  Jakub



Thanks!

Marcel
-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
Fortran/OpenMP: Add support for 'close' in map clause

gcc/fortran/ChangeLog: 

* openmp.c (gfc_match_omp_clauses): Support map-type-modifier 'close'.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/map-6.f90: New test.
* gfortran.dg/gomp/map-7.f90: New test.
* gfortran.dg/gomp/map-8.f90: New test.

diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 7eeabff..f8d198e 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -1710,27 +1710,62 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const 
omp_mask mask,
  && gfc_match ("map ( ") == MATCH_YES)
{
  locus old_loc2 = gfc_current_locus;
- bool always = false;
+ int always_modifier = 0;
+ int close_modifier = 0;
+ locus second_always_locus;
+ locus second_close_locus;
+
+ for (;;)
+   {
+ locus current_locus = gfc_current_locus;
+ if (gfc_match ("always ") == MATCH_YES)
+   {
+ if (always_modifier++ == 1)
+   second_always_locus = current_locus;
+   }
+ else if (gfc_match ("close ") == MATCH_YES)
+   {
+ if (close_modifier++ == 1)
+   second_close_locus = current_locus;
+   }
+ else
+   break;
+ gfc_match (", ");
+   }
+
  gfc_omp_map_op map_op = OMP_MAP_TOFROM;
- if (gfc_match ("always , ") == MATCH_YES)
-   always = true;
  if (gfc_match ("alloc : ") == MATCH_YES)
map_op = OMP_MAP_ALLOC;
  else if (gfc_match ("tofrom : ") == MATCH_YES)
-   map_op = always ? OMP_MAP_ALWAYS_TOFROM : OMP_MAP_TOFROM;
+   map_op = always_modifier ? OMP_MAP_ALWAYS_TOFROM : 
OMP_MAP_TOFROM;
  else if (gfc_match ("to : ") == MATCH_YES)
-   map_op = always ? OMP_MAP_ALWAYS_TO : OMP_MAP_TO;
+   map_op = always_modifier ? OMP_MAP_ALWAYS_TO : OMP_MAP_TO;
  else if (gfc_match ("from : ") == MATCH_YES)
-   map_op = always ? OMP_MAP_ALWAYS_FROM : OMP_MAP_FROM;
+   map_op = always_modifier ? OMP_MAP_ALWAYS_FROM : OMP_MAP_FROM;
  else if (gfc_match ("release : ") == MATCH_YES)
map_op = OMP_MAP_RELEASE;

[PATCH] constructor: Elide expand_constructor when can move by pieces is true

2021-05-20 Thread H.J. Lu via Gcc-patches
On Thu, May 20, 2021 at 12:51 AM Richard Biener
 wrote:
>
> On Wed, May 19, 2021 at 3:22 PM H.J. Lu  wrote:
> >
> > On Wed, May 19, 2021 at 2:33 AM Richard Biener
> >  wrote:
> > >
> > > On Tue, May 18, 2021 at 9:16 PM H.J. Lu  wrote:
> > > >
> > > > When expanding a constant constructor, don't call expand_constructor if
> > > > it is more efficient to load the data from the memory via move by 
> > > > pieces.
> > > >
> > > > gcc/
> > > >
> > > > PR middle-end/90773
> > > > * expr.c (expand_expr_real_1): Don't call expand_constructor if
> > > > it is more efficient to load the data from the memory.
> > > >
> > > > gcc/testsuite/
> > > >
> > > > PR middle-end/90773
> > > > * gcc.target/i386/pr90773-24.c: New test.
> > > > * gcc.target/i386/pr90773-25.c: Likewise.
> > > > ---
> > > >  gcc/expr.c | 10 ++
> > > >  gcc/testsuite/gcc.target/i386/pr90773-24.c | 22 ++
> > > >  gcc/testsuite/gcc.target/i386/pr90773-25.c | 20 
> > > >  3 files changed, 52 insertions(+)
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-24.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-25.c
> > > >
> > > > diff --git a/gcc/expr.c b/gcc/expr.c
> > > > index d09ee42e262..80e01ea1cbe 100644
> > > > --- a/gcc/expr.c
> > > > +++ b/gcc/expr.c
> > > > @@ -10886,6 +10886,16 @@ expand_expr_real_1 (tree exp, rtx target, 
> > > > machine_mode tmode,
> > > > unsigned HOST_WIDE_INT ix;
> > > > tree field, value;
> > > >
> > > > +   /* Check if it is more efficient to load the data from
> > > > +  the memory directly.  FIXME: How many stores do we
> > > > +  need here if not moved by pieces?  */
> > > > +   unsigned HOST_WIDE_INT bytes
> > > > + = tree_to_uhwi (TYPE_SIZE_UNIT (type));
> > >
> > > that's prone to fail - it could be a VLA.
> >
> > What do you mean by fail?  Is it ICE or missed optimization?
> > Do you have a testcase?
> >
> > >
> > > > +   if ((bytes / UNITS_PER_WORD) > 2
> > > > +   && MOVE_MAX_PIECES > UNITS_PER_WORD
> > > > +   && can_move_by_pieces (bytes, TYPE_ALIGN (type)))
> > > > + goto normal_inner_ref;
> > > > +
> > >
> > > It looks like you're concerned about aggregate copies but this also 
> > > handles
> > > non-aggregates (which on GIMPLE might already be optimized of course).
> >
> > Here I check if we copy more than 2 words and we can move more than
> > a word in a single instruction.
> >
> > > Also you say "if it's cheaper" but I see no cost considerations.  How do
> > > we generally handle immed const vs. load from constant pool costs?
> >
> > This trades 2 (update to 8) stores with one load plus one store.  Is there
> > a way to check which one is faster?
>
> I'm not sure - it depends on whether the target can do stores from immediates
> at all or what restrictions apply, what the immediate value actually is
> (zero or all-ones should be way cheaper than sth arbitrary) and how the
> pressure on the load unit is.  can_move_by_pieces (bytes, TYPE_ALIGN (type))
> also does not guarantee it will actually move pieces larger than 
> UNITS_PER_WORD,
> that might depend on alignment.  There's by_pieces_ninsns that might provide
> some hint here.
>
> I'm sure it works well for x86.
>
> I wonder if the existing code is in the appropriate place and we
> shouldn't instead
> handle this somewhere upthread where we ask to copy 'exp' into some other
> memory location.  For your testcase that's expand_assignment but I can
> imagine passing array[0] by value to a function resulting in similar copying.
> Testing that shows we get
>
> pushq   array+56(%rip)
> .cfi_def_cfa_offset 24
> pushq   array+48(%rip)
> .cfi_def_cfa_offset 32
> pushq   array+40(%rip)
> .cfi_def_cfa_offset 40
> pushq   array+32(%rip)
> .cfi_def_cfa_offset 48
> pushq   array+24(%rip)
> .cfi_def_cfa_offset 56
> pushq   array+16(%rip)
> .cfi_def_cfa_offset 64
> pushq   array+8(%rip)
> .cfi_def_cfa_offset 72
> pushq   array(%rip)
> .cfi_def_cfa_offset 80
> callbar
>
> for that.  We do have the by-pieces infrastructure to generally do this kind 
> of
> copying but in both of these cases we do not seem to use it.  I also wonder
> if the by-pieces infrastructure can pick up constant initializers 
> automagically
> (we could native_encode the initializer part and feed the by-pieces
> infrastructure with an array of bytes).  There for example might be easy to
> immediate-store byte parts and difficult ones where we could decide on a
> case-by-case basis whether to load+store or immediate-store them.

I opened:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100704

> For example if I change your testcase to have 

Re: [PATCH] Simplify option handling for -fsanitize-coverage

2021-05-20 Thread Martin Liška

On 5/20/21 3:26 PM, Bernhard Reutner-Fischer wrote:

On 20 May 2021 12:43:17 CEST, "Martin Liška"  wrote:


  /* Given ARG, an unrecognized sanitizer option, return the best
 matching sanitizer option, or NULL if there isn't one.
 OPTS is array of candidate sanitizer options.
-   CODE is OPT_fsanitize_, OPT_fsanitize_recover_ or
-   OPT_fsanitize_coverage_.
+   CODE is OPT_fsanitize_or OPT_fsanitize_recover_.


Shouldn't be there a space before "or" in OPT_fsanitize_or ?


Yes, please.


thanks,


Thanks,
Martin


Re: [PATCH] collect2: avoid failure when passing text files in an AIX archive

2021-05-20 Thread David Edelsohn via Gcc-patches
This should not use warning_at because there is no location.  There is
a separate "warning" diagnostic that should be used.

I will test with a correct implementation.

Thanks, David

On Thu, May 20, 2021 at 4:13 AM CHIGOT, CLEMENT  wrote:
>
> AIX ld allows archives to contain text files by simply ignoring
> them and printing a warning.
> This patch enables the same behavior for collect2.
>
> gcc/ChangeLog:
> 2021-05-20  Clément Chigot  
>
> * collect2.c (scan_prog_file): Skip none-COFF files instead
> of raising an error.
>
>
> Maybe the warning message can be changed as it can say that
> an archive is not a COFF file instead of the file inside the archive.
> But I don't know how to handle location files under "po/" afterwards,
> and it was already the case before.
>
> Please apply if accepted.
>
> Clément Chigot
> ATOS Bull SAS
> 1 rue de Provence - 38432 Échirolles - France
>


Re: [PATCH] Simplify option handling for -fsanitize-coverage

2021-05-20 Thread Bernhard Reutner-Fischer via Gcc-patches
On 20 May 2021 12:43:17 CEST, "Martin Liška"  wrote:

>  /* Given ARG, an unrecognized sanitizer option, return the best
> matching sanitizer option, or NULL if there isn't one.
> OPTS is array of candidate sanitizer options.
>-   CODE is OPT_fsanitize_, OPT_fsanitize_recover_ or
>-   OPT_fsanitize_coverage_.
>+   CODE is OPT_fsanitize_or OPT_fsanitize_recover_.

Shouldn't be there a space before "or" in OPT_fsanitize_or ?
thanks,


Re: [PATCH] Try LTO partial linking. (Was: Speed of compiling gimple-match.c)

2021-05-20 Thread Richard Biener via Gcc-patches
On Thu, May 20, 2021 at 3:16 PM Richard Biener
 wrote:
>
> On Thu, May 20, 2021 at 3:06 PM Martin Liška  wrote:
> >
> > On 5/20/21 2:54 PM, Richard Biener wrote:
> > > So why did you go from applying this per-file to multiple files?
> >
> > When I did per-file for {gimple,generic}-match.c I hit the following issue 
> > with lto.priv symbols:
> >
> > /usr/lib64/gcc/x86_64-suse-linux/10/../../../../x86_64-suse-linux/bin/ld: 
> > error: libbackend.a(generic-match.o): multiple definition of 
> > 'wi::to_wide(tree_node const*) [clone .part.0] [clone .lto_priv.0]'
> > /usr/lib64/gcc/x86_64-suse-linux/10/../../../../x86_64-suse-linux/bin/ld: 
> > libbackend.a(gimple-match.o): previous definition here
> > /usr/lib64/gcc/x86_64-suse-linux/10/../../../../x86_64-suse-linux/bin/ld: 
> > error: libbackend.a(generic-match.o): multiple definition of 
> > 'TYPE_VECTOR_SUBPARTS(tree_node const*) [clone .part.0] [clone .lto_priv.0]'
> > /usr/lib64/gcc/x86_64-suse-linux/10/../../../../x86_64-suse-linux/bin/ld: 
> > libbackend.a(gimple-match.o): previous definition here
> > /usr/lib64/gcc/x86_64-suse-linux/10/../../../../x86_64-suse-linux/bin/ld: 
> > error: libbackend.a(generic-match.o): multiple definition of 
> > 'vec::operator[](unsigned int) [clone 
> > .part.0] [clone .lto_priv.0]'
> >
> > Any idea what was I doing wrong?
>
> Nothing in particular I think - you're just hitting the issue that LTO
> produces new symbols and that those
> can obviously clash.  Giuliano hit the very same issue.  When not
> doing partial links those internal
> symbols pose no problem, but with -r -flinker-output=nolto-rel and
> re-linking the produced objects
> they obviously do.  ELF has no solution for this though, but I think
> we could strip those from the
> partially linked object - if WPA would give us a list of objects the
> link step could postprocess
> the object with objcopy or maybe a custom linker script could do the
> trick as well.

Oh, and the "best" solution would be to avoid involving the linker
when doing -r -flinker-output=nolto-rel but instead have the assembler
produce the single object from the multiple LTRANS assembly snippets
which could then use local labels instead of symbols for these.

> So your workaround is to only ever have a single LTO produced object
> file participating in the
> final links ;)
>
> Richard.
>
> >
> > Martin


Re: [PATCH] Try LTO partial linking. (Was: Speed of compiling gimple-match.c)

2021-05-20 Thread Richard Biener via Gcc-patches
On Thu, May 20, 2021 at 3:06 PM Martin Liška  wrote:
>
> On 5/20/21 2:54 PM, Richard Biener wrote:
> > So why did you go from applying this per-file to multiple files?
>
> When I did per-file for {gimple,generic}-match.c I hit the following issue 
> with lto.priv symbols:
>
> /usr/lib64/gcc/x86_64-suse-linux/10/../../../../x86_64-suse-linux/bin/ld: 
> error: libbackend.a(generic-match.o): multiple definition of 
> 'wi::to_wide(tree_node const*) [clone .part.0] [clone .lto_priv.0]'
> /usr/lib64/gcc/x86_64-suse-linux/10/../../../../x86_64-suse-linux/bin/ld: 
> libbackend.a(gimple-match.o): previous definition here
> /usr/lib64/gcc/x86_64-suse-linux/10/../../../../x86_64-suse-linux/bin/ld: 
> error: libbackend.a(generic-match.o): multiple definition of 
> 'TYPE_VECTOR_SUBPARTS(tree_node const*) [clone .part.0] [clone .lto_priv.0]'
> /usr/lib64/gcc/x86_64-suse-linux/10/../../../../x86_64-suse-linux/bin/ld: 
> libbackend.a(gimple-match.o): previous definition here
> /usr/lib64/gcc/x86_64-suse-linux/10/../../../../x86_64-suse-linux/bin/ld: 
> error: libbackend.a(generic-match.o): multiple definition of 
> 'vec::operator[](unsigned int) [clone 
> .part.0] [clone .lto_priv.0]'
>
> Any idea what was I doing wrong?

Nothing in particular I think - you're just hitting the issue that LTO
produces new symbols and that those
can obviously clash.  Giuliano hit the very same issue.  When not
doing partial links those internal
symbols pose no problem, but with -r -flinker-output=nolto-rel and
re-linking the produced objects
they obviously do.  ELF has no solution for this though, but I think
we could strip those from the
partially linked object - if WPA would give us a list of objects the
link step could postprocess
the object with objcopy or maybe a custom linker script could do the
trick as well.

So your workaround is to only ever have a single LTO produced object
file participating in the
final links ;)

Richard.

>
> Martin


Re: [PATCH, rs6000] Remove mode promotion of SSA variables

2021-05-20 Thread Segher Boessenkool
Hi!

On Thu, May 20, 2021 at 04:29:07PM +0800, HAO CHEN GUI wrote:
> On 19/5/2021 下午 9:20, Segher Boessenkool wrote:
> >On Wed, May 19, 2021 at 04:36:00PM +0800, HAO CHEN GUI wrote:
> >>-/* Define this macro if it is advisable to hold scalars in registers
> >>-   in a wider mode than that declared by the program.  In such cases,
> >>-   the value is constrained to be within the bounds of the declared
> >>-   type, but kept valid in the wider mode.  The signedness of the
> >>-   extension may differ from that of the type.  */
> >>-
> >>-#define PROMOTE_MODE(MODE,UNSIGNEDP,TYPE)  \
> >>-  if (GET_MODE_CLASS (MODE) == MODE_INT\
> >>-  && GET_MODE_SIZE (MODE) < (TARGET_32BIT ? 4 : 8)) \
> >>-(MODE) = TARGET_32BIT ? SImode : DImode;
> >>-
> >And this part needs some more words in the commit message :-)

[ ... ]

> >Also, how about something like
> >
> >#define PROMOTE_MODE(MODE,UNSIGNEDP,TYPE)\
> >   if (GET_MODE_CLASS (MODE) == MODE_INT \
> >   && GET_MODE_SIZE (MODE) < 4)  \
> > (MODE) = SImode;
> >
> >(that is, promoting modes smaller than SImode to SImode).  How does that
> >compare?
> 
> It hits an ICE when assigning a function return value to a variable. The 
> mode of variable is promoted to SImode, but the mode of return value is 
> promoted to DImode. Current GCC logical is either the mode of set_dest 
> is unchanged or the mode of set_dest matches the mode of function return 
> value.

So that sounds like a bug in the generic code?

> By the way, I tested the patch on ppc64 with m32 option. The SPECint 
> shows a little improvement(0.4%). So it's better not do any mode promotions.

Interesting!  Do you have an example that shows improved code for that,
just like that "SI->DI promotion gets a superfluous extension" testcase
you sent?


Segher


Re: [PATCH] Try LTO partial linking. (Was: Speed of compiling gimple-match.c)

2021-05-20 Thread Martin Liška

On 5/20/21 2:54 PM, Richard Biener wrote:

So why did you go from applying this per-file to multiple files?


When I did per-file for {gimple,generic}-match.c I hit the following issue with 
lto.priv symbols:

/usr/lib64/gcc/x86_64-suse-linux/10/../../../../x86_64-suse-linux/bin/ld: 
error: libbackend.a(generic-match.o): multiple definition of 
'wi::to_wide(tree_node const*) [clone .part.0] [clone .lto_priv.0]'
/usr/lib64/gcc/x86_64-suse-linux/10/../../../../x86_64-suse-linux/bin/ld: 
libbackend.a(gimple-match.o): previous definition here
/usr/lib64/gcc/x86_64-suse-linux/10/../../../../x86_64-suse-linux/bin/ld: 
error: libbackend.a(generic-match.o): multiple definition of 
'TYPE_VECTOR_SUBPARTS(tree_node const*) [clone .part.0] [clone .lto_priv.0]'
/usr/lib64/gcc/x86_64-suse-linux/10/../../../../x86_64-suse-linux/bin/ld: 
libbackend.a(gimple-match.o): previous definition here
/usr/lib64/gcc/x86_64-suse-linux/10/../../../../x86_64-suse-linux/bin/ld: error: 
libbackend.a(generic-match.o): multiple definition of 'vec::operator[](unsigned int) [clone .part.0] [clone .lto_priv.0]'

Any idea what was I doing wrong?

Martin


Re: [PATCH] testsuite: Use libstdc++ macro to check for pthread_cond_clockwait [PR 100655]

2021-05-20 Thread Jonathan Wakely via Gcc-patches

On 20/05/21 13:50 +0100, Jonathan Wakely wrote:

Also add dg-shouldfail to ignore failures due to ulimit.

gcc/testsuite/ChangeLog:

PR testsuite/100655
* g++.dg/tsan/pthread_cond_clockwait.C: Use libstdc++ macro to
check for availability of pthread_cond_clockwait. Add
dg-shouldfail.

Tested x86_64-linux (glibc 2.23) and powerpc64le-linux (glibc 2.17)



Jakub pointed out on IRC that the shouldfail here is wrong.

Path withdrawn (and I'm not working on this any more).



Re: [PATCH 1/2, rs6000] Remove mode promotion for pseudos

2021-05-20 Thread Segher Boessenkool
[ Please send attachments as plain text, not as base64 ]

On Thu, May 20, 2021 at 05:49:50PM +0800, HAO CHEN GUI wrote:
>   * config/rs6000/rs6000-call.c (rs6000_promote_function_mode):
>   Replace PROMOTE_MODE marco with its content.

> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index f5676255387..dca139b2ecf 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -6646,7 +6646,9 @@ rs6000_promote_function_mode (const_tree type 
> ATTRIBUTE_UNUSED,
>  >-->--->---  int *punsignedp ATTRIBUTE_UNUSED,
>  >-->--->---  const_tree, int for_return ATTRIBUTE_UNUSED)
>  {
> -  PROMOTE_MODE (mode, *punsignedp, type);
> +  if (GET_MODE_CLASS (mode) == MODE_INT
> +  && GET_MODE_SIZE (mode) < (TARGET_32BIT ? 4 : 8))
> +mode = TARGET_32BIT ? SImode : DImode;
> ~
>return mode;
>  }

This is fine (of course).  Okay for trunk.  Thanks!


Segher


Re: [PATCH] c++: Add new warning options for C++ language mismatches

2021-05-20 Thread Jonathan Wakely via Gcc-patches

On 19/05/21 16:05 -0400, Jason Merrill wrote:

On 5/19/21 3:55 PM, Jonathan Wakely wrote:

On 19/05/21 13:26 -0400, Jason Merrill wrote:

On 5/19/21 12:46 PM, Jonathan Wakely wrote:

On 19/05/21 17:39 +0100, Jonathan Wakely wrote:

Jakub pointed out I'd forgotten the spaces before the opening parens
for function calls. The attached patch should fix all those, with no
other changes.

Tested x86_64-linux. OK for trunk?


Jakub also pointed out we already have some similar diagnostics for
C++23, but I missed them as they say "only optional with" not "only
available with".

I'm testing the incremental change in the attached patch which also
adds -Wc++23-extensions, and I'll resend the full patch after that
finishes.



  if (omitted_parms_loc && lambda_specs.any_specifiers_p)
    {
-  pedwarn (omitted_parms_loc, 0,
+  pedwarn (omitted_parms_loc, OPT_Wc__23_extensions,


You probably want to change


 else if (cxx_dialect < cxx23)
   omitted_parms_loc = cp_lexer_peek_token (parser->lexer)->location;


To use warn_about_dialect_p.


Ah yes.

And just above that there's another pedwarn about a C++14 feature
being used:


 /* Default arguments shall not be specified in the
 parameter-declaration-clause of a lambda-declarator.  */
 if (cxx_dialect < cxx14)
for (tree t = param_list; t; t = TREE_CHAIN (t))
  if (TREE_PURPOSE (t) && DECL_P (TREE_VALUE (t)))
    pedwarn (DECL_SOURCE_LOCATION (TREE_VALUE (t)), OPT_Wpedantic,
 "default argument specified for lambda parameter");


I didn't notice that one initially. That should also use
warn_about_dialect_p and OPT_Wc__14_extensions.


Indeed.


Should I change the message to say "init capture" rather than
"default argument"?


No, this is about e.g. [](int = 42){}


OK, this is a simpler version of the patch, with docs now, but without
the new warn_about_cxx_dialect_p function (which isn't needed) and
with no changes to any actual warning text (I'll do that separately,
if at all).

I also caught a few more pedwarn cases that I missed previously.

Tested powerpc64le-linux. OK for trunk?



commit 01daea63635cff592f3fdbe7be2b704843b9193c
Author: Jonathan Wakely 
Date:   Wed May 19 21:35:58 2021

c++: Add new warning options for C++ language mismatches

This adds new warning flags, enabled by default: -Wc++11-extensions,
-Wc++14-extensions, -Wc++17-extensions, -Wc++20-extensions, and
-Wc++23-extensions. The names of the flags are copied from Clang, which
already has similar options.

No new diagnostics are added, but the new OPT_Wxxx variables are used to
control existing pedwarns about occurences of new C++ constructs in code
using an old C++ standard dialect. This allows several existing warnings
that cannot currently be disabled to be controlled by the appropriate
-Wno-xxx flag. For example, it will now be possible to disable warnings
about using variadic templates in C++98 code, by using the new
-Wno-c++11-extensions option. This will allow libstdc++ headers to
disable those warnings unconditionally by using diagnostic pragmas, so
that they are not emitted even if -Wsystem-headers is used.

Some of the affected diagnostics are currently only given when
-Wpedantic is used. Now that we have a more specific warning flag, we
could consider making them not depend on -Wpedantic, and only on the new
flag. This patch does not do that, as it intends to make no changes to
what is accepted/rejected by default. The only effect should be that
the new option is shown when -fdiagnostics-show-option is active, and
that some warnings can be disabled by using the new flags (and for the
warnings that previously only dependend on -Wpedantic, it will now be
possible to disable just those warnings while still using -Wpedantic for
its other benefits).

gcc/c-family/ChangeLog:

* c.opt (Wc++11-extensions, Wc++14-extensions)
(Wc++17-extensions, Wc++20-extensions, Wc++23-extensions): New
options.

gcc/cp/ChangeLog:

* call.c (maybe_warn_array_conv): Use new warning option.
* decl.c (mark_inline_variable, grokdeclarator): Likewise.
* error.c (maybe_warn_cpp0x): Likewise.
* parser.c (cp_parser_primary_expression)
(cp_parser_unqualified_id)
(cp_parser_pseudo_destructor_name)
(cp_parser_lambda_introducer)
(cp_parser_lambda_declarator_opt)
(cp_parser_selection_statement)
(cp_parser_init_statement)
(cp_parser_decomposition_declaration)
(cp_parser_function_specifier_opt)
(cp_parser_static_assert)
(cp_parser_namespace_definition)
(cp_parser_using_declaration)
(cp_parser_asm_definition)
(cp_parser_ctor_initializer_opt_and_function_body)
(cp_parser_initializer_list)

Re: [PATCH] Try LTO partial linking. (Was: Speed of compiling gimple-match.c)

2021-05-20 Thread Richard Biener via Gcc-patches
On Thu, May 20, 2021 at 2:34 PM Martin Liška  wrote:
>
> Hello.
>
> I've got a patch candidate that leverages partial linking for a couple of 
> selected object files.
>
> I'm sending make all-host- jX results for my machine:
>
> before: 3m18s (user 32m52s)
> https://gist.githubusercontent.com/marxin/223890df4d8d8e490b6b2918b77dacad/raw/1dd5eae5001295ba0230a689f7edc67284c9b742/gcc-all-host.svg
>
> after: 2m57m (user 35m)
> https://gist.githubusercontent.com/marxin/223890df4d8d8e490b6b2918b77dacad/raw/d659b2187cf622167841efbbe6bc93cb33855fa9/gcc-all-host-partial-lto.svg
>
> One can utilize it with:
> make -j16 all-host PARTIAL_LTO=1
>
> @Segher, Andrew: Can you please measure time improvement for your slow 
> bootstrap?
> One can also tweak --param=lto-partitions=16 param value.
>
> Thoughts?

You're LTO linking multiple objects here - that's almost as if you
were doing this
for the whole of libbackend.a ... so $(OBJS)_CLFAGS += -flto and in the
libbackend.a rule do a similar partial link trick.

That gets you half of a LTO bootstrap then.

So why did you go from applying this per-file to multiple files?  Does $(LINKER)
have a proper rule to pick up a jobserver?

When upstreaming in any form you probably have to gate it on bootstrap-lto
being not active.

Richard.

> Thanks,
> Martin


[PATCH] testsuite: Use libstdc++ macro to check for pthread_cond_clockwait [PR 100655]

2021-05-20 Thread Jonathan Wakely via Gcc-patches
Also add dg-shouldfail to ignore failures due to ulimit.

gcc/testsuite/ChangeLog:

PR testsuite/100655
* g++.dg/tsan/pthread_cond_clockwait.C: Use libstdc++ macro to
check for availability of pthread_cond_clockwait. Add
dg-shouldfail.

Tested x86_64-linux (glibc 2.23) and powerpc64le-linux (glibc 2.17)

OK for trunk?


commit 86f5ac3b64a98280df6f0ea407eea8cde7c2edbd
Author: Jonathan Wakely 
Date:   Thu May 20 12:40:07 2021

testsuite: Use libstdc++ macro to check for pthread_cond_clockwait [PR 
100655]

Also add dg-shouldfail to ignore failures due to ulimit.

gcc/testsuite/ChangeLog:

PR testsuite/100655
* g++.dg/tsan/pthread_cond_clockwait.C: Use libstdc++ macro to
check for availability of pthread_cond_clockwait. Add
dg-shouldfail.

diff --git a/gcc/testsuite/g++.dg/tsan/pthread_cond_clockwait.C 
b/gcc/testsuite/g++.dg/tsan/pthread_cond_clockwait.C
index 82d6a5c8329..c6c621bea51 100644
--- a/gcc/testsuite/g++.dg/tsan/pthread_cond_clockwait.C
+++ b/gcc/testsuite/g++.dg/tsan/pthread_cond_clockwait.C
@@ -1,9 +1,14 @@
 // Test pthread_cond_clockwait not generating false positives with tsan
+/* { dg-shouldfail "tsan" } */
 // { dg-do run { target { { *-*-linux* *-*-gnu* *-*-uclinux* } && pthread } } }
 // { dg-options "-fsanitize=thread -lpthread" }
 
 #include 
 
+// Include this to get the libstdc++ _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT
+// macro that indicates pthread_cond_clockwait is available.
+#include 
+
 pthread_cond_t cv;
 pthread_mutex_t mtx;
 
@@ -23,7 +28,9 @@ int main() {
 struct timespec ts;
 clock_gettime(CLOCK_MONOTONIC, );
 ts.tv_sec += 10;
+#ifdef _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT
 pthread_cond_clockwait(, , CLOCK_MONOTONIC, );
+#endif
 pthread_mutex_unlock();
 
 pthread_join(tid, NULL);


committed: Fix PR libstdc++/100361

2021-05-20 Thread Joern Wolfgang Rennecke
commit 66c5f24788652a49b528f14e23e8121ad0935ace (trunk)
commit 5f772bd9847cdbf6a7a6d856de87cb65472d56f4 (releases/gcc11)

As approved by Jonathan Wakely in the comments to PR libstdc++/100361 .
Bootstrapped and regression tested on x86_64-pc-linux.gnu.
2021-05-18  Joern Rennecke  

libstdc++: Disable floating_to_chars.cc on 16 bit targets

This patch conditionally disables the compilation of floating_to_chars.cc
on 16 bit targets, thus fixing a build failure for these targets as
the POW10_SPLIT_2 array exceeds the maximum object size.

libstdc++-v3/
PR libstdc++/100361
* include/std/charconv (to_chars): Hide the overloads for
floating-point types for 16 bit targets.
* src/c++17/floating_to_chars.cc: Don't compile for 16 bit targets.
* testsuite/20_util/to_chars/double.cc: Run this test only on
size32plus targets.
* testsuite/20_util/to_chars/float.cc: Likewise.
* testsuite/20_util/to_chars/long_double.cc: Likewise.

diff --git a/libstdc++-v3/include/std/charconv 
b/libstdc++-v3/include/std/charconv
index 193702e677a..ac9c34d4601 100644
--- a/libstdc++-v3/include/std/charconv
+++ b/libstdc++-v3/include/std/charconv
@@ -703,7 +703,8 @@ namespace __detail
 chars_format __fmt = chars_format::general) noexcept;
 #endif
 
-#if _GLIBCXX_FLOAT_IS_IEEE_BINARY32 && _GLIBCXX_DOUBLE_IS_IEEE_BINARY64
+#if _GLIBCXX_FLOAT_IS_IEEE_BINARY32 && _GLIBCXX_DOUBLE_IS_IEEE_BINARY64 \
+&& __SIZE_WIDTH__ >= 32
   // Floating-point std::to_chars
 
   // Overloads for float.
diff --git a/libstdc++-v3/src/c++17/floating_to_chars.cc 
b/libstdc++-v3/src/c++17/floating_to_chars.cc
index 1a0abb9e80f..44f547a77b4 100644
--- a/libstdc++-v3/src/c++17/floating_to_chars.cc
+++ b/libstdc++-v3/src/c++17/floating_to_chars.cc
@@ -50,7 +50,9 @@ extern "C" int __sprintfieee128(char*, const char*, ...);
 
 // This implementation crucially assumes float/double have the
 // IEEE binary32/binary64 formats.
-#if _GLIBCXX_FLOAT_IS_IEEE_BINARY32 && _GLIBCXX_DOUBLE_IS_IEEE_BINARY64
+#if _GLIBCXX_FLOAT_IS_IEEE_BINARY32 && _GLIBCXX_DOUBLE_IS_IEEE_BINARY64 \
+/* And it also assumes that uint64_t POW10_SPLIT_2[3133][3] is valid.  */\
+&& __SIZE_WIDTH__ >= 32
 
 // Determine the binary format of 'long double'.
 
diff --git a/libstdc++-v3/testsuite/20_util/to_chars/double.cc 
b/libstdc++-v3/testsuite/20_util/to_chars/double.cc
index bb6f74424ed..64e62213044 100644
--- a/libstdc++-v3/testsuite/20_util/to_chars/double.cc
+++ b/libstdc++-v3/testsuite/20_util/to_chars/double.cc
@@ -33,6 +33,7 @@
 
 // { dg-do run { target c++17 } }
 // { dg-require-effective-target ieee-floats }
+// { dg-require-effective-target size32plus }
 
 #include 
 
diff --git a/libstdc++-v3/testsuite/20_util/to_chars/float.cc 
b/libstdc++-v3/testsuite/20_util/to_chars/float.cc
index 0c8dd4f66df..73b9081d4ff 100644
--- a/libstdc++-v3/testsuite/20_util/to_chars/float.cc
+++ b/libstdc++-v3/testsuite/20_util/to_chars/float.cc
@@ -33,6 +33,7 @@
 
 // { dg-do run { target c++17 } }
 // { dg-require-effective-target ieee-floats }
+// { dg-require-effective-target size32plus }
 
 #include 
 
diff --git a/libstdc++-v3/testsuite/20_util/to_chars/long_double.cc 
b/libstdc++-v3/testsuite/20_util/to_chars/long_double.cc
index 8cf45ad5e94..447e5368811 100644
--- a/libstdc++-v3/testsuite/20_util/to_chars/long_double.cc
+++ b/libstdc++-v3/testsuite/20_util/to_chars/long_double.cc
@@ -35,6 +35,7 @@
 // { dg-xfail-run-if "Non-conforming printf (see PR98384)" { *-*-solaris* 
*-*-darwin* } }
 
 // { dg-require-effective-target ieee-floats }
+// { dg-require-effective-target size32plus }
 
 #include 
 


[PATCH] Try LTO partial linking. (Was: Speed of compiling gimple-match.c)

2021-05-20 Thread Martin Liška

Hello.

I've got a patch candidate that leverages partial linking for a couple of 
selected object files.

I'm sending make all-host- jX results for my machine:

before: 3m18s (user 32m52s)
https://gist.githubusercontent.com/marxin/223890df4d8d8e490b6b2918b77dacad/raw/1dd5eae5001295ba0230a689f7edc67284c9b742/gcc-all-host.svg

after: 2m57m (user 35m)
https://gist.githubusercontent.com/marxin/223890df4d8d8e490b6b2918b77dacad/raw/d659b2187cf622167841efbbe6bc93cb33855fa9/gcc-all-host-partial-lto.svg

One can utilize it with:
make -j16 all-host PARTIAL_LTO=1

@Segher, Andrew: Can you please measure time improvement for your slow 
bootstrap?
One can also tweak --param=lto-partitions=16 param value.

Thoughts?
Thanks,
Martin
>From 85228e612610c0e4b0324f6bebc84ef7c0211c4a Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Thu, 20 May 2021 14:29:35 +0200
Subject: [PATCH] Try LTO partial linking.

---
 gcc/Makefile.in | 30 ++
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 1164554e6d6..f76bcea66f5 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -220,7 +220,9 @@ libgcov-util.o-warn = -Wno-error
 libgcov-driver-tool.o-warn = -Wno-error
 libgcov-merge-tool.o-warn = -Wno-error
 gimple-match.o-warn = -Wno-unused
+gimple-match-lto.o-warn = -Wno-unused
 generic-match.o-warn = -Wno-unused
+generic-match-lto.o-warn = -Wno-unused
 dfp.o-warn = -Wno-strict-aliasing
 
 # All warnings have to be shut off in stage1 if the compiler used then
@@ -1282,12 +1284,10 @@ ANALYZER_OBJS = \
 # will build them sooner, because they are large and otherwise tend to be
 # the last objects to finish building.
 OBJS = \
-	gimple-match.o \
-	generic-match.o \
+	common-base.a \
 	insn-attrtab.o \
 	insn-automata.o \
 	insn-dfatab.o \
-	insn-emit.o \
 	insn-extract.o \
 	insn-latencytab.o \
 	insn-modes.o \
@@ -1295,7 +1295,6 @@ OBJS = \
 	insn-output.o \
 	insn-peep.o \
 	insn-preds.o \
-	insn-recog.o \
 	insn-enums.o \
 	ggc-page.o \
 	adjust-alignment.o \
@@ -2627,6 +2626,29 @@ s-match: build/genmatch$(build_exeext) $(srcdir)/match.pd cfn-operators.pd
 		generic-match.c
 	$(STAMP) s-match
 
+ifdef PARTIAL_LTO
+LTO_LINKER_FLAGS = -flto=auto --param=lto-partitions=16 -flinker-output=nolto-rel -r
+LTO_FLAGS = -flto
+
+gimple-match-lto.o: gimple-match.c $(TARGET_H)
+	$(COMPILE) $< $(LTO_FLAGS)
+generic-match-lto.o: generic-match.c $(TARGET_H)
+	$(COMPILE) $< $(LTO_FLAGS)
+insn-recog-lto.o: insn-recog.c
+	$(COMPILE) $< $(LTO_FLAGS)
+insn-emit-lto.o: insn-emit.c
+	$(COMPILE) $< $(LTO_FLAGS)
+
+common-base.a: gimple-match-lto.o generic-match-lto.o insn-recog-lto.o insn-emit-lto.o
+	-rm -rf $@
+	$(LINKER) $^ $(LTO_LINKER_FLAGS) -o common-base.o
+	$(AR) $(AR_FLAGS)T $@ common-base.o
+else
+common-base.a: gimple-match.o generic-match.o insn-recog.o insn-emit.o
+	-rm -rf $@
+	$(AR) $(AR_FLAGS)T $@ $^
+endif
+
 GTFILES = $(CPPLIB_H) $(srcdir)/input.h $(srcdir)/coretypes.h \
   $(host_xm_file_list) \
   $(tm_file_list) $(HASHTAB_H) $(SPLAY_TREE_H) $(srcdir)/bitmap.h \
-- 
2.31.1



Re: [PATCH] tree-optimization: Improve spaceship_replacement [PR94589]

2021-05-20 Thread Richard Biener
On Thu, 20 May 2021, Jakub Jelinek wrote:

> On Wed, May 19, 2021 at 01:30:31PM -0400, Jason Merrill via Gcc-patches wrote:
> > Here, when genericizing lexicographical_compare_three_way, we haven't yet
> > walked the operands, so (a == a) still sees ADDR_EXPR , but this is after
> > we've changed the type of a to REFERENCE_TYPE.  When we try to fold (a == a)
> > by constexpr evaluation, the constexpr code doesn't understand trying to
> > take the address of a reference, and we end up crashing.
> > 
> > Fixed by avoiding constexpr evaluation in genericize_spaceship, by using
> > fold_build2 instead of build_new_op on scalar operands.  Class operands
> > should have been expanded during parsing.
> 
> Unfortunately this slightly changed the IL and spaceship_replacement no
> longer pattern matches it.
> 
> Here are 3 improvements that make it match:
> 
> 1) as mentioned in the comment above spaceship_replacement, for
>strong_ordering, we are pattern matching something like:
>x == y ? 0 : x < y ? -1 : 1;
>and for partial_ordering
>x == y ? 0 : x < y ? -1 : x > y ? 1 : 2;
>but given the == comparison done first and the other comparisons only
>if == was false, we actually don't care if the other comparisons
>are < vs. <= (or > vs. >=), provided the operands of the comparison
>are the same; we know == is false when doing those and < vs. <= or
>> vs. >= have the same behavior for NaNs too
> 2) when y is an integral constant, we should treat x < 5 equivalently
>to x <= 4 etc.
> 3) the code punted if cond2_phi_edge wasn't a EDGE_TRUE_VALUE edge, but
>as the new IL shows, that isn't really needed; given 1) that
>> and >= are equivalent in the code, any of swapping the comparison
>operands, changing L[TE]_EXPR to G[TE]_EXPR or vice versa or
>swapping the EDGE_TRUE_VALUE / EDGE_FALSE_VALUE bits on the edges
>reverses one of the two comparisons
> 
> Ok for trunk if it passes bootstrap/regtest?

OK.

Richard.

> 2021-05-20  Jakub Jelinek  
> 
>   PR tree-optimization/94589
>   * tree-ssa-phiopt.c (spaceship_replacement): For integral rhs1 and
>   rhs2, treat x <= 4 equivalently to x < 5 etc.  In cmp1 and cmp2 (if
>   not the same as cmp3) treat <= the same as < and >= the same as >.
>   Don't require that cond2_phi_edge is true edge, instead take
>   false/true edges into account based on cmp1/cmp2 comparison kinds.
> 
> --- gcc/tree-ssa-phiopt.c.jj  2021-05-18 10:08:43.858591341 +0200
> +++ gcc/tree-ssa-phiopt.c 2021-05-20 13:09:28.046448559 +0200
> @@ -1988,8 +1988,16 @@ spaceship_replacement (basic_block cond_
>  
>gcond *cond1 = as_a  (last_stmt (cond_bb));
>enum tree_code cmp1 = gimple_cond_code (cond1);
> -  if (cmp1 != LT_EXPR && cmp1 != GT_EXPR)
> -return false;
> +  switch (cmp1)
> +{
> +case LT_EXPR:
> +case LE_EXPR:
> +case GT_EXPR:
> +case GE_EXPR:
> +  break;
> +default:
> +  return false;
> +}
>tree lhs1 = gimple_cond_lhs (cond1);
>tree rhs1 = gimple_cond_rhs (cond1);
>/* The optimization may be unsafe due to NaNs.  */
> @@ -2029,7 +2037,42 @@ spaceship_replacement (basic_block cond_
>if (lhs2 == lhs1)
>  {
>if (!operand_equal_p (rhs2, rhs1, 0))
> - return false;
> + {
> +   if ((cmp2 == EQ_EXPR || cmp2 == NE_EXPR)
> +   && TREE_CODE (rhs1) == INTEGER_CST
> +   && TREE_CODE (rhs2) == INTEGER_CST)
> + {
> +   /* For integers, we can have cond2 x == 5
> +  and cond1 x < 5, x <= 4, x <= 5, x < 6,
> +  x > 5, x >= 6, x >= 5 or x > 4.  */
> +   if (tree_int_cst_lt (rhs1, rhs2))
> + {
> +   if (wi::ne_p (wi::to_wide (rhs1) + 1, wi::to_wide (rhs2)))
> + return false;
> +   if (cmp1 == LE_EXPR)
> + cmp1 = LT_EXPR;
> +   else if (cmp1 == GT_EXPR)
> + cmp1 = GE_EXPR;
> +   else
> + return false;
> + }
> +   else
> + {
> +   gcc_checking_assert (tree_int_cst_lt (rhs2, rhs1));
> +   if (wi::ne_p (wi::to_wide (rhs2) + 1, wi::to_wide (rhs1)))
> + return false;
> +   if (cmp1 == LT_EXPR)
> + cmp1 = LE_EXPR;
> +   else if (cmp1 == GE_EXPR)
> + cmp1 = GT_EXPR;
> +   else
> + return false;
> + }
> +   rhs1 = rhs2;
> + }
> +   else
> + return false;
> + }
>  }
>else if (lhs2 == rhs1)
>  {
> @@ -2061,20 +2104,30 @@ spaceship_replacement (basic_block cond_
>  || absu_hwi (tree_to_shwi (arg0)) != 1
>  || wi::to_widest (arg0) == wi::to_widest (arg1))
>   return false;
> -  if (cmp2 != LT_EXPR && cmp2 != GT_EXPR)
> - return false;
> +  switch (cmp2)
> + {
> + case LT_EXPR:
> + case LE_EXPR:
> + case GT_EXPR:
> 

[PATCH] tree-optimization: Improve spaceship_replacement [PR94589]

2021-05-20 Thread Jakub Jelinek via Gcc-patches
On Wed, May 19, 2021 at 01:30:31PM -0400, Jason Merrill via Gcc-patches wrote:
> Here, when genericizing lexicographical_compare_three_way, we haven't yet
> walked the operands, so (a == a) still sees ADDR_EXPR , but this is after
> we've changed the type of a to REFERENCE_TYPE.  When we try to fold (a == a)
> by constexpr evaluation, the constexpr code doesn't understand trying to
> take the address of a reference, and we end up crashing.
> 
> Fixed by avoiding constexpr evaluation in genericize_spaceship, by using
> fold_build2 instead of build_new_op on scalar operands.  Class operands
> should have been expanded during parsing.

Unfortunately this slightly changed the IL and spaceship_replacement no
longer pattern matches it.

Here are 3 improvements that make it match:

1) as mentioned in the comment above spaceship_replacement, for
   strong_ordering, we are pattern matching something like:
   x == y ? 0 : x < y ? -1 : 1;
   and for partial_ordering
   x == y ? 0 : x < y ? -1 : x > y ? 1 : 2;
   but given the == comparison done first and the other comparisons only
   if == was false, we actually don't care if the other comparisons
   are < vs. <= (or > vs. >=), provided the operands of the comparison
   are the same; we know == is false when doing those and < vs. <= or
   > vs. >= have the same behavior for NaNs too
2) when y is an integral constant, we should treat x < 5 equivalently
   to x <= 4 etc.
3) the code punted if cond2_phi_edge wasn't a EDGE_TRUE_VALUE edge, but
   as the new IL shows, that isn't really needed; given 1) that
   > and >= are equivalent in the code, any of swapping the comparison
   operands, changing L[TE]_EXPR to G[TE]_EXPR or vice versa or
   swapping the EDGE_TRUE_VALUE / EDGE_FALSE_VALUE bits on the edges
   reverses one of the two comparisons

Ok for trunk if it passes bootstrap/regtest?

2021-05-20  Jakub Jelinek  

PR tree-optimization/94589
* tree-ssa-phiopt.c (spaceship_replacement): For integral rhs1 and
rhs2, treat x <= 4 equivalently to x < 5 etc.  In cmp1 and cmp2 (if
not the same as cmp3) treat <= the same as < and >= the same as >.
Don't require that cond2_phi_edge is true edge, instead take
false/true edges into account based on cmp1/cmp2 comparison kinds.

--- gcc/tree-ssa-phiopt.c.jj2021-05-18 10:08:43.858591341 +0200
+++ gcc/tree-ssa-phiopt.c   2021-05-20 13:09:28.046448559 +0200
@@ -1988,8 +1988,16 @@ spaceship_replacement (basic_block cond_
 
   gcond *cond1 = as_a  (last_stmt (cond_bb));
   enum tree_code cmp1 = gimple_cond_code (cond1);
-  if (cmp1 != LT_EXPR && cmp1 != GT_EXPR)
-return false;
+  switch (cmp1)
+{
+case LT_EXPR:
+case LE_EXPR:
+case GT_EXPR:
+case GE_EXPR:
+  break;
+default:
+  return false;
+}
   tree lhs1 = gimple_cond_lhs (cond1);
   tree rhs1 = gimple_cond_rhs (cond1);
   /* The optimization may be unsafe due to NaNs.  */
@@ -2029,7 +2037,42 @@ spaceship_replacement (basic_block cond_
   if (lhs2 == lhs1)
 {
   if (!operand_equal_p (rhs2, rhs1, 0))
-   return false;
+   {
+ if ((cmp2 == EQ_EXPR || cmp2 == NE_EXPR)
+ && TREE_CODE (rhs1) == INTEGER_CST
+ && TREE_CODE (rhs2) == INTEGER_CST)
+   {
+ /* For integers, we can have cond2 x == 5
+and cond1 x < 5, x <= 4, x <= 5, x < 6,
+x > 5, x >= 6, x >= 5 or x > 4.  */
+ if (tree_int_cst_lt (rhs1, rhs2))
+   {
+ if (wi::ne_p (wi::to_wide (rhs1) + 1, wi::to_wide (rhs2)))
+   return false;
+ if (cmp1 == LE_EXPR)
+   cmp1 = LT_EXPR;
+ else if (cmp1 == GT_EXPR)
+   cmp1 = GE_EXPR;
+ else
+   return false;
+   }
+ else
+   {
+ gcc_checking_assert (tree_int_cst_lt (rhs2, rhs1));
+ if (wi::ne_p (wi::to_wide (rhs2) + 1, wi::to_wide (rhs1)))
+   return false;
+ if (cmp1 == LT_EXPR)
+   cmp1 = LE_EXPR;
+ else if (cmp1 == GE_EXPR)
+   cmp1 = GT_EXPR;
+ else
+   return false;
+   }
+ rhs1 = rhs2;
+   }
+ else
+   return false;
+   }
 }
   else if (lhs2 == rhs1)
 {
@@ -2061,20 +2104,30 @@ spaceship_replacement (basic_block cond_
   || absu_hwi (tree_to_shwi (arg0)) != 1
   || wi::to_widest (arg0) == wi::to_widest (arg1))
return false;
-  if (cmp2 != LT_EXPR && cmp2 != GT_EXPR)
-   return false;
+  switch (cmp2)
+   {
+   case LT_EXPR:
+   case LE_EXPR:
+   case GT_EXPR:
+   case GE_EXPR:
+ break;
+   default:
+ return false;
+   }
   /* if (x < y) goto phi_bb; else fallthru;
 if (x > y) goto phi_bb; else fallthru;
 

Re: [PATCH][libgomp, nvptx] Fix hang in gomp_team_barrier_wait_end

2021-05-20 Thread Tom de Vries
On 5/20/21 11:52 AM, Thomas Schwinge wrote:
> Hi Tom!
> 
> First, thanks for looking into this PR99555!
> 
> 
> I can't comment on the OpenMP/nvptx changes, so just the following:
> 
> On 2021-04-23T18:48:01+0200, Tom de Vries  wrote:
>> --- a/libgomp/testsuite/libgomp.fortran/task-detach-6.f90
>> +++ b/libgomp/testsuite/libgomp.fortran/task-detach-6.f90
>> @@ -1,6 +1,5 @@
>>  ! { dg-do run }
>>
>> -! { dg-additional-sources on_device_arch.c }
>>! { dg-prune-output "command-line option '-fintrinsic-modules-path=.*' is 
>> valid for Fortran but not for C" }
> 
> Please remove the 'dg-prune-output', too.  ;-)
> 

Ack, updated patch.

> Your changes leave
> 'libgomp/testsuite/lib/libgomp.exp:check_effective_target_offload_device_nvptx',
> 'libgomp/testsuite/libgomp.c-c++-common/on_device_arch.h',
> 'libgomp/testsuite/libgomp.fortran/on_device_arch.c' unused.  Should we
> keep those for a potential future use (given that they've been tested to
> work) or remove (as now unused, danger of bit-rot)?

I vote to leave them in, they look useful, and I think the danger of
bit-rot is less than the danger of not knowing/remembering that they
once where there and having to start from scratch.

Thanks,
- Tom


Re: [PATCH][libsanitizer]: Guard cyclades inclusion in sanitizer

2021-05-20 Thread Florian Weimer via Gcc-patches
* Tamar Christina via Gcc-patches:

> Hi All,
>
> libsanitizer: Guard cyclades inclusion in sanitizer
>
> The Linux kernel has removed the interface to cyclades from
> the latest kernel headers[1] due to them being orphaned for the
> past 13 years.

Nit: The commit subject doesn't match the patch because it removes the
functionality unconditionally (which is fine).

Thanks,
Florian



Re: [PATCH][libsanitizer]: Guard cyclades inclusion in sanitizer

2021-05-20 Thread Jakub Jelinek via Gcc-patches
On Thu, May 20, 2021 at 11:49:03AM +0100, Tamar Christina wrote:
> libsanitizer: Guard cyclades inclusion in sanitizer
> 
> The Linux kernel has removed the interface to cyclades from
> the latest kernel headers[1] due to them being orphaned for the
> past 13 years.
> 
> libsanitizer uses this header when compiling against glibc, but
> glibcs itself doesn't seem to have any references to cyclades.
> 
> Further more it seems that the driver is broken in the kernel and
> the firmware doesn't seem to be available anymore.
> 
> As such since this is breaking the build of libsanitizer (and so the
> GCC bootstrap[2]) I propose to remove this.
> 
> [1] https://lkml.org/lkml/2021/3/2/153
> [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100379
> 
> (cherry picked from commit f7c5351552387bd43f6ca3631016d7f0dfe0f135)
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> libsanitizer/ChangeLog:
> 
>   PR sanitizer/100379
>   * sanitizer_common/sanitizer_common_interceptors_ioctl.inc: Remove 
> cyclades.
>   * sanitizer_common/sanitizer_platform_limits_posix.cpp: Likewise.
>   * sanitizer_common/sanitizer_platform_limits_posix.h: Likewise.

The *.inc line is too long.  And furthermore it should read:
PR sanitizer/100379
* sanitizer_common/sanitizer_common_interceptors_ioctl.inc: Cherry-pick
llvm-project revision f7c5351552387bd43f6ca3631016d7f0dfe0f135.
* sanitizer_common/sanitizer_platform_limits_posix.cpp: Likewise.
* sanitizer_common/sanitizer_platform_limits_posix.h: Likewise.

Ok for trunk and release branches with that change.
Thanks.

Jakub



Re: [PATCH] Add no_sanitize_coverage attribute.

2021-05-20 Thread Marco Elver via Gcc-patches
On Thu, 20 May 2021 at 11:08, Martin Liška  wrote:
> On 5/20/21 10:45 AM, Marco Elver wrote:
> > On Thu, 20 May 2021 at 10:33, Martin Liška  wrote:
> >> Hello.
> >>
> >> The patch implements one missing attribute which can be used for 
> >> per-function
> >> disabling of coverage sanitization.
> >>
> >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >
> > Thanks for implementing this so quickly. One thing I just have to
> > double check, is if this works with always_inline (not instrumented if
> > inlined into no_sanitize),
>
> No, in this case it's instrumented (if caller has not set the attribute).

This one I'm not sure of, because in too many places, we assume
always_inline behaves like a macro (see below).

> > and inline (not inlined if inline fn is
> > no_sanitize_coverage).
>
> No again, it's not blocking inlining.

I think that's fine, as long as the inlined code is still instrumented
according to the attribute (which I think is the case based on what
you say above).

I think this came up with other no_sanitize [1] based on what I had
written to you last year [2].

[1] https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547618.html
[2] 
https://lore.kernel.org/lkml/canpmjnnrz5ovkb6pe7k6gjfogbht_zhypkng9ad+kjndzk7...@mail.gmail.com/

> > Just like the other no_sanitize* do. The test doesn't mention them.
>
> It's aligned with similar issue that was discussed some time ago:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94722#c9

That's fair, but keep in mind all -fsanitize are not for use in
production, unlike stack protector. no_sanitize_coverage needs to be
aligned with the behavior of other existing no_sanitize*. Of course,
whether to inline or not I don't mind, as long as no_sanitize means
the code is never instrumented as discussed in [2]. AFAIK, it just so
happens that to achieve that for the other sanitizers, the easiest
option was to not inline "inline no_sanitize*" functions.

But always_inline needs to behave in line with other no_sanitize*,
i.e. if inlined into no_sanitize_coverage, never instrument:
https://lore.kernel.org/lkml/CANpmjNMTsY_8241bS7=xafqvzhflrvekv_um4aduwe_kh3r...@mail.gmail.com/
This is also what Clang will do for no_sanitize_coverage.

Thanks,
-- Marco

> Martin
>
> >
> > Thanks,
> > -- Marco
> >
> >> Ready to be installed?
> >> Thanks,
> >> Martin
> >>
> >> gcc/ChangeLog:
> >>
> >>  * asan.h (sanitize_coverage_p): New function.
> >>  * doc/extend.texi: Document it.
> >>  * fold-const.c (fold_range_test): Use sanitize_flags_p
> >>  instead of flag_sanitize_coverage.
> >>  (fold_truth_andor): Likewise.
> >>  * sancov.c: Likewise.
> >>  * tree-ssa-ifcombine.c (ifcombine_ifandif): Likewise.
> >>
> >> gcc/c-family/ChangeLog:
> >>
> >>  * c-attribs.c (handle_no_sanitize_coverage_attribute): New.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >>  * gcc.dg/sancov/attribute.c: New test.
> >> ---
> >>gcc/asan.h  | 10 ++
> >>gcc/c-family/c-attribs.c| 20 
> >>gcc/doc/extend.texi |  6 ++
> >>gcc/fold-const.c|  4 ++--
> >>gcc/sancov.c|  4 ++--
> >>gcc/testsuite/gcc.dg/sancov/attribute.c | 15 +++
> >>gcc/tree-ssa-ifcombine.c|  4 +++-
> >>7 files changed, 58 insertions(+), 5 deletions(-)
> >>create mode 100644 gcc/testsuite/gcc.dg/sancov/attribute.c
> >>
> >> diff --git a/gcc/asan.h b/gcc/asan.h
> >> index f110f1db563..8c0b2baf170 100644
> >> --- a/gcc/asan.h
> >> +++ b/gcc/asan.h
> >> @@ -249,4 +249,14 @@ sanitize_flags_p (unsigned int flag, const_tree fn = 
> >> current_function_decl)
> >>  return result_flags;
> >>}
> >>
> >> +/* Return true when coverage sanitization should happend for FN function. 
> >>  */
> >> +
> >> +static inline bool
> >> +sanitize_coverage_p (const_tree fn = current_function_decl)
> >> +{
> >> +  return (flag_sanitize_coverage
> >> + && lookup_attribute ("no_sanitize_coverage",
> >> +  DECL_ATTRIBUTES (fn)) == NULL_TREE);
> >> +}
> >> +
> >>#endif /* TREE_ASAN */
> >> diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
> >> index ccf9e4ccf0b..671b27c3200 100644
> >> --- a/gcc/c-family/c-attribs.c
> >> +++ b/gcc/c-family/c-attribs.c
> >> @@ -62,6 +62,8 @@ static tree handle_no_address_safety_analysis_attribute 
> >> (tree *, tree, tree,
> >>   int, bool *);
> >>static tree handle_no_sanitize_undefined_attribute (tree *, tree, tree, 
> >> int,
> >>  bool *);
> >> +static tree handle_no_sanitize_coverage_attribute (tree *, tree, tree, 
> >> int,
> >> +  bool *);
> >>static tree handle_asan_odr_indicator_attribute (tree *, tree, tree, 
> >> 

[PATCH][libsanitizer]: Guard cyclades inclusion in sanitizer

2021-05-20 Thread Tamar Christina via Gcc-patches
Hi All,

libsanitizer: Guard cyclades inclusion in sanitizer

The Linux kernel has removed the interface to cyclades from
the latest kernel headers[1] due to them being orphaned for the
past 13 years.

libsanitizer uses this header when compiling against glibc, but
glibcs itself doesn't seem to have any references to cyclades.

Further more it seems that the driver is broken in the kernel and
the firmware doesn't seem to be available anymore.

As such since this is breaking the build of libsanitizer (and so the
GCC bootstrap[2]) I propose to remove this.

[1] https://lkml.org/lkml/2021/3/2/153
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100379

(cherry picked from commit f7c5351552387bd43f6ca3631016d7f0dfe0f135)

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

libsanitizer/ChangeLog:

PR sanitizer/100379
* sanitizer_common/sanitizer_common_interceptors_ioctl.inc: Remove 
cyclades.
* sanitizer_common/sanitizer_platform_limits_posix.cpp: Likewise.
* sanitizer_common/sanitizer_platform_limits_posix.h: Likewise.

--- inline copy of patch -- 
diff --git 
a/libsanitizer/sanitizer_common/sanitizer_common_interceptors_ioctl.inc 
b/libsanitizer/sanitizer_common/sanitizer_common_interceptors_ioctl.inc
index 
7f181258eab52b11688d6857ddfe09acc98ba986..b7da659875574ed3ca457780799d5b4c556d2f68
 100644
--- a/libsanitizer/sanitizer_common/sanitizer_common_interceptors_ioctl.inc
+++ b/libsanitizer/sanitizer_common/sanitizer_common_interceptors_ioctl.inc
@@ -370,15 +370,6 @@ static void ioctl_table_fill() {
 
 #if SANITIZER_GLIBC
   // _(SIOCDEVPLIP, WRITE, struct_ifreq_sz); // the same as EQL_ENSLAVE
-  _(CYGETDEFTHRESH, WRITE, sizeof(int));
-  _(CYGETDEFTIMEOUT, WRITE, sizeof(int));
-  _(CYGETMON, WRITE, struct_cyclades_monitor_sz);
-  _(CYGETTHRESH, WRITE, sizeof(int));
-  _(CYGETTIMEOUT, WRITE, sizeof(int));
-  _(CYSETDEFTHRESH, NONE, 0);
-  _(CYSETDEFTIMEOUT, NONE, 0);
-  _(CYSETTHRESH, NONE, 0);
-  _(CYSETTIMEOUT, NONE, 0);
   _(EQL_EMANCIPATE, WRITE, struct_ifreq_sz);
   _(EQL_ENSLAVE, WRITE, struct_ifreq_sz);
   _(EQL_GETMASTRCFG, WRITE, struct_ifreq_sz);
diff --git a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h 
b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
index 
ad358eef8b77e0f0098af41f30ce3e7084ee4552..cba41ba54943d80da7d557ce52e1a3ee36e3946a
 100644
--- a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
+++ b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
@@ -983,7 +983,6 @@ extern unsigned struct_vt_mode_sz;
 
 #if SANITIZER_LINUX && !SANITIZER_ANDROID
 extern unsigned struct_ax25_parms_struct_sz;
-extern unsigned struct_cyclades_monitor_sz;
 extern unsigned struct_input_keymap_entry_sz;
 extern unsigned struct_ipx_config_data_sz;
 extern unsigned struct_kbdiacrs_sz;
@@ -1328,15 +1327,6 @@ extern unsigned IOCTL_VT_WAITACTIVE;
 #endif  // SANITIZER_LINUX
 
 #if SANITIZER_LINUX && !SANITIZER_ANDROID
-extern unsigned IOCTL_CYGETDEFTHRESH;
-extern unsigned IOCTL_CYGETDEFTIMEOUT;
-extern unsigned IOCTL_CYGETMON;
-extern unsigned IOCTL_CYGETTHRESH;
-extern unsigned IOCTL_CYGETTIMEOUT;
-extern unsigned IOCTL_CYSETDEFTHRESH;
-extern unsigned IOCTL_CYSETDEFTIMEOUT;
-extern unsigned IOCTL_CYSETTHRESH;
-extern unsigned IOCTL_CYSETTIMEOUT;
 extern unsigned IOCTL_EQL_EMANCIPATE;
 extern unsigned IOCTL_EQL_ENSLAVE;
 extern unsigned IOCTL_EQL_GETMASTRCFG;
diff --git a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp 
b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp
index 
35a690cba5c834a6952ebc0d633195155b151447..6e5c330b98eff32e3fd49b4ff95f72a50230c538
 100644
--- a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp
+++ b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp
@@ -143,7 +143,6 @@ typedef struct user_fpregs elf_fpregset_t;
 # include 
 #endif
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -460,7 +459,6 @@ unsigned struct_ElfW_Phdr_sz = sizeof(Elf_Phdr);
 
 #if SANITIZER_GLIBC
   unsigned struct_ax25_parms_struct_sz = sizeof(struct ax25_parms_struct);
-  unsigned struct_cyclades_monitor_sz = sizeof(struct cyclades_monitor);
 #if EV_VERSION > (0x01)
   unsigned struct_input_keymap_entry_sz = sizeof(struct input_keymap_entry);
 #else
@@ -824,15 +822,6 @@ unsigned struct_ElfW_Phdr_sz = sizeof(Elf_Phdr);
 #endif // SANITIZER_LINUX
 
 #if SANITIZER_LINUX && !SANITIZER_ANDROID
-  unsigned IOCTL_CYGETDEFTHRESH = CYGETDEFTHRESH;
-  unsigned IOCTL_CYGETDEFTIMEOUT = CYGETDEFTIMEOUT;
-  unsigned IOCTL_CYGETMON = CYGETMON;
-  unsigned IOCTL_CYGETTHRESH = CYGETTHRESH;
-  unsigned IOCTL_CYGETTIMEOUT = CYGETTIMEOUT;
-  unsigned IOCTL_CYSETDEFTHRESH = CYSETDEFTHRESH;
-  unsigned IOCTL_CYSETDEFTIMEOUT = CYSETDEFTIMEOUT;
-  unsigned IOCTL_CYSETTHRESH = CYSETTHRESH;
-  unsigned IOCTL_CYSETTIMEOUT = CYSETTIMEOUT;
   unsigned IOCTL_EQL_EMANCIPATE = EQL_EMANCIPATE;
   unsigned IOCTL_EQL_ENSLAVE = 

Re: [PATCH v2] vect: Replace hardcoded weight factor with param

2021-05-20 Thread Kewen.Lin via Gcc-patches
on 2021/5/20 下午6:25, Richard Biener wrote:
> On Thu, May 20, 2021 at 12:09 PM Kewen.Lin  wrote:
>>
>> on 2021/5/20 下午5:30, Christophe Lyon wrote:
>>> On Thu, 20 May 2021 at 10:52, Kewen.Lin via Gcc-patches
>>>  wrote:

 on 2021/5/19 下午6:01, Richard Biener wrote:
> On Wed, May 19, 2021 at 11:47 AM Kewen.Lin  wrote:
>>
>> Hi Richi,
>>
>> on 2021/5/19 下午4:15, Richard Biener wrote:
>>> On Wed, May 19, 2021 at 8:20 AM Kewen.Lin  wrote:

 Hi,

 This patch is to replace the current hardcoded weight factor 50
 for those statements in an inner loop relative to the loop being
 vectorized with a specific parameter vect-inner-loop-weight-factor.

 The motivation behind this change is: if targets want to have one
 unique function to gather some information in each add_stmt_cost
 call, no matter that it's put before or after the cost tweaking
 part for inner loop, it may have the need to adjust (expand or
 shrink) the gathered data as the factor.  Now the factor is
 hardcoded, it's not easily maintained.  Since it's possible that
 targets have their own decisions on this costing like the others,
 I used parameter instead of one unique macro here.

 Testing is ongoing, is it ok for trunk if everything goes well?
>>>
>>> Certainly an improvement.  I suppose we might want to put
>>> the factor into vinfo->inner_loop_cost_factor.  That way
>>> we could adjust it easily in common code in the vectorizer
>>> when we for example have (non-guessed) profile data.
>>>
>>> "weight_factor" is kind-of double-speak and I'm missing 'cost' ...
>>> so, bike-shedding to vect_inner_loop_cost_factor?
>>>
>>> Just suggestions - as said, the patch is an improvement already.
>>>
>>
>> Thanks for your nice suggestions!  I've updated the patch accordingly
>> and attached it.  Does it look better to you?
>
> Minor nit:
>
> +@item vect-inner-loop-cost-factor
> +The factor which loop vectorizer uses to over weight those statements in
> +an inner loop relative to the loop being vectorized.
> +
>
> the default value should be documented here, not..
>
> +-param=vect-inner-loop-cost-factor=
> +Common Joined UInteger Var(param_vect_inner_loop_cost_factor)
> Init(50) IntegerRange(1, 99) Param Optimization
> +Indicates the factor which loop vectorizer uses to over weight those
> statements in an inner loop relative to the loop being vectorized.
> The default value is 50.
> +
>
> here (based on statistical analysis of existing cases).  Also the
> params.opt docs
> should be the "brief" one - but for simplicity simply make both docs 
> identical
> (apart from the default value doc).  I suggest
>
> "The factor which the loop vectorizer applies to the cost of statements
> in an inner loop relative to the loop being vectorized."
>

 Thanks for catching this and the suggestion!

 Bootstrapped/regtested on powerpc64le-linux-gnu P9, x86_64-redhat-linux
 and aarch64-linux-gnu.

>>>
>>> This breaks the build for arm targets:
>>> /tmp/158661_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:
>>> In function 'unsigned int arm_add_stmt_cost(vec_info*, void*, int,
>>> vect_cost_for_stmt, _stmt_v
>>> ec_info*, tree, int, vect_cost_model_location)':
>>> /tmp/158661_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:12230:4:
>>> error: 'loop_vec_info' was not declared in this scope
>>> loop_vec_info loop_vinfo = dyn_cast (vinfo);
>>> ^
>>> /tmp/158661_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:12230:18:
>>> error: expected ';' before 'loop_vinfo'
>>> loop_vec_info loop_vinfo = dyn_cast (vinfo);
>>>
>>> Can you fix it?
>>>
>>
>> Oops!  Deeply sorry for that and thanks for the testing!
>>
>> I just found that unlike the other targets arm.c doesn't include
>> "tree-vectorizer.h".  The issue should be fixed with the below patch:
>>
>> gcc/ChangeLog:
>>
>> * config/arm/arm.c: Include head files tree-vectorizer.h and
>> cfgloop.h.
>>
>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>> index caf4e56b9fe..6ed34fbf627 100644
>> --- a/gcc/config/arm/arm.c
>> +++ b/gcc/config/arm/arm.c
>> @@ -69,6 +69,8 @@
>>  #include "gimplify.h"
>>  #include "gimple.h"
>>  #include "selftest.h"
>> +#include "cfgloop.h"
>> +#include "tree-vectorizer.h"
>>
>>  /* This file should be included last.  */
>>  #include "target-def.h"
>>
>>
>> Is it counted as a obvious patch?
> 
> Please check if you can build a cc1 cross to arm, then yes.
> 

Thanks for the prompt review!

Yes, it worked to build a cross cc1.  I did a trivial adjustment
to align with the exisiting include order like other ports by putting
cfgloop.h just after cfghooks.h as below.  Will commit 

[PATCH] Simplify option handling for -fsanitize-coverage

2021-05-20 Thread Martin Liška

The simplification patch improves option completion and
handling of the option.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

* common.opt: Use proper Enum values.
* opts.c (COVERAGE_SANITIZER_OPT): Remove.
(parse_sanitizer_options): Handle only sanitizer_opts.
(common_handle_option): Just assign value.

gcc/testsuite/ChangeLog:

* gcc.dg/spellcheck-options-23.c: New test.
---
 gcc/common.opt   | 11 +-
 gcc/opts.c   | 41 +---
 gcc/testsuite/gcc.dg/spellcheck-options-23.c |  5 +++
 3 files changed, 25 insertions(+), 32 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/spellcheck-options-23.c

diff --git a/gcc/common.opt b/gcc/common.opt
index a75b44ee47e..de92e3e37aa 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1037,9 +1037,18 @@ Common Driver Joined
 Select what to sanitize.
 
 fsanitize-coverage=

-Common Joined
+Common Joined RejectNegative Enum(sanitize_coverage)
 Select type of coverage sanitization.
 
+Enum

+Name(sanitize_coverage) Type(int)
+
+EnumValue
+Enum(sanitize_coverage) String(trace-pc) Value(SANITIZE_COV_TRACE_PC)
+
+EnumValue
+Enum(sanitize_coverage) String(trace-cmp) Value(SANITIZE_COV_TRACE_CMP)
+
 fasan-shadow-offset=
 Common Joined RejectNegative Var(common_deferred_options) Defer
 -fasan-shadow-offset=Use custom shadow memory offset.
diff --git a/gcc/opts.c b/gcc/opts.c
index fe6fddbf095..282da84f286 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -1817,17 +1817,6 @@ const struct sanitizer_opts_s sanitizer_opts[] =
   { NULL, 0U, 0UL, false }
 };
 
-/* -f{,no-}sanitize-coverage= suboptions.  */

-const struct sanitizer_opts_s coverage_sanitizer_opts[] =
-{
-#define COVERAGE_SANITIZER_OPT(name, flags) \
-{ #name, flags, sizeof #name - 1, true }
-  COVERAGE_SANITIZER_OPT (trace-pc, SANITIZE_COV_TRACE_PC),
-  COVERAGE_SANITIZER_OPT (trace-cmp, SANITIZE_COV_TRACE_CMP),
-#undef COVERAGE_SANITIZER_OPT
-  { NULL, 0U, 0UL, false }
-};
-
 /* -fzero-call-used-regs= suboptions.  */
 const struct zero_call_used_regs_opts_s zero_call_used_regs_opts[] =
 {
@@ -1878,8 +1867,7 @@ struct edit_distance_traits
 /* Given ARG, an unrecognized sanitizer option, return the best
matching sanitizer option, or NULL if there isn't one.
OPTS is array of candidate sanitizer options.
-   CODE is OPT_fsanitize_, OPT_fsanitize_recover_ or
-   OPT_fsanitize_coverage_.
+   CODE is OPT_fsanitize_or OPT_fsanitize_recover_.
VALUE is non-zero for the regular form of the option, zero
for the "no-" form (e.g. "-fno-sanitize-recover=").  */
 
@@ -1919,12 +1907,6 @@ parse_sanitizer_options (const char *p, location_t loc, int scode,

 {
   enum opt_code code = (enum opt_code) scode;
 
-  const struct sanitizer_opts_s *opts;

-  if (code == OPT_fsanitize_coverage_)
-opts = coverage_sanitizer_opts;
-  else
-opts = sanitizer_opts;
-
   while (*p != 0)
 {
   size_t len, i;
@@ -1942,11 +1924,12 @@ parse_sanitizer_options (const char *p, location_t loc, 
int scode,
}
 
   /* Check to see if the string matches an option class name.  */

-  for (i = 0; opts[i].name != NULL; ++i)
-   if (len == opts[i].len && memcmp (p, opts[i].name, len) == 0)
+  for (i = 0; sanitizer_opts[i].name != NULL; ++i)
+   if (len == sanitizer_opts[i].len
+   && memcmp (p, sanitizer_opts[i].name, len) == 0)
  {
/* Handle both -fsanitize and -fno-sanitize cases.  */
-   if (value && opts[i].flag == ~0U)
+   if (value && sanitizer_opts[i].flag == ~0U)
  {
if (code == OPT_fsanitize_)
  {
@@ -1963,14 +1946,14 @@ parse_sanitizer_options (const char *p, location_t loc, 
int scode,
   -fsanitize-recover=return if -fsanitize-recover=undefined
   is selected.  */
if (code == OPT_fsanitize_recover_
-   && opts[i].flag == SANITIZE_UNDEFINED)
+   && sanitizer_opts[i].flag == SANITIZE_UNDEFINED)
  flags |= (SANITIZE_UNDEFINED
& ~(SANITIZE_UNREACHABLE | SANITIZE_RETURN));
else
- flags |= opts[i].flag;
+ flags |= sanitizer_opts[i].flag;
  }
else
- flags &= ~opts[i].flag;
+ flags &= ~sanitizer_opts[i].flag;
found = true;
break;
  }
@@ -1979,13 +1962,11 @@ parse_sanitizer_options (const char *p, location_t loc, 
int scode,
{
  const char *hint
= get_closest_sanitizer_option (string_fragment (p, len),
-   opts, code, value);
+   sanitizer_opts, code, value);
 
 	  const char *suffix;

  if (code == OPT_fsanitize_recover_)
suffix = "-recover";

Re: [PATCH v2] vect: Replace hardcoded weight factor with param

2021-05-20 Thread Richard Biener via Gcc-patches
On Thu, May 20, 2021 at 12:09 PM Kewen.Lin  wrote:
>
> on 2021/5/20 下午5:30, Christophe Lyon wrote:
> > On Thu, 20 May 2021 at 10:52, Kewen.Lin via Gcc-patches
> >  wrote:
> >>
> >> on 2021/5/19 下午6:01, Richard Biener wrote:
> >>> On Wed, May 19, 2021 at 11:47 AM Kewen.Lin  wrote:
> 
>  Hi Richi,
> 
>  on 2021/5/19 下午4:15, Richard Biener wrote:
> > On Wed, May 19, 2021 at 8:20 AM Kewen.Lin  wrote:
> >>
> >> Hi,
> >>
> >> This patch is to replace the current hardcoded weight factor 50
> >> for those statements in an inner loop relative to the loop being
> >> vectorized with a specific parameter vect-inner-loop-weight-factor.
> >>
> >> The motivation behind this change is: if targets want to have one
> >> unique function to gather some information in each add_stmt_cost
> >> call, no matter that it's put before or after the cost tweaking
> >> part for inner loop, it may have the need to adjust (expand or
> >> shrink) the gathered data as the factor.  Now the factor is
> >> hardcoded, it's not easily maintained.  Since it's possible that
> >> targets have their own decisions on this costing like the others,
> >> I used parameter instead of one unique macro here.
> >>
> >> Testing is ongoing, is it ok for trunk if everything goes well?
> >
> > Certainly an improvement.  I suppose we might want to put
> > the factor into vinfo->inner_loop_cost_factor.  That way
> > we could adjust it easily in common code in the vectorizer
> > when we for example have (non-guessed) profile data.
> >
> > "weight_factor" is kind-of double-speak and I'm missing 'cost' ...
> > so, bike-shedding to vect_inner_loop_cost_factor?
> >
> > Just suggestions - as said, the patch is an improvement already.
> >
> 
>  Thanks for your nice suggestions!  I've updated the patch accordingly
>  and attached it.  Does it look better to you?
> >>>
> >>> Minor nit:
> >>>
> >>> +@item vect-inner-loop-cost-factor
> >>> +The factor which loop vectorizer uses to over weight those statements in
> >>> +an inner loop relative to the loop being vectorized.
> >>> +
> >>>
> >>> the default value should be documented here, not..
> >>>
> >>> +-param=vect-inner-loop-cost-factor=
> >>> +Common Joined UInteger Var(param_vect_inner_loop_cost_factor)
> >>> Init(50) IntegerRange(1, 99) Param Optimization
> >>> +Indicates the factor which loop vectorizer uses to over weight those
> >>> statements in an inner loop relative to the loop being vectorized.
> >>> The default value is 50.
> >>> +
> >>>
> >>> here (based on statistical analysis of existing cases).  Also the
> >>> params.opt docs
> >>> should be the "brief" one - but for simplicity simply make both docs 
> >>> identical
> >>> (apart from the default value doc).  I suggest
> >>>
> >>> "The factor which the loop vectorizer applies to the cost of statements
> >>> in an inner loop relative to the loop being vectorized."
> >>>
> >>
> >> Thanks for catching this and the suggestion!
> >>
> >> Bootstrapped/regtested on powerpc64le-linux-gnu P9, x86_64-redhat-linux
> >> and aarch64-linux-gnu.
> >>
> >
> > This breaks the build for arm targets:
> > /tmp/158661_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:
> > In function 'unsigned int arm_add_stmt_cost(vec_info*, void*, int,
> > vect_cost_for_stmt, _stmt_v
> > ec_info*, tree, int, vect_cost_model_location)':
> > /tmp/158661_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:12230:4:
> > error: 'loop_vec_info' was not declared in this scope
> > loop_vec_info loop_vinfo = dyn_cast (vinfo);
> > ^
> > /tmp/158661_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:12230:18:
> > error: expected ';' before 'loop_vinfo'
> > loop_vec_info loop_vinfo = dyn_cast (vinfo);
> >
> > Can you fix it?
> >
>
> Oops!  Deeply sorry for that and thanks for the testing!
>
> I just found that unlike the other targets arm.c doesn't include
> "tree-vectorizer.h".  The issue should be fixed with the below patch:
>
> gcc/ChangeLog:
>
> * config/arm/arm.c: Include head files tree-vectorizer.h and
> cfgloop.h.
>
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index caf4e56b9fe..6ed34fbf627 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -69,6 +69,8 @@
>  #include "gimplify.h"
>  #include "gimple.h"
>  #include "selftest.h"
> +#include "cfgloop.h"
> +#include "tree-vectorizer.h"
>
>  /* This file should be included last.  */
>  #include "target-def.h"
>
>
> Is it counted as a obvious patch?

Please check if you can build a cc1 cross to arm, then yes.

> BR,
> Kewen


Re: [RFC] Using main loop's updated IV as base_address for epilogue vectorization

2021-05-20 Thread Richard Biener
On Mon, 17 May 2021, Andre Vieira (lists) wrote:

> Hi,
> 
> So this is my second attempt at finding a way to improve how we generate the
> vector IV's and teach the vectorizer to share them between main loop and
> epilogues. On IRC we discussed my idea to use the loop's control_iv, but that
> was a terrible idea and I quickly threw it in the bin. The main problem, that
> for some reason I failed to see, was that the control_iv increases by 's' and
> the datarefs by 's' * NELEMENTS where 's' is usually 1 and NELEMENTs the
> amount of elements we handle per iteration. That means the epilogue loops
> would have to start from the last loop's IV * the last loop's NELEMENT's and
> that would just cause a mess.
> 
> Instead I started to think about creating IV's for the datarefs and what I
> thought worked best was to create these in scalar before peeling. That way the
> peeling mechanisms takes care of the duplication of these for the vector and
> scalar epilogues and it also takes care of adding phi-nodes for the
> skip_vector paths.

How does this work for if-converted loops where we use the 
non-if-converted scalar loop for (scalar) peeling but the
if-converted scalar loop for vectorized epilogues?  I suppose
you're only adjusting the if-converted copy.

> These new IV's have two functions:
> 1) 'vect_create_data_ref_ptr' can use them to:
>  a) if it's the main loop: replace the values of the 'initial' value of the
> main loop's IV and the initial values in the skip_vector phi-nodes
>  b) Update the the skip_vector phi-nodes argument for the non-skip path with
> the updated vector ptr.

b) means the prologue IV will not be dead there so we actually need
to compute it?  I suppose IVOPTs could be teached to replace an
IV with its final value (based on some other IV) when it's unused?
Or does it already magically do good?

> 2) They are used for the scalar epilogue ensuring they share the same
> datareference ptr.
> 
> There are still a variety of 'hacky' elements here and a lot of testing to be
> done, but I hope to be able to clean them away. One of the main issues I had
> was that I had to skip a couple of checks and things for the added phi-nodes
> and update statements as these do not have stmt_vec_info representation. 
> Though I'm not sure adding this representation at their creation was much
> cleaner... It is something I could play around with but I thought this was a
> good moment to ask you for input. For instance, maybe we could do this
> transformation before analysis?
> 
> Also be aware that because I create a IV for each dataref this leads to
> regressions with SVE codegen for instance. NEON is able to use the post-index
> addressing mode to increase each dr IV at access time, but SVE can't do this. 
> For this I don't know if maybe we could try to be smart and create shared
> IV's. So rather than make them based on the actual vector ptr, use a shared
> sizetype IV that can be shared among dr IV's with the same step. Or maybe this
> is something for IVOPTs?

Certainly IVOPTs could decide to use the newly created IVs in the
scalar loops for the DRs therein as well.  But since IVOPTs only
considers a single loop at a time it will probably not pay too
much attention and is only influenced by the out-of-loop uses of
the final values of the IVs.

My gut feeling tells me that whatever we do we'll have to look
into improving IVOPTs to consider multiple loops.

> Let me know what ya think!

Now as to the patch itself.  We shouldn't amend struct data_reference,
try to use dr_vec_info instead.

--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -4941,11 +4941,14 @@ vect_create_data_ref_ptr (vec_info *vinfo, 
stmt_vec_info stmt_info,
}

   standard_iv_increment_position (loop, _gsi, _after);
+  gphi *update_phi
+   = as_a (SSA_NAME_DEF_STMT (*dr->iv_bases->get (loop)));


and avoid this by having some explicit representation of IVs.

iv_bases is a map that exists for each DR and maps the loop to the
IV def it seems.  I'd have added the map to loop_vinfo, mapping
the (vect-)DR to the IV [def].

That probably makes the convenience of transforming the scalar
loop before peeling go away, but I wonder whether that's a good
idea anyway.

@@ -9484,6 +9480,51 @@ vect_transform_loop (loop_vec_info loop_vinfo, 
gimple *loop_vectorized_call)
   tree advance;
   drs_init_vec orig_drs_init;

+  if (LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo) == NULL)
+{
+  struct data_reference *dr;
+  unsigned int i;
+  gimple_seq stmts;
+  gimple_stmt_iterator gsi = gsi_for_stmt (get_loop_exit_condition 
(loop));
+
+  FOR_EACH_VEC_ELT (LOOP_VINFO_DATAREFS (loop_vinfo), i, dr) {
+ tree _base = dr->iv_bases->get_or_insert (loop);

there's a comment missing - does this try to replace the
IVs used by the scalar DRs in the non-if-converted loop by
the new artificial IVs?

I notice that even for

void foo (int * __restrict x, int *y, int n1, int n2)
{
  int i;
  for (i = 0; i < 

Re: [PATCH v2] vect: Replace hardcoded weight factor with param

2021-05-20 Thread Kewen.Lin via Gcc-patches
on 2021/5/20 下午5:30, Christophe Lyon wrote:
> On Thu, 20 May 2021 at 10:52, Kewen.Lin via Gcc-patches
>  wrote:
>>
>> on 2021/5/19 下午6:01, Richard Biener wrote:
>>> On Wed, May 19, 2021 at 11:47 AM Kewen.Lin  wrote:

 Hi Richi,

 on 2021/5/19 下午4:15, Richard Biener wrote:
> On Wed, May 19, 2021 at 8:20 AM Kewen.Lin  wrote:
>>
>> Hi,
>>
>> This patch is to replace the current hardcoded weight factor 50
>> for those statements in an inner loop relative to the loop being
>> vectorized with a specific parameter vect-inner-loop-weight-factor.
>>
>> The motivation behind this change is: if targets want to have one
>> unique function to gather some information in each add_stmt_cost
>> call, no matter that it's put before or after the cost tweaking
>> part for inner loop, it may have the need to adjust (expand or
>> shrink) the gathered data as the factor.  Now the factor is
>> hardcoded, it's not easily maintained.  Since it's possible that
>> targets have their own decisions on this costing like the others,
>> I used parameter instead of one unique macro here.
>>
>> Testing is ongoing, is it ok for trunk if everything goes well?
>
> Certainly an improvement.  I suppose we might want to put
> the factor into vinfo->inner_loop_cost_factor.  That way
> we could adjust it easily in common code in the vectorizer
> when we for example have (non-guessed) profile data.
>
> "weight_factor" is kind-of double-speak and I'm missing 'cost' ...
> so, bike-shedding to vect_inner_loop_cost_factor?
>
> Just suggestions - as said, the patch is an improvement already.
>

 Thanks for your nice suggestions!  I've updated the patch accordingly
 and attached it.  Does it look better to you?
>>>
>>> Minor nit:
>>>
>>> +@item vect-inner-loop-cost-factor
>>> +The factor which loop vectorizer uses to over weight those statements in
>>> +an inner loop relative to the loop being vectorized.
>>> +
>>>
>>> the default value should be documented here, not..
>>>
>>> +-param=vect-inner-loop-cost-factor=
>>> +Common Joined UInteger Var(param_vect_inner_loop_cost_factor)
>>> Init(50) IntegerRange(1, 99) Param Optimization
>>> +Indicates the factor which loop vectorizer uses to over weight those
>>> statements in an inner loop relative to the loop being vectorized.
>>> The default value is 50.
>>> +
>>>
>>> here (based on statistical analysis of existing cases).  Also the
>>> params.opt docs
>>> should be the "brief" one - but for simplicity simply make both docs 
>>> identical
>>> (apart from the default value doc).  I suggest
>>>
>>> "The factor which the loop vectorizer applies to the cost of statements
>>> in an inner loop relative to the loop being vectorized."
>>>
>>
>> Thanks for catching this and the suggestion!
>>
>> Bootstrapped/regtested on powerpc64le-linux-gnu P9, x86_64-redhat-linux
>> and aarch64-linux-gnu.
>>
> 
> This breaks the build for arm targets:
> /tmp/158661_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:
> In function 'unsigned int arm_add_stmt_cost(vec_info*, void*, int,
> vect_cost_for_stmt, _stmt_v
> ec_info*, tree, int, vect_cost_model_location)':
> /tmp/158661_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:12230:4:
> error: 'loop_vec_info' was not declared in this scope
> loop_vec_info loop_vinfo = dyn_cast (vinfo);
> ^
> /tmp/158661_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:12230:18:
> error: expected ';' before 'loop_vinfo'
> loop_vec_info loop_vinfo = dyn_cast (vinfo);
> 
> Can you fix it?
> 

Oops!  Deeply sorry for that and thanks for the testing!

I just found that unlike the other targets arm.c doesn't include
"tree-vectorizer.h".  The issue should be fixed with the below patch:

gcc/ChangeLog:

* config/arm/arm.c: Include head files tree-vectorizer.h and
cfgloop.h.

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index caf4e56b9fe..6ed34fbf627 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -69,6 +69,8 @@
 #include "gimplify.h"
 #include "gimple.h"
 #include "selftest.h"
+#include "cfgloop.h"
+#include "tree-vectorizer.h"

 /* This file should be included last.  */
 #include "target-def.h"


Is it counted as a obvious patch?

BR,
Kewen


Re: [PATCH][libgomp, nvptx] Fix hang in gomp_team_barrier_wait_end

2021-05-20 Thread Thomas Schwinge
Hi Tom!

First, thanks for looking into this PR99555!


I can't comment on the OpenMP/nvptx changes, so just the following:

On 2021-04-23T18:48:01+0200, Tom de Vries  wrote:
> --- a/libgomp/testsuite/libgomp.fortran/task-detach-6.f90
> +++ b/libgomp/testsuite/libgomp.fortran/task-detach-6.f90
> @@ -1,6 +1,5 @@
>  ! { dg-do run }
>
> -! { dg-additional-sources on_device_arch.c }
>! { dg-prune-output "command-line option '-fintrinsic-modules-path=.*' is 
> valid for Fortran but not for C" }

Please remove the 'dg-prune-output', too.  ;-)


Your changes leave
'libgomp/testsuite/lib/libgomp.exp:check_effective_target_offload_device_nvptx',
'libgomp/testsuite/libgomp.c-c++-common/on_device_arch.h',
'libgomp/testsuite/libgomp.fortran/on_device_arch.c' unused.  Should we
keep those for a potential future use (given that they've been tested to
work) or remove (as now unused, danger of bit-rot)?


Grüße
 Thomas
-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf


[PATCH 1/2, rs6000] Remove mode promotion for pseudos

2021-05-20 Thread HAO CHEN GUI via Gcc-patches

Hi,

   The patch is preparatory for the patch2 - remove mode promotion for 
pseudos.


   The attachments are the patch diff and change log file.

    Bootstrapped and tested on powerpc64le-linux and powerpc64-linux 
(with both m32 and m64) with no regressions. Is this okay for trunk? Any 
recommendations? Thanks a lot.


* config/rs6000/rs6000-call.c (rs6000_promote_function_mode):
Replace PROMOTE_MODE marco with its content.
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index f5676255387..dca139b2ecf 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -6646,7 +6646,9 @@ rs6000_promote_function_mode (const_tree type 
ATTRIBUTE_UNUSED,
 >-->--->---  int *punsignedp ATTRIBUTE_UNUSED,
 >-->--->---  const_tree, int for_return ATTRIBUTE_UNUSED)
 {
-  PROMOTE_MODE (mode, *punsignedp, type);
+  if (GET_MODE_CLASS (mode) == MODE_INT
+  && GET_MODE_SIZE (mode) < (TARGET_32BIT ? 4 : 8))
+mode = TARGET_32BIT ? SImode : DImode;
~
   return mode;
 }


[PATCH 2/2, rs6000] Remove mode promotion for pseudos

2021-05-20 Thread HAO CHEN GUI via Gcc-patches

Hi,

   The patch removes mode promotion for pseudos on rs6000 target.

   The attachments are the patch diff and change log file.

    Bootstrapped and tested on powerpc64le-linux and powerpc64-linux 
(with both m32 and m64) with no regressions. Is this okay for trunk? Any 
recommendations? Thanks a lot.


rs6000 has instructions that can do almost everything 32 bit
at least as efficiently as corresponding 64 bit things. The
mode promotion can be defered to when a wide mode is necessary.
So it helps a lot not promote mode for pseudos. SPECint test
shows that the overall performance improvement (by geomean) is
more than 2% with this patch.
testsuite/gcc.target/powerpc/not-promote-mode.c illustrates how
the patch eliminates the redundant extensions and do further
optimization by disabling mode promotion for pseduos.

* config/rs6000/rs6000.h (PROMOTE_MODE): Remove.
* testsuite/gcc.target/powerpc/not-promote-mode.c: New.
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 233a92baf3c..f6485fcacf5 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -666,17 +666,6 @@ extern unsigned char rs6000_recip_bits[];
 
 /* Target machine storage layout.  */
 
-/* Define this macro if it is advisable to hold scalars in registers
-   in a wider mode than that declared by the program.  In such cases,
-   the value is constrained to be within the bounds of the declared
-   type, but kept valid in the wider mode.  The signedness of the
-   extension may differ from that of the type.  */
-
-#define PROMOTE_MODE(MODE,UNSIGNEDP,TYPE)  \
-  if (GET_MODE_CLASS (MODE) == MODE_INT\
-  && GET_MODE_SIZE (MODE) < (TARGET_32BIT ? 4 : 8)) \
-(MODE) = TARGET_32BIT ? SImode : DImode;
-
 /* Define this if most significant bit is lowest numbered
in instructions that operate on numbered bit-fields.  */
 /* That is true on RS/6000.  */
diff --git a/gcc/testsuite/gcc.target/powerpc/not-promote-mode.c 
b/gcc/testsuite/gcc.target/powerpc/not-promote-mode.c
new file mode 100644
index 000..4d29ebe8b87
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/not-promote-mode.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-options "-O2" } */
+
+extern void bar ();
+
+void foo ()
+{
+  int i;
+  for (i = 0; i < 1; i++)
+bar ();
+}
+
+/* { dg-final { scan-assembler-not   {\mrldicl\M} } } */


Re: [PATCH v2] forwprop: Support vec perm fed by CTOR and CTOR/CST [PR99398]

2021-05-20 Thread Richard Biener
On Thu, 13 May 2021, Kewen.Lin wrote:

> Hi Richi,
> 
> Thanks for the review!
> 
> on 2021/5/11 下午9:26, Richard Biener wrote:
> > On Fri, 7 May 2021, Kewen.Lin wrote:
> > 
> >> Hi, 
> >>
> >> This patch is to teach forwprop to optimize some cases where the
> >> permutated operands of vector permutation are from two same type
> >> CTOR and CTOR or one CTOR and one VECTOR CST.  It aggressively
> >> takes VIEW_CONVERT_EXPR as trivial copies and transform the vector
> >> permutation into vector CTOR.
> >>
> >> Bootstrapped/regtested on powerpc64le-linux-gnu P9, powerpc64-linux-gnu P8,
> >> x86_64-redhat-linux and aarch64-linux-gnu.
> >>
> >> Is it ok for trunk?
> > 
> > Can you please avoid the changes to get_prop_source_stmt and
> > can_propagate_from?  
> 
> Sure!  Do you mean that we need to keep those functions as pure as
> possible?  I meant to reuse the single use check in the call.

Yeah, I'd like to get rid of them eventually ...

> > It should work to add a single match
> > of a V_C_E after the get_prop_source_stmt call.  Ideally
> > we'd have
> > 
> >   /* Shuffle of a constructor.  */
> >   else if (code == CONSTRUCTOR || code == VECTOR_CST)
> > {
> > ...
> > }
> >   else if (code == VIEW_CONVERT_EXPR)
> > {
> >op1 must also be a V_C_E or VECTOR_CST here
> > }
> > 
> > but then I fear we have no canonicalization of the VECTOR_CST
> > to 2nd VEC_PERM operand.  But then moving the op1 gathering
> > out of the if (code == CONSTRUCTOR || code == VECTOR_CST)
> > case (doesn't need an else) might still make such refactoring
> > possible as first matching
> > 
> >   if (code == VIEW_CONVERT_EXPR || code2 == VIEW_CONVERT_EXPR)
> >{
> > ...
> >   }
> >   else if (code == CONSTRUCTOR || code == VECTOR_CST)
> > ...
> > 
> 
> 
> The attached patch v2 use the structure by considering the above
> advice and the (code == CONSTRUCTOR || code == VECTOR_CST) part
> can be shared with VIEW_CONVERT_EXPR handlings as below:
> 
>   op0 gathering (leave V_C_E in code if it's met)  
> 
>   else if (code == CONSTRUCTOR || code == VECTOR_CST || VIEW_CONVERT_EXPR) 
> {
>op1 gathering (leave V_C_E in code2)
>
>if (code == VIEW_CONVERT_EXPR || code2 == VIEW_CONVERT_EXPR)
>  do the tricks on arg0/arg1/op2
> 
>the previous handlings on CONSTRUCTOR/VECTOR_CST
> }
> 
> Also updated "shrinked" to "shrunk" as Segher pointed out.  :-)
> 
> Does it look better now?

Yes.  The forwprop changes are OK - I'd still like Richard to
review the vec-perm-indices change.

Thanks, and sorry for the delay,
Richard.

> Bootstrapped/regtested on powerpc64le-linux-gnu P9 btw.
> 
> BR,
> Kewen
> -
> gcc/ChangeLog:
> 
>   PR tree-optimization/99398
>   * tree-ssa-forwprop.c (simplify_permutation): Optimize some cases
>   where the fed operands are CTOR/CST and propagated through
>   VIEW_CONVERT_EXPR.  Call vec_perm_indices::new_shrunk_vector.
>   * vec-perm-indices.c (vec_perm_indices::new_shrunk_vector): New
>   function.
>   * vec-perm-indices.h (vec_perm_indices::new_shrunk_vector): New
>   declare.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/99398
>   * gcc.target/powerpc/vec-perm-ctor-run.c: New test.
>   * gcc.target/powerpc/vec-perm-ctor.c: New test.
>   * gcc.target/powerpc/vec-perm-ctor.h: New test.
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [PATCH,V3 1/2] opts: change write_symbols to support bitmasks

2021-05-20 Thread Richard Biener via Gcc-patches
On Thu, May 13, 2021 at 12:53 AM Indu Bhagat via Gcc-patches
 wrote:
>
> [No changes from V2]
>
> To support multiple debug formats, we need to move away from explicit
> enumeration of each individual combination of debug formats.

OK.

Thanks,
Richard.

> gcc/c-family/ChangeLog:
>
> * c-opts.c (c_common_post_options): Adjust access to debug_type_names.
> * c-pch.c (struct c_pch_validity): Use type uint32_t.
> (pch_init): Renamed member.
> (c_common_valid_pch): Adjust access to debug_type_names.
>
> gcc/ChangeLog:
>
> * common.opt: Change type to support bitmasks.
> * flag-types.h (enum debug_info_type): Rename enumerator constants.
> (NO_DEBUG): New bitmask.
> (DBX_DEBUG): Likewise.
> (DWARF2_DEBUG): Likewise.
> (XCOFF_DEBUG): Likewise.
> (VMS_DEBUG): Likewise.
> (VMS_AND_DWARF2_DEBUG): Likewise.
> * flags.h (debug_set_to_format): New function declaration.
> (debug_set_count): Likewise.
> (debug_set_names): Likewise.
> * opts.c (debug_type_masks): Array of bitmasks for debug formats.
> (debug_set_to_format): New function definition.
> (debug_set_count): Likewise.
> (debug_set_names): Likewise.
> (set_debug_level): Update access to debug_type_names.
> * toplev.c: Likewise.
>
> gcc/objc/ChangeLog:
>
> * objc-act.c (synth_module_prologue): Use uint32_t instead of enum
> debug_info_type.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/pch/valid-1.c: Adjust diagnostic message in testcase.
> * lib/dg-pch.exp: Adjust diagnostic message.
> ---
>  gcc/c-family/c-opts.c  |   7 ++-
>  gcc/c-family/c-pch.c   |  12 ++--
>  gcc/common.opt |   2 +-
>  gcc/flag-types.h   |  29 +++---
>  gcc/flags.h|  17 +-
>  gcc/objc/objc-act.c|   2 +-
>  gcc/opts.c | 109 
> +
>  gcc/testsuite/gcc.dg/pch/valid-1.c |   2 +-
>  gcc/testsuite/lib/dg-pch.exp   |   4 +-
>  gcc/toplev.c   |   9 ++-
>  10 files changed, 157 insertions(+), 36 deletions(-)
>
> diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
> index 89e05a4..60b5802 100644
> --- a/gcc/c-family/c-opts.c
> +++ b/gcc/c-family/c-opts.c
> @@ -1112,9 +1112,10 @@ c_common_post_options (const char **pfilename)
>   /* Only -g0 and -gdwarf* are supported with PCH, for other
>  debug formats we warn here and refuse to load any PCH files.  */
>   if (write_symbols != NO_DEBUG && write_symbols != DWARF2_DEBUG)
> -   warning (OPT_Wdeprecated,
> -"the %qs debug format cannot be used with "
> -"pre-compiled headers", debug_type_names[write_symbols]);
> + warning (OPT_Wdeprecated,
> +  "the %qs debug info cannot be used with "
> +  "pre-compiled headers",
> +  debug_set_names (write_symbols & ~DWARF2_DEBUG));
> }
>else if (write_symbols != NO_DEBUG && write_symbols != DWARF2_DEBUG)
> c_common_no_more_pch ();
> diff --git a/gcc/c-family/c-pch.c b/gcc/c-family/c-pch.c
> index fd94c37..8f0f760 100644
> --- a/gcc/c-family/c-pch.c
> +++ b/gcc/c-family/c-pch.c
> @@ -52,7 +52,7 @@ enum {
>
>  struct c_pch_validity
>  {
> -  unsigned char debug_info_type;
> +  uint32_t pch_write_symbols;
>signed char match[MATCH_SIZE];
>void (*pch_init) (void);
>size_t target_data_length;
> @@ -108,7 +108,7 @@ pch_init (void)
>pch_outfile = f;
>
>memset (, '\0', sizeof (v));
> -  v.debug_info_type = write_symbols;
> +  v.pch_write_symbols = write_symbols;
>{
>  size_t i;
>  for (i = 0; i < MATCH_SIZE; i++)
> @@ -252,13 +252,13 @@ c_common_valid_pch (cpp_reader *pfile, const char 
> *name, int fd)
>/* The allowable debug info combinations are that either the PCH file
>   was built with the same as is being used now, or the PCH file was
>   built for some kind of debug info but now none is in use.  */
> -  if (v.debug_info_type != write_symbols
> +  if (v.pch_write_symbols != write_symbols
>&& write_symbols != NO_DEBUG)
>  {
>cpp_warning (pfile, CPP_W_INVALID_PCH,
> -  "%s: created with -g%s, but used with -g%s", name,
> -  debug_type_names[v.debug_info_type],
> -  debug_type_names[write_symbols]);
> +  "%s: created with '%s' debug info, but used with '%s'", 
> name,
> +  debug_set_names (v.pch_write_symbols),
> +  debug_set_names (write_symbols));
>return 2;
>  }
>
> diff --git a/gcc/common.opt b/gcc/common.opt
> index a75b44e..ffb968d 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -109,7 +109,7 @@ bool exit_after_options
>  ; flag-types.h for the definitions of the different 

Re: [PATCH,V3 2/2] dwarf: new dwarf_debuginfo_p predicate

2021-05-20 Thread Richard Biener via Gcc-patches
On Thu, May 13, 2021 at 12:52 AM Indu Bhagat via Gcc-patches
 wrote:
>
> [Changes from V2]
>   - Tested build (make all-gcc) of cross compiler for target triplets
> containing c6x/mips/powerpc and darwin/cygwin.
> [End of changes from V2]
>
> This patch introduces a dwarf_debuginfo_p predicate that abstracts and
> replaces complex checks on write_symbols.

OK.

Thanks,
Richard.


> gcc/c-family/ChangeLog:
>
> * c-lex.c (init_c_lex): Use dwarf_debuginfo_p.
>
> gcc/ChangeLog:
>
> * config/c6x/c6x.c (c6x_output_file_unwind): Use dwarf_debuginfo_p.
> * config/darwin.c (darwin_override_options): Likewise.
> * config/i386/cygming.h (DBX_REGISTER_NUMBER): Likewise.
> * config/i386/darwin.h (DBX_REGISTER_NUMBER): Likewise.
> (DWARF2_FRAME_REG_OUT): Likewise.
> * config/mips/mips.c (mips_output_filename): Likewise.
> * config/rs6000/rs6000.c (rs6000_xcoff_declare_function_name):
> Likewise.
> (rs6000_dbx_register_number): Likewise.
> * dbxout.c: Include flags.h.
> * dwarf2cfi.c (cfi_label_required_p): Likewise.
> (dwarf2out_do_frame): Likewise.
> * except.c: Include flags.h.
> * final.c (dwarf2_debug_info_emitted_p): Likewise.
> (final_scan_insn_1): Likewise.
> * flags.h (dwarf_debuginfo_p): New function declaration.
> * opts.c (dwarf_debuginfo_p): New function definition.
> * targhooks.c (default_debug_unwind_info): Use dwarf_debuginfo_p.
> * toplev.c (process_options): Likewise.
> ---
>  gcc/c-family/c-lex.c   |  4 ++--
>  gcc/config/c6x/c6x.c   |  4 ++--
>  gcc/config/darwin.c|  3 ++-
>  gcc/config/i386/cygming.h  |  2 +-
>  gcc/config/i386/darwin.h   |  4 ++--
>  gcc/config/mips/mips.c |  3 ++-
>  gcc/config/rs6000/rs6000.c |  4 ++--
>  gcc/dbxout.c   |  1 +
>  gcc/dwarf2cfi.c|  9 -
>  gcc/except.c   |  1 +
>  gcc/final.c| 15 ++-
>  gcc/flags.h|  4 
>  gcc/opts.c |  8 
>  gcc/targhooks.c|  2 +-
>  gcc/toplev.c   |  6 ++
>  15 files changed, 40 insertions(+), 30 deletions(-)
>
> diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
> index 1c66ecd..c44e7a1 100644
> --- a/gcc/c-family/c-lex.c
> +++ b/gcc/c-family/c-lex.c
> @@ -27,6 +27,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "stor-layout.h"
>  #include "c-pragma.h"
>  #include "debug.h"
> +#include "flags.h"
>  #include "file-prefix-map.h" /* remap_macro_filename()  */
>  #include "langhooks.h"
>  #include "attribs.h"
> @@ -87,8 +88,7 @@ init_c_lex (void)
>
>/* Set the debug callbacks if we can use them.  */
>if ((debug_info_level == DINFO_LEVEL_VERBOSE
> -   && (write_symbols == DWARF2_DEBUG
> -  || write_symbols == VMS_AND_DWARF2_DEBUG))
> +   && dwarf_debuginfo_p ())
>|| flag_dump_go_spec != NULL)
>  {
>cb->define = cb_define;
> diff --git a/gcc/config/c6x/c6x.c b/gcc/config/c6x/c6x.c
> index f9ad1e5..e2011f0 100644
> --- a/gcc/config/c6x/c6x.c
> +++ b/gcc/config/c6x/c6x.c
> @@ -59,6 +59,7 @@
>  #include "regrename.h"
>  #include "dumpfile.h"
>  #include "builtins.h"
> +#include "flags.h"
>
>  /* This file should be included last.  */
>  #include "target-def.h"
> @@ -439,8 +440,7 @@ c6x_output_file_unwind (FILE * f)
>  {
>if (flag_unwind_tables || flag_exceptions)
> {
> - if (write_symbols == DWARF2_DEBUG
> - || write_symbols == VMS_AND_DWARF2_DEBUG)
> + if (dwarf_debuginfo_p ())
> asm_fprintf (f, "\t.cfi_sections .debug_frame, .c6xabi.exidx\n");
>   else
> asm_fprintf (f, "\t.cfi_sections .c6xabi.exidx\n");
> diff --git a/gcc/config/darwin.c b/gcc/config/darwin.c
> index 5d17391..026c1fb 100644
> --- a/gcc/config/darwin.c
> +++ b/gcc/config/darwin.c
> @@ -46,6 +46,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "lto-section-names.h"
>  #include "intl.h"
>  #include "optabs.h"
> +#include "flags.h"
>
>  /* Fix and Continue.
>
> @@ -3348,7 +3349,7 @@ darwin_override_options (void)
>&& generating_for_darwin_version >= 9
>&& (flag_gtoggle ? (debug_info_level == DINFO_LEVEL_NONE)
>: (debug_info_level >= DINFO_LEVEL_NORMAL))
> -  && write_symbols == DWARF2_DEBUG)
> +  && dwarf_debuginfo_p ())
>  flag_var_tracking_uninit = flag_var_tracking;
>
>/* Final check on PCI options; for Darwin these are not dependent on the 
> PIE
> diff --git a/gcc/config/i386/cygming.h b/gcc/config/i386/cygming.h
> index cfbca34..ac458cd 100644
> --- a/gcc/config/i386/cygming.h
> +++ b/gcc/config/i386/cygming.h
> @@ -82,7 +82,7 @@ along with GCC; see the file COPYING3.  If not see
>  #undef DBX_REGISTER_NUMBER
>  #define DBX_REGISTER_NUMBER(n) \
>(TARGET_64BIT ? dbx64_register_map[n]\
> -   : 

Re: [PATCH 2/2] ipa-sra: Improve debug info for removed parameters (PR 93385)

2021-05-20 Thread Richard Biener via Gcc-patches
On Mon, May 10, 2021 at 8:43 PM Martin Jambor  wrote:
>
> On Mon, May 10 2021, Richard Biener wrote:
> > On Tue, Apr 27, 2021 at 5:26 PM Martin Jambor  wrote:
> >>
> >> Hi,
> >>
> >> Whereas the previous patch fixed issues with code left behind after
> >> IPA-SRA removed a parameter but only reset all affected debug bind
> >> statements, this one updates them with expressions which can allow the
> >> debugger to print the removed value - see the added test-case.
> >>
> >> Even though I originally did not want to create DEBUG_EXPR_DECLs for
> >> intermediate values, I ended up doing so, because otherwise the code
> >> started creating statements like
> >>
> >># DEBUG __aD.198693 => [(const struct _Alloc_nodeD.171110 
> >> *)D#195]._M_tD.184726->_M_implD.171154
> >>
> >> which not only is a bit scary but also gimple-fold ICEs on
> >> it. Therefore I decided they are probably quite necessary and have
> >> them.
> >>
> >> The patch simply notes each removed SSA name present in a debug
> >> statement and then works from it backwards, looking if it can
> >> reconstruct the expression it represents (which can fail if a
> >> non-degenerate PHI node is in the way).  If it can, it populates two
> >> hash maps with those expressions so that 1) removed assignments are
> >> replaced with a debug bind defining a new intermediate debug_decl_expr
> >> and 2) existing debug binds that refer to SSA names that are bing
> >> removed now refer to corresponding debug_decl_exprs.
> >
> > Isn't this what insert_debug_temp_for_var_def already does when you
> > remove a stmt and if you take care to do that back-to-front?  So with
> > IPA SRA removing a parameter you'd "only" need to make sure to
> > set up a debug stmt for the parameter itself and that be picked up
> > for the (uninitialized) default-def you map to?
> >
>
> But there is no removal, the dead statements creating dead SSAs are not
> even copied when tree-inline.c does its thing, such SSAs are actually
> mapped to error_mark_node.

OK.  Still its a bit odd what you do - I'd have expected that we can
somehow create the debug stmt from the original stmt and then
"copy"/remap that instead of the original stmt (which we DCE).

It looks like you didn't yet install the DCE patch so I'd have to dig it up
to remember how the DCE is wired in, but basically I'd have expected
remap_gimple_stmt for a DCEd stmt to go the same way if it were
a debug-bind but of course instead of copying a debug bind, create one
from scratch, pushing it to id->debug_stmts so it gets re-mapped later.

Removed SSA defs have to be set up as mapping to a new debug temp,
so we do have to scan the original DCEd stmt for defs.

> The code is heavily inspired by what removal does but (IIRC I hope) it
> is also much simpler because IPA-SRA can only remove limited classes of
> scalars.

Maybe some code can be split out and shared - I realize that
insert_debug_temp_for_var_def does stuff that's not appropriate
in the inlining context.  Or just add another parameter so we can fend
off that code ...

Sorry for the delay,
Richard.

>
> Martin
>
>
>
> >> If a removed parameter is passed to another function, the debugging
> >> information still cannot describe its value there - see the xfailed
> >> test in the testcase.  I sort of know what needs to be done but the
> >> handling of debug information for removed parameters is LTO unfriendly
> >> in general and so needs a bit more work.
> >>
> >> Bootstrapped and tested on x86_64-linux, i686-linux and aarch64-linux.
> >> Also LTO-bootstrapped and LTO-profiledbootstrapped on x86_64-linux.
> >>
> >> OK for trunk?
> >>
> >> Thanks,
> >>
> >> Martin
> >>
> >>
> >> gcc/ChangeLog:
> >>
> >> 2021-03-29  Martin Jambor  
> >>
> >> PR ipa/93385
> >> * ipa-param-manipulation.h (class ipa_param_body_adjustments): New
> >> members remap_with_debug_expressions, m_dead_ssa_debug_equiv,
> >> m_dead_stmt_debug_equiv and prepare_debug_expressions.  Added
> >> parameter to mark_dead_statements.
> >> * ipa-param-manipulation.c: Include tree-phinodes.h and 
> >> cfgexpand.h.
> >> (ipa_param_body_adjustments::mark_dead_statements): New parameter
> >> debugstack, push into it all SSA names used in debug statements,
> >> produce m_dead_ssa_debug_equiv mapping for the removed param.
> >> (replace_with_mapped_expr): New function.
> >> (ipa_param_body_adjustments::remap_with_debug_expressions): 
> >> Likewise.
> >> (ipa_param_body_adjustments::prepare_debug_expressions): Likewise.
> >> (ipa_param_body_adjustments::common_initialization): Gather and
> >> procecc SSA which will be removed but are in debug statements. 
> >> Simplify.
> >> (ipa_param_body_adjustments::ipa_param_body_adjustments): 
> >> Initialize
> >> new members.
> >> * tree-inline.c (remap_gimple_stmt): Create a debug bind when 
> >> possible
> >> when avoiding a copy of an 

Re: [PATCH v2] vect: Replace hardcoded weight factor with param

2021-05-20 Thread Christophe Lyon via Gcc-patches
On Thu, 20 May 2021 at 10:52, Kewen.Lin via Gcc-patches
 wrote:
>
> on 2021/5/19 下午6:01, Richard Biener wrote:
> > On Wed, May 19, 2021 at 11:47 AM Kewen.Lin  wrote:
> >>
> >> Hi Richi,
> >>
> >> on 2021/5/19 下午4:15, Richard Biener wrote:
> >>> On Wed, May 19, 2021 at 8:20 AM Kewen.Lin  wrote:
> 
>  Hi,
> 
>  This patch is to replace the current hardcoded weight factor 50
>  for those statements in an inner loop relative to the loop being
>  vectorized with a specific parameter vect-inner-loop-weight-factor.
> 
>  The motivation behind this change is: if targets want to have one
>  unique function to gather some information in each add_stmt_cost
>  call, no matter that it's put before or after the cost tweaking
>  part for inner loop, it may have the need to adjust (expand or
>  shrink) the gathered data as the factor.  Now the factor is
>  hardcoded, it's not easily maintained.  Since it's possible that
>  targets have their own decisions on this costing like the others,
>  I used parameter instead of one unique macro here.
> 
>  Testing is ongoing, is it ok for trunk if everything goes well?
> >>>
> >>> Certainly an improvement.  I suppose we might want to put
> >>> the factor into vinfo->inner_loop_cost_factor.  That way
> >>> we could adjust it easily in common code in the vectorizer
> >>> when we for example have (non-guessed) profile data.
> >>>
> >>> "weight_factor" is kind-of double-speak and I'm missing 'cost' ...
> >>> so, bike-shedding to vect_inner_loop_cost_factor?
> >>>
> >>> Just suggestions - as said, the patch is an improvement already.
> >>>
> >>
> >> Thanks for your nice suggestions!  I've updated the patch accordingly
> >> and attached it.  Does it look better to you?
> >
> > Minor nit:
> >
> > +@item vect-inner-loop-cost-factor
> > +The factor which loop vectorizer uses to over weight those statements in
> > +an inner loop relative to the loop being vectorized.
> > +
> >
> > the default value should be documented here, not..
> >
> > +-param=vect-inner-loop-cost-factor=
> > +Common Joined UInteger Var(param_vect_inner_loop_cost_factor)
> > Init(50) IntegerRange(1, 99) Param Optimization
> > +Indicates the factor which loop vectorizer uses to over weight those
> > statements in an inner loop relative to the loop being vectorized.
> > The default value is 50.
> > +
> >
> > here (based on statistical analysis of existing cases).  Also the
> > params.opt docs
> > should be the "brief" one - but for simplicity simply make both docs 
> > identical
> > (apart from the default value doc).  I suggest
> >
> > "The factor which the loop vectorizer applies to the cost of statements
> > in an inner loop relative to the loop being vectorized."
> >
>
> Thanks for catching this and the suggestion!
>
> Bootstrapped/regtested on powerpc64le-linux-gnu P9, x86_64-redhat-linux
> and aarch64-linux-gnu.
>

This breaks the build for arm targets:
/tmp/158661_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:
In function 'unsigned int arm_add_stmt_cost(vec_info*, void*, int,
vect_cost_for_stmt, _stmt_v
ec_info*, tree, int, vect_cost_model_location)':
/tmp/158661_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:12230:4:
error: 'loop_vec_info' was not declared in this scope
loop_vec_info loop_vinfo = dyn_cast (vinfo);
^
/tmp/158661_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:12230:18:
error: expected ';' before 'loop_vinfo'
loop_vec_info loop_vinfo = dyn_cast (vinfo);

Can you fix it?

Thanks,

Christophe


> Committed in r12-939 as the suggested wordings.


>
> BR,
> Kewen


Re: [RFC] ldist: Recognize rawmemchr loop patterns

2021-05-20 Thread Richard Biener via Gcc-patches
On Fri, May 7, 2021 at 2:32 PM Stefan Schulze Frielinghaus
 wrote:
>
> On Wed, May 05, 2021 at 11:36:41AM +0200, Richard Biener wrote:
> > On Tue, Mar 16, 2021 at 6:13 PM Stefan Schulze Frielinghaus
> >  wrote:
> > >
> > > [snip]
> > >
> > > Please find attached a new version of the patch.  A major change compared 
> > > to
> > > the previous patch is that I created a separate pass which hopefully makes
> > > reviewing also easier since it is almost self-contained.  After realizing 
> > > that
> > > detecting loops which mimic the behavior of rawmemchr/strlen functions 
> > > does not
> > > really fit into the topic of loop distribution, I created a separate pass.
> >
> > It's true that these reduction-like patterns are more difficult than
> > the existing
> > memcpy/memset cases.
> >
> > >  Due
> > > to this I was also able to play around a bit and schedule the pass at 
> > > different
> > > times.  Currently it is scheduled right before loop distribution where 
> > > loop
> > > header copying already took place which leads to the following effect.
> >
> > In fact I'd schedule it after loop distribution so there's the chance that 
> > loop
> > distribution can expose a loop that fits the new pattern.
> >
> > >  Running
> > > this setup over
> > >
> > > char *t (char *p)
> > > {
> > >   for (; *p; ++p);
> > >   return p;
> > > }
> > >
> > > the new pass transforms
> > >
> > > char * t (char * p)
> > > {
> > >   char _1;
> > >   char _7;
> > >
> > >[local count: 118111600]:
> > >   _7 = *p_3(D);
> > >   if (_7 != 0)
> > > goto ; [89.00%]
> > >   else
> > > goto ; [11.00%]
> > >
> > >[local count: 105119324]:
> > >
> > >[local count: 955630225]:
> > >   # p_8 = PHI 
> > >   p_6 = p_8 + 1;
> > >   _1 = *p_6;
> > >   if (_1 != 0)
> > > goto ; [89.00%]
> > >   else
> > > goto ; [11.00%]
> > >
> > >[local count: 105119324]:
> > >   # p_2 = PHI 
> > >   goto ; [100.00%]
> > >
> > >[local count: 850510901]:
> > >   goto ; [100.00%]
> > >
> > >[local count: 12992276]:
> > >
> > >[local count: 118111600]:
> > >   # p_9 = PHI 
> > >   return p_9;
> > >
> > > }
> > >
> > > into
> > >
> > > char * t (char * p)
> > > {
> > >   char * _5;
> > >   char _7;
> > >
> > >[local count: 118111600]:
> > >   _7 = *p_3(D);
> > >   if (_7 != 0)
> > > goto ; [89.00%]
> > >   else
> > > goto ; [11.00%]
> > >
> > >[local count: 105119324]:
> > >   _5 = p_3(D) + 1;
> > >   p_10 = .RAWMEMCHR (_5, 0);
> > >
> > >[local count: 118111600]:
> > >   # p_9 = PHI 
> > >   return p_9;
> > >
> > > }
> > >
> > > which is fine so far.  However, I haven't made up my mind so far whether 
> > > it is
> > > worthwhile to spend more time in order to also eliminate the "first 
> > > unrolling"
> > > of the loop.
> >
> > Might be a phiopt transform ;)  Might apply to quite some set of
> > builtins.  I wonder how the strlen case looks like though.
> >
> > > I gave it a shot by scheduling the pass prior pass copy header
> > > and ended up with:
> > >
> > > char * t (char * p)
> > > {
> > >[local count: 118111600]:
> > >   p_5 = .RAWMEMCHR (p_3(D), 0);
> > >   return p_5;
> > >
> > > }
> > >
> > > which seems optimal to me.  The downside of this is that I have to 
> > > initialize
> > > scalar evolution analysis which might be undesired that early.
> > >
> > > All this brings me to the question where do you see this peace of code 
> > > running?
> > > If in a separate pass when would you schedule it?  If in an existing pass,
> > > which one would you choose?
> >
> > I think it still fits loop distribution.  If you manage to detect it
> > with your pass
> > standalone then you should be able to detect it in loop distribution.
>
> If a loop is distributed only because one of the partitions matches a
> rawmemchr/strlen-like loop pattern, then we have at least two partitions
> which walk over the same memory region.  Since a rawmemchr/strlen-like
> loop has no body (neglecting expression-3 of a for-loop where just an
> increment happens) it is governed by the memory accesses in the loop
> condition.  Therefore, in such a case loop distribution would result in
> performance degradation.  This is why I think that it does not fit
> conceptually into ldist pass.  However, since I make use of a couple of
> helper functions from ldist pass, it may still fit technically.
>
> Since currently all ldist optimizations operate over loops where niters
> is known and for rawmemchr/strlen-like loops this is not the case, it is
> not possible that those optimizations expose a loop which is suitable
> for rawmemchr/strlen optimization.

True - though that seems to be an unnecessary restriction.

>  Therefore, what do you think about
> scheduling rawmemchr/strlen optimization right between those
> if-statements of function loop_distribution::execute?
>
>if (nb_generated_loops + nb_generated_calls > 0)
>  {
>changed = true;
>if (dump_enabled_p ())
>  dump_printf_loc 

[PATCH] i386: Add mult-high and shift patterns for 4-byte vectors [PR100637]

2021-05-20 Thread Uros Bizjak via Gcc-patches
2021-05-20  Uroš Bizjak  

gcc/
PR target/100637
* config/i386/mmx.md (Yv_Yw): Revert adding V4QI and V2HI modes.
(*3): Use Yw instad of  constrint.
(mulv4hi3_highpart): New expander.
(*mulv2hi3_highpart): New insn pattern.
(mulv2hi3_higpart): New expander.
(*v2hi3): New insn pattern.
(v2hi3): New expander.
* config/i386/sse.md (smulhrsv2hi3): New expander.
(*smulhrsv2hi3): New insn pattern.

gcc/testsuite/

PR target/100637
* gcc.target/i386/pr100637-1w.c (shl, ashr, lshr): New tests.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index d8479782e90..948ba479c32 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -78,8 +78,7 @@ (define_mode_attr mmxintvecmodelower
   [(V2SF "v2si") (V2SI "v2si") (V4HI "v4hi") (V8QI "v8qi")])
 
 (define_mode_attr Yv_Yw
-  [(V8QI "Yw") (V4QI "Yw") (V4HI "Yw") (V2HI "Yw")
-   (V2SI "Yv") (V1DI "Yv") (V2SF "Yv")])
+  [(V8QI "Yw") (V4HI "Yw") (V2SI "Yv") (V1DI "Yv") (V2SF "Yv")])
 
 ;
 ;;
@@ -1367,10 +1366,10 @@ (define_expand "3"
   "ix86_fixup_binary_operands_no_copy (, mode, operands);")
 
 (define_insn "*3"
-  [(set (match_operand:VI_32 0 "register_operand" "=x,")
+  [(set (match_operand:VI_32 0 "register_operand" "=x,Yw")
 (plusminus:VI_32
- (match_operand:VI_32 1 "register_operand" "0,")
- (match_operand:VI_32 2 "register_operand" "x,")))]
+ (match_operand:VI_32 1 "register_operand" "0,Yw")
+ (match_operand:VI_32 2 "register_operand" "x,Yw")))]
   "TARGET_SSE2
&& ix86_binary_operator_ok (, mode, operands)"
   "@
@@ -1523,6 +1522,51 @@ (define_insn "*mmx_umulv4hi3_highpart"
(set_attr "type" "mmxmul,ssemul,ssemul")
(set_attr "mode" "DI,TI,TI")])
 
+(define_expand "mulv4hi3_highpart"
+  [(set (match_operand:V4HI 0 "register_operand")
+   (truncate:V4HI
+ (lshiftrt:V4SI
+   (mult:V4SI
+ (any_extend:V4SI
+   (match_operand:V4HI 1 "register_operand"))
+ (any_extend:V4SI
+   (match_operand:V4HI 2 "register_operand")))
+   (const_int 16]
+  "TARGET_MMX_WITH_SSE"
+  "ix86_fixup_binary_operands_no_copy (MULT, V4HImode, operands);")
+
+(define_insn "*mulv2hi3_highpart"
+  [(set (match_operand:V2HI 0 "register_operand" "=x,Yw")
+   (truncate:V2HI
+ (lshiftrt:V2SI
+   (mult:V2SI
+ (any_extend:V2SI
+   (match_operand:V2HI 1 "register_operand" "%0,Yw"))
+ (any_extend:V2SI
+   (match_operand:V2HI 2 "register_operand" "x,Yw")))
+   (const_int 16]
+  "TARGET_SSE2
+   && ix86_binary_operator_ok (MULT, V2HImode, operands)"
+  "@
+   pmulhw\t{%2, %0|%0, %2}
+   vpmulhw\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "isa" "noavx,avx")
+   (set_attr "type" "ssemul")
+   (set_attr "mode" "TI")])
+
+(define_expand "mulv2hi3_highpart"
+  [(set (match_operand:V2HI 0 "register_operand")
+   (truncate:V2HI
+ (lshiftrt:V2SI
+   (mult:V2SI
+ (any_extend:V2SI
+   (match_operand:V2HI 1 "register_operand"))
+ (any_extend:V2SI
+   (match_operand:V2HI 2 "register_operand")))
+   (const_int 16]
+  "TARGET_SSE2"
+  "ix86_fixup_binary_operands_no_copy (MULT, V2HImode, operands);")
+
 (define_expand "mmx_pmaddwd"
   [(set (match_operand:V2SI 0 "register_operand")
 (plus:V2SI
@@ -1817,6 +1861,30 @@ (define_expand "3"
  (match_operand:DI 2 "nonmemory_operand")))]
   "TARGET_MMX_WITH_SSE")
 
+(define_insn "*v2hi3"
+  [(set (match_operand:V2HI 0 "register_operand" "=x,Yw")
+(any_shift:V2HI
+ (match_operand:V2HI 1 "register_operand" "0,Yw")
+ (match_operand:DI 2 "nonmemory_operand" "xN,YwN")))]
+  "TARGET_SSE2"
+  "@
+   pw\t{%2, %0|%0, %2}
+   vpw\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "isa" "noavx,avx")
+   (set_attr "type" "sseishft")
+   (set (attr "length_immediate")
+ (if_then_else (match_operand 2 "const_int_operand")
+   (const_string "1")
+   (const_string "0")))
+   (set_attr "mode" "TI")])
+
+(define_expand "v2hi3"
+  [(set (match_operand:V2HI 0 "register_operand")
+(any_shift:V2HI
+ (match_operand:V2HI 1 "register_operand")
+ (match_operand:DI 2 "nonmemory_operand")))]
+  "TARGET_SSE2")
+
 ;
 ;;
 ;; Parallel integral comparisons
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index a4503ddcb73..0f1108f0db1 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -17239,6 +17239,51 @@ (define_insn "*ssse3_pmulhrswv4hi3"
(set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)"))
(set_attr "mode" "DI,TI,TI")])
 
+(define_expand "smulhrsv2hi3"
+  [(set (match_operand:V2HI 0 "register_operand")
+   

Re: [PATCH] Add no_sanitize_coverage attribute.

2021-05-20 Thread Martin Liška

On 5/20/21 10:45 AM, Marco Elver wrote:

On Thu, 20 May 2021 at 10:33, Martin Liška  wrote:

Hello.

The patch implements one missing attribute which can be used for per-function
disabling of coverage sanitization.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.


Thanks for implementing this so quickly. One thing I just have to
double check, is if this works with always_inline (not instrumented if
inlined into no_sanitize),


No, in this case it's instrumented (if caller has not set the attribute).


and inline (not inlined if inline fn is
no_sanitize_coverage).


No again, it's not blocking inlining.


Just like the other no_sanitize* do. The test doesn't mention them.


It's aligned with similar issue that was discussed some time ago:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94722#c9

Martin



Thanks,
-- Marco


Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

 * asan.h (sanitize_coverage_p): New function.
 * doc/extend.texi: Document it.
 * fold-const.c (fold_range_test): Use sanitize_flags_p
 instead of flag_sanitize_coverage.
 (fold_truth_andor): Likewise.
 * sancov.c: Likewise.
 * tree-ssa-ifcombine.c (ifcombine_ifandif): Likewise.

gcc/c-family/ChangeLog:

 * c-attribs.c (handle_no_sanitize_coverage_attribute): New.

gcc/testsuite/ChangeLog:

 * gcc.dg/sancov/attribute.c: New test.
---
   gcc/asan.h  | 10 ++
   gcc/c-family/c-attribs.c| 20 
   gcc/doc/extend.texi |  6 ++
   gcc/fold-const.c|  4 ++--
   gcc/sancov.c|  4 ++--
   gcc/testsuite/gcc.dg/sancov/attribute.c | 15 +++
   gcc/tree-ssa-ifcombine.c|  4 +++-
   7 files changed, 58 insertions(+), 5 deletions(-)
   create mode 100644 gcc/testsuite/gcc.dg/sancov/attribute.c

diff --git a/gcc/asan.h b/gcc/asan.h
index f110f1db563..8c0b2baf170 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -249,4 +249,14 @@ sanitize_flags_p (unsigned int flag, const_tree fn = 
current_function_decl)
 return result_flags;
   }

+/* Return true when coverage sanitization should happend for FN function.  */
+
+static inline bool
+sanitize_coverage_p (const_tree fn = current_function_decl)
+{
+  return (flag_sanitize_coverage
+ && lookup_attribute ("no_sanitize_coverage",
+  DECL_ATTRIBUTES (fn)) == NULL_TREE);
+}
+
   #endif /* TREE_ASAN */
diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index ccf9e4ccf0b..671b27c3200 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -62,6 +62,8 @@ static tree handle_no_address_safety_analysis_attribute (tree 
*, tree, tree,
  int, bool *);
   static tree handle_no_sanitize_undefined_attribute (tree *, tree, tree, int,
 bool *);
+static tree handle_no_sanitize_coverage_attribute (tree *, tree, tree, int,
+  bool *);
   static tree handle_asan_odr_indicator_attribute (tree *, tree, tree, int,
  bool *);
   static tree handle_stack_protect_attribute (tree *, tree, tree, int, bool *);
@@ -449,6 +451,8 @@ const struct attribute_spec c_common_attribute_table[] =
   handle_no_sanitize_thread_attribute, NULL },
 { "no_sanitize_undefined",  0, 0, true, false, false, false,
   handle_no_sanitize_undefined_attribute, NULL },
+  { "no_sanitize_coverage",   0, 0, true, false, false, false,
+ handle_no_sanitize_coverage_attribute, NULL },
 { "asan odr indicator", 0, 0, true, false, false, false,
   handle_asan_odr_indicator_attribute, NULL },
 { "warning",  1, 1, true,  false, false, false,
@@ -1211,6 +1215,22 @@ handle_no_sanitize_undefined_attribute (tree *node, tree 
name, tree, int,
 return NULL_TREE;
   }

+/* Handle a "no_sanitize_coverage" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_no_sanitize_coverage_attribute (tree *node, tree name, tree, int,
+  bool *no_add_attrs)
+{
+  if (TREE_CODE (*node) != FUNCTION_DECL)
+{
+  warning (OPT_Wattributes, "%qE attribute ignored", name);
+  *no_add_attrs = true;
+}
+
+  return NULL_TREE;
+}
+
   /* Handle an "asan odr indicator" attribute; arguments as in
  struct attribute_spec.handler.  */

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 826804e6149..3ddeb0dee3a 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3415,6 +3415,12 @@ The @code{no_sanitize_undefined} attribute on functions 
is used
   to inform the compiler that it should not check for undefined behavior
   in 

Re: [PATCH] go/100537 - Bootstrap-O3 and bootstrap-debug fail

2021-05-20 Thread guojiufu via Gcc-patches

On 2021-05-18 14:58, Richard Biener wrote:

On Mon, 17 May 2021, Ian Lance Taylor wrote:


On Mon, May 17, 2021 at 1:17 AM Richard Biener via Gcc-patches
 wrote:
>
> On Fri, May 14, 2021 at 11:19 AM guojiufu via Gcc-patches
>  wrote:
> >
> > On 2021-05-14 15:39, guojiufu via Gcc-patches wrote:
> > > On 2021-05-14 15:15, Richard Biener wrote:
> > >> On May 14, 2021 4:52:56 AM GMT+02:00, Jiufu Guo
> > >>  wrote:
> > >>> As discussed in the PR, Richard mentioned the method to
> > >>> figure out which VAR was not set TREE_ADDRESSABLE, and
> > >>> then cause this failure.  It is address_expression which
> > >>> build addr_expr (build_fold_addr_expr_loc), but not set
> > >>> TREE_ADDRESSABLE.
> > >>>
> > >>> I drafted this patch with reference the comments from Richard
> > >>> in this PR, while I'm not quite sure if more thing need to do.
> > >>> So, please have review, thanks!
> > >>>
> > >>> Bootstrap and regtest pass on ppc64le. Is this ok for trunk?
> > >>
> > >> I suggest to use mark_addresssable unless we're sure expr is always an
> > >> entity where TREE_ADDRESSABLE has the desired meaning.
> >
> > Thanks, Richard!
> > You point out the root concern, I'm not sure ;)
> >
> > With looking at code "mark_addresssable" and code around
> > tree-ssa.c:1013,
> > VAR_P, PARM_DECL, and RESULT_DECL are checked before accessing
> > TREE_ADDRESSABLE.
> > So, just wondering if these entities need to be marked as
> > TREE_ADDRESSABLE?
> >
> > diff --git a/gcc/go/go-gcc.cc b/gcc/go/go-gcc.cc
> > index 5d9dbb5d068..85d324a92cc 100644
> > --- a/gcc/go/go-gcc.cc
> > +++ b/gcc/go/go-gcc.cc
> > @@ -1680,6 +1680,11 @@ Gcc_backend::address_expression(Bexpression*
> > bexpr, Location location)
> > if (expr == error_mark_node)
> >   return this->error_expression();
> >
> > +  if ((VAR_P(expr)
> > +   || TREE_CODE(expr) == PARM_DECL
> > +   || TREE_CODE(expr) == RESULT_DECL)
> > +TREE_ADDRESSABLE (expr) = 1;
> > +
>
> The root concern is that mark_addressable does
>
>   while (handled_component_p (x))
> x = TREE_OPERAND (x, 0);
>
> and I do not know the constraints on 'expr' as passed to
> Gcc_backend::address_expression.
>
> I think we need input from Ian here.  Most FEs have their own 
*_mark_addressable
> function where they also emit diagnostics (guess this is handled in
> the actual Go frontend).
> Since Gcc_backend does lowering to GENERIC using a middle-end is probably OK.

I doubt I understand all the issues here.

In general the Go frontend only takes the addresses of VAR_DECLs or
PARM_DECLs.  It doesn't bother to set TREE_ADDRESSABLE for global
variables for which TREE_STATIC or DECL_EXTERNAL is true.  For local
variables it sets TREE_ADDRESSABLE based on the is_address_taken
parameter to Gcc_backend::local_variable, and similarly for PARM_DECLs
and Gcc_backend::parameter_variable.

The name in the bug report is for a string initializer, which should
be TREE_STATIC == 1 and TREE_PUBLIC == 0.  Perhaps the fix is simply
to set TREE_ADDRESSABLE in Gcc_backend::immutable_struct and
Gcc_backend::implicit_variable.  I can't see how it would hurt to set
TREE_ADDRESSABLE unnecessarily for a TREE_STATIC variable.

But, again, I doubt I understand all the issues here.


GENERIC requires TREE_ADDRESSABLE to be set on all address-taken
VAR_DECLs, PARM_DECLs and RESULT_DECLs - the gimplifier is the
first to require this for correctness.  Setting TREE_ADDRESSABLE
when the address is not taken is harmless and at most results in
missed optimizations (on most entities we are able to clear the
flag later).

We're currently quite forgiving with this though (still the
gimplifier can generate wrong-code).  The trigger of the current
failure removed one "forgiveness", I do plan to remove a few more.

guojiufu's patch works for me but as said I'm not sure if there's
a better place to set TREE_ADDRESSABLE for entities that have
their address taken - definitely catching the places where
you build an ADDR_EXPR are the most obvious ones.

Richard.


I tested below patch As Ian said, bootstrap pass.


diff --git a/gcc/go/go-gcc.cc b/gcc/go/go-gcc.cc
index 5d9dbb5d068..529f657598a 100644
--- a/gcc/go/go-gcc.cc
+++ b/gcc/go/go-gcc.cc
@@ -2943,6 +2943,7 @@ Gcc_backend::implicit_variable(const std::string& 
name,

   TREE_STATIC(decl) = 1;
   TREE_USED(decl) = 1;
   DECL_ARTIFICIAL(decl) = 1;
+  TREE_ADDRESSABLE(decl) = 1;
   if (is_common)
 {
   DECL_COMMON(decl) = 1;
@@ -3053,6 +3054,7 @@ Gcc_backend::immutable_struct(const std::string& 
name,

   TREE_READONLY(decl) = 1;
   TREE_CONSTANT(decl) = 1;
   DECL_ARTIFICIAL(decl) = 1;
+  TREE_ADDRESSABLE(decl) = 1;
   if (!is_hidden)
 TREE_PUBLIC(decl) = 1;
   if (! asm_name.empty())

While, I hacked a patch for OPERATOR_AND. I'm thinking it may be 
acceptable.


BR.
Jiufu Guo.

Bootstrap failure go [PR100537]

In general the Go frontend only takes 

Re: [PATCH] Fortran/OpenMP: Add support for 'close' in map clause

2021-05-20 Thread Jakub Jelinek via Gcc-patches
On Thu, May 20, 2021 at 10:47:52AM +0200, Marcel Vollweiler wrote:
> --- a/gcc/fortran/openmp.c
> +++ b/gcc/fortran/openmp.c
> @@ -1710,10 +1710,21 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const 
> omp_mask mask,
> && gfc_match ("map ( ") == MATCH_YES)
>   {
> locus old_loc2 = gfc_current_locus;
> -   bool always = false;
> +
> +   int always = 0;
> +   int close = 0;

The vertical space should be after the 3 variable declarations
rather than in between 1 and 2.

> +   for (;;)
> + {
> +   if (gfc_match ("always ") == MATCH_YES)
> + always++;
> +   else if (gfc_match ("close ") == MATCH_YES)
> + close++;
> +   else
> + break;
> +   gfc_match (", ");
> + }
> +
> gfc_omp_map_op map_op = OMP_MAP_TOFROM;
> -   if (gfc_match ("always , ") == MATCH_YES)
> - always = true;
> if (gfc_match ("alloc : ") == MATCH_YES)
>   map_op = OMP_MAP_ALLOC;
> else if (gfc_match ("tofrom : ") == MATCH_YES)
> @@ -1726,11 +1737,24 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const 
> omp_mask mask,
>   map_op = OMP_MAP_RELEASE;
> else if (gfc_match ("delete : ") == MATCH_YES)
>   map_op = OMP_MAP_DELETE;
> -   else if (always)
> +   else
>   {
> gfc_current_locus = old_loc2;
> -   always = false;
> +   always = 0;
> +   close = 0;
>   }
> +
> +   if (always > 1)
> + {
> +   gfc_error ("too many % modifiers at %C");
> +   break;
> + }
> +   if (close > 1)
> + {
> +   gfc_error ("too many % modifiers at %C");
> +   break;

I think it would be nice to show the locus of the second always or close
modifier.  Could the loop above remember that locus when always++ == 1
(or ++always == 2) and similarly for close and use it when printing the
error?
And similarly to the C/C++ patch, better use always_modifier and
close_modifier as the names of the variables, as close is a function and
could be defined as macro.

Jakub



  1   2   >