Re: [PATCH] Simplify (view_convert ~a) < 0 to (view_convert a) >= 0 [PR middle-end/100738]

2021-06-03 Thread Hongtao Liu via Gcc-patches
On Tue, Jun 1, 2021 at 6:17 PM Marc Glisse  wrote:
>
> On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote:
>
> > Hi:
> >  This patch is about to simplify (view_convert:type ~a) < 0 to
> > (view_convert:type a) >= 0 when type is signed integer. Similar for
> > (view_convert:type ~a) >= 0.
> >  Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
> >  Ok for the trunk?
> >
> > gcc/ChangeLog:
> >
> >PR middle-end/100738
> >* match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0,
> >(view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE
> >simplification.
>
> We already have
>
> /* Fold ~X op C as X op' ~C, where op' is the swapped comparison.  */
> (for cmp (simple_comparison)
>   scmp (swapped_simple_comparison)
>   (simplify
>(cmp (bit_not@2 @0) CONSTANT_CLASS_P@1)
>(if (single_use (@2)
> && (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST))
> (scmp @0 (bit_not @1)
>
> Would it make sense to try and generalize it a bit, say with
>
> (cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P)
>
> (scmp (view_convert:XXX @0) (bit_not @1))
>
Thanks for your advice, it looks great.
And can I use *view_convert1?* instead of *nop_convert1?* here,
because the original case is view_convert, and nop_convert would fail
to simplify the case.
> (I still believe that it is a bad idea that SSA_NAMEs are strongly typed,
> encoding the type in operations would be more convenient, but I think the
> time for that choice has long gone)
>
> --
> Marc Glisse



-- 
BR,
Hongtao


[Bug tree-optimization/13563] if-conversion not agressive enough

2021-06-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=13563

Andrew Pinski  changed:

   What|Removed |Added

 Depends on||56223

--- Comment #7 from Andrew Pinski  ---
I will be solving this one and the patch will depend on the patch set that
fixes PR 56223 (well depends on part of that patch set).


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56223
[Bug 56223] Integer ABS is not recognized for more complicated pattern

[Bug tree-optimization/13563] if-conversion not agressive enough

2021-06-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=13563

Andrew Pinski  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #6 from Andrew Pinski  ---
Mine.

[Bug tree-optimization/56223] Integer ABS is not recognized for more complicated pattern

2021-06-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56223

--- Comment #6 from Andrew Pinski  ---
Created attachment 50925
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50925=edit
Patch which starts to fix this

This is step 1 in fixing this bug, there are two or three more steps/patches.

[Bug tree-optimization/56223] Integer ABS is not recognized for more complicated pattern

2021-06-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56223

Andrew Pinski  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org

--- Comment #5 from Andrew Pinski  ---
I have a start to this, though it needs to be fixed.  It only works for the
following extra case now:
int f(int a, int b, int c)
{
  int d0, d1, d;
d1 = __builtin_abs(b);
  if (a) {
d0 = __builtin_abs(c);
  d = d0;
  }  else {
  d = d1;
  }
  return d;
}

I also noticed that factor_out_conditional_conversion has a similar issue where
the cast is inside both if and else part.

[Bug c++/100899] internal compiler error: in retrieve_specialization, at cp/pt.c:1240

2021-06-03 Thread hans.p.erickson at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100899

Hans Erickson  changed:

   What|Removed |Added

 CC||hans.p.erickson at gmail dot 
com

--- Comment #1 from Hans Erickson  ---
Created attachment 50924
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50924=edit
Source code for regression testing (if useful)

[Bug c++/100899] New: internal compiler error: in retrieve_specialization, at cp/pt.c:1240

2021-06-03 Thread hans.p.erickson at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100899

Bug ID: 100899
   Summary: internal compiler error: in retrieve_specialization,
at cp/pt.c:1240
   Product: gcc
   Version: 10.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hans.p.erickson at gmail dot com
  Target Milestone: ---

Created attachment 50923
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50923=edit
Preprocessed output

Some template code that I am working on causes an internal compiler error with
the gcc 10 series of compilers. Based on some experimentation with the Compiler
Explorer site at godbolt.org it appears that this affects 10.1, 10.2, and 10.3,
but doesn't affect version 9 compilers or version 11 compilers. The link to the
code at godbolt.org is https://godbolt.org/z/svWW5Tq4r . The pre-processor
output that I've attached is from 10.2.

The attached output is from a compiler with the following properties:
  gcc version 10.2.0 (GCC)
  Target: x86_64-pc-linux-gnu
  Configured with: ./configure --prefix=/home/hans/.local --enable-multilib

[Bug c++/100102] [9/10/11/12 Regression] ICE in tsubst, at cp/pt.c:15310

2021-06-03 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100102

--- Comment #19 from Patrick Palka  ---
Candidate patch that addresses the ICE in the original testcase and reductions:
https://gcc.gnu.org/pipermail/gcc-patches/2021-June/571899.html

[PATCH] c++: tsubst_function_decl and excess arg levels [PR100102]

2021-06-03 Thread Patrick Palka via Gcc-patches
Here, when instantiating the dependent alias template
duration::__is_harmonic with args={{T,U},{int}}, we find ourselves
substituting the function decl _S_gcd.  Since we have more arg levels
than _S_gcd has parm levels, an old special case in tsubst_function_decl
causes us to unwantedly reduce args to its innermost level, yielding
args={int}, which leads to a nonsensical substitution into the decl's
context and an eventual crash.

The comment for this special case refers to three examples for which we
ought to see more arg levels than parm levels here, but none of the
examples actually demonstrate this.  In the first example, when
defining S::f(U) parms_depth is 2 and args_depth is 1, and
later when instantiating say S::f both depths are 2.  In the
second example, when substituting the template friend declaration
parms_depth is 2 and args_depth is 1, and later when instantiating f
both depths are 1.  Finally, the third example is invalid since we can't
specialize a member template of an unspecialized class template like
that.

Given that this reduction code seems no longer relevant for its
documented purpose and that it causes problems as in the PR, this patch
just removes it.  Note that as far as bootstrap/regtest is concerned,
this code is dead; the below two tests would be the first to trigger the
removed code.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and perhaps backports?  Also tested on various other libraries,
e.g. range-v3 and cmcstl2.

PR c++/100102

gcc/cp/ChangeLog:

* pt.c (tsubst_function_decl): Remove old code for reducing
args when it has excess levels.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/alias-decl-72.C: New test.
* g++.dg/cpp0x/alias-decl-72a.C: New test.
---
 gcc/cp/pt.c | 39 -
 gcc/testsuite/g++.dg/cpp0x/alias-decl-72.C  |  9 +
 gcc/testsuite/g++.dg/cpp0x/alias-decl-72a.C |  9 +
 3 files changed, 18 insertions(+), 39 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/alias-decl-72.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/alias-decl-72a.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 3cac073ed50..a6acdf864d1 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13909,45 +13909,6 @@ tsubst_function_decl (tree t, tree args, 
tsubst_flags_t complain,
  if (tree spec = retrieve_specialization (gen_tmpl, argvec, hash))
return spec;
}
-
-  /* We can see more levels of arguments than parameters if
-there was a specialization of a member template, like
-this:
-
-template  struct S { template  void f(); }
-template <> template  void S::f(U);
-
-Here, we'll be substituting into the specialization,
-because that's where we can find the code we actually
-want to generate, but we'll have enough arguments for
-the most general template.
-
-We also deal with the peculiar case:
-
-template  struct S {
-  template  friend void f();
-};
-template  void f() {}
-template S;
-template void f();
-
-Here, the ARGS for the instantiation of will be {int,
-double}.  But, we only need as many ARGS as there are
-levels of template parameters in CODE_PATTERN.  We are
-careful not to get fooled into reducing the ARGS in
-situations like:
-
-template  struct S { template  void f(U); }
-template  template <> void S::f(int) {}
-
-which we can spot because the pattern will be a
-specialization in this case.  */
-  int args_depth = TMPL_ARGS_DEPTH (args);
-  int parms_depth =
-   TMPL_PARMS_DEPTH (DECL_TEMPLATE_PARMS (DECL_TI_TEMPLATE (t)));
-
-  if (args_depth > parms_depth && !DECL_TEMPLATE_SPECIALIZATION (t))
-   args = get_innermost_template_args (args, parms_depth);
 }
   else
 {
diff --git a/gcc/testsuite/g++.dg/cpp0x/alias-decl-72.C 
b/gcc/testsuite/g++.dg/cpp0x/alias-decl-72.C
new file mode 100644
index 000..8009756dcba
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/alias-decl-72.C
@@ -0,0 +1,9 @@
+// PR c++/100102
+// { dg-do compile { target c++11 } }
+
+template struct ratio;
+template struct duration {
+  static constexpr int _S_gcd();
+  template using __is_harmonic = ratio<_S_gcd>;
+  using type = __is_harmonic;
+};
diff --git a/gcc/testsuite/g++.dg/cpp0x/alias-decl-72a.C 
b/gcc/testsuite/g++.dg/cpp0x/alias-decl-72a.C
new file mode 100644
index 000..a4443e18f9d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/alias-decl-72a.C
@@ -0,0 +1,9 @@
+// PR c++/100102
+// { dg-do compile { target c++11 } }
+
+template struct ratio;
+template struct duration {
+  static constexpr int _S_gcd();
+  template using __is_harmonic = ratio<(duration::_S_gcd)()>;
+  using type = __is_harmonic;
+};
-- 
2.32.0.rc2



[Bug c/100898] New: ICE with -O2: in gimple_call_arg_ptr, at gimple.h:3264

2021-06-03 Thread cnsun at uwaterloo dot ca via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100898

Bug ID: 100898
   Summary: ICE with -O2: in gimple_call_arg_ptr, at gimple.h:3264
   Product: gcc
   Version: tree-ssa
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: cnsun at uwaterloo dot ca
  Target Milestone: ---

$ gcc-trunk -v
Using built-in specs.
COLLECT_GCC=gcc-trunk
COLLECT_LTO_WRAPPER=/scratch/software/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/12.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /tmp/tmp.6GNjkccaVE-gcc-builder/gcc/configure
--enable-languages=c,c++,lto --enable-checking-yes --enable-multiarch
--prefix=/scratch/software/gcc-trunk --disable-bootstrap
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 12.0.0 20210603 (experimental) [master revision
:fe23f36c0:9663c744e2d0942f14eafa725a1bd9f766f02a16] (GCC)

$ cat mutant.c
a;
fn1() {
  for (; a;)
return bar(__builtin_va_arg_pack());
}
main() { fn1(); }

$ gcc-trunk -O2 mutant.c
mutant.c:1:1: warning: data definition has no type or storage class
1 | a;
  | ^
mutant.c:1:1: warning: type defaults to ‘int’ in declaration of ‘a’
[-Wimplicit-int]
mutant.c:2:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
2 | fn1() {
  | ^~~
mutant.c: In function ‘fn1’:
mutant.c:4:12: warning: implicit declaration of function ‘bar’
[-Wimplicit-function-declaration]
4 | return bar(__builtin_va_arg_pack());
  |^~~
mutant.c: At top level:
mutant.c:6:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
6 | main() { fn1(); }
  | ^~~~
during IPA pass: inline
In function ‘fn1’:
cc1: internal compiler error: in gimple_call_arg_ptr, at gimple.h:3264
0x758b10 gimple_call_arg_ptr
/tmp/tmp.6GNjkccaVE-gcc-builder/gcc/gcc/gimple.h:3264
0x758b10 gimple_call_arg_ptr
/tmp/tmp.6GNjkccaVE-gcc-builder/gcc/gcc/gimple.h:3262
0x758b10 copy_bb
/tmp/tmp.6GNjkccaVE-gcc-builder/gcc/gcc/tree-inline.c:2085
0xf93522 copy_cfg_body
/tmp/tmp.6GNjkccaVE-gcc-builder/gcc/gcc/tree-inline.c:3022
0xf93522 copy_body
/tmp/tmp.6GNjkccaVE-gcc-builder/gcc/gcc/tree-inline.c:3278
0xf96a56 expand_call_inline
/tmp/tmp.6GNjkccaVE-gcc-builder/gcc/gcc/tree-inline.c:5105
0xf98109 gimple_expand_calls_inline
/tmp/tmp.6GNjkccaVE-gcc-builder/gcc/gcc/tree-inline.c:5300
0xf98109 optimize_inline_calls(tree_node*)
/tmp/tmp.6GNjkccaVE-gcc-builder/gcc/gcc/tree-inline.c:5473
0xcbf76b inline_transform(cgraph_node*)
/tmp/tmp.6GNjkccaVE-gcc-builder/gcc/gcc/ipa-inline-transform.c:790
0xe19e04 execute_one_ipa_transform_pass
/tmp/tmp.6GNjkccaVE-gcc-builder/gcc/gcc/passes.c:2290
0xe19e04 execute_all_ipa_transforms(bool)
/tmp/tmp.6GNjkccaVE-gcc-builder/gcc/gcc/passes.c:2337
0xa874a9 cgraph_node::expand()
/tmp/tmp.6GNjkccaVE-gcc-builder/gcc/gcc/cgraphunit.c:1821
0xa888cf expand_all_functions
/tmp/tmp.6GNjkccaVE-gcc-builder/gcc/gcc/cgraphunit.c:1992
0xa888cf symbol_table::compile()
/tmp/tmp.6GNjkccaVE-gcc-builder/gcc/gcc/cgraphunit.c:2356
0xa8b7cb symbol_table::compile()
/tmp/tmp.6GNjkccaVE-gcc-builder/gcc/gcc/cgraphunit.c:2269
0xa8b7cb symbol_table::finalize_compilation_unit()
/tmp/tmp.6GNjkccaVE-gcc-builder/gcc/gcc/cgraphunit.c:2537
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

Re: [RFC/PATCH 00/11] Fix up some unexpected empty split conditions

2021-06-03 Thread Kewen.Lin via Gcc-patches
on 2021/6/3 下午4:05, Richard Sandiford wrote:
> "Kewen.Lin"  writes:
>> Hi Richi/Richard/Jeff/Segher,
>>
>> Thanks for the comments!
>>
>> on 2021/6/3 锟斤拷锟斤拷7:52, Segher Boessenkool wrote:
>>> On Wed, Jun 02, 2021 at 06:32:13PM +0100, Richard Sandiford wrote:
 Richard Biener  writes:
> So what Richard suggests would be to disallow split conditions
> that do not start with "&& ", it's probably easy to do that as well
> and look for build fails.  That should catch all cases to look at.

 Yeah.  As a strawman proposal, how about:

 - add a new "define_independent_insn_and_split" that has the
   current semantics of define_insn_and_split.  This should be
   mechanical.
>>>
>>> I'd rather not have that -- we can just write separate define_insn and
>>> define_split in that case.
>>>
>>
>> Not sure if someone would argue that he/she would like to go with one shared
>> pattern as before, to avoid any possible differences between two seperated
>> patterns and have good maintainability (like only editing on place) and
>> slightly better efficiency.
> 
> Right.  Plus it creates less make-work.  If we didn't have it, someone
> would need to split the define_insn_and_splits that don't currently
> use "&&", then later someone might decide that the missing "&&" was a
> mistake and need to put them together again (or just revert the patch
> and edit from there, I guess).
> 
> Plus define_independent_insn_and_split would act as a flag for something
> that might be suspect.  If we split them then the define_split condition
> will seem to have been chosen deliberately in isolation.
> 
>>> How many such cases *are* there?  There are no users exposed to this,
>>> and when the split condition is required to start with "&&" (instead of
>>> getting that implied) it is not a silent change ever, either.
>>>
>>
>> If I read the proposal right, the explicit "&&" is only required when going
>> to find all potential problematic places for final implied "&&" change.
>> But one explicit "&&" does offer good readability.
> 
> I don't know.  "&& 1" looks kind of weird to me.
> 
> One thing I'd been wondering about a while ago was whether we should key
> the split part of define_insn_and_splits off the insn code, instead of
> repeating the pattern match and insn C condition.  That would make the
> split apply only to the associated define_insns, whereas at the moment
> they also apply to any earlier (less general) define_insn that happens
> to match the pattern and the C conditions.  It would also reduce the
> complexity of the autogenerated define_split logic.
> 
> I don't know whether that's a good idea or not.  But having an explicit
> "&&" implies that the generator shouldn't do that, and that it should
> retest the insn condition from scratch.
> 
 - find the define_insn_and_splits that are missing the "&&", and where
   missing the "&&" might make a difference.  Change them to
   define_independent_insn_and_splits.

   Like Richard says, this can be done by temporarily disallowing
   define_insn_and_splits that have no "&&".
>>>
>>> If we make that change permanently, that is all steps we ever need!
>>>
>>
>> So the question is that: whether we need to demand an explicit "&&".
>> Richard's proposal is for answer "no" which aligns with Richi's auto
>> filling advice before.  I think it would result in fewer changes since
>> those places without explicit "&&" are mostly unintentional, all the jobs
>> are done by implied "&&".  Its downside seems to be bad readability, new
>> readers may take it as two seperated conditions at first glance, but I
>> guess if we emphasize this change in the document it would be fine?
>> Or emitting one warning if missing an explicit "&&"?
> 
> IMO the natural way to read it is that the split C condition gives the
> conditions under which the instruction should be split.  I think that's
> why forgetting the "&&" is such a common mistake.  (I know I've done it
> plenty of times.)
> 
> IMO requiring the "&&" is baking in an alternative, less intuitive,
> interpretation.
> 

Thanks for the explanation, I was thinking people may have got used to
the starting "&&" in split condition, so it's easy for them to read.
But I agree it's better not to have it in the natural way.

BR,
Kewen


Re: [PATCH 04/11] cris: Update unexpected empty split condition

2021-06-03 Thread Kewen.Lin via Gcc-patches
on 2021/6/4 上午12:12, Hans-Peter Nilsson wrote:
>> From: Kewen.Lin 
>> Date: Thu, 3 Jun 2021 07:45:57 +0200
> 
>> on 2021/6/2 Hans-Peter Nilsson wrote:
 From: Kewen Lin 
 Date: Wed, 2 Jun 2021 07:04:54 +0200
>>>
 gcc/ChangeLog:

* config/cris/cris.md (*addi_reload): Fix empty split condition.
> 
 -  ""
 +  "&& 1"
> 
>>> Ok, thanks, if only for all-round consistency.
>>>
>>> In preparation for a warning for an empty condition?  I'm
>>> usually all for .md-warnings, but I'm not sure about the
>>> benefit of that one, though.  Those "&& 1" look...hackish.
>>
>> Thanks!  Yeah, the 01/11 patch aims to raise one error message
>> for the define_insn_and_split whose split condition is empty
>> while insn condition isn't.  In most cases, when we write one
>> define_insn_and_split we want the splitting only to take effect
>> while we see the define_insn matching happen (insn cond holds),
>> but if we leave the split condition empty, the splitting will
>> be done always, it could result in some unexpected consequence.
>> Mostly this is unintentional.
> 
> It certainly was in the patch above!
> 
>>  The error message is to avoid
>> people to make it unintentionally.
>>
>> As you may have seen from the discussion under the 00/11 thread,
>> we will probably end up with some other solution, so I will hold
>> the changes for the ports, sorry for wasting your time and the
>> other port maintainers'.
> 
> No worries: I certainly don't consider it wasted and I'd
> prefer to have the patch above committed sooner than the
> conclusion of that discussion.  (If you don't get to it,
> I'll do it, after a round of testing.)
> 

Thanks for your help on testing!

> If you're considering further target patches to adjust for
> eventually changed semantics in the define_insn_and_split
> split-condition, then whatever trivial patch to cris.md that
> gets the effect of the one you sent is preapproved.
> 

OK, thanks again!

BR,
Kewen


Re: [RFC/PATCH 00/11] Fix up some unexpected empty split conditions

2021-06-03 Thread Kewen.Lin via Gcc-patches
Hi Segher,

on 2021/6/3 下午5:18, Segher Boessenkool wrote:
> On Thu, Jun 03, 2021 at 03:00:44AM -0500, Segher Boessenkool wrote:
>> On Thu, Jun 03, 2021 at 01:22:38PM +0800, Kewen.Lin wrote:
>> The whole point of requiring the split condition to start with && is so
>> it will become harder to mess things up (it will make the gen* code a
>> tiny little bit simpler as well).  And there is no transition period or
>> anything like that needed either.  Just the bunch that will break will
>> need fixing.  So let's find out how many of those there are :-)
>>

To find out those need fixing seems to be the critical part.  It's
not hard to add one explicit "&&" to those that don't have it now, but
even with further bootstrapped and regression tested I'm still not
confident the adjustments are safe enough, since the testing coverage
could be limited.  It may need more efforts to revisit, or/and test
with more coverages, and port maintainers' reviews.

In order to find one example which needs more fixing, for rs6000/i386/
aarch64, I fixed all define_insn_and_splits whose insn cond isn't empty
(from gensupport's view since the iterator can add more on) while split
cond don't start with "&&" , also skipped those whose insn conds are
the same as their split conds.  Unfortunately (or fortunately :-\) all
were bootstrapped and regress-tested.

The related diffs are attached, which is based on r12-0.

 How many such cases *are* there?  There are no users exposed to this,
 and when the split condition is required to start with "&&" (instead of
 getting that implied) it is not a silent change ever, either.
>>>
>>> If I read the proposal right, the explicit "&&" is only required when going
>>> to find all potential problematic places for final implied "&&" change.
>>> But one explicit "&&" does offer good readability.
>>
>> My proposal is to always require && (or maybe identical insn and split
>> conditions should be allowed as well, if people still use that -- that
>> is how we wrote "&& 1" before that existed).
> 
> I prototyped this.  There are very many errors.  Iterators often modify
> the insn condition (for one iteration of it), but that does not work if
> the split condition does not start with "&&"!
> 
> See attached prototype.
> 
> 

Thanks for the prototype!

BR,
Kewen

> Segher
> 
> = = =
> 
> diff --git a/gcc/gensupport.c b/gcc/gensupport.c
> index 2cb760ffb90f..05d46fd3775c 100644
> --- a/gcc/gensupport.c
> +++ b/gcc/gensupport.c
> @@ -590,7 +590,6 @@ process_rtx (rtx desc, file_location loc)
>  case DEFINE_INSN_AND_SPLIT:
>  case DEFINE_INSN_AND_REWRITE:
>{
> - const char *split_cond;
>   rtx split;
>   rtvec attr;
>   int i;
> @@ -611,15 +610,20 @@ process_rtx (rtx desc, file_location loc)
>  
>   /* If the split condition starts with "&&", append it to the
>  insn condition to create the new split condition.  */
> - split_cond = XSTR (desc, 4);
> - if (split_cond[0] == '&' && split_cond[1] == '&')
> + const char *insn_cond = XSTR (desc, 2);
> + const char *split_cond = XSTR (desc, 4);
> + if (!strncmp (split_cond, "&&", 2))
> {
>   rtx_reader_ptr->copy_md_ptr_loc (split_cond + 2, split_cond);
> - split_cond = rtx_reader_ptr->join_c_conditions (XSTR (desc, 2),
> + split_cond = rtx_reader_ptr->join_c_conditions (insn_cond,
>   split_cond + 2);
> +   } else if (insn_cond[0]) {
> + if (GET_CODE (desc) == DEFINE_INSN_AND_REWRITE)
> +   error_at (loc, "the rewrite condition must start with `&&'");
> + else
> +   error_at (loc, "the split condition must start with `&&' [%s]", 
> insn_cond);
> }
> - else if (GET_CODE (desc) == DEFINE_INSN_AND_REWRITE)
> -   error_at (loc, "the rewrite condition must start with `&&'");
> +
>   XSTR (split, 1) = split_cond;
>   if (GET_CODE (desc) == DEFINE_INSN_AND_REWRITE)
> XVEC (split, 2) = gen_rewrite_sequence (XVEC (desc, 1));
> 
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index abfd845..86869d9 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1283,7 +1283,7 @@ (define_insn_and_split "*movsi_aarch64"
fmov\\t%w0, %s1
fmov\\t%s0, %s1
* return aarch64_output_scalar_simd_mov_immediate (operands[1], SImode);"
-  "CONST_INT_P (operands[1]) && !aarch64_move_imm (INTVAL (operands[1]), 
SImode)
+  "&& CONST_INT_P (operands[1]) && !aarch64_move_imm (INTVAL (operands[1]), 
SImode)
 && REG_P (operands[0]) && GP_REGNUM_P (REGNO (operands[0]))"
[(const_int 0)]
"{
@@ -1319,7 +1319,7 @@ (define_insn_and_split "*movdi_aarch64"
fmov\\t%x0, %d1
fmov\\t%d0, %d1
* return aarch64_output_scalar_simd_mov_immediate (operands[1], DImode);"
-   "(CONST_INT_P (operands[1]) && !aarch64_move_imm (INTVAL (operands[1]), 
DImode))
+   "&& (CONST_INT_P (operands[1]) && !aarch64_move_imm 

Re: [PATCH 2/2] Fix _mm256_zeroupper by representing the instructions as call_insns in which the call has a special vzeroupper ABI.

2021-06-03 Thread Hongtao Liu via Gcc-patches
Ping

This is a splitted backend patch as a follow up of
https://gcc.gnu.org/pipermail/gcc-patches/2021-June/571545.html

On Thu, Jun 3, 2021 at 2:55 PM liuhongt via Gcc-patches
 wrote:
>
> When __builtin_ia32_vzeroupper is called explicitly, the corresponding
> vzeroupper pattern does not carry any CLOBBERS or SETs before LRA,
> which leads to incorrect optimization in pass_reload. In order to
> solve this problem, this patch refine instructions as call_insns in
> which the call has a special vzeroupper ABI.
>
> gcc/ChangeLog:
>
> PR target/82735
> * config/i386/i386-expand.c (ix86_expand_builtin): Remove
> assignment of cfun->machine->has_explicit_vzeroupper.
> * config/i386/i386-features.c
> (ix86_add_reg_usage_to_vzerouppers): Delete.
> (ix86_add_reg_usage_to_vzeroupper): Ditto.
> (rest_of_handle_insert_vzeroupper): Remove
> ix86_add_reg_usage_to_vzerouppers, add df_analyze at the end
> of the function.
> (gate): Remove cfun->machine->has_explicit_vzeroupper.
> * config/i386/i386-protos.h (ix86_expand_avx_vzeroupper):
> Declared.
> * config/i386/i386.c (ix86_insn_callee_abi): New function.
> (ix86_initialize_callee_abi): Ditto.
> (ix86_expand_avx_vzeroupper): Ditto.
> (ix86_hard_regno_call_part_clobbered): Adjust for vzeroupper
> ABI.
> (TARGET_INSN_CALLEE_ABI): Define as ix86_insn_callee_abi.
> (ix86_emit_mode_set): Call ix86_expand_avx_vzeroupper
> directly.
> * config/i386/i386.h (struct GTY(()) machine_function): Delete
> has_explicit_vzeroupper.
> * config/i386/i386.md (enum unspec): New member
> UNSPEC_CALLEE_ABI.
> (I386_DEFAULT,I386_VZEROUPPER,I386_UNKNOWN): New
> define_constants for insn callee abi index.
> * config/i386/predicates.md (vzeroupper_pattern): Adjust.
> * config/i386/sse.md (UNSPECV_VZEROUPPER): Deleted.
> (avx_vzeroupper): Call ix86_expand_avx_vzeroupper.
> (*avx_vzeroupper): Rename to ..
> (avx_vzeroupper_callee_abi): .. this, and adjust pattern as
> call_insn which has a special vzeroupper ABI.
> (*avx_vzeroupper_1): Deleted.
>
> gcc/testsuite/ChangeLog:
>
> PR target/82735
> * gcc.target/i386/pr82735-1.c: New test.
> * gcc.target/i386/pr82735-2.c: New test.
> * gcc.target/i386/pr82735-3.c: New test.
> * gcc.target/i386/pr82735-4.c: New test.
> * gcc.target/i386/pr82735-5.c: New test.
> ---
>  gcc/config/i386/i386-expand.c |  4 -
>  gcc/config/i386/i386-features.c   | 99 +++
>  gcc/config/i386/i386-protos.h |  1 +
>  gcc/config/i386/i386.c| 55 -
>  gcc/config/i386/i386.h|  4 -
>  gcc/config/i386/i386.md   | 10 +++
>  gcc/config/i386/predicates.md |  5 +-
>  gcc/config/i386/sse.md| 59 --
>  gcc/testsuite/gcc.target/i386/pr82735-1.c | 29 +++
>  gcc/testsuite/gcc.target/i386/pr82735-2.c | 22 +
>  gcc/testsuite/gcc.target/i386/pr82735-3.c |  5 ++
>  gcc/testsuite/gcc.target/i386/pr82735-4.c | 48 +++
>  gcc/testsuite/gcc.target/i386/pr82735-5.c | 54 +
>  13 files changed, 252 insertions(+), 143 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr82735-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr82735-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr82735-3.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr82735-4.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr82735-5.c
>
> diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
> index 9f3d41955a2..d25d59aa4e7 100644
> --- a/gcc/config/i386/i386-expand.c
> +++ b/gcc/config/i386/i386-expand.c
> @@ -13282,10 +13282,6 @@ rdseed_step:
>
>return 0;
>
> -case IX86_BUILTIN_VZEROUPPER:
> -  cfun->machine->has_explicit_vzeroupper = true;
> -  break;
> -
>  default:
>break;
>  }
> diff --git a/gcc/config/i386/i386-features.c b/gcc/config/i386/i386-features.c
> index 77783a154b6..a25769ae478 100644
> --- a/gcc/config/i386/i386-features.c
> +++ b/gcc/config/i386/i386-features.c
> @@ -1768,92 +1768,22 @@ convert_scalars_to_vector (bool timode_p)
>return 0;
>  }
>
> -/* Modify the vzeroupper pattern in INSN so that it describes the effect
> -   that the instruction has on the SSE registers.  LIVE_REGS are the set
> -   of registers that are live across the instruction.
> -
> -   For a live register R we use:
> -
> - (set (reg:V2DF R) (reg:V2DF R))
> -
> -   which preserves the low 128 bits but clobbers the upper bits.  */
> -
> -static void
> -ix86_add_reg_usage_to_vzeroupper (rtx_insn *insn, bitmap live_regs)
> -{
> -  rtx pattern = PATTERN (insn);
> -  unsigned int nregs = TARGET_64BIT ? 16 : 8;
> -  unsigned int npats = nregs;
> -  

Re: [PATCH 1/2] CALL_INSN may not be a real function call.

2021-06-03 Thread Hongtao Liu via Gcc-patches
Ping,
This is a splitted middle-end patch as a follow up of
https://gcc.gnu.org/pipermail/gcc-patches/2021-June/571544.html

On Thu, Jun 3, 2021 at 2:54 PM liuhongt via Gcc-patches
 wrote:
>
> Use "used" flag for CALL_INSN to indicate it's a fake call. If it's a
> fake call, it won't have its own function stack.
>
> gcc/ChangeLog
>
> PR target/82735
> * df-scan.c (df_get_call_refs): When call_insn is a fake call,
> it won't use stack pointer reg.
> * final.c (leaf_function_p): When call_insn is a fake call, it
> won't affect caller as a leaf function.
> * reg-stack.c (callee_clobbers_any_stack_reg): New.
> (subst_stack_regs): When call_insn doesn't clobber any stack
> reg, don't clear the arguments.
> * rtl.c (shallow_copy_rtx): Don't clear flag used when orig is
> a insn.
> * shrink-wrap.c (requires_stack_frame_p): No need for stack
> frame for a fake call.
> * rtl.h (FAKE_CALL_P): New macro.
> ---
>  gcc/df-scan.c |  3 ++-
>  gcc/final.c   |  3 ++-
>  gcc/reg-stack.c   | 18 +-
>  gcc/rtl.c |  6 --
>  gcc/rtl.h |  5 +
>  gcc/shrink-wrap.c |  2 +-
>  6 files changed, 31 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/df-scan.c b/gcc/df-scan.c
> index 6691c3e8357..1268536b3f0 100644
> --- a/gcc/df-scan.c
> +++ b/gcc/df-scan.c
> @@ -3090,7 +3090,8 @@ df_get_call_refs (class df_collection_rec 
> *collection_rec,
>
>for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
>  {
> -  if (i == STACK_POINTER_REGNUM)
> +  if (i == STACK_POINTER_REGNUM
> + && !FAKE_CALL_P (insn_info->insn))
> /* The stack ptr is used (honorarily) by a CALL insn.  */
> df_ref_record (DF_REF_BASE, collection_rec, regno_reg_rtx[i],
>NULL, bb, insn_info, DF_REF_REG_USE,
> diff --git a/gcc/final.c b/gcc/final.c
> index e0a70fcd830..817f7722cb2 100644
> --- a/gcc/final.c
> +++ b/gcc/final.c
> @@ -4109,7 +4109,8 @@ leaf_function_p (void)
>for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
>  {
>if (CALL_P (insn)
> - && ! SIBLING_CALL_P (insn))
> + && ! SIBLING_CALL_P (insn)
> + && ! FAKE_CALL_P (insn))
> return 0;
>if (NONJUMP_INSN_P (insn)
>   && GET_CODE (PATTERN (insn)) == SEQUENCE
> diff --git a/gcc/reg-stack.c b/gcc/reg-stack.c
> index 25210f0c17f..1d9ea035cf4 100644
> --- a/gcc/reg-stack.c
> +++ b/gcc/reg-stack.c
> @@ -174,6 +174,7 @@
>  #include "reload.h"
>  #include "tree-pass.h"
>  #include "rtl-iter.h"
> +#include "function-abi.h"
>
>  #ifdef STACK_REGS
>
> @@ -2368,6 +2369,18 @@ subst_asm_stack_regs (rtx_insn *insn, stack_ptr 
> regstack)
> }
>}
>  }
> +
> +/* Return true if a function call is allowed to alter some or all bits
> +   of any stack reg.  */
> +static bool
> +callee_clobbers_any_stack_reg (const function_abi & callee_abi)
> +{
> +  for (unsigned regno = FIRST_STACK_REG; regno <= LAST_STACK_REG; regno++)
> +if (callee_abi.clobbers_at_least_part_of_reg_p (regno))
> +  return true;
> +  return false;
> +}
> +
>
>  /* Substitute stack hard reg numbers for stack virtual registers in
> INSN.  Non-stack register numbers are not changed.  REGSTACK is the
> @@ -2382,7 +2395,10 @@ subst_stack_regs (rtx_insn *insn, stack_ptr regstack)
>bool control_flow_insn_deleted = false;
>int i;
>
> -  if (CALL_P (insn))
> +  /* If the target of the call doesn't clobber any stack registers,
> + Don't clear the arguments.  */
> +  if (CALL_P (insn)
> +  && callee_clobbers_any_stack_reg (insn_callee_abi (insn)))
>  {
>int top = regstack->top;
>
> diff --git a/gcc/rtl.c b/gcc/rtl.c
> index b0ba1ff684c..aaee882f5ca 100644
> --- a/gcc/rtl.c
> +++ b/gcc/rtl.c
> @@ -395,8 +395,10 @@ shallow_copy_rtx (const_rtx orig MEM_STAT_DECL)
>  case SCRATCH:
>break;
>  default:
> -  /* For all other RTXes clear the used flag on the copy.  */
> -  RTX_FLAG (copy, used) = 0;
> +  /* For all other RTXes clear the used flag on the copy.
> +CALL_INSN use "used" flag to indicate it's a fake call.  */
> +  if (!INSN_P (orig))
> +   RTX_FLAG (copy, used) = 0;
>break;
>  }
>return copy;
> diff --git a/gcc/rtl.h b/gcc/rtl.h
> index 35178b5bfac..5ed0d6dd6fa 100644
> --- a/gcc/rtl.h
> +++ b/gcc/rtl.h
> @@ -839,6 +839,11 @@ struct GTY(()) rtvec_def {
>  /* Predicate yielding nonzero iff X is a call insn.  */
>  #define CALL_P(X) (GET_CODE (X) == CALL_INSN)
>
> +/* 1 if RTX is a call_insn for a fake call.
> +   CALL_INSN use "used" flag to indicate it's a fake call.  */
> +#define FAKE_CALL_P(RTX)\
> +  (RTL_FLAG_CHECK1 ("FAKE_CALL_P", (RTX), CALL_INSN)->used)
> +
>  /* Predicate yielding nonzero iff X is an insn that cannot jump.  */
>  #define NONJUMP_INSN_P(X) (GET_CODE (X) == INSN)
>
> diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c
> 

RE: [PATCH] Canonicalize (vec_duplicate (not A)) to (not (vec_duplicate A)).

2021-06-03 Thread Liu, Hongtao via Gcc-patches



>-Original Message-
>From: Segher Boessenkool 
>Sent: Friday, June 4, 2021 4:00 AM
>To: Liu, Hongtao 
>Cc: Richard Biener ; GCC Patches patc...@gcc.gnu.org>
>Subject: Re: [PATCH] Canonicalize (vec_duplicate (not A)) to (not
>(vec_duplicate A)).
>
>On Thu, Jun 03, 2021 at 11:03:43AM +, Liu, Hongtao wrote:
>> >A very typical example is how UMIN is optimised:
>> >
>> >   case UMIN:
>> >  if (trueop1 == CONST0_RTX (mode) && ! side_effects_p (op0))
>> >return op1;
>> >  if (rtx_equal_p (trueop0, trueop1) && ! side_effects_p (op0))
>> >return op0;
>> >  tem = simplify_associative_operation (code, mode, op0, op1);
>> >  if (tem)
>> >return tem;
>> >  break;
>> >
>> >(the stuff using "tem").
>> >
>> >Hongtao, can we do something similar here?  Does that work well?
>> >Please try it out :-)
>>
>> In simplify_rtx, no simplication occurs, there is just the difference
>> between  (vec_duplicate (not REG)) and (not (vec_duplicate (REG)). So here
>tem will only be 0.
>
>simplify-rtx is used by combine.  When you do and+not+splat for example my
>suggestion should kick in.  Try it out, don't just dismiss it?
>
Forgive my obtuseness, do you mean try the following changes, if so then there 
will be no "kick in", 
temp will be 0, there's no simplification here since it's just the difference 
between  (vec_duplicate (not REG))
 and (not (vec_duplicate (REG)). Or maybe you mean something else?

@@ -1708,6 +1708,17 @@ simplify_context::simplify_unary_operation_1 (rtx_code 
code, machine_mode mode,
 #endif
   break;

+  /* Canonicalize (vec_duplicate (not A)) to (not (vec_duplicate A)).  */
+case VEC_DUPLICATE:
+  if (GET_CODE (op) == NOT)
+   {
+ rtx vec_dup = gen_rtx_VEC_DUPLICATE (mode, XEXP (op, 0));
+ temp = simplify_unary_operation (NOT, mode, vec_dup, GET_MODE (op));
+ if (temp)
+   return temp;
+   }
+  break;
+
>> Basically we don't know it's a simplication until combine successfully
>> split the
>> 3->2 instructions (not + broadcast + and to andnot + broadcast), but
>> 3->it's pretty awkward
>> to do this in combine.
>
>But you need to do this *before* it is split.  That is the whole point.
>
>> Consider andnot is existed for many backends, I think a canonicalization is
>needed here.
>
>Please do note that that is not as easy as yoou may think: you need to make
>sure nothing ever creates non-canonical code.
>
>> Maybe we can add insn canonicalization for transforming (and
>> (vect_duplicate (not A)) B) to (and (not (duplicate (not A)) B) instead of
>(vec_duplicate (not A)) to (not (vec_duplicate A))?
>
>I don't understand what this means?
I mean let's give a last shot for andnot in case AND like below

@ -3702,6 +3702,16 @@ simplify_context::simplify_binary_operation_1 (rtx_code 
code,
   tem = simplify_associative_operation (code, mode, op0, op1);
   if (tem)
return tem;
+
+  if (GET_CODE (op0) == VEC_DUPLICATE
+ && GET_CODE (XEXP (op0, 0)) == NOT)
+   {
+ rtx vec_dup = gen_rtx_VEC_DUPLICATE (GET_MODE (op0),
+  XEXP (XEXP (op0, 0), 0));
+ return simplify_gen_binary (AND, mode,
+ gen_rtx_NOT (mode, vec_dup),
+ op1);
+   }
   break;
>
>
>Segher


Re: [PATCH] rs6000: Support doubleword swaps removal in rot64 load store [PR100085]

2021-06-03 Thread Xionghu Luo via Gcc-patches



On 2021/6/4 04:16, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Jun 03, 2021 at 08:46:46AM +0800, Xionghu Luo wrote:
>> On 2021/6/3 06:20, Segher Boessenkool wrote:
>>> On Wed, Jun 02, 2021 at 03:19:32AM -0500, Xionghu Luo wrote:
 On P8LE, extra rot64+rot64 load or store instructions are generated
 in float128 to vector __int128 conversion.

 This patch teaches pass swaps to also handle such pattens to remove
 extra swap instructions.
>>>
>>> Did you check if this is already handled by simplify-rtx if the mode had
>>> been TImode (not V1TImode)?  If not, why do you not handle it there?
>>
>> I tried to do it in combine or peephole, the later pass split2
>> or split3 will still split it to rotate + rotate again as we have split
>> after reload, and this pattern is quite P8LE specific, so put it in pass
>> swap.  The simplify-rtx could simplify
>> r124:KF#0=r123:KF#0<-<0x40<-<0x40 to r124:KF#0=r123:KF#0 for register
>> operations already.
> 
> What mode are those subregs?  Abbreviated RTL printouts are very lossy.
> Assuming those are TImode (please check), then yes, that is what I
> asked, thanks.

typedef union
{
  __float128 vf1;
  vector __int128 vi128;
  __int128 i128;
} VF_128;


VF_128 vu;

int foo3 ()
{
  __float128 f128 = {3.1415926535897932384626433832795028841971693993751058Q};
  vu.vf1 = f128;
  vector __int128 ret = vu.vi128;
  return ret[0];
}

This case catches such optimization, they are also V1TImode:

Trying 8 -> 9:
8: r122:KF#0=r123:KF#0<-<0x40
  REG_DEAD r123:KF
9: r124:KF#0=r122:KF#0<-<0x40
  REG_DEAD r122:KF
Successfully matched this instruction:
(set (subreg:V1TI (reg:KF 124) 0)
(rotate:V1TI (rotate:V1TI (subreg:V1TI (reg:KF 123) 0)
(const_int 64 [0x40]))
(const_int 64 [0x40])))
allowing combination of insns 8 and 9
original costs 4 + 4 = 8
replacement cost 4
deferring deletion of insn with uid = 8.
modifying insn i3 9: r124:KF#0=r123:KF#0<-<0x40<-<0x40
  REG_DEAD r123:KF
deferring rescan insn with uid = 9.


With confirmation, actually it was optimized by this pattern from vsx.md
in split1 pass instead of simplify-rtx.


(define_insn_and_split "*vsx_le_undo_permute_"
  [(set (match_operand:VSX_TI 0 "vsx_register_operand" "=wa,wa")
(rotate:VSX_TI
 (rotate:VSX_TI
  (match_operand:VSX_TI 1 "vsx_register_operand" "0,wa")
  (const_int 64))
 (const_int 64)))]
  "!BYTES_BIG_ENDIAN && TARGET_VSX"
  "@
   #
   xxlor %x0,%x1"
  "&& 1"
  [(set (match_dup 0) (match_dup 1))]
{
  if (reload_completed && REGNO (operands[0]) == REGNO (operands[1]))
{
  emit_note (NOTE_INSN_DELETED);
  DONE;
}
}
  [(set_attr "length" "0,4")
   (set_attr "type" "veclogical")])


> 
>> ;; The post-reload split requires that we re-permute the source
>> ;; register in case it is still live.
>> (define_split
>>[(set (match_operand:VSX_LE_128 0 "memory_operand")
>>  (match_operand:VSX_LE_128 1 "vsx_register_operand"))]
>>"!BYTES_BIG_ENDIAN && TARGET_VSX && reload_completed && !TARGET_P9_VECTOR
>> && !altivec_indexed_or_indirect_operand (operands[0], mode)"
>>[(const_int 0)]
>> {
>>rs6000_emit_le_vsx_permute (operands[1], operands[1], mode);
>>rs6000_emit_le_vsx_permute (operands[0], operands[1], mode);
>>rs6000_emit_le_vsx_permute (operands[1], operands[1], mode);
>>DONE;
>> })
> 
> Yes, that needs improvement itself.
> 
> The tthing to realise is that TImode is optimised by generic code just
> fine (as all scalar integer modes are), but V1TImode is not.  We have
> that mode because we really needed to not put TImode in vector registers
> so much on older cpus, but that balance may have changed by now.  Worth
> experimenting with, we now can do pretty much all noormal operations in
> vector registers!
>
We have two issues stated in PR100085, one is __float128 to vector __int128 
(V1TImode),
the other is float128 to __float128 to __int128 (TImode).   The first one could 
be 
solved by this patch(by pass swap optimization), so to cover the TImode case, 
we should
also generate rotate permute instructions when gen_movti for P8LE like gen_movkf
in vector.md(below change is exactly copied from "mov"...)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 84820d3b5cb..f35c235a39e 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7385,7 +7385,22 @@ (define_expand "mov"
(match_operand:INT 1 "any_operand"))]
   ""
 {
-  rs6000_emit_move (operands[0], operands[1], mode);
+  /* When generating load/store instructions to/from VSX registers on
+ pre-power9 hardware in little endian mode, we need to emit register
+ permute instructions to byte swap the contents, since the VSX load/store
+ instructions do not include a byte swap as part of their operation.
+ Altivec loads and stores have no such problem, so we skip them below.  */
+  if (!BYTES_BIG_ENDIAN
+  && VECTOR_MEM_VSX_P (mode)
+  

[Bug c++/100102] [9/10/11/12 Regression] ICE in tsubst, at cp/pt.c:15310

2021-06-03 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100102

Patrick Palka  changed:

   What|Removed |Added

 CC||mu11 at yahoo dot com

--- Comment #18 from Patrick Palka  ---
*** Bug 100599 has been marked as a duplicate of this bug. ***

[Bug c++/100599] ICE in tree check: accessed elt 2 of ‘tree_vec’ with 1 elts in tsubst, at cp/pt.c:15649

2021-06-03 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100599

Patrick Palka  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|NEW |RESOLVED
 CC||ppalka at gcc dot gnu.org

--- Comment #2 from Patrick Palka  ---
Looks like this is essentially a dup of PR100102

*** This bug has been marked as a duplicate of bug 100102 ***

[Bug tree-optimization/56223] Integer ABS is not recognized for more complicated pattern

2021-06-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56223

--- Comment #4 from Andrew Pinski  ---
Note factor_out_conditional_conversion is a special case comment #3 really.

[Bug tree-optimization/56223] Integer ABS is not recognized for more complicated pattern

2021-06-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56223

--- Comment #3 from Andrew Pinski  ---
There are ways of handling this case.  I think the second one is better than
the first.  The first is mentioned here.
The second would take:


if (a) {
  foo();
  b = d + e;
} else {
  goo();
  b = d -e;
}
and replace it with:
if (a) {
  foo();
  t = e;
} else {
  goo();
  t = -e;
}
b = d + t;
And then because in the original case foo and goo were non-existant, PHI-OPT
would replace the if statement with:
t = abs(e);
b = d + t;

Re: [PATCH] rs6000: Support doubleword swaps removal in rot64 load store [PR100085]

2021-06-03 Thread Xionghu Luo via Gcc-patches
Hi,


On 2021/6/3 21:09, Bill Schmidt wrote:
> On 6/2/21 7:46 PM, Xionghu Luo wrote:
>> Hi,
>>
>> On 2021/6/3 06:20, Segher Boessenkool wrote:
>>> On Wed, Jun 02, 2021 at 03:19:32AM -0500, Xionghu Luo wrote:
 On P8LE, extra rot64+rot64 load or store instructions are generated
 in float128 to vector __int128 conversion.

 This patch teaches pass swaps to also handle such pattens to remove
 extra swap instructions.
>>> Did you check if this is already handled by simplify-rtx if the mode had
>>> been TImode (not V1TImode)?  If not, why do you not handle it there?
>> I tried to do it in combine or peephole, the later pass split2
>> or split3 will still split it to rotate + rotate again as we have split
>> after reload, and this pattern is quite P8LE specific, so put it in pass
>> swap.  The simplify-rtx could simplify
>> r124:KF#0=r123:KF#0<-<0x40<-<0x40 to r124:KF#0=r123:KF#0 for register
>> operations already.
>>
>>
>> vsx.md:
>>
>> ;; The post-reload split requires that we re-permute the source
>> ;; register in case it is still live.
>> (define_split
>>    [(set (match_operand:VSX_LE_128 0 "memory_operand")
>>  (match_operand:VSX_LE_128 1 "vsx_register_operand"))]
>>    "!BYTES_BIG_ENDIAN && TARGET_VSX && reload_completed && 
>> !TARGET_P9_VECTOR
>>     && !altivec_indexed_or_indirect_operand (operands[0], mode)"
>>    [(const_int 0)]
>> {
>>    rs6000_emit_le_vsx_permute (operands[1], operands[1], mode);
>>    rs6000_emit_le_vsx_permute (operands[0], operands[1], mode);
>>    rs6000_emit_le_vsx_permute (operands[1], operands[1], mode);
>>    DONE;
>> })
> 
> Note also that swap optimization can handle more general cases than 
> simplify-rtx.  In my view it's best to have it covered in both places.
> 

But this pattern is after reload quite later than swap optimization,
so it couldn't remove the swap operations as expected, I have a below
example that matched the above pattern in pass split2, this may be not 
quite appropriate as there is a function call between the load and store.

extern vector __int128 foo1 (__float128 a);

int foo2 ()
{
  __binary128 f128 = {3.1415926535897932384626433832795028841971693993751058Q};
  vector __int128 ret = foo1 (f128);
  return ret[0];
}


295r.split (*see insn 35, 36, 37*):

...
Splitting with gen_split_558 (vsx.md:1079)
...

(insn 33 12 34 2 (set (reg/f:DI 9 %r9 [121])
(high:DI (unspec:DI [
(symbol_ref:DI ("*.LANCHOR0") [flags 0x182])
(reg:DI 2 %r2)
] UNSPEC_TOCREL))) "pr100085.c":279:25 715 {*largetoc_high}
 (nil))
(insn 34 33 6 2 (set (reg/f:DI 9 %r9 [121])
(lo_sum:DI (reg/f:DI 9 %r9 [121])
(unspec:DI [
(symbol_ref:DI ("*.LANCHOR0") [flags 0x182])
(reg:DI 2 %r2)
] UNSPEC_TOCREL))) "pr100085.c":279:25 717 {*largetoc_low}
 (expr_list:REG_EQUAL (symbol_ref:DI ("*.LANCHOR0") [flags 0x182])
(nil)))
(insn 6 34 8 2 (set (reg:V1TI 66 %v2 [123])
(rotate:V1TI (mem/c:V1TI (reg/f:DI 9 %r9 [121]) [1 f128+0 S16 A128])
(const_int 64 [0x40]))) "pr100085.c":279:25 1113 
{*vsx_le_permute_v1ti}
 (nil))
(insn 8 6 9 2 (set (reg:V1TI 66 %v2)
(rotate:V1TI (reg:V1TI 66 %v2 [123])
(const_int 64 [0x40]))) "pr100085.c":279:25 1113 
{*vsx_le_permute_v1ti}
 (nil))
(call_insn 9 8 32 2 (parallel [
(set (reg:V1TI 66 %v2)
(call (mem:SI (symbol_ref:DI ("foo1") [flags 0x41]  
) [0 foo
1 S4 A8])
(const_int 0 [0])))
(use (const_int 0 [0]))
(clobber (reg:DI 96 lr))
]) "pr100085.c":279:25 735 {*call_value_nonlocal_aixdi}
 (expr_list:REG_CALL_DECL (symbol_ref:DI ("foo1") [flags 0x41]  
)
(nil))
(expr_list (use (reg:DI 2 %r2))
(expr_list:KF (use (reg:KF 66 %v2))
(nil
(insn 32 9 35 2 (set (reg:DI 9 %r9 [138])
(plus:DI (reg/f:DI 1 %r1)
(const_int 32 [0x20]))) "pr100085.c":279:25 66 {*adddi3}
 (nil))
(insn 35 32 36 2 (set (reg:V1TI 66 %v2)
(rotate:V1TI (reg:V1TI 66 %v2)
(const_int 64 [0x40]))) "pr100085.c":279:25 1113 
{*vsx_le_permute_v1ti}
 (nil))
(insn 36 35 37 2 (set (mem/c:V1TI (reg:DI 9 %r9 [138]) [2 %sfp+32 S16 A128])
(rotate:V1TI (reg:V1TI 66 %v2)
(const_int 64 [0x40]))) "pr100085.c":279:25 1113 
{*vsx_le_permute_v1ti}
 (nil))
(insn 37 36 28 2 (set (reg:V1TI 66 %v2)
(rotate:V1TI (reg:V1TI 66 %v2)
(const_int 64 [0x40]))) "pr100085.c":279:25 1113 
{*vsx_le_permute_v1ti}
 (nil))
(insn 28 37 17 2 (set (reg:DI 3 %r3 [133])
(mem/c:DI (plus:DI (reg/f:DI 1 %r1)
(const_int 32 [0x20])) [2 %sfp+32 S8 A128])) 
"pr100085.c":279:25 636 {*movdi_internal64}
 (nil))
(insn 17 28 18 2 (set (reg/i:DI 3 %r3)
(sign_extend:DI (reg:SI 3 %r3 [129]))) "pr100085.c":281:1 31 
{extendsidi2}
 (nil))
(insn 18 17 30 2 (use (reg/i:DI 3 %r3)) 

Re: [PATCH] rs6000: Support doubleword swaps removal in rot64 load store [PR100085]

2021-06-03 Thread Xionghu Luo via Gcc-patches



On 2021/6/4 04:31, Segher Boessenkool wrote:
> On Thu, Jun 03, 2021 at 02:49:15PM +0800, Xionghu Luo wrote:
>> If remove the rotate in simplify-rtx like below:
>>
>> +++ b/gcc/simplify-rtx.c
>> @@ -3830,10 +3830,16 @@ simplify_context::simplify_binary_operation_1 
>> (rtx_code code,
>>   case ROTATE:
>> if (trueop1 == CONST0_RTX (mode))
>>  return op0;
>> +
>> +  if (GET_CODE (trueop0) == ROTATE && trueop1 == GEN_INT (64)
>> + && CONST_INT_P (XEXP (trueop0, 1))
>> + && INTVAL (XEXP (trueop0, 1)) == 64)
>> +   return XEXP (trueop0, 0);
> 
> (The hardcoded 64 need improving -- but this is just a proof of concept
> I'll assume :-) )
> 
>> Combine still fail to merge the two instructions:
>>
>> Trying 6 -> 7:
>>  6: r120:KF#0=r125:KF#0<-<0x40
>>REG_DEAD r125:KF
>>  7: [sfp:DI+r123:DI]=r120:KF#0<-<0x40
>>REG_DEAD r120:KF
>> Successfully matched this instruction:
>> (set (mem/c:V1TI (plus:DI (reg/f:DI 110 sfp)
>>  (reg:DI 123)) [1  S16 A128])
>>  (subreg:V1TI (reg:KF 125) 0))
>> rejecting combination of insns 6 and 7
>> original costs 4 + 4 = 8
>> replacement cost 12
> 
> So what instructions were these?  Why did the store cost 4 but the new
> one costs 12?

For this case of __float128 to vector __int128:

typedef union
{
  __float128 vf1;
  vector __int128 vi128;
  __int128 i128;
} VF_128;

vector __int128
foo1 (__float128 f128)
{
  VF_128 vunion;

  vunion.vf1 = f128;
  return vunion.vi128;
}

Without this patch, the RTL in combine is:

(insn 6 3 17 2 (set (subreg:V1TI (reg:KF 120 [ f128 ]) 0)
(rotate:V1TI (subreg:V1TI (reg:KF 125) 0)
(const_int 64 [0x40]))) "pr100085.c":258:14 1113 
{*vsx_le_permute_v1ti}
 (expr_list:REG_DEAD (reg:KF 125)
(nil)))
(insn 17 6 7 2 (set (reg:DI 123)
(const_int 32 [0x20])) "pr100085.c":258:14 636 {*movdi_internal64}
 (nil))
(insn 7 17 19 2 (set (mem/c:V1TI (plus:DI (reg/f:DI 110 sfp)
(reg:DI 123)) [1  S16 A128])
(rotate:V1TI (subreg:V1TI (reg:KF 120 [ f128 ]) 0)
(const_int 64 [0x40]))) "pr100085.c":258:14 1113 
{*vsx_le_permute_v1ti}
 (expr_list:REG_DEAD (reg:KF 120 [ f128 ])
(nil)))
(note 19 7 14 2 NOTE_INSN_DELETED)
(insn 14 19 15 2 (set (reg/i:V1TI 66 %v2)
(mem/c:V1TI (plus:DI (reg/f:DI 110 sfp)
(reg:DI 123)) [1  S16 A128])) "pr100085.c":260:1 1119 
{*vsx_le_perm_load_v1ti}
 (expr_list:REG_DEAD (reg:DI 123)
(nil)))
(insn 15 14 0 2 (use (reg/i:V1TI 66 %v2)) "pr100085.c":260:1 -1
 (nil))

insn 6 and insn 7 are two vsx_le_permute_v1ti instructions each with costs 4,
(The two instructions are VSX and LE specific like Bill said, swap pass tries 
to remove insn if legal).  If remove the rotates in simplify-rtx.c
(simplify_context::simplify_binary_operation_1) like my last reply, combine will
try to merge them to vsx_le_perm_store_v1ti whose insn cost is 12 and meet 
"rejecting
combination".  They are all V1TI mode.


> 
>> By hacking the vsx_le_perm_store_v1ti INSN_COST from 12 to 8,
> 
> It should be the same cost as the other store!

vsx_le_permute_v1ti's cost is defined to 4 in vsx.md:

;; Little endian word swapping for 128-bit types that are either scalars or the
;; special V1TI container class, which it is not appropriate to use vec_select
;; for the type.
(define_insn "*vsx_le_permute_"
  [(set (match_operand:VSX_TI 0 "nonimmediate_operand" "=wa,wa,Z,,,Q")
(rotate:VSX_TI
 (match_operand:VSX_TI 1 "input_operand" "wa,Z,wa,r,Q,r")
 (const_int 64)))]
  "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
  "@
   xxpermdi %x0,%x1,%x1,2
   lxvd2x %x0,%y1
   stxvd2x %x1,%y0
   mr %0,%L1\;mr %L0,%1
   ld%U1%X1 %0,%L1\;ld%U1%X1 %L0,%1
   std%U0%X0 %L1,%0\;std%U0%X0 %1,%L0"
  [(set_attr "length" "*,*,*,8,8,8")
   (set_attr "type" "vecperm,vecload,vecstore,*,load,store")])

> 
>> it could merge the instructions:
>>
>>  21: r125:KF=%v2:KF
>>REG_DEAD %v2:KF
>>  2: NOTE_INSN_DELETED
>>  3: NOTE_INSN_FUNCTION_BEG
>>  6: NOTE_INSN_DELETED
>> 17: r123:DI=0x20
>>  7: [sfp:DI+r123:DI]=r125:KF#0
>>REG_DEAD r125:KF
>> 19: NOTE_INSN_DELETED
>> 14: %v2:V1TI=[sfp:DI+r123:DI]
>>REG_DEAD r123:DI
>> 15: use %v2:V1TI
>>
>> Then followed split1 pass will still split it to due to no dse pass
>> between to remove the memory operations on stack, remove the rotate
>> in swap won't face such problem since it runs before dse and no split
>> pass between them:
> 
> Sure, but none of that is the point.  I asked if we did this for TImode
> properly, and maybe we do, but:
> 
>> 22: r126:V1TI=r125:KF#0<-<0x40
>> 23: [sfp:DI+r123:DI]=r126:V1TI<-<0x40
> 
> ... this is V1TI mode.
> 
> 
> Segher
> 

-- 
Thanks,
Xionghu


[Bug target/80636] AVX / AVX512 register-zeroing should always use AVX 128b, not ymm or zmm

2021-06-03 Thread peter at cordes dot ca via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80636

Peter Cordes  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Peter Cordes  ---
This seems to be fixed for ZMM vectors in GCC8. 
https://gcc.godbolt.org/z/7351be1v4

Seems to have never been a problem for __m256, at least not for 
__m256 zero256(){ return _mm256_setzero_ps(); }
IDK what I was looking at when I originally reported; maybe just clang which
*did* used to prefer YMM-zeroing.

Some later comments suggested movdqa vs. pxor zeroing choices (and mov vs. xor
for integer), but the bug title is just AVX / AVX-512 xor-zeroing, and that
seems to be fixed.  So I think this should be closed.

Re: [PATCH] RISC-V: Enable riscv attributes by default for all riscv targets.

2021-06-03 Thread Nelson Chu
On Fri, Jun 4, 2021 at 5:20 AM Palmer Dabbelt  wrote:
>
> On Thu, 03 Jun 2021 13:55:40 PDT (-0700), Jim Wilson wrote:
> > These were only enabled for embedded elf originally because that was
> > the safe option, and linux had no obvious use for them.  But now that
> > we have new extensions coming like V that affect process state and ABIs,
> > the attributes are expected to be useful for linux, and may be required
> > by the psABI.  clang already emits them for all riscv targets.
> >
> > Tested with a patched open embedded build and boot, and a native
> > toolchain build.
> >
> > Committed.
>
> Works for me.
>
> Nelson: I'm assuming this is why the .riscv.attributes is in
> binutils-all now?  That seems fine, we should just call it out as the
> reason explicitly by splitting that out into its own commit.

Yes this patch will make .riscv.attributes always be generated, both
for elf and linux toolchains.
Got it, I will split this one into a separate commit,  thanks.

FYI, this update resolves the ABI problems for linux toolchain that I
met before,
https://sourceware.org/pipermail/binutils/2020-November/114016.html
Now linux toolchain also generate the elf architecture attributes, so
binutils will try to choose the correct ABI according to the
attributes, if user compile a assembly code without setting -mabi=.

Thanks
Nelson


> > Jim
> >
> >   gcc/
> >   * config.gcc (riscv*-*-*): If --with-riscv-attribute not used,
> >   turn TARGET_RISCV_ATTRIBUTES on for all riscv targets.
> > ---
> >  gcc/config.gcc | 9 +
> >  1 file changed, 1 insertion(+), 8 deletions(-)
> >
> > diff --git a/gcc/config.gcc b/gcc/config.gcc
> > index 92fad8e20ca..6833a6c13d9 100644
> > --- a/gcc/config.gcc
> > +++ b/gcc/config.gcc
> > @@ -4605,14 +4605,7 @@ case "${target}" in
> >   tm_defines="${tm_defines} TARGET_RISCV_ATTRIBUTE=0"
> >   ;;
> >   ""|default)
> > - case "${target}" in
> > - riscv*-*-elf*)
> > - tm_defines="${tm_defines} 
> > TARGET_RISCV_ATTRIBUTE=1"
> > - ;;
> > - *)
> > - tm_defines="${tm_defines} 
> > TARGET_RISCV_ATTRIBUTE=0"
> > - ;;
> > - esac
> > + tm_defines="${tm_defines} TARGET_RISCV_ATTRIBUTE=1"
> >   ;;
> >   *)
> >   echo "--with-riscv-attribute=${with_riscv_attribute} 
> > is not supported.  The argument must begin with yes, no or default." 1>&2


RE: [PATCH] [i386] Fix ICE of insn does not satisfy its constraints.

2021-06-03 Thread Liu, Hongtao via Gcc-patches


>-Original Message-
>From: Jakub Jelinek 
>Sent: Thursday, June 3, 2021 9:49 PM
>To: Liu, Hongtao 
>Cc: gcc-patches@gcc.gnu.org
>Subject: Re: [PATCH] [i386] Fix ICE of insn does not satisfy its constraints.
>
>On Thu, Jun 03, 2021 at 05:07:26PM +0800, liuhongt via Gcc-patches wrote:
>> @@ -18163,10 +18163,10 @@ (define_expand "v16qiv16si2"
>>"TARGET_AVX512F")
>>
>>  (define_insn "avx2_v8qiv8si2"
>> -  [(set (match_operand:V8SI 0 "register_operand" "=v")
>> +  [(set (match_operand:V8SI 0 "register_operand" "=Yv")
>>  (any_extend:V8SI
>>(vec_select:V8QI
>> -(match_operand:V16QI 1 "register_operand" "v")
>> +(match_operand:V16QI 1 "register_operand" "Yv")
>>  (parallel [(const_int 0) (const_int 1)
>> (const_int 2) (const_int 3)
>> (const_int 4) (const_int 5)
>
>Why do you need this change (and similarly other v -> Yv changes)?
>I mean, ix86_hard_regno_mode_ok for TARGET_AVX512F
>&& !TARGET_AVX512VL should return false for the 16-byte and 32-byte vector
>modes.
>
>The reason to use Yv is typically where the match_operand has 64-byte vector
>mode or scalar mode, yet it needs an AVX512VL instruction.
>
>The changes to use Yw look ok, that is for the cases where the insn requires
>both AVX512VL and AVX512BW, while ix86_hard_regno_mode_ok ensures
>the xmm16+ regs won't be used for the 16/32-byte vectors when AVX512VL is
>not on, it doesn't ensure that AVX512BW will be enabled.
Thanks for the review.
Yes, you're right, AVX512VL parts are already guaranteed by 
ix86_hard_regno_mode_ok.

Here is updated patch.
>
>   Jakub



0001-i386-Fix-ICE-of-insn-does-not-satisfy-its-constraint_v2.patch
Description: 0001-i386-Fix-ICE-of-insn-does-not-satisfy-its-constraint_v2.patch


[Bug target/70387] -fnon-call-exceptions has no effect

2021-06-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70387

Andrew Pinski  changed:

   What|Removed |Added

  Component|c++ |target
 Target|i586-pc-msdosdjgpp  |djgpp
   Severity|normal  |enhancement
 Status|WAITING |NEW

--- Comment #6 from Andrew Pinski  ---
(In reply to jwjagersma from comment #4)
> Created attachment 38096 [details]
> Test case 2
> 
> Generic test case, which doesn't require djgpp or a DOS machine. (Assuming
> throwing from inline asm is similar enough)
> 
> compile with:
> "g++ -std=gnu++14 -fnon-call-exceptions throw_from_asm.cpp"

Yes GCC adds no unwind info and it is hard to do from an inline-asm since GCC
has no information on what the inline-asm could do. So that part is more of a
documentation rather than anything else there.

As far as the other one, throwing from an interrupt in DOS requires you have to
have a MD_FALLBACK_FRAME_STATE_FOR defined which is not done for djgpp and
would need to be custom for your interrupt handler. I don't know the best way
forward for this really except to say there is not much to be done here unless
someone steps up and adds interrupt handler support to djgpp and then adds
throwing through the interrupt handler support too.

[Bug testsuite/28123] gcc.dg/cpp/_Pragma3.c is sensitive to timestamps when using from git

2021-06-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28123

Andrew Pinski  changed:

   What|Removed |Added

 Status|REOPENED|NEW
   Last reconfirmed|2006-06-21 15:02:45 |2021-6-3
Summary|gcc.dg/cpp/_Pragma3.c is|gcc.dg/cpp/_Pragma3.c is
   |sensitive to timestamps |sensitive to timestamps
   ||when using from git

--- Comment #7 from Andrew Pinski  ---
The released tar file will always be correct:
  # Run gcc_update on them to set up the timestamps nicely, and (re)write
  # the LAST_UPDATED file containing the git tag/revision used.
  changedir "gcc-${RELEASE}"
  contrib/gcc_update --touch
  echo "Obtained from git: ${GITBRANCH} revision ${GITREV}" > LAST_UPDATED


So this is more about only when building from git.

[Bug c++/100102] [9/10/11/12 Regression] ICE in tsubst, at cp/pt.c:15310

2021-06-03 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100102

Patrick Palka  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |ppalka at gcc dot 
gnu.org

--- Comment #17 from Patrick Palka  ---
Reduced valid reproducer:

template struct ratio;
template struct duration {
  static constexpr int _S_gcd();
  template using __is_harmonic = ratio<_S_gcd>;
  using type = __is_harmonic;
};

Re: [wwwdocs] lists: Correct procmail recipe

2021-06-03 Thread Hans-Peter Nilsson
On Wed, 2 Jun 2021, Gerald Pfeifer wrote:

> On Tue, 1 Jun 2021, Segher Boessenkool wrote:
> > We haven't had Sender: for a while now.
>
> "a while now" was about four(?) hours when you sent that yesterday. :-)
>
> I know since I still had been using that and was looking for all my
> missing gcc-related mails yesterday afternoon.  Thanks for spotting
> this and this change!

JFTR, nitpickish: I still get a "Sender:" field (obviously
generated at gcc.gnu.org), just not matching that format.

brgds, H-P


Re: [PATCH] rtl: constm64_rtx..const64_rtx

2021-06-03 Thread Segher Boessenkool
On Thu, Jun 03, 2021 at 01:32:54PM -0600, Jeff Law wrote:
> On 6/2/2021 4:43 PM, Segher Boessenkool wrote:
> >We have has const0_rtx etc. since forever, this patch just increases the
> >range (to those values that have had guaranteed unique RTXes since
> >decades as well).
> Yea, but often what you really want is CONST0_RTX (mode) instead of 
> const0_rtx.   It's easily goof'd and often the cause minor missed 
> optimizations.

I'd say "sometimes" instead of "often"...  almost always you really do
want a scalar integer.  Also, although CONST0_RTX makes sense for pretty
much all modes, CONST64_RTX will not (and those are not currently unique
RTX anyway).


Segher


Re: [Patch] Fortran/OpenMP: Add omp loop [PR99928]

2021-06-03 Thread Jakub Jelinek via Gcc-patches
On Thu, Jun 03, 2021 at 05:07:22PM +0200, Jakub Jelinek via Gcc-patches wrote:
> I think best would be just to remove that part of the testcase
> in the loop patch and handle the !$omp with !$acc continuations
> and vice versa in a separate change.  That seems to be a preexisting
> bug not really related to whether we support loop or not.
> fatal_error is certainly not ideal, but I can understand fixing
> it otherwise might be hard.
> Wonder if we just shouldn't treat the incorrect continuations
> as if they were simple comments.

Perhaps can't it gfc_error_now for the mixing of OMP and ACC line
continuations and then act as if the incorrect ones were comments
(like for -fopenmp -fno-openacc or -fopenacc -fno-openmp)?

Jakub



[PATCH] [og11] OpenMP/OpenACC: Move array_ref/indirect_ref handling code out of extract_base_bit_offset

2021-06-03 Thread Julian Brown
At Richard Biener's suggestion, this patch undoes the following patch:

  https://gcc.gnu.org/pipermail/gcc-patches/2021-June/571712.html

and moves the stripping of ARRAY_REFS/INDIRECT_REFS out of
extract_base_bit_offset and back into the (two) call sites of the
function. The difference between the two ways of looking through these
nodes comes down to (I think) what processing has been done on the
clause in question already: in the case where BASE_REF is non-NULL,
we are processing an OMP_CLAUSE_DECL for the first time. Conversely,
when BASE_REF is NULL, we are processing a node from the sorted list
that is being constructed after a GOMP_MAP_STRUCT node.

In practice, this appears to have no effect on test results (and I
haven't come up with a new test where it makes a difference), though
I will fold this version into the next iteration of these patches sent
upstream in order to avoid potentially introducing a bug.

Tested with offloading to NVPTX. I will apply to the og11 branch after
the weekend.

2021-06-03  Julian Brown  

gcc/
* gimplify.c (extract_base_bit_offset): Don't look through ARRAY_REFs or
INDIRECT_REFs here.
(build_struct_group): Reinstate previous behaviour for handling
ARRAY_REFs/INDIRECT_REFs.
---
 gcc/gimplify.c | 46 ++
 1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index c6ebef8e41c..1742a2cb564 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -8526,20 +8526,6 @@ extract_base_bit_offset (tree base, tree *base_ind, tree 
*base_ref,
   if (base_ref)
 *base_ref = NULL_TREE;
 
-  if (TREE_CODE (base) == ARRAY_REF)
-{
-  while (TREE_CODE (base) == ARRAY_REF)
-   base = TREE_OPERAND (base, 0);
-  if (TREE_CODE (base) != COMPONENT_REF
- || TREE_CODE (TREE_TYPE (base)) != ARRAY_TYPE)
-   return NULL_TREE;
-}
-  else if (TREE_CODE (base) == INDIRECT_REF
-  && TREE_CODE (TREE_OPERAND (base, 0)) == COMPONENT_REF
-  && (TREE_CODE (TREE_TYPE (TREE_OPERAND (base, 0)))
-  == REFERENCE_TYPE))
-base = TREE_OPERAND (base, 0);
-
   base = get_inner_reference (base, , , , ,
  , , );
 
@@ -9116,11 +9102,17 @@ build_struct_group (struct gimplify_omp_ctx *ctx,
   poly_offset_int coffset;
   poly_int64 cbitpos;
   tree base_ind, base_ref, tree_coffset;
+  tree ocd = OMP_CLAUSE_DECL (c);
   bool openmp = !(region_type & ORT_ACC);
 
-  tree base = extract_base_bit_offset (OMP_CLAUSE_DECL (c), _ind,
-  _ref, , ,
-  _coffset, openmp);
+  while (TREE_CODE (ocd) == ARRAY_REF)
+ocd = TREE_OPERAND (ocd, 0);
+
+  if (TREE_CODE (ocd) == INDIRECT_REF)
+ocd = TREE_OPERAND (ocd, 0);
+
+  tree base = extract_base_bit_offset (ocd, _ind, _ref, ,
+  , _coffset, openmp);
 
   bool do_map_struct = (base == decl && !tree_coffset);
 
@@ -9347,9 +9339,23 @@ build_struct_group (struct gimplify_omp_ctx *ctx,
poly_offset_int offset;
poly_int64 bitpos;
tree tree_offset;
-   tree base = extract_base_bit_offset (sc_decl, NULL, NULL,
-, ,
-_offset, openmp);
+
+   if (TREE_CODE (sc_decl) == ARRAY_REF)
+ {
+   while (TREE_CODE (sc_decl) == ARRAY_REF)
+ sc_decl = TREE_OPERAND (sc_decl, 0);
+   if (TREE_CODE (sc_decl) != COMPONENT_REF
+   || TREE_CODE (TREE_TYPE (sc_decl)) != ARRAY_TYPE)
+ break;
+ }
+   else if (TREE_CODE (sc_decl) == INDIRECT_REF
+&& TREE_CODE (TREE_OPERAND (sc_decl, 0)) == COMPONENT_REF
+&& (TREE_CODE (TREE_TYPE (TREE_OPERAND (sc_decl, 0)))
+== REFERENCE_TYPE))
+ sc_decl = TREE_OPERAND (sc_decl, 0);
+
+   tree base = extract_base_bit_offset (sc_decl, NULL, NULL, ,
+, _offset, openmp);
if (!base || !operand_equal_p (base, decl, 0))
  break;
if (scp)
-- 
2.29.2



Re: [PATCH] x86: Convert CONST_WIDE_INT to broadcast in move expanders

2021-06-03 Thread H.J. Lu via Gcc-patches
On Thu, Jun 3, 2021 at 12:39 AM Uros Bizjak  wrote:
>
> On Thu, Jun 3, 2021 at 5:49 AM H.J. Lu  wrote:
> >
> > Update move expanders to convert the CONST_WIDE_INT operand to vector
> > broadcast from a byte with AVX2.  Add ix86_gen_scratch_sse_rtx to
> > return a scratch SSE register which won't increase stack alignment
> > requirement and blocks transformation by the combine pass.
>
> Using fixed scratch reg is just too hackish for my taste. The

It was recommended to use hard register for things like this:

https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569945.html

> expansion is OK to emit some optimized sequence, but the approach to
> use fixed reg to bypass stack alignment functionality and combine is
> not.
>
> Perhaps a new insn pattern should be introduced, e.g.
>
> (define_insn_and_split ""
>[(set (match_opreand:V 0 "memory_operand" "=m,m")
> (vec_duplicate:V
>   (match_operand:S 2 "reg_or_0_operand" "r,C"))
> (clobber (match_scratch:V 1 "=x"))]
>
> and split it at some appropriate point.

I will give it a try.

Thanks.

> Uros.
>
> >
> > A small benchmark:
> >
> > https://gitlab.com/x86-benchmarks/microbenchmark/-/tree/memset/broadcast
> >
> > shows that broadcast is a little bit faster on Intel Core i7-8559U:
> >
> > $ make
> > gcc -g -I. -O2   -c -o test.o test.c
> > gcc -g   -c -o memory.o memory.S
> > gcc -g   -c -o broadcast.o broadcast.S
> > gcc -o test test.o memory.o broadcast.o
> > ./test
> > memory   : 99333
> > broadcast: 97208
> > $
> >
> > broadcast is also smaller:
> >
> > $ size memory.o broadcast.o
> >textdata bss dec hex filename
> > 132   0   0 132  84 memory.o
> > 122   0   0 122  7a broadcast.o
> > $
> >
> > gcc/
> >
> > PR target/100865
> > * config/i386/i386-expand.c (ix86_expand_vector_init_duplicate):
> > New prototype.
> > (ix86_byte_broadcast): New function.
> > (ix86_convert_const_wide_int_to_broadcast): Likewise.
> > (ix86_expand_move): Try ix86_convert_const_wide_int_to_broadcast
> > if mode size is 16 bytes or bigger.
> > (ix86_expand_vector_move): Try
> > ix86_convert_const_wide_int_to_broadcast.
> > * config/i386/i386-protos.h (ix86_gen_scratch_sse_rtx): New
> > prototype.
> > * config/i386/i386.c (ix86_minimum_incoming_stack_boundary): Add
> > an argument to ignore stack_alignment_estimated.  It is passed
> > as false by default.
> > (ix86_gen_scratch_sse_rtx): New function.
> >
> > gcc/testsuite/
> >
> > PR target/100865
> > * gcc.target/i386/pr100865-1.c: New test.
> > * gcc.target/i386/pr100865-2.c: Likewise.
> > * gcc.target/i386/pr100865-3.c: Likewise.
> > * gcc.target/i386/pr100865-4.c: Likewise.
> > * gcc.target/i386/pr100865-5.c: Likewise.
> > ---
> >  gcc/config/i386/i386-expand.c  | 103 ++---
> >  gcc/config/i386/i386-protos.h  |   2 +
> >  gcc/config/i386/i386.c |  50 +-
> >  gcc/testsuite/gcc.target/i386/pr100865-1.c |  13 +++
> >  gcc/testsuite/gcc.target/i386/pr100865-2.c |  14 +++
> >  gcc/testsuite/gcc.target/i386/pr100865-3.c |  15 +++
> >  gcc/testsuite/gcc.target/i386/pr100865-4.c |  16 
> >  gcc/testsuite/gcc.target/i386/pr100865-5.c |  17 
> >  8 files changed, 215 insertions(+), 15 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-3.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-4.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr100865-5.c
> >
> > diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
> > index 4185f58eed5..658adafa269 100644
> > --- a/gcc/config/i386/i386-expand.c
> > +++ b/gcc/config/i386/i386-expand.c
> > @@ -93,6 +93,9 @@ along with GCC; see the file COPYING3.  If not see
> >  #include "i386-builtins.h"
> >  #include "i386-expand.h"
> >
> > +static bool ix86_expand_vector_init_duplicate (bool, machine_mode, rtx,
> > +  rtx);
> > +
> >  /* Split one or more double-mode RTL references into pairs of half-mode
> > references.  The RTL can be REG, offsettable MEM, integer constant, or
> > CONST_DOUBLE.  "operands" is a pointer to an array of double-mode RTLs 
> > to
> > @@ -190,6 +193,65 @@ ix86_expand_clear (rtx dest)
> >emit_insn (tmp);
> >  }
> >
> > +/* Return a byte value which V can be broadcasted from.  Otherwise,
> > +   return INT_MAX.  */
> > +
> > +static int
> > +ix86_byte_broadcast (HOST_WIDE_INT v)
> > +{
> > +  wide_int val = wi::uhwi (v, HOST_BITS_PER_WIDE_INT);
> > +  int byte_broadcast = wi::extract_uhwi (val, 0, BITS_PER_UNIT);
> > +  for (unsigned int i = BITS_PER_UNIT;
> > +   i < HOST_BITS_PER_WIDE_INT;
> > +   i += 

gcc-9-20210603 is now available

2021-06-03 Thread GCC Administrator via Gcc
Snapshot gcc-9-20210603 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/9-20210603/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 9 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch releases/gcc-9 
revision 70a087ccb19002e71e104809a697fac62c83c7af

You'll find:

 gcc-9-20210603.tar.xzComplete GCC

  SHA256=d720003d44807a57b85fc06270fb193fbfc10eb9068f4a9c89eaa7661c7912ff
  SHA1=b1363c1caa2a5a1ec195e494ca840837fbce0566

Diffs from 9-20210527 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-9
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: [PATCH 04/11] cris: Update unexpected empty split condition

2021-06-03 Thread Hans-Peter Nilsson via Gcc-patches
> From: Hans-Peter Nilsson 
> CC: "gcc-patches@gcc.gnu.org" 
> Date: Thu, 3 Jun 2021 18:12:25 +0200

> I'd
> prefer to have the patch above committed sooner than the
> conclusion of that discussion.  (If you don't get to it,
> I'll do it, after a round of testing.)

Done; no regressions.

brgds, H-P


Re: [RFC/PATCH 00/11] Fix up some unexpected empty split conditions

2021-06-03 Thread Segher Boessenkool
On Thu, Jun 03, 2021 at 11:11:53AM -0600, Jeff Law wrote:
> On 6/3/2021 2:00 AM, Segher Boessenkool wrote:
> >The whole point of requiring the split condition to start with && is so
> >it will become harder to mess things up (it will make the gen* code a
> >tiny little bit simpler as well).  And there is no transition period or
> >anything like that needed either.  Just the bunch that will break will
> >need fixing.  So let's find out how many of those there are :-)
> Exactly.   While these empty conditions or those not starting with "&&" 
> are technically valid, they're all suspicious from a port correctness 
> standpoint, particularly if the main condition is non-empty.

And note that this is also the case if you wrote the insn condition as
an empty string, but you used some iterator with a condition.  I found
many of these in rs6000.  This will need to be fixed before we do
anything else.

> Having made that mistake when converting the H8 away from CC0, I can say 
> fairly confidently that if we had this in place a year ago that those 
> mistakes would likely have been avoided.  Thankfully the H8 isn't a 
> heavily used port and has limped along until I stumbled over the issue a 
> week or so ago while polishing some improvements to the port.

I've noticed unexpected splits in rs6000 only a few times over the
years. That doesn't mean more didn't happen, they just didn't cause
obvious enough problems :-)


Segher


Re: Test results for gccrs on Debian unstable aarch64

2021-06-03 Thread Mark Wielaard
Hi,

On Thu, Jun 03, 2021 at 09:05:22PM +0200, Mark Wielaard wrote:
> Currently I have only enabled the arches that I know are fully green
> (x86_64, arm64 and ppc64le). I should add ppc64[be] because that
> should also be fully green.

fedora-ppc64 is enabled now too:
https://builder.wildebeest.org/buildbot/#/builders/61

It currently has no builds because there have been no new commits,
but I verified by hand that it is green.

Cheers,

Mark

-- 
Gcc-rust mailing list
Gcc-rust@gcc.gnu.org
https://gcc.gnu.org/mailman/listinfo/gcc-rust


Re: [RFC/PATCH 00/11] Fix up some unexpected empty split conditions

2021-06-03 Thread Jakub Jelinek via Gcc-patches
On Thu, Jun 03, 2021 at 04:25:44PM -0500, Segher Boessenkool wrote:
> If we could just start all over we could do it perfectly (but see
> second-system syndrome, heh).  But we cannot.  IMO we should especially
> avoid everything that uses new semantics for old syntax.

Agreed, that would be a nightmare for backporting.

Jakub



Re: [RFC/PATCH 00/11] Fix up some unexpected empty split conditions

2021-06-03 Thread Segher Boessenkool
On Thu, Jun 03, 2021 at 11:25:49AM +0100, Richard Sandiford wrote:
> We shouldn't just add "&&" to all define_insn_and_splits that currently
> lack them.

My previous post shows that this *already* is required.

> IMO it's not reasonable to ask Kewen to do that for all ports.  So the
> process I suggested was a way of mechanically changing existing ports in
> a way that would not require input from target maintainers, or extensive
> testing on affected targets.

I fear it will end up as Yet Another unfinished transition this way :-(

> >> I don't know.  "&& 1" looks kind of weird to me.
> >
> > We have it in rs6000.md since 2004.  sparc has had it since 2002.  i386
> > has had it since 2001.  arm still does not have it :-)
> 
> Sure, this syntax goes back 20 years.  I don't think that answers the
> point though.  The question was whether a split condition "&& 1" is
> more readable than a condition "", given that "" means "always" in other
> contexts.  Given the choice, IMO "" is more readable and "&& 1" looks
> weird/inconsistent.

In most ports "&& 1" is all over the place and well-known already.  Sure
we could change to "" meaning always, but that is not what it currently
means!

If we could just start all over we could do it perfectly (but see
second-system syndrome, heh).  But we cannot.  IMO we should especially
avoid everything that uses new semantics for old syntax.


Segher


Re: [PATCH] RISC-V: Enable riscv attributes by default for all riscv targets.

2021-06-03 Thread Palmer Dabbelt

On Thu, 03 Jun 2021 13:55:40 PDT (-0700), Jim Wilson wrote:

These were only enabled for embedded elf originally because that was
the safe option, and linux had no obvious use for them.  But now that
we have new extensions coming like V that affect process state and ABIs,
the attributes are expected to be useful for linux, and may be required
by the psABI.  clang already emits them for all riscv targets.

Tested with a patched open embedded build and boot, and a native
toolchain build.

Committed.


Works for me.

Nelson: I'm assuming this is why the .riscv.attributes is in 
binutils-all now?  That seems fine, we should just call it out as the 
reason explicitly by splitting that out into its own commit.




Jim

gcc/
* config.gcc (riscv*-*-*): If --with-riscv-attribute not used,
turn TARGET_RISCV_ATTRIBUTES on for all riscv targets.
---
 gcc/config.gcc | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 92fad8e20ca..6833a6c13d9 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4605,14 +4605,7 @@ case "${target}" in
tm_defines="${tm_defines} TARGET_RISCV_ATTRIBUTE=0"
;;
""|default)
-   case "${target}" in
-   riscv*-*-elf*)
-   tm_defines="${tm_defines} 
TARGET_RISCV_ATTRIBUTE=1"
-   ;;
-   *)
-   tm_defines="${tm_defines} 
TARGET_RISCV_ATTRIBUTE=0"
-   ;;
-   esac
+   tm_defines="${tm_defines} TARGET_RISCV_ATTRIBUTE=1"
;;
*)
echo "--with-riscv-attribute=${with_riscv_attribute} is not 
supported.  The argument must begin with yes, no or default." 1>&2


[PATCH] Use PRIx64 to print 64bit hex values in legacy_hash

2021-06-03 Thread Mark Wielaard
* gcc/rust/backend/rust-compile.cc (legacy_hash): lo and hi are 
uint64_t use
PRIx64 to print them as hex values instead of %lx which is target 
specific.
---
 gcc/rust/backend/rust-compile.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/rust/backend/rust-compile.cc b/gcc/rust/backend/rust-compile.cc
index 480afc8b72b..ea21ad3f92b 100644
--- a/gcc/rust/backend/rust-compile.cc
+++ b/gcc/rust/backend/rust-compile.cc
@@ -442,7 +442,7 @@ legacy_hash (const std::string )
 
   char hex[16 + 1];
   memset (hex, 0, sizeof hex);
-  snprintf (hex, sizeof hex, "%08lx%08lx", lo, hi);
+  snprintf (hex, sizeof hex, "%08" PRIx64 "%08" PRIx64, lo, hi);
 
   return "h" + std::string (hex, sizeof (hex) - 1);
 }
-- 
2.32.0.rc0

-- 
Gcc-rust mailing list
Gcc-rust@gcc.gnu.org
https://gcc.gnu.org/mailman/listinfo/gcc-rust


[PATCH] RISC-V: Enable riscv attributes by default for all riscv targets.

2021-06-03 Thread Jim Wilson
These were only enabled for embedded elf originally because that was
the safe option, and linux had no obvious use for them.  But now that
we have new extensions coming like V that affect process state and ABIs,
the attributes are expected to be useful for linux, and may be required
by the psABI.  clang already emits them for all riscv targets.

Tested with a patched open embedded build and boot, and a native
toolchain build.

Committed.

Jim

gcc/
* config.gcc (riscv*-*-*): If --with-riscv-attribute not used,
turn TARGET_RISCV_ATTRIBUTES on for all riscv targets.
---
 gcc/config.gcc | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 92fad8e20ca..6833a6c13d9 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4605,14 +4605,7 @@ case "${target}" in
tm_defines="${tm_defines} TARGET_RISCV_ATTRIBUTE=0"
;;
""|default)
-   case "${target}" in
-   riscv*-*-elf*)
-   tm_defines="${tm_defines} 
TARGET_RISCV_ATTRIBUTE=1"
-   ;;
-   *)
-   tm_defines="${tm_defines} 
TARGET_RISCV_ATTRIBUTE=0"
-   ;;
-   esac
+   tm_defines="${tm_defines} TARGET_RISCV_ATTRIBUTE=1"
;;
*)
echo "--with-riscv-attribute=${with_riscv_attribute} is 
not supported.  The argument must begin with yes, no or default." 1>&2
-- 
2.25.1



Re: [PATCH] c++: top-level cv-quals on type of NTTP [PR100893]

2021-06-03 Thread Jason Merrill via Gcc-patches

On 6/3/21 2:40 PM, Patrick Palka wrote:

On Thu, 3 Jun 2021, Patrick Palka wrote:


Here, we're rejecting the specialization of g with T=A, F= in the
first testcase due to a spurious constness mismatch between the type of
the template argument  and the substituted type of F (the substituted
type has a top-level const).  Note that this mismatch doesn't occur with
object pointers because in that case a call to
perform_qualification_conversions from convert_nontype_argument
implicitly adds a top-level const to the argument (via a cast) to match.

This however seems to be a manifestation of a more general conformance
issue -- that we're not dropping top-level cv-quals after substituting
into the type of an NTTP as per [temp.param]/6 (we only do so at parse
time in process_template_parm).  This patch makes convert_template_argument
drop top-level cv-quals accordingly.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/100893

gcc/cp/ChangeLog:

* pt.c (convert_template_argument): Strip top-level cv-quals
on the substituted type of a non-type template parameter.

gcc/testsuite/ChangeLog:

* g++.dg/template/param4.C: New test.
* g++.dg/template/param5.C: New test.
---
  gcc/cp/pt.c|  7 ++-
  gcc/testsuite/g++.dg/template/param4.C | 10 ++
  gcc/testsuite/g++.dg/template/param5.C |  7 +++
  3 files changed, 23 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/template/param4.C
  create mode 100644 gcc/testsuite/g++.dg/template/param5.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 7211bdc5bbc..66cc88a331f 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -8494,7 +8494,12 @@ convert_template_argument (tree parm,
return error_mark_node;
}
else
-   t = tsubst (t, args, complain, in_decl);
+   {
+ t = tsubst (t, args, complain, in_decl);
+ /* Ignore top-level qualifiers on the substituted type of this
+non-type template parameter, as per [temp.param]/6.  */
+ t = cv_unqualified (t);
+   }


Err, shortly after posting I realized we could get top-level cv-quals not
only after substitution, but also after deduction of a decltype(auto)
NTTP (as in nontype-auto19.C below), so the call to cv_unqualified would
need to happen after do_auto_deduction too, like so.  Testing in
progress.

-- >8 --

Subject: [PATCH] c++: top-level cv-quals on type of NTTP [PR100893]

Here, we're rejecting the specialization of g with T=A, F= in
param4.C below due to a spurious constness mismatch between the type of
the template argument  and the substituted type of F (the substituted
type has a top-level const).  Note that this mismatch doesn't occur with
object pointers because in that case a call to
perform_qualification_conversions from convert_nontype_argument
implicitly adds a top-level const to the argument (via a cast) to match.

This however seems to be a manifestation of a more general conformance
issue -- that we're not dropping top-level cv-quals after substituting
into the type of an NTTP as per [temp.param]/6 (we only do so at parse
time in process_template_parm).  So this patch makes
convert_template_argument drop top-level cv-quals accordingly.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/100893

gcc/cp/ChangeLog:

* pt.c (convert_template_argument): Strip top-level cv-quals
on the substituted type of a non-type template parameter.

gcc/testsuite/ChangeLog:

* g++.dg/template/param4.C: New test.
* g++.dg/template/param5.C: New test.
* g++.dg/cpp1z/nontype-auto19.C: New test.
* g++.dg/cpp2a/concepts-decltype.C: Don't expect that the
deduced type of a decltype(auto) NTTP has top-level cv-quals.
---
  gcc/cp/pt.c|  4 
  gcc/testsuite/g++.dg/cpp1z/nontype-auto19.C|  8 
  gcc/testsuite/g++.dg/cpp2a/concepts-decltype.C |  2 +-
  gcc/testsuite/g++.dg/template/param4.C | 10 ++
  gcc/testsuite/g++.dg/template/param5.C |  7 +++
  5 files changed, 30 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp1z/nontype-auto19.C
  create mode 100644 gcc/testsuite/g++.dg/template/param4.C
  create mode 100644 gcc/testsuite/g++.dg/template/param5.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 7211bdc5bbc..3cac073ed50 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -8496,6 +8496,10 @@ convert_template_argument (tree parm,
else
t = tsubst (t, args, complain, in_decl);
  
+  /* Drop top-level cv-qualifiers on the substituted/deduced type of

+this non-type template parameter, as per [temp.param]/6.  */
+  t = cv_unqualified (t);


Might move this after...


if (invalid_nontype_parm_type_p (t, complain))
return error_mark_node;


...this test.  OK either way.


diff --git 

Re: [PATCH] rs6000: Support doubleword swaps removal in rot64 load store [PR100085]

2021-06-03 Thread Bill Schmidt via Gcc-patches



On 6/3/21 3:19 PM, Segher Boessenkool wrote:

On Thu, Jun 03, 2021 at 08:09:36AM -0500, Bill Schmidt wrote:

Note also that swap optimization can handle more general cases than
simplify-rtx.

Do you have examples?  That should be fixed (unless it is something
Power-specific?)
It is Power-specific.  This optimization looks at entire webs of 
computation to determine whether the computation can be done in the 
"wrong" lanes without problems, and removes all the extra swaps at 
once.  That is beyond what simplify-rtx is capable of.  It takes full 
dataflow analysis to see what can legally be done.  As a side effect, 
p8swaps takes care of some cases that simplify-rtx would catch at that 
particular point in compilation, but not all.

  In my view it's best to have it covered in both places.

Oh certainly, and we need that p8swaps pass on at least p8 anyway (but
perhaps we can allow TImode in vector regs on later cpus).


Yes, I agree that should potentially be revisited, but we still have the 
ABI requirement to keep them in GPRs for parameter passing, so there are 
limits on what we can do.


Bill




Segher


Re: [PATCH] rs6000: Support doubleword swaps removal in rot64 load store [PR100085]

2021-06-03 Thread Segher Boessenkool
On Thu, Jun 03, 2021 at 02:49:15PM +0800, Xionghu Luo wrote:
> If remove the rotate in simplify-rtx like below:
> 
> +++ b/gcc/simplify-rtx.c
> @@ -3830,10 +3830,16 @@ simplify_context::simplify_binary_operation_1 
> (rtx_code code,
>  case ROTATE:
>if (trueop1 == CONST0_RTX (mode))
> return op0;
> +
> +  if (GET_CODE (trueop0) == ROTATE && trueop1 == GEN_INT (64)
> + && CONST_INT_P (XEXP (trueop0, 1))
> + && INTVAL (XEXP (trueop0, 1)) == 64)
> +   return XEXP (trueop0, 0);

(The hardcoded 64 need improving -- but this is just a proof of concept
I'll assume :-) )

> Combine still fail to merge the two instructions:
> 
> Trying 6 -> 7:
> 6: r120:KF#0=r125:KF#0<-<0x40
>   REG_DEAD r125:KF
> 7: [sfp:DI+r123:DI]=r120:KF#0<-<0x40
>   REG_DEAD r120:KF
> Successfully matched this instruction:
> (set (mem/c:V1TI (plus:DI (reg/f:DI 110 sfp)
> (reg:DI 123)) [1  S16 A128])
> (subreg:V1TI (reg:KF 125) 0))
> rejecting combination of insns 6 and 7
> original costs 4 + 4 = 8
> replacement cost 12

So what instructions were these?  Why did the store cost 4 but the new
one costs 12?

> By hacking the vsx_le_perm_store_v1ti INSN_COST from 12 to 8,

It should be the same cost as the other store!

> it could merge the instructions:
> 
> 21: r125:KF=%v2:KF
>   REG_DEAD %v2:KF
> 2: NOTE_INSN_DELETED
> 3: NOTE_INSN_FUNCTION_BEG
> 6: NOTE_INSN_DELETED
>17: r123:DI=0x20
> 7: [sfp:DI+r123:DI]=r125:KF#0
>   REG_DEAD r125:KF
>19: NOTE_INSN_DELETED
>14: %v2:V1TI=[sfp:DI+r123:DI]
>   REG_DEAD r123:DI
>15: use %v2:V1TI
> 
> Then followed split1 pass will still split it to due to no dse pass
> between to remove the memory operations on stack, remove the rotate
> in swap won't face such problem since it runs before dse and no split
> pass between them:

Sure, but none of that is the point.  I asked if we did this for TImode
properly, and maybe we do, but:

>22: r126:V1TI=r125:KF#0<-<0x40
>23: [sfp:DI+r123:DI]=r126:V1TI<-<0x40

... this is V1TI mode.


Segher


Re: [PATCH] c++: Fix up attribute handling in methods in templates [PR100872]

2021-06-03 Thread Jason Merrill via Gcc-patches

On 6/3/21 5:00 AM, Jakub Jelinek wrote:

Hi!

The following testcase FAILs because a dependent (late) attribute is never
tsubsted.  While the testcase is OpenMP, I think it is a generic C++ FE problem
that could affect any other dependent attribute.

apply_late_template_attributes documents that it relies on
   /* save_template_attributes puts the dependent attributes at the beginning of
  the list; find the non-dependent ones.  */
The "operator binding" attributes that are sometimes added are added to the
head of DECL_ATTRIBUTES list though and because it doesn't have
ATTR_IS_DEPENDENT set it violates this requirement.

The following patch fixes it by adding that attribute after all
ATTR_IS_DEPENDENT attributes.  I'm not 100% sure if DECL_ATTRIBUTES can't be
shared by multiple functions (e.g. the cdtor clones), but the code uses
later remove_attribute which could break that too.  In any case it passed
bootstrap/regtest on x86_64-linux and i686-linux.


OK.


Other option would be to copy_list the ATTR_IS_DEPENDENT portion of the
DECL_ATTRIBUTES list if we need to do this, that would be the same as this
patch but replace that *ap = op_attr; at the end with
   *ap = NULL_TREE;
   DECL_ATTRIBUTES (cfn) = chainon (copy_list (DECL_ATTRIBUTES (cfn)),
   op_attr);
Or perhaps set ATTR_IS_DEPENDENT on the "operator bindings" attribute,
though it would need to be studied what would it try to do with the
attribute during tsubst.
2021-06-03  Jakub Jelinek  

PR c++/100872
* name-lookup.c (maybe_save_operator_binding): Add op_attr after all
ATTR_IS_DEPENDENT attributes in the DECL_ATTRIBUTES list rather than
to the start.

* g++.dg/gomp/declare-simd-8.C: New test.

--- gcc/cp/name-lookup.c.jj 2021-05-11 09:06:24.281997782 +0200
+++ gcc/cp/name-lookup.c2021-06-02 15:50:52.042521824 +0200
@@ -9136,9 +9136,12 @@ maybe_save_operator_binding (tree e)
tree op_attr = lookup_attribute (op_bind_attrname, attributes);
if (!op_attr)
  {
+  tree *ap = _ATTRIBUTES (cfn);
+  while (*ap && ATTR_IS_DEPENDENT (*ap))
+   ap = _CHAIN (*ap);
op_attr = tree_cons (get_identifier (op_bind_attrname),
-  NULL_TREE, attributes);
-  DECL_ATTRIBUTES (cfn) = op_attr;
+  NULL_TREE, *ap);
+  *ap = op_attr;
  }
  
tree op_bind = purpose_member (fnname, TREE_VALUE (op_attr));

--- gcc/testsuite/g++.dg/gomp/declare-simd-8.C.jj   2021-06-02 
16:02:32.792681922 +0200
+++ gcc/testsuite/g++.dg/gomp/declare-simd-8.C  2021-06-02 16:02:09.849004442 
+0200
@@ -0,0 +1,15 @@
+// PR c++/100872
+
+template 
+struct S {
+  #pragma omp declare simd aligned(a : N * 2) aligned(b) linear(ref(b): N)
+  float foo (float *a, T *) { return *a + *b; }
+};
+
+S<16, float> s;
+
+float
+bar (float *a, float *p)
+{
+  return s.foo (a, p);
+}

Jakub





[Bug target/100841] xtensa-linux: dwarf2cfi.c:291:12: error: comparison of integer expressions of different signedness: 'const unsigned int' and 'int'

2021-06-03 Thread jbglaw--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100841

Jan-Benedict Glaw  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Jan-Benedict Glaw  ---
Fixed by Jakub Jelinek.

[Bug target/100841] xtensa-linux: dwarf2cfi.c:291:12: error: comparison of integer expressions of different signedness: 'const unsigned int' and 'int'

2021-06-03 Thread jbglaw--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100841

--- Comment #4 from Jan-Benedict Glaw  ---
Builds now. Thanks a lot!

PING [PATCH] PR fortran/99839 - [9/10/11/12 Regression] ICE in inline_matmul_assign, at fortran/frontend-passes.c:4234

2021-06-03 Thread Harald Anlauf via Gcc-patches
*PING*

> Gesendet: Donnerstag, 27. Mai 2021 um 22:20 Uhr
> Von: "Harald Anlauf" 
> An: "fortran" , "gcc-patches" 
> Betreff: [PATCH] PR fortran/99839 - [9/10/11/12 Regression] ICE in 
> inline_matmul_assign, at fortran/frontend-passes.c:4234
>
> Dear Fortranners,
>
> frontend optimization tries to inline matmul, but then it also needs
> to take care of the assignment to the result array.  If that one is
> not of canonical type, we currently get an ICE.  The straightforward
> solution is to simply punt in those cases and avoid inlining.
>
> Regtested on x86_64-pc-linux-gnu.
>
> OK for mainline?  Backport to affected branches?
>
> Thanks,
> Harald
>
>
> Fortran - ICE in inline_matmul_assign
>
> Restrict inlining of matmul to those cases where assignment to the
> result array does not need special treatment.
>
> gcc/fortran/ChangeLog:
>
>   PR fortran/99839
>   * frontend-passes.c (inline_matmul_assign): Do not inline matmul
>   if the assignment to the resulting array if it is not of canonical
>   type (real/integer/complex/logical).
>
> gcc/testsuite/ChangeLog:
>
>   PR fortran/99839
>   * gfortran.dg/inline_matmul_25.f90: New test.
>
>


Re: [PATCH] rs6000: Support doubleword swaps removal in rot64 load store [PR100085]

2021-06-03 Thread Segher Boessenkool
On Thu, Jun 03, 2021 at 08:09:36AM -0500, Bill Schmidt wrote:
> Note also that swap optimization can handle more general cases than 
> simplify-rtx.

Do you have examples?  That should be fixed (unless it is something
Power-specific?)

> In my view it's best to have it covered in both places.

Oh certainly, and we need that p8swaps pass on at least p8 anyway (but
perhaps we can allow TImode in vector regs on later cpus).


Segher


Re: [PATCH][version 3]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-06-03 Thread Qing Zhao via Gcc-patches
Hi, Richard,


On May 26, 2021, at 6:18 AM, Richard Biener 
mailto:rguent...@suse.de>> wrote:

On Wed, 12 May 2021, Qing Zhao wrote:

Hi,

This is the 3rd version of the patch for the new security feature for GCC.

Please take look and let me know your comments and suggestions.


+/* Returns true when the given TYPE has padding inside it.
+   return false otherwise.  */
+bool
+type_has_padding (tree type)
+{
+  switch (TREE_CODE (type))
+{
+case RECORD_TYPE:
+  {

btw, there's __builtin_clear_padding and a whole machinery around
it in gimple-fold.c, I'm sure that parts could be re-used if they
are neccessary in the end.

To address the above suggestion:

My study shows: the call to __builtin_clear_padding is expanded during 
gimplification phase.
And there is no __bultin_clear_padding expanding during rtx expanding phase.
However, for -ftrivial-auto-var-init, padding initialization should be done 
both in gimplification phase and rtx expanding phase.
since the __builtin_clear_padding might not be good for rtx expanding, reusing 
__builtin_clear_padding might not work.

Let me know if you have any more comments on this.

Thanks.

Qing


Re: [PATCH] rs6000: Support doubleword swaps removal in rot64 load store [PR100085]

2021-06-03 Thread Segher Boessenkool
Hi!

On Thu, Jun 03, 2021 at 08:46:46AM +0800, Xionghu Luo wrote:
> On 2021/6/3 06:20, Segher Boessenkool wrote:
> > On Wed, Jun 02, 2021 at 03:19:32AM -0500, Xionghu Luo wrote:
> >> On P8LE, extra rot64+rot64 load or store instructions are generated
> >> in float128 to vector __int128 conversion.
> >>
> >> This patch teaches pass swaps to also handle such pattens to remove
> >> extra swap instructions.
> > 
> > Did you check if this is already handled by simplify-rtx if the mode had
> > been TImode (not V1TImode)?  If not, why do you not handle it there?
> 
> I tried to do it in combine or peephole, the later pass split2
> or split3 will still split it to rotate + rotate again as we have split
> after reload, and this pattern is quite P8LE specific, so put it in pass
> swap.  The simplify-rtx could simplify 
> r124:KF#0=r123:KF#0<-<0x40<-<0x40 to r124:KF#0=r123:KF#0 for register
> operations already.

What mode are those subregs?  Abbreviated RTL printouts are very lossy.
Assuming those are TImode (please check), then yes, that is what I
asked, thanks.

> ;; The post-reload split requires that we re-permute the source
> ;; register in case it is still live.
> (define_split
>   [(set (match_operand:VSX_LE_128 0 "memory_operand")
> (match_operand:VSX_LE_128 1 "vsx_register_operand"))]
>   "!BYTES_BIG_ENDIAN && TARGET_VSX && reload_completed && !TARGET_P9_VECTOR
>&& !altivec_indexed_or_indirect_operand (operands[0], mode)"
>   [(const_int 0)]
> {
>   rs6000_emit_le_vsx_permute (operands[1], operands[1], mode);
>   rs6000_emit_le_vsx_permute (operands[0], operands[1], mode);
>   rs6000_emit_le_vsx_permute (operands[1], operands[1], mode);
>   DONE;
> })

Yes, that needs improvement itself.

The tthing to realise is that TImode is optimised by generic code just
fine (as all scalar integer modes are), but V1TImode is not.  We have
that mode because we really needed to not put TImode in vector registers
so much on older cpus, but that balance may have changed by now.  Worth
experimenting with, we now can do pretty much all noormal operations in
vector registers!


Segher


Re: [PATCH][version 3]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-06-03 Thread Qing Zhao via Gcc-patches
Hi, Richard,

For the following, I need more clarification:



+/* Expand the IFN_DEFERRED_INIT function according to its second
argument.  */
+static void
+expand_DEFERRED_INIT (internal_fn, gcall *stmt)
+{
+  tree var = gimple_call_lhs (stmt);
+  tree init = NULL_TREE;
+  enum auto_init_type init_type
+= (enum auto_init_type) TREE_INT_CST_LOW (gimple_call_arg (stmt, 1));
+
+  switch (init_type)
+{
+default:
+  gcc_unreachable ();
+case AUTO_INIT_PATTERN:
+  init = build_pattern_cst_for_auto_init (TREE_TYPE (var));
+  expand_assignment (var, init, false);
+  break;
+case AUTO_INIT_ZERO:
+  init = build_zero_cst (TREE_TYPE (var));
+  expand_assignment (var, init, false);
+  break;
+}

I think actually building build_pattern_cst_for_auto_init can generate
massive garbage and for big auto vars code size is also a concern and
ideally on x86 you'd produce rep movq.  So I don't think going
via expand_assignment is good.  Instead you possibly want to lower
.DEFERRED_INIT to MEMs following expand_builtin_memset and
eventually enhance that to allow storing pieces larger than a byte.


I will lower .DEFFERED_INIT to MEMS following expand_builtin_memset for 
“AUTO_INIT_PATTERN”.
My question is:
Do I need to do the same for “AUTO_INIT_ZERO”?

Thanks.

Qing



[Bug c++/100897] New: Symmetric transfer does not prevent stack-overflow for C++20 coroutines

2021-06-03 Thread l.v.merzljak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100897

Bug ID: 100897
   Summary: Symmetric transfer does not prevent stack-overflow for
C++20 coroutines
   Product: gcc
   Version: 11.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: l.v.merzljak at gmail dot com
  Target Milestone: ---

Although the following code uses symmetric transfer, it crashes due to a
stack-overflow. The crash is also reproducible when using the task<> type of
the cppcoro library. The crash does not occur when using clang.

```
// main.cc
#include 
#include 

class Task {
 public:
  struct promise_type {
Task get_return_object() { return Handle::from_promise(*this); }

struct FinalAwaitable {
  bool await_ready() const noexcept { return false; }

  // Use symmetric transfer. Resuming coro.promise().m_continuation should
  // not require extra stack space
  std::coroutine_handle<> await_suspend(
  std::coroutine_handle coro) noexcept {
if (coro.promise().m_continuation) {
  return coro.promise().m_continuation;
} else {
  // The top-level task started from within main() does not have a
  // continuation. This will give control back to the main function.
  return std::noop_coroutine();
}
  }

  void await_resume() noexcept {}
};

std::suspend_always initial_suspend() noexcept { return {}; }

FinalAwaitable final_suspend() noexcept { return {}; }

void unhandled_exception() noexcept { std::terminate(); }

void set_continuation(std::coroutine_handle<> continuation) noexcept {
  m_continuation = continuation;
}

void return_void() noexcept {}

   private:
std::coroutine_handle<> m_continuation;
  };

  using Handle = std::coroutine_handle;

  Task(Handle coroutine) : m_coroutine(coroutine) {}

  ~Task() {
if (m_coroutine) {
  m_coroutine.destroy();
}
  }

  void start() noexcept { m_coroutine.resume(); }

  auto operator co_await() const noexcept { return Awaitable{m_coroutine}; }

 private:
  struct Awaitable {
Handle m_coroutine;

Awaitable(Handle coroutine) noexcept : m_coroutine(coroutine) {}

bool await_ready() const noexcept { return false; }

// Use symmetric transfer. Resuming m_coroutine should not require extra
// stack space
std::coroutine_handle<> await_suspend(
std::coroutine_handle<> awaitingCoroutine) noexcept {
  m_coroutine.promise().set_continuation(awaitingCoroutine);
  return m_coroutine;
}

void await_resume() {}
  };

  Handle m_coroutine;
};

Task inner() { co_return; }

Task outer() {
  // Use large number of iterations to trigger stack-overflow
  for (int i = 0; i != 5000; ++i) {
co_await inner();
  }
}

int main() {
  auto task = outer();
  task.start();
}
```

I compile the code with `g++-11 main.cc -std=c++20 -O3 -fsanitize=address`.

Here is the output:
```
$ ./a.out
AddressSanitizer:DEADLYSIGNAL
=
==21002==ERROR: AddressSanitizer: stack-overflow on address 0x7fffc666dff8 (pc
0x7f6ec2dfa16d bp 0x7fffc666e870 sp 0x7fffc666e000 T0)
#0 0x7f6ec2dfa16d in __sanitizer::BufferedStackTrace::UnwindImpl(unsigned
long, unsigned long, void*, bool, unsigned int)
../../../../src/libsanitizer/asan/asan_stack.cpp:57
#1 0x7f6ec2df00eb in __sanitizer::BufferedStackTrace::Unwind(unsigned long,
unsigned long, void*, bool, unsigned int)
../../../../src/libsanitizer/sanitizer_common/sanitizer_stacktrace.h:122
#2 0x7f6ec2df00eb in operator delete(void*)
../../../../src/libsanitizer/asan/asan_new_delete.cpp:160
#3 0x560193552e57 in _Z5innerv.destroy(inner()::_Z5innerv.frame*)
(/home/leonard/Desktop/hiwi/async_io_uring/stack-overflow/a.out+0x1e57)
#4 0x560193553b30 in _Z5outerv.actor(outer()::_Z5outerv.frame*)
(/home/leonard/Desktop/hiwi/async_io_uring/stack-overflow/a.out+0x2b30)
#5 0x560193552bbb in _Z5innerv.actor(inner()::_Z5innerv.frame*)
(/home/leonard/Desktop/hiwi/async_io_uring/stack-overflow/a.out+0x1bbb)
...
```

Re: [PATCH] Canonicalize (vec_duplicate (not A)) to (not (vec_duplicate A)).

2021-06-03 Thread Segher Boessenkool
On Thu, Jun 03, 2021 at 11:03:43AM +, Liu, Hongtao wrote:
> >A very typical example is how UMIN is optimised:
> >
> >   case UMIN:
> >  if (trueop1 == CONST0_RTX (mode) && ! side_effects_p (op0))
> > return op1;
> >  if (rtx_equal_p (trueop0, trueop1) && ! side_effects_p (op0))
> > return op0;
> >  tem = simplify_associative_operation (code, mode, op0, op1);
> >  if (tem)
> > return tem;
> >  break;
> >
> >(the stuff using "tem").
> >
> >Hongtao, can we do something similar here?  Does that work well?  Please try
> >it out :-)
> 
> In simplify_rtx, no simplication occurs, there is just the difference between
>  (vec_duplicate (not REG)) and (not (vec_duplicate (REG)). So here tem will 
> only be 0.

simplify-rtx is used by combine.  When you do and+not+splat for example
my suggestion should kick in.  Try it out, don't just dismiss it?

> Basically we don't know it's a simplication until combine successfully split 
> the
> 3->2 instructions (not + broadcast + and to andnot + broadcast), but it's 
> pretty awkward
> to do this in combine.

But you need to do this *before* it is split.  That is the whole point.

> Consider andnot is existed for many backends, I think a canonicalization is 
> needed here.

Please do note that that is not as easy as yoou may think: you need to
make sure nothing ever creates non-canonical code.

> Maybe we can add insn canonicalization for transforming (and (vect_duplicate 
> (not A)) B) to 
> (and (not (duplicate (not A)) B) instead of (vec_duplicate (not A)) to (not 
> (vec_duplicate A))?

I don't understand what this means?


Segher


[Bug target/100896] New: --enable-initfini-array should be enabled for cross compiler to Linux

2021-06-03 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100896

Bug ID: 100896
   Summary: --enable-initfini-array should be enabled for cross
compiler to Linux
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
  Target Milestone: ---

--enable-initfini-array should be enabled for cross compiler to
all *-*-linux* and *-*-gnu* targets.

Re: [PATCH] rtl: constm64_rtx..const64_rtx

2021-06-03 Thread Jeff Law via Gcc-patches




On 6/2/2021 4:43 PM, Segher Boessenkool wrote:

Hi!

On Wed, Jun 02, 2021 at 06:07:28PM +0100, Richard Sandiford wrote:

Segher Boessenkool  writes:

Since times immemorial there has been const_int_rtx for all values from
-64 to 64, but only constm1_rtx..const2_rtx have been available for
convenient use.  Change this, so that we can use all values in
{-64,...,64} in RTL easily.  This matters, because then we we just say
   if (XEXP (x, 1) == const16_rtx)
and things like that, since all const_int in that range are unique.  We
already do for -1, 0, 1, 2, but we could for everything.

No strong objection, but personally I'd rather not add something
that is very specific to VOIDmode CONST_INTs.  I realise it's very
unlikely that we'll ever be able to give CONST_INTs their proper mode
(no-one has the kind of time needed to do that), but I don't think we
should make the switch actively harder either.

How does this make that harder?

Having no mode for CONST_INTs makes some things significantly *easier*
btw.  Well you know that, that is what makes any conversion away from
this so much harder :-)

We have has const0_rtx etc. since forever, this patch just increases the
range (to those values that have had guaranteed unique RTXes since
decades as well).
Yea, but often what you really want is CONST0_RTX (mode) instead of 
const0_rtx.   It's easily goof'd and often the cause minor missed 
optimizations.


jeff


[Bug target/43892] PowerPC suboptimal "add with carry" optimization

2021-06-03 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43892

--- Comment #35 from Segher Boessenkool  ---
You get something like

.L5:
lwzu 9,4(10)
addc 8,3,9
adde 3,9,3
bdnz .L5

Re: Test results for gccrs on Debian unstable aarch64

2021-06-03 Thread Mark Wielaard
On Thu, Jun 03, 2021 at 12:58:46PM +0100, Philip Herron wrote:
> I just had a thought it would be nice if we could keep a matrix of
> different platforms gccrs has been tested on, and they could have states of:
> 
> 1. Build Failure
> 2. Test Failures link to log
> 3. Tests pass, no unexpected results

Note that the buildbot could provide that (for the arches it supports).
If you look at:
https://builder.wildebeest.org/buildbot/#/builders?tags=gccrust
you can see which arches are green.
And for each build it publishes the rust.sum and rust.log files
(under the make check link)

Currently I have only enabled the arches that I know are fully green
(x86_64, arm64 and ppc64le). I should add ppc64[be] because that
should also be fully green. I am not sure about enabling enabling
others (arm32, i386 and s390x). It seems a bit pointless to run builds
we know are currently broken.

Cheers,

Mark
-- 
Gcc-rust mailing list
Gcc-rust@gcc.gnu.org
https://gcc.gnu.org/mailman/listinfo/gcc-rust


[PATCH] libgcc libiberty: optimize and modernize standard string and memory functions

2021-06-03 Thread Seija K. via Gcc-patches
This patch optimizes and simplifies many of the standard string functions.

Since C99, some of the standard string functions have been changed to use
the restrict modifier.

diff --git a/libgcc/memcmp.c b/libgcc/memcmp.c
index 2348afe1d27f7..74195cf6baf13 100644
--- a/libgcc/memcmp.c
+++ b/libgcc/memcmp.c
@@ -7,10 +7,11 @@ memcmp (const void *str1, const void *str2, size_t count)
   const unsigned char *s1 = str1;
   const unsigned char *s2 = str2;

-  while (count-- > 0)
+  while (count--)
 {
-  if (*s1++ != *s2++)
- return s1[-1] < s2[-1] ? -1 : 1;
+  if (*s1 != *s2)
+ return *s1 < *s2 ? -1 : 1;
+  s1++, s2++;
 }
   return 0;
 }
diff --git a/libgcc/memcpy.c b/libgcc/memcpy.c
index 58b1e405627aa..616df78fd2969 100644
--- a/libgcc/memcpy.c
+++ b/libgcc/memcpy.c
@@ -2,7 +2,7 @@
 #include 

 void *
-memcpy (void *dest, const void *src, size_t len)
+memcpy (void * restrict dest, const void * restrict src, size_t len)
 {
   char *d = dest;
   const char *s = src;
diff --git a/libgcc/memset.c b/libgcc/memset.c
index 3e7025ee39443..b3b27cd63e12d 100644
--- a/libgcc/memset.c
+++ b/libgcc/memset.c
@@ -5,7 +5,7 @@ void *
 memset (void *dest, int val, size_t len)
 {
   unsigned char *ptr = dest;
-  while (len-- > 0)
-*ptr++ = val;
+  while (len--)
+*ptr++ = (unsigned char)val;
   return dest;
 }
diff --git a/libiberty/memchr.c b/libiberty/memchr.c
index 7448ab9e71c32..6f03e9c281108 100644
--- a/libiberty/memchr.c
+++ b/libiberty/memchr.c
@@ -23,7 +23,7 @@ memchr (register const PTR src_void, int c, size_t length)
 {
   const unsigned char *src = (const unsigned char *)src_void;

-  while (length-- > 0)
+  while (length--)
   {
 if (*src == c)
  return (PTR)src;
diff --git a/libiberty/memcmp.c b/libiberty/memcmp.c
index 37db60f38267a..f41b35a758cc4 100644
--- a/libiberty/memcmp.c
+++ b/libiberty/memcmp.c
@@ -27,8 +27,9 @@ memcmp (const PTR str1, const PTR str2, size_t count)

   while (count-- > 0)
 {
-  if (*s1++ != *s2++)
- return s1[-1] < s2[-1] ? -1 : 1;
+  if (*s1 != *s2)
+ return *s1 < *s2 ? -1 : 1;
+  s1++, s2++;
 }
   return 0;
 }
diff --git a/libiberty/memcpy.c b/libiberty/memcpy.c
index 7f67d0bd1f26c..d388ae7f3506b 100644
--- a/libiberty/memcpy.c
+++ b/libiberty/memcpy.c
@@ -19,7 +19,7 @@ Copies @var{length} bytes from memory region
@var{in} to region
 void bcopy (const void*, void*, size_t);

 PTR
-memcpy (PTR out, const PTR in, size_t length)
+memcpy (PTR restrict out, const PTR restrict in, size_t length)
 {
 bcopy(in, out, length);
 return out;
diff --git a/libiberty/mempcpy.c b/libiberty/mempcpy.c
index f4c624d4a3227..ac56eeaee0d5e 100644
--- a/libiberty/mempcpy.c
+++ b/libiberty/mempcpy.c
@@ -33,10 +33,10 @@ Copies @var{length} bytes from memory region
@var{in} to region
 #include 
 #include 

-extern PTR memcpy (PTR, const PTR, size_t);
+extern PTR memcpy (PTR restrict, const PTR restrict, size_t);

 PTR
-mempcpy (PTR dst, const PTR src, size_t len)
+mempcpy (PTR restrict dst, const PTR restrict src, size_t len)
 {
   return (char *) memcpy (dst, src, len) + len;
 }


Re: [PATCH] c++: top-level cv-quals on type of NTTP [PR100893]

2021-06-03 Thread Patrick Palka via Gcc-patches
On Thu, 3 Jun 2021, Patrick Palka wrote:

> Here, we're rejecting the specialization of g with T=A, F= in the
> first testcase due to a spurious constness mismatch between the type of
> the template argument  and the substituted type of F (the substituted
> type has a top-level const).  Note that this mismatch doesn't occur with
> object pointers because in that case a call to
> perform_qualification_conversions from convert_nontype_argument
> implicitly adds a top-level const to the argument (via a cast) to match.
> 
> This however seems to be a manifestation of a more general conformance
> issue -- that we're not dropping top-level cv-quals after substituting
> into the type of an NTTP as per [temp.param]/6 (we only do so at parse
> time in process_template_parm).  This patch makes convert_template_argument
> drop top-level cv-quals accordingly.
> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> trunk?
> 
>   PR c++/100893
> 
> gcc/cp/ChangeLog:
> 
>   * pt.c (convert_template_argument): Strip top-level cv-quals
>   on the substituted type of a non-type template parameter.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/template/param4.C: New test.
>   * g++.dg/template/param5.C: New test.
> ---
>  gcc/cp/pt.c|  7 ++-
>  gcc/testsuite/g++.dg/template/param4.C | 10 ++
>  gcc/testsuite/g++.dg/template/param5.C |  7 +++
>  3 files changed, 23 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.dg/template/param4.C
>  create mode 100644 gcc/testsuite/g++.dg/template/param5.C
> 
> diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
> index 7211bdc5bbc..66cc88a331f 100644
> --- a/gcc/cp/pt.c
> +++ b/gcc/cp/pt.c
> @@ -8494,7 +8494,12 @@ convert_template_argument (tree parm,
>   return error_mark_node;
>   }
>else
> - t = tsubst (t, args, complain, in_decl);
> + {
> +   t = tsubst (t, args, complain, in_decl);
> +   /* Ignore top-level qualifiers on the substituted type of this
> +  non-type template parameter, as per [temp.param]/6.  */
> +   t = cv_unqualified (t);
> + }

Err, shortly after posting I realized we could get top-level cv-quals not
only after substitution, but also after deduction of a decltype(auto)
NTTP (as in nontype-auto19.C below), so the call to cv_unqualified would
need to happen after do_auto_deduction too, like so.  Testing in
progress.

-- >8 --

Subject: [PATCH] c++: top-level cv-quals on type of NTTP [PR100893]

Here, we're rejecting the specialization of g with T=A, F= in
param4.C below due to a spurious constness mismatch between the type of
the template argument  and the substituted type of F (the substituted
type has a top-level const).  Note that this mismatch doesn't occur with
object pointers because in that case a call to
perform_qualification_conversions from convert_nontype_argument
implicitly adds a top-level const to the argument (via a cast) to match.

This however seems to be a manifestation of a more general conformance
issue -- that we're not dropping top-level cv-quals after substituting
into the type of an NTTP as per [temp.param]/6 (we only do so at parse
time in process_template_parm).  So this patch makes
convert_template_argument drop top-level cv-quals accordingly.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/100893

gcc/cp/ChangeLog:

* pt.c (convert_template_argument): Strip top-level cv-quals
on the substituted type of a non-type template parameter.

gcc/testsuite/ChangeLog:

* g++.dg/template/param4.C: New test.
* g++.dg/template/param5.C: New test.
* g++.dg/cpp1z/nontype-auto19.C: New test.
* g++.dg/cpp2a/concepts-decltype.C: Don't expect that the
deduced type of a decltype(auto) NTTP has top-level cv-quals.
---
 gcc/cp/pt.c|  4 
 gcc/testsuite/g++.dg/cpp1z/nontype-auto19.C|  8 
 gcc/testsuite/g++.dg/cpp2a/concepts-decltype.C |  2 +-
 gcc/testsuite/g++.dg/template/param4.C | 10 ++
 gcc/testsuite/g++.dg/template/param5.C |  7 +++
 5 files changed, 30 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/nontype-auto19.C
 create mode 100644 gcc/testsuite/g++.dg/template/param4.C
 create mode 100644 gcc/testsuite/g++.dg/template/param5.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 7211bdc5bbc..3cac073ed50 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -8496,6 +8496,10 @@ convert_template_argument (tree parm,
   else
t = tsubst (t, args, complain, in_decl);
 
+  /* Drop top-level cv-qualifiers on the substituted/deduced type of
+this non-type template parameter, as per [temp.param]/6.  */
+  t = cv_unqualified (t);
+
   if (invalid_nontype_parm_type_p (t, complain))
return error_mark_node;
 
diff --git a/gcc/testsuite/g++.dg/cpp1z/nontype-auto19.C 

[PATCH] i386: Add insert and extract patterns for 4-byte vectors [PR100637]

2021-06-03 Thread Uros Bizjak via Gcc-patches
The patch introduces insert and extract patterns for 4-byte vectors.
It effectively only emits PINSR and PEXTR instructions when available,
otherwise falls back to generic code that emulates these instructions
via inserts, extracts, logic operations and shifts in integer registers.

Please note that generic fallback produces better code than the current
approach of constructing new vector in memory (due to store forwarding stall)
so also enable QImode 8-byte vector inserts only with TARGET_SSE4_1.

2021-06-03  Uroš Bizjak  

gcc/
PR target/100637
* config/i386/i386-expand.c (ix86_expand_vector_set):
Handle V2HI and V4QI modes.
(ix86_expand_vector_extract): Ditto.
* config/i386/mmx.md (*pinsrw): New insn pattern.
(*pinsrb): Ditto.
(*pextrw): Ditto.
(*pextrw_zext): Ditto.
(*pextrb): Ditto.
(*pextrb_zext): Ditto.
(vec_setv2hi): New expander.
(vec_extractv2hihi): Ditto.
(vec_setv4qi): Ditto.
(vec_extractv4qiqi): Ditto.

(vec_setv8qi): Enable only for TARGET_SSE4_1.
(vec_extractv8qiqi): Ditto.

gcc/testsuite/

PR target/100637
* gcc.target/i386/vperm-v2hi.c: New test.
* gcc.target/i386/vperm-v4qi.c: Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index 4185f58eed5..eb7cdb0c14f 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -14968,6 +14968,7 @@ ix86_expand_vector_set (bool mmx_ok, rtx target, rtx 
val, int elt)
   return;
 
 case E_V8HImode:
+case E_V2HImode:
   use_vec_merge = TARGET_SSE2;
   break;
 case E_V4HImode:
@@ -14975,6 +14976,7 @@ ix86_expand_vector_set (bool mmx_ok, rtx target, rtx 
val, int elt)
   break;
 
 case E_V16QImode:
+case E_V4QImode:
   use_vec_merge = TARGET_SSE4_1;
   break;
 
@@ -15274,6 +15276,7 @@ ix86_expand_vector_extract (bool mmx_ok, rtx target, 
rtx vec, int elt)
   break;
 
 case E_V8HImode:
+case E_V2HImode:
   use_vec_extr = TARGET_SSE2;
   break;
 case E_V4HImode:
@@ -15294,6 +15297,9 @@ ix86_expand_vector_extract (bool mmx_ok, rtx target, 
rtx vec, int elt)
  return;
}
   break;
+case E_V4QImode:
+  use_vec_extr = TARGET_SSE4_1;
+  break;
 
 case E_V8SFmode:
   if (TARGET_AVX)
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index f39e062ddfc..914e5e91e90 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -3092,7 +3092,7 @@ (define_expand "vec_setv8qi"
   [(match_operand:V8QI 0 "register_operand")
(match_operand:QI 1 "register_operand")
(match_operand 2 "const_int_operand")]
-  "TARGET_MMX || TARGET_MMX_WITH_SSE"
+  "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE"
 {
   ix86_expand_vector_set (TARGET_MMX_WITH_SSE, operands[0], operands[1],
  INTVAL (operands[2]));
@@ -3103,7 +3103,7 @@ (define_expand "vec_extractv8qiqi"
   [(match_operand:QI 0 "register_operand")
(match_operand:V8QI 1 "register_operand")
(match_operand 2 "const_int_operand")]
-  "TARGET_MMX || TARGET_MMX_WITH_SSE"
+  "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE"
 {
   ix86_expand_vector_extract (TARGET_MMX_WITH_SSE, operands[0],
  operands[1], INTVAL (operands[2]));
@@ -3120,6 +3120,178 @@ (define_expand "vec_initv8qiqi"
   DONE;
 })
 
+(define_insn "*pinsrw"
+  [(set (match_operand:V2HI 0 "register_operand" "=x,YW")
+(vec_merge:V2HI
+  (vec_duplicate:V2HI
+(match_operand:HI 2 "nonimmediate_operand" "rm,rm"))
+ (match_operand:V2HI 1 "register_operand" "0,YW")
+  (match_operand:SI 3 "const_int_operand")))]
+  "TARGET_SSE2
+   && ((unsigned) exact_log2 (INTVAL (operands[3]))
+   < GET_MODE_NUNITS (V2HImode))"
+{
+  operands[3] = GEN_INT (exact_log2 (INTVAL (operands[3])));
+  switch (which_alternative)
+{
+case 1:
+  if (MEM_P (operands[2]))
+   return "vpinsrw\t{%3, %2, %1, %0|%0, %1, %2, %3}";
+  else
+   return "vpinsrw\t{%3, %k2, %1, %0|%0, %1, %k2, %3}";
+case 0:
+  if (MEM_P (operands[2]))
+   return "pinsrw\t{%3, %2, %0|%0, %2, %3}";
+  else
+   return "pinsrw\t{%3, %k2, %0|%0, %k2, %3}";
+default:
+  gcc_unreachable ();
+}
+}
+  [(set_attr "isa" "noavx,avx")
+   (set_attr "type" "sselog")
+   (set_attr "length_immediate" "1")
+   (set_attr "mode" "TI")])
+
+(define_insn "*pinsrb"
+  [(set (match_operand:V4QI 0 "register_operand" "=x,YW")
+(vec_merge:V4QI
+  (vec_duplicate:V4QI
+(match_operand:QI 2 "nonimmediate_operand" "rm,rm"))
+ (match_operand:V4QI 1 "register_operand" "0,YW")
+  (match_operand:SI 3 "const_int_operand")))]
+  "TARGET_SSE4_1
+   && ((unsigned) exact_log2 (INTVAL (operands[3]))
+   < GET_MODE_NUNITS (V4QImode))"
+{
+  operands[3] = GEN_INT (exact_log2 (INTVAL (operands[3])));
+  switch (which_alternative)
+{
+

[Bug target/100637] [i386] Vectorize 4-byte vectors

2021-06-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100637

--- Comment #8 from CVS Commits  ---
The master branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:5883e567564c5b3caecba0c13e8a360a14cdc846

commit r12-1197-g5883e567564c5b3caecba0c13e8a360a14cdc846
Author: Uros Bizjak 
Date:   Thu Jun 3 20:05:31 2021 +0200

i386: Add insert and extract patterns for 4-byte vectors [PR100637]

The patch introduces insert and extract patterns for 4-byte vectors.
It effectively only emits PINSR and PEXTR instructions when available,
otherwise falls back to generic code that emulates these instructions
via inserts, extracts, logic operations and shifts in integer registers.

Please note that generic fallback produces better code than the current
approach of constructing new vector in memory (due to store forwarding
stall)
so also enable QImode 8-byte vector inserts only with TARGET_SSE4_1.

2021-06-03  Uroš Bizjak  

gcc/
PR target/100637
* config/i386/i386-expand.c (ix86_expand_vector_set):
Handle V2HI and V4QI modes.
(ix86_expand_vector_extract): Ditto.
* config/i386/mmx.md (*pinsrw): New insn pattern.
(*pinsrb): Ditto.
(*pextrw): Ditto.
(*pextrw_zext): Ditto.
(*pextrb): Ditto.
(*pextrb_zext): Ditto.
(vec_setv2hi): New expander.
(vec_extractv2hihi): Ditto.
(vec_setv4qi): Ditto.
(vec_extractv4qiqi): Ditto.

(vec_setv8qi): Enable only for TARGET_SSE4_1.
(vec_extractv8qiqi): Ditto.

gcc/testsuite/

PR target/100637
* gcc.target/i386/vperm-v2hi.c: New test.
* gcc.target/i386/vperm-v4qi.c: Ditto.

[PATCH] c++: top-level cv-quals on type of NTTP [PR100893]

2021-06-03 Thread Patrick Palka via Gcc-patches
Here, we're rejecting the specialization of g with T=A, F= in the
first testcase due to a spurious constness mismatch between the type of
the template argument  and the substituted type of F (the substituted
type has a top-level const).  Note that this mismatch doesn't occur with
object pointers because in that case a call to
perform_qualification_conversions from convert_nontype_argument
implicitly adds a top-level const to the argument (via a cast) to match.

This however seems to be a manifestation of a more general conformance
issue -- that we're not dropping top-level cv-quals after substituting
into the type of an NTTP as per [temp.param]/6 (we only do so at parse
time in process_template_parm).  This patch makes convert_template_argument
drop top-level cv-quals accordingly.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/100893

gcc/cp/ChangeLog:

* pt.c (convert_template_argument): Strip top-level cv-quals
on the substituted type of a non-type template parameter.

gcc/testsuite/ChangeLog:

* g++.dg/template/param4.C: New test.
* g++.dg/template/param5.C: New test.
---
 gcc/cp/pt.c|  7 ++-
 gcc/testsuite/g++.dg/template/param4.C | 10 ++
 gcc/testsuite/g++.dg/template/param5.C |  7 +++
 3 files changed, 23 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/template/param4.C
 create mode 100644 gcc/testsuite/g++.dg/template/param5.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 7211bdc5bbc..66cc88a331f 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -8494,7 +8494,12 @@ convert_template_argument (tree parm,
return error_mark_node;
}
   else
-   t = tsubst (t, args, complain, in_decl);
+   {
+ t = tsubst (t, args, complain, in_decl);
+ /* Ignore top-level qualifiers on the substituted type of this
+non-type template parameter, as per [temp.param]/6.  */
+ t = cv_unqualified (t);
+   }
 
   if (invalid_nontype_parm_type_p (t, complain))
return error_mark_node;
diff --git a/gcc/testsuite/g++.dg/template/param4.C 
b/gcc/testsuite/g++.dg/template/param4.C
new file mode 100644
index 000..d55fecec2d9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/param4.C
@@ -0,0 +1,10 @@
+// PR c++/100893
+
+template void g() { }
+
+struct A { typedef void (* const type)(); };
+void f();
+template void g();
+
+struct B { typedef void (B::* const type)(); void f(); };
+template void g();
diff --git a/gcc/testsuite/g++.dg/template/param5.C 
b/gcc/testsuite/g++.dg/template/param5.C
new file mode 100644
index 000..b8c92f4c217
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/param5.C
@@ -0,0 +1,7 @@
+// Verify top-level cv-qualifiers are dropped when determining the substituted
+// type of a non-type-template parameter, as per [temp.param]/6.
+// { dg-do compile { target c++11 } }
+
+template decltype(V)& f();
+using type = decltype(f());
+using type = int&;
-- 
2.32.0.rc2



Re: [PATCH 2/2] rs6000: Add test for _mm_minpos_epu16

2021-06-03 Thread Paul A. Clarke via Gcc-patches
On Wed, Jun 02, 2021 at 08:50:56PM -0500, Segher Boessenkool wrote:
> On Wed, Jun 02, 2021 at 05:13:16PM -0500, Paul A. Clarke wrote:
> > +  for (i = 0; i < NUM; i++)
> > +src.s[i] = i * i - 68 * i + 1200;
> 
> Could you do tests with some identical elements as well?  Because that
> is where I think it fails on BE currently.

Let me re-do the test case a bit more to provide a better set of
input data, rather than this computational attempt which misses a
bunch of interesting cases.

I'll send a v2 in a bit.

PC


Re: [PATCH 1/2] rs6000: Add support for _mm_minpos_epu16

2021-06-03 Thread Paul A. Clarke via Gcc-patches
On Wed, Jun 02, 2021 at 07:27:35PM -0500, Segher Boessenkool wrote:
> On Wed, Jun 02, 2021 at 05:13:15PM -0500, Paul A. Clarke wrote:
> > Add a naive implementation of the subject x86 intrinsic to
> > ease porting.
> 
> > +/* Return horizontal packed word minimum and its index in bits [15:0]
> > +   and bits [18:16] respectively.  */
> > +extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
> > __artificial__))
> > +_mm_minpos_epu16 (__m128i __A)
> > +{
> > +  union __u
> > +{
> > +  __m128i __m;
> > +  __v8hu __uh;
> > +};
> > +  union __u __u = { .__m = __A }, __r = { .__m = {0} };
> > +  unsigned short __ridx = 0;
> > +  unsigned short __rmin = __u.__uh[__ridx];
> > +  for (unsigned long __i = __ridx+1;
> 
> (spaces around the "+"?)

ok

> 
> > +   __i < sizeof (__u.__uh) / sizeof (__u.__uh[0]);
> 
> You should either use a macro for that, or just write "8" :-)

ok. (There should be a standard thing for this operation.)

> > +   __i++)
> > +{
> > +  if (__u.__uh[__i] < __rmin)
> > +{
> > +  __rmin = __u.__uh[__i];
> > +  __ridx = __i;
> > +}
> > +}
> > +  __r.__uh[0] = __rmin;
> > +  __r.__uh[1] = __ridx;
> > +  return __r.__m;
> > +}
> 
> This does not compute the index correctly for big endian (it needs to
> walk from right to left for that).  The construction of the return value
> looks wrong as well.
> 
> Okay for trunk with that fixed.  Thanks!

I'm not seeing the issue here. The values are numbered by element order,
and the results are in the "first" (minimum value) and "second" (index of
first encountered minimum value in element order) elements of the result.

PC


Update to GCC copyright assignment policy

2021-06-03 Thread Christopher Dimech via Gcc



> Sent: Friday, June 04, 2021 at 4:50 AM
> From: "Daniel Pono Takamori" 
> To: gcc@gcc.gnu.org
> Subject: Re: Update to GCC copyright assignment policy
>
> I'm joining this list just briefly to give some feedback and input on this
> thread on behalf of Software Freedom Conservancy, since we were mentioned
> multiple times in this thread.  I suspect any conversation about how
> Conservancy and GCC might work together should be off-list or another list,
> and I have suggestions on that below.

Software Freedom Conservancy cannot dictate what gets discussed here.  Naturally
people, including the GCC Steering Committee could discuss with the Software 
Freedom
Conservancy Group on matters they wish to discuss.  But, as you could have 
deduced,
we allow comments ourselves on any aspects, and have allowed absolute freedom of
speech that could well have harmed people's sentiments and emotions very easily.

The law protects a broad variety of honest assessments and discussions.

> > > On 2021-06-01 07:28, Mark Wielaard wrote:
> > > > If we no longer want the FSF to be the legal guardian and copyright
> > > > holder for GCC could we please find another legal entity that performs
> > > > that role and helps us as a project with copyleft compliance?
>
> > On Tue, Jun 01, 2021 at 12:58:12PM -0700, Thomas Rodgers wrote:
> > > Personally, this would have been my preference.
>
> On Wed, Jun 2, 2021 at 4:18 AM Mark Wielaard  wrote:
> > the Conservancy is happy to share their knowledge and discuss policy issues
> > with the GCC community if we decide we want their input.
>
> Jason Merrill replied:
> >> This seems to me a complement rather than an alternative; some Linux
> >> developers use the Conservancy copyleft services while contributing under
> >> the DCO, and some GCC developers could do the same.
>
> Jason, we agree completely that anything Conservancy might offer is a
> complement rather than a replacement for any structure that the GCC community
> already has or might want to build.  For example, the Copyleft Compliance
> project that Mark mentioned 
> is primarily designed for projects (e.g., BusyBox, Debian, Linux, Samba) that
> have diversely-held copyright.  We provide logistical and coordination
> support for individuals who hold copyright (and help them figure out how to
> keep their own copyrights) and we also accept copyright assignment from those
> who prefer assignment.  (As a reminder, Conservancy is not a law firm and we
> do not provide legal services and advice.)

It is important that people understand this - Software Freedom Conservancy does 
not
provide any legal advice.  The FSF, on the other hand, has a robust Copyright 
and
Compliance framework where one can report violations.

Furthermore, thue FSF can still help bring about compliance even when the 
copyright
lies elsewhere.

> Also, note that both these models of copyright (assigning to a single entity,
> or having diversly held copyright among both entities and individuals) are
> compatible with the DCO in our experience.  The DCO is an assent mechanism
> for licensing, and is orthogonal to the question of who holds the copyright.
>
> We would be glad to talk off-list with any GCC developers who have already
> decided to keep their own copyright about joining an enforcement coalition at
> Conservancy.
>
> The final note that Conservancy would like to share on-list is that through
> our ContractPatch initiative , we've been
> encouraging individuals to assure that their employment contract does permit
> them to keep their own copyrights.  There are many reasons and advantages
> for individuals rather than their employers to take control of copylefted
> copyrights.  We'd also be glad to discuss those policy benefits with anyone
> who is interested off-list.

The FSF has been at the forefront regarding the Disclaimer of Copyright aspect
from employers.  What we can say is that things will became much more difficult
to manage if things are related without a real understanding of the 
implications.

> If you'd like to discuss any of these topics further with Conservancy, may I
> suggest the Contract Patch mailing list at:
> 
> We definitely don't want to see the GCC mailing list derailed into
> discussing this possibly off-topic issue.
>
> -Pono from Software Freedom Conservancy

It is well known that the Software Freedom Law Center has always sought to 
resolve
licensing disputes amicably. On the other, the Software Freedom Conservancy 
takes
much harder line against the noncompliance of licensing terms.

On August 26, 2016, Linus Torvalds stated that he found such type of lawyering
a nasty festering disease, and the SFC is spreading that disease.

I agree with Torvalds following your arguing on what is discussed here.

https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2016-August/003580.html


Re: [PATCH] MAINTAINERS: create DCO section; add myself to it

2021-06-03 Thread Jason Merrill via Gcc-patches

On 6/2/21 12:33 PM, Koning, Paul wrote:




On Jun 2, 2021, at 11:03 AM, Jason Merrill via Gcc-patches 
 wrote:

On 6/1/21 3:22 PM, Richard Biener via Gcc wrote:

On June 1, 2021 7:30:54 PM GMT+02:00, David Malcolm via Gcc  
wrote:

...


The MAINTAINERS file doesn't seem to have such a "DCO list"
yet; does the following patch look like what you had in mind?

ChangeLog

* MAINTAINERS: Create DCO section; add myself to it.
---
MAINTAINERS | 12 
1 file changed, 12 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index db25583b37b..1148e0915cf 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -685,3 +685,15 @@ Josef Zlomek   

James Dennett   
Christian Ehrhardt  
Dara Hazeghi
+
+
+DCO
+===
+
+Developers with commit access may add their name to the following list
+to certify the DCO (https://developercertificate.org/) for all

There should be a verbatim copy of the DCO in this file or the repository.


It's on the website now, at gcc.gnu.org/dco.html , and I've added the section 
to MAINTAINERS.  It's not clear to me that it needs to be in the source tree as 
well, since it's project contribution policy rather than license.


I'm wondering about change control of this document.  The GPL has a version 
number and references to use the version number.  The DCO seems to have a 
version number, but the DCO section in the MAINTAINERS file does not give it.  
I would think that a certification should call out which DCO it uses, whether 
in a one-off (in a patch) or in the MAINTAINERS DCO list.


I've added the version to the MAINTAINERS list, thanks.  It is not 
customary to mention the version in a Signed-off-by tag.


Jason



Re: [PATCH 2/2, rs6000] Remove mode promotion for pseudos

2021-06-03 Thread Segher Boessenkool
Hi!

On Thu, May 20, 2021 at 05:49:49PM +0800, HAO CHEN GUI wrote:
>   rs6000 has instructions that can do almost everything 32 bit
>   at least as efficiently as corresponding 64 bit things. The
>   mode promotion can be defered to when a wide mode is necessary.
>   So it helps a lot not promote mode for pseudos. SPECint test
>   shows that the overall performance improvement (by geomean) is
>   more than 2% with this patch.
>   testsuite/gcc.target/powerpc/not-promote-mode.c illustrates how
>   the patch eliminates the redundant extensions and do further
>   optimization by disabling mode promotion for pseduos.

I'd still like to see if (and why) this works better than explicitly
promoting QImode and HImode here.  But that can be done later.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/not-promote-mode.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */

Just

/* { dg-do compile { target lp64 } } */

because the rest is already implied by this being in gcc.target/powerpc .

The patch is okay for trunk.  Thank you very much for finding this huge
performance gain!


Segher


[Bug middle-end/61577] [4.9.0 Regression] can't compile on hp-ux v3 ia64

2021-06-03 Thread dave.anglin at bell dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61577

--- Comment #226 from dave.anglin at bell dot net ---
John, would you please post your full patch set for ia64-hpux?  This will help
others.

Re: GCC documentation: porting to Sphinx

2021-06-03 Thread Joseph Myers
On Thu, 3 Jun 2021, Martin Liška wrote:

> On 6/2/21 6:44 PM, Joseph Myers wrote:
> > On Wed, 2 Jun 2021, Joel Sherrill wrote:
> > 
> > > For RTEMS, we switched from texinfo to Sphinx and the dependency
> > > on Python3 for Sphinx has caused a bit of hassle. Is this going to be
> > > an issue for GCC?
> > 
> > What Sphinx (and, thus, Python) versions does the GCC manual build work
> > with?
> 
> I've just tried version 1.7.6 which we use for libgccjit and it's fine:
> https://gcc.gnu.org/onlinedocs/jit/
> 
> About Python version: I'm not planning supporting Python2, it's dead 10 years
> already.

There should be appropriate configure checks to avoid building manuals 
with too-old versions (i.e. disable the info/man manual build/install when 
Sphinx, or the Python version it's using, is too old or missing, not fail 
configure).

Actually this code is depending on Python 3.6 or later because of the use 
of an f-string in baseconf.py (without that f-string, it works with older 
versions, even 2.7).  Formally 3.5 and older are no longer supported 
upstream, but certainly still present in some maintained long-term-support 
distribution versions.

> I would recommend testing the build. You can simply clone:
> https://github.com/marxin/texi2rst-generated
> 
> and simply run 'make html' or 'make latexpdf'. Basic dependencies are
> mentioned here:
> https://github.com/marxin/texi2rst-generated#requirements

It appears "make html" works (with lots of WARNINGs) with Sphinx 1.6.1 but 
fails with 1.4 ("Theme error: unsupported theme option 
'prev_next_buttons_location' given").

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: GCC documentation: porting to Sphinx

2021-06-03 Thread Joseph Myers
On Thu, 3 Jun 2021, Martin Liška wrote:

> On 6/2/21 6:44 PM, Joseph Myers wrote:
> > On Wed, 2 Jun 2021, Joel Sherrill wrote:
> > 
> > > For RTEMS, we switched from texinfo to Sphinx and the dependency
> > > on Python3 for Sphinx has caused a bit of hassle. Is this going to be
> > > an issue for GCC?
> > 
> > What Sphinx (and, thus, Python) versions does the GCC manual build work
> > with?
> 
> I've just tried version 1.7.6 which we use for libgccjit and it's fine:
> https://gcc.gnu.org/onlinedocs/jit/
> 
> About Python version: I'm not planning supporting Python2, it's dead 10 years
> already.

There should be appropriate configure checks to avoid building manuals 
with too-old versions (i.e. disable the info/man manual build/install when 
Sphinx, or the Python version it's using, is too old or missing, not fail 
configure).

Actually this code is depending on Python 3.6 or later because of the use 
of an f-string in baseconf.py (without that f-string, it works with older 
versions, even 2.7).  Formally 3.5 and older are no longer supported 
upstream, but certainly still present in some maintained long-term-support 
distribution versions.

> I would recommend testing the build. You can simply clone:
> https://github.com/marxin/texi2rst-generated
> 
> and simply run 'make html' or 'make latexpdf'. Basic dependencies are
> mentioned here:
> https://github.com/marxin/texi2rst-generated#requirements

It appears "make html" works (with lots of WARNINGs) with Sphinx 1.6.1 but 
fails with 1.4 ("Theme error: unsupported theme option 
'prev_next_buttons_location' given").

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [RFC/PATCH 00/11] Fix up some unexpected empty split conditions

2021-06-03 Thread Jeff Law via Gcc-patches




On 6/3/2021 2:00 AM, Segher Boessenkool wrote:

Hi!

On Thu, Jun 03, 2021 at 01:22:38PM +0800, Kewen.Lin wrote:

on 2021/6/3 上午7:52, Segher Boessenkool wrote:

- add a new "define_independent_insn_and_split" that has the
   current semantics of define_insn_and_split.  This should be
   mechanical.

I'd rather not have that -- we can just write separate define_insn and
define_split in that case.

Not sure if someone would argue that he/she would like to go with one shared
pattern as before, to avoid any possible differences between two seperated
patterns and have good maintainability (like only editing on place) and
slightly better efficiency.

You only would do this if you have a different insn condition and split
condition, which is a very important thing to know, it doesn't hurt to
draw attention to it.  The efficiency is exactly the same btw,
define_insn_and_split is just syntactic sugar.

The whole point of requiring the split condition to start with && is so
it will become harder to mess things up (it will make the gen* code a
tiny little bit simpler as well).  And there is no transition period or
anything like that needed either.  Just the bunch that will break will
need fixing.  So let's find out how many of those there are :-)
Exactly.   While these empty conditions or those not starting with "&&" 
are technically valid, they're all suspicious from a port correctness 
standpoint, particularly if the main condition is non-empty.


Having made that mistake when converting the H8 away from CC0, I can say 
fairly confidently that if we had this in place a year ago that those 
mistakes would likely have been avoided.  Thankfully the H8 isn't a 
heavily used port and has limped along until I stumbled over the issue a 
week or so ago while polishing some improvements to the port.

Jeff



Re: Update to GCC copyright assignment policy

2021-06-03 Thread Giacomo Tesio
Hi Daniel,

On Thu, 3 Jun 2021 12:50:44 -0400 Daniel Pono Takamori wrote:

> We definitely don't want to see the GCC mailing list derailed into
> discussing this possibly off-topic issue.

To be fair, THIS is the correct mailing list to discuss these
topics, so much that such major policy change should have been
proposed and discussed here way before its adoption.

At least, according to https://gcc.gnu.org/lists.html

```
gcc is a high volume list for general development discussions about
GCC. Anything relevant to the development or testing of GCC and not
covered by other mailing lists is suitable for discussion here.

[...]

All major decisions and changes, like abandoning ports or front ends,
should be announced and discussed here. [...]
```

So I think Conservacy could (and should) share its own insights here.


Giacomo


Re: Mailing list reconfiguration: VERP Sender: header affected

2021-06-03 Thread Theodore Papadopoulo

On 6/3/21 5:10 PM, D. Hugh Redelmeier wrote:

| From: Martin Liška 

| Which we recommend in the ection Filtering here:
| https://gcc.gnu.org/lists.html

Thanks for the useful information.

That document suggests:
  * ^List-Id: .*<.*.gcc.gnu.org>$

Surely this should be:
  * ^List-Id: .*<.*.gcc\.gnu\.org>$


Or even:

* ^List-Id: .*<.*\.gcc\.gnu\.org>$



[Bug c++/100102] [9/10/11/12 Regression] ICE in tsubst, at cp/pt.c:15310

2021-06-03 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100102

--- Comment #16 from Jonathan Wakely  ---
This is an old, latent g++ bug, but it now makes it impossible to use cuda with
GCC (e.g. on Fedora) because of my libstdc++ changes. I will see if I can make
another libstdc++ change to avoid the ICE without having to revert the change
completely, as it was done to fix a non-conformance bug.

[Bug c++/100102] [9/10/11/12 Regression] ICE in tsubst, at cp/pt.c:15310

2021-06-03 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100102

--- Comment #15 from Jonathan Wakely  ---
(In reply to Erik Schnetter from comment #6)
> I looked for the string "GCC" in the user header files, but could not find
> any place where things would differ between GCC 10.2 and 10.3. I assume
> there could be a difference in GCC-provided header files (the error message
> mentions "chrono" and "gcd"), or it could be that nvcc examines the GCC
> version and produces different code.

The difference is r10-9609-g14f8307cf4261efd5ee4475b3c7f7c42c48557d6 in the
 header, which is in 10.3 and not 10.2

[Bug c++/100102] [9/10/11/12 Regression] ICE in tsubst, at cp/pt.c:15310

2021-06-03 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100102

--- Comment #14 from Jonathan Wakely  ---
cvised reproducer from PR 100448

template  using __bool_constant struct intmax_t;
template  struct ratio {
  template  struct duration {
static intmax_t _S_gcd();
template 
using __is_harmonic =
__bool_constant::den>;
class _Period2 __is_harmonic<_Period2>

[Bug c++/100102] [9/10/11/12 Regression] ICE in tsubst, at cp/pt.c:15310

2021-06-03 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100102

Jonathan Wakely  changed:

   What|Removed |Added

 CC||alexander.grund@tu-dresden.
   ||de

--- Comment #13 from Jonathan Wakely  ---
*** Bug 100448 has been marked as a duplicate of this bug. ***

[Bug c++/100448] internal compiler error: Segmentation fault

2021-06-03 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100448

Jonathan Wakely  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #3 from Jonathan Wakely  ---
dup of PR 100102 then

*** This bug has been marked as a duplicate of bug 100102 ***

[Bug c++/100895] New: gcc accepts invalid template argument in partial template specialization

2021-06-03 Thread hewillk at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100895

Bug ID: 100895
   Summary: gcc accepts invalid template argument in partial
template specialization
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hewillk at gmail dot com
  Target Milestone: ---

template 
using type = int;

template 
constexpr bool value = true;

template 
struct S {}; 

template 
struct S>> {};

S s;

https://godbolt.org/z/GebosdTrM

Re: Update to GCC copyright assignment policy

2021-06-03 Thread Daniel Pono Takamori
I'm joining this list just briefly to give some feedback and input on this
thread on behalf of Software Freedom Conservancy, since we were mentioned
multiple times in this thread.  I suspect any conversation about how
Conservancy and GCC might work together should be off-list or another list,
and I have suggestions on that below.

> > On 2021-06-01 07:28, Mark Wielaard wrote:
> > > If we no longer want the FSF to be the legal guardian and copyright
> > > holder for GCC could we please find another legal entity that performs
> > > that role and helps us as a project with copyleft compliance?

> On Tue, Jun 01, 2021 at 12:58:12PM -0700, Thomas Rodgers wrote:
> > Personally, this would have been my preference.

On Wed, Jun 2, 2021 at 4:18 AM Mark Wielaard  wrote:
> the Conservancy is happy to share their knowledge and discuss policy issues
> with the GCC community if we decide we want their input.

Jason Merrill replied:
>> This seems to me a complement rather than an alternative; some Linux
>> developers use the Conservancy copyleft services while contributing under
>> the DCO, and some GCC developers could do the same.

Jason, we agree completely that anything Conservancy might offer is a
complement rather than a replacement for any structure that the GCC community
already has or might want to build.  For example, the Copyleft Compliance
project that Mark mentioned 
is primarily designed for projects (e.g., BusyBox, Debian, Linux, Samba) that
have diversely-held copyright.  We provide logistical and coordination
support for individuals who hold copyright (and help them figure out how to
keep their own copyrights) and we also accept copyright assignment from those
who prefer assignment.  (As a reminder, Conservancy is not a law firm and we
do not provide legal services and advice.)

Also, note that both these models of copyright (assigning to a single entity,
or having diversly held copyright among both entities and individuals) are
compatible with the DCO in our experience.  The DCO is an assent mechanism
for licensing, and is orthogonal to the question of who holds the copyright.

We would be glad to talk off-list with any GCC developers who have already
decided to keep their own copyright about joining an enforcement coalition at
Conservancy.

The final note that Conservancy would like to share on-list is that through
our ContractPatch initiative , we've been
encouraging individuals to assure that their employment contract does permit
them to keep their own copyrights.  There are many reasons and advantages
for individuals rather than their employers to take control of copylefted
copyrights.  We'd also be glad to discuss those policy benefits with anyone
who is interested off-list.

If you'd like to discuss any of these topics further with Conservancy, may I
suggest the Contract Patch mailing list at: 

We definitely don't want to see the GCC mailing list derailed into
discussing this possibly off-topic issue.

-Pono from Software Freedom Conservancy


signature.asc
Description: PGP signature


[Bug c++/100102] [9/10/11/12 Regression] ICE in tsubst, at cp/pt.c:15310

2021-06-03 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100102

Jonathan Wakely  changed:

   What|Removed |Added

 CC||bowie.owens at gmail dot com

--- Comment #12 from Jonathan Wakely  ---
*** Bug 100277 has been marked as a duplicate of this bug. ***

[Bug c++/100277] ICE on cuda host code

2021-06-03 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100277

Jonathan Wakely  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #3 from Jonathan Wakely  ---
Looks like a dup

*** This bug has been marked as a duplicate of bug 100102 ***

[PATCH] libgcc: Fix _Unwind_Backtrace() for SEH

2021-06-03 Thread Seija K. via Gcc-patches
Forgot to assign to gcc_context.cfa and gcc_context.ra. Note this fix can
be backported to earlier editions of gcc as well

diff --git a/libgcc/unwind-seh.c b/libgcc/unwind-seh.c
index 8c6aade9a3b39..d40d16702a9e1 100644
--- a/libgcc/unwind-seh.c
+++ b/libgcc/unwind-seh.c
@@ -466,6 +466,9 @@ _Unwind_Backtrace(_Unwind_Trace_Fn trace,
_context.disp->HandlerData,
_context.disp->EstablisherFrame, NULL);

+  gcc_context.cfa = ms_context.Rsp;
+  gcc_context.ra = ms_context.Rip;
+
   /* Call trace function.  */
   if (trace (_context, trace_argument) != _URC_NO_REASON)
return _URC_FATAL_PHASE1_ERROR;


[RFC][ivopts] Generate better code for IVs with uses outside the loop (was Re: [RFC] Implementing detection of saturation and rounding arithmetic)

2021-06-03 Thread Andre Vieira (lists) via Gcc-patches

Streams got crossed there and used the wrong subject ...

On 03/06/2021 17:34, Andre Vieira (lists) via Gcc-patches wrote:

Hi,

This RFC is motivated by the IV sharing RFC in 
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569502.html and the 
need to have the IVOPTS pass be able to clean up IV's shared between 
multiple loops. When creating a similar problem with C code I noticed 
IVOPTs treated IV's with uses outside the loop differently, this 
didn't even required multiple loops, take for instance the following 
example using SVE intrinsics:


#include 
#include 
extern void use (char *);
void bar (char  * __restrict__ a, char * __restrict__ b, char * 
__restrict__ c, unsigned n)

{
    svbool_t all_true = svptrue_b8 ();
  unsigned i = 0;
  if (n < (UINT_MAX - svcntb() - 1))
    {
    for (; i < n; i += svcntb())
    {
    svuint8_t va = svld1 (all_true, (uint8_t*)a);
    svuint8_t vb = svld1 (all_true, (uint8_t*)b);
    svst1 (all_true, (uint8_t *)c, svadd_z (all_true, 
va,vb));

    a += svcntb();
    b += svcntb();
    c += svcntb();
    }
    }
  use (a);
}

IVOPTs tends to generate a shared IV for SVE memory accesses, as we 
don't have a post-increment for SVE load/stores. If we had not 
included 'use (a);' in this example, IVOPTs would have replaced the 
IV's for a, b and c with a single one, (also used for the 
loop-control). See:


   [local count: 955630225]:
  # ivtmp.7_8 = PHI 
  va_14 = MEM  [(unsigned char *)a_10(D) + ivtmp.7_8 * 1];
  vb_15 = MEM  [(unsigned char *)b_11(D) + ivtmp.7_8 * 1];
  _2 = svadd_u8_z ({ -1, ... }, va_14, vb_15);
  MEM <__SVUint8_t> [(unsigned char *)c_12(D) + ivtmp.7_8 * 1] = _2;
  ivtmp.7_25 = ivtmp.7_8 + POLY_INT_CST [16, 16];
  i_23 = (unsigned int) ivtmp.7_25;
  if (n_9(D) > i_23)
    goto ; [89.00%]
  else
    goto ; [11.00%]

 However, due to the 'use (a);' it will create two IVs one for 
loop-control, b and c and one for a. See:


  [local count: 955630225]:
  # a_28 = PHI 
  # ivtmp.7_25 = PHI 
  va_15 = MEM  [(unsigned char *)a_28];
  vb_16 = MEM  [(unsigned char *)b_12(D) + ivtmp.7_25 * 1];
  _2 = svadd_u8_z ({ -1, ... }, va_15, vb_16);
  MEM <__SVUint8_t> [(unsigned char *)c_13(D) + ivtmp.7_25 * 1] = _2;
  a_18 = a_28 + POLY_INT_CST [16, 16];
  ivtmp.7_24 = ivtmp.7_25 + POLY_INT_CST [16, 16];
  i_8 = (unsigned int) ivtmp.7_24;
  if (n_10(D) > i_8)
    goto ; [89.00%]
  else
    goto ; [11.00%]

With the first patch attached in this RFC 'no_cost.patch', I tell 
IVOPTs to not cost uses outside of the loop. This makes IVOPTs 
generate a single IV, but unfortunately it decides to create the 
variable for the use inside the loop and it also seems to use the 
pre-increment value of the shared-IV and add the [16,16] to it. See:


   [local count: 955630225]:
  # ivtmp.7_25 = PHI 
  va_15 = MEM  [(unsigned char *)a_11(D) + ivtmp.7_25 * 1];
  vb_16 = MEM  [(unsigned char *)b_12(D) + ivtmp.7_25 * 1];
  _2 = svadd_u8_z ({ -1, ... }, va_15, vb_16);
  MEM <__SVUint8_t> [(unsigned char *)c_13(D) + ivtmp.7_25 * 1] = _2;
  _8 = (unsigned long) a_11(D);
  _7 = _8 + ivtmp.7_25;
  _6 = _7 + POLY_INT_CST [16, 16];
  a_18 = (char * restrict) _6;
  ivtmp.7_24 = ivtmp.7_25 + POLY_INT_CST [16, 16];
  i_5 = (unsigned int) ivtmp.7_24;
  if (n_10(D) > i_5)
    goto ; [89.00%]
  else
    goto ; [11.00%]

With the patch 'var_after.patch' I make get_computation_aff_1 use 
'cand->var_after' for outside uses thus using the post-increment var 
of the candidate IV. This means I have to insert it in a different 
place and make sure to delete the old use->stmt. I'm sure there is a 
better way to do this using IVOPTs current framework, but I didn't 
find one yet. See the result:


  [local count: 955630225]:
  # ivtmp.7_25 = PHI 
  va_15 = MEM  [(unsigned char *)a_11(D) + ivtmp.7_25 * 1];
  vb_16 = MEM  [(unsigned char *)b_12(D) + ivtmp.7_25 * 1];
  _2 = svadd_u8_z ({ -1, ... }, va_15, vb_16);
  MEM <__SVUint8_t> [(unsigned char *)c_13(D) + ivtmp.7_25 * 1] = _2;
  ivtmp.7_24 = ivtmp.7_25 + POLY_INT_CST [16, 16];
  _8 = (unsigned long) a_11(D);
  _7 = _8 + ivtmp.7_24;
  a_18 = (char * restrict) _7;
  i_6 = (unsigned int) ivtmp.7_24;
  if (n_10(D) > i_6)
    goto ; [89.00%]
  else
    goto ; [11.00%]


This is still not optimal as we are still doing the update inside the 
loop and there is absolutely no need for that. I found that running 
sink would solve it and it seems someone has added a second sink pass, 
so that saves me a third patch :) see after sink2:


   [local count: 955630225]:
  # ivtmp.7_25 = PHI 
  va_15 = MEM  [(unsigned char *)a_11(D) + ivtmp.7_25 * 1];
  vb_16 = MEM  [(unsigned char *)b_12(D) + ivtmp.7_25 * 1];
  _2 = svadd_u8_z ({ -1, ... }, va_15, vb_16);
  MEM <__SVUint8_t> [(unsigned char *)c_13(D) + ivtmp.7_25 * 1] = _2;
  ivtmp.7_24 = ivtmp.7_25 + POLY_INT_CST [16, 16];
  i_6 = (unsigned int) ivtmp.7_24;
  if (i_6 < n_10(D))
    goto ; [89.00%]
  else
    goto ; [11.00%]

[Bug libstdc++/100770] Incorrect if constexpr statement in ranges::unique_copy

2021-06-03 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100770

--- Comment #2 from Patrick Palka  ---
Fixed on trunk so far by r12-1195:

Author: Patrick Palka 
Date:   Thu, 3 Jun 2021 12:30:29 -0400

libstdc++: Avoid hard error in ranges::unique_copy [PR100770]

Here, in the constexpr if condition within ranges::unique_copy, when
input_iterator<_Out> isn't satisfied we must avoid substituting into
iter_value_t<_Out> because the latter isn't necessarily well-formed
then.  To that end, this patch factors out the condition into a concept
and uses it throughout.

This patch also makes the definition of our testsuite
output_iterator_wrapper more minimal by setting its value_type, pointer
and reference member types to void.  This means our existing tests for
unique_copy already exercise the fix for this bug, so we don't need
to add another test.  The only other fallout of this testsuite iterator
change appears in std/ranges/range.cc, where the use of range_value_t
on a test_output_range is now ill-formed.

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (__detail::__can_reread_output):
Factor out this concept from ...
(__unique_copy_fn::operator()): ... here.  Use the concept
throughout.
* testsuite/std/ranges/range.cc: Remove now ill-formed use
of range_value_t on an output_range.
* testsuite/util/testsuite_iterators.h (output_iterator_wrapper):
Define value_type, pointer and reference member types to void.

[Bug libstdc++/100894] New: The std::common_reference implementation seems to be wrong

2021-06-03 Thread hewillk at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100894

Bug ID: 100894
   Summary: The std::common_reference implementation seems to be
wrong
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hewillk at gmail dot com
  Target Milestone: ---

In [meta#trans.other-6.3.1], the standard specifies "If T1 and T2 are reference
types and COMMON-REF(T1, T2) is well-formed, then the member typedef type
denotes that type", where COMMON-REF is defined in [meta#trans.other-3.5]: "If
A and B are both lvalue reference types, COMMON-REF(A, B) is COND-RES(COPYCV(X,
Y) &, COPYCV(​ Y, X) &) if that type exists and is a reference type."

libstdc++ does not check that COMMON-REF(A, B) must be a reference type, which
will lead to incorrect determination of the common reference type according to
bullet 1 in the following cases.

https://godbolt.org/z/7Mc7jjesK

#include 

struct A {};
struct B { B(A); };
static_assert(
  std::same_as<
  std::common_reference_t, B>);

[RFC] Implementing detection of saturation and rounding arithmetic

2021-06-03 Thread Andre Vieira (lists) via Gcc-patches

Hi,

This RFC is motivated by the IV sharing RFC in 
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569502.html and the 
need to have the IVOPTS pass be able to clean up IV's shared between 
multiple loops. When creating a similar problem with C code I noticed 
IVOPTs treated IV's with uses outside the loop differently, this didn't 
even required multiple loops, take for instance the following example 
using SVE intrinsics:


#include 
#include 
extern void use (char *);
void bar (char  * __restrict__ a, char * __restrict__ b, char * 
__restrict__ c, unsigned n)

{
    svbool_t all_true = svptrue_b8 ();
  unsigned i = 0;
  if (n < (UINT_MAX - svcntb() - 1))
    {
    for (; i < n; i += svcntb())
    {
    svuint8_t va = svld1 (all_true, (uint8_t*)a);
    svuint8_t vb = svld1 (all_true, (uint8_t*)b);
    svst1 (all_true, (uint8_t *)c, svadd_z (all_true, va,vb));
    a += svcntb();
    b += svcntb();
    c += svcntb();
    }
    }
  use (a);
}

IVOPTs tends to generate a shared IV for SVE memory accesses, as we 
don't have a post-increment for SVE load/stores. If we had not included 
'use (a);' in this example, IVOPTs would have replaced the IV's for a, b 
and c with a single one, (also used for the loop-control). See:


   [local count: 955630225]:
  # ivtmp.7_8 = PHI 
  va_14 = MEM  [(unsigned char *)a_10(D) + ivtmp.7_8 * 1];
  vb_15 = MEM  [(unsigned char *)b_11(D) + ivtmp.7_8 * 1];
  _2 = svadd_u8_z ({ -1, ... }, va_14, vb_15);
  MEM <__SVUint8_t> [(unsigned char *)c_12(D) + ivtmp.7_8 * 1] = _2;
  ivtmp.7_25 = ivtmp.7_8 + POLY_INT_CST [16, 16];
  i_23 = (unsigned int) ivtmp.7_25;
  if (n_9(D) > i_23)
    goto ; [89.00%]
  else
    goto ; [11.00%]

 However, due to the 'use (a);' it will create two IVs one for 
loop-control, b and c and one for a. See:


  [local count: 955630225]:
  # a_28 = PHI 
  # ivtmp.7_25 = PHI 
  va_15 = MEM  [(unsigned char *)a_28];
  vb_16 = MEM  [(unsigned char *)b_12(D) + ivtmp.7_25 * 1];
  _2 = svadd_u8_z ({ -1, ... }, va_15, vb_16);
  MEM <__SVUint8_t> [(unsigned char *)c_13(D) + ivtmp.7_25 * 1] = _2;
  a_18 = a_28 + POLY_INT_CST [16, 16];
  ivtmp.7_24 = ivtmp.7_25 + POLY_INT_CST [16, 16];
  i_8 = (unsigned int) ivtmp.7_24;
  if (n_10(D) > i_8)
    goto ; [89.00%]
  else
    goto ; [11.00%]

With the first patch attached in this RFC 'no_cost.patch', I tell IVOPTs 
to not cost uses outside of the loop. This makes IVOPTs generate a 
single IV, but unfortunately it decides to create the variable for the 
use inside the loop and it also seems to use the pre-increment value of 
the shared-IV and add the [16,16] to it. See:


   [local count: 955630225]:
  # ivtmp.7_25 = PHI 
  va_15 = MEM  [(unsigned char *)a_11(D) + ivtmp.7_25 * 1];
  vb_16 = MEM  [(unsigned char *)b_12(D) + ivtmp.7_25 * 1];
  _2 = svadd_u8_z ({ -1, ... }, va_15, vb_16);
  MEM <__SVUint8_t> [(unsigned char *)c_13(D) + ivtmp.7_25 * 1] = _2;
  _8 = (unsigned long) a_11(D);
  _7 = _8 + ivtmp.7_25;
  _6 = _7 + POLY_INT_CST [16, 16];
  a_18 = (char * restrict) _6;
  ivtmp.7_24 = ivtmp.7_25 + POLY_INT_CST [16, 16];
  i_5 = (unsigned int) ivtmp.7_24;
  if (n_10(D) > i_5)
    goto ; [89.00%]
  else
    goto ; [11.00%]

With the patch 'var_after.patch' I make get_computation_aff_1 use 
'cand->var_after' for outside uses thus using the post-increment var of 
the candidate IV. This means I have to insert it in a different place 
and make sure to delete the old use->stmt. I'm sure there is a better 
way to do this using IVOPTs current framework, but I didn't find one 
yet. See the result:


  [local count: 955630225]:
  # ivtmp.7_25 = PHI 
  va_15 = MEM  [(unsigned char *)a_11(D) + ivtmp.7_25 * 1];
  vb_16 = MEM  [(unsigned char *)b_12(D) + ivtmp.7_25 * 1];
  _2 = svadd_u8_z ({ -1, ... }, va_15, vb_16);
  MEM <__SVUint8_t> [(unsigned char *)c_13(D) + ivtmp.7_25 * 1] = _2;
  ivtmp.7_24 = ivtmp.7_25 + POLY_INT_CST [16, 16];
  _8 = (unsigned long) a_11(D);
  _7 = _8 + ivtmp.7_24;
  a_18 = (char * restrict) _7;
  i_6 = (unsigned int) ivtmp.7_24;
  if (n_10(D) > i_6)
    goto ; [89.00%]
  else
    goto ; [11.00%]


This is still not optimal as we are still doing the update inside the 
loop and there is absolutely no need for that. I found that running sink 
would solve it and it seems someone has added a second sink pass, so 
that saves me a third patch :) see after sink2:


   [local count: 955630225]:
  # ivtmp.7_25 = PHI 
  va_15 = MEM  [(unsigned char *)a_11(D) + ivtmp.7_25 * 1];
  vb_16 = MEM  [(unsigned char *)b_12(D) + ivtmp.7_25 * 1];
  _2 = svadd_u8_z ({ -1, ... }, va_15, vb_16);
  MEM <__SVUint8_t> [(unsigned char *)c_13(D) + ivtmp.7_25 * 1] = _2;
  ivtmp.7_24 = ivtmp.7_25 + POLY_INT_CST [16, 16];
  i_6 = (unsigned int) ivtmp.7_24;
  if (i_6 < n_10(D))
    goto ; [89.00%]
  else
    goto ; [11.00%]

   [local count: 105119324]:
  _8 = (unsigned long) a_11(D);
  _7 = _8 + ivtmp.7_24;
  a_18 = (char * restrict) _7;
  goto ; 

[PATCH] PR libstdc++/98842: Fixed Constraints on operator<=>(optional, U)

2021-06-03 Thread Seija K. via Gcc-patches
The original operator was underconstrained. _Up needs to fulfill
compare_three_way_result,
as mentioned in this bug report
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98842

diff --git a/libstdc++-v3/include/std/optional
b/libstdc++-v3/include/std/optional
index 8b9e038e6e510..9e61c1b2cbfbd 100644
--- a/libstdc++-v3/include/std/optional
+++ b/libstdc++-v3/include/std/optional
@@ -1234,7 +1234,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { return !__rhs || __lhs >= *__rhs; }

 #ifdef __cpp_lib_three_way_comparison
-  template
+  template _Up>
 constexpr compare_three_way_result_t<_Tp, _Up>
 operator<=>(const optional<_Tp>& __x, const _Up& __v)
 { return bool(__x) ? *__x <=> __v : strong_ordering::less; }


Fwd: [PUSHED] Skip out on processing __builtin_clz when varying.

2021-06-03 Thread Aldy Hernandez via Gcc-patches
Ping*2

-- Forwarded message -
From: Aldy Hernandez 
Date: Thu, May 13, 2021, 20:02
Subject: Re: [PUSHED] Skip out on processing __builtin_clz when varying.
To: Jakub Jelinek 
Cc: GCC patches 




On 5/12/21 5:08 PM, Jakub Jelinek wrote:
> On Wed, May 12, 2021 at 05:01:00PM -0400, Aldy Hernandez via Gcc-patches
wrote:
>>
>>  PR c/100521
>>  * gimple-range.cc (range_of_builtin_call): Skip out on
>>processing __builtin_clz when varying.
>> ---
>>   gcc/gimple-range.cc | 2 +-
>>   gcc/testsuite/gcc.dg/pr100521.c | 8 
>>   2 files changed, 9 insertions(+), 1 deletion(-)
>>   create mode 100644 gcc/testsuite/gcc.dg/pr100521.c
>>
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/pr100521.c
>> @@ -0,0 +1,8 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" } */
>> +
>> +int
>> +__builtin_clz (int a)
>
> Is this intentional?  People shouldn't be redefining builtins...

Ughhh.  I don't think that's intentional.  For that matter, the current
nor the old code is designed to deal with this, especially in this case
when the builtin is being redefined with incompatible arguments.  That
is, the above "builtin" has a signed integer as an argument, whereas the
original builtin had an unsigned one.

In looking at the original vr-values code, I think this could use a
cleanup.  First, ranges from range_of_expr are always numeric so we
should adjust.  Also, the checks for non-zero were assuming the argument
was unsigned, which in the above redirect is clearly not.  I've cleaned
this up, so that it works either way, though perhaps we should _also_
bail on non-builtins. I don't know...this is before my time.

BTW, I've removed the following annoying idiom:

- int newmini = prec - 1 - wi::floor_log2 (r.upper_bound ());
- if (newmini == prec)

This is really a check for r.upper_bound() == 0, as floor_log2(0)
returns -1.  It's confusing.

How does this look?  For reference, the original code where this all
came from is 82b6d25d289195.

Thanks for pointing this out.
Aldy
From f8a958e8028ed129558f9ad7ccf423c834d377bd Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Thu, 13 May 2021 13:47:41 -0400
Subject: [PATCH] Cleanup clz and ctz code in range_of_builtin_call.

gcc/ChangeLog:

	* gimple-range.cc (range_of_builtin_call): Cleanup clz and ctz
	code.
---
 gcc/gimple-range.cc | 43 ---
 1 file changed, 20 insertions(+), 23 deletions(-)

diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
index 5b288d8e6a7..b33ba1c8099 100644
--- a/gcc/gimple-range.cc
+++ b/gcc/gimple-range.cc
@@ -736,33 +736,29 @@ range_of_builtin_call (range_query , irange , gcall *call)
 	}
 
   query.range_of_expr (r, arg, call);
-  // From clz of minimum we can compute result maximum.
-  if (r.constant_p () && !r.varying_p ())
+  if (!r.undefined_p ())
 	{
-	  int newmaxi = prec - 1 - wi::floor_log2 (r.lower_bound ());
-	  // Argument is unsigned, so do nothing if it is [0, ...] range.
-	  if (newmaxi != prec)
+	  // From clz of minimum we can compute result maximum.
+	  if (wi::gt_p (r.lower_bound (), 0, TYPE_SIGN (r.type (
+	{
+	  maxi = prec - 1 - wi::floor_log2 (r.lower_bound ());
+	  if (mini == -2)
+		mini = 0;
+	}
+	  else if (!range_includes_zero_p ())
 	{
 	  mini = 0;
-	  maxi = newmaxi;
+	  maxi = prec - 1;
 	}
-	}
-  else if (!range_includes_zero_p ())
-	{
-	  maxi = prec - 1;
-	  mini = 0;
-	}
-  if (mini == -2)
-	break;
-  // From clz of maximum we can compute result minimum.
-  if (r.constant_p ())
-	{
-	  int newmini = prec - 1 - wi::floor_log2 (r.upper_bound ());
-	  if (newmini == prec)
+	  if (mini == -2)
+	break;
+	  // From clz of maximum we can compute result minimum.
+	  wide_int max = r.upper_bound ();
+	  int newmini = prec - 1 - wi::floor_log2 (max);
+	  if (max == 0)
 	{
-	  // Argument range is [0, 0].  If CLZ_DEFINED_VALUE_AT_ZERO
-	  // is 2 with VALUE of prec, return [prec, prec], otherwise
-	  // ignore the range.
+	  // If CLZ_DEFINED_VALUE_AT_ZERO is 2 with VALUE of prec,
+	  // return [prec, prec], otherwise ignore the range.
 	  if (maxi == prec)
 		mini = prec;
 	}
@@ -803,7 +799,8 @@ range_of_builtin_call (range_query , irange , gcall *call)
   query.range_of_expr (r, arg, call);
   if (!r.undefined_p ())
 	{
-	  if (r.lower_bound () != 0)
+	  // If arg is non-zero, then use [0, prec - 1].
+	  if (!range_includes_zero_p ())
 	{
 	  mini = 0;
 	  maxi = prec - 1;
-- 
2.31.1



[Bug fortran/86694] gfortran rejects character parameter binding label

2021-06-03 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86694

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kargl at gcc dot gnu.org

--- Comment #3 from kargl at gcc dot gnu.org ---
Adding my text from the duplicate bug report.

Interesting bug.  gfortran tries to reduce the constant
expression to a constant during the parse phase.  This
reduction occurs too early and needs to be moved to
the resolution phase.  In particular, decl.c:8052-8123
need to change/move to someplace in resolve.c where the
host namespace can be resolved to accommodate the import
statement.  Good Luck and happy hacking.

[Bug fortran/86694] gfortran rejects character parameter binding label

2021-06-03 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86694

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 CC||ehlert at thch dot uni-bonn.de

--- Comment #2 from kargl at gcc dot gnu.org ---
*** Bug 100870 has been marked as a duplicate of this bug. ***

[Bug fortran/100870] Constant expression for bind(C) name in interface body not importable

2021-06-03 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100870

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||kargl at gcc dot gnu.org
 Resolution|--- |DUPLICATE

--- Comment #1 from kargl at gcc dot gnu.org ---
Interesting bug.  gfortran tries to reduce the constant
expression to a constant during the parse phase.  This
reduction occurs too early and needs to be moved to
the resolution phase.  In particular, decl.c:8052-8123
need to change/move to someplace in resolve.c where the
host namespace can be resolved to accommodate the import
statement.  Good Luck and happy hacking.

*** This bug has been marked as a duplicate of bug 86694 ***

[Bug c++/100893] Template argument conversion fails for dependant constant function pointer template parameters

2021-06-03 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100893

Patrick Palka  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2021-06-03
 CC||ppalka at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |ppalka at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Patrick Palka  ---
Confirmed, this never worked it seems.

Re: [PATCH 04/11] cris: Update unexpected empty split condition

2021-06-03 Thread Hans-Peter Nilsson via Gcc-patches
> From: Kewen.Lin 
> Date: Thu, 3 Jun 2021 07:45:57 +0200

> on 2021/6/2 Hans-Peter Nilsson wrote:
> >> From: Kewen Lin 
> >> Date: Wed, 2 Jun 2021 07:04:54 +0200
> > 
> >> gcc/ChangeLog:
> >>
> >>* config/cris/cris.md (*addi_reload): Fix empty split condition.

> >> -  ""
> >> +  "&& 1"

> > Ok, thanks, if only for all-round consistency.
> > 
> > In preparation for a warning for an empty condition?  I'm
> > usually all for .md-warnings, but I'm not sure about the
> > benefit of that one, though.  Those "&& 1" look...hackish.
> 
> Thanks!  Yeah, the 01/11 patch aims to raise one error message
> for the define_insn_and_split whose split condition is empty
> while insn condition isn't.  In most cases, when we write one
> define_insn_and_split we want the splitting only to take effect
> while we see the define_insn matching happen (insn cond holds),
> but if we leave the split condition empty, the splitting will
> be done always, it could result in some unexpected consequence.
> Mostly this is unintentional.

It certainly was in the patch above!

>  The error message is to avoid
> people to make it unintentionally.
> 
> As you may have seen from the discussion under the 00/11 thread,
> we will probably end up with some other solution, so I will hold
> the changes for the ports, sorry for wasting your time and the
> other port maintainers'.

No worries: I certainly don't consider it wasted and I'd
prefer to have the patch above committed sooner than the
conclusion of that discussion.  (If you don't get to it,
I'll do it, after a round of testing.)

If you're considering further target patches to adjust for
eventually changed semantics in the define_insn_and_split
split-condition, then whatever trivial patch to cris.md that
gets the effect of the one you sent is preapproved.

Again, thanks.

brgds, H-P


Update to GCC copyright assignment policy

2021-06-03 Thread Christopher Dimech via Gcc



> Sent: Friday, June 04, 2021 at 2:45 AM
> From: "Giacomo Tesio" 
> To: "Jakub Jelinek" 
> Cc: gcc@gcc.gnu.org
> Subject: Re: Update to GCC copyright assignment policy
>
> On Thu, 3 Jun 2021 16:14:15 +0200 Jakub Jelinek wrote:
>
> > Because it makes no sense
>
> A change in the copyright policies and ownership of a project is usually
> seen as a very big change, so much that usually the project change its
> whole name, not just its major version.
>
> > doing a GCC release is lots of work and GCC has a
> > roughly yearly release cadence for a reason.
>
> Actually an year of delay on such policy change would be very welcome.
>
> I would have really appreciated if the GCC SC had announced such change
> for the upcoming GCC 12 while sticking to the old policy in GCC 11.
>
> > You can always cherry-pick any changes assigned to FSF from trunk to
> > 11.1 on your own
>
> Sure, I can.
>
> But most users usually download tarballs.
>
> Having the first non-FSF-copyrighted version in a new version would be
> very appreciated by many organizations around the world that prefer
> to have as few legal dependencies as possible.
>
> That's why it's a major change for people downstream!
>
> Giacomo

It all depends on whether the maintainer wants it included.  Has
nothing to do with legal dependencies.  Suppose a person gives
you a free software license, but at a later time changes the
license.  You would still be able to use the code, and distribute
modified copies.  What has to happen is for developers to ask
their employers to issue them with a "Disclamer of Copyright
Statement".

The major problem is still linked to the reality that many
business administrators have a grasping attitude towards
software, science, and knowledge in general, seeing any activity or
knowledge only as opportunities for unjust income, not as
opportunities to contribute to human knowledge.

Workers today have no rights in the new digital world.

- Christopher Dimech
General Administrator - Naiad Informatics - Gnu Project (Geocomputation)

Society has become too quick to pass judgement and declare someone
Persona Non-Grata, the most extreme form of censure a country can
bestow.

In a new era of destructive authoritarianism, I support Richard
Stallman.  Times of great crisis are also times of great
opportunity.  I call upon you to make this struggle yours as well !

https://stallmansupport.org/
https://www.fsf.org/ https://www.gnu.org/





Re: [PATCH] wwwdocs: Do not rewrite the page titles

2021-06-03 Thread Jonathan Wakely via Gcc-patches

On 03/06/21 16:50 +0100, Jonathan Wakely wrote:

Ping.

Is this OK now?


On 18/04/21 23:45 +0100, Jonathan Wakely wrote:

Remove GNU and FSF attribution from HTML page titles.


I don't see why we should have to "comply with the GNU style" if we're
truly an independent project run by the GCC developers and aided by
the steering committee.

OK for wwwdocs?


An alternative change would be to just drop the mention of the FSF,
since they don't fund GCC or provide hosting etc., as in the attached
patch.

And as I pointed out previously, none of these sites refer to the FSF
in their page s:

https://www.gnu.org/software/gdb/
https://www.gnu.org/software/libc/
https://www.gnu.org/software/guile/
https://www.gnu.org/software/emacs/
https://guix.gnu.org/

(Some don't refer to the GNU project either.)


diff --git a/htdocs/style.mhtml b/htdocs/style.mhtml
index 677fd05a..1f7139eb 100644
--- a/htdocs/style.mhtml
+++ b/htdocs/style.mhtml
@@ -32,12 +32,12 @@
  
 
 
-;;; Redefine the  tag to comply with the GNU style.
+;;; Redefine the  tag to mention the GNU project.
 
 
 
 %body
-- GNU Project - Free Software Foundation (FSF)
+- GNU Project
 
 
 ;;; Redefine the  tag, adding navigation and a standard footer.


[c-family] Fix duplicate name issues in output of -fdump-ada-spec

2021-06-03 Thread Eric Botcazou
The namespace rules are different in the C family of languages and in Ada, and a
few adjustments are further needed in -fdump-ada-spec because of them.

Tested on x86-64/Linux, applied on the mainline.


2021-06-03  Eric Botcazou  

c-family/
* c-ada-spec.c (dump_ada_enum_type): Dump a prefix for constants.
(htable_t): New typedef.
(overloaded_names): Use it.
(add_name): New function.
(init_overloaded_names): Use add_name to populate the table and add
special cases for sigaction and stat.
(overloaded_name_p): Rename into...
(overloading_index): ...this.  Do not initialize overloaded_names table
here.  Return the index or zero.
(dump_ada_declaration): Minor tweaks.  Do not skip overloaded functions
but add an overloading suffix instead.
(dump_ada_specs): Initialize overloaded_names tables here.

-- 
Eric Botcazoudiff --git a/gcc/c-family/c-ada-spec.c b/gcc/c-family/c-ada-spec.c
index 29eb0b01a91..ef0c74c3f08 100644
--- a/gcc/c-family/c-ada-spec.c
+++ b/gcc/c-family/c-ada-spec.c
@@ -2003,7 +2003,15 @@ dump_ada_enum_type (pretty_printer *buffer, tree node, tree type, tree parent,
 	  pp_semicolon (buffer);
 	  newline_and_indent (buffer, spc);
 
+	  if (TYPE_NAME (node))
+	dump_ada_node (buffer, node, NULL_TREE, spc, false, true);
+	  else if (type)
+	dump_ada_node (buffer, type, NULL_TREE, spc, false, true);
+	  else
+	dump_anonymous_type_name (buffer, node, parent);
+	  pp_underscore (buffer);
 	  pp_ada_tree_identifier (buffer, TREE_PURPOSE (value), node, false);
+
 	  pp_string (buffer, " : constant ");
 
 	  if (TYPE_NAME (node))
@@ -2628,11 +2636,31 @@ struct overloaded_name_hasher : delete_ptr_hash
 { return a->name == b->name; }
 };
 
-static hash_table *overloaded_names;
+typedef hash_table htable_t;
+
+static htable_t *overloaded_names;
+
+/* Add an overloaded NAME with N occurrences to TABLE.  */
+
+static void
+add_name (const char *name, unsigned int n, htable_t *table)
+{
+  struct overloaded_name_hash in, *h, **slot;
+  tree id = get_identifier (name);
+  hashval_t hash = htab_hash_pointer (id);
+  in.hash = hash;
+  in.name = id;
+  slot = table->find_slot_with_hash (, hash, INSERT);
+  h = new overloaded_name_hash;
+  h->hash = hash;
+  h->name = id;
+  h->n = n;
+  *slot = h;
+}
 
 /* Initialize the table with the problematic overloaded names.  */
 
-static hash_table *
+static htable_t *
 init_overloaded_names (void)
 {
   static const char *names[] =
@@ -2640,41 +2668,31 @@ init_overloaded_names (void)
   { "memchr", "rawmemchr", "memrchr", "strchr", "strrchr", "strchrnul",
 "strpbrk", "strstr", "strcasestr", "index", "rindex", "basename" };
 
-  hash_table *table
-= new hash_table (64);
+  htable_t *table = new htable_t (64);
 
   for (unsigned int i = 0; i < ARRAY_SIZE (names); i++)
-{
-  struct overloaded_name_hash in, *h, **slot;
-  tree id = get_identifier (names[i]);
-  hashval_t hash = htab_hash_pointer (id);
-  in.hash = hash;
-  in.name = id;
-  slot = table->find_slot_with_hash (, hash, INSERT);
-  h = new overloaded_name_hash;
-  h->hash = hash;
-  h->name = id;
-  h->n = 0;
-  *slot = h;
-}
+add_name (names[i], 0, table);
+
+  /* Consider that sigaction() is overloaded by struct sigaction for QNX.  */
+  add_name ("sigaction", 1, table);
+
+  /* Consider that stat() is overloaded by struct stat for QNX.  */
+  add_name ("stat", 1, table);
 
   return table;
 }
 
-/* Return whether NAME cannot be supported as overloaded name.  */
+/* Return the overloading index of NAME or 0 if NAME is not overloaded.  */
 
-static bool
-overloaded_name_p (tree name)
+static unsigned int
+overloading_index (tree name)
 {
-  if (!overloaded_names)
-overloaded_names = init_overloaded_names ();
-
   struct overloaded_name_hash in, *h;
   hashval_t hash = htab_hash_pointer (name);
   in.hash = hash;
   in.name = name;
   h = overloaded_names->find_with_hash (, hash);
-  return h && ++h->n > 1;
+  return h ? ++h->n : 0;
 }
 
 /* Dump in BUFFER constructor spec corresponding to T for TYPE.  */
@@ -2798,14 +2816,17 @@ dump_ada_declaration (pretty_printer *buffer, tree t, tree type, int spc)
 	}
 
   /* Skip unnamed or anonymous structs/unions/enum types.  */
-  if (!orig && !decl_name && !name
+  if (!orig
 	  && (RECORD_OR_UNION_TYPE_P (TREE_TYPE (t))
-	  || TREE_CODE (TREE_TYPE (t)) == ENUMERAL_TYPE))
+	  || TREE_CODE (TREE_TYPE (t)) == ENUMERAL_TYPE)
+	  && !decl_name
+	  && !name)
 	return 0;
 
-	/* Skip anonymous enum types (duplicates of real types).  */
+  /* Skip duplicates of structs/unions/enum types built in C++.  */
   if (!orig
-	  && TREE_CODE (TREE_TYPE (t)) == ENUMERAL_TYPE
+	  && (RECORD_OR_UNION_TYPE_P (TREE_TYPE (t))
+	  || TREE_CODE (TREE_TYPE (t)) == ENUMERAL_TYPE)
 	  && decl_name
 	  && (*IDENTIFIER_POINTER (decl_name) == '.'
 	  || *IDENTIFIER_POINTER (decl_name) == 

  1   2   3   >