Re: [PATCH] Ensure `-lmsvcrt` precede `-lkernel32`

2020-05-29 Thread JonY via Gcc-patches
On 5/29/20 2:04 PM, Liu Hao via Gcc-patches wrote:
> 在 2020/5/29 22:01, Liu Hao 写道:
>> This is necessary as libmsvcrt.a is not a pure import library, but
>> also contains some functions that invoke others in KERNEL32.DLL.
>>
>>  * config/i386/mingw32.h: Insert -lkernel32 after -lmsvcrt
>> ---
>>  gcc/config/i386/mingw32.h | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/gcc/config/i386/mingw32.h b/gcc/config/i386/mingw32.h
>> index 1bbabfe8bed..321c30e41cc 100644
>> --- a/gcc/config/i386/mingw32.h
>> +++ b/gcc/config/i386/mingw32.h
>> @@ -165,7 +165,7 @@ along with GCC; see the file COPYING3.  If not see
>>  #define REAL_LIBGCC_SPEC \
>>"%{mthreads:-lmingwthrd} -lmingw32 \
>> " SHARED_LIBGCC_SPEC " \
>> -   -lmoldname -lmingwex -lmsvcrt"
>> +   -lmoldname -lmingwex -lmsvcrt -lkernel32"
>>   #undef STARTFILE_SPEC
>>  #define STARTFILE_SPEC "%{shared|mdll:dllcrt2%O%s} \
>>
> 
> This patch originates from this discussion on #mingw-w64 on OFTC:
> 
> ```
> [20:09:50]  there is suddenly an unexpected call to
> `IsDBCSLeadByteEx()` in winpthreads. Not sure why it gets involved.
> [20:13:12] * tchan (~tc...@c-98-220-238-152.hsd1.il.comcast.net) has joined
> [20:22:28]  diff'ing the import tables the previous working
> binary and now broken binary reveals that the old symbol to `printf` is
> gone. seems the mingw-w64 ones is called, which references
> `IsDBCSLeadByteEx()` and `WideCharToMultiByte()`.
> [20:27:19]  both of those should be provided by -lkernel32 right?
> [20:27:36] * Dejan has quit (Quit: Leaving)
> [20:34:09]  probably, but I doubt whether it should behave
> this way.  when perform cross-compilation the CRT is not available when
> building winpthreads.
> [20:34:37]  presumably it should always call the MS one.
> [20:34:45]  I'm pretty sure you'd first build the crt, then
> libraries like winpthreads - the other way around doesn't work
> [20:35:16]  :|  let me make a test program.
> [20:38:38]  can't reproduce it myself.
> [20:41:06]  there may be something wrong with the OP'
> [20:41:18]  's configuration.  Normally kernel32 is a default lib.
> [20:42:39]  I still think winpthreads should be built with
> `CPPFLAGS='-D__USE_MINGW_ANSI_STDIO=0'`. I built a local package and
> there is no reference to DBCS or wide char functions.
> [20:49:21]   reproduced now:
> https://paste.ubuntu.com/p/HwNk8WqgkD/
> [20:49:23]  Title: Ubuntu Pastebin (at paste.ubuntu.com)
> [20:52:45]  strange:  -lmingw32 -lgcc -lgcc_eh -lmoldname
> -lmingwex -lmsvcrt -lpthread -lmcfgthread -ladvapi32 -lshell32 -luser32
> -lkernel32 -lmingw32 -lgcc -lgcc_eh -lmoldname -lmingwex -lmsvcrt
> [20:53:10]  oh.
> [20:53:22]  the mingw-w64 `printf` is defined in mingwex I guess?
> [20:53:22]  pthread pulls in objects from msvcrt, which then needs
> things from kernel32, but there's no more kernel32 after msvcrt
> [20:54:04]  in 211af1e7d4d188dbefacea7af8b83d32b3edb48c I moved
> mbrtowc and wcrtomb from mingwex to msvcrt
> [20:54:37]  (but that wouldn't make a difference wrt this, as
> there's no kernel32 after the first mingwex after -lpthread either)
> [20:55:10]  I think it would be good with yet another -lkernel32
> after -lmsvcrt
> [20:55:27]  after all, that's the way they are layered anyway - the
> crt runs on top of kernel32
> [20:56:32]  and we want to have the freedom to have object file
> implementations in libmsvcrt.a
> [20:58:21]  some of these -l things are hard-coded in GCC
> default specs.
> [20:58:48]  I only found `-ladvapi32 -lshell32 -luser32
> -lkernel32`. The list ends there.
> [20:59:11]  not sure how those additional libraries were added.
> [21:00:04]  lld is nice in this aspect, that it doesn't need static
> libraries ordered like this; for each undefined, it searches the list of
> static libraries from the start
> [21:01:07]  LD is dumb. :(
> [21:02:53]  I thought MSVCRT was only an import library. It
> seems more complicated.
> [21:04:18]  it has (almost) always been more than that - there's
> been some stub functions that call GetProcAddress() and try to
> conditionally load functions if available
> [21:04:53]  and especially with ucrt, we want to move quite a bit
> of things from libmingwex.a to libmsvcrt-os.a, for things where we can
> and should use the ucrt equivalent instead of statically linking in our own
> [21:05:32]  GetProcAdress() requires a successive -lkernel32 too.
> [21:05:45]  indeed
> [21:06:56]  so this suddenly becomes a GCC issue in its
> default specs:  `-lkernel32` is required after `-lmsvcrt`.
> [21:07:33]  yes, pretty much. clang has got the same structure as well
> [21:07:44]  (which matters for cases when using clang on top of ld.bfd)
> [21:08:48]  looks like it's REAL_LIBGCC_SPEC in
> gcc/config/i386/mingw32.h that needs to be updated
> ```
> 
> 
> 

Thanks pushed to master.



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] Port libgccjit to Windows.

2020-05-29 Thread JonY via Gcc-patches
On 5/28/20 8:46 PM, David Malcolm via Gcc-patches wrote:
>>> I was able to successfully bootstrap and regression test with
>>> your patch on x86_64-pc-linux-gnu.  I also verified that the
>>> result of
>> "make
>>> install" was not affected for my configuration.
>> 
>> Great.
>> 
>>> I've pushed your patch to master as 
>>> c83027f32d9cca84959c7d6a1e519a0129731501.
>>> 
>>> Thanks again for the patch Dave
>> 
>> Thanks to you for all the good feedback.
>> 
>> Nico.
> 

Hello,

A bit of a late review, some minor points:

1. Using .so on Windows for DLLs is fine.
2. The DLL name on Windows should use LIBGCCJIT_SONAME rather than
LIBGCCJIT_LINKER_NAME, so applications would load libgccjit.so.0 instead
of libgccjit.so directly. The linker command output needs to be 
LIBGCCJIT_SONAME.
3. Ideally I would prefer to .cc too, though I see other C++ files also written 
as .c.



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] c++: premature requires-expression folding [PR95020]

2020-05-29 Thread Patrick Palka via Gcc-patches
On Wed, 13 May 2020, Jason Merrill wrote:

> On 5/11/20 6:43 PM, Patrick Palka wrote:
> > In the testcase below we're prematurely folding away the
> > requires-expression to 'true' after substituting in the function's
> > template arguments, but before substituting in the lambda's deduced
> > template arguments.
> > 
> > This happens because during the first tsubst_requires_expr,
> > processing_template_decl is 1 but 'args' is just {void} and therefore
> > non-dependent, so we end up folding away the requires-expression to
> > boolean_true_node before we could substitute in the lambda's template
> > arguments and determine that '*v' is ill-formed.
> > 
> > This patch removes the uses_template_parms check when deciding in
> > tsubst_requires_expr whether to keep around a new requires-expression.
> > Regardless of whether the template arguments are dependent, there still
> > might be more template parameters to later substitute in -- as in the
> > testcase below -- and even if not, tsubst_expr doesn't perform full
> > semantic processing unless !processing_template_decl, so it seems we
> > should wait until then to fold away the requires-expression.
> > 
> > Passes 'make check-c++', does this look OK to commit after a full
> > bootstrap/regtest?
> 
> OK.

Would the same patch be OK to backport to the GCC 10 branch?

> 
> > gcc/cp/ChangeLog:
> > 
> > PR c++/95020
> > * constraint.c (tsubst_requires_expr): Produce a new
> > requires-expression when processing_template_decl, even if
> > template arguments are not dependent.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > PR c++/95020
> > * g++/cpp2a/concepts-lambda7.C: New test.
> > ---
> >   gcc/cp/constraint.cc  |  4 +---
> >   gcc/testsuite/g++.dg/cpp2a/concepts-lambda7.C | 14 ++
> >   2 files changed, 15 insertions(+), 3 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-lambda7.C
> > 
> > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> > index 4ad17f3b7d8..8ee347cae60 100644
> > --- a/gcc/cp/constraint.cc
> > +++ b/gcc/cp/constraint.cc
> > @@ -2173,9 +2173,7 @@ tsubst_requires_expr (tree t, tree args,
> > if (reqs == error_mark_node)
> >   return boolean_false_node;
> >   -  /* In certain cases, produce a new requires-expression.
> > - Otherwise the value of the expression is true.  */
> > -  if (processing_template_decl && uses_template_parms (args))
> > +  if (processing_template_decl)
> >   return finish_requires_expr (cp_expr_location (t), parms, reqs);
> >   return boolean_true_node;
> > diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-lambda7.C
> > b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda7.C
> > new file mode 100644
> > index 000..50746b777a3
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda7.C
> > @@ -0,0 +1,14 @@
> > +// PR c++/95020
> > +// { dg-do compile { target c++2a } }
> > +
> > +template
> > +void foo() {
> > +  auto t = [](auto v) {
> > +static_assert(requires { *v; }); // { dg-error "static assertion
> > failed" }
> > +  };
> > +  t(0);
> > +}
> > +
> > +void bar() {
> > +  foo();
> > +}
> > 
> 
> 



Re: [PATCH] c++: P0848R3 and member function templates [PR95181]

2020-05-29 Thread Patrick Palka via Gcc-patches
On Fri, 29 May 2020, Jason Merrill wrote:

> On 5/22/20 10:56 AM, Patrick Palka wrote:
> > On Fri, 22 May 2020, Patrick Palka wrote:
> > 
> > > When comparing two special member function templates to see if one hides
> > > the other (as per P0848R3), we need to check satisfaction which we can't
> > > do on templates.  So this patch makes add_method skip the eligibility
> > > test on member function templates and just lets them coexist.
> > 
> > It just occurred to me that this problem isn't limited to member function
> > templates.  Consider this valid testcase which we currently reject:
> > 
> >  template struct g {
> >g() requires B && false;
> >g() requires B;
> >  };
> > 
> >  g b; // error
> > 
> > During add_method, we check satisfaction of both default constructors,
> > and since their constraints are dependent, constraints_satisfied_p
> > returns true for both sets of constraints.  We then see that
> > 'B && false' is more constrained than 'B' and therefore discard the second
> > constructor.  Since we discarded the second default constructor at
> > definition time, the instantiation g has no eligible default
> > constructor.
> > 
> > I am not sure what to do from here...
> 
> I looked at this and it seems enough to let the functions coexist without
> trying to compare their constraints if processing_template_decl; we'll handle
> the hiding properly when the class template is instantiated.
> 
> So this is what I'm committing:

Aha, that makes sense now, thanks!  Somehow I convinced myself that
during class template instantiation we don't call add_method, so I
instead tried to remove that block of code from add_method altogether.
Needless to say that didn't work very well..



Re: [PATCH, committed, part2] PR fortran/95090 - ICE: identifier overflow

2020-05-29 Thread Jakub Jelinek via Gcc-patches
On Sat, May 30, 2020 at 12:20:19AM +0200, Harald Anlauf wrote:
> > Gesendet: Freitag, 29. Mai 2020 um 23:57 Uhr
> > Von: "H.J. Lu" 
> 
> > This breaks bootstrap:
> > 
> > https://gcc.gnu.org/pipermail/gcc-regression/2020-May/072642.html
> > 
> > ../../src-master/gcc/fortran/class.c:487:13: error: ‘char*
> > strncpy(char*, const char*, size_t)’ specified bound 67 equals
> > destination size [-Werror=stringop-truncation]
> >   487 | strncpy (dt_name, gfc_dt_upper_string (derived->name),
> > sizeof (dt_name));
> 
> what is the right way to deal with that?
> 
> I haven't seen any use of strlcpy in the gcc sources.  This would do
> the right thing and would fit here.
> 
> So should one use the clumsy way:
> 
>strncpy(buf, str, buflen - 1);
>if (buflen > 0)
>buf[buflen - 1]= '\0';
> 
> Or is there something more convenient that keeps gcc happy?

Depends on what exactly you want to achieve, but strncpy is pretty much
always the wrong answer.
Do you really want to clear the rest of buf, when gfc_dt_upper_string
is much shorter?  That is one thing that is only seldom useful from strncpy
behavior.  The other is that it doesn't zero terminate on truncation.
I'd suggest e.g. remember gfc_dt_upper_string result in a temporary,
compute len = strnlen (res, sizeof (dt_name) - 1);
and then you can just memcpy and zero terminate buf[len] = '\0';
Or do you ever want to truncate?

Jakub



[PATCH] rs6000: Prefer VSX insns over VMX ones (part 1: perm and mrg)

2020-05-29 Thread Segher Boessenkool
There are various VSX insns that do the same job as (older) AltiVec
insns, just with a wider range of possible registers.  Many patterns
for such insns have the "v" alternative before the "wa" alternative,
which makes the output less readable than possible (since vs32 is v0,
and most insns before or after this insn will be VSX as well).

This changes the define_insns for the mrg and perm machine instructions
to prefer the VSX form.  No behaviour change.  Only one testcase needed
a little adjustment as well.

Tested on powerpc64-linux {-m32,-m64}.  Applying to trunk.


Segher


2020-05-29  Segher Boessenkool  

* config/rs6000/altivec.md (altivec_vmrghw_direct): Prefer VSX form.
(altivec_vmrglw_direct): Ditto.
(altivec_vperm__direct): Ditto.
(altivec_vperm_v8hiv16qi): Ditto.
(*altivec_vperm__uns_internal): Ditto.
(*altivec_vpermr__internal): Ditto.
(vperm_v8hiv4si): Ditto.
(vperm_v16qiv8hi): Ditto.

testsuite/
* gcc.target/powerpc/vsx-vector-6.p9.c: Allow xxperm as perm as well.

---
 gcc/config/rs6000/altivec.md   | 104 ++---
 gcc/testsuite/gcc.target/powerpc/vsx-vector-6.p9.c |   2 +-
 2 files changed, 53 insertions(+), 53 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 792ca4f..159f24e 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1216,14 +1216,14 @@ (define_insn "*altivec_vmrghw_internal"
   [(set_attr "type" "vecperm")])
 
 (define_insn "altivec_vmrghw_direct"
-  [(set (match_operand:V4SI 0 "register_operand" "=v,wa")
-   (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v,wa")
- (match_operand:V4SI 2 "register_operand" "v,wa")]
+  [(set (match_operand:V4SI 0 "register_operand" "=wa,v")
+   (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "wa,v")
+ (match_operand:V4SI 2 "register_operand" "wa,v")]
 UNSPEC_VMRGH_DIRECT))]
   "TARGET_ALTIVEC"
   "@
-   vmrghw %0,%1,%2
-   xxmrghw %x0,%x1,%x2"
+   xxmrghw %x0,%x1,%x2
+   vmrghw %0,%1,%2"
   [(set_attr "type" "vecperm")])
 
 (define_insn "*altivec_vmrghsf"
@@ -1364,14 +1364,14 @@ (define_insn "*altivec_vmrglw_internal"
   [(set_attr "type" "vecperm")])
 
 (define_insn "altivec_vmrglw_direct"
-  [(set (match_operand:V4SI 0 "register_operand" "=v,wa")
-   (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v,wa")
- (match_operand:V4SI 2 "register_operand" "v,wa")]
+  [(set (match_operand:V4SI 0 "register_operand" "=wa,v")
+   (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "wa,v")
+ (match_operand:V4SI 2 "register_operand" "wa,v")]
 UNSPEC_VMRGL_DIRECT))]
   "TARGET_ALTIVEC"
   "@
-   vmrglw %0,%1,%2
-   xxmrglw %x0,%x1,%x2"
+   xxmrglw %x0,%x1,%x2
+   vmrglw %0,%1,%2"
   [(set_attr "type" "vecperm")])
 
 (define_insn "*altivec_vmrglsf"
@@ -2193,30 +2193,30 @@ (define_expand "altivec_vperm_"
 
 ;; Slightly prefer vperm, since the target does not overlap the source
 (define_insn "altivec_vperm__direct"
-  [(set (match_operand:VM 0 "register_operand" "=v,?wa")
-   (unspec:VM [(match_operand:VM 1 "register_operand" "v,wa")
-   (match_operand:VM 2 "register_operand" "v,0")
-   (match_operand:V16QI 3 "register_operand" "v,wa")]
+  [(set (match_operand:VM 0 "register_operand" "=?wa,v")
+   (unspec:VM [(match_operand:VM 1 "register_operand" "wa,v")
+   (match_operand:VM 2 "register_operand" "0,v")
+   (match_operand:V16QI 3 "register_operand" "wa,v")]
   UNSPEC_VPERM))]
   "TARGET_ALTIVEC"
   "@
-   vperm %0,%1,%2,%3
-   xxperm %x0,%x1,%x3"
+   xxperm %x0,%x1,%x3
+   vperm %0,%1,%2,%3"
   [(set_attr "type" "vecperm")
-   (set_attr "isa" "*,p9v")])
+   (set_attr "isa" "p9v,*")])
 
 (define_insn "altivec_vperm_v8hiv16qi"
-  [(set (match_operand:V16QI 0 "register_operand" "=v,?wa")
-   (unspec:V16QI [(match_operand:V8HI 1 "register_operand" "v,wa")
-  (match_operand:V8HI 2 "register_operand" "v,0")
-  (match_operand:V16QI 3 "register_operand" "v,wa")]
+  [(set (match_operand:V16QI 0 "register_operand" "=?wa,v")
+   (unspec:V16QI [(match_operand:V8HI 1 "register_operand" "wa,v")
+  (match_operand:V8HI 2 "register_operand" "0,v")
+  (match_operand:V16QI 3 "register_operand" "wa,v")]
   UNSPEC_VPERM))]
   "TARGET_ALTIVEC"
   "@
-   vperm %0,%1,%2,%3
-   xxperm %x0,%x1,%x3"
+   xxperm %x0,%x1,%x3
+   vperm %0,%1,%2,%3"
   [(set_attr "type" "vecperm")
-   (set_attr "isa" "*,p9v")])
+   (set_attr "isa" "p9v,*")])
 
 (define_expand "altivec_vperm__uns"
   [(set (match_operand:VM 0 "register_operand")
@@ -2234,17 +2234,17 @@ (define_expand "altivec_vperm__uns"
 })
 
 (define_insn "*altivec_vperm__uns_internal"
-  [(set (match_operand:VM 0 

Re: [PATCH, committed, part2] PR fortran/95090 - ICE: identifier overflow

2020-05-29 Thread H.J. Lu via Gcc-patches
On Fri, May 29, 2020 at 3:20 PM Harald Anlauf  wrote:
>
> Hi H.J.,
>
> > Gesendet: Freitag, 29. Mai 2020 um 23:57 Uhr
> > Von: "H.J. Lu" 
>
> > This breaks bootstrap:
> >
> > https://gcc.gnu.org/pipermail/gcc-regression/2020-May/072642.html
> >
> > ../../src-master/gcc/fortran/class.c:487:13: error: ‘char*
> > strncpy(char*, const char*, size_t)’ specified bound 67 equals
> > destination size [-Werror=stringop-truncation]
> >   487 | strncpy (dt_name, gfc_dt_upper_string (derived->name),
> > sizeof (dt_name));
>
> what is the right way to deal with that?
>
> I haven't seen any use of strlcpy in the gcc sources.  This would do
> the right thing and would fit here.
>
> So should one use the clumsy way:
>
>strncpy(buf, str, buflen - 1);
>if (buflen > 0)
>buf[buflen - 1]= '\0';
>
> Or is there something more convenient that keeps gcc happy?

dt_name[sizeof (dt_name) - 1] = '\0';
strncpy (dt_name, gfc_dt_upper_string (derived->name), sizeof (dt_name) - 1);


-- 
H.J.


Re: [PATCH, committed, part2] PR fortran/95090 - ICE: identifier overflow

2020-05-29 Thread Harald Anlauf
Hi H.J.,

> Gesendet: Freitag, 29. Mai 2020 um 23:57 Uhr
> Von: "H.J. Lu" 

> This breaks bootstrap:
> 
> https://gcc.gnu.org/pipermail/gcc-regression/2020-May/072642.html
> 
> ../../src-master/gcc/fortran/class.c:487:13: error: ‘char*
> strncpy(char*, const char*, size_t)’ specified bound 67 equals
> destination size [-Werror=stringop-truncation]
>   487 | strncpy (dt_name, gfc_dt_upper_string (derived->name),
> sizeof (dt_name));

what is the right way to deal with that?

I haven't seen any use of strlcpy in the gcc sources.  This would do
the right thing and would fit here.

So should one use the clumsy way:

   strncpy(buf, str, buflen - 1);
   if (buflen > 0)
   buf[buflen - 1]= '\0';

Or is there something more convenient that keeps gcc happy?

Harald



Re: [PATCH] c++: P0848R3 and member function templates [PR95181]

2020-05-29 Thread Jason Merrill via Gcc-patches

On 5/22/20 10:56 AM, Patrick Palka wrote:

On Fri, 22 May 2020, Patrick Palka wrote:


When comparing two special member function templates to see if one hides
the other (as per P0848R3), we need to check satisfaction which we can't
do on templates.  So this patch makes add_method skip the eligibility
test on member function templates and just lets them coexist.


It just occurred to me that this problem isn't limited to member function
templates.  Consider this valid testcase which we currently reject:

 template struct g {
   g() requires B && false;
   g() requires B;
 };

 g b; // error

During add_method, we check satisfaction of both default constructors,
and since their constraints are dependent, constraints_satisfied_p
returns true for both sets of constraints.  We then see that
'B && false' is more constrained than 'B' and therefore discard the second
constructor.  Since we discarded the second default constructor at
definition time, the instantiation g has no eligible default
constructor.

I am not sure what to do from here...


I looked at this and it seems enough to let the functions coexist 
without trying to compare their constraints if processing_template_decl; 
we'll handle the hiding properly when the class template is instantiated.


So this is what I'm committing:
commit c591bb815e359a6f5c5ef5a9c9f9a7a646a99ea0
Author: Patrick Palka 
Date:   Fri May 22 10:28:19 2020 -0400

c++: P0848R3 and member function templates [PR95181]

When comparing two special member function templates to see if one hides
the other (as per P0848R3), we need to check satisfaction which we can't
do on templates.  So this patch makes add_method skip the eligibility
test on member function templates and just lets them coexist.

gcc/cp/ChangeLog:

PR c++/95181
* class.c (add_method): Let special member function templates
coexist if they are not equivalently constrained, or in a class
template.

gcc/testsuite/ChangeLog:

PR c++/95181
* g++.dg/concepts/pr95181.C: New test.
* g++.dg/concepts/pr95181-2.C: New test.

Co-authored-by: Jason Merrill 

diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index ca492cdbd40..c818826a108 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -1081,12 +1081,19 @@ add_method (tree type, tree method, bool via_using)
 	{
   if (!equivalently_constrained (fn, method))
 	{
+	  if (processing_template_decl)
+		/* We can't check satisfaction in dependent context, wait until
+		   the class is instantiated.  */
+		continue;
+
 	  special_function_kind sfk = special_memfn_p (method);
 
-	  if (sfk == sfk_none || DECL_INHERITED_CTOR (fn))
-		/* Non-special member functions coexist if they are not
-		   equivalently constrained.  A member function is not hidden
-		   by an inherited constructor.  */
+	  if (sfk == sfk_none
+		  || DECL_INHERITED_CTOR (fn)
+		  || TREE_CODE (fn) == TEMPLATE_DECL)
+		/* Member function templates and non-special member functions
+		   coexist if they are not equivalently constrained.  A member
+		   function is not hidden by an inherited constructor.  */
 		continue;
 
 	  /* P0848: For special member functions, deleted, unsatisfied, or
diff --git a/gcc/testsuite/g++.dg/concepts/pr95181-2.C b/gcc/testsuite/g++.dg/concepts/pr95181-2.C
new file mode 100644
index 000..6d67350e58f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/concepts/pr95181-2.C
@@ -0,0 +1,8 @@
+// { dg-do compile { target concepts } }
+
+template struct g {
+  g() requires B && false;
+  g() requires B;
+};
+
+g b; // error
diff --git a/gcc/testsuite/g++.dg/concepts/pr95181.C b/gcc/testsuite/g++.dg/concepts/pr95181.C
new file mode 100644
index 000..0185c86b438
--- /dev/null
+++ b/gcc/testsuite/g++.dg/concepts/pr95181.C
@@ -0,0 +1,9 @@
+// PR c++/95181
+// { dg-do compile { target concepts } }
+
+template  struct f {
+  template  f();
+  template  requires false f();
+};
+
+f a;


[pushed] c++: Template template parameter in constraint [PR95371]

2020-05-29 Thread Jason Merrill via Gcc-patches
any_template_parm_r was assuming that the DECL_TEMPLATE_RESULT of a template
will have a suitable TEMPLATE_INFO from which we can look at the generic
arguments for that template.  But that wasn't true for a template template
parameter; this patch makes it so.

Tested x86_64-pc-linux-gnu, applying to trunk and 10.

gcc/cp/ChangeLog:

PR c++/95371
* pt.c (process_template_parm): Set DECL_TEMPLATE_INFO
on the DECL_TEMPLATE_RESULT.

gcc/testsuite/ChangeLog:

PR c++/95371
* g++.dg/cpp2a/concepts-ttp1.C: New test.
---
 gcc/cp/pt.c| 11 ++-
 gcc/testsuite/g++.dg/cpp2a/concepts-ttp1.C | 16 
 2 files changed, 26 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-ttp1.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 90dafff3aa7..df647af7b46 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -4575,7 +4575,16 @@ process_template_parm (tree list, location_t parm_loc, 
tree parm,
  /* This is for distinguishing between real templates and template
 template parameters */
  TREE_TYPE (parm) = t;
- TREE_TYPE (DECL_TEMPLATE_RESULT (parm)) = t;
+
+ /* any_template_parm_r expects to be able to get the targs of a
+DECL_TEMPLATE_RESULT.  */
+ tree result = DECL_TEMPLATE_RESULT (parm);
+ TREE_TYPE (result) = t;
+ tree args = template_parms_to_args (DECL_TEMPLATE_PARMS (parm));
+ tree tinfo = build_template_info (parm, args);
+ retrofit_lang_decl (result);
+ DECL_TEMPLATE_INFO (result) = tinfo;
+
  decl = parm;
}
   else
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-ttp1.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp1.C
new file mode 100644
index 000..3f6eb35cf61
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp1.C
@@ -0,0 +1,16 @@
+// PR c++/95371
+// { dg-do compile { target c++20 } }
+
+template 
+struct configuration  {
+template  typename query_t>
+static constexpr bool exists() { return true; }
+
+template  typename query_t>
+void remove() requires(exists());
+};
+
+int main() {
+configuration<> cfg{};
+cfg.remove();
+}

base-commit: 33e23881aae0549572cc23a2520c5094a2ffede9
-- 
2.18.1



Re: [PATCH] rs6000: PR target/95347 Correctly identify stfs if prefixed

2020-05-29 Thread Segher Boessenkool
Hi!

Re: [PATCH] rs6000: PR target/95347 Correctly identify stfs if prefixed
Please put the PR id at the end of the subject (it is the least
important information).  You can also shorten it to "PR95347" -- total
subject length ideally is maybe 50 chars, so something like
"rtl-optimization" would be extremely long, for no good reason.

On Fri, May 29, 2020 at 04:31:09PM -0500, Aaron Sawdey via Gcc-patches wrote:
> Because reg_to_non_prefixed() only looks at the register being used, it
> doesn't get the right answer for stfs, which leads to us not seeing
> that it has a PCREL symbol ref.  This patch works around this by
> introducing a helper function that inspects the insn to see if it is in
> fact a stfs. Then if we use NON_PREFIXED_DEFAULT, address_to_insn_form()
> can see that it has the PCREL symbol ref.

> +/* Helper function to see if we're potentially looking at stfs that
> +   could be pstfs.  */

"That could be pstfs" is only confusing here, I think?  It has nothing
to do with this function itself.

"Return true if INSN is a "movsi_from_sf" to memory"?

> +static bool
> +is_stfs_insn (rtx_insn *insn)
> +{
> +  rtx pattern=PATTERN (insn);

Spaces on both sides of binary operators (like "=").

> +  if (GET_CODE (pattern) != PARALLEL)
> +return false;
> +
> +  /* This should be a parallel with exactly one set and one clobber.  */

You could simplify this: it has to be a parallel of exactly two things,
the first a SET, the second a CLOBBER?

> +  int i;
> +  rtx set=NULL, clobber=NULL;
> +  for (i = 0; i < XVECLEN (pattern, 0); i++)

  rtx set = NULL;
  rtx clobber = NULL;
  for (int i = 0; i < XVECLEN (pattern, 0); i++)

(Declarations with initialiser go on a line by their own; "i" doesn't
need declaring before the loop).

> +  /* All we care is that the destination of the SET is a mem:SI,
> + the source should be an UNSPEC_SI_FROM_SF, and the clobber
> + should be a scratch:V4SF.  */
> +
> +  rtx dest = XEXP (set, 0);

rtx dest = SET_DEST (set);

> +  rtx src = XEXP (set, 1);

rtx src = SET_SRC (set);

> +  rtx scratch = XEXP (clobber, 0);

rtx scratch = SET_DEST (clobber);

> @@ -25119,8 +25171,14 @@ prefixed_store_p (rtx_insn *insn)
>  return false;
>  
>machine_mode mem_mode = GET_MODE (mem);
> +  rtx addr = XEXP (mem, 0);
>enum non_prefixed_form non_prefixed = reg_to_non_prefixed (reg, mem_mode);
> -  return address_is_prefixed (XEXP (mem, 0), mem_mode, non_prefixed);
> +  /* Need to make sure we aren't looking at a stfs which doesn't
> + looking like the other things that we are looking for.  */

s/looking/look/ I guess?

> +  if (non_prefixed == NON_PREFIXED_X && is_stfs_insn (insn))
> +return address_is_prefixed (addr, mem_mode, NON_PREFIXED_DEFAULT);
> +  else
> +return address_is_prefixed (addr, mem_mode, non_prefixed);

Rest looks fine :-)  Okay for trunk with the nits fixed and the
suggestions looked at.  Also okay for 10, if wanted there?

Thanks!


Segher


Re: [PATCH, committed, part2] PR fortran/95090 - ICE: identifier overflow

2020-05-29 Thread H.J. Lu via Gcc-patches
On Fri, May 29, 2020 at 1:39 PM Harald Anlauf  wrote:
>
> The initial patch for this PR had some fallout which for unknown reason
> did only show up on i686, but not on x86_64.  With initial guidance by
> Manfred Schwarb three further locations exhibiting buffer overrun could
> be identified in a gdb session and were fixed.
>
> Committed as 'obvious' to master.
>
> Thanks,
> Harald
>
>
> PR fortran/95090 - ICE: identifier overflow
>
> The initial fix for this PR uncovered several latent issues with further
> too small string buffers which showed up only when testing on i686.
> Provide sufficiently large temporaries.
>
> 2020-05-29  Harald Anlauf  
>
> gcc/fortran/
> PR fortran/95090
> * class.c (get_unique_type_string): Enlarge temporary for
> name-mangling.  Use strncpy to prevent buffer overrun.
> (get_unique_hashed_string): Enlarge temporary.
> (gfc_hash_value): Enlarge temporary for name-mangling.

This breaks bootstrap:

https://gcc.gnu.org/pipermail/gcc-regression/2020-May/072642.html

../../src-master/gcc/fortran/class.c:487:13: error: ‘char*
strncpy(char*, const char*, size_t)’ specified bound 67 equals
destination size [-Werror=stringop-truncation]
  487 | strncpy (dt_name, gfc_dt_upper_string (derived->name),
sizeof (dt_name));

-- 
H.J.


[PATCH] rs6000: PR target/95347 Correctly identify stfs if prefixed

2020-05-29 Thread Aaron Sawdey via Gcc-patches
Because reg_to_non_prefixed() only looks at the register being used, it
doesn't get the right answer for stfs, which leads to us not seeing
that it has a PCREL symbol ref.  This patch works around this by
introducing a helper function that inspects the insn to see if it is in
fact a stfs. Then if we use NON_PREFIXED_DEFAULT, address_to_insn_form()
can see that it has the PCREL symbol ref.

OK for trunk if regstrap on ppc64le passes?

Thanks,
   Aaron

2020-05-29  Aaron Sawdey  

PR target/95347
* config/rs6000/rs6000.c (prefixed_store_p): Add special case
for stfs.
(is_stfs_insn): New helper function.
---
 gcc/config/rs6000/rs6000.c | 60 +-
 1 file changed, 59 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 8435bc15d72..d58fca4 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -24980,6 +24980,58 @@ address_to_insn_form (rtx addr,
   return INSN_FORM_BAD;
 }
 
+/* Helper function to see if we're potentially looking at stfs that
+   could be pstfs.  */
+
+static bool
+is_stfs_insn (rtx_insn *insn)
+{
+  rtx pattern=PATTERN (insn);
+  if (GET_CODE (pattern) != PARALLEL)
+return false;
+
+  /* This should be a parallel with exactly one set and one clobber.  */
+  int i;
+  rtx set=NULL, clobber=NULL;
+  for (i = 0; i < XVECLEN (pattern, 0); i++)
+{
+  rtx elt = XVECEXP (pattern, 0, i);
+  if (GET_CODE (elt) == SET)
+   {
+ if (set)
+   return false;
+ set = elt;
+   }
+  else if (GET_CODE (elt) == CLOBBER)
+   {
+ if (clobber)
+   return false;
+ clobber = elt;
+   }
+  else
+   return false;
+}
+
+  /* All we care is that the destination of the SET is a mem:SI,
+ the source should be an UNSPEC_SI_FROM_SF, and the clobber
+ should be a scratch:V4SF.  */
+
+  rtx dest = XEXP (set, 0);
+  rtx src = XEXP (set, 1);
+  rtx scratch = XEXP (clobber, 0);
+
+  if (GET_CODE (src) != UNSPEC || XINT (src, 1) != UNSPEC_SI_FROM_SF)
+return false;
+
+  if (GET_CODE (dest) != MEM || GET_MODE (dest) != SImode)
+return false;
+
+  if (GET_CODE (scratch) != SCRATCH || GET_MODE (scratch) != V4SFmode)
+return false;
+
+  return true;
+}
+
 /* Helper function to take a REG and a MODE and turn it into the non-prefixed
instruction format (D/DS/DQ) used for offset memory.  */
 
@@ -25119,8 +25171,14 @@ prefixed_store_p (rtx_insn *insn)
 return false;
 
   machine_mode mem_mode = GET_MODE (mem);
+  rtx addr = XEXP (mem, 0);
   enum non_prefixed_form non_prefixed = reg_to_non_prefixed (reg, mem_mode);
-  return address_is_prefixed (XEXP (mem, 0), mem_mode, non_prefixed);
+  /* Need to make sure we aren't looking at a stfs which doesn't
+ looking like the other things that we are looking for.  */
+  if (non_prefixed == NON_PREFIXED_X && is_stfs_insn (insn))
+return address_is_prefixed (addr, mem_mode, NON_PREFIXED_DEFAULT);
+  else
+return address_is_prefixed (addr, mem_mode, non_prefixed);
 }
 
 /* Whether a load immediate or add instruction is a prefixed instruction.  This
-- 
2.17.1



Re: [PATCH 2/5] Add function tree_code_in_cst.

2020-05-29 Thread Jan Hubicka
> 
> ---
>  gcc/tree.h | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/gcc/tree.h b/gcc/tree.h
> index bd0c51b2a18..86a4542f58b 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -6156,6 +6156,17 @@ int_bit_position (const_tree field)
> + wi::to_offset (DECL_FIELD_BIT_OFFSET (field))).to_shwi ();
>  }
> 
> +/* Determine if tree code is a constant */
> +inline bool
> +tree_code_is_cst (tree op)
> +{
> +  int code = TREE_CODE (op);
> +  if (code == INTEGER_CST || code == REAL_CST || code == COMPLEX_CST
> +  || code == VECTOR_CST)
> +return true;
> +  return false;

We have is_gimple_ip_invariant which I think should suit your purpose -
it return true if tree is a constant, it also accepts things like
addresses of (global) variables, functions and labels.
> +}
> +
>  /* Return true if it makes sense to consider alias set for a type T.  */
> 
>  inline bool
> -- 
> 2.18.1
> 


Re: [PATCH 1/5] Add stringify_ipa_ref_use function.

2020-05-29 Thread Jan Hubicka
Hello,
> 
> 
> ---
>  gcc/ipa-ref.c | 22 ++
>  gcc/ipa-ref.h |  3 +++
>  2 files changed, 25 insertions(+)
> 
> diff --git a/gcc/ipa-ref.c b/gcc/ipa-ref.c
> index 241828ee973..76459e9cc3d 100644
> --- a/gcc/ipa-ref.c
> +++ b/gcc/ipa-ref.c
> @@ -103,3 +103,25 @@ ipa_ref::referred_ref_list (void)
>  {
>return >ref_list;
>  }
> +
> +const char *
> +stringify_ipa_ref_use (const ipa_ref_use use)
> +{
> +  switch (use)
> +{
> +case IPA_REF_LOAD:
> +  return "IPA_REF_LOAD";
> +  break;
> +case IPA_REF_STORE:
> +  return "IPA_REF_STORE";
> +  break;
> +case IPA_REF_ADDR:
> +  return "IPA_REF_ADDR";
> +  break;
> +case IPA_REF_ALIAS:
> +  return "IPA_REF_ALIAS";
> +  break;
> +default:
> +  return "";
> +}
There is ipa_ref_use_name that also turns ipa_ref to string. I guess you
can use the same mechanizm in case this is used for debug output?

Honza


[PATCH, part2] PR fortran/95373 - [9/10/11 Regression] ICE in build_reference_type, at tree.c:7942

2020-05-29 Thread Harald Anlauf
The initial attempt to fix this PR unfortunately produced a regression
in the testsuite  that was overlooked.  The real fix is to apply this
check in the appropriate place.

Regtested on x86_64-pc-linux-gnu.  Really.

OK for master and backports?

Thanks,
Harald


PR fortran/95373 - ICE in build_reference_type, at tree.c:7942

The use of KIND, LEN, RE, and IM inquiry references for applicable intrinsic
types is valid only for suffienctly new Fortran standards.  Add appropriate
checks in the appropriate place.

2020-05-28  Harald Anlauf  

gcc/fortran/
PR fortran/95373
* primary.c (is_inquiry_ref): Move validity check of inquiry
references against selected Fortran standard from here...
(gfc_match_varspec) ...to here.

gcc/testsuite/
PR fortran/95373
* gfortran.dg/pr95373_1.f90: Adjust error messages.
* gfortran.dg/pr95373_2.f90: Adjust error message.
diff --git a/gcc/fortran/primary.c b/gcc/fortran/primary.c
index 67105cc9ab1..7c221c8d209 100644
--- a/gcc/fortran/primary.c
+++ b/gcc/fortran/primary.c
@@ -1998,28 +1998,6 @@ is_inquiry_ref (const char *name, gfc_ref **ref)
   else
 return false;

-  switch (type)
-{
-case INQUIRY_RE:
-case INQUIRY_IM:
-  if (!gfc_notify_std (GFC_STD_F2008, "RE or IM part_ref at %C"))
-	return false;
-  break;
-
-case INQUIRY_KIND:
-  if (!gfc_notify_std (GFC_STD_F2003, "KIND part_ref at %C"))
-	return false;
-  break;
-
-case INQUIRY_LEN:
-  if (!gfc_notify_std (GFC_STD_F2003, "LEN part_ref at %C"))
-	return false;
-  break;
-
-default:
-  gcc_unreachable ();
-}
-
   if (ref)
 {
   *ref = gfc_get_ref ();
@@ -2267,6 +2245,27 @@ gfc_match_varspec (gfc_expr *primary, int equiv_flag, bool sub_flag,
 	{
 	  if (tmp)
 		{
+		  switch (tmp->u.i)
+		{
+		case INQUIRY_RE:
+		case INQUIRY_IM:
+		  if (!gfc_notify_std (GFC_STD_F2008,
+	   "RE or IM part_ref at %C"))
+			return MATCH_ERROR;
+		  break;
+
+		case INQUIRY_KIND:
+		  if (!gfc_notify_std (GFC_STD_F2003,
+	   "KIND part_ref at %C"))
+			return MATCH_ERROR;
+		  break;
+
+		case INQUIRY_LEN:
+		  if (!gfc_notify_std (GFC_STD_F2003, "LEN part_ref at %C"))
+			return MATCH_ERROR;
+		  break;
+		}
+
 		  if ((tmp->u.i == INQUIRY_RE || tmp->u.i == INQUIRY_IM)
 		  && primary->ts.type != BT_COMPLEX)
 		{
diff --git a/gcc/testsuite/gfortran.dg/pr95373_1.f90 b/gcc/testsuite/gfortran.dg/pr95373_1.f90
index f39b6a72346..59a9e7a81e0 100644
--- a/gcc/testsuite/gfortran.dg/pr95373_1.f90
+++ b/gcc/testsuite/gfortran.dg/pr95373_1.f90
@@ -4,12 +4,12 @@

 subroutine s (x)
   complex, parameter :: z = 3
-  real(z% kind)  :: x   ! { dg-error "nonderived-type variable" }
+  real(z% kind)  :: x   ! { dg-error "Fortran 2003: KIND part_ref" }
   type t
  real:: kind
  logical :: re
   end type t
   type(t) :: b
   print *, b% kind, b% re
-  print *, z% re! { dg-error "nonderived-type variable" }
+  print *, z% re! { dg-error "Fortran 2008: RE or IM part_ref" }
 end
diff --git a/gcc/testsuite/gfortran.dg/pr95373_2.f90 b/gcc/testsuite/gfortran.dg/pr95373_2.f90
index 2a654b43faa..b0f3da0a20d 100644
--- a/gcc/testsuite/gfortran.dg/pr95373_2.f90
+++ b/gcc/testsuite/gfortran.dg/pr95373_2.f90
@@ -11,5 +11,5 @@ subroutine s (x)
   end type t
   type(t) :: b
   print *, b% kind, b% re
-  print *, z% re! { dg-error "nonderived-type variable" }
+  print *, z% re! { dg-error "Fortran 2008: RE or IM part_ref" }
 end


[PATCH] expr: Fix fallout from optimize store_expr from STRING_CST [PR95052]

2020-05-29 Thread Jakub Jelinek via Gcc-patches
On Fri, May 15, 2020 at 10:32:00AM -0600, Jeff Law via Gcc-patches wrote:
> > I wasn't sure if it wouldn't be safer to add some bool flag set to true by
> > the new code and then add gcc_assert in all the other paths, like following
> > incremental patch.  I believe none of the asserts can trigger right now,
> > but the code is still adjusting what it plans to use as source before 
> > actually
> > only copying fewer bytes from it, so if somebody changes it later...
> > 
> > Thoughts on that?
> Can't hurt, and debugging the assert tripping is likely a hell of a lot easier
> than debugging the resultant incorrect code.   So if it passes, then I'd say 
> go
> for it.

Testing passed, so I've committed it with those asserts (and thankfully I've
added them!) but it apparently broke Linux kernel build on arm.

The problem is that if the STRING_CST is very short, while the full object
has BLKmode, the short string could very well have
QImode/HImode/SImode/DImode and in that case it wouldn't take the path that
copies the string and then clears the remaining space, but different paths
in which it will ICE because of those asserts and without those it would
just emit wrong-code.

The following patch fixes it by enforcing BLKmode for the string MEM, even
if it is short, so that we copy it and memset the rest.

Ok for trunk if it passes bootstrap/regtest?

2020-05-29  Jakub Jelinek  

PR middle-end/95052
* expr.c (store_expr): For shortedned_string_cst, ensure temp has
BLKmode.

* gcc.dg/pr95052.c: New test.

--- gcc/expr.c.jj   2020-05-29 10:42:26.0 +0200
+++ gcc/expr.c  2020-05-29 21:49:11.421646101 +0200
@@ -5779,6 +5779,11 @@ store_expr (tree exp, rtx target, int ca
   (call_param_p
? EXPAND_STACK_PARM : EXPAND_NORMAL),
   _rtl, false);
+  if (shortened_string_cst)
+   {
+ gcc_assert (MEM_P (temp));
+ temp = change_address (temp, BLKmode, NULL_RTX);
+   }
 }
 
   /* If TEMP is a VOIDmode constant and the mode of the type of EXP is not
--- gcc/testsuite/gcc.dg/pr95052.c.jj   2020-05-29 21:56:15.139426809 +0200
+++ gcc/testsuite/gcc.dg/pr95052.c  2020-05-29 21:55:51.919767620 +0200
@@ -0,0 +1,12 @@
+/* PR middle-end/95052 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fconserve-stack" } */
+
+void bar (char *);
+
+void
+foo (void)
+{
+  char buf[70] = "";
+  bar (buf);
+}


Jakub



Re: [PATCH] Avoid nested save_CFLAGS and save_LDFLAGS

2020-05-29 Thread Jakub Jelinek via Gcc-patches
[6~On Fri, May 29, 2020 at 12:52:09PM -0700, H.J. Lu via Gcc-patches wrote:
> Avoid nested save_CFLAGS and save_LDFLAGS by replacing save_CFLAGS and
> save_LDFLAGS with cet_save_CFLAGS and cet_save_LDFLAGS in cet.m4.

Ok, thanks.

Jakub



[PATCH] Avoid nested save_CFLAGS and save_LDFLAGS

2020-05-29 Thread H.J. Lu via Gcc-patches
Avoid nested save_CFLAGS and save_LDFLAGS by replacing save_CFLAGS and
save_LDFLAGS with cet_save_CFLAGS and cet_save_LDFLAGS in cet.m4.

config/

PR bootstrap/95413
* cet.m4: Replace save_CFLAGS and save_LDFLAGS with
cet_save_CFLAGS and cet_save_LDFLAGS.

gcc/

PR bootstrap/95413
* configure: Regenerated.

libatomic/

PR bootstrap/95413
* configure: Regenerated.

libbacktrace/

PR bootstrap/95413
* configure: Regenerated.

libcc1/

PR bootstrap/95413
* configure: Regenerated.

libcpp/

PR bootstrap/95413
* configure: Regenerated.

libdecnumber/

PR bootstrap/95413
* configure: Regenerated.

libgcc/

PR bootstrap/95413
* configure: Regenerated.

libgfortran/

PR bootstrap/95413
* configure: Regenerated.

libgomp/

PR bootstrap/95413
* configure: Regenerated.

libiberty/

PR bootstrap/95413
* configure: Regenerated.

libitm/

PR bootstrap/95413
* configure: Regenerated.

libobjc/

PR bootstrap/95413
* configure: Regenerated.

libphobos/

PR bootstrap/95413
* configure: Regenerated.

libquadmath/

PR bootstrap/95413
* configure: Regenerated.

libsanitizer/

PR bootstrap/95413
* configure: Regenerated.

libssp/

PR bootstrap/95413
* configure: Regenerated.

libstdc++-v3/

PR bootstrap/95413
* configure: Regenerated.

libvtv/

PR bootstrap/95413
* configure: Regenerated.

lto-plugin/

PR bootstrap/95413
* configure: Regenerated.

zlib/

PR bootstrap/95413
* configure: Regenerated.
---
 config/cet.m4  | 17 +
 gcc/configure  | 12 ++--
 libatomic/configure|  5 +++--
 libbacktrace/configure | 17 +
 libcc1/configure   | 12 ++--
 libcpp/configure   | 12 ++--
 libdecnumber/configure | 12 ++--
 libgcc/configure   |  5 +++--
 libgfortran/configure  |  9 +
 libgomp/configure  |  4 ++--
 libiberty/configure| 12 ++--
 libitm/configure   |  5 +++--
 libobjc/configure  |  9 +
 libphobos/configure|  9 +
 libquadmath/configure  |  5 +++--
 libsanitizer/configure |  5 +++--
 libssp/configure   |  9 +
 libstdc++-v3/configure |  5 +++--
 libvtv/configure   |  5 +++--
 lto-plugin/configure   | 12 ++--
 zlib/configure |  9 +
 21 files changed, 102 insertions(+), 88 deletions(-)

diff --git a/config/cet.m4 b/config/cet.m4
index 2bb2c8a95ac..911fbd46475 100644
--- a/config/cet.m4
+++ b/config/cet.m4
@@ -7,13 +7,14 @@ GCC_ENABLE(cet, auto, ,[enable Intel CET in target libraries],
   permit yes|no|auto)
 AC_MSG_CHECKING([for CET support])
 
+# NB: Avoid nested save_CFLAGS and save_LDFLAGS.
 case "$host" in
   i[[34567]]86-*-linux* | x86_64-*-linux*)
 case "$enable_cet" in
   auto)
# Check if target supports multi-byte NOPs
# and if assembler supports CET insn.
-   save_CFLAGS="$CFLAGS"
+   cet_save_CFLAGS="$CFLAGS"
CFLAGS="$CFLAGS -fcf-protection"
AC_COMPILE_IFELSE(
 [AC_LANG_PROGRAM(
@@ -27,7 +28,7 @@ asm ("setssbsy");
  ])],
 [enable_cet=yes],
 [enable_cet=no])
-   CFLAGS="$save_CFLAGS"
+   CFLAGS="$cet_save_CFLAGS"
;;
   yes)
# Check if assembler supports CET.
@@ -64,7 +65,7 @@ AC_MSG_CHECKING([for CET support])
 case "$host" in
   i[[34567]]86-*-linux* | x86_64-*-linux*)
 may_have_cet=yes
-save_CFLAGS="$CFLAGS"
+cet_save_CFLAGS="$CFLAGS"
 CFLAGS="$CFLAGS -fcf-protection"
 case "$enable_cet" in
   auto)
@@ -93,7 +94,7 @@ asm ("setssbsy");
 [AC_MSG_ERROR([assembler with CET support is required for 
--enable-cet])])
;;
 esac
-CFLAGS="$save_CFLAGS"
+CFLAGS="$cet_save_CFLAGS"
 ;;
   *)
 may_have_cet=no
@@ -101,9 +102,9 @@ asm ("setssbsy");
 ;;
 esac
 
-save_CFLAGS="$CFLAGS"
+cet_save_CFLAGS="$CFLAGS"
 CFLAGS="$CFLAGS -fcf-protection=none"
-save_LDFLAGS="$LDFLAGS"
+cet_save_LDFLAGS="$LDFLAGS"
 LDFLAGS="$LDFLAGS -Wl,-z,ibt,-z,shstk"
 if test x$may_have_cet = xyes; then
   # Check whether -fcf-protection=none -Wl,-z,ibt,-z,shstk work.
@@ -159,6 +160,6 @@ if test x$enable_cet = xyes; then
 else
   AC_MSG_RESULT([no])
 fi
-CFLAGS="$save_CFLAGS"
-LDFLAGS="$save_LDFLAGS"
+CFLAGS="$cet_save_CFLAGS"
+LDFLAGS="$cet_save_LDFLAGS"
 ])
diff --git a/gcc/configure b/gcc/configure
index 4531d50eb0f..46850710424 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -31747,7 +31747,7 @@ $as_echo_n "checking for CET support... " >&6; }
 case "$host" in
   i[34567]86-*-linux* | x86_64-*-linux*)
 may_have_cet=yes
-save_CFLAGS="$CFLAGS"
+cet_save_CFLAGS="$CFLAGS"
 CFLAGS="$CFLAGS -fcf-protection"
 case "$enable_cet" in
   auto)
@@ 

[PATCH, committed, part2] PR fortran/95090 - ICE: identifier overflow

2020-05-29 Thread Harald Anlauf
The initial patch for this PR had some fallout which for unknown reason
did only show up on i686, but not on x86_64.  With initial guidance by
Manfred Schwarb three further locations exhibiting buffer overrun could
be identified in a gdb session and were fixed.

Committed as 'obvious' to master.

Thanks,
Harald


PR fortran/95090 - ICE: identifier overflow

The initial fix for this PR uncovered several latent issues with further
too small string buffers which showed up only when testing on i686.
Provide sufficiently large temporaries.

2020-05-29  Harald Anlauf  

gcc/fortran/
PR fortran/95090
* class.c (get_unique_type_string): Enlarge temporary for
name-mangling.  Use strncpy to prevent buffer overrun.
(get_unique_hashed_string): Enlarge temporary.
(gfc_hash_value): Enlarge temporary for name-mangling.
diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c
index 9aa3eb7282c..db395624a16 100644
--- a/gcc/fortran/class.c
+++ b/gcc/fortran/class.c
@@ -479,11 +479,12 @@ gfc_class_initializer (gfc_typespec *ts, gfc_expr *init_expr)
 static void
 get_unique_type_string (char *string, gfc_symbol *derived)
 {
-  char dt_name[GFC_MAX_SYMBOL_LEN+1];
+  /* Provide sufficient space to hold "Pdtsymbol".  */
+  char dt_name[GFC_MAX_SYMBOL_LEN+4];
   if (derived->attr.unlimited_polymorphic)
 strcpy (dt_name, "STAR");
   else
-strcpy (dt_name, gfc_dt_upper_string (derived->name));
+strncpy (dt_name, gfc_dt_upper_string (derived->name), sizeof (dt_name));
   if (derived->attr.unlimited_polymorphic)
 sprintf (string, "_%s", dt_name);
   else if (derived->module)
@@ -501,7 +502,8 @@ get_unique_type_string (char *string, gfc_symbol *derived)
 static void
 get_unique_hashed_string (char *string, gfc_symbol *derived)
 {
-  char tmp[2*GFC_MAX_SYMBOL_LEN+2];
+  /* Provide sufficient space to hold "symbol_Pdtsymbol".  */
+  char tmp[2*GFC_MAX_SYMBOL_LEN+5];
   get_unique_type_string ([0], derived);
   /* If string is too long, use hash value in hex representation (allow for
  extra decoration, cf. gfc_build_class_symbol & gfc_find_derived_vtab).
@@ -523,7 +525,8 @@ unsigned int
 gfc_hash_value (gfc_symbol *sym)
 {
   unsigned int hash = 0;
-  char c[2*(GFC_MAX_SYMBOL_LEN+1)];
+  /* Provide sufficient space to hold "symbol_Pdtsymbol".  */
+  char c[2*GFC_MAX_SYMBOL_LEN+5];
   int i, len;

   get_unique_type_string ([0], sym);


Re: [PATCH] c++: Make braced-init-list as template arg work with aggr init [PR95369]

2020-05-29 Thread Jason Merrill via Gcc-patches

On 5/28/20 7:23 PM, Marek Polacek wrote:

Barry pointed out to me that our braced-init-list as a template-argument
extension doesn't work as expected when we aggregate-initialize.  Thus
fixed by calling digest_init in convert_nontype_argument so that we can
actually convert the CONSTRUCTOR.

I don't think we can call digest_init any earlier, and it needs to
happen before the call to build_converted_constant_expr.


Or we could fix build_converted_constant_expr to allow it; aggregate 
list-initialization is a user-defined conversion sequence, so I tihnk it 
should be allowed as part of a converted constant expression.


Jason


Barry also noticed that we allow designated initializers for
non-aggregate types in the template-argument argument context, i.e. this

   struct S {
 unsigned a;
 unsigned b;
 constexpr S(unsigned _a, unsigned _b) noexcept: a{_a}, b{_b} { }
   };

   template struct X { };

   void f()
   {
 X<{.a = 1, .b = 2}> x;
   }

probably should not compile.  But I'm not too sure about it, and don't
know how I would fix it anyway, so I'm not dealing with it in this
patch.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/95369
* pt.c (convert_nontype_argument): In C++20, reshape and digest
a braced-init-list if the type is an aggregate.

gcc/testsuite/ChangeLog:

PR c++/95369
* g++.dg/cpp2a/nontype-class38.C: New test.
---
  gcc/cp/pt.c  | 13 +
  gcc/testsuite/g++.dg/cpp2a/nontype-class38.C | 30 
  2 files changed, 43 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class38.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 90dafff3aa7..adb7593f77d 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -7133,6 +7133,19 @@ convert_nontype_argument (tree type, tree expr, 
tsubst_flags_t complain)
return error_mark_node;
  }
  
+  /* For a { } template argument, like in X<{ 1, 2 }>, we need to digest

+ here so that build_converted_constant_expr below is able to convert
+ it to TYPE.  */
+  if (cxx_dialect >= cxx20
+  && BRACE_ENCLOSED_INITIALIZER_P (expr)
+  && CP_AGGREGATE_TYPE_P (type))
+{
+  expr = reshape_init (type, expr, complain);
+  expr = digest_init (type, expr, complain);
+  if (expr == error_mark_node)
+   return error_mark_node;
+}
+
/* If we are in a template, EXPR may be non-dependent, but still
   have a syntactic, rather than semantic, form.  For example, EXPR
   might be a SCOPE_REF, rather than the VAR_DECL to which the
diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class38.C 
b/gcc/testsuite/g++.dg/cpp2a/nontype-class38.C
new file mode 100644
index 000..5b440fd1c9e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/nontype-class38.C
@@ -0,0 +1,30 @@
+// PR c++/95369
+// { dg-do compile { target c++20 } }
+
+struct S {
+  int a;
+  int b;
+};
+
+struct W {
+  int i;
+  S s;
+};
+
+template 
+void fnc()
+{
+}
+
+template struct X { };
+template struct Y { };
+
+void f()
+{
+  fnc<{ .a = 10, .b = 20 }>();
+  fnc<{ 10, 20 }>();
+  X<{ .a = 1, .b = 2 }> x;
+  X<{ 1, 2 }> x2;
+  // Brace elision is likely to be allowed.
+  Y<{ 1, 2, 3 }> x3;
+}

base-commit: 3d8d5ddb539a5254c7ef83414377f4c74c7701d4



diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index a51ebb5d9e3..d38e23a7e9a 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -4348,7 +4348,7 @@ build_converted_constant_expr_internal (tree type, tree expr,
  and where the reference binding (if any) binds directly.  */
 
   for (conversion *c = conv;
-   conv && c->kind != ck_identity;
+   c && c->kind != ck_identity;
c = next_conversion (c))
 {
   switch (c->kind)
@@ -4356,6 +4356,8 @@ build_converted_constant_expr_internal (tree type, tree expr,
 	  /* A conversion function is OK.  If it isn't constexpr, we'll
 	 complain later that the argument isn't constant.  */
 	case ck_user:
+	  /* List-initialization is OK.  */
+	case ck_aggr:
 	  /* The lvalue-to-rvalue conversion is OK.  */
 	case ck_rvalue:
 	  /* Array-to-pointer and function-to-pointer.  */


Re: [PATCH] c++: satisfaction value of type typedef to bool [PR95386]

2020-05-29 Thread Jason Merrill via Gcc-patches

On 5/29/20 1:40 PM, Patrick Palka wrote:

On Fri, 29 May 2020, Jason Merrill wrote:


On 5/29/20 11:59 AM, Patrick Palka wrote:

In the testcase below, the satisfaction value of fn1's constraint
is INTEGER_CST '1' of type BOOLEAN_TYPE value_type, which is a typedef
to the standard boolean_type_node.  But satisfaction_value expects to
see exactly boolean_true_node or integer_one_node, which this value is
neither, causing us to trip over the assert therein.

This patch relaxes satisfaction_value to accept any INTEGER_CST which
satisfies integer_zerop or integer_onep.  (It seems we could get away
with accepting only INTEGER_CSTs of type BOOLEAN_TYPE, but that wouldn't
be a proper relaxation of what the subroutine currently accepts and
would therefore be more risky to backport.)


I think for GCC 11 I'd prefer to restrict it to BOOLEAN_TYPE.  This patch is
OK for GCC 10.


Sounds good.  Would the following be OK for GCC 11 after a full
bootstrap and regtest?



I opted to mirror satisfy_atom and instead check
same_type_p (..., boolean_type_node).


OK.


-- >8 --

Subject: [PATCH] c++: satisfaction value of type typedef to bool [PR95386]

In the testcase below, the satisfaction value of fn1's constraint
is INTEGER_CST '1' of type BOOLEAN_TYPE value_type, which is a typedef
to the standard boolean_type_node.  But satisfaction_value expects to
see exactly boolean_true_node or integer_one_node, which this value is
neither, causing us to trip over the assert therein.

This patch changes satisfaction_value to accept INTEGER_CST of any
boolean type.

gcc/cp/ChangeLog:

PR c++/95386
* constraint.cc (satisfaction_value): Accept INTEGER_CST of any
boolean type.

gcc/testsuite/ChangeLog:

PR c++/95386
* g++.dg/concepts/pr95386.C: New test.
---
  gcc/cp/constraint.cc| 14 +++---
  gcc/testsuite/g++.dg/concepts/pr95386.C | 11 +++
  2 files changed, 18 insertions(+), 7 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/concepts/pr95386.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index eb72bfe5936..92ff283013e 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -2490,15 +2490,15 @@ satisfy_disjunction (tree t, tree args, subst_info info)
  tree
  satisfaction_value (tree t)
  {
-  if (t == error_mark_node)
+  if (t == error_mark_node || t == boolean_true_node || t == 
boolean_false_node)
  return t;
-  if (t == boolean_true_node || t == integer_one_node)
-return boolean_true_node;
-  if (t == boolean_false_node || t == integer_zero_node)
-return boolean_false_node;
  
-  /* Anything else should be invalid.  */

-  gcc_assert (false);
+  gcc_assert (TREE_CODE (t) == INTEGER_CST
+ && same_type_p (TREE_TYPE (t), boolean_type_node));
+  if (integer_zerop (t))
+return boolean_false_node;
+  else
+return boolean_true_node;
  }
  
  /* Build a new template argument list with template arguments corresponding

diff --git a/gcc/testsuite/g++.dg/concepts/pr95386.C 
b/gcc/testsuite/g++.dg/concepts/pr95386.C
new file mode 100644
index 000..3c683e5693c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/concepts/pr95386.C
@@ -0,0 +1,11 @@
+// PR c++/95386
+// { dg-do compile { target concepts } }
+
+template  struct blah {
+ typedef bool value_type;
+ constexpr operator value_type() { return false; }
+};
+
+template  void fn1(T) requires (!blah());
+
+void fn2() { fn1(0); }





Re: [PATCH] Prefer simple case changes in spelling suggestions

2020-05-29 Thread Pip Cet via Gcc-patches
On Fri, May 29, 2020 at 6:02 PM Tom Tromey  wrote:
> This patch changes gcc's spell checker to prefer simple case changes
> when possible.
>
> I tested this using the self-tests.  A new self-test is also included.

I think that's great, but it should be mentioned in the comment that
the distance function used is not the Damerau-Levenshtein distance,
and not a metric.

(No triangle inequality. For example, the distance between "aB" and
"ba" is 4, but the distance between "aB" and "ab" is 1 and that
between "ab" and "ba" is 2, unless I missed something very clever in
your code.)

IIRC, minimum string alignment does not satisfy the triangle
inequality anyway, so test_metric_conditions should probably not
pretend to test it...


Re: [PATCH] c++: satisfaction value of type typedef to bool [PR95386]

2020-05-29 Thread Patrick Palka via Gcc-patches
On Fri, 29 May 2020, Jason Merrill wrote:

> On 5/29/20 11:59 AM, Patrick Palka wrote:
> > In the testcase below, the satisfaction value of fn1's constraint
> > is INTEGER_CST '1' of type BOOLEAN_TYPE value_type, which is a typedef
> > to the standard boolean_type_node.  But satisfaction_value expects to
> > see exactly boolean_true_node or integer_one_node, which this value is
> > neither, causing us to trip over the assert therein.
> > 
> > This patch relaxes satisfaction_value to accept any INTEGER_CST which
> > satisfies integer_zerop or integer_onep.  (It seems we could get away
> > with accepting only INTEGER_CSTs of type BOOLEAN_TYPE, but that wouldn't
> > be a proper relaxation of what the subroutine currently accepts and
> > would therefore be more risky to backport.)
> 
> I think for GCC 11 I'd prefer to restrict it to BOOLEAN_TYPE.  This patch is
> OK for GCC 10.

Sounds good.  Would the following be OK for GCC 11 after a full
bootstrap and regtest?

I opted to mirror satisfy_atom and instead check
same_type_p (..., boolean_type_node).

-- >8 --

Subject: [PATCH] c++: satisfaction value of type typedef to bool [PR95386]

In the testcase below, the satisfaction value of fn1's constraint
is INTEGER_CST '1' of type BOOLEAN_TYPE value_type, which is a typedef
to the standard boolean_type_node.  But satisfaction_value expects to
see exactly boolean_true_node or integer_one_node, which this value is
neither, causing us to trip over the assert therein.

This patch changes satisfaction_value to accept INTEGER_CST of any
boolean type.

gcc/cp/ChangeLog:

PR c++/95386
* constraint.cc (satisfaction_value): Accept INTEGER_CST of any
boolean type.

gcc/testsuite/ChangeLog:

PR c++/95386
* g++.dg/concepts/pr95386.C: New test.
---
 gcc/cp/constraint.cc| 14 +++---
 gcc/testsuite/g++.dg/concepts/pr95386.C | 11 +++
 2 files changed, 18 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/concepts/pr95386.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index eb72bfe5936..92ff283013e 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -2490,15 +2490,15 @@ satisfy_disjunction (tree t, tree args, subst_info info)
 tree
 satisfaction_value (tree t)
 {
-  if (t == error_mark_node)
+  if (t == error_mark_node || t == boolean_true_node || t == 
boolean_false_node)
 return t;
-  if (t == boolean_true_node || t == integer_one_node)
-return boolean_true_node;
-  if (t == boolean_false_node || t == integer_zero_node)
-return boolean_false_node;
 
-  /* Anything else should be invalid.  */
-  gcc_assert (false);
+  gcc_assert (TREE_CODE (t) == INTEGER_CST
+ && same_type_p (TREE_TYPE (t), boolean_type_node));
+  if (integer_zerop (t))
+return boolean_false_node;
+  else
+return boolean_true_node;
 }
 
 /* Build a new template argument list with template arguments corresponding
diff --git a/gcc/testsuite/g++.dg/concepts/pr95386.C 
b/gcc/testsuite/g++.dg/concepts/pr95386.C
new file mode 100644
index 000..3c683e5693c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/concepts/pr95386.C
@@ -0,0 +1,11 @@
+// PR c++/95386
+// { dg-do compile { target concepts } }
+
+template  struct blah {
+ typedef bool value_type;
+ constexpr operator value_type() { return false; }
+};
+
+template  void fn1(T) requires (!blah());
+
+void fn2() { fn1(0); }
-- 
2.27.0.rc1.5.gae92ac8ae3


> 
> > Passes 'make check-c++', does this look OK to commit to master and to
> > the GCC 10 branch after a full bootstrap and regtest?
> > 
> > gcc/cp/ChangeLog:
> > 
> > PR c++/95386
> > * constraint.cc (satisfaction_value): Relax to accept any
> > INTEGER_CST that satisfies integer_zerop or integer_onep.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > PR c++/95386
> > * g++.dg/concepts/pr95386.C: New test.
> > ---
> >   gcc/cp/constraint.cc|  7 ---
> >   gcc/testsuite/g++.dg/concepts/pr95386.C | 11 +++
> >   2 files changed, 15 insertions(+), 3 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/concepts/pr95386.C
> > 
> > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> > index eb72bfe5936..5a247cfb738 100644
> > --- a/gcc/cp/constraint.cc
> > +++ b/gcc/cp/constraint.cc
> > @@ -2490,11 +2490,12 @@ satisfy_disjunction (tree t, tree args, subst_info
> > info)
> >   tree
> >   satisfaction_value (tree t)
> >   {
> > -  if (t == error_mark_node)
> > +  if (t == error_mark_node || t == boolean_true_node || t ==
> > boolean_false_node)
> >   return t;
> > -  if (t == boolean_true_node || t == integer_one_node)
> > +  gcc_assert (TREE_CODE (t) == INTEGER_CST);
> > +  if (integer_onep (t))
> >   return boolean_true_node;
> > -  if (t == boolean_false_node || t == integer_zero_node)
> > +  if (integer_zerop (t))
> >   return boolean_false_node;
> >   /* Anything else should be invalid.  */
> > diff --git a/gcc/testsuite/g++.dg/concepts/pr95386.C
> 

Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-05-29 Thread Segher Boessenkool
On Fri, May 29, 2020 at 06:26:55PM +0100, Richard Sandiford wrote:
> Segher Boessenkool  writes:
> > Most patterns *do* FAIL on some target.  We cannot rewind time.
> 
> Sure.  But the point is that FAILing isn't “explicitly allowed” for vcond*.
> In fact it's the opposite.

It has FAILed on rs6000 since 2004.

> If we ignore the docs and look at what the status quo actually is --
> which I agree seems safest for GCC :-) -- then patterns are allowed to
> FAIL if target-independent code provides an expand-time fallback for
> the FAILing case.  But that isn't true for vcond either.

That is a bug in the callers then :-)

> expand_vec_cond_expr does:
> 
>   icode = get_vcond_icode (mode, cmp_op_mode, unsignedp);
>   if (icode == CODE_FOR_nothing)
> ...
> 
>   comparison = vector_compare_rtx (VOIDmode, tcode, op0a, op0b, unsignedp,
>  icode, 4);
>   rtx_op1 = expand_normal (op1);
>   rtx_op2 = expand_normal (op2);
> 
>   create_output_operand ([0], target, mode);
>   create_input_operand ([1], rtx_op1, mode);
>   create_input_operand ([2], rtx_op2, mode);
>   create_fixed_operand ([3], comparison);
>   create_fixed_operand ([4], XEXP (comparison, 0));
>   create_fixed_operand ([5], XEXP (comparison, 1));
>   expand_insn (icode, 6, ops);
>   return ops[0].value;
> 
> which ICEs if the expander FAILs.
> 
> So whether you go from the docs or from what's actually implemented,
> vcond* isn't currently allowed to FAIL.  All Richard's gcc_unreachable
> suggestion would do is change where the ICE happens.


>   icode = get_vcond_icode (mode, cmp_op_mode, unsignedp);
>   if (icode == CODE_FOR_nothing)
> ...

Of course it is allowed to FAIL, based on this code.  That is: the RTL
pattern is allowed to FAIL.  Whatever optabs do, I never understood :-)

Is this vec_cmp that is used by the fallback?  That will never FAIL
for us (if it is enabled at all, natch, same as for any other target).


Segher


[PATCH 4/5] Export cgraph_externally_visible_p.

2020-05-29 Thread Erick Ochoa




---
 gcc/ipa-ref.h| 2 +-
 gcc/ipa-visibility.c | 5 ++---
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/gcc/ipa-ref.h b/gcc/ipa-ref.h
index 95e29605548..9ff26e2693c 100644
--- a/gcc/ipa-ref.h
+++ b/gcc/ipa-ref.h
@@ -139,5 +139,5 @@ public:

 const char *
 stringify_ipa_ref_use (const ipa_ref_use use);
-
+bool cgraph_externally_visible_p (struct cgraph_node *, bool);
 #endif /* GCC_IPA_REF_H */
diff --git a/gcc/ipa-visibility.c b/gcc/ipa-visibility.c
index 7c854f471e8..8397cc9d61d 100644
--- a/gcc/ipa-visibility.c
+++ b/gcc/ipa-visibility.c
@@ -186,9 +186,8 @@ comdat_can_be_unshared_p (symtab_node *node)

 /* Return true when function NODE should be considered externally 
visible.  */


-static bool
-cgraph_externally_visible_p (struct cgraph_node *node,
-bool whole_program)
+bool
+cgraph_externally_visible_p (struct cgraph_node *node, bool whole_program)
 {
   while (node->transparent_alias && node->definition)
 node = node->get_alias_target ();
--
2.18.1



[PATCH 5/5] Adds ipa-initcall-cp pass.

2020-05-29 Thread Erick Ochoa




This pass is a variant of constant propagation where global
primitive constants with a single write are propagated to multiple
read statements.

ChangeLog:

2020-05-20  Erick Ochoa 

gcc/Makefile.in: Adds new pass
gcc/passes.def: Same
gcc/tree-pass.h: Same
gcc/common.opt: Same
gcc/cgraph.h: Adds new field to cgraph_node
gcc/cgraph.c: Same
gcc/ipa-cp.c: May skip ipa-cp for a function
if initcall-cp has constants to propagate
in that same function
gcc/ipa-initcall-cp.c: New file
---
 gcc/Makefile.in   |1 +
 gcc/cgraph.h  |5 +-
 gcc/common.opt|8 +
 gcc/ipa-cp.c  |2 +-
 gcc/ipa-initcall-cp.c | 1199 +
 gcc/passes.def|1 +
 gcc/tree-pass.h   |1 +
 7 files changed, 1215 insertions(+), 2 deletions(-)
 create mode 100644 gcc/ipa-initcall-cp.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 543b477ff18..b94fb633d78 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1401,6 +1401,7 @@ OBJS = \
internal-fn.o \
ipa-cp.o \
ipa-sra.o \
+   ipa-initcall-cp.o \
ipa-devirt.o \
ipa-fnsummary.o \
ipa-polymorphic-call.o \
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 5ddeb65269b..532b4671609 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -937,7 +937,7 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : 
public symtab_node

   split_part (false), indirect_call_target (false), local (false),
   versionable (false), can_change_signature (false),
   redefined_extern_inline (false), tm_may_enter_irr (false),
-  ipcp_clone (false), m_uid (uid), m_summary_id (-1)
+  ipcp_clone (false), skip_ipa_cp (false), m_uid (uid), 
m_summary_id (-1)

   {}

   /* Remove the node from cgraph and all inline clones inlined into it.
@@ -1539,6 +1539,9 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node 
: public symtab_node

   unsigned tm_may_enter_irr : 1;
   /* True if this was a clone created by ipa-cp.  */
   unsigned ipcp_clone : 1;
+  /* True if ipa-initcall-cp and therefore we need to skip ipa-cp for 
cgraph

+   * node.  */
+  unsigned skip_ipa_cp : 1;

 private:
   /* Unique id of the node.  */
diff --git a/gcc/common.opt b/gcc/common.opt
index d33383b523c..b2d8d1b0958 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -3408,4 +3408,12 @@ fipa-ra
 Common Report Var(flag_ipa_ra) Optimization
 Use caller save register across calls if possible.

+fipa-initcall-cp-dryrun
+Common Report Var(flag_initcall_cp_dryrun)
+Run analysis for propagating constants across initialization functions.
+
+fipa-initcall-cp
+Common Report Var(flag_initcall_cp) Optimization
+Run transformation for propagation constants across initialization 
functions.

+
 ; This comment is to ensure we retain the blank line above.
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index c64e9104a94..1036cb0684e 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -1188,7 +1188,7 @@ initialize_node_lattices (struct cgraph_node *node)

   gcc_checking_assert (node->has_gimple_body_p ());

-  if (!ipa_get_param_count (info))
+  if (!ipa_get_param_count (info) || node->skip_ipa_cp)
 disable = true;
   else if (node->local)
 {
diff --git a/gcc/ipa-initcall-cp.c b/gcc/ipa-initcall-cp.c
new file mode 100644
index 000..02f70b7a908
--- /dev/null
+++ b/gcc/ipa-initcall-cp.c
@@ -0,0 +1,1199 @@
+/* Initcall constant propagation pass.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+
+   Contributed by Erick Ochoa 
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "tree.h"
+#include "tree-eh.h"
+#include "gimple.h"
+#include "gimple-expr.h"
+#include "gimple-iterator.h"
+#include "predict.h"
+#include "alloc-pool.h"
+#include "tree-pass.h"
+#include "cgraph.h"
+#include "diagnostic.h"
+#include "fold-const.h"
+#include "gimple-fold.h"
+#include "symbol-summary.h"
+#include "tree-vrp.h"
+#include "ipa-prop.h"
+#include "tree-pretty-print.h"
+#include "tree-inline.h"
+#include "ipa-fnsummary.h"
+#include "ipa-utils.h"
+#include "tree-ssa-ccp.h"
+#include "stringpool.h"
+#include "attribs.h"
+#include "gimple-pretty-print.h"
+#include "ssa.h"
+
+#define INITCALL_CP_SUFFIX "initcall.cp"
+
+typedef vec ipa_ref_vec;
+
+/* log to dump file 

[PATCH 3/5] Add function path_exists

2020-05-29 Thread Erick Ochoa




---
 gcc/ipa-utils.c | 33 +
 gcc/ipa-utils.h |  4 +++-
 2 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/gcc/ipa-utils.c b/gcc/ipa-utils.c
index 23e7f714306..bd3e79bd2e2 100644
--- a/gcc/ipa-utils.c
+++ b/gcc/ipa-utils.c
@@ -781,3 +781,36 @@ recursive_call_p (tree func, tree dest)
   return false;
   return true;
 }
+
+static bool
+path_exists_dfs (hash_set *visited,
+const cgraph_node *current, const cgraph_node *target)
+{
+  if (current == target)
+return true;
+
+  visited->add (current);
+
+  cgraph_edge *cs;
+  for (cs = current->callees; cs; cs = cs->next_callee)
+{
+  cgraph_node *callee = cs->callee->function_symbol ();
+  if (visited->contains (callee))
+   continue;
+  if (!path_exists_dfs (visited, callee, target))
+   continue;
+
+  return true;
+}
+  return false;
+}
+
+/* Determine if there's a path between two cgraph_nodes */
+bool
+path_exists (const cgraph_node *from, const cgraph_node *to)
+{
+  hash_set visited;
+  bool found_path = path_exists_dfs (, from, to);
+  visited.empty ();
+  return found_path;
+}
diff --git a/gcc/ipa-utils.h b/gcc/ipa-utils.h
index 98edc383461..1706deaee0a 100644
--- a/gcc/ipa-utils.h
+++ b/gcc/ipa-utils.h
@@ -263,4 +263,6 @@ get_odr_name_for_type (tree type)
   return IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (type_name));
 }

-#endif  /* GCC_IPA_UTILS_H  */
+extern bool
+path_exists (const cgraph_node *from, const cgraph_node *to);
+#endif /* GCC_IPA_UTILS_H  */
--
2.18.1



[PATCH 2/5] Add function tree_code_in_cst.

2020-05-29 Thread Erick Ochoa



---
 gcc/tree.h | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/gcc/tree.h b/gcc/tree.h
index bd0c51b2a18..86a4542f58b 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -6156,6 +6156,17 @@ int_bit_position (const_tree field)
  + wi::to_offset (DECL_FIELD_BIT_OFFSET (field))).to_shwi ();
 }

+/* Determine if tree code is a constant */
+inline bool
+tree_code_is_cst (tree op)
+{
+  int code = TREE_CODE (op);
+  if (code == INTEGER_CST || code == REAL_CST || code == COMPLEX_CST
+  || code == VECTOR_CST)
+return true;
+  return false;
+}
+
 /* Return true if it makes sense to consider alias set for a type T.  */

 inline bool
--
2.18.1



[PATCH 1/5] Add stringify_ipa_ref_use function.

2020-05-29 Thread Erick Ochoa




---
 gcc/ipa-ref.c | 22 ++
 gcc/ipa-ref.h |  3 +++
 2 files changed, 25 insertions(+)

diff --git a/gcc/ipa-ref.c b/gcc/ipa-ref.c
index 241828ee973..76459e9cc3d 100644
--- a/gcc/ipa-ref.c
+++ b/gcc/ipa-ref.c
@@ -103,3 +103,25 @@ ipa_ref::referred_ref_list (void)
 {
   return >ref_list;
 }
+
+const char *
+stringify_ipa_ref_use (const ipa_ref_use use)
+{
+  switch (use)
+{
+case IPA_REF_LOAD:
+  return "IPA_REF_LOAD";
+  break;
+case IPA_REF_STORE:
+  return "IPA_REF_STORE";
+  break;
+case IPA_REF_ADDR:
+  return "IPA_REF_ADDR";
+  break;
+case IPA_REF_ALIAS:
+  return "IPA_REF_ALIAS";
+  break;
+default:
+  return "";
+}
+}
diff --git a/gcc/ipa-ref.h b/gcc/ipa-ref.h
index 1de5bd34b82..95e29605548 100644
--- a/gcc/ipa-ref.h
+++ b/gcc/ipa-ref.h
@@ -137,4 +137,7 @@ public:
   vec  GTY((skip)) referring;
 };

+const char *
+stringify_ipa_ref_use (const ipa_ref_use use);
+
 #endif /* GCC_IPA_REF_H */
--
2.18.1



Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-05-29 Thread Segher Boessenkool
On Fri, May 29, 2020 at 06:05:14PM +0100, Richard Sandiford wrote:
> Segher Boessenkool  writes:
> > On Fri, May 29, 2020 at 02:43:12PM +0200, Richard Biener wrote:
> >> Segher - do you actually know this code to guess why the patterns are 
> >> defensive?
> >
> > Yes.
> 
> In that case, can you give a specific example in which the patterns do
> actually fail?

That is a very different question.  (And this is shifting the burden of
proof again.)

> I think Richard's point is that even the current compiler will ICE if
> the vcond* patterns fail.  All Martin's patch did was expose that via
> the extra static checking we get for directly-mapped internal fns.

How will they ICE?

> If you want us to fix that by providing a fallback, we need to know what
> the fallback should do.

Just whatever vcond* is documented to do, of course ;-)

> E.g. the obvious thing would be to emit the
> embedded comparison separately and then emit bitwise operations to
> implement the select.  But in the powerpc case, it's actually the
> comparison that's the potential problem, so that expansion would just
> kick the can further down the road.
> 
> So which vector comparisons doesn't powerpc support, and what should the
> fallback vcond* expansion for them be?

It depends on which set of vector registers is in use, and on the ISA
version as well, what the hardware can do.  What the backend can do --
well, it is allowed to FAIL these patterns, and it sometimes does.
That's the whole point isn't it?

vec_cmp* won't FAIL.  I don't know if there is a portable variant of
this?


Segher


[patch 0/5] ipa-initcall-cp

2020-05-29 Thread Erick Ochoa

Hello everyone,

I wanted to highlight this ticket on bugzilla [0]. It is a missed 
optimization that I worked on. It involves propagating constants across 
function calls at link-time. I am relatively new to GCC and this would 
be my first significant contribution. I have developed a prototype of 
the solution that seems to work well enough to compile and run CPU2017 
intrate benchmarks correctly. I am now in the process of running the 
full suite. Feedback would be appreciated.


Here's an overview of how it works:

ipa-initcall-cp collects all variables with static lifetime that contain 
only a single constant write. Then, for each read of the variable, we 
determine (statically) if the write occurs before read. In order to do 
this, we use both the dominators graph and the static call graph. If the 
write occurs before all reads, then we can safely substitute the read 
with the constant being written to the variable. ipa-initcall-cp works 
across function calls so the read and write do not need to occur on the 
same function.


There are some issues which still need to be addressed, particularly, 
ipa-initcall-cp is at the moment just a prototype and as such it will 
stream in functions and modify them during WPA. I would like to fix 
this, however I am not sure how to properly use clones when modifying 
the function body.


I have tested this against applied my patch to GCC version 10.1. If you 
have any questions, comments, let me know.


[0] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92538


Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-05-29 Thread Richard Sandiford
Segher Boessenkool  writes:
> On Fri, May 29, 2020 at 05:57:13PM +0100, Richard Sandiford wrote:
>> Segher Boessenkool  writes:
>> > On Fri, May 29, 2020 at 02:17:00PM +0200, Richard Biener wrote:
>> >> Now it looks like that those verification also simply checks optab
>> >> availability only but then this is just a preexisting issue (and we can
>> >> possibly build a testcase that FAILs RTL expansion for power...).
>> >> 
>> >> So given that this means the latent bug in the powerpc backend
>> >> should be fixed and we should use a direct internal function instead?
>> >
>> > I don't see what you consider a bug in the backend here?  The expansion
>> > FAILs, and it is explicitly allowed to do that.
>> 
>> Well, the docs say:
>> 
>>   …  For **certain** named patterns, it may invoke @code{FAIL} to tell the
>>   compiler to use an alternate way of performing that task.  …
>> 
>> (my emphasis).  Later on they say:
>> 
>>   @findex FAIL
>>   @item FAIL
>>   …
>> 
>>   Failure is currently supported only for binary (addition, multiplication,
>>   shifting, etc.) and bit-field (@code{extv}, @code{extzv}, and @code{insv})
>>   operations.
>> 
>> which explicitly says that vcond* isn't allowed to fail.
>> 
>> OK, so that list looks out of date.  But still. :-)
>> 
>> We now explicitly say that some patterns aren't allowed to FAIL,
>> which I guess gives the (implicit) impression that all the others can.
>> But that wasn't the intention.  The lines were just added for emphasis.
>> (AFAIK 7f9844caf1ebd513 was the first patch to do this.)
>
> Most patterns *do* FAIL on some target.  We cannot rewind time.

Sure.  But the point is that FAILing isn't “explicitly allowed” for vcond*.
In fact it's the opposite.

If we ignore the docs and look at what the status quo actually is --
which I agree seems safest for GCC :-) -- then patterns are allowed to
FAIL if target-independent code provides an expand-time fallback for
the FAILing case.  But that isn't true for vcond either.
expand_vec_cond_expr does:

  icode = get_vcond_icode (mode, cmp_op_mode, unsignedp);
  if (icode == CODE_FOR_nothing)
...

  comparison = vector_compare_rtx (VOIDmode, tcode, op0a, op0b, unsignedp,
   icode, 4);
  rtx_op1 = expand_normal (op1);
  rtx_op2 = expand_normal (op2);

  create_output_operand ([0], target, mode);
  create_input_operand ([1], rtx_op1, mode);
  create_input_operand ([2], rtx_op2, mode);
  create_fixed_operand ([3], comparison);
  create_fixed_operand ([4], XEXP (comparison, 0));
  create_fixed_operand ([5], XEXP (comparison, 1));
  expand_insn (icode, 6, ops);
  return ops[0].value;

which ICEs if the expander FAILs.

So whether you go from the docs or from what's actually implemented,
vcond* isn't currently allowed to FAIL.  All Richard's gcc_unreachable
suggestion would do is change where the ICE happens.

Richard


Re: [PATCH] c++: Reject some further reinterpret casts in constexpr [PR82304, PR95307]

2020-05-29 Thread Jason Merrill via Gcc-patches

On 5/29/20 6:25 AM, Jakub Jelinek wrote:

Hi!

cxx_eval_outermost_constant_expr had a check for reinterpret_casts from
pointers (well, it checked from ADDR_EXPRs) to integral type, but that
only caught such cases at the toplevel of expressions.
As the comment said, it should be done even inside of the expressions,
but at the point of the writing e.g. pointer differences used to be a
problem.  We now have POINTER_DIFF_EXPR, so this is no longer an issue.

Had to do it just for CONVERT_EXPR, because the FE emits NOP_EXPR casts
from pointers to integrals in various spots, e.g. for the PMR & 1 tests,
though on NOP_EXPR we have the REINTERPRET_CAST_P bit that we do check,
while on CONVERT_EXPR we don't.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
PR92411 is not fixed by this change though.

2020-05-29  Jakub Jelinek  

PR c++/82304
PR c++/95307
* constexpr.c (cxx_eval_constant_expression): Diagnose CONVERT_EXPR
conversions from pointer types to arithmetic types here...
(cxx_eval_outermost_constant_expr): ... instead of here.

* g++.dg/template/pr79650.C: Expect different diagnostics and expect
it on all lines that do pointer to integer casts.
* g++.dg/cpp1y/constexpr-shift1.C: Expect different diagnostics.
* g++.dg/cpp1y/constexpr-82304.C: New test.
* g++.dg/cpp0x/constexpr-95307.C: New test.

--- gcc/cp/constexpr.c.jj   2020-05-28 23:12:19.715303826 +0200
+++ gcc/cp/constexpr.c  2020-05-29 12:02:06.161656532 +0200
@@ -6194,6 +6194,18 @@ cxx_eval_constant_expression (const cons
if (VOID_TYPE_P (type))
  return void_node;
  
+	if (TREE_CODE (t) == CONVERT_EXPR

+   && ARITHMETIC_TYPE_P (type)
+   && INDIRECT_TYPE_P (TREE_TYPE (op)))
+ {
+   if (!ctx->quiet)
+ error ("conversion from pointer type %qT "
+"to arithmetic type %qT in a constant expression",
+TREE_TYPE (op), type);
+   *non_constant_p = true;
+   return t;
+ }
+
if (TREE_CODE (op) == PTRMEM_CST && !TYPE_PTRMEM_P (type))
  op = cplus_expand_constant (op);
  
@@ -6795,19 +6807,6 @@ cxx_eval_outermost_constant_expr (tree t

non_constant_p = true;
  }
  
-  /* Technically we should check this for all subexpressions, but that

- runs into problems with our internal representation of pointer
- subtraction and the 5.19 rules are still in flux.  */
-  if (CONVERT_EXPR_CODE_P (TREE_CODE (r))
-  && ARITHMETIC_TYPE_P (TREE_TYPE (r))
-  && TREE_CODE (TREE_OPERAND (r, 0)) == ADDR_EXPR)
-{
-  if (!allow_non_constant)
-   error ("conversion from pointer type %qT "
-  "to arithmetic type %qT in a constant expression",
-  TREE_TYPE (TREE_OPERAND (r, 0)), TREE_TYPE (r));
-  non_constant_p = true;
-}
  
if (!non_constant_p && overflow_p)

  non_constant_p = true;
--- gcc/testsuite/g++.dg/template/pr79650.C.jj  2020-01-12 11:54:37.249400796 
+0100
+++ gcc/testsuite/g++.dg/template/pr79650.C 2020-05-29 12:02:06.180656252 
+0200
@@ -11,10 +11,10 @@ foo ()
static int a, b;
  lab1:
  lab2:
-  A<(intptr_t)& - (__INTPTR_TYPE__)&> c;   // { dg-error "not a 
constant integer" }
-  A<(intptr_t)& - (__INTPTR_TYPE__)&> d;
-  A<(intptr_t) - (intptr_t)> e;  // { dg-error "is not a 
constant expression" }
-  A<(intptr_t) - (intptr_t)> f;
-  A<(intptr_t)sizeof(a) + (intptr_t)> g;   // { dg-error "not a constant 
integer" }
+  A<(intptr_t)& - (__INTPTR_TYPE__)&> c;   // { dg-error "conversion 
from pointer type" }
+  A<(intptr_t)& - (__INTPTR_TYPE__)&> d;   // { dg-error "conversion 
from pointer type" }
+  A<(intptr_t) - (intptr_t)> e;  // { dg-error "conversion 
from pointer type" }
+  A<(intptr_t) - (intptr_t)> f;  // { dg-error "conversion 
from pointer type" }
+  A<(intptr_t)sizeof(a) + (intptr_t)> g;   // { dg-error "conversion from 
pointer type" }
A<(intptr_t)> h;// { dg-error 
"conversion from pointer type" }
  }
--- gcc/testsuite/g++.dg/cpp1y/constexpr-shift1.C.jj2020-01-12 
11:54:37.115402818 +0100
+++ gcc/testsuite/g++.dg/cpp1y/constexpr-shift1.C   2020-05-29 
12:02:06.180656252 +0200
@@ -3,7 +3,8 @@
  constexpr int p = 1;
  constexpr __PTRDIFF_TYPE__ bar (int a)
  {
-  return ((__PTRDIFF_TYPE__) ) << a; // { dg-error "is not a constant 
expression" }
+  return ((__PTRDIFF_TYPE__) ) << a;
  }
  constexpr __PTRDIFF_TYPE__ r = bar (2); // { dg-message "in .constexpr. expansion 
of" }
+   // { dg-error "conversion from pointer" 
"" { target *-*-* } .-1 }
  constexpr __PTRDIFF_TYPE__ s = bar (0); // { dg-error "conversion from 
pointer" }


This is a diagnostic quality regression, moving the error message away 
from the line where the actual problem is.


Maybe use 

Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-05-29 Thread Segher Boessenkool
On Fri, May 29, 2020 at 05:57:13PM +0100, Richard Sandiford wrote:
> Segher Boessenkool  writes:
> > On Fri, May 29, 2020 at 02:17:00PM +0200, Richard Biener wrote:
> >> Now it looks like that those verification also simply checks optab
> >> availability only but then this is just a preexisting issue (and we can
> >> possibly build a testcase that FAILs RTL expansion for power...).
> >> 
> >> So given that this means the latent bug in the powerpc backend
> >> should be fixed and we should use a direct internal function instead?
> >
> > I don't see what you consider a bug in the backend here?  The expansion
> > FAILs, and it is explicitly allowed to do that.
> 
> Well, the docs say:
> 
>   …  For **certain** named patterns, it may invoke @code{FAIL} to tell the
>   compiler to use an alternate way of performing that task.  …
> 
> (my emphasis).  Later on they say:
> 
>   @findex FAIL
>   @item FAIL
>   …
> 
>   Failure is currently supported only for binary (addition, multiplication,
>   shifting, etc.) and bit-field (@code{extv}, @code{extzv}, and @code{insv})
>   operations.
> 
> which explicitly says that vcond* isn't allowed to fail.
> 
> OK, so that list looks out of date.  But still. :-)
> 
> We now explicitly say that some patterns aren't allowed to FAIL,
> which I guess gives the (implicit) impression that all the others can.
> But that wasn't the intention.  The lines were just added for emphasis.
> (AFAIK 7f9844caf1ebd513 was the first patch to do this.)

Most patterns *do* FAIL on some target.  We cannot rewind time.


Segher


PING**(5./7.) [patch, fortran] Fix memory leaks for finalized types

2020-05-29 Thread Thomas Koenig via Gcc-patches

Am 24.05.20 um 20:55 schrieb Thomas Koenig via Fortran:

Hello world,

this patch fixes a 8/9/10/11 regression, where finalized types
were not finalized (and deallocated), which led to memory
leaks.


Hi,

OK for trunk? The patch is simple enough (and the regression bad enough)
that I'll commit on Sunday unless there are any objections.

Regards

Thomas


Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-05-29 Thread Richard Sandiford
Segher Boessenkool  writes:
> On Fri, May 29, 2020 at 02:43:12PM +0200, Richard Biener wrote:
>> So I tried to understand the circumstances the rs6000 patterns FAIL
>> but FAILed ;)  It looks like some outs of rs6000_emit_vector_cond_expr
>> are unwarranted and the following should work:
>> 
>> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
>> index 8435bc15d72..5503215a00a 100644
>> --- a/gcc/config/rs6000/rs6000.c
>> +++ b/gcc/config/rs6000/rs6000.c
>> @@ -14638,8 +14638,7 @@ rs6000_emit_vector_compare (enum rtx_code rcode,
>
> (Different function, btw)
>
>> rtx mask2;
>> 
>> rev_code = reverse_condition_maybe_unordered (rcode);
>> -   if (rev_code == UNKNOWN)
>> - return NULL_RTX;
>> +   gcc_assert (rev_code != UNKNOWN);
>
> reverse_condition_maybe_unordered is documented as possibly returning
> UNKNOWN.  The current implementation doesn't, sure.  But fix that first?
>
> rs6000_emit_vector_compare can fail for several other reasons, too --
> including when rs6000_emit_vector_compare_inner fails.
>
>> @@ -14737,8 +14736,7 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx
>> op_true, rtx op_false,
>>rtx cond2;
>>bool invert_move = false;
>> 
>> -  if (VECTOR_UNIT_NONE_P (dest_mode))
>> -return 0;
>> +  gcc_assert (VECTOR_UNIT_NONE_P (dest_mode));
>
> Why can this condition never be true?  (Missing a ! btw)
>
> It needs a big comment if you want to make wide assumptions like that,
> in any case.  Pretty much *all* (non-trivial) asserts need an explanation.
>
> (And perhaps VECTOR_UNIT_ALTIVEC_OR_VSX_P is better).
>
>>   /* Get the vector mask for the given relational operations.  */
>>   mask = rs6000_emit_vector_compare (rcode, cc_op0, cc_op1, mask_mode);
>> 
>>   if (!mask)
>> return 0;
>> 
>> fail but that function recurses heavily - from reading
>> rs6000_emit_vector_compare_inner
>> it looks like power can do a lot of compares but floating-point LT which
>> reverse_condition_maybe_unordered would turn into UNGE which is not
>> handled either.
>> But then rs6000_emit_vector_compare just tries GT for that anyway (not UNGE) 
>> so
>> it is actually be handled (but should not?).
>> 
>> So I bet the expansion of the patterns cannot fail at the moment.  Thus I'd
>> replace the FAIL with a gcc_unreachable () and see if we have test
>> coverage for those
>> FAILs.
>
> I am not comfortable with that at all.
>
>> Segher - do you actually know this code to guess why the patterns are 
>> defensive?
>
> Yes.

In that case, can you give a specific example in which the patterns do
actually fail?

I think Richard's point is that even the current compiler will ICE if
the vcond* patterns fail.  All Martin's patch did was expose that via
the extra static checking we get for directly-mapped internal fns.
If you want us to fix that by providing a fallback, we need to know what
the fallback should do.  E.g. the obvious thing would be to emit the
embedded comparison separately and then emit bitwise operations to
implement the select.  But in the powerpc case, it's actually the
comparison that's the potential problem, so that expansion would just
kick the can further down the road.

So which vector comparisons doesn't powerpc support, and what should the
fallback vcond* expansion for them be?

Richard


Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-05-29 Thread Richard Sandiford
Segher Boessenkool  writes:
> On Fri, May 29, 2020 at 02:17:00PM +0200, Richard Biener wrote:
>> Now it looks like that those verification also simply checks optab
>> availability only but then this is just a preexisting issue (and we can
>> possibly build a testcase that FAILs RTL expansion for power...).
>> 
>> So given that this means the latent bug in the powerpc backend
>> should be fixed and we should use a direct internal function instead?
>
> I don't see what you consider a bug in the backend here?  The expansion
> FAILs, and it is explicitly allowed to do that.

Well, the docs say:

  …  For **certain** named patterns, it may invoke @code{FAIL} to tell the
  compiler to use an alternate way of performing that task.  …

(my emphasis).  Later on they say:

  @findex FAIL
  @item FAIL
  …

  Failure is currently supported only for binary (addition, multiplication,
  shifting, etc.) and bit-field (@code{extv}, @code{extzv}, and @code{insv})
  operations.

which explicitly says that vcond* isn't allowed to fail.

OK, so that list looks out of date.  But still. :-)

We now explicitly say that some patterns aren't allowed to FAIL,
which I guess gives the (implicit) impression that all the others can.
But that wasn't the intention.  The lines were just added for emphasis.
(AFAIK 7f9844caf1ebd513 was the first patch to do this.)

Richard


[PATCH] Prefer simple case changes in spelling suggestions

2020-05-29 Thread Tom Tromey
I got this error message when editing gcc and recompiling:

../../gcc/gcc/ada/gcc-interface/decl.c:7714:39: error: 
‘DWARF_GNAT_ENCODINGS_all’ was not declared in this scope; did you mean 
‘DWARF_GNAT_ENCODINGS_GDB’?
 7714 | = debug_info && gnat_encodings == DWARF_GNAT_ENCODINGS_all;
  |   ^~~~
  |   DWARF_GNAT_ENCODINGS_GDB

This suggestion could be improved -- what happened here is that I
failed to upper-case the word, and DWARF_GNAT_ENCODINGS_ALL was the
correct spelling.

This patch changes gcc's spell checker to prefer simple case changes
when possible.

I tested this using the self-tests.  A new self-test is also included.

gcc/ChangeLog:

* spellcheck.c (CASE_COST): New define.
(BASE_COST): New define.
(get_edit_distance): Recognize case changes.
(get_edit_distance_cutoff): Update.
(test_edit_distances): Update.
(get_old_cutoff): Update.
(test_find_closest_string): Add case sensitivity test.
---
 gcc/spellcheck.c | 114 ++-
 1 file changed, 74 insertions(+), 40 deletions(-)

diff --git a/gcc/spellcheck.c b/gcc/spellcheck.c
index 7891260a258..9002617453f 100644
--- a/gcc/spellcheck.c
+++ b/gcc/spellcheck.c
@@ -25,6 +25,12 @@ along with GCC; see the file COPYING3.  If not see
 #include "spellcheck.h"
 #include "selftest.h"
 
+/* Cost of a case transformation.  */
+#define CASE_COST 1
+
+/* Cost of another kind of edit.  */
+#define BASE_COST 2
+
 /* Get the edit distance between the two strings: the minimal
number of edits that are needed to change one string into another,
where edits can be one-character insertions, removals, or substitutions,
@@ -47,9 +53,9 @@ get_edit_distance (const char *s, int len_s,
 }
 
   if (len_s == 0)
-return len_t;
+return BASE_COST * len_t;
   if (len_t == 0)
-return len_s;
+return BASE_COST * len_s;
 
   /* We effectively build a matrix where each (i, j) contains the
  distance between the prefix strings s[0:j] and t[0:i].
@@ -67,7 +73,7 @@ get_edit_distance (const char *s, int len_s,
   /* The first row is for the case of an empty target string, which
  we can reach by deleting every character in the source string.  */
   for (int i = 0; i < len_s + 1; i++)
-v_one_ago[i] = i;
+v_one_ago[i] = i * BASE_COST;
 
   /* Build successive rows.  */
   for (int i = 0; i < len_t; i++)
@@ -83,21 +89,28 @@ get_edit_distance (const char *s, int len_s,
   /* The initial column is for the case of an empty source string; we
 can reach prefixes of the target string of length i
 by inserting i characters.  */
-  v_next[0] = i + 1;
+  v_next[0] = (i + 1) * BASE_COST;
 
   /* Build the rest of the row by considering neighbors to
 the north, west and northwest.  */
   for (int j = 0; j < len_s; j++)
{
- edit_distance_t cost = (s[j] == t[i] ? 0 : 1);
- edit_distance_t deletion = v_next[j] + 1;
- edit_distance_t insertion= v_one_ago[j + 1] + 1;
+ edit_distance_t cost;
+
+ if (s[j] == t[i])
+   cost = 0;
+ else if (TOLOWER (s[j]) == TOLOWER (t[i]))
+   cost = CASE_COST;
+ else
+   cost = BASE_COST;
+ edit_distance_t deletion = v_next[j] + BASE_COST;
+ edit_distance_t insertion= v_one_ago[j + 1] + BASE_COST;
  edit_distance_t substitution = v_one_ago[j] + cost;
  edit_distance_t cheapest = MIN (deletion, insertion);
  cheapest = MIN (cheapest, substitution);
  if (i > 0 && j > 0 && s[j] == t[i - 1] && s[j - 1] == t[i])
{
- edit_distance_t transposition = v_two_ago[j - 1] + 1;
+ edit_distance_t transposition = v_two_ago[j - 1] + BASE_COST;
  cheapest = MIN (cheapest, transposition);
}
  v_next[j + 1] = cheapest;
@@ -185,11 +198,11 @@ get_edit_distance_cutoff (size_t goal_len, size_t 
candidate_len)
   /* If the lengths are close, then round down.  */
   if (max_length - min_length <= 1)
 /* ...but allow an edit distance of at least 1.  */
-return MAX (max_length / 3, 1);
+return BASE_COST * MAX (max_length / 3, 1);
 
   /* Otherwise, round up (thus giving a little extra leeway to some cases
  involving insertions/deletions).  */
-  return (max_length + 2) / 3;
+  return BASE_COST * (max_length + 2) / 3;
 }
 
 #if CHECKING_P
@@ -228,47 +241,50 @@ test_get_edit_distance_both_ways (const char *a, const 
char *b,
 static void
 test_edit_distances ()
 {
-  test_get_edit_distance_both_ways ("", "nonempty", strlen ("nonempty"));
-  test_get_edit_distance_both_ways ("saturday", "sunday", 3);
-  test_get_edit_distance_both_ways ("foo", "m_foo", 2);
-  test_get_edit_distance_both_ways ("hello_world", "HelloWorld", 3);
+  test_get_edit_distance_both_ways ("", "nonempty",
+

Re: [PATCH, committed] [9/10/11 Regression] PR fortran/95104 - Segfault on a legal WAIT statement

2020-05-29 Thread Thomas Koenig via Gcc-patches

Am 28.05.20 um 21:58 schrieb Harald Anlauf:

The initial commit for this PR uncovered a latent issue with unit locking
in the Fortran run-time library.  Add check for valid unit.


This only came up because Solaris, unlike Linux, links the pthreads
library by default, so it will not be found in a normal regression
test.

Is there a magic incantation which would let the gfortran testsuite
cycle through -phtread at least once, so this kind of thing can
be found earlier?  Of course, this would have to be restricted
to those platforms which actually support pthreds...

Regards

Thomas


Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-05-29 Thread Segher Boessenkool
On Fri, May 29, 2020 at 02:43:12PM +0200, Richard Biener wrote:
> So I tried to understand the circumstances the rs6000 patterns FAIL
> but FAILed ;)  It looks like some outs of rs6000_emit_vector_cond_expr
> are unwarranted and the following should work:
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 8435bc15d72..5503215a00a 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -14638,8 +14638,7 @@ rs6000_emit_vector_compare (enum rtx_code rcode,

(Different function, btw)

> rtx mask2;
> 
> rev_code = reverse_condition_maybe_unordered (rcode);
> -   if (rev_code == UNKNOWN)
> - return NULL_RTX;
> +   gcc_assert (rev_code != UNKNOWN);

reverse_condition_maybe_unordered is documented as possibly returning
UNKNOWN.  The current implementation doesn't, sure.  But fix that first?

rs6000_emit_vector_compare can fail for several other reasons, too --
including when rs6000_emit_vector_compare_inner fails.

> @@ -14737,8 +14736,7 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx
> op_true, rtx op_false,
>rtx cond2;
>bool invert_move = false;
> 
> -  if (VECTOR_UNIT_NONE_P (dest_mode))
> -return 0;
> +  gcc_assert (VECTOR_UNIT_NONE_P (dest_mode));

Why can this condition never be true?  (Missing a ! btw)

It needs a big comment if you want to make wide assumptions like that,
in any case.  Pretty much *all* (non-trivial) asserts need an explanation.

(And perhaps VECTOR_UNIT_ALTIVEC_OR_VSX_P is better).

>   /* Get the vector mask for the given relational operations.  */
>   mask = rs6000_emit_vector_compare (rcode, cc_op0, cc_op1, mask_mode);
> 
>   if (!mask)
> return 0;
> 
> fail but that function recurses heavily - from reading
> rs6000_emit_vector_compare_inner
> it looks like power can do a lot of compares but floating-point LT which
> reverse_condition_maybe_unordered would turn into UNGE which is not
> handled either.
> But then rs6000_emit_vector_compare just tries GT for that anyway (not UNGE) 
> so
> it is actually be handled (but should not?).
> 
> So I bet the expansion of the patterns cannot fail at the moment.  Thus I'd
> replace the FAIL with a gcc_unreachable () and see if we have test
> coverage for those
> FAILs.

I am not comfortable with that at all.

> Segher - do you actually know this code to guess why the patterns are 
> defensive?

Yes.


If you want to change the documented semantics of widely used functions,
please propose that?


Segher


Re: [PATCH] c++: satisfaction value of type typedef to bool [PR95386]

2020-05-29 Thread Jason Merrill via Gcc-patches

On 5/29/20 11:59 AM, Patrick Palka wrote:

In the testcase below, the satisfaction value of fn1's constraint
is INTEGER_CST '1' of type BOOLEAN_TYPE value_type, which is a typedef
to the standard boolean_type_node.  But satisfaction_value expects to
see exactly boolean_true_node or integer_one_node, which this value is
neither, causing us to trip over the assert therein.

This patch relaxes satisfaction_value to accept any INTEGER_CST which
satisfies integer_zerop or integer_onep.  (It seems we could get away
with accepting only INTEGER_CSTs of type BOOLEAN_TYPE, but that wouldn't
be a proper relaxation of what the subroutine currently accepts and
would therefore be more risky to backport.)


I think for GCC 11 I'd prefer to restrict it to BOOLEAN_TYPE.  This 
patch is OK for GCC 10.



Passes 'make check-c++', does this look OK to commit to master and to
the GCC 10 branch after a full bootstrap and regtest?

gcc/cp/ChangeLog:

PR c++/95386
* constraint.cc (satisfaction_value): Relax to accept any
INTEGER_CST that satisfies integer_zerop or integer_onep.

gcc/testsuite/ChangeLog:

PR c++/95386
* g++.dg/concepts/pr95386.C: New test.
---
  gcc/cp/constraint.cc|  7 ---
  gcc/testsuite/g++.dg/concepts/pr95386.C | 11 +++
  2 files changed, 15 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/concepts/pr95386.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index eb72bfe5936..5a247cfb738 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -2490,11 +2490,12 @@ satisfy_disjunction (tree t, tree args, subst_info info)
  tree
  satisfaction_value (tree t)
  {
-  if (t == error_mark_node)
+  if (t == error_mark_node || t == boolean_true_node || t == 
boolean_false_node)
  return t;
-  if (t == boolean_true_node || t == integer_one_node)
+  gcc_assert (TREE_CODE (t) == INTEGER_CST);
+  if (integer_onep (t))
  return boolean_true_node;
-  if (t == boolean_false_node || t == integer_zero_node)
+  if (integer_zerop (t))
  return boolean_false_node;
  
/* Anything else should be invalid.  */

diff --git a/gcc/testsuite/g++.dg/concepts/pr95386.C 
b/gcc/testsuite/g++.dg/concepts/pr95386.C
new file mode 100644
index 000..3c683e5693c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/concepts/pr95386.C
@@ -0,0 +1,11 @@
+// PR c++/95386
+// { dg-do compile { target concepts } }
+
+template  struct blah {
+ typedef bool value_type;
+ constexpr operator value_type() { return false; }
+};
+
+template  void fn1(T) requires (!blah());
+
+void fn2() { fn1(0); }





Re: [PATCH] libgfortran: Export forgotten _gfortran_{,m,s}findloc{0,1}_c10 [PR95390]

2020-05-29 Thread Thomas Koenig via Gcc-patches

Hi Jakub,


Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk and 10.2?


OK. Thanks a lot!

Regards

Thomas


Re: [PATCH v2] c++: Fix bogus -Wparentheses warning [PR95344]

2020-05-29 Thread Jason Merrill via Gcc-patches

On 5/28/20 7:11 PM, Marek Polacek wrote:

On Thu, May 28, 2020 at 05:01:51PM -0400, Jason Merrill wrote:

On 5/26/20 8:25 PM, Marek Polacek wrote:

Since r267272, which added location wrappers, cp_fold loses
TREE_NO_WARNING on a MODIFY_EXPR that finish_parenthesized_expr set, and
that results in a bogus -Wparentheses warning.

I.e., previously we had "b = 1" but now we have "VIEW_CONVERT_EXPR(b) = 1"
and cp_fold_maybe_rvalue folds away the location wrapper and so we do
2718 x = fold_build2_loc (loc, code, TREE_TYPE (x), op0, op1);
in cp_fold and the flag is lost.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/10/9?

PR c++/95344
* cp-gimplify.c (cp_fold) : Set TREE_NO_WARNING.

* c-c++-common/Wparentheses-2.c: New test.
---
   gcc/cp/cp-gimplify.c|  5 -
   gcc/testsuite/c-c++-common/Wparentheses-2.c | 18 ++
   2 files changed, 22 insertions(+), 1 deletion(-)
   create mode 100644 gcc/testsuite/c-c++-common/Wparentheses-2.c

diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index 53d715dcd89..8b505dd878c 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -2745,7 +2745,10 @@ cp_fold (tree x)
x = org_x;
}
 if (code == MODIFY_EXPR && TREE_CODE (x) == MODIFY_EXPR)
-   TREE_THIS_VOLATILE (x) = TREE_THIS_VOLATILE (org_x);
+   {
+ TREE_THIS_VOLATILE (x) = TREE_THIS_VOLATILE (org_x);
+ TREE_NO_WARNING (x) = TREE_NO_WARNING (org_x);
+   }


I wonder if we want to copy these flags lower down for any EXPR_P (x) where
TREE_CODE (x) == code?


Sounds good; I don't think we want to lose those flags when folding in general,
not just for MODIFY_EXPR.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


-- >8 --
Since r267272, which added location wrappers, cp_fold loses
TREE_NO_WARNING on a MODIFY_EXPR that finish_parenthesized_expr set, and
that results in a bogus -Wparentheses warning.

I.e., previously we had "b = 1" but now we have "VIEW_CONVERT_EXPR(b) = 1"
and cp_fold_maybe_rvalue folds away the location wrapper and so we do
2718 x = fold_build2_loc (loc, code, TREE_TYPE (x), op0, op1);
in cp_fold and the flag is lost.

PR c++/95344
* cp-gimplify.c (cp_fold) : Don't set
TREE_THIS_VOLATILE here.
(cp_fold): Set it here along with TREE_NO_WARNING.

* c-c++-common/Wparentheses-2.c: New test.
---
  gcc/cp/cp-gimplify.c|  8 ++--
  gcc/testsuite/c-c++-common/Wparentheses-2.c | 18 ++
  2 files changed, 24 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/c-c++-common/Wparentheses-2.c

diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index 53d715dcd89..d6723e44ec4 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -2744,8 +2744,6 @@ cp_fold (tree x)
  else
x = org_x;
}
-  if (code == MODIFY_EXPR && TREE_CODE (x) == MODIFY_EXPR)
-   TREE_THIS_VOLATILE (x) = TREE_THIS_VOLATILE (org_x);
  
break;
  
@@ -2994,6 +2992,12 @@ cp_fold (tree x)

return org_x;
  }
  
+  if (EXPR_P (x) && TREE_CODE (x) == code)

+{
+  TREE_THIS_VOLATILE (x) = TREE_THIS_VOLATILE (org_x);
+  TREE_NO_WARNING (x) = TREE_NO_WARNING (org_x);
+}
+
if (!c.evaluation_restricted_p ())
  {
fold_cache->put (org_x, x);
diff --git a/gcc/testsuite/c-c++-common/Wparentheses-2.c 
b/gcc/testsuite/c-c++-common/Wparentheses-2.c
new file mode 100644
index 000..1aa5d314ae7
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wparentheses-2.c
@@ -0,0 +1,18 @@
+// PR c++/95344 - bogus -Wparentheses warning.
+// { dg-do compile }
+// { dg-options "-Wparentheses" }
+
+#ifndef __cplusplus
+# define bool _Bool
+# define true 1
+# define false 0
+#endif
+
+void
+f (int i)
+{
+  bool b = false;
+  if (i == 99 ? (b = true) : false) // { dg-bogus "suggest parentheses" }
+{
+}
+}

base-commit: 3d8d5ddb539a5254c7ef83414377f4c74c7701d4





[pushed] c++: vptr ubsan and derived class [PR95311].

2020-05-29 Thread Jason Merrill via Gcc-patches
We weren't able to find OBJ_TYPE_REF_OBJECT walking through
OBJ_TYPE_REF_EXPR because we had folded away the ADDR_EXPR.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

PR c++/95311
PR c++/95221
* class.c (build_vfn_ref): Don't fold the INDIRECT_REF.

gcc/testsuite/ChangeLog:

PR c++/95311
* g++.dg/ubsan/vptr-16.C: New test.
---
 gcc/cp/class.c   |  8 ++--
 gcc/testsuite/g++.dg/ubsan/vptr-16.C | 14 ++
 2 files changed, 20 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ubsan/vptr-16.C

diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index bab15524a60..ca492cdbd40 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -729,9 +729,13 @@ build_vtbl_ref (tree instance, tree idx)
 tree
 build_vfn_ref (tree instance_ptr, tree idx)
 {
-  tree aref;
+  tree obtype = TREE_TYPE (TREE_TYPE (instance_ptr));
 
-  aref = build_vtbl_ref (cp_build_fold_indirect_ref (instance_ptr), idx);
+  /* Leave the INDIRECT_REF unfolded so cp_ubsan_maybe_instrument_member_call
+ can find instance_ptr.  */
+  tree ind = build1 (INDIRECT_REF, obtype, instance_ptr);
+
+  tree aref = build_vtbl_ref (ind, idx);
 
   /* When using function descriptors, the address of the
  vtable entry is treated as a function pointer.  */
diff --git a/gcc/testsuite/g++.dg/ubsan/vptr-16.C 
b/gcc/testsuite/g++.dg/ubsan/vptr-16.C
new file mode 100644
index 000..a3db66e9140
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ubsan/vptr-16.C
@@ -0,0 +1,14 @@
+// PR c++/95311
+// { dg-additional-options -fsanitize=undefined }
+
+class a {
+  virtual long b() const;
+};
+class c : a {
+public:
+  long b() const;
+};
+class d : c {
+  long e();
+};
+long d::e() { b(); return 0; }

base-commit: 24663f1f6d709daf8913484914ed01af9f7a480a
-- 
2.18.1



[PATCH] c++: satisfaction value of type typedef to bool [PR95386]

2020-05-29 Thread Patrick Palka via Gcc-patches
In the testcase below, the satisfaction value of fn1's constraint
is INTEGER_CST '1' of type BOOLEAN_TYPE value_type, which is a typedef
to the standard boolean_type_node.  But satisfaction_value expects to
see exactly boolean_true_node or integer_one_node, which this value is
neither, causing us to trip over the assert therein.

This patch relaxes satisfaction_value to accept any INTEGER_CST which
satisfies integer_zerop or integer_onep.  (It seems we could get away
with accepting only INTEGER_CSTs of type BOOLEAN_TYPE, but that wouldn't
be a proper relaxation of what the subroutine currently accepts and
would therefore be more risky to backport.)

Passes 'make check-c++', does this look OK to commit to master and to
the GCC 10 branch after a full bootstrap and regtest?

gcc/cp/ChangeLog:

PR c++/95386
* constraint.cc (satisfaction_value): Relax to accept any
INTEGER_CST that satisfies integer_zerop or integer_onep.

gcc/testsuite/ChangeLog:

PR c++/95386
* g++.dg/concepts/pr95386.C: New test.
---
 gcc/cp/constraint.cc|  7 ---
 gcc/testsuite/g++.dg/concepts/pr95386.C | 11 +++
 2 files changed, 15 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/concepts/pr95386.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index eb72bfe5936..5a247cfb738 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -2490,11 +2490,12 @@ satisfy_disjunction (tree t, tree args, subst_info info)
 tree
 satisfaction_value (tree t)
 {
-  if (t == error_mark_node)
+  if (t == error_mark_node || t == boolean_true_node || t == 
boolean_false_node)
 return t;
-  if (t == boolean_true_node || t == integer_one_node)
+  gcc_assert (TREE_CODE (t) == INTEGER_CST);
+  if (integer_onep (t))
 return boolean_true_node;
-  if (t == boolean_false_node || t == integer_zero_node)
+  if (integer_zerop (t))
 return boolean_false_node;
 
   /* Anything else should be invalid.  */
diff --git a/gcc/testsuite/g++.dg/concepts/pr95386.C 
b/gcc/testsuite/g++.dg/concepts/pr95386.C
new file mode 100644
index 000..3c683e5693c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/concepts/pr95386.C
@@ -0,0 +1,11 @@
+// PR c++/95386
+// { dg-do compile { target concepts } }
+
+template  struct blah {
+ typedef bool value_type;
+ constexpr operator value_type() { return false; }
+};
+
+template  void fn1(T) requires (!blah());
+
+void fn2() { fn1(0); }
-- 
2.27.0.rc1.5.gae92ac8ae3



Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-05-29 Thread Segher Boessenkool
On Fri, May 29, 2020 at 02:17:00PM +0200, Richard Biener wrote:
> Now it looks like that those verification also simply checks optab
> availability only but then this is just a preexisting issue (and we can
> possibly build a testcase that FAILs RTL expansion for power...).
> 
> So given that this means the latent bug in the powerpc backend
> should be fixed and we should use a direct internal function instead?

I don't see what you consider a bug in the backend here?  The expansion
FAILs, and it is explicitly allowed to do that.

Not allowed to FAIL are:
-- The "lanes" things;
-- vec_duplicate, vec_series;
-- maskload, maskstore;
-- fmin, fmax;
-- madd and friends;
-- sqrt, rsqrt;
-- fmod, remainder;
-- scalb, ldexp;
-- sin, cos, tan, asin, acos, atan;
-- exp, expm1, exp10, exp2, log, log1p, log10, log2, logb;
-- significand, pow, atan2, floor, btrunc, round, ceil, nearbyint, rint;
-- copysign, xorsign;
-- ffs, clrsb, clz, ctz, popcount, parity.

All vcond* patterns are allowed to fail.

Maybe ours don't *need* to, but that doesn't change a thing.

In general, it is a Very Good Thing if patterns are allowed to fail: if
they are not allowed to fail, they have to duplicate all the code that
the generic expander should have, into ever target that needs it.  It
also makes writing a (new) backend easier.


Segher


Re: [PATCH] Error for missing change description in git_commit.py.

2020-05-29 Thread Martin Liška

I've just pushed that to master.

Jakub: Can you please rsync the script to the server hook?

Thanks,
Martin


Re: [PATCH] diagnostics: Consistently add fixit hint for implicit builtin declaration

2020-05-29 Thread Martin Sebor via Gcc-patches

On 5/28/20 7:13 PM, Mark Wielaard wrote:

Hi Martin,

On Thu, May 28, 2020 at 06:21:39PM -0600, Martin Sebor wrote:

Although few tests bother with it, since you add an option for
the existing warning where there was none before, an even more
exhaustive test than the one you added would also verify the same
option can be used to suppress it (e.g., via #pragma GCC diagnostic
ignored).


OK. How about this variant with an extra
Wbuiltin-declaration-mismatch-ignore.c test?
It FAILS with (test for excess errors) before the patch.
It PASSes with the patch.


It looks good to me but I can't formally approve it.

Martin



Thanks,

Mark





PING [PATCH] warn on uninitialized accesses by function calls (PR 10138)

2020-05-29 Thread Martin Sebor via Gcc-patches

Ping: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545856.html

On 5/15/20 5:31 PM, Martin Sebor wrote:

Besides better buffer overflow checking, the new GCC 10 attribute
access also provides an opportunity to detect other kinds of bugs,
including uninitialized accesses by user-defined functions.
The attached patch implements this enhancement.

In addition, the closely related PR 10138 requests that GCC warn when
passing the address of an uninitialized variable to a const-qualified
pointer function argument.  Const pointers almost always imply a read
access of the object, so the patch also enables the warning in these
cases.  (There are situations when a const pointer doesn't imply it
and the warning takes care not to trigger overly enthusiastically.)
Since pointers often point to allocated objects it seemed natural
(and was surprisingly easy) to also detect uninitialized reads from
those.

For optimum results I slightly enhanced the detection of the referenced
decls and allocations.  In the process, I also noticed and fixed a small
bug in the existing code.   This helps both find more uninitialized
variables and reduce the rate of false positives in existing warnings.

Besides the usual GCC bootstrap/regtest I validated the changes by
building a number of packages, including Binutils/GDB, Glibc, and
the Linux kernel.  It found a decent number of likely bugs (about
half a doze by my count) but also triggered a few false positives.
One class of such problems was due to the kernel's function

   __check_object_size (const void*, unsigned, bool)

used to validate the sizes of objects without ever accessing them.
To accommodate this idiom the patch adds a new  mode to attribute
access: none.

Martin




Re: [PATCH] Port libgccjit to Windows.

2020-05-29 Thread NightStrike via Gcc-patches
On Thu, May 28, 2020, 4:25 PM David Malcolm via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> On Wed, 2020-05-27 at 22:27 -0300, Nicolas Bértolo wrote:
> > > New C++ source files should have a .cc extension.
> > > I hope that at some point we'll rename all the existing .c ones
> > > accordingly.
> >
> > I just couldn't get Make to generate jit-w32.o from jit-w32.cc.
> > It looks for jit-w32.c.
> >
> > I had to leave it with the .c extension.
>
> Fair enough.
>

That's not a good reason to leave it like this. You should get a make
expert to help here.

I was able to successfully bootstrap and regression test with your
>
patch on x86_64-pc-linux-gnu.  I also verified that the result of "make
> install" was not affected for my configuration.
>
> I've pushed your patch to master as
> c83027f32d9cca84959c7d6a1e519a0129731501.
>
> (I had to do a little fixup of the ChangeLog entries to get them to
> work with the new hooks on our git repo)
>
> Thanks again for the patch
> Dave
>
> [1]
> https://docs.microsoft.com/en-us/previous-versions/windows/desktop/legacy/aa379560(v=vs.85)


I don't want to sound confrontational, but I don't think testing this on
linux and reviewing it by non windows experts is correct. At the very
least, a windows maintainer (Jon, Kai) should review it for correctness.
I've cc'd them here.


[PATCH] bugzilla-close-candidate.py: Fix sorting of branches.

2020-05-29 Thread Martin Liška

Pushed to master.

maintainer-scripts/ChangeLog:

* bugzilla-close-candidate.py: Fix sorting of branches.
---
 maintainer-scripts/bugzilla-close-candidate.py | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/maintainer-scripts/bugzilla-close-candidate.py 
b/maintainer-scripts/bugzilla-close-candidate.py
index 9c95f2bf3eb..cdace9763e7 100755
--- a/maintainer-scripts/bugzilla-close-candidate.py
+++ b/maintainer-scripts/bugzilla-close-candidate.py
@@ -91,9 +91,10 @@ def search():
 if skip:
 continue
 
-branches = get_branches_by_comments(comments)

-if len(branches):
-branches_str = ','.join(sorted(list(branches)))
+branches = sorted(list(get_branches_by_comments(comments)),
+  key=lambda b: 999 if b is 'master' else int(b))
+if branches:
+branches_str = ','.join(branches)
 print('%-30s%-30s%-40s%-40s%-60s' % 
('https://gcc.gnu.org/PR%d' % id, branches_str, fail, work, b['summary']), 
flush=True)
 ids.append(id)
 
--

2.26.2



Re: [PATCH] aarch64: Change the definition of Pmode [pr95182]

2020-05-29 Thread Richard Earnshaw
On 29/05/2020 13:28, duanbo (C) wrote:
> 
> 
>> -Original Message-
>> From: Andrew Pinski [mailto:pins...@gmail.com]
>> Sent: Monday, May 18, 2020 11:49 AM
>> To: duanbo (C) 
>> Cc: GCC Patches 
>> Subject: Re: [PATCH] aarch64: Change the definition of Pmode [pr95182]
>>
>> On Sun, May 17, 2020 at 8:23 PM duanbo (C)  wrote:
>>>
>>> Hi,
>>>
>>> This changes the definition of Pmode for aarch64 port.
>>> Unlike x86, S390 etc., Pmode is always set to DImode for aarch64 port even
>> under ILP32.
>>> Because of that definition,  machine mode of symbol_ref which is
>> supposed to be SImode becomes DImode under target ILP32.
>>> Definition of Pmode should depend on the current ABI, i.e., SImode for
>> ILP32 and DImode for LP64.
>>> Attached please find the proposed patch .
>>> Bootstrap and tested on aarch64 Linux platform. No new regression
>> witnessed.
>>> Any suggestion?
>>
>> THIS DOES NOT WORK correctly and will never work correctly.  When I was
>> originally writing AARCH64 ILP32 (back in 2013), I went this route first (as 
>> it
>> was the fastest way to get it working; I could not wait on ARM's
>> implementation at the time) but I had regressions.
>>
>> The place where it fails was something like:
>> int f(char *g, int t)
>> {
>>   return g[t];
>> }
>>
>> Which you pass -1 for t  as there would be no zero-extend any more.
>> I remember at least one testcase in the testsuite failing due to
>> implementation this way even; I don't remember which one as I did not write
>> it down and it was over 6 years ago.
>>
>> If there was an arch mode which would VAs to be truncated to 32bits, this
>> would be the correct way to implement this.
>>
>> The reason why the other ABIs/targets define Pmode as SImode is because
>> the underlying hardware will extend the VA correctly as Linux will set the
>> arch bit correctly (NOTE MIPS is an example where index'ed load/stores
>> which has a similar issue even on MIPS32 but that is a different story).
>>
>> Also there are other ABIs where Pmode != PTRmode (e.g. IA64-HPUX32).
>> x32 has an option which can select either way.  The ISA on x86_64 supports
>> both cases which is why it can be selected that way; this is unlike AARCH64
>> which cannot.
>>
>> Thanks,
>> Andrew Pinski
>>
>>>
>>> Thanks,
>>> Duanbo
> 
> Hi
> 
> I got your point. The hardware of AARCH64 determines the definition of Pmode. 
> But I didn't figure out the reason why the other ABIs / targets could define 
> Pmode as SImode. 

I don't know precisely, but I would imagine that on X86 (and maybe other
architectures) this works because the 64-bit ISA is simply a superset of
the the 32-bit ISA.  So there are load and store operations that take a
32-bit base register for addresses and these can be used even when in
64-bit operational mode.

On AArch64 we don't have that option as all registers used for
addressing in the ISA are 64 bits in size.  Ergo, our only option is to
have Pmode=DImode.

> There is not much information about ILP32 on the official website of ARM.
> It would be very helpful if you can provide some useful documents, especially 
> the different hardware implementation between X86 and AARCH64 in this issue.
> I really want to figure out.
> thanks.
> 
> Duanbo
> 

R.


[PATCH] Fix parsing of SVN commits in PRs.

2020-05-29 Thread Martin Liška

Tested and pushed to master.

maintainer-scripts/ChangeLog:

* bugzilla-close-candidate.py: Fix parsing of SVN revisions.
Fix skipping of PRs that contain Can be closed message.
---
 .../bugzilla-close-candidate.py   | 36 +++
 1 file changed, 22 insertions(+), 14 deletions(-)

diff --git a/maintainer-scripts/bugzilla-close-candidate.py 
b/maintainer-scripts/bugzilla-close-candidate.py
index dfd67ac1cbb..9c95f2bf3eb 100755
--- a/maintainer-scripts/bugzilla-close-candidate.py
+++ b/maintainer-scripts/bugzilla-close-candidate.py
@@ -37,23 +37,27 @@ def get_branches_by_comments(comments):
 for c in comments:
 text = c['text']
 lines = text.split('\n')
-for line in lines:
-if 'URL: https://gcc.gnu.org/viewcvs' in line:
-version = 'master'
+if 'URL: https://gcc.gnu.org/viewcvs' in text:
+version = 'master'
+for line in lines:
 if 'branches/gcc-' in line:
 parts = line.strip().split('/')
 parts = parts[1].split('-')
 assert len(parts) == 3
-versions.add(parts[1])
-versions.add(version)
-elif line.startswith('The ') and 'branch has been updated' in line:
-version = 'master'
-name = line.strip().split(' ')[1]
-if '/' in name:
-name = name.split('/')[1]
-assert '-' in name
-version = name.split('-')[1]
-versions.add(version)
+version = parts[1]
+break
+versions.add(version)
+else:
+for line in lines:
+if line.startswith('The ') and 'branch has been updated' in 
line:
+version = 'master'
+name = line.strip().split(' ')[1]
+if '/' in name:
+name = name.split('/')[1]
+assert '-' in name
+version = name.split('-')[1]
+versions.add(version)
+break
 return versions
 
 def get_bugs(query):

@@ -79,9 +83,13 @@ def search():
 keys = list(r['bugs'].keys())
 assert len(keys) == 1
 comments = r['bugs'][keys[0]]['comments']
+skip = False
 for c in comments:
 if closure_question in c['text']:
-continue
+skip = True
+break
+if skip:
+continue
 
 branches = get_branches_by_comments(comments)

 if len(branches):
--
2.26.2



Re: [PATCH] Ensure `-lmsvcrt` precede `-lkernel32`

2020-05-29 Thread Liu Hao via Gcc-patches
在 2020/5/29 22:01, Liu Hao 写道:
> This is necessary as libmsvcrt.a is not a pure import library, but
> also contains some functions that invoke others in KERNEL32.DLL.
> 
>   * config/i386/mingw32.h: Insert -lkernel32 after -lmsvcrt
> ---
>  gcc/config/i386/mingw32.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/config/i386/mingw32.h b/gcc/config/i386/mingw32.h
> index 1bbabfe8bed..321c30e41cc 100644
> --- a/gcc/config/i386/mingw32.h
> +++ b/gcc/config/i386/mingw32.h
> @@ -165,7 +165,7 @@ along with GCC; see the file COPYING3.  If not see
>  #define REAL_LIBGCC_SPEC \
>"%{mthreads:-lmingwthrd} -lmingw32 \
> " SHARED_LIBGCC_SPEC " \
> -   -lmoldname -lmingwex -lmsvcrt"
> +   -lmoldname -lmingwex -lmsvcrt -lkernel32"
>   #undef STARTFILE_SPEC
>  #define STARTFILE_SPEC "%{shared|mdll:dllcrt2%O%s} \
> 

This patch originates from this discussion on #mingw-w64 on OFTC:

```
[20:09:50]  there is suddenly an unexpected call to
`IsDBCSLeadByteEx()` in winpthreads. Not sure why it gets involved.
[20:13:12] * tchan (~tc...@c-98-220-238-152.hsd1.il.comcast.net) has joined
[20:22:28]  diff'ing the import tables the previous working
binary and now broken binary reveals that the old symbol to `printf` is
gone. seems the mingw-w64 ones is called, which references
`IsDBCSLeadByteEx()` and `WideCharToMultiByte()`.
[20:27:19]  both of those should be provided by -lkernel32 right?
[20:27:36] * Dejan has quit (Quit: Leaving)
[20:34:09]  probably, but I doubt whether it should behave
this way.  when perform cross-compilation the CRT is not available when
building winpthreads.
[20:34:37]  presumably it should always call the MS one.
[20:34:45]  I'm pretty sure you'd first build the crt, then
libraries like winpthreads - the other way around doesn't work
[20:35:16]  :|  let me make a test program.
[20:38:38]  can't reproduce it myself.
[20:41:06]  there may be something wrong with the OP'
[20:41:18]  's configuration.  Normally kernel32 is a default lib.
[20:42:39]  I still think winpthreads should be built with
`CPPFLAGS='-D__USE_MINGW_ANSI_STDIO=0'`. I built a local package and
there is no reference to DBCS or wide char functions.
[20:49:21]   reproduced now:
https://paste.ubuntu.com/p/HwNk8WqgkD/
[20:49:23]  Title: Ubuntu Pastebin (at paste.ubuntu.com)
[20:52:45]  strange:  -lmingw32 -lgcc -lgcc_eh -lmoldname
-lmingwex -lmsvcrt -lpthread -lmcfgthread -ladvapi32 -lshell32 -luser32
-lkernel32 -lmingw32 -lgcc -lgcc_eh -lmoldname -lmingwex -lmsvcrt
[20:53:10]  oh.
[20:53:22]  the mingw-w64 `printf` is defined in mingwex I guess?
[20:53:22]  pthread pulls in objects from msvcrt, which then needs
things from kernel32, but there's no more kernel32 after msvcrt
[20:54:04]  in 211af1e7d4d188dbefacea7af8b83d32b3edb48c I moved
mbrtowc and wcrtomb from mingwex to msvcrt
[20:54:37]  (but that wouldn't make a difference wrt this, as
there's no kernel32 after the first mingwex after -lpthread either)
[20:55:10]  I think it would be good with yet another -lkernel32
after -lmsvcrt
[20:55:27]  after all, that's the way they are layered anyway - the
crt runs on top of kernel32
[20:56:32]  and we want to have the freedom to have object file
implementations in libmsvcrt.a
[20:58:21]  some of these -l things are hard-coded in GCC
default specs.
[20:58:48]  I only found `-ladvapi32 -lshell32 -luser32
-lkernel32`. The list ends there.
[20:59:11]  not sure how those additional libraries were added.
[21:00:04]  lld is nice in this aspect, that it doesn't need static
libraries ordered like this; for each undefined, it searches the list of
static libraries from the start
[21:01:07]  LD is dumb. :(
[21:02:53]  I thought MSVCRT was only an import library. It
seems more complicated.
[21:04:18]  it has (almost) always been more than that - there's
been some stub functions that call GetProcAddress() and try to
conditionally load functions if available
[21:04:53]  and especially with ucrt, we want to move quite a bit
of things from libmingwex.a to libmsvcrt-os.a, for things where we can
and should use the ucrt equivalent instead of statically linking in our own
[21:05:32]  GetProcAdress() requires a successive -lkernel32 too.
[21:05:45]  indeed
[21:06:56]  so this suddenly becomes a GCC issue in its
default specs:  `-lkernel32` is required after `-lmsvcrt`.
[21:07:33]  yes, pretty much. clang has got the same structure as well
[21:07:44]  (which matters for cases when using clang on top of ld.bfd)
[21:08:48]  looks like it's REAL_LIBGCC_SPEC in
gcc/config/i386/mingw32.h that needs to be updated
```



-- 
Best regards,
LH_Mouse



signature.asc
Description: OpenPGP digital signature


[PATCH] Ensure `-lmsvcrt` precede `-lkernel32`

2020-05-29 Thread Liu Hao via Gcc-patches
This is necessary as libmsvcrt.a is not a pure import library, but
also contains some functions that invoke others in KERNEL32.DLL.

* config/i386/mingw32.h: Insert -lkernel32 after -lmsvcrt
---
 gcc/config/i386/mingw32.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/i386/mingw32.h b/gcc/config/i386/mingw32.h
index 1bbabfe8bed..321c30e41cc 100644
--- a/gcc/config/i386/mingw32.h
+++ b/gcc/config/i386/mingw32.h
@@ -165,7 +165,7 @@ along with GCC; see the file COPYING3.  If not see
 #define REAL_LIBGCC_SPEC \
   "%{mthreads:-lmingwthrd} -lmingw32 \
" SHARED_LIBGCC_SPEC " \
-   -lmoldname -lmingwex -lmsvcrt"
+   -lmoldname -lmingwex -lmsvcrt -lkernel32"
  #undef STARTFILE_SPEC
 #define STARTFILE_SPEC "%{shared|mdll:dllcrt2%O%s} \
-- 
2.26.2




signature.asc
Description: OpenPGP digital signature


Re: [PATCH] aarch64: Change the definition of Pmode [pr95182]

2020-05-29 Thread H.J. Lu via Gcc-patches
On Fri, May 29, 2020 at 6:45 AM duanbo (C)  wrote:
>
>
>
> > -Original Message-
> > From: Andrew Pinski [mailto:pins...@gmail.com]
> > Sent: Monday, May 18, 2020 11:49 AM
> > To: duanbo (C) 
> > Cc: GCC Patches 
> > Subject: Re: [PATCH] aarch64: Change the definition of Pmode [pr95182]
> >
> > On Sun, May 17, 2020 at 8:23 PM duanbo (C)  wrote:
> > >
> > > Hi,
> > >
> > > This changes the definition of Pmode for aarch64 port.
> > > Unlike x86, S390 etc., Pmode is always set to DImode for aarch64 port even
> > under ILP32.
> > > Because of that definition,  machine mode of symbol_ref which is
> > supposed to be SImode becomes DImode under target ILP32.
> > > Definition of Pmode should depend on the current ABI, i.e., SImode for
> > ILP32 and DImode for LP64.
> > > Attached please find the proposed patch .
> > > Bootstrap and tested on aarch64 Linux platform. No new regression
> > witnessed.
> > > Any suggestion?
> >
> > THIS DOES NOT WORK correctly and will never work correctly.  When I was
> > originally writing AARCH64 ILP32 (back in 2013), I went this route first 
> > (as it
> > was the fastest way to get it working; I could not wait on ARM's
> > implementation at the time) but I had regressions.
> >
> > The place where it fails was something like:
> > int f(char *g, int t)
> > {
> >   return g[t];
> > }
> >
> > Which you pass -1 for t  as there would be no zero-extend any more.
> > I remember at least one testcase in the testsuite failing due to
> > implementation this way even; I don't remember which one as I did not write
> > it down and it was over 6 years ago.
> >
> > If there was an arch mode which would VAs to be truncated to 32bits, this
> > would be the correct way to implement this.
> >
> > The reason why the other ABIs/targets define Pmode as SImode is because
> > the underlying hardware will extend the VA correctly as Linux will set the
> > arch bit correctly (NOTE MIPS is an example where index'ed load/stores
> > which has a similar issue even on MIPS32 but that is a different story).
> >
> > Also there are other ABIs where Pmode != PTRmode (e.g. IA64-HPUX32).
> > x32 has an option which can select either way.  The ISA on x86_64 supports
> > both cases which is why it can be selected that way; this is unlike AARCH64
> > which cannot.
> >
> > Thanks,
> > Andrew Pinski
> >
> > >
> > > Thanks,
> > > Duanbo
>
> Hi
>
> I got your point. The hardware of AARCH64 determines the definition of Pmode.
> But I didn't figure out the reason why the other ABIs / targets could define 
> Pmode as SImode.
> There is not much information about ILP32 on the official website of ARM.
> It would be very helpful if you can provide some useful documents, especially 
> the different hardware implementation between X86 and AARCH64 in this issue.
> I really want to figure out.
> thanks.

X32 has an option to use DImode for Pmode.  But it never worked 100%
correctly.  The problems are sign/unsigned extensions for Pmode.


-- 
H.J.


RE: [PATCH] [aarch64] Fix PR94591: GCC generates invalid rev64 insns

2020-05-29 Thread Alex Coplan
> -Original Message-
> From: Richard Sandiford 
> Sent: 29 May 2020 11:59
> To: Alex Coplan 
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; Marcus Shawcroft ;
> Kyrylo Tkachov 
> Subject: Re: [PATCH] [aarch64] Fix PR94591: GCC generates invalid rev64
> insns
> 
> Alex Coplan  writes:
> > On 19/05/2020 17:59, Richard Sandiford wrote:
> >> Alex Coplan  writes:
> >> > Hello,
> >> >
> >> > This patch fixes PR94591. The problem was the function
> aarch64_evpc_rev_local()
> >> > matching vector permutations that were not reversals. In particular,
> prior to
> >> > this patch, this function matched the identity permutation which led
> to
> >> > generating bogus REV64 insns which were rejected by the assembler.
> >> >
> >> > Testing:
> >> >  - New regression test which passes after applying the patch.
> >> >  - New test passes on an x64 -> aarch64-none-elf cross.
> >> >  - Bootstrap and regtest on aarch64-linux-gnu.
> >> >
> >> > OK to install?
> >> >
> >> > Thanks,
> >> > Alex
> >> >
> >> > ---
> >> >
> >> > gcc/ChangeLog:
> >> >
> >> > 2020-05-19  Alex Coplan  
> >> >
> >> >  PR target/94591
> >> >  * config/aarch64/aarch64.c (aarch64_evpc_rev_local): Don't match
> >> >  identity permutation.
> >> >
> >> > gcc/testsuite/ChangeLog:
> >> >
> >> > 2020-05-19  Alex Coplan  
> >> >
> >> >  PR target/94591
> >> >  * gcc.c-torture/execute/pr94591.c: New test.
> >>
> >> OK, thanks.
> >>
> >> Richard
> >
> > I've just tested this patch on gcc-{8,9,10} release branches:
> > bootstraps+regtests on aarch64-linux-gnu came back clean.
> >
> > Since this was a regression introduced in GCC 8, is it OK to backport
> > the fix to those release branches now?
> 
> Yeah, OK for the branches too.

Installed on the branches.

Thanks,
Alex


[committed][GCC10] amdgcn: fix vcc clobber in vector load/store

2020-05-29 Thread Andrew Stubbs

I've committed this backport of two master commits, originally posted here:

https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545771.html
https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546800.html

These correct a wrong-code bug that could affect spilled vectors.

Andrew
amdgcn: fix vcc clobber in vector load/store

This switches the code that expands scalar addresses to vectors of addresses
from using VCC to using CC_SAVE_REG, for the lo-part to hi-part carry values.
These were fine in code expanded in earlier passes, but addresses expanded
late, such as for stack spills or reloads, could clobber live VCC values,
causing execution failures.

This is the first target-specific testcase for GCN, so the new .exp file is
included.

2020-05-28  Andrew Stubbs  

	Backport from master:

	2020-05-14  Andrew Stubbs  

	gcc/
	* config/gcn/gcn-valu.md (add3_zext_dup): Change to a
	define_expand, and rename the original to ...
	(add3_vcc_zext_dup): ... this, and add a custom VCC operand.
	(add3_zext_dup_exec): Likewise, with ...
	(add3_vcc_zext_dup_exec): ... this.
	(add3_zext_dup2): Likewise, with ...
	(add3_zext_dup_exec): ... this.
	(add3_zext_dup2_exec): Likewise, with ...
	(add3_zext_dup2): ... this.
	* config/gcn/gcn.c (gcn_expand_scalar_to_vector_address): Switch
	addv64di3_zext* calls to use addv64di3_vcc_zext*.

	gcc/testsuite/
	* testsuite/gcc.target/gcn/gcn.exp: New file.
	* testsuite/gcc.target/gcn/vcc-clobber.c: New file.

	2020-05-28  Andrew Stubbs  

	gcc/
	* config/gcn/gcn-valu.md (add3_vcc_zext_dup): Add early clobber.
	(add3_vcc_zext_dup_exec): Likewise.
	(add3_vcc_zext_dup2): Likewise.
	(add3_vcc_zext_dup2_exec): Likewise.

diff --git a/gcc/config/gcn/gcn-valu.md b/gcc/config/gcn/gcn-valu.md
index d3badb4059c..dd55c08dae4 100644
--- a/gcc/config/gcn/gcn-valu.md
+++ b/gcc/config/gcn/gcn-valu.md
@@ -1379,135 +1379,206 @@ (define_insn_and_split "add3_zext_exec"
   [(set_attr "type" "vmult")
(set_attr "length" "8")])
 
-(define_insn_and_split "add3_zext_dup"
-  [(set (match_operand:V_DI 0 "register_operand""= v,  v")
+(define_insn_and_split "add3_vcc_zext_dup"
+  [(set (match_operand:V_DI 0 "register_operand""=v,v")
 	(plus:V_DI
 	  (zero_extend:V_DI
 	(vec_duplicate:
-	  (match_operand:SI 1 "gcn_alu_operand" "BSv,ASv")))
-	  (match_operand:V_DI 2 "gcn_alu_operand"   "vDA,vDb")))
-   (clobber (reg:DI VCC_REG))]
+	  (match_operand:SI 1 "gcn_alu_operand" "   BSv,  ASv")))
+	  (match_operand:V_DI 2 "gcn_alu_operand"   "   vDA,  vDb")))
+   (set (match_operand:DI 3 "register_operand"	"=,")
+	(ltu:DI (plus:V_DI 
+		  (zero_extend:V_DI (vec_duplicate: (match_dup 1)))
+		  (match_dup 2))
+		(match_dup 1)))]
   ""
   "#"
   "gcn_can_split_p  (mode, operands[0])
&& gcn_can_split_p (mode, operands[2])"
   [(const_int 0)]
   {
-rtx vcc = gen_rtx_REG (DImode, VCC_REG);
 emit_insn (gen_add3_vcc_dup
 		(gcn_operand_part (mode, operands[0], 0),
 		 gcn_operand_part (DImode, operands[1], 0),
 		 gcn_operand_part (mode, operands[2], 0),
-		 vcc));
+		 operands[3]));
 emit_insn (gen_addc3
 		(gcn_operand_part (mode, operands[0], 1),
 		 gcn_operand_part (mode, operands[2], 1),
-		 const0_rtx, vcc, vcc));
+		 const0_rtx, operands[3], operands[3]));
 DONE;
   }
   [(set_attr "type" "vmult")
(set_attr "length" "8")])
 
-(define_insn_and_split "add3_zext_dup_exec"
-  [(set (match_operand:V_DI 0 "register_operand"		 "= v,  v")
+(define_expand "add3_zext_dup"
+  [(match_operand:V_DI 0 "register_operand")
+   (match_operand:SI 1 "gcn_alu_operand")
+   (match_operand:V_DI 2 "gcn_alu_operand")]
+  ""
+  {
+rtx vcc = gen_rtx_REG (DImode, VCC_REG);
+emit_insn (gen_add3_vcc_zext_dup (operands[0], operands[1],
+	operands[2], vcc));
+DONE;
+  })
+
+(define_insn_and_split "add3_vcc_zext_dup_exec"
+  [(set (match_operand:V_DI 0 "register_operand"	  "=v,v")
 	(vec_merge:V_DI
 	  (plus:V_DI
 	(zero_extend:V_DI
 	  (vec_duplicate:
-		(match_operand:SI 1 "gcn_alu_operand"		 "ASv,BSv")))
-	(match_operand:V_DI 2 "gcn_alu_operand"		 "vDb,vDA"))
-	  (match_operand:V_DI 3 "gcn_register_or_unspec_operand" " U0, U0")
-	  (match_operand:DI 4 "gcn_exec_reg_operand"		 "  e,  e")))
-   (clobber (reg:DI VCC_REG))]
+		(match_operand:SI 1 "gcn_alu_operand"	  "   ASv,  BSv")))
+	(match_operand:V_DI 2 "gcn_alu_operand"	  "   vDb,  vDA"))
+	  (match_operand:V_DI 4 "gcn_register_or_unspec_operand" " U0,   U0")
+	  (match_operand:DI 5 "gcn_exec_reg_operand"	  " e,e")))
+   (set (match_operand:DI 3 "register_operand"		  "=,")
+	(and:DI
+	  (ltu:DI (plus:V_DI 
+		(zero_extend:V_DI (vec_duplicate: (match_dup 1)))
+		(match_dup 2))
+		  (match_dup 1))
+	  (match_dup 5)))]
   ""
   "#"
   "gcn_can_split_p  (mode, operands[0])
&& gcn_can_split_p (mode, operands[2])
-   && gcn_can_split_p (mode, operands[3])"
+   && gcn_can_split_p (mode, operands[4])"
   [(const_int 0)]
   {
-rtx vcc = gen_rtx_REG (DImode, 

Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-05-29 Thread Richard Biener via Gcc-patches
On Fri, May 29, 2020 at 2:17 PM Richard Biener
 wrote:
>
> On Thu, May 28, 2020 at 5:28 PM Richard Sandiford
>  wrote:
> >
> > Martin Liška  writes:
> > > Hi.
> > >
> > > There's a new patch that adds normal internal functions for the 4
> > > VCOND* functions.
> > >
> > > The patch that survives bootstrap and regression
> > > tests on x86_64-linux-gnu and ppc64le-linux-gnu.
> >
> > I think this has the same problem as the previous one.  What I meant
> > in yesterday's message is that:
> >
> >   expand_insn (icode, 6, ops);
> >
> > is simply not valid when icode is allowed to FAIL.  That's true in
> > any context, not just internal functions.  If icode does FAIL,
> > the expand_insn call will ICE:
> >
> >   if (!maybe_expand_insn (icode, nops, ops))
> > gcc_unreachable ();
> >
> > When using optabs you either:
> >
> > (a) declare that the md patterns aren't allowed to FAIL.  expand_insn
> > is for this case.
> >
> > (b) allow the md patterns to FAIL and provide a fallback when they do.
> > maybe_expand_insn is for this case.
> >
> > So if we keep IFN_VCOND, we need to use maybe_expand_insn and find some
> > way of implementing the IFN_VCOND when the pattern FAILs.
>
> But we should not have generated the pattern in that case - we actually verify
> we can expand at the time we do this "instruction selection".  This is in-line
> with other vectorizations where we also do not expect things to FAIL.
>
> See also the expanders that are removed in the patch.
>
> But adding a comment in the internal function expander to reflect this
> is probably good, also pointing to the verification routines (the
> preexisting expand_vec_cond_expr_p and expand_vec_cmp_expr_p
> routines).  Because of this pre-verification I suggested the direct
> internal function first, not being aware of the static cannot-FAIL logic.
>
> Now it looks like that those verification also simply checks optab
> availability only but then this is just a preexisting issue (and we can
> possibly build a testcase that FAILs RTL expansion for power...).
>
> So given that this means the latent bug in the powerpc backend
> should be fixed and we should use a direct internal function instead?

So I tried to understand the circumstances the rs6000 patterns FAIL
but FAILed ;)  It looks like some outs of rs6000_emit_vector_cond_expr
are unwarranted and the following should work:

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 8435bc15d72..5503215a00a 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -14638,8 +14638,7 @@ rs6000_emit_vector_compare (enum rtx_code rcode,
rtx mask2;

rev_code = reverse_condition_maybe_unordered (rcode);
-   if (rev_code == UNKNOWN)
- return NULL_RTX;
+   gcc_assert (rev_code != UNKNOWN);

nor_code = optab_handler (one_cmpl_optab, dmode);
if (nor_code == CODE_FOR_nothing)
@@ -14737,8 +14736,7 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx
op_true, rtx op_false,
   rtx cond2;
   bool invert_move = false;

-  if (VECTOR_UNIT_NONE_P (dest_mode))
-return 0;
+  gcc_assert (VECTOR_UNIT_NONE_P (dest_mode));

   gcc_assert (GET_MODE_SIZE (dest_mode) == GET_MODE_SIZE (mask_mode)
  && GET_MODE_NUNITS (dest_mode) == GET_MODE_NUNITS (mask_mode));
@@ -14756,8 +14754,7 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx
op_true, rtx op_false,
 e.g., A  = (B != C) ? D : E becomes A = (B == C) ? E : D.  */
   invert_move = true;
   rcode = reverse_condition_maybe_unordered (rcode);
-  if (rcode == UNKNOWN)
-   return 0;
+  gcc_assert (rcode != UNKNOWN);
   break;

 case GE:

which leaves the

  /* Get the vector mask for the given relational operations.  */
  mask = rs6000_emit_vector_compare (rcode, cc_op0, cc_op1, mask_mode);

  if (!mask)
return 0;

fail but that function recurses heavily - from reading
rs6000_emit_vector_compare_inner
it looks like power can do a lot of compares but floating-point LT which
reverse_condition_maybe_unordered would turn into UNGE which is not
handled either.
But then rs6000_emit_vector_compare just tries GT for that anyway (not UNGE) so
it is actually be handled (but should not?).

So I bet the expansion of the patterns cannot fail at the moment.  Thus I'd
replace the FAIL with a gcc_unreachable () and see if we have test
coverage for those
FAILs.

Segher - do you actually know this code to guess why the patterns are defensive?

Thanks,
Richard.

> Thanks,
> Richard.
>
> > Thanks,
> > Richard


Re: [PATCH 5/7] vect: Support vector load/store with length in vectorizer

2020-05-29 Thread Segher Boessenkool
Hi!

On Fri, May 29, 2020 at 09:32:49AM +0100, Richard Sandiford wrote:
> There's nothing to stop us using masks and lengths in the same loop
> in future if we need to.  It would “just” be a case of setting up both
> the masks and the lengths in vect_set_loop_condition.  But the point is
> that doing that would be extra code, and there's no point writing that
> extra code until it's needed.

You won't ever get it right even, because you do not know exactly what
will be needed :-)

> If some future arch does support both mask-based and length-based
> approaches, I think that's even less reason to make a binary choice
> between them.  How we prioritise the length and mask approaches when
> both are available is something that we'll have to decide at the time.
> 
> If your concern is that the arch might support masked operations
> without wanting them to be used for loop control, we could test for
> that case by checking whether while_ult_optab is implemented.

Heh, sneaky.  But at least for now it will work fine, and it is local,
and not hard to change later.


Segher


Re: [PATCH 2/2] Provide diagnostic hints for missing C++ cinttypes string constants.

2020-05-29 Thread David Malcolm via Gcc-patches
On Fri, 2020-05-29 at 01:33 +0200, Mark Wielaard wrote:
> Hi,
> 
> On Mon, May 25, 2020 at 12:26:33PM -0400, Jason Merrill wrote:
> > On 5/23/20 8:30 PM, Mark Wielaard wrote:
> > > When reporting an error in cp_parser and we notice a string
> > > literal
> > > followed by an unknown name check whether there is a known
> > > standard
> > > header containing a string macro with the same name, then add a
> > > hint
> > > to the error message to include that header.
> > > 
> > > gcc/c-family/ChangeLog:
> > > 
> > >   * known-headers.cc
> > > (get_cp_stdlib_header_for_string_macro_name):
> > >   New function.
> > >   * known-headers.h (get_c_stdlib_header_for_string_macro_name):
> > 
> > Missing 'p'.
> > 
> > >   New function definition.
> > 
> > Declaration, not definition.
> > 
> > The C++ changes are OK with these ChangeLog corrections.
> 
> Thanks. David, are you OK with the diagnostic changes?

Yes.

> Who can we trick into reviewing the C frontend changes in the 1/2
> patch that this depends on?
> 
> Cheers,
> 
> Mark




RE: [PATCH] aarch64: Change the definition of Pmode [pr95182]

2020-05-29 Thread duanbo (C)


> -Original Message-
> From: Andrew Pinski [mailto:pins...@gmail.com]
> Sent: Monday, May 18, 2020 11:49 AM
> To: duanbo (C) 
> Cc: GCC Patches 
> Subject: Re: [PATCH] aarch64: Change the definition of Pmode [pr95182]
> 
> On Sun, May 17, 2020 at 8:23 PM duanbo (C)  wrote:
> >
> > Hi,
> >
> > This changes the definition of Pmode for aarch64 port.
> > Unlike x86, S390 etc., Pmode is always set to DImode for aarch64 port even
> under ILP32.
> > Because of that definition,  machine mode of symbol_ref which is
> supposed to be SImode becomes DImode under target ILP32.
> > Definition of Pmode should depend on the current ABI, i.e., SImode for
> ILP32 and DImode for LP64.
> > Attached please find the proposed patch .
> > Bootstrap and tested on aarch64 Linux platform. No new regression
> witnessed.
> > Any suggestion?
> 
> THIS DOES NOT WORK correctly and will never work correctly.  When I was
> originally writing AARCH64 ILP32 (back in 2013), I went this route first (as 
> it
> was the fastest way to get it working; I could not wait on ARM's
> implementation at the time) but I had regressions.
> 
> The place where it fails was something like:
> int f(char *g, int t)
> {
>   return g[t];
> }
> 
> Which you pass -1 for t  as there would be no zero-extend any more.
> I remember at least one testcase in the testsuite failing due to
> implementation this way even; I don't remember which one as I did not write
> it down and it was over 6 years ago.
> 
> If there was an arch mode which would VAs to be truncated to 32bits, this
> would be the correct way to implement this.
> 
> The reason why the other ABIs/targets define Pmode as SImode is because
> the underlying hardware will extend the VA correctly as Linux will set the
> arch bit correctly (NOTE MIPS is an example where index'ed load/stores
> which has a similar issue even on MIPS32 but that is a different story).
> 
> Also there are other ABIs where Pmode != PTRmode (e.g. IA64-HPUX32).
> x32 has an option which can select either way.  The ISA on x86_64 supports
> both cases which is why it can be selected that way; this is unlike AARCH64
> which cannot.
> 
> Thanks,
> Andrew Pinski
> 
> >
> > Thanks,
> > Duanbo

Hi

I got your point. The hardware of AARCH64 determines the definition of Pmode. 
But I didn't figure out the reason why the other ABIs / targets could define 
Pmode as SImode. 
There is not much information about ILP32 on the official website of ARM.
It would be very helpful if you can provide some useful documents, especially 
the different hardware implementation between X86 and AARCH64 in this issue.
I really want to figure out.
thanks.

Duanbo


Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-05-29 Thread Richard Biener via Gcc-patches
On Thu, May 28, 2020 at 5:28 PM Richard Sandiford
 wrote:
>
> Martin Liška  writes:
> > Hi.
> >
> > There's a new patch that adds normal internal functions for the 4
> > VCOND* functions.
> >
> > The patch that survives bootstrap and regression
> > tests on x86_64-linux-gnu and ppc64le-linux-gnu.
>
> I think this has the same problem as the previous one.  What I meant
> in yesterday's message is that:
>
>   expand_insn (icode, 6, ops);
>
> is simply not valid when icode is allowed to FAIL.  That's true in
> any context, not just internal functions.  If icode does FAIL,
> the expand_insn call will ICE:
>
>   if (!maybe_expand_insn (icode, nops, ops))
> gcc_unreachable ();
>
> When using optabs you either:
>
> (a) declare that the md patterns aren't allowed to FAIL.  expand_insn
> is for this case.
>
> (b) allow the md patterns to FAIL and provide a fallback when they do.
> maybe_expand_insn is for this case.
>
> So if we keep IFN_VCOND, we need to use maybe_expand_insn and find some
> way of implementing the IFN_VCOND when the pattern FAILs.

But we should not have generated the pattern in that case - we actually verify
we can expand at the time we do this "instruction selection".  This is in-line
with other vectorizations where we also do not expect things to FAIL.

See also the expanders that are removed in the patch.

But adding a comment in the internal function expander to reflect this
is probably good, also pointing to the verification routines (the
preexisting expand_vec_cond_expr_p and expand_vec_cmp_expr_p
routines).  Because of this pre-verification I suggested the direct
internal function first, not being aware of the static cannot-FAIL logic.

Now it looks like that those verification also simply checks optab
availability only but then this is just a preexisting issue (and we can
possibly build a testcase that FAILs RTL expansion for power...).

So given that this means the latent bug in the powerpc backend
should be fixed and we should use a direct internal function instead?

Thanks,
Richard.

> Thanks,
> Richard


[PATCH] Error for missing change description in git_commit.py.

2020-05-29 Thread Martin Liška

Hello.

The change finds situations where somebody missing description of a change
in a ChangeLog entry.

I tested:
git gcc-verify 51e10276d6792f67f1d88d90f299e7ac1b1f1f24..HEAD -n
and it's fine for ~250 last commits.

I'll install it if there are no objections.
Thanks,
Martin

contrib/ChangeLog:

* gcc-changelog/git_commit.py: Find empty change descriptions.
* gcc-changelog/test_email.py: New test.
* gcc-changelog/test_patches.txt: New patch that tests that.
---
 contrib/gcc-changelog/git_commit.py| 10 +
 contrib/gcc-changelog/test_email.py|  6 ++
 contrib/gcc-changelog/test_patches.txt | 28 ++
 3 files changed, 44 insertions(+)

diff --git a/contrib/gcc-changelog/git_commit.py 
b/contrib/gcc-changelog/git_commit.py
index 084e83c18cc..4f82b58f64b 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -274,6 +274,7 @@ class GitCommit:
 self.parse_lines(all_are_ignored)
 if self.changes:
 self.parse_changelog()
+self.check_for_empty_description()
 self.deduce_changelog_locations()
 if not self.errors:
 self.check_mentioned_files()
@@ -440,6 +441,15 @@ class GitCommit:
 else:
 last_entry.lines.append(line)
 
+def check_for_empty_description(self):

+for entry in self.changelog_entries:
+for i, line in enumerate(entry.lines):
+if (star_prefix_regex.match(line) and line.endswith(':') and
+(i == len(entry.lines) - 1
+ or star_prefix_regex.match(entry.lines[i + 1]))):
+msg = 'missing description of a change'
+self.errors.append(Error(msg, line))
+
 def get_file_changelog_location(self, changelog_file):
 for file in self.modified_files:
 if file[0] == changelog_file:
diff --git a/contrib/gcc-changelog/test_email.py 
b/contrib/gcc-changelog/test_email.py
index e73b3626473..158eb651367 100755
--- a/contrib/gcc-changelog/test_email.py
+++ b/contrib/gcc-changelog/test_email.py
@@ -312,3 +312,9 @@ class TestGccChangelog(unittest.TestCase):
 == 'Steven G. Kargl  ')
 assert (email.changelog_entries[0].author_lines[1][0]
 == 'Mark Eggleston  ')
+
+def test_missing_change_description(self):
+email = self.from_patch_glob('0001-Missing-change-description.patch')
+assert len(email.errors) == 2
+assert email.errors[0].message == 'missing description of a change'
+assert email.errors[1].message == 'missing description of a change'
diff --git a/contrib/gcc-changelog/test_patches.txt 
b/contrib/gcc-changelog/test_patches.txt
index 76037c33f93..25311fbf300 100644
--- a/contrib/gcc-changelog/test_patches.txt
+++ b/contrib/gcc-changelog/test_patches.txt
@@ -2945,3 +2945,31 @@ index 000..fda10c1a88b
 +
 --
 2.26.2
+
+=== 0001-Missing-change-description.patch ===
+From 8ec655bd94615ba45adabae9b50df299edb74eda Mon Sep 17 00:00:00 2001
+From: Martin Liska 
+Date: Fri, 29 May 2020 13:42:57 +0200
+Subject: [PATCH] Test me.
+
+gcc/ChangeLog:
+
+   * ipa-icf-gimple.c (compare_gimple_asm):
+   * ipa-icf-gimple2.c (compare_gimple_asm): Good.
+   * ipa-icf-gimple3.c (compare_gimple_asm):
+---
+ contrib/gcc-changelog/git_commit.py | 10 ++
+ gcc/ipa-icf-gimple.c|  1 +
+ 2 files changed, 11 insertions(+)
+
+diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c
+index 1cd5872c03d..6f95aedb3d3 100644
+--- a/gcc/ipa-icf-gimple.c
 b/gcc/ipa-icf-gimple.c
+@@ -850,3 +850,4 @@
+ }
+
+ } // ipa_icf_gimple namespace
++
+--
+2.26.2
--
2.26.2



Re: [PATCH] S/390: Emit vector alignment hints for z13

2020-05-29 Thread Andreas Krebbel via Gcc-patches
On 28.05.20 20:24, Stefan Schulze Frielinghaus wrote:
> Vector alignment hints are fully supported since z14.  On z13 alignment
> hints have no effect, however, instructions with alignment hints are
> still legal.  Thus, emit alignment hints also for z13 targets so that if
> the binary is actually run on a z14 or later it benefits from such
> hints.

More precisely the alignment hints don't have any effect before z15, are 
described already in the
z14 architecture but actually are also accepted by z13 hardware.

> 
> Note, this requires gas including commit f687f5f563 of the binutils
> repository.
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.c (print_operand): Emit vector alignment
>   hints for z13.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/vector/align-1.c: Change target architecture
>   to z13.
>   * gcc.target/s390/vector/align-2.c: Change target architecture
>   to z13.

Ok. Thanks!

Andreas

> ---
>  gcc/config/s390/s390.c | 7 ++-
>  gcc/testsuite/gcc.target/s390/vector/align-1.c | 2 +-
>  gcc/testsuite/gcc.target/s390/vector/align-2.c | 2 +-
>  3 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
> index 4de3129f88e..b5fd5a2f3ed 100644
> --- a/gcc/config/s390/s390.c
> +++ b/gcc/config/s390/s390.c
> @@ -7854,7 +7854,12 @@ print_operand (FILE *file, rtx x, int code)
>  {
>  case 'A':
>  #ifdef HAVE_AS_VECTOR_LOADSTORE_ALIGNMENT_HINTS
> -  if (TARGET_Z14 && MEM_P (x))
> +  /* Vector alignment hints are fully supported since z14.  On z13
> +  alignment hints have no effect, however, instructions with alignment
> +  hints are still legal.  Thus, emit alignment hints also for z13
> +  targets so that if the binary is actually run on a z14 or later it
> +  benefits from such hints.  */
> +  if (TARGET_Z13 && MEM_P (x))
>   {
> if (MEM_ALIGN (x) >= 128)
>   fprintf (file, ",4");
> diff --git a/gcc/testsuite/gcc.target/s390/vector/align-1.c 
> b/gcc/testsuite/gcc.target/s390/vector/align-1.c
> index ccad22a..6997af2ddcd 100644
> --- a/gcc/testsuite/gcc.target/s390/vector/align-1.c
> +++ b/gcc/testsuite/gcc.target/s390/vector/align-1.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O3 -mzarch -march=z14" } */
> +/* { dg-options "-O3 -mzarch -march=z13" } */
>  
>  /* The user alignment ends up in DECL_ALIGN of the VAR_DECL and is
> currently ignored if it is smaller than the alignment of the type.
> diff --git a/gcc/testsuite/gcc.target/s390/vector/align-2.c 
> b/gcc/testsuite/gcc.target/s390/vector/align-2.c
> index e4e2fba6a58..00e09d3eadb 100644
> --- a/gcc/testsuite/gcc.target/s390/vector/align-2.c
> +++ b/gcc/testsuite/gcc.target/s390/vector/align-2.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O3 -mzarch -march=z14" } */
> +/* { dg-options "-O3 -mzarch -march=z13" } */
>  
>  /* The user alignment ends up in TYPE_ALIGN of the type of the
> VAR_DECL.  */
> 



[PATCH] hurd: libgcc unwinding support over signal trampolines

2020-05-29 Thread Samuel Thibault
Hello,

libgcc is currently missing the support for unwinding over signal
trampolines on GNU/Hurd. The attached patch implements it.

Samuel
hurd: libgcc unwinding support over signal trampolines

* libgcc/config.host (md_unwind_header): Set to i386/gnu-unwind.h on
i[34567]86-*-gnu*.
* src/libgcc/config/i386/gnu-unwind.h: New file.

diff --git a/libgcc/config.host b/libgcc/config.host
index 2cd42097167..044b34d53cc 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -734,11 +734,17 @@ i[34567]86-*-linux*)
tm_file="${tm_file} i386/elf-lib.h"
md_unwind_header=i386/linux-unwind.h
;;
-i[34567]86-*-kfreebsd*-gnu | i[34567]86-*-gnu* | 
i[34567]86-*-kopensolaris*-gnu)
+i[34567]86-*-kfreebsd*-gnu | i[34567]86-*-kopensolaris*-gnu)
extra_parts="$extra_parts crtprec32.o crtprec64.o crtprec80.o 
crtfastmath.o"
tmake_file="${tmake_file} i386/t-crtpc t-crtfm i386/t-crtstuff 
t-dfprules"
tm_file="${tm_file} i386/elf-lib.h"
;;
+i[34567]86-*-gnu*)
+   extra_parts="$extra_parts crtprec32.o crtprec64.o crtprec80.o 
crtfastmath.o"
+   tmake_file="${tmake_file} i386/t-crtpc t-crtfm i386/t-crtstuff 
t-dfprules"
+   tm_file="${tm_file} i386/elf-lib.h"
+   md_unwind_header=i386/gnu-unwind.h
+   ;;
 x86_64-*-linux*)
extra_parts="$extra_parts crtprec32.o crtprec64.o crtprec80.o 
crtfastmath.o"
tmake_file="${tmake_file} i386/t-crtpc t-crtfm i386/t-crtstuff 
t-dfprules"
diff --git a/src/libgcc/config/i386/gnu-unwind.h 
b/src/libgcc/config/i386/gnu-unwind.h
new file mode 100644
index 000..db47f0ac1d4
--- /dev/null
+++ b/src/libgcc/config/i386/gnu-unwind.h
@@ -0,0 +1,107 @@
+/* DWARF2 EH unwinding support for GNU Hurd: x86.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   Contributed by Samuel Thibault 
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+.  */
+
+/* Do code reading to identify a signal frame, and set the frame
+   state data appropriately.  See unwind-dw2.c for the structs. */
+
+#ifndef inhibit_libc
+
+#include 
+
+#define MD_FALLBACK_FRAME_STATE_FOR x86_gnu_fallback_frame_state
+
+static _Unwind_Reason_Code
+x86_gnu_fallback_frame_state
+(struct _Unwind_Context *context, _Unwind_FrameState *fs)
+{
+  struct handler_args {
+int signo;
+int sigcode;
+struct sigcontext *scp;
+  } *handler_args;
+  struct sigcontext *scp;
+  unsigned long usp;
+
+/*
+ * i386 sigtramp frame we are looking for follows.
+ * (see glibc/sysdeps/mach/hurd/i386/trampoline.c assembly)
+ *
+ * rpc_wait_trampoline:
+ *   0:b8 e7 ff ff ff  mov$-25,%eax   mach_msg_trap
+ *   5:9a 00 00 00 00 07 00lcall  $7,$0
+ *  12:89 01   movl   %eax, (%ecx)
+ *  14:89 dc   movl   %ebx, %esp  switch to signal 
stack
+ *
+ * trampoline:
+ *  16:ff d2   call   *%edx   call the handler 
function
+ * RA HERE
+ *  18:83 c4 0caddl   $12, %esp   pop its args
+ *  21:c3  retreturn to 
sigreturn
+ *
+ * firewall:
+ *  22:f4  hlt
+ */
+
+  if (!(   *(unsigned int   *)(context->ra ) == 0xc30cc483
+&& *(unsigned char  *)(context->ra +  4) ==   0xf4
+
+&& *(unsigned int   *)(context->ra -  4) == 0xd2ffdc89
+&& *(unsigned int   *)(context->ra -  8) == 0x01890007
+&& *(unsigned int   *)(context->ra - 12) == 0x
+&& *(unsigned int   *)(context->ra - 16) == 0x9aff
+&& *(unsigned short *)(context->ra - 18) == 0xe7b8))
+return _URC_END_OF_STACK;
+
+  handler_args = context->cfa;
+  scp = handler_args->scp;
+  usp = scp->sc_uesp;
+
+  fs->regs.cfa_how = CFA_REG_OFFSET;
+  fs->regs.cfa_reg = 4;
+  fs->regs.cfa_offset = usp - (unsigned long) context->cfa;
+
+  fs->regs.reg[0].how = REG_SAVED_OFFSET;
+  fs->regs.reg[0].loc.offset = (unsigned long)>sc_eax - usp;
+  fs->regs.reg[1].how = REG_SAVED_OFFSET;
+  fs->regs.reg[1].loc.offset = (unsigned long)>sc_ecx - usp;
+  fs->regs.reg[2].how = REG_SAVED_OFFSET;
+  

[committed] amdgcn: Fix VCC early clobber

2020-05-29 Thread Andrew Stubbs
This patch fixes a bug in which the register allocator could place 
arbitrary data into the VCC (vector condition code) register, and then 
use it as input to an instruction that writes condition codes there.


This would be fine except that 64 bit integers are split into high-part 
and low-part operations, and each writes to the whole of VCC, creating 
an early-clobber situation for this specific input register.


Andrew
amdgcn: Fix VCC early clobber

gcc/ChangeLog:

2020-05-28  Andrew Stubbs  

	* config/gcn/gcn-valu.md (add3_vcc_zext_dup): Add early clobber.
	(add3_vcc_zext_dup_exec): Likewise.
	(add3_vcc_zext_dup2): Likewise.
	(add3_vcc_zext_dup2_exec): Likewise.

diff --git a/gcc/config/gcn/gcn-valu.md b/gcc/config/gcn/gcn-valu.md
index d31fe5063b9..6d7fecaa12c 100644
--- a/gcc/config/gcn/gcn-valu.md
+++ b/gcc/config/gcn/gcn-valu.md
@@ -1380,13 +1380,13 @@ (define_insn_and_split "add3_zext_exec"
(set_attr "length" "8")])
 
 (define_insn_and_split "add3_vcc_zext_dup"
-  [(set (match_operand:V_DI 0 "register_operand""=   v,   v")
+  [(set (match_operand:V_DI 0 "register_operand""=v,v")
 	(plus:V_DI
 	  (zero_extend:V_DI
 	(vec_duplicate:
-	  (match_operand:SI 1 "gcn_alu_operand" "  BSv, ASv")))
-	  (match_operand:V_DI 2 "gcn_alu_operand"   "  vDA, vDb")))
-   (set (match_operand:DI 3 "register_operand"	"=SgcV,SgcV")
+	  (match_operand:SI 1 "gcn_alu_operand" "   BSv,  ASv")))
+	  (match_operand:V_DI 2 "gcn_alu_operand"   "   vDA,  vDb")))
+   (set (match_operand:DI 3 "register_operand"	"=,")
 	(ltu:DI (plus:V_DI 
 		  (zero_extend:V_DI (vec_duplicate: (match_dup 1)))
 		  (match_dup 2))
@@ -1424,16 +1424,16 @@ (define_expand "add3_zext_dup"
   })
 
 (define_insn_and_split "add3_vcc_zext_dup_exec"
-  [(set (match_operand:V_DI 0 "register_operand"		"=   v,   v")
+  [(set (match_operand:V_DI 0 "register_operand"	  "=v,v")
 	(vec_merge:V_DI
 	  (plus:V_DI
 	(zero_extend:V_DI
 	  (vec_duplicate:
-		(match_operand:SI 1 "gcn_alu_operand"		"  ASv, BSv")))
-	(match_operand:V_DI 2 "gcn_alu_operand"		"  vDb, vDA"))
-	  (match_operand:V_DI 4 "gcn_register_or_unspec_operand" "  U0,  U0")
-	  (match_operand:DI 5 "gcn_exec_reg_operand"		"e,   e")))
-   (set (match_operand:DI 3 "register_operand"			"=SgcV,SgcV")
+		(match_operand:SI 1 "gcn_alu_operand"	  "   ASv,  BSv")))
+	(match_operand:V_DI 2 "gcn_alu_operand"	  "   vDb,  vDA"))
+	  (match_operand:V_DI 4 "gcn_register_or_unspec_operand" " U0,   U0")
+	  (match_operand:DI 5 "gcn_exec_reg_operand"	  " e,e")))
+   (set (match_operand:DI 3 "register_operand"		  "=,")
 	(and:DI
 	  (ltu:DI (plus:V_DI 
 		(zero_extend:V_DI (vec_duplicate: (match_dup 1)))
@@ -1481,11 +1481,11 @@ (define_expand "add3_zext_dup_exec"
   })
 
 (define_insn_and_split "add3_vcc_zext_dup2"
-  [(set (match_operand:V_DI 0 "register_operand"		"=   v")
+  [(set (match_operand:V_DI 0 "register_operand"		   "=v")
 	(plus:V_DI
 	  (zero_extend:V_DI (match_operand: 1 "gcn_alu_operand" " vA"))
 	  (vec_duplicate:V_DI (match_operand:DI 2 "gcn_alu_operand" " DbSv"
-   (set (match_operand:DI 3 "register_operand"			"=SgcV")
+   (set (match_operand:DI 3 "register_operand"			   "=")
 	(ltu:DI (plus:V_DI 
 		  (zero_extend:V_DI (match_dup 1))
 		  (vec_duplicate:V_DI (match_dup 2)))
@@ -1523,14 +1523,14 @@ (define_expand "add3_zext_dup2"
   })
 
 (define_insn_and_split "add3_vcc_zext_dup2_exec"
-  [(set (match_operand:V_DI 0 "register_operand"		 "=   v")
+  [(set (match_operand:V_DI 0 "register_operand"		"=v")
 	(vec_merge:V_DI
 	  (plus:V_DI
 	(zero_extend:V_DI (match_operand: 1 "gcn_alu_operand" "vA"))
 	(vec_duplicate:V_DI (match_operand:DI 2 "gcn_alu_operand"  "BSv")))
-	  (match_operand:V_DI 4 "gcn_register_or_unspec_operand" "   U0")
-	  (match_operand:DI 5 "gcn_exec_reg_operand"		 "e")))
-   (set (match_operand:DI 3 "register_operand"			 "=SgcV")
+	  (match_operand:V_DI 4 "gcn_register_or_unspec_operand""U0")
+	  (match_operand:DI 5 "gcn_exec_reg_operand"		" e")))
+   (set (match_operand:DI 3 "register_operand"			"=")
 	(and:DI
 	  (ltu:DI (plus:V_DI 
 		(zero_extend:V_DI (match_dup 1))


[PATCH] Port bugzilla-close-candidate script to git.

2020-05-29 Thread Martin Liška

Tested and pushed to master.

Martin

maintainer-scripts/ChangeLog:

* bugzilla-close-candidate.py: Support both SVN and GIT messages
in PRs. Remove need of usage of the bugzilla API key.
---
 .../bugzilla-close-candidate.py   | 50 +++
 1 file changed, 29 insertions(+), 21 deletions(-)

diff --git a/maintainer-scripts/bugzilla-close-candidate.py 
b/maintainer-scripts/bugzilla-close-candidate.py
index 26ee84474a0..dfd67ac1cbb 100755
--- a/maintainer-scripts/bugzilla-close-candidate.py
+++ b/maintainer-scripts/bugzilla-close-candidate.py
@@ -1,19 +1,19 @@
 #!/usr/bin/env python3
 
-# The script is used for finding PRs that have a SVN revision that

+# The script is used for finding PRs that have a GIT revision that
 # mentiones the PR and are not closed.  It's done by iterating all
-# comments of a PR and finding SVN commit entries.
+# comments of a PR and finding GIT commit entries.
 
 """

 Sample output of the script:
 Bugzilla URL page size: 50
 HINT: bugs with following comment are ignored: Can the bug be marked as 
resolved?
 
-Bug URL  SVN commits   known-to-fail   known-to-work   Bug summary

-https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88129   trunk 

Two blockage insns are emited in the function epilogue
-https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88122   trunk 

[9 Regression] g++ ICE: internal compiler error: Segmentation fault
-https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88084   trunk 

basic_string_view::copy doesn't use Traits::copy
-https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88083   trunk 

ICE in find_constant_pool_ref_1, at config/s390/s390.c:8231
+Bug URL  GIT commits   
known-to-fail   known-to-work   
Bug summary
+https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88129   master

Two blockage insns are emited in the function epilogue
+https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88122   master

[9 Regression] g++ ICE: internal compiler error: Segmentation fault
+https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88084   master

basic_string_view::copy doesn't use Traits::copy
+https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88083   master

ICE in find_constant_pool_ref_1, at config/s390/s390.c:8231
 ...
 Bugzilla lists:
 
https://gcc.gnu.org/bugzilla/buglist.cgi?bug_id=88129,88122,88084,88083,88074,88073,88071,88070,88051,88018,87985,87955,87926,87917,87913,87898,87895,87874,87871,87855,87853,87826,87824,87819,87818,87799,87793,87789,87788,87787,87754,87725,87674,87665,87649,87647,87645,87625,87611,87610,87598,87593,87582,87566,87556,87547,87544,87541,87537,87528
@@ -36,29 +36,38 @@ def get_branches_by_comments(comments):
 versions = set()
 for c in comments:
 text = c['text']
-if 'URL: https://gcc.gnu.org/viewcvs' in text:
-version = 'trunk'
-for l in text.split('\n'):
-if 'branches/gcc-' in l:
-parts = l.strip().split('/')
+lines = text.split('\n')
+for line in lines:
+if 'URL: https://gcc.gnu.org/viewcvs' in line:
+version = 'master'
+if 'branches/gcc-' in line:
+parts = line.strip().split('/')
 parts = parts[1].split('-')
 assert len(parts) == 3
 versions.add(parts[1])
-versions.add(version)
+versions.add(version)
+elif line.startswith('The ') and 'branch has been updated' in line:
+version = 'master'
+name = line.strip().split(' ')[1]
+if '/' in name:
+name = name.split('/')[1]
+assert '-' in name
+version = name.split('-')[1]
+versions.add(version)
 return versions
 
-def get_bugs(api_key, query):

+def get_bugs(query):
 u = base_url + 'bug'
 r = requests.get(u, params = query)
 return r.json()['bugs']
 
-def 

[PATCH] tree-optimization/95272 - add SLP_TREE_REPRESENTATIVE

2020-05-29 Thread Richard Biener


This adds SLP_TREE_REPRESENTATIVE - a representative stmt-info that
is used by SLP analysis and code generation.  This avoids the need
for the hack in vect_slp_rearrange_stmts which previously avoided
to re-arrange stmts that might not have been isomorphic because
of operand swapping.  It also plays nice with future directions of SLP
and for the forseeable future is easier than replicating more and
more info in the SLP node as long as non-SLP is in-tree.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2020-05-29  Richard Biener  

PR tree-optimization/95272
* tree-vectorizer.h (_slp_tree::representative): Add.
(SLP_TREE_REPRESENTATIVE): Likewise.
* tree-vect-loop.c (vectorizable_reduction): Adjust SLP
node gathering.
(vectorizable_live_operation): Use the representative to
attach the reduction info to.
* tree-vect-slp.c (_slp_tree::_slp_tree): Initialize
SLP_TREE_REPRESENTATIVE.
(vect_create_new_slp_node): Likewise.
(slp_copy_subtree): Copy it.
(vect_slp_rearrange_stmts): Re-arrange even COND_EXPR stmts.
(vect_slp_analyze_node_operations_1): Pass the representative
to vect_analyze_stmt.
(vect_schedule_slp_instance): Pass the representative to
vect_transform_stmt.

* gcc.dg/vect/pr95272.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/pr95272.c | 18 ++
 gcc/tree-vect-loop.c|  8 ++--
 gcc/tree-vect-slp.c | 15 +--
 gcc/tree-vectorizer.h   |  4 
 4 files changed, 33 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr95272.c

diff --git a/gcc/testsuite/gcc.dg/vect/pr95272.c 
b/gcc/testsuite/gcc.dg/vect/pr95272.c
new file mode 100644
index 000..47698ff3e56
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr95272.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+
+enum { a = 5, b };
+typedef struct {
+  int c[b];
+} d;
+extern d e[];
+int f;
+int g[6];
+void h() {
+  int i;
+  for (; f; f++) {
+i = 0;
+for (; i < b; i++)
+  if (e[f].c[i])
+g[i] = e[f].c[i];
+  }
+}
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 4f94b4baad9..3c5c0ea9ebc 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -6192,9 +6192,9 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
 {
   slp_for_stmt_info = slp_node_instance->root;
   /* And then there's reduction chain with a conversion ...  */
-  if (SLP_TREE_SCALAR_STMTS (slp_for_stmt_info)[0] != stmt_info)
+  if (SLP_TREE_REPRESENTATIVE (slp_for_stmt_info) != stmt_info)
slp_for_stmt_info = SLP_TREE_CHILDREN (slp_for_stmt_info)[0];
-  gcc_assert (SLP_TREE_SCALAR_STMTS (slp_for_stmt_info)[0] == stmt_info);
+  gcc_assert (SLP_TREE_REPRESENTATIVE (slp_for_stmt_info) == stmt_info);
 }
   slp_tree *slp_op = XALLOCAVEC (slp_tree, op_type);
   for (i = 0; i < op_type; i++)
@@ -7952,6 +7952,10 @@ vectorizable_live_operation (loop_vec_info loop_vinfo,
 all involved stmts together.  */
  else if (slp_index != 0)
return true;
+ else
+   /* For SLP reductions the meta-info is attached to
+  the representative.  */
+   stmt_info = SLP_TREE_REPRESENTATIVE (slp_node);
}
   stmt_vec_info reduc_info = info_for_reduction (loop_vinfo, stmt_info);
   gcc_assert (reduc_info->is_reduc_info);
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 5976e91cf62..836defce990 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -61,6 +61,7 @@ _slp_tree::_slp_tree ()
   SLP_TREE_TWO_OPERATORS (this) = false;
   SLP_TREE_DEF_TYPE (this) = vect_uninitialized_def;
   SLP_TREE_VECTYPE (this) = NULL_TREE;
+  SLP_TREE_REPRESENTATIVE (this) = NULL;
   this->refcnt = 1;
   this->max_nunits = 1;
 }
@@ -132,6 +133,7 @@ vect_create_new_slp_node (vec scalar_stmts, 
unsigned nops)
   SLP_TREE_SCALAR_STMTS (node) = scalar_stmts;
   SLP_TREE_CHILDREN (node).create (nops);
   SLP_TREE_DEF_TYPE (node) = vect_internal_def;
+  SLP_TREE_REPRESENTATIVE (node) = scalar_stmts[0];
 
   unsigned i;
   stmt_vec_info stmt_info;
@@ -1741,6 +1743,7 @@ slp_copy_subtree (slp_tree node, hash_map )
   slp_tree copy = copy_ref;
   SLP_TREE_DEF_TYPE (copy) = SLP_TREE_DEF_TYPE (node);
   SLP_TREE_VECTYPE (copy) = SLP_TREE_VECTYPE (node);
+  SLP_TREE_REPRESENTATIVE (copy) = SLP_TREE_REPRESENTATIVE (node);
   copy->max_nunits = node->max_nunits;
   copy->refcnt = 0;
   if (SLP_TREE_SCALAR_STMTS (node).exists ())
@@ -1786,14 +1789,6 @@ vect_slp_rearrange_stmts (slp_tree node, unsigned int 
group_size,
   if (SLP_TREE_SCALAR_STMTS (node).exists ())
 {
   gcc_assert (group_size == SLP_TREE_SCALAR_STMTS (node).length ());
-  /* ???  Computation nodes are isomorphic and need no rearrangement.
-This is a quick hack to cover those where rearrangement breaks
-semantics because only 

Re: [PATCH] [aarch64] Fix PR94591: GCC generates invalid rev64 insns

2020-05-29 Thread Richard Sandiford
Alex Coplan  writes:
> On 19/05/2020 17:59, Richard Sandiford wrote:
>> Alex Coplan  writes:
>> > Hello,
>> >
>> > This patch fixes PR94591. The problem was the function 
>> > aarch64_evpc_rev_local()
>> > matching vector permutations that were not reversals. In particular, prior 
>> > to
>> > this patch, this function matched the identity permutation which led to
>> > generating bogus REV64 insns which were rejected by the assembler.
>> >
>> > Testing:
>> >  - New regression test which passes after applying the patch.
>> >  - New test passes on an x64 -> aarch64-none-elf cross.
>> >  - Bootstrap and regtest on aarch64-linux-gnu.
>> >
>> > OK to install?
>> >
>> > Thanks,
>> > Alex
>> >
>> > ---
>> >
>> > gcc/ChangeLog:
>> >
>> > 2020-05-19  Alex Coplan  
>> >
>> >PR target/94591
>> >* config/aarch64/aarch64.c (aarch64_evpc_rev_local): Don't match
>> >identity permutation.
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> > 2020-05-19  Alex Coplan  
>> >
>> >PR target/94591
>> >* gcc.c-torture/execute/pr94591.c: New test.
>> 
>> OK, thanks.
>> 
>> Richard
>
> I've just tested this patch on gcc-{8,9,10} release branches:
> bootstraps+regtests on aarch64-linux-gnu came back clean.
>
> Since this was a regression introduced in GCC 8, is it OK to backport
> the fix to those release branches now?

Yeah, OK for the branches too.

Thanks,
Richard


[PATCH] tree-optimization/95356 - more vectorizable_shift massaging

2020-05-29 Thread Richard Biener
The previous fix clashed with the rewrite to emit SLP invariants
during the SLP walk.  Thus the following adjusts the SLP tree
hacking vectorizable_shift does appropriately.

Still resisting the attempt of a rewrite of vectorizable_shift ...

2020-05-29  Richard Biener  

PR tree-optimization/95356
* tree-vect-stmts.c (vectorizable_shift): Do in-place SLP
node hacking during analysis.
---
 gcc/tree-vect-stmts.c | 37 -
 1 file changed, 20 insertions(+), 17 deletions(-)

diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 58fb93d740a..11780ddf89a 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -5696,7 +5696,7 @@ vectorizable_shift (vec_info *vinfo,
 
   if (!op1_vectype)
op1_vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (op1),
-  slp_node);
+  slp_op1);
   incompatible_op1_vectype_p
= (op1_vectype == NULL_TREE
   || maybe_ne (TYPE_VECTOR_SUBPARTS (op1_vectype),
@@ -5704,8 +5704,8 @@ vectorizable_shift (vec_info *vinfo,
   || TYPE_MODE (op1_vectype) != TYPE_MODE (vectype));
   if (incompatible_op1_vectype_p
  && (!slp_node
- || SLP_TREE_DEF_TYPE
-  (SLP_TREE_CHILDREN (slp_node)[1]) != vect_constant_def))
+ || SLP_TREE_DEF_TYPE (slp_op1) != vect_constant_def
+ || slp_op1->refcnt != 1))
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -5808,6 +5808,21 @@ vectorizable_shift (vec_info *vinfo,
 "incompatible vector types for invariants\n");
  return false;
}
+  /* Now adjust the constant shift amount in place.  */
+  if (slp_node
+ && incompatible_op1_vectype_p
+ && dt[1] == vect_constant_def)
+   {
+ for (unsigned i = 0;
+  i < SLP_TREE_SCALAR_OPS (slp_op1).length (); ++i)
+   {
+ SLP_TREE_SCALAR_OPS (slp_op1)[i]
+   = fold_convert (TREE_TYPE (vectype),
+   SLP_TREE_SCALAR_OPS (slp_op1)[i]);
+ gcc_assert ((TREE_CODE (SLP_TREE_SCALAR_OPS (slp_op1)[i])
+  == INTEGER_CST));
+   }
+   }
   STMT_VINFO_TYPE (stmt_info) = shift_vec_info_type;
   DUMP_VECT_SCOPE ("vectorizable_shift");
   vect_model_simple_cost (vinfo, stmt_info, ncopies, dt,
@@ -5882,20 +5897,8 @@ vectorizable_shift (vec_info *vinfo,
vec_oprnds1.quick_push (vec_oprnd1);
}
  else if (dt[1] == vect_constant_def)
-   {
- /* Convert the scalar constant shift amounts in-place.  */
- slp_tree shift = SLP_TREE_CHILDREN (slp_node)[1];
- gcc_assert (SLP_TREE_DEF_TYPE (shift) == vect_constant_def);
- for (unsigned i = 0;
-  i < SLP_TREE_SCALAR_OPS (shift).length (); ++i)
-   {
- SLP_TREE_SCALAR_OPS (shift)[i]
- = fold_convert (TREE_TYPE (vectype),
- SLP_TREE_SCALAR_OPS (shift)[i]);
- gcc_assert ((TREE_CODE (SLP_TREE_SCALAR_OPS (shift)[i])
-  == INTEGER_CST));
-   }
-   }
+   /* The constant shift amount has been adjusted in place.  */
+   ;
  else
gcc_assert (TYPE_MODE (op1_vectype) == TYPE_MODE (vectype));
}
-- 
2.25.1


Re: Simplify tree streaming

2020-05-29 Thread Richard Biener via Gcc-patches
On Tue, May 26, 2020 at 10:44 AM Jan Hubicka  wrote:
>
> Hi,
> this patch cleans up tree streaming.  The code is prepared to stream nested
> trees, but we only handle flat trees. As a result we have quite heavy function
> to stream in/out tree reference which is used many times and shows up in
> profile.
>
> This patch adds stream_write_tree_ref/stream_read_tree_ref which is used to
> stream references to trees that are required to exist in the cache or be
> indexable.
>
> The actual implementation is just a first cut.  I would like to make it more
> compact. We used to stream 2 byte tag (as UHWI) + UHWI representing the index.
> Now we stream one UHWI that represent type of reference + index for references
> to cache, but still two integers for references to global stream.  This is
> becaue the abstraction is not very helpful here and I want to clean this up
> incrementally.
>
> I would also like to get rid of the ref_p parameters which seems unnecessary 
> for
> flat streams.
>
> This reduces around 7% of global stream, 3% when compressed.
> More reduction will happen once the format is sanitized a bit.
>
> from
> [WPA] read 4597161 unshared trees
> [WPA] read 2937414 mergeable SCCs of average size 1.364280
> [WPA] 8604617 tree bodies read in total
> [WPA] tree SCC table: size 524287, 247507 elements, collision ratio: 0.377468
> [WPA] tree SCC max chain length 2 (size 1)
> [WPA] Compared 2689907 SCCs, 184 collisions (0.68)
> [WPA] Merged 2689890 SCCs
> [WPA] Merged 3722677 tree bodies
> [WPA] Merged 632040 types
> ...
> [WPA] Compression: 88124141 input bytes, 234906430 uncompressed bytes (ratio: 
> 2.665631)
> [WPA] Size of mmap'd section decls: 88124141 bytes
> ...
> [WPA] Compression: 113758813 input bytes, 316149514 uncompressed bytes 
> (ratio: 2.779121)
> [WPA] Size of mmap'd section decls: 88124141 bytes
> [WPA] Size of mmap'd section function_body: 14485721 bytes
>
> to
>
> [WPA] read 4597174 unshared trees
> [WPA] read 2937413 mergeable SCCs of average size 1.364280
> [WPA] 8604629 tree bodies read in total
> [WPA] tree SCC table: size 524287, 247509 elements, collision ratio: 0.377458
> [WPA] tree SCC max chain length 2 (size 1)
> [WPA] Compared 2689904 SCCs, 183 collisions (0.68)
> [WPA] Merged 2689888 SCCs
> [WPA] Merged 3722675 tree bodies
> [WPA] Merged 632041 types
> 
> [WPA] Size of mmap'd section decls: 86177293 bytes
> [WPA] Compression: 86177293 input bytes, 217625095 uncompressed bytes (ratio: 
> 2.525318)
> 
> [WPA] Compression: 111682269 input bytes, 297228756 uncompressed bytes 
> (ratio: 2.661378)
> [WPA] Size of mmap'd section decls: 86177293 bytes
> [WPA] Size of mmap'd section function_body: 14349032 bytes
>
> Bootstrapped/regtested x86_64-linux, OK?
>
> gcc/ChangeLog:
>
> 2020-05-26  Jan Hubicka  
>
> * lto-streamer-in.c (streamer_read_chain): Move here from
> tree-streamer-in.c.
> (stream_read_tree_ref): New.
> (lto_input_tree_1): Simplify.
> * lto-streamer-out.c (stream_write_tree_ref): New.
> (lto_write_tree_1): Simplify.
> (lto_output_tree_1): Simplify.
> (DFS::DFS_write_tree): Simplify.
> (streamer_write_chain): Move here from tree-stremaer-out.c.
> * lto-streamer.h (lto_output_tree_ref): Update prototype.
> (stream_read_tree_ref): Declare
> (stream_write_tree_ref): Declare
> * tree-streamer-in.c (streamer_read_chain): Update to use
> stream_read_tree_ref.
> (lto_input_ts_common_tree_pointers): Likewise.
> (lto_input_ts_vector_tree_pointers): Likewise.
> (lto_input_ts_poly_tree_pointers): Likewise.
> (lto_input_ts_complex_tree_pointers): Likewise.
> (lto_input_ts_decl_minimal_tree_pointers): Likewise.
> (lto_input_ts_decl_common_tree_pointers): Likewise.
> (lto_input_ts_decl_with_vis_tree_pointers): Likewise.
> (lto_input_ts_field_decl_tree_pointers): Likewise.
> (lto_input_ts_function_decl_tree_pointers): Likewise.
> (lto_input_ts_type_common_tree_pointers): Likewise.
> (lto_input_ts_type_non_common_tree_pointers): Likewise.
> (lto_input_ts_list_tree_pointers): Likewise.
> (lto_input_ts_vec_tree_pointers): Likewise.
> (lto_input_ts_exp_tree_pointers): Likewise.
> (lto_input_ts_block_tree_pointers): Likewise.
> (lto_input_ts_binfo_tree_pointers): Likewise.
> (lto_input_ts_constructor_tree_pointers): Likewise.
> (lto_input_ts_omp_clause_tree_pointers): Likewise.
> * tree-streamer-out.c (streamer_write_chain): Update to use
> stream_write_tree_ref.
> (write_ts_common_tree_pointers): Likewise.
> (write_ts_vector_tree_pointers): Likewise.
> (write_ts_poly_tree_pointers): Likewise.
> (write_ts_complex_tree_pointers): Likewise.
> (write_ts_decl_minimal_tree_pointers): Likewise.
> (write_ts_decl_common_tree_pointers): Likewise.
> 

[PATCH] c++: Reject some further reinterpret casts in constexpr [PR82304, PR95307]

2020-05-29 Thread Jakub Jelinek via Gcc-patches
Hi!

cxx_eval_outermost_constant_expr had a check for reinterpret_casts from
pointers (well, it checked from ADDR_EXPRs) to integral type, but that
only caught such cases at the toplevel of expressions.
As the comment said, it should be done even inside of the expressions,
but at the point of the writing e.g. pointer differences used to be a
problem.  We now have POINTER_DIFF_EXPR, so this is no longer an issue.

Had to do it just for CONVERT_EXPR, because the FE emits NOP_EXPR casts
from pointers to integrals in various spots, e.g. for the PMR & 1 tests,
though on NOP_EXPR we have the REINTERPRET_CAST_P bit that we do check,
while on CONVERT_EXPR we don't.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
PR92411 is not fixed by this change though.

2020-05-29  Jakub Jelinek  

PR c++/82304
PR c++/95307
* constexpr.c (cxx_eval_constant_expression): Diagnose CONVERT_EXPR
conversions from pointer types to arithmetic types here...
(cxx_eval_outermost_constant_expr): ... instead of here.

* g++.dg/template/pr79650.C: Expect different diagnostics and expect
it on all lines that do pointer to integer casts.
* g++.dg/cpp1y/constexpr-shift1.C: Expect different diagnostics.
* g++.dg/cpp1y/constexpr-82304.C: New test.
* g++.dg/cpp0x/constexpr-95307.C: New test.

--- gcc/cp/constexpr.c.jj   2020-05-28 23:12:19.715303826 +0200
+++ gcc/cp/constexpr.c  2020-05-29 12:02:06.161656532 +0200
@@ -6194,6 +6194,18 @@ cxx_eval_constant_expression (const cons
if (VOID_TYPE_P (type))
  return void_node;
 
+   if (TREE_CODE (t) == CONVERT_EXPR
+   && ARITHMETIC_TYPE_P (type)
+   && INDIRECT_TYPE_P (TREE_TYPE (op)))
+ {
+   if (!ctx->quiet)
+ error ("conversion from pointer type %qT "
+"to arithmetic type %qT in a constant expression",
+TREE_TYPE (op), type);
+   *non_constant_p = true;
+   return t;
+ }
+
if (TREE_CODE (op) == PTRMEM_CST && !TYPE_PTRMEM_P (type))
  op = cplus_expand_constant (op);
 
@@ -6795,19 +6807,6 @@ cxx_eval_outermost_constant_expr (tree t
   non_constant_p = true;
 }
 
-  /* Technically we should check this for all subexpressions, but that
- runs into problems with our internal representation of pointer
- subtraction and the 5.19 rules are still in flux.  */
-  if (CONVERT_EXPR_CODE_P (TREE_CODE (r))
-  && ARITHMETIC_TYPE_P (TREE_TYPE (r))
-  && TREE_CODE (TREE_OPERAND (r, 0)) == ADDR_EXPR)
-{
-  if (!allow_non_constant)
-   error ("conversion from pointer type %qT "
-  "to arithmetic type %qT in a constant expression",
-  TREE_TYPE (TREE_OPERAND (r, 0)), TREE_TYPE (r));
-  non_constant_p = true;
-}
 
   if (!non_constant_p && overflow_p)
 non_constant_p = true;
--- gcc/testsuite/g++.dg/template/pr79650.C.jj  2020-01-12 11:54:37.249400796 
+0100
+++ gcc/testsuite/g++.dg/template/pr79650.C 2020-05-29 12:02:06.180656252 
+0200
@@ -11,10 +11,10 @@ foo ()
   static int a, b;
 lab1:
 lab2:
-  A<(intptr_t)& - (__INTPTR_TYPE__)&> c; // { dg-error "not a 
constant integer" }
-  A<(intptr_t)& - (__INTPTR_TYPE__)&> d;
-  A<(intptr_t) - (intptr_t)> e;// { dg-error "is not a 
constant expression" }
-  A<(intptr_t) - (intptr_t)> f;
-  A<(intptr_t)sizeof(a) + (intptr_t)> g; // { dg-error "not a 
constant integer" }
+  A<(intptr_t)& - (__INTPTR_TYPE__)&> c; // { dg-error 
"conversion from pointer type" }
+  A<(intptr_t)& - (__INTPTR_TYPE__)&> d; // { dg-error 
"conversion from pointer type" }
+  A<(intptr_t) - (intptr_t)> e;// { dg-error 
"conversion from pointer type" }
+  A<(intptr_t) - (intptr_t)> f;// { dg-error 
"conversion from pointer type" }
+  A<(intptr_t)sizeof(a) + (intptr_t)> g; // { dg-error 
"conversion from pointer type" }
   A<(intptr_t)> h;   // { dg-error 
"conversion from pointer type" }
 }
--- gcc/testsuite/g++.dg/cpp1y/constexpr-shift1.C.jj2020-01-12 
11:54:37.115402818 +0100
+++ gcc/testsuite/g++.dg/cpp1y/constexpr-shift1.C   2020-05-29 
12:02:06.180656252 +0200
@@ -3,7 +3,8 @@
 constexpr int p = 1;
 constexpr __PTRDIFF_TYPE__ bar (int a)
 {
-  return ((__PTRDIFF_TYPE__) ) << a; // { dg-error "is not a constant 
expression" }
+  return ((__PTRDIFF_TYPE__) ) << a;
 }
 constexpr __PTRDIFF_TYPE__ r = bar (2); // { dg-message "in .constexpr. 
expansion of" }
+   // { dg-error "conversion from pointer" 
"" { target *-*-* } .-1 }
 constexpr __PTRDIFF_TYPE__ s = bar (0); // { dg-error "conversion from 
pointer" }
--- gcc/testsuite/g++.dg/cpp1y/constexpr-82304.C.jj 2020-05-29 
12:04:58.077131497 +0200
+++ gcc/testsuite/g++.dg/cpp1y/constexpr-82304.C2020-05-29 
12:10:08.171576940 +0200
@@ -0,0 +1,14 @@
+// PR 

Fix streamer desynchornization caused by streamer debugging patch

2020-05-29 Thread Jan Hubicka
Hi,
it turns out I lost one hunk in the patch disabling extra streaming
which causes streamer to go out of sync in the case non-trivial scc
containing the node being streamed appears in local stream (which seems
quite rare since it does not happen during bootstrap).

I am regtesting on x86_64-linux the following and will commit afterwards.

Honza

gcc/ChangeLog:

2020-05-29  Jan Hubicka  

PR lto/95362
* lto-streamer-out.c (lto_output_tree):

diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index 288e3c0f4c6..dfd32ece4bd 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -1791,8 +1791,9 @@ lto_output_tree (struct output_block *ob, tree expr,
}
  streamer_write_record_start (ob, LTO_tree_pickle_reference);
  streamer_write_uhwi (ob, ix);
- streamer_write_enum (ob->main_stream, LTO_tags, LTO_NUM_TAGS,
-  lto_tree_code_to_tag (TREE_CODE (expr)));
+ if (streamer_debugging)
+   streamer_write_enum (ob->main_stream, LTO_tags, LTO_NUM_TAGS,
+lto_tree_code_to_tag (TREE_CODE (expr)));
}
   in_dfs_walk = false;
   lto_stats.num_pickle_refs_output++;


Re: [PATCH] PR 95079 Improve unordered_map insert_or_assign and try_emplace

2020-05-29 Thread Jonathan Wakely via Gcc-patches

On 29/05/20 10:18 +0200, François Dumont via Libstdc++ wrote:
I added a try_emplace at the underlying _Hashtable level which I use 
in both insert_or_assign and try_emplace.


I am not making any use of the hint for the moment. I'll review this 
once my other hashtable patches are being validated.


            PR libstdc++/95079
            * include/bits/hashtable_policy.h 
(_Insert_base<>::try_emplace): New.
            * include/bits/unordered_map.h 
(unordered_map<>::try_emplace): Adapt.

            (unordered_map<>::insert_or_assign): Adapt.

Tested under Linux x86_64.

Ok to commit ?

François




diff --git a/libstdc++-v3/include/bits/hashtable_policy.h 
b/libstdc++-v3/include/bits/hashtable_policy.h
index ef120134914..5c6c81bcf21 100644
--- a/libstdc++-v3/include/bits/hashtable_policy.h
+++ b/libstdc++-v3/include/bits/hashtable_policy.h
@@ -848,6 +848,28 @@ namespace __detail
return __h._M_insert(__hint, __v, __node_gen, __unique_keys());
  }

+  template 


Our coding conventions say no space in "template <" here.


Otherwise looks great, thanks.

OK for master. Please be aware of the new ChangeLog policy I forwarded
to the mailing list the other day.




[PATCH] Remove references to SVN in libsanitizer.

2020-05-29 Thread Martin Liška

Simple documentation update based on usage of GIT by both
LLVM and GCC.

libsanitizer/ChangeLog:

* HOWTO_MERGE: Do not mention not existing argument.
* README.gcc: Update LLVM repository location.
---
 libsanitizer/HOWTO_MERGE |  3 +--
 libsanitizer/README.gcc  | 16 
 2 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/libsanitizer/HOWTO_MERGE b/libsanitizer/HOWTO_MERGE
index a47a26a4a74..73cce4f6f7d 100644
--- a/libsanitizer/HOWTO_MERGE
+++ b/libsanitizer/HOWTO_MERGE
@@ -3,8 +3,7 @@ track various ABI changes and GCC-specific patches carefully.  
Here is a
 general list of actions required to perform the merge:
 
 * Checkout recent GCC tree.

-* Run merge.sh script from libsanitizer directory.  The script accepts one
-  argument that is control version system (svn or git).
+* Run merge.sh script from libsanitizer directory.
 * Modify Makefile.am files into 
asan/tsan/lsan/ubsan/sanitizer_common/interception
   directories if needed.  In particular, you may need to add new source files
   and remove old ones in source files list, add new flags to {C, CXX}FLAGS if
diff --git a/libsanitizer/README.gcc b/libsanitizer/README.gcc
index ec491ba4bf8..fdb0ec5ba09 100644
--- a/libsanitizer/README.gcc
+++ b/libsanitizer/README.gcc
@@ -3,14 +3,14 @@ projects initially developed by Google Inc.
 
 Both tools consist of a compiler module and a run-time library.

 The sources of the run-time library for these projects are hosted at
-https://llvm.org/svn/llvm-project/compiler-rt in the following directories:
-  include/sanitizer
-  lib/sanitizer_common
-  lib/interception
-  lib/asan
-  lib/tsan
-  lib/lsan
-  lib/ubsan
+https://github.com/llvm/llvm-project in the following directories:
+  compiler-rt/include/sanitizer
+  compiler-rt/lib/sanitizer_common
+  compiler-rt/lib/interception
+  compiler-rt/lib/asan
+  compiler-rt/lib/tsan
+  compiler-rt/lib/lsan
+  compiler-rt/lib/ubsan
 
 Trivial and urgent fixes (portability, build fixes, etc.) may go directly to the

 GCC tree.  All non-trivial changes, functionality improvements, etc. should go
--
2.26.2



Re: [PATCH PR95332] gcov-tool: Flexible endian adjustment for merging coverage data

2020-05-29 Thread dongjianqiang (A)
Thank you.

Regards,
Dong JianQiang

> On 5/28/20 8:19 AM, Martin Liška wrote:
> > On 5/28/20 4:07 AM, dongjianqiang (A) wrote:
> >> Thanks for reviewing this. Could you please help install this patch? I am 
> >> not
> a gcc commiter.
> >
> > I've just done that.
> >
> > For the next time, please add ChangeLog entries to a git commit
> > message. We do not longer modify ChangeLog filese, these are generated
> automatically from commit messages.
> >
> > Martin
> 
> I've tested the patch on actives branches and I'm going to install it.
> 
> Martin


RE: [PATCH] extend cselim to check non-trapping for more references (PR tree-optimizaton/89430)

2020-05-29 Thread Hao Liu OS via Gcc-patches
Hi Richard,

Thanks for your comments. It's a good idea to simplify the code and remove 
get_inner_reference. I've updated the patch accordingly. I also simplified the 
code to ignore other loads, which can not help to check if a store can be 
trapped. 

About tests:
1.  All previously XFAIL tests (gcc.dg/tree-ssa/pr89430-*) for pr89430 are 
passed with this patch. 
2.  ssa-pre-17.c is failed as cselim optimizes the conditional store, so 
"-fno-tree-cselim" is added. That case is added as a new test case for pr89430.
3.  Other test cases (including the regression test for pr94734) in gcc 
testsuit are not affected by this patch, according to gcc "make check".
4.  Some other benchmarks are also tested for correctness and performance. 
The performance regression mentioned in pr89430 can be fixed. 
 
Review, please.

gcc/ChangeLog:

PR tree-optimization/89430
* tree-ssa-phiopt.c (cond_store_replacement): Extend non-trap checking to
support ARRAY_REFs and COMPONENT_REFs.  Support a special case: if there is
a dominating load of local variable without address escape, a store is not
trapped (as local stack is always writable).  Other loads are ignored for
simplicity, as they don't help to check if a store can be trapped.

gcc/testsuite/ChangeLog:

PR tree-optimization/89430
* gcc.dg/tree-ssa/pr89430-1.c: Remove xfail.
* gcc.dg/tree-ssa/pr89430-2.c: Remove xfail.
* gcc.dg/tree-ssa/pr89430-5.c: Remove xfail.
* gcc.dg/tree-ssa/pr89430-6.c: Remove xfail.
* gcc.dg/tree-ssa/pr89430-7-comp-ref.c: New test.
* gcc.dg/tree-ssa/ssa-pre-17.c: Add -fno-tree-cselim.

---
 gcc/testsuite/gcc.dg/tree-ssa/pr89430-1.c |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr89430-2.c |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr89430-5.c |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr89430-6.c |   2 +-
 .../gcc.dg/tree-ssa/pr89430-7-comp-ref.c  |  17 +++
 gcc/testsuite/gcc.dg/tree-ssa/ssa-pre-17.c|   2 +-
 gcc/tree-ssa-phiopt.c | 140 +-
 7 files changed, 91 insertions(+), 76 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr89430-7-comp-ref.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr89430-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr89430-1.c
index ce242ba569b..8ee1850ac63 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr89430-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr89430-1.c
@@ -9,4 +9,4 @@ unsigned test(unsigned k, unsigned b) {
 return a[0]+a[1];
 }
 
-/* { dg-final { scan-tree-dump "Conditional store replacement" "cselim" { 
xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump "Conditional store replacement" "cselim" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr89430-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr89430-2.c
index 90ae36bfce2..9b96875ac7a 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr89430-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr89430-2.c
@@ -11,4 +11,4 @@ unsigned test(unsigned k, unsigned b) {
 return a[0]+a[1];
 }
 
-/* { dg-final { scan-tree-dump "Conditional store replacement" "cselim" { 
xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump "Conditional store replacement" "cselim" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr89430-5.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr89430-5.c
index c633cbe947d..b2d04119381 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr89430-5.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr89430-5.c
@@ -13,4 +13,4 @@ int test(int b, int k) {
 return a.data[0] + a.data[1];
 }
 
-/* { dg-final { scan-tree-dump "Conditional store replacement" "cselim" { 
xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump "Conditional store replacement" "cselim" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr89430-6.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr89430-6.c
index 7cad563128d..8d3c4f7cc6a 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr89430-6.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr89430-6.c
@@ -16,4 +16,4 @@ int test(int b, int k) {
 return a.data[0].x + a.data[1].x;
 }
 
-/* { dg-final { scan-tree-dump "Conditional store replacement" "cselim" { 
xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump "Conditional store replacement" "cselim" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr89430-7-comp-ref.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr89430-7-comp-ref.c
new file mode 100644
index 000..c35a2afc70b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr89430-7-comp-ref.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-cselim-details" } */
+
+typedef union {
+  int i;
+  float f;
+} U;
+
+int foo(U *u, int b, int i)
+{
+  u->i = 0;
+  if (b)
+u->i = i;
+  return u->i;
+}
+
+/* { dg-final { scan-tree-dump "Conditional store replacement" "cselim" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-pre-17.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-pre-17.c
index 09313716598..a06f339f0bb 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-pre-17.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-pre-17.c
@@ -1,5 +1,5 @@
 /* { dg-do 

Re: [PATCH PR95332] gcov-tool: Flexible endian adjustment for merging coverage data

2020-05-29 Thread Martin Liška

On 5/28/20 8:19 AM, Martin Liška wrote:

On 5/28/20 4:07 AM, dongjianqiang (A) wrote:

Thanks for reviewing this. Could you please help install this patch? I am not a 
gcc commiter.


I've just done that.

For the next time, please add ChangeLog entries to a git commit message. We do 
not
longer modify ChangeLog filese, these are generated automatically from commit 
messages.

Martin


I've tested the patch on actives branches and I'm going to install it.

Martin


[PATCH] Fix various limitations of git-backport.py.

2020-05-29 Thread Martin Liška

I've just tested the script and I'm going to install the patch
to all active branches.

contrib/ChangeLog:

* git-backport.py: The script did 'git co HEAD~' when
there was no modified ChangeLog file in a successful
git cherry pick.
Run cherry-pick --continue without editor.
---
 contrib/git-backport.py | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/contrib/git-backport.py b/contrib/git-backport.py
index 6a115c34d40..3a9413dcd27 100755
--- a/contrib/git-backport.py
+++ b/contrib/git-backport.py
@@ -30,9 +30,13 @@ if __name__ == '__main__':
 
 r = subprocess.run('git cherry-pick -x %s' % args.revision, shell=True)

 if r.returncode == 0:
-cmd = 'git show --name-only --pretty="" -- "*ChangeLog" |' \
-  'xargs git checkout HEAD~'
-subprocess.check_output(cmd, shell=True)
+cmd = 'git show --name-only --pretty="" -- "*ChangeLog"'
+changelogs = subprocess.check_output(cmd, shell=True, encoding='utf8')
+changelogs = changelogs.strip()
+if changelogs:
+for changelog in changelogs.split('\n'):
+subprocess.check_output('git checkout HEAD~ %s' % changelog,
+shell=True)
 subprocess.check_output('git commit --amend --no-edit', shell=True)
 else:
 # 1) remove all ChangeLog files from conflicts
@@ -55,6 +59,7 @@ if __name__ == '__main__':
 
 # try to continue

 if len(conflicts) == len(changelogs):
-subprocess.check_output('git cherry-pick --continue', shell=True)
+cmd = 'git -c core.editor=true cherry-pick --continue'
+subprocess.check_output(cmd, shell=True)
 else:
 print('Please resolve all remaining file conflicts.')
--
2.26.2



[PATCH] tree-optimization/95403 - guard vect_init_vector_1 against NULL stmt_info

2020-05-29 Thread Richard Biener


Bootstrapped / tested on x86_64-unknown-linux-gnu, applied.

2020-05-29  Richard Biener  

PR tree-optimization/95403
* tree-vect-stmts.c (vect_init_vector_1): Guard against NULL
stmt_vinfo.

* gfortran.dg/vect/pr95403.f: New testcase.
---
 gcc/testsuite/gfortran.dg/vect/pr95403.f | 16 
 gcc/tree-vect-stmts.c|  2 +-
 2 files changed, 17 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/vect/pr95403.f

diff --git a/gcc/testsuite/gfortran.dg/vect/pr95403.f 
b/gcc/testsuite/gfortran.dg/vect/pr95403.f
new file mode 100644
index 000..248958b524c
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/vect/pr95403.f
@@ -0,0 +1,16 @@
+! { dg-do compile }
+  subroutine deuldlag(xi,et,ze,xlag,xeul,xj,xs)
+  real*8 shp(3,20),xs(3,3),xlag(3,20),xeul(3,20)
+  do i=1,3
+do j=1,3
+enddo
+  enddo
+  do i=1,3
+do j=1,3
+  xs(i,j)=0.d0
+  do k=1,20
+xs(i,j)=xs(i,j)+xeul(i,k)*shp(j,k)
+  enddo
+enddo
+  enddo
+  end
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 58fb93d740a..a5f1b52d498 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -1330,7 +1330,7 @@ vect_init_vector_1 (vec_info *vinfo, stmt_vec_info 
stmt_vinfo, gimple *new_stmt,
  basic_block new_bb;
  edge pe;
 
- if (nested_in_vect_loop_p (loop, stmt_vinfo))
+ if (stmt_vinfo && nested_in_vect_loop_p (loop, stmt_vinfo))
loop = loop->inner;
 
  pe = loop_preheader_edge (loop);
-- 
2.26.2


[PATCH] RISC-V: align RISC-V software division with hardware specification in case of division by zero

2020-05-29 Thread MOSER Virginie via Gcc-patches
The assembly code in libgcc/config/riscv/div.S does not handle the division by 
zero as specified in the riscv-spec v2.2 chapter 6.2 in case of signed division:

"The quotient of division by zero has all bits set, i.e. 2XLEN−1 for unsigned 
division or−1 for signed division."

When a negative number is divided by zero in the __divdi3 function, the result 
is 1 instead of -1.

As soon as there exists a specific implementation for the software division for 
the RISC-V and although that the C language allows unspecified result in case 
of division by zero, it would be worth aligning the software RISC-V division 
with the hardware implementation so that the compliance tests could pass 
whatever the -mno-div or -mdiv flag value.
Especially in the case where the patch is simple, does not add additional code 
size nor execution time.

The patch proposes that when the dividend is negative, a zero divisor is 
considered as negative so that the result of the unsigned division will not be 
inverted. This consists of exchanging a "branch greater or equal zero (bgez)" 
with a "branch greater than zero (bgtz)".

Virginie

---

ChangeLog

*libgcc/config/riscv/div.S: Change handling of signed division by zero in case 
of negative dividend to align with RISC-V specification for hardware division.



diff --git a/libgcc/config/riscv/div.S b/libgcc/config/riscv/div.S
index 922a4338042..57f5856e11d 100644
--- a/libgcc/config/riscv/div.S
+++ b/libgcc/config/riscv/div.S
@@ -107,7 +107,8 @@ FUNC_END (__umoddi3)
   /* Handle negative arguments to __divdi3.  */
.L10:
   neg   a0, a0
-  bgez  a1, .L12  /* Compute __udivdi3(-a0, a1), then negate the result.  
*/
+  bgtz  a1, .L12  /* Compute __udivdi3(-a0, a1), then negate the result.  
*/
+  /* Zero is handled as a negative so that the result will 
not be inverted */
   neg   a1, a1
   j __udivdi3 /* Compute __udivdi3(-a0, -a1).  */


[committed] openmp: One omp_resolve_declare_variant followup [PR95315]

2020-05-29 Thread Jakub Jelinek via Gcc-patches
Hi!

As noticed by Arseny, I got the condition when to call the add removal hook
wrong wrong.  Fixed thusly, bootstrapped/regtested on x86_64-linux and
i686-linux, committed to trunk.

2020-05-28  Jakub Jelinek  

PR middle-end/95315
* omp-general.c (omp_resolve_declare_variant): Fix up addition of
declare variant cgraph node removal callback.

* gcc.dg/gomp/pr95315-2.c: New test.

--- gcc/omp-general.c.jj2020-05-27 10:25:35.855578064 +0200
+++ gcc/omp-general.c   2020-05-28 19:39:38.575186925 +0200
@@ -1851,7 +1851,7 @@ omp_resolve_declare_variant (tree base)
}
 
   static struct cgraph_node_hook_list *node_removal_hook_holder;
-  if (node_removal_hook_holder)
+  if (!node_removal_hook_holder)
node_removal_hook_holder
  = symtab->add_cgraph_removal_hook (omp_declare_variant_remove_hook,
 NULL);
--- gcc/testsuite/gcc.dg/gomp/pr95315-2.c.jj2020-05-28 19:41:26.765630849 
+0200
+++ gcc/testsuite/gcc.dg/gomp/pr95315-2.c   2020-05-28 19:32:34.511286132 
+0200
@@ -0,0 +1,46 @@
+/* PR middle-end/95315 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fopenmp --param ggc-min-heapsize=0" } */
+
+typedef float __v4sf __attribute__((vector_size (16)));
+typedef int __v4si __attribute__((vector_size (16)));
+typedef float __v8sf __attribute__((vector_size (32)));
+typedef int __v8si __attribute__((vector_size (32)));
+__v4si f1 (__v4sf, __v4sf, float *);
+__v8si f2 (__v8sf, __v8sf, float *);
+__v4si f3 (__v4si, int, __v4si);
+
+#pragma omp declare variant (f1) match 
(construct={parallel,for,simd(simdlen(4),notinbranch,uniform(z),aligned(z:4 * 
sizeof (*z)))})
+#pragma omp declare variant (f2) match 
(construct={for,simd(uniform(z),simdlen(8),notinbranch)})
+int f4 (float x, float y, float *z);
+
+#pragma omp declare variant (f3) match 
(construct={simd(simdlen(4),inbranch,linear(y:1))})
+int f5 (int x, int y);
+
+static inline __attribute__((always_inline)) int
+ret_false (void)
+{
+  return 0;
+}
+
+void
+test (int *x, float *y, float *z, float *w)
+{
+  #pragma omp parallel
+  #pragma omp for simd aligned (w:4 * sizeof (float))
+  for (int i = 0; i < 1024; i++)
+if (ret_false ())
+  x[i] = f4 (y[i], z[i], w);
+  #pragma omp parallel for simd aligned (w:4 * sizeof (float)) simdlen(4)
+  for (int i = 1024; i < 2048; i++)
+if (ret_false ())
+  x[i] = f4 (y[i], z[i], w);
+  #pragma omp simd aligned (w:4 * sizeof (float))
+  for (int i = 2048; i < 4096; i++)
+if (ret_false ())
+  x[i] = f4 (y[i], z[i], w);
+  #pragma omp simd
+  for (int i = 4096; i < 8192; i++)
+if (x[i] > 10 && ret_false ())
+  x[i] = f5 (x[i], i);
+}


Jakub



[PATCH] libgfortran: Export forgotten _gfortran_{,m,s}findloc{0,1}_c10 [PR95390]

2020-05-29 Thread Jakub Jelinek via Gcc-patches
Hi!

I have noticed we don't export these 6 symbols and thus the testcase
below fails to link.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk and 10.2?

2020-05-28  Jakub Jelinek  

PR libfortran/95390
* gfortran.dg/findloc_8.f90: New test.

* Makefile.am (i_findloc0_c): Add findloc0_i10.c.
(i_findloc1_c): Add findloc1_i10.c.
* gfortran.map (GFORTRAN_10.2): New symbol version, export
_gfortran_{,m,s}findloc{0,1}_c10 symbols.
* Makefile.in: Regenerated.
* generated/findloc0_c10.c: Generated.
* generated/findloc1_c10.c: Generated.

--- gcc/testsuite/gfortran.dg/findloc_8.f90.jj  2020-05-28 16:52:19.170772844 
+0200
+++ gcc/testsuite/gfortran.dg/findloc_8.f90 2020-05-28 17:33:46.252731963 
+0200
@@ -0,0 +1,29 @@
+! PR libfortran/95390
+! { dg-do run { target fortran_real_10 } }
+
+  complex(kind=10) :: a(6), b, d(2,2)
+  logical :: m(6), n, o(2,2)
+  integer :: c(1), e(2)
+  a = (/ 1., 2., 17., 2., 2., 6. /)
+  b = 17.
+  c = findloc (a, b)
+  if (c(1) /= 3) stop 1
+  m = (/ .true., .false., .true., .true., .true., .true. /)
+  n = .true.
+  b = 2.
+  c = findloc (a, b, m)
+  if (c(1) /= 4) stop 2
+  c = findloc (a, b, n)
+  if (c(1) /= 2) stop 3
+  d = reshape((/ 1., 2., 2., 3. /), (/ 2, 2 /))
+  e = findloc (d, b, 1)
+  if (e(1) /= 2 .or. e(2) /= 1) stop 4
+  o = reshape((/ .true., .false., .true., .true. /), (/ 2, 2 /))
+  e = findloc (d, b, 1, o)
+  if (e(1) /= 0 .or. e(2) /= 1) stop 5
+  e = findloc (d, b, 1, n)
+  if (e(1) /= 2 .or. e(2) /= 1) stop 6
+  n = .false.
+  e = findloc (d, b, 1, n)
+  if (e(1) /= 0 .or. e(2) /= 0) stop 7
+end
--- libgfortran/Makefile.am.jj  2020-04-07 21:29:25.939086424 +0200
+++ libgfortran/Makefile.am 2020-05-28 17:34:38.314981779 +0200
@@ -283,6 +283,7 @@ $(srcdir)/generated/findloc0_r10.c \
 $(srcdir)/generated/findloc0_r16.c \
 $(srcdir)/generated/findloc0_c4.c \
 $(srcdir)/generated/findloc0_c8.c \
+$(srcdir)/generated/findloc0_c10.c \
 $(srcdir)/generated/findloc0_c16.c
 
 i_findloc0s_c= \
@@ -301,6 +302,7 @@ $(srcdir)/generated/findloc1_r10.c \
 $(srcdir)/generated/findloc1_r16.c \
 $(srcdir)/generated/findloc1_c4.c \
 $(srcdir)/generated/findloc1_c8.c \
+$(srcdir)/generated/findloc1_c10.c \
 $(srcdir)/generated/findloc1_c16.c
 
 i_findloc1s_c= \
--- libgfortran/gfortran.map.jj 2020-04-07 21:29:25.942086379 +0200
+++ libgfortran/gfortran.map2020-05-28 17:38:07.734964170 +0200
@@ -1619,3 +1619,13 @@ GFORTRAN_10 {
   _gfortran_tand_r10;
   _gfortran_tand_r16;
 } GFORTRAN_9.2;
+
+GFORTRAN_10.2 {
+  global:
+  _gfortran_findloc0_c10;
+  _gfortran_mfindloc0_c10;
+  _gfortran_sfindloc0_c10;
+  _gfortran_findloc1_c10;
+  _gfortran_mfindloc1_c10;
+  _gfortran_sfindloc1_c10;
+} GFORTRAN_10;
--- libgfortran/Makefile.in.jj  2020-04-07 21:29:25.942086379 +0200
+++ libgfortran/Makefile.in 2020-05-28 17:36:48.530105462 +0200
@@ -373,12 +373,12 @@ am__objects_46 = minval1_s1.lo minval1_s
 am__objects_47 = findloc0_i1.lo findloc0_i2.lo findloc0_i4.lo \
findloc0_i8.lo findloc0_i16.lo findloc0_r4.lo findloc0_r8.lo \
findloc0_r10.lo findloc0_r16.lo findloc0_c4.lo findloc0_c8.lo \
-   findloc0_c16.lo
+   findloc0_c10.lo findloc0_c16.lo
 am__objects_48 = findloc0_s1.lo findloc0_s4.lo
 am__objects_49 = findloc1_i1.lo findloc1_i2.lo findloc1_i4.lo \
findloc1_i8.lo findloc1_i16.lo findloc1_r4.lo findloc1_r8.lo \
findloc1_r10.lo findloc1_r16.lo findloc1_c4.lo findloc1_c8.lo \
-   findloc1_c16.lo
+   findloc1_c10.lo findloc1_c16.lo
 am__objects_50 = findloc1_s1.lo findloc1_s4.lo
 am__objects_51 = findloc2_s1.lo findloc2_s4.lo
 am__objects_52 = ISO_Fortran_binding.lo
@@ -844,6 +844,7 @@ $(srcdir)/generated/findloc0_r10.c \
 $(srcdir)/generated/findloc0_r16.c \
 $(srcdir)/generated/findloc0_c4.c \
 $(srcdir)/generated/findloc0_c8.c \
+$(srcdir)/generated/findloc0_c10.c \
 $(srcdir)/generated/findloc0_c16.c
 
 i_findloc0s_c = \
@@ -862,6 +863,7 @@ $(srcdir)/generated/findloc1_r10.c \
 $(srcdir)/generated/findloc1_r16.c \
 $(srcdir)/generated/findloc1_c4.c \
 $(srcdir)/generated/findloc1_c8.c \
+$(srcdir)/generated/findloc1_c10.c \
 $(srcdir)/generated/findloc1_c16.c
 
 i_findloc1s_c = \
@@ -1822,6 +1824,7 @@ distclean-compile:
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/extends_type_of.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/fbuf.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/file_pos.Plo@am__quote@
+@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/findloc0_c10.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/findloc0_c16.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/findloc0_c4.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/findloc0_c8.Plo@am__quote@
@@ -1836,6 +1839,7 @@ distclean-compile:
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/findloc0_r8.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ 

Re: [PATCH 5/7] vect: Support vector load/store with length in vectorizer

2020-05-29 Thread Richard Sandiford
"Kewen.Lin"  writes:
> on 2020/5/27 下午6:02, Richard Sandiford wrote:
>> "Kewen.Lin"  writes:
>>> Hi Richard,
>>>
>>> Thanks for your comments!
>>>
>>> on 2020/5/26 锟斤拷锟斤拷8:49, Richard Sandiford wrote:
 "Kewen.Lin"  writes:
> @@ -626,6 +645,12 @@ public:
>/* True if have decided to use a fully-masked loop.  */
>bool fully_masked_p;
>  
> +  /* Records whether we still have the option of using a length access 
> loop.  */
> +  bool can_with_length_p;
> +
> +  /* True if have decided to use length access for the loop fully.  */
> +  bool fully_with_length_p;

 Rather than duplicate the flags like this, I think we should have
 three bits of information:

 (1) Can the loop operate on partial vectors?  Starts off optimistically
 assuming "yes", gets set to "no" when we find a counter-example.

 (2) If we do decide to use partial vectors, will we need loop masks?

 (3) If we do decide to use partial vectors, will we need lengths?

 Vectorisation using partial vectors succeeds if (1) && ((2) != (3))

 LOOP_VINFO_CAN_FULLY_MASK_P currently tracks (1) and
 LOOP_VINFO_MASKS currently tracks (2).  In pathological cases it's
 already possible to have (1) && !(2), see r9-6240 for an example.

 With the new support, LOOP_VINFO_LENS tracks (3).

 So I don't think we need the can_with_length_p.  What is now
 LOOP_VINFO_CAN_FULLY_MASK_P can continue to track (1) for both
 approaches, with the final choice of approach only being made
 at the end.  Maybe it would be worth renaming it to something
 more generic though, now that we have two approaches to partial
 vectorisation.
>>>
>>> I like this idea!  I could be wrong, but I'm afraid that we
>>> can not have one common flag to be shared for both approaches,
>>> the check criterias could be different for both approaches, one
>>> counter example for length could be acceptable for masking, such
>>> as length can only allow CONTIGUOUS related modes, but masking
>>> can support more.  When we see acceptable VMAT_LOAD_STORE_LANES,
>>> we leave LOOP_VINFO_CAN_FULLY_MASK_P true, later should length
>>> checking turn it to false?  I guess no, assuming still true, then 
>>> LOOP_VINFO_CAN_FULLY_MASK_P will mean only partial vectorization
>>> for masking, not for both.  We can probably clean LOOP_VINFO_LENS
>>> when the length checking is false, but we just know the vec is empty,
>>> not sure we are unable to do partial vectorization with length,
>>> when we see LOOP_VINFO_CAN_FULLY_MASK_P true, we could still
>>> record length into it if possible.
>> 
>> Let's call the flag in (1) CAN_USE_PARTIAL_VECTORS_P rather than
>> CAN_FULLY_MASK_P to (try to) avoid any confusion from the current name.
>> 
>> What I meant is that each vectorizable_* routine has the responsibility
>> of finding a way of coping with partial vectorisation, or setting
>> CAN_USE_PARTIAL_VECTORS_P to false if it can't.
>> 
>> vectorizable_load chooses the VMAT first, and then decides based on that
>> whether partial vectorisation is supported.  There's no influence in
>> the other direction (partial vectorisation doesn't determine the VMAT).
>> 
>> So once it has chosen a VMAT, vectorizable_load needs to try to find a way
>> of handling the operation with partial vectorisation.  Currently the only
>> way of doing that for VMAT_LOAD_STORE_LANES is using masks.  So at the
>> moment there are two possible outcomes:
>> 
>> - The target supports the necessary IFN_MASK_LOAD_LANES function.
>>   If so, we can use partial vectorisation for the statement, so we
>>   leave CAN_USE_PARTIAL_VECTORS_P true and record the necessary masks
>>   in LOOP_VINFO_MASKS.
>> 
>> - The target doesn't support the necessary IFN_MASK_LOAD_LANES function.
>>   If so, we can't use partial vectorisation, so we clear
>>   CAN_USE_PARTIAL_VECTORS_P.
>> 
>> That's how things work at the moment.  It would work in the same way
>> for lengths if we ever supported IFN_LEN_LOAD_LANES: we'd check whether
>> IFN_LEN_LOAD_LANES is available and record the length in LOOP_VINFO_LENS
>> if so.  If partial vectorisation isn't supported (via masks or lengths),
>> we'd continue to clear CAN_USE_PARTIAL_VECTORS_P.
>> 
>> But equally, if we never add support for IFN_LEN_LOAD_LANES, the current
>> code continues to work with length-based approaches.  We'll continue to
>> clear CAN_USE_PARTIAL_VECTORS_P for VMAT_LOAD_STORE_LANES when the
>> target provides no IFN_MASK_LOAD_LANES function.
>> 
>
> Thanks a lot for your detailed explanation!  This proposal looks good
> based on the current implementation of both masking and length.  I may
> think too much, but I had a bit concern as below when some targets have
> both masking and length supports in future, such as ppc adds masking
> support like SVE.
>
> I assumed that you meant each vectorizable_* routine should record the
> objs for any available partial 

[PATCH] PR 95079 Improve unordered_map insert_or_assign and try_emplace

2020-05-29 Thread François Dumont via Gcc-patches
I added a try_emplace at the underlying _Hashtable level which I use in 
both insert_or_assign and try_emplace.


I am not making any use of the hint for the moment. I'll review this 
once my other hashtable patches are being validated.


    PR libstdc++/95079
    * include/bits/hashtable_policy.h 
(_Insert_base<>::try_emplace): New.
    * include/bits/unordered_map.h 
(unordered_map<>::try_emplace): Adapt.

    (unordered_map<>::insert_or_assign): Adapt.

Tested under Linux x86_64.

Ok to commit ?

François

diff --git a/libstdc++-v3/include/bits/hashtable_policy.h b/libstdc++-v3/include/bits/hashtable_policy.h
index ef120134914..5c6c81bcf21 100644
--- a/libstdc++-v3/include/bits/hashtable_policy.h
+++ b/libstdc++-v3/include/bits/hashtable_policy.h
@@ -848,6 +848,28 @@ namespace __detail
 	return __h._M_insert(__hint, __v, __node_gen, __unique_keys());
   }
 
+  template 
+	std::pair
+	try_emplace(const_iterator, _KType&& __k, _Args&&... __args)
+	{
+	  __hashtable& __h = _M_conjure_hashtable();
+	  auto __code = __h._M_hash_code(__k);
+	  std::size_t __bkt = __h._M_bucket_index(__k, __code);
+	  if (__node_type* __node = __h._M_find_node(__bkt, __k, __code))
+	return { iterator(__node), false };
+
+	  typename __hashtable::_Scoped_node __node {
+	&__h,
+	std::piecewise_construct,
+	std::forward_as_tuple(std::forward<_KType>(__k)),
+	std::forward_as_tuple(std::forward<_Args>(__args)...)
+	};
+	  auto __it
+	= __h._M_insert_unique_node(__k, __bkt, __code, __node._M_node);
+	  __node._M_node = nullptr;
+	  return { __it, true };
+	}
+
   void
   insert(initializer_list __l)
   { this->insert(__l.begin(), __l.end()); }
diff --git a/libstdc++-v3/include/bits/unordered_map.h b/libstdc++-v3/include/bits/unordered_map.h
index 0071d62e462..33f632ddb79 100644
--- a/libstdc++-v3/include/bits/unordered_map.h
+++ b/libstdc++-v3/include/bits/unordered_map.h
@@ -469,17 +469,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 	pair
 	try_emplace(const key_type& __k, _Args&&... __args)
 	{
-  iterator __i = find(__k);
-  if (__i == end())
-{
-  __i = emplace(std::piecewise_construct,
-std::forward_as_tuple(__k),
-std::forward_as_tuple(
-  std::forward<_Args>(__args)...))
-.first;
-  return {__i, true};
-}
-  return {__i, false};
+	  return _M_h.try_emplace(cend(), __k, std::forward<_Args>(__args)...);
 	}
 
   // move-capable overload
@@ -487,17 +477,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 	pair
 	try_emplace(key_type&& __k, _Args&&... __args)
 	{
-  iterator __i = find(__k);
-  if (__i == end())
-{
-  __i = emplace(std::piecewise_construct,
-std::forward_as_tuple(std::move(__k)),
-std::forward_as_tuple(
-  std::forward<_Args>(__args)...))
-.first;
-  return {__i, true};
-}
-  return {__i, false};
+	  return _M_h.try_emplace(cend(), std::move(__k),
+  std::forward<_Args>(__args)...);
 	}
 
   /**
@@ -533,13 +514,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 	try_emplace(const_iterator __hint, const key_type& __k,
 		_Args&&... __args)
 	{
-  iterator __i = find(__k);
-  if (__i == end())
-__i = emplace_hint(__hint, std::piecewise_construct,
-   std::forward_as_tuple(__k),
-   std::forward_as_tuple(
- std::forward<_Args>(__args)...));
-  return __i;
+	  return _M_h.try_emplace(__hint, __k,
+  std::forward<_Args>(__args)...).first;
 	}
 
   // move-capable overload
@@ -547,13 +523,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 	iterator
 	try_emplace(const_iterator __hint, key_type&& __k, _Args&&... __args)
 	{
-  iterator __i = find(__k);
-  if (__i == end())
-__i = emplace_hint(__hint, std::piecewise_construct,
-   std::forward_as_tuple(std::move(__k)),
-   std::forward_as_tuple(
- std::forward<_Args>(__args)...));
-  return __i;
+	  return _M_h.try_emplace(__hint, std::move(__k),
+  std::forward<_Args>(__args)...).first;
 	}
 #endif // C++17
 
@@ -681,17 +652,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 	pair
 	insert_or_assign(const key_type& __k, _Obj&& __obj)
 	{
-  iterator __i = find(__k);
-  if (__i == end())
-{
-  __i = emplace(std::piecewise_construct,
-std::forward_as_tuple(__k),
-std::forward_as_tuple(std::forward<_Obj>(__obj)))
-.first;
-  return {__i, true};
-}
-  (*__i).second = 

[PATCH] tree-optimization/95393 - fold MIN/MAX_EXPR generated by phiopt

2020-05-29 Thread Richard Biener
This makes sure to fold generated stmts so they do not survive
until RTL expansion and cause awkward code generation.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

2020-05-29  Richard Biener 

PR tree-optimization/95393
* tree-ssa-phiopt.c (minmax_replacement): Use gimple_build
to build the min/max expression so we simplify cases like
MAX(0, s) immediately.

* gcc.dg/tree-ssa/phi-opt-21.c: New testcase.
* g++.dg/vect/slp-pr87105.cc: Adjust.
---
 gcc/testsuite/g++.dg/vect/slp-pr87105.cc   |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-21.c | 15 +
 gcc/tree-ssa-phiopt.c  | 25 +++---
 3 files changed, 29 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-21.c

diff --git a/gcc/testsuite/g++.dg/vect/slp-pr87105.cc 
b/gcc/testsuite/g++.dg/vect/slp-pr87105.cc
index 5518f319be3..d07b1cd46b7 100644
--- a/gcc/testsuite/g++.dg/vect/slp-pr87105.cc
+++ b/gcc/testsuite/g++.dg/vect/slp-pr87105.cc
@@ -102,4 +102,4 @@ void quadBoundingBoxA(const Point bez[3], Box& bBox) 
noexcept {
 // { dg-final { scan-tree-dump-times "basic block part vectorized" 1 "slp2" { 
xfail { { ! vect_element_align } && { ! vect_hw_misalign } } } } }
 // It's a bit awkward to detect that all stores were vectorized but the
 // following more or less does the trick
-// { dg-final { scan-tree-dump "vect_iftmp\[^\r\m\]* = MIN" "slp2" { xfail { { 
! vect_element_align } && { ! vect_hw_misalign } } } } }
+// { dg-final { scan-tree-dump "vect_\[^\r\m\]* = MIN" "slp2" { xfail { { ! 
vect_element_align } && { ! vect_hw_misalign } } } } }
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-21.c 
b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-21.c
new file mode 100644
index 000..9f3d5695728
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-21.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-phiopt4-details" } */
+
+int f(unsigned s)
+{
+  int i;
+  for (i = 0; i < s; ++i)
+;
+
+  return i;
+}
+
+/* { dg-final { scan-tree-dump "converted to straightline code" "phiopt4" } } 
*/
+/* Make sure we fold the detected MAX.  */
+/* { dg-final { scan-tree-dump-not "MAX" "phiopt4" } } */
diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c
index b1e0dce93d8..bb97dcf63b4 100644
--- a/gcc/tree-ssa-phiopt.c
+++ b/gcc/tree-ssa-phiopt.c
@@ -46,6 +46,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-inline.h"
 #include "case-cfn-macros.h"
 #include "tree-eh.h"
+#include "gimple-fold.h"
 
 static unsigned int tree_ssa_phiopt_worker (bool, bool, bool);
 static bool two_value_replacement (basic_block, basic_block, edge, gphi *,
@@ -1364,7 +1365,6 @@ minmax_replacement (basic_block cond_bb, basic_block 
middle_bb,
 {
   tree result, type, rhs;
   gcond *cond;
-  gassign *new_stmt;
   edge true_edge, false_edge;
   enum tree_code cmp, minmax, ass_code;
   tree smaller, alt_smaller, larger, alt_larger, arg_true, arg_false;
@@ -1688,19 +1688,20 @@ minmax_replacement (basic_block cond_bb, basic_block 
middle_bb,
   gsi_move_before (_from, );
 }
 
-  /* Create an SSA var to hold the min/max result.  If we're the only
- things setting the target PHI, then we  can clone the PHI
- variable.  Otherwise we must create a new one.  */
-  result = PHI_RESULT (phi);
-  if (EDGE_COUNT (gimple_bb (phi)->preds) == 2)
-result = duplicate_ssa_name (result, NULL);
-  else
-result = make_ssa_name (TREE_TYPE (result));
-
   /* Emit the statement to compute min/max.  */
-  new_stmt = gimple_build_assign (result, minmax, arg0, arg1);
+  gimple_seq stmts = NULL;
+  tree phi_result = PHI_RESULT (phi);
+  result = gimple_build (, minmax, TREE_TYPE (phi_result), arg0, arg1);
+  /* Duplicate range info if we're the only things setting the target PHI.  */
+  if (!gimple_seq_empty_p (stmts)
+  && EDGE_COUNT (gimple_bb (phi)->preds) == 2
+  && !POINTER_TYPE_P (TREE_TYPE (phi_result))
+  && SSA_NAME_RANGE_INFO (phi_result))
+duplicate_ssa_name_range_info (result, SSA_NAME_RANGE_TYPE (phi_result),
+  SSA_NAME_RANGE_INFO (phi_result));
+
   gsi = gsi_last_bb (cond_bb);
-  gsi_insert_before (, new_stmt, GSI_NEW_STMT);
+  gsi_insert_seq_before (, stmts, GSI_NEW_STMT);
 
   replace_phi_edge_with_variable (cond_bb, e1, phi, result);
 
-- 
2.26.2


Re: [PATCH]: aarch64: add support for unpacked EOR, ORR and AND

2020-05-29 Thread Richard Sandiford
Joe Ramsay  writes:
> This patch improves code generation for EOR, ORR and AND on unpacked vectors 
> with SVE. The following function:
> void f (unsigned int *x, unsigned short *y, unsigned short *z) {
>   for (int i = 0; i < 7; ++i)
> x[i] = (unsigned short) (y[i] & z[i]);
> }
>
> previously compiled to
> ptrue   p1.d, vl3
> ld1hz0.d, p1/z, [x1, #1, mul vl]
> ptrue   p0.b, vl32
> st1hz0.d, p0, [sp, #1, mul vl]
> ld1hz0.d, p1/z, [x2, #1, mul vl]
> st1hz0.d, p0, [sp]
> ldr x3, [x2]
> ldp x4, x2, [sp]
> ldr x1, [x1]
> and x1, x3, x1
> and x2, x2, x4
> str x2, [sp]
> ld1hz0.d, p0/z, [sp]
> str x1, [sp]
> uxthz0.s, p0/m, z0.s
> st1wz0.d, p1, [x0, #1, mul vl]
> ld1hz0.d, p0/z, [sp]
> uxthz0.s, p0/m, z0.s
> st1wz0.d, p0, [x0]
> add sp, sp, 16
> ret
>
> and now compiles to:
> ptrue   p0.s, vl7
> ptrue   p1.b, vl32
> ld1hz1.s, p0/z, [x1]
> ld1hz0.s, p0/z, [x2]
> add z0.h, z0.h, z1.h
> uxthz0.s, p1/m, z0.s
> st1wz0.s, p0, [x0]
> ret

LGTM thanks.  Pushed to master.

Richard


[PATCH] git_commit: fix duplicite email address.

2020-05-29 Thread Martin Liška

The patch is about to handle situations like seen
in 3ea6977d0f1813d982743a09660eec1760e981ec.

contrib/ChangeLog:

* gcc-changelog/git_commit.py: Properly
handle duplicite authors.
* gcc-changelog/test_email.py: New test.
* gcc-changelog/test_patches.txt: New patch.
---
 contrib/gcc-changelog/git_commit.py|  8 +++-
 contrib/gcc-changelog/test_email.py|  9 -
 contrib/gcc-changelog/test_patches.txt | 51 ++
 3 files changed, 66 insertions(+), 2 deletions(-)

diff --git a/contrib/gcc-changelog/git_commit.py 
b/contrib/gcc-changelog/git_commit.py
index a24a251d8f3..084e83c18cc 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -232,6 +232,12 @@ class ChangeLogEntry:
 def is_empty(self):
 return not self.lines and self.prs == self.initial_prs
 
+def contains_author(self, author):

+for author_lines in self.author_lines:
+if author_lines[0] == author:
+return True
+return False
+
 
 class GitCommit:

 def __init__(self, hexsha, date, author, body, modified_files,
@@ -408,7 +414,7 @@ class GitCommit:
 self.changelog_entries.append(last_entry)
 will_deduce = True
 elif author_tuple:
-if author_tuple not in last_entry.author_lines:
+if not last_entry.contains_author(author_tuple[0]):
 last_entry.author_lines.append(author_tuple)
 continue
 
diff --git a/contrib/gcc-changelog/test_email.py b/contrib/gcc-changelog/test_email.py

index 23372f082a0..e73b3626473 100755
--- a/contrib/gcc-changelog/test_email.py
+++ b/contrib/gcc-changelog/test_email.py
@@ -18,11 +18,11 @@
 
 import os

 import tempfile
-import unidiff
 import unittest
 
 from git_email import GitEmail
 
+import unidiff
 
 script_path = os.path.dirname(os.path.realpath(__file__))
 
@@ -305,3 +305,10 @@ class TestGccChangelog(unittest.TestCase):

 email = self.from_patch_glob(
 '0001-Ada-Add-support-for-XDR-streaming-in-the-default-run.patch')
 assert not email.errors
+
+def test_duplicite_author_lines(self):
+email = self.from_patch_glob('0001-Fortran-type-is-real-kind-1.patch')
+assert (email.changelog_entries[0].author_lines[0][0]
+== 'Steven G. Kargl  ')
+assert (email.changelog_entries[0].author_lines[1][0]
+== 'Mark Eggleston  ')
diff --git a/contrib/gcc-changelog/test_patches.txt 
b/contrib/gcc-changelog/test_patches.txt
index cc81fcd32b8..76037c33f93 100644
--- a/contrib/gcc-changelog/test_patches.txt
+++ b/contrib/gcc-changelog/test_patches.txt
@@ -2893,4 +2893,55 @@ index 9e0263b431d..37f3d030e3f 100644
 +
 --
 2.20.1
+=== 0001-Fortran-type-is-real-kind-1.patch ===
+From 3ea6977d0f1813d982743a09660eec1760e981ec Mon Sep 17 00:00:00 2001
+From: Mark Eggleston 
+Date: Wed, 1 Apr 2020 09:52:41 +0100
+Subject: [PATCH] Fortran  : "type is( real(kind(1.)) )" spurious syntax error
+ PR94397
+
+Based on a patch in the comments of the PR. That patch fixed this
+problem but caused the test cases for PR93484 to fail. It has been
+changed to reduce initialisation expressions if the expression is
+not EXPR_VARIABLE and not EXPR_CONSTANT.
+
+2020-05-28  Steven G. Kargl  
+   Mark Eggleston  
+
+gcc/fortran/
+
+   PR fortran/94397
+   * match.c (gfc_match_type_spec): New variable ok initialised
+   to true. Set ok with the return value of gfc_reduce_init_expr
+   called only if the expression is not EXPR_CONSTANT and is not
+   EXPR_VARIABLE. Add !ok to the check for type not being integer
+   or the rank being greater than zero.
+
+2020-05-28  Mark Eggleston  
+
+gcc/testsuite/
+
+   PR fortran/94397
+   * gfortran.dg/pr94397.F90: New test.
+---
+ gcc/fortran/match.c   |  5 -
+ gcc/testsuite/gfortran.dg/pr94397.F90 | 26 ++
+ 2 files changed, 30 insertions(+), 1 deletion(-)
+ create mode 100644 gcc/testsuite/gfortran.dg/pr94397.F90
 
+diff --git a/gcc/fortran/match.c b/gcc/fortran/match.c

+index 8ae34a94a95..82d2b5087e5 100644
+--- a/gcc/fortran/match.c
 b/gcc/fortran/match.c
+@@ -1 +1,2 @@
+
++
+diff --git a/gcc/testsuite/gfortran.dg/pr94397.F90 
b/gcc/testsuite/gfortran.dg/pr94397.F90
+new file mode 100644
+index 000..fda10c1a88b
+--- /dev/null
 b/gcc/testsuite/gfortran.dg/pr94397.F90
+@@ -0,0 +1 @@
++
+--
+2.26.2
--
2.26.2



RE: [PATCH PR95254] aarch64: gcc generate inefficient code with fixed sve vector length

2020-05-29 Thread Yangfei (Felix)
Hi,

> -Original Message-
> From: Hongtao Liu [mailto:crazy...@gmail.com]
> Sent: Friday, May 29, 2020 11:24 AM
> To: H.J. Lu 
> Cc: Yangfei (Felix) ; gcc-patches@gcc.gnu.org;
> Uros Bizjak ; Jakub Jelinek ;
> Richard Sandiford 
> Subject: Re: [PATCH PR95254] aarch64: gcc generate inefficient code with
> fixed sve vector length
> 

Snip...

> > >
> > > This is due to define_subst magic.  The generators automatically
> > > create a vec_merge form of the instruction based on the information
> > > in the  attributes.
> > >
> > > AFAICT the rtl above is for the line-125 instruction, which looks ok.
> > > The problem is the line-126 instruction, since vcvtps2ph doesn't
> > > AIUI allow zero masking.
> > >
> 
> zero masking is not allowed for mem_operand here, but available for
> register_operand.
> there's something wrong in the pattern, we need to fix it.
> (define_insn "avx512f_vcvtps2ph512"

Thanks for confirming that :-)

> 
> > > The "mask" define_subst allows both zeroing and merging, so I guess
> > > this means that the pattern should either be using a different
> > > define_subst, or should be enforcing merging in some other way.
> > > Please could one of the x86 devs take a look?
> > >
> >
> > Hongtao, can you take a look?
> >
> > Thanks.
> >
> >
> > --
> > H.J.
> 
> BTW, i failed to build gcc when apply pr95254-v4.txt.
> 
> gcc configure:
> 
> Using built-in specs.
> COLLECT_GCC=./gcc/xgcc
> Target: x86_64-pc-linux-gnu
> Configured with: ../../gcc/gnu-toolchain/gcc/configure
> --enable-languages=c,c++,fortran --disable-bootstrap Thread model: posix
> Supported LTO compression algorithms: zlib gcc version 11.0.0 20200528
> (experimental) (GCC)
> 
> host on x86_64 rel8.

Yes, I tried your configure and reproduced the error.  Thanks for pointing this 
out.
The patch can pass bootstrap on x86_64 with the following configure options.
Surprised to see that it failed to build with your configure.

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/home/yangfei/gcc-hacking/install-gcc/libexec/gcc/x86_64-pc-linux-gnu/11/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-git/configure 
--prefix=/home/yangfei/gcc-hacking/install-gcc 
--enable-languages=c,c++,objc,obj-c++,fortran,lto --enable-shared 
--enable-threads=posix --enable-checking=yes --disable-multilib 
--with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions 
--enable-gnu-unique-object --enable-linker-build-id 
--with-gcc-major-version-only --enable-plugin --enable-initfini-array 
--without-isl --disable-libmpx --enable-gnu-indirect-function
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 11.0.0 20200526 (experimental) (GCC)

Felix


[PATCH v5 2/8] libstdc++ futex: Use FUTEX_CLOCK_REALTIME for wait

2020-05-29 Thread Mike Crowe via Gcc-patches
The futex system call supports waiting for an absolute time if
FUTEX_WAIT_BITSET is used rather than FUTEX_WAIT.  Doing so provides two
benefits:

1. The call to gettimeofday is not required in order to calculate a
   relative timeout.

2. If someone changes the system clock during the wait then the futex
   timeout will correctly expire earlier or later.  Currently that only
   happens if the clock is changed prior to the call to gettimeofday.

According to futex(2), support for FUTEX_CLOCK_REALTIME was added in the
v2.6.28 Linux kernel and FUTEX_WAIT_BITSET was added in v2.6.25.  To ensure
that the code still works correctly with earlier kernel versions, an ENOSYS
error from futex[1] results in the futex_clock_realtime_unavailable flag
being set.  This flag is used to avoid the unnecessary unsupported futex
call in the future and to fall back to the previous gettimeofday and
relative time implementation.

glibc applied an equivalent switch in pthread_cond_timedwait to use
FUTEX_CLOCK_REALTIME and FUTEX_WAIT_BITSET rather than FUTEX_WAIT for
glibc-2.10 back in 2009.  See
glibc:cbd8aeb836c8061c23a5e00419e0fb25a34abee7

The futex_clock_realtime_unavailable flag is accessed using
std::memory_order_relaxed to stop it becoming a bottleneck.  If the first
two calls to _M_futex_wait_until happen to happen simultaneously then the
only consequence is that both will try to use FUTEX_CLOCK_REALTIME, both
risk discovering that it doesn't work and, if so, both set the flag.

[1] This is how glibc's nptl-init.c determines whether these flags are
supported.

* libstdc++-v3/src/c++11/futex.cc: Add new constants for required
futex flags.  Add futex_clock_realtime_unavailable flag to store
result of trying to use
FUTEX_CLOCK_REALTIME. 
(__atomic_futex_unsigned_base::_M_futex_wait_until):
Try to use FUTEX_WAIT_BITSET with FUTEX_CLOCK_REALTIME and only
fall back to using gettimeofday and FUTEX_WAIT if that's not
supported.
---
 libstdc++-v3/src/c++11/futex.cc | 37 ++-
 1 file changed, 37 insertions(+)

diff --git a/libstdc++-v3/src/c++11/futex.cc b/libstdc++-v3/src/c++11/futex.cc
index c9de11a..25b3e05 100644
--- a/libstdc++-v3/src/c++11/futex.cc
+++ b/libstdc++-v3/src/c++11/futex.cc
@@ -35,8 +35,16 @@
 
 // Constants for the wait/wake futex syscall operations
 const unsigned futex_wait_op = 0;
+const unsigned futex_wait_bitset_op = 9;
+const unsigned futex_clock_realtime_flag = 256;
+const unsigned futex_bitset_match_any = ~0;
 const unsigned futex_wake_op = 1;
 
+namespace
+{
+  std::atomic futex_clock_realtime_unavailable;
+}
+
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
@@ -58,6 +66,35 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   }
 else
   {
+   if (!futex_clock_realtime_unavailable.load(std::memory_order_relaxed))
+ {
+   struct timespec rt;
+   rt.tv_sec = __s.count();
+   rt.tv_nsec = __ns.count();
+   if (syscall (SYS_futex, __addr,
+futex_wait_bitset_op | futex_clock_realtime_flag,
+__val, , nullptr, futex_bitset_match_any) == -1)
+ {
+   __glibcxx_assert(errno == EINTR || errno == EAGAIN
+   || errno == ETIMEDOUT || errno == ENOSYS);
+   if (errno == ETIMEDOUT)
+ return false;
+   if (errno == ENOSYS)
+ {
+   futex_clock_realtime_unavailable.store(true,
+   std::memory_order_relaxed);
+   // Fall through to legacy implementation if the system
+   // call is unavailable.
+ }
+   else
+ return true;
+ }
+   else
+ return true;
+ }
+
+   // We only get to here if futex_clock_realtime_unavailable was
+   // true or has just been set to true.
struct timeval tv;
gettimeofday (, NULL);
// Convert the absolute timeout value to a relative timeout
-- 
git-series 0.9.1


[PATCH v5 7/8] libstdc++ condition_variable: Avoid rounding errors on custom clocks

2020-05-29 Thread Mike Crowe via Gcc-patches
The fix for PR68519 in 83fd5e73b3c16296e0d7ba54f6c547e01c7eae7b only
applied to condition_variable::wait_for. This problem can also apply to
condition_variable::wait_until but only if the custom clock is using a
more recent epoch so that a small enough delta can be calculated. let's
use the newly-added chrono::__detail::ceil to fix this and also make use
of that function to simplify the previous wait_for fixes.

Also, simplify the existing test case for PR68519 a little and make its
variables local so we can add a new test case for the above
problem. Unfortunately, the test would have only started failing if
sufficient time has passed since the chrono::steady_clock epoch had
passed anyway, but it's better than nothing.

* libstdc++-v3/include/std/condition_variable:
  (condition_variable::wait_until): Convert delta to
  steady_clock duration before adding to current steady_clock
  time to avoid rounding errors described in
  PR68519. (condition_variable::wait_for): Simplify calculation
  of absolute time by using chrono::__detail::ceil in both
  overloads.  *
  libstdc++-v3/testsuite/30_threads/condition_variable/members/68519.cc:
  (test_wait_for): Renamed from test01. Replace unassigned val
  variable with constant false. Reduce scope of mx and cv
  variables to just test_wait_for function. (test_wait_until):
  Add new test case.
---
 libstdc++-v3/include/std/condition_variable   | 18 
+-
 libstdc++-v3/testsuite/30_threads/condition_variable/members/68519.cc | 61 
++---
 2 files changed, 63 insertions(+), 16 deletions(-)

diff --git a/libstdc++-v3/include/std/condition_variable 
b/libstdc++-v3/include/std/condition_variable
index 2db9dff..0796ca9 100644
--- a/libstdc++-v3/include/std/condition_variable
+++ b/libstdc++-v3/include/std/condition_variable
@@ -133,10 +133,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #if __cplusplus > 201703L
static_assert(chrono::is_clock_v<_Clock>);
 #endif
+   using __s_dur = typename __clock_t::duration;
const typename _Clock::time_point __c_entry = _Clock::now();
const __clock_t::time_point __s_entry = __clock_t::now();
const auto __delta = __atime - __c_entry;
-   const auto __s_atime = __s_entry + __delta;
+   const auto __s_atime = __s_entry +
+ chrono::__detail::ceil<__s_dur>(__delta);
 
if (__wait_until_impl(__lock, __s_atime) == cv_status::no_timeout)
  return cv_status::no_timeout;
@@ -166,10 +168,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   const chrono::duration<_Rep, _Period>& __rtime)
   {
using __dur = typename steady_clock::duration;
-   auto __reltime = chrono::duration_cast<__dur>(__rtime);
-   if (__reltime < __rtime)
- ++__reltime;
-   return wait_until(__lock, steady_clock::now() + __reltime);
+   return wait_until(__lock,
+ steady_clock::now() +
+ chrono::__detail::ceil<__dur>(__rtime));
   }
 
 template
@@ -179,10 +180,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _Predicate __p)
   {
using __dur = typename steady_clock::duration;
-   auto __reltime = chrono::duration_cast<__dur>(__rtime);
-   if (__reltime < __rtime)
- ++__reltime;
-   return wait_until(__lock, steady_clock::now() + __reltime,
+   return wait_until(__lock,
+ steady_clock::now() +
+ chrono::__detail::ceil<__dur>(__rtime),
  std::move(__p));
   }
 
diff --git 
a/libstdc++-v3/testsuite/30_threads/condition_variable/members/68519.cc 
b/libstdc++-v3/testsuite/30_threads/condition_variable/members/68519.cc
index 9a70713..739f74c 100644
--- a/libstdc++-v3/testsuite/30_threads/condition_variable/members/68519.cc
+++ b/libstdc++-v3/testsuite/30_threads/condition_variable/members/68519.cc
@@ -26,25 +26,72 @@
 
 // PR libstdc++/68519
 
-bool val = false;
-std::mutex mx;
-std::condition_variable cv;
-
 void
-test01()
+test_wait_for()
 {
+  std::mutex mx;
+  std::condition_variable cv;
+
   for (int i = 0; i < 3; ++i)
   {
 std::unique_lock l(mx);
 auto start = std::chrono::system_clock::now();
-cv.wait_for(l, std::chrono::duration(1), [] { return val; });
+cv.wait_for(l, std::chrono::duration(1), [] { return false; });
 auto t = std::chrono::system_clock::now();
 VERIFY( (t - start) >= std::chrono::seconds(1) );
   }
 }
 
+// In order to ensure that the delta calculated in the arbitrary clock overload
+// of condition_variable::wait_until fits accurately in a float, but the result
+// of adding it to steady_clock with a float duration does not, this clock
+// needs to use a more recent epoch.
+struct recent_epoch_float_clock
+{
+  using rep = std::chrono::duration::rep;
+  using period = 

[PATCH v5 6/8] libstdc++ atomic_futex: Avoid rounding errors in std::future::wait_* [PR91486]

2020-05-29 Thread Mike Crowe via Gcc-patches
Convert the specified duration to the target clock's duration type
before adding it to the current time in
__atomic_futex_unsigned::_M_load_when_equal_for and
_M_load_when_equal_until.  This removes the risk of the timeout being
rounded down to the current time resulting in there being no wait at
all when the duration type lacks sufficient precision to hold the
steady_clock current time.

Rather than using the style of fix from PR68519, let's expose the C++17
std::chrono::ceil function as std::chrono::__detail::ceil so that it can
be used in code compiled with earlier standards versions and simplify
the fix. This was suggested by John Salmon in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91486#c5 .

This problem has become considerably less likely to trigger since I
switched the __atomic__futex_unsigned::__clock_t reference clock from
system_clock to steady_clock and added the loop, but the consequences of
triggering it have changed too.

By my calculations it takes just over 194 days from the epoch for the
current time not to be representable in a float. This means that
system_clock is always subject to the problem (with the standard 1970
epoch) whereas steady_clock with float duration only runs out of
resolution machine has been running for that long (assuming the Linux
implementation of CLOCK_MONOTONIC.)

The recently-added loop in
__atomic_futex_unsigned::_M_load_when_equal_until turns this scenario
into a busy wait.

Unfortunately the combination of both of these things means that it's
not possible to write a test case for this occurring in
_M_load_when_equal_until as it stands.

* libstdc++-v3/include/std/chrono: (__detail::ceil) Move
  implementation of std::chrono::ceil into private namespace so
  that it's available to pre-C++17 code.

* libstdc++-v3/include/bits/atomic_futex.h:
  (__atomic_futex_unsigned::_M_load_when_equal_for,
  __atomic_futex_unsigned::_M_load_when_equal_until): Use
  __detail::ceil to convert delta to the reference clock
  duration type to avoid resolution problems

* libstdc++-v3/testsuite/30_threads/async/async.cc: (test_pr91486):
  New test for __atomic_futex_unsigned::_M_load_when_equal_for.
---
 libstdc++-v3/include/bits/atomic_futex.h |  6 +++--
 libstdc++-v3/include/std/chrono  | 19 +
 libstdc++-v3/testsuite/30_threads/async/async.cc | 15 +-
 3 files changed, 34 insertions(+), 6 deletions(-)

diff --git a/libstdc++-v3/include/bits/atomic_futex.h 
b/libstdc++-v3/include/bits/atomic_futex.h
index 5f95ade..aa137a7 100644
--- a/libstdc++-v3/include/bits/atomic_futex.h
+++ b/libstdc++-v3/include/bits/atomic_futex.h
@@ -219,8 +219,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_load_when_equal_for(unsigned __val, memory_order __mo,
  const chrono::duration<_Rep, _Period>& __rtime)
   {
+   using __dur = typename __clock_t::duration;
return _M_load_when_equal_until(__val, __mo,
-   __clock_t::now() + __rtime);
+   __clock_t::now() + chrono::__detail::ceil<__dur>(__rtime));
   }
 
 // Returns false iff a timeout occurred.
@@ -233,7 +234,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
do {
  const __clock_t::time_point __s_entry = __clock_t::now();
  const auto __delta = __atime - __c_entry;
- const auto __s_atime = __s_entry + __delta;
+ const auto __s_atime = __s_entry +
+ chrono::__detail::ceil<_Duration>(__delta);
  if (_M_load_when_equal_until(__val, __mo, __s_atime))
return true;
  __c_entry = _Clock::now();
diff --git a/libstdc++-v3/include/std/chrono b/libstdc++-v3/include/std/chrono
index 6d78f32..4257c7c 100644
--- a/libstdc++-v3/include/std/chrono
+++ b/libstdc++-v3/include/std/chrono
@@ -299,6 +299,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
 #endif // C++20
 
+// We want to use ceil even when compiling for earlier standards versions
+namespace __detail
+{
+  template
+constexpr __enable_if_is_duration<_ToDur>
+ceil(const duration<_Rep, _Period>& __d)
+{
+ auto __to = chrono::duration_cast<_ToDur>(__d);
+ if (__to < __d)
+   return __to + _ToDur{1};
+ return __to;
+   }
+}
+
 #if __cplusplus >= 201703L
 # define __cpp_lib_chrono 201611
 
@@ -316,10 +330,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   constexpr __enable_if_is_duration<_ToDur>
   ceil(const duration<_Rep, _Period>& __d)
   {
-   auto __to = chrono::duration_cast<_ToDur>(__d);
-   if (__to < __d)
- return __to + _ToDur{1};
-   return __to;
+   return __detail::ceil<_ToDur>(__d);
   }
 
 template 
diff --git a/libstdc++-v3/testsuite/30_threads/async/async.cc 
b/libstdc++-v3/testsuite/30_threads/async/async.cc
index ee117f4..f697292 100644
--- a/libstdc++-v3/testsuite/30_threads/async/async.cc
+++ 

[PATCH v5 3/8] libstdc++ futex: Support waiting on std::chrono::steady_clock directly

2020-05-29 Thread Mike Crowe via Gcc-patches
The user-visible effect of this change is for std::future::wait_until to
use CLOCK_MONOTONIC when passed a timeout of std::chrono::steady_clock
type.  This makes it immune to any changes made to the system clock
CLOCK_REALTIME.

Add an overload of __atomic_futex_unsigned::_M_load_and_text_until_impl
that accepts a std::chrono::steady_clock, and correctly passes this through
to __atomic_futex_unsigned_base::_M_futex_wait_until_steady which uses
CLOCK_MONOTONIC for the timeout within the futex system call.  These
functions are mostly just copies of the std::chrono::system_clock versions
with small tweaks.

Prior to this commit, a std::chrono::steady timeout would be converted via
std::chrono::system_clock which risks reducing or increasing the timeout if
someone changes CLOCK_REALTIME whilst the wait is happening.  (The commit
immediately prior to this one increases the window of opportunity for that
from a short period during the calculation of a relative timeout, to the
entire duration of the wait.)

FUTEX_WAIT_BITSET was added in kernel v2.6.25.  If futex reports ENOSYS to
indicate that this operation is not supported then the code falls back to
using clock_gettime(2) to calculate a relative time to wait for.

I believe that I've added this functionality in a way that it doesn't break
ABI compatibility, but that has made it more verbose and less type safe.  I
believe that it would be better to maintain the timeout as an instance of
the correct clock type all the way down to a single _M_futex_wait_until
function with an overload for each clock.  The current scheme of separating
out the seconds and nanoseconds early risks accidentally calling the wait
function for the wrong clock.  Unfortunately, doing this would break code
that compiled against the old header.

* libstdc++-v3/config/abi/pre/gnu.ver: Update for addition of
  __atomic_futex_unsigned_base::_M_futex_wait_until_steady.

* libstdc++-v3/include/bits/atomic_futex.h
  (__atomic_futex_unsigned_base): Add comments to clarify that
  _M_futex_wait_until _M_load_and_test_until use CLOCK_REALTIME.
  Declare new _M_futex_wait_until_steady and
  _M_load_and_text_until_steady methods that use CLOCK_MONOTONIC.
  Add _M_load_and_test_until_impl and _M_load_when_equal_until
  overloads that accept a steady_clock time_point and use these new
  methods.

* libstdc++-v3/src/c++11/futex.cc: Include headers required for
clock_gettime. Add futex_clock_monotonic_flag constant to tell
futex to use CLOCK_MONOTONIC to match the existing
futex_clock_realtime_flag.  Add futex_clock_monotonic_unavailable
to store the result of trying to use
CLOCK_MONOTONIC. 
(__atomic_futex_unsigned_base::_M_futex_wait_until_steady):
Add new variant of _M_futex_wait_until that uses CLOCK_MONOTONIC to
support waiting using steady_clock.
---
 libstdc++-v3/config/abi/pre/gnu.ver  | 10 +--
 libstdc++-v3/include/bits/atomic_futex.h | 67 +++-
 libstdc++-v3/src/c++11/futex.cc  | 82 +-
 3 files changed, 154 insertions(+), 5 deletions(-)

diff --git a/libstdc++-v3/config/abi/pre/gnu.ver 
b/libstdc++-v3/config/abi/pre/gnu.ver
index edf4485..3d734d7 100644
--- a/libstdc++-v3/config/abi/pre/gnu.ver
+++ b/libstdc++-v3/config/abi/pre/gnu.ver
@@ -1916,10 +1916,9 @@ GLIBCXX_3.4.21 {
 _ZNSt7codecvtID[is]c*;
 _ZT[ISV]St7codecvtID[is]c*E;
 
-extern "C++"
-{
-  std::__atomic_futex_unsigned_base*;
-};
+# std::__atomic_futex_unsigned_base members
+_ZNSt28__atomic_futex_unsigned_base19_M_futex_notify_all*;
+_ZNSt28__atomic_futex_unsigned_base19_M_futex_wait_until*;
 
 # codecvt_utf8 etc.
 _ZNKSt19__codecvt_utf8_base*;
@@ -2297,6 +2296,9 @@ GLIBCXX_3.4.28 {
 _ZNSt3pmr25monotonic_buffer_resourceD[0125]Ev;
 _ZT[ISV]NSt3pmr25monotonic_buffer_resourceE;
 
+# std::__atomic_futex_unsigned_base::_M_futex_wait_until_steady
+_ZNSt28__atomic_futex_unsigned_base26_M_futex_wait_until_steady*;
+
 } GLIBCXX_3.4.27;
 
 # Symbols in the support library (libsupc++) have their own tag.
diff --git a/libstdc++-v3/include/bits/atomic_futex.h 
b/libstdc++-v3/include/bits/atomic_futex.h
index 886fc63..507c5c9 100644
--- a/libstdc++-v3/include/bits/atomic_futex.h
+++ b/libstdc++-v3/include/bits/atomic_futex.h
@@ -52,11 +52,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #if defined(_GLIBCXX_HAVE_LINUX_FUTEX) && ATOMIC_INT_LOCK_FREE > 1
   struct __atomic_futex_unsigned_base
   {
-// Returns false iff a timeout occurred.
+// __s and __ns are measured against CLOCK_REALTIME. Returns false
+// iff a timeout occurred.
 bool
 _M_futex_wait_until(unsigned *__addr, unsigned __val, bool __has_timeout,
chrono::seconds __s, chrono::nanoseconds __ns);
 
+// __s and __ns are measured against CLOCK_MONOTONIC. Returns
+// false iff a timeout occurred.
+bool
+

[PATCH v5 0/8] std::future::wait_* and std::condition_variable improvements

2020-05-29 Thread Mike Crowe via Gcc-patches
This series ensures that the std::future::wait_* functions use
std::chrono::steady_clock when required, introduces
std::chrono::__detail::ceil to make that easier to do, and then makes
use of that function to simplify and improve the fix for PR68519 in
std::condition_variable.

v1 of this series was originally posted back in September 2017 (see
https://gcc.gnu.org/ml/libstdc++/2017-09/msg00083.html )

v2 of this series was originally posted back in January 2018 (see
https://gcc.gnu.org/ml/libstdc++/2018-01/msg00035.html )

v3 of this series was originally posted back in August 2018 (see
https://gcc.gnu.org/ml/libstdc++/2018-08/msg1.html )

v4 of this series was originally posted back in October 2019 (see
https://gcc.gnu.org/legacy-ml/gcc-patches/2019-10/msg01934.html )

Changes since v4:

* Expose std::chrono::ceil as std::chrono::__detail::ceil so that it
  can be used to fix PR91486 in std::future::wait_for (as suggested by
  John Salmon in PR91486.)

* Use std::chrono::__detail::ceil to simplify fix for PR68519 in
  std::condition_variable::wait_for.

* Also fix equivalent of PR68519 in
  std::condition_variable::wait_until and add test.

Changelog:
* libstdc++-v3/include/std/condition_variable:
  (condition_variable::wait_until): Convert delta to
  steady_clock duration before adding to current steady_clock
  time to avoid rounding errors described in
  PR68519. (condition_variable::wait_for): Simplify calculation
  of absolute time by using chrono::__detail::ceil in both
  overloads.  *
  
libstdc++-v3/testsuite/30_threads/condition_variable/members/68519.cc:
  (test_wait_for): Renamed from test01. Replace unassigned val
  variable with constant false. Reduce scope of mx and cv
  variables to just test_wait_for function. (test_wait_until):
  Add new test case.

* libstdc++-v3/include/std/chrono: (__detail::ceil) Move
  implementation of std::chrono::ceil into private namespace so
  that it's available to pre-C++17 code.

* libstdc++-v3/include/bits/atomic_futex.h:
  (__atomic_futex_unsigned::_M_load_when_equal_for,
  __atomic_futex_unsigned::_M_load_when_equal_until): Use
  __detail::ceil to convert delta to the reference clock
  duration type to avoid resolution problems

* libstdc++-v3/testsuite/30_threads/async/async.cc: (test_pr91486):
  New test for __atomic_futex_unsigned::_M_load_when_equal_for.

* run test03 with steady_clock_copy, which behaves identically to
  std::chrono::steady_clock, but isn't std::chrono::steady_clock. This
  causes the overload of __atomic_futex_unsigned::_M_load_when_equal_until
  that takes an arbitrary clock to be called.

* invent test04 which uses a deliberately slow running clock in order to
  exercise the looping behaviour o
  __atomic_futex_unsigned::_M_load_when_equal_until described above.

* libstdc++-v3/include/bits/atomic_futex.h:
(__atomic_futex_unsigned) Add loop to _M_load_when_equal_until on
generic _Clock to check the timeout against _Clock again after
_M_load_when_equal_until returns indicating a timeout.

* libstdc++-v3/testsuite/30_threads/async/async.cc: Invent
slow_clock that runs at an eleventh of steady_clock's speed. Use it
to test the user-supplied-clock variant of
__atomic_futex_unsigned::_M_load_when_equal_until works generally
with test03 and loops correctly when the timeout time hasn't been
reached in test04.

* libstdc++-v3/include/bits/atomic_futex.h:
(__atomic_futex_unsigned): Change __clock_t typedef to use
steady_clock so that unknown clocks are synced to it rather than
system_clock. Change existing __clock_t overloads of
_M_load_and_text_until_impl and _M_load_when_equal_until to use
system_clock explicitly. Remove comment about DR 887 since these
changes address that problem as best as we currently able.

* libstdc++-v3/config/abi/pre/gnu.ver: Update for addition of
  __atomic_futex_unsigned_base::_M_futex_wait_until_steady.

* libstdc++-v3/include/bits/atomic_futex.h
  (__atomic_futex_unsigned_base): Add comments to clarify that
  _M_futex_wait_until _M_load_and_test_until use CLOCK_REALTIME.
  Declare new _M_futex_wait_until_steady and
  _M_load_and_text_until_steady methods that use CLOCK_MONOTONIC.
  Add _M_load_and_test_until_impl and _M_load_when_equal_until
  overloads that accept a steady_clock time_point and use these new
  methods.

* libstdc++-v3/src/c++11/futex.cc: Include headers required for
   

[PATCH v5 5/8] libstdc++ futex: Loop when waiting against arbitrary clock

2020-05-29 Thread Mike Crowe via Gcc-patches
If std::future::wait_until is passed a time point measured against a clock
that is neither std::chrono::steady_clock nor std::chrono::system_clock
then the generic implementation of
__atomic_futex_unsigned::_M_load_when_equal_until is called which
calculates the timeout based on __clock_t and calls the
_M_load_when_equal_until method for that clock to perform the actual wait.

There's no guarantee that __clock_t is running at the same speed as the
caller's clock, so if the underlying wait times out timeout we need to
check the timeout against the caller's clock again before potentially
looping.

Also add two extra tests to the testsuite's async.cc:

* run test03 with steady_clock_copy, which behaves identically to
  std::chrono::steady_clock, but isn't std::chrono::steady_clock. This
  causes the overload of __atomic_futex_unsigned::_M_load_when_equal_until
  that takes an arbitrary clock to be called.

* invent test04 which uses a deliberately slow running clock in order to
  exercise the looping behaviour o
  __atomic_futex_unsigned::_M_load_when_equal_until described above.

* libstdc++-v3/include/bits/atomic_futex.h:
(__atomic_futex_unsigned) Add loop to _M_load_when_equal_until on
generic _Clock to check the timeout against _Clock again after
_M_load_when_equal_until returns indicating a timeout.

* libstdc++-v3/testsuite/30_threads/async/async.cc: Invent
slow_clock that runs at an eleventh of steady_clock's speed. Use it
to test the user-supplied-clock variant of
__atomic_futex_unsigned::_M_load_when_equal_until works generally
with test03 and loops correctly when the timeout time hasn't been
reached in test04.
---
 libstdc++-v3/include/bits/atomic_futex.h | 15 ++--
 libstdc++-v3/testsuite/30_threads/async/async.cc | 70 +-
 2 files changed, 80 insertions(+), 5 deletions(-)

diff --git a/libstdc++-v3/include/bits/atomic_futex.h 
b/libstdc++-v3/include/bits/atomic_futex.h
index 4375129..5f95ade 100644
--- a/libstdc++-v3/include/bits/atomic_futex.h
+++ b/libstdc++-v3/include/bits/atomic_futex.h
@@ -229,11 +229,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_load_when_equal_until(unsigned __val, memory_order __mo,
  const chrono::time_point<_Clock, _Duration>& __atime)
   {
-   const typename _Clock::time_point __c_entry = _Clock::now();
-   const __clock_t::time_point __s_entry = __clock_t::now();
-   const auto __delta = __atime - __c_entry;
-   const auto __s_atime = __s_entry + __delta;
-   return _M_load_when_equal_until(__val, __mo, __s_atime);
+   typename _Clock::time_point __c_entry = _Clock::now();
+   do {
+ const __clock_t::time_point __s_entry = __clock_t::now();
+ const auto __delta = __atime - __c_entry;
+ const auto __s_atime = __s_entry + __delta;
+ if (_M_load_when_equal_until(__val, __mo, __s_atime))
+   return true;
+ __c_entry = _Clock::now();
+   } while (__c_entry < __atime);
+   return false;
   }
 
 // Returns false iff a timeout occurred.
diff --git a/libstdc++-v3/testsuite/30_threads/async/async.cc 
b/libstdc++-v3/testsuite/30_threads/async/async.cc
index 84d94cf..ee117f4 100644
--- a/libstdc++-v3/testsuite/30_threads/async/async.cc
+++ b/libstdc++-v3/testsuite/30_threads/async/async.cc
@@ -63,6 +63,24 @@ void test02()
   VERIFY( status == std::future_status::ready );
 }
 
+// This clock behaves exactly the same as steady_clock, but it is not
+// steady_clock which means that the generic clock overload of
+// future::wait_until is used.
+struct steady_clock_copy
+{
+  using rep = std::chrono::steady_clock::rep;
+  using period = std::chrono::steady_clock::period;
+  using duration = std::chrono::steady_clock::duration;
+  using time_point = std::chrono::time_point;
+  static constexpr bool is_steady = true;
+
+  static time_point now()
+  {
+const auto steady = std::chrono::steady_clock::now();
+return time_point{steady.time_since_epoch()};
+  }
+};
+
 // This test is prone to failures if run on a loaded machine where the
 // kernel decides not to schedule us for several seconds. It also
 // assumes that no-one will warp CLOCK whilst the test is
@@ -90,11 +108,63 @@ void test03()
   VERIFY( elapsed < std::chrono::seconds(5) );
 }
 
+// This clock is supposed to run at a tenth of normal speed, but we
+// don't have to worry about rounding errors causing us to wake up
+// slightly too early below if we actually run it at an eleventh of
+// normal speed. It is used to exercise the
+// __atomic_futex_unsigned::_M_load_when_equal_until overload that
+// takes an arbitrary clock.
+struct slow_clock
+{
+  using rep = std::chrono::steady_clock::rep;
+  using period = std::chrono::steady_clock::period;
+  using duration = std::chrono::steady_clock::duration;
+  using time_point = std::chrono::time_point;
+  static constexpr bool is_steady = true;
+
+  static time_point 

[PATCH v5 8/8] libstdc++: Extra async tests, not for merging

2020-05-29 Thread Mike Crowe via Gcc-patches
These tests show that changing the system clock has an effect on
std::future::wait_until when using std::chrono::system_clock but not when
using std::chrono::steady_clock.  Unfortunately these tests have a number
of downsides:

1. Nothing that is attempting to keep the clock set correctly (ntpd,
   systemd-timesyncd) can be running at the same time.

2. The test process requires the CAP_SYS_TIME capability (although, as it's
   written it checks for being root.)

3. Other processes running concurrently may misbehave when the clock darts
   back and forth.

4. They are slow to run.

As such, I don't think they are suitable for merging. I include them here
because I wanted to document how I had tested the changes in the previous
commits.
---
 libstdc++-v3/testsuite/30_threads/async/async.cc | 70 +-
 1 file changed, 70 insertions(+)

diff --git a/libstdc++-v3/testsuite/30_threads/async/async.cc 
b/libstdc++-v3/testsuite/30_threads/async/async.cc
index f697292..8b44810 100644
--- a/libstdc++-v3/testsuite/30_threads/async/async.cc
+++ b/libstdc++-v3/testsuite/30_threads/async/async.cc
@@ -24,6 +24,7 @@
 
 #include 
 #include 
+#include 
 
 using namespace std;
 
@@ -172,6 +173,71 @@ void test_pr91486()
   VERIFY( elapsed_steady >= std::chrono::seconds(1) );
 }
 
+void perturb_system_clock(const std::chrono::seconds )
+{
+  struct timeval tv;
+  if (gettimeofday(, NULL))
+abort();
+
+  tv.tv_sec += seconds.count();
+  if (settimeofday(, NULL))
+abort();
+}
+
+// Ensure that advancing CLOCK_REALTIME doesn't make any difference
+// when we're waiting on std::chrono::steady_clock.
+void test05()
+{
+  auto const start = chrono::steady_clock::now();
+  future f1 = async(launch::async, []() {
+  std::this_thread::sleep_for(std::chrono::seconds(10));
+});
+
+  perturb_system_clock(chrono::seconds(20));
+
+  std::future_status status;
+  status = f1.wait_for(std::chrono::seconds(4));
+  VERIFY( status == std::future_status::timeout );
+
+  status = f1.wait_until(start + std::chrono::seconds(6));
+  VERIFY( status == std::future_status::timeout );
+
+  status = f1.wait_until(start + std::chrono::seconds(12));
+  VERIFY( status == std::future_status::ready );
+
+  auto const elapsed = chrono::steady_clock::now() - start;
+  VERIFY( elapsed >= std::chrono::seconds(10) );
+  VERIFY( elapsed < std::chrono::seconds(15) );
+
+  perturb_system_clock(chrono::seconds(-20));
+}
+
+// Ensure that advancing CLOCK_REALTIME does make a difference when
+// we're waiting on std::chrono::system_clock.
+void test06()
+{
+  auto const start = chrono::system_clock::now();
+  auto const start_steady = chrono::steady_clock::now();
+
+  future f1 = async(launch::async, []() {
+  std::this_thread::sleep_for(std::chrono::seconds(5));
+  perturb_system_clock(chrono::seconds(60));
+  std::this_thread::sleep_for(std::chrono::seconds(5));
+});
+  future_status status;
+  status = f1.wait_until(start + std::chrono::seconds(60));
+  VERIFY( status == std::future_status::timeout );
+
+  auto const elapsed_steady = chrono::steady_clock::now() - start_steady;
+  VERIFY( elapsed_steady >= std::chrono::seconds(5) );
+  VERIFY( elapsed_steady < std::chrono::seconds(10) );
+
+  status = f1.wait_until(start + std::chrono::seconds(75));
+  VERIFY( status == std::future_status::ready );
+
+  perturb_system_clock(chrono::seconds(-60));
+}
+
 int main()
 {
   test01();
@@ -181,5 +247,9 @@ int main()
   test03();
   test04();
   test_pr91486();
+  if (geteuid() == 0) {
+test05();
+test06();
+  }
   return 0;
 }
-- 
git-series 0.9.1


[PATCH v5 1/8] libstdc++: Improve async test

2020-05-29 Thread Mike Crowe via Gcc-patches
Add tests for waiting for the future using both std::chrono::steady_clock
and std::chrono::system_clock in preparation for dealing with those clocks
properly in futex.cc.

 * libstdc++-v3/testsuite/30_threads/async/async.cc (test02): Test
 steady_clock with std::future::wait_until.  (test03): Add new test
 templated on clock type waiting for future associated with async
 to resolve.  (main): Call test03 to test both system_clock and
 steady_clock.
---
 libstdc++-v3/testsuite/30_threads/async/async.cc | 33 +-
 1 file changed, 33 insertions(+)

diff --git a/libstdc++-v3/testsuite/30_threads/async/async.cc 
b/libstdc++-v3/testsuite/30_threads/async/async.cc
index 7fa9b03..84d94cf 100644
--- a/libstdc++-v3/testsuite/30_threads/async/async.cc
+++ b/libstdc++-v3/testsuite/30_threads/async/async.cc
@@ -51,17 +51,50 @@ void test02()
   VERIFY( status == std::future_status::timeout );
   status = f1.wait_until(std::chrono::system_clock::now());
   VERIFY( status == std::future_status::timeout );
+  status = f1.wait_until(std::chrono::steady_clock::now());
+  VERIFY( status == std::future_status::timeout );
   l.unlock();  // allow async thread to proceed
   f1.wait();   // wait for it to finish
   status = f1.wait_for(std::chrono::milliseconds(0));
   VERIFY( status == std::future_status::ready );
   status = f1.wait_until(std::chrono::system_clock::now());
   VERIFY( status == std::future_status::ready );
+  status = f1.wait_until(std::chrono::steady_clock::now());
+  VERIFY( status == std::future_status::ready );
+}
+
+// This test is prone to failures if run on a loaded machine where the
+// kernel decides not to schedule us for several seconds. It also
+// assumes that no-one will warp CLOCK whilst the test is
+// running when CLOCK is std::chrono::system_clock.
+template
+void test03()
+{
+  auto const start = CLOCK::now();
+  future f1 = async(launch::async, []() {
+  std::this_thread::sleep_for(std::chrono::seconds(2));
+});
+  std::future_status status;
+
+  status = f1.wait_for(std::chrono::milliseconds(500));
+  VERIFY( status == std::future_status::timeout );
+
+  status = f1.wait_until(start + std::chrono::seconds(1));
+  VERIFY( status == std::future_status::timeout );
+
+  status = f1.wait_until(start + std::chrono::seconds(5));
+  VERIFY( status == std::future_status::ready );
+
+  auto const elapsed = CLOCK::now() - start;
+  VERIFY( elapsed >= std::chrono::seconds(2) );
+  VERIFY( elapsed < std::chrono::seconds(5) );
 }
 
 int main()
 {
   test01();
   test02();
+  test03();
+  test03();
   return 0;
 }
-- 
git-series 0.9.1


[PATCH v5 4/8] libstdc++ atomic_futex: Use std::chrono::steady_clock as reference clock

2020-05-29 Thread Mike Crowe via Gcc-patches
The user-visible effect of this change is that std::future::wait_for now
uses std::chrono::steady_clock to determine the timeout.  This makes it
immune to changes made to the system clock.  It also means that anyone
using their own clock types with std::future::wait_until will have the
timeout converted to std::chrono::steady_clock rather than
std::chrono::system_clock.

Now that use of both std::chrono::steady_clock and
std::chrono::system_clock are correctly supported for the wait timeout, I
believe that std::chrono::steady_clock is a better choice for the reference
clock that all other clocks are converted to since it is guaranteed to
advance steadily.  The previous behaviour of converting to
std::chrono::system_clock risks timeouts changing dramatically when the
system clock is changed.

* libstdc++-v3/include/bits/atomic_futex.h:
(__atomic_futex_unsigned): Change __clock_t typedef to use
steady_clock so that unknown clocks are synced to it rather than
system_clock. Change existing __clock_t overloads of
_M_load_and_text_until_impl and _M_load_when_equal_until to use
system_clock explicitly. Remove comment about DR 887 since these
changes address that problem as best as we currently able.
---
 libstdc++-v3/include/bits/atomic_futex.h | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/include/bits/atomic_futex.h 
b/libstdc++-v3/include/bits/atomic_futex.h
index 507c5c9..4375129 100644
--- a/libstdc++-v3/include/bits/atomic_futex.h
+++ b/libstdc++-v3/include/bits/atomic_futex.h
@@ -71,7 +71,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template 
   class __atomic_futex_unsigned : __atomic_futex_unsigned_base
   {
-typedef chrono::system_clock __clock_t;
+typedef chrono::steady_clock __clock_t;
 
 // This must be lock-free and at offset 0.
 atomic _M_data;
@@ -169,7 +169,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 unsigned
 _M_load_and_test_until_impl(unsigned __assumed, unsigned __operand,
bool __equal, memory_order __mo,
-   const chrono::time_point<__clock_t, _Dur>& __atime)
+   const chrono::time_point& __atime)
 {
   auto __s = chrono::time_point_cast(__atime);
   auto __ns = chrono::duration_cast(__atime - __s);
@@ -229,7 +229,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_load_when_equal_until(unsigned __val, memory_order __mo,
  const chrono::time_point<_Clock, _Duration>& __atime)
   {
-   // DR 887 - Sync unknown clock to known clock.
const typename _Clock::time_point __c_entry = _Clock::now();
const __clock_t::time_point __s_entry = __clock_t::now();
const auto __delta = __atime - __c_entry;
@@ -241,7 +240,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 template
 _GLIBCXX_ALWAYS_INLINE bool
 _M_load_when_equal_until(unsigned __val, memory_order __mo,
-   const chrono::time_point<__clock_t, _Duration>& __atime)
+   const chrono::time_point& __atime)
 {
   unsigned __i = _M_load(__mo);
   if ((__i & ~_Waiter_bit) == __val)
-- 
git-series 0.9.1


  1   2   >