Re: [PATCH] enable -fweb and -frename-registers at -O3 for rs6000

2019-12-22 Thread Jiufu Guo
Segher Boessenkool  writes:

> Hi!
>
> On Fri, Dec 20, 2019 at 02:34:05PM +0800, Jiufu Guo wrote:
>> Previously, limited unrolling was enabled at O2 for powerpc in r278034.  At 
>> that
>> time, -fweb and -frename-registers were not enabled together with 
>> -funroll-loops
>> even for -O3.  After that, we notice there are some performance degradation 
>> on
>> SPEC2006fp which caused by without web and rnreg.
>
> And 2017 was fine on all tests.  Annoying :-(
>
>> This patch enable -fweb
>> and -frename-registers for -O3 to align original behavior before r278034.
>
> Okay.  Hopefully we can find a way to determine in what circumstances web
> and rnreg help instead of hurt, but until then, the old behaviour is
> certainly the safe choice.
>
>> --- a/gcc/testsuite/gcc.dg/torture/stackalign/builtin-return-1.c
>> +++ b/gcc/testsuite/gcc.dg/torture/stackalign/builtin-return-1.c
>> @@ -2,6 +2,7 @@
>>  /* Originator: Andrew Church  */
>>  /* { dg-do run } */
>>  /* { dg-require-effective-target untyped_assembly } */
>> +/* { dg-additional-options "-fno-rename-registers" { target { powerpc*-*-* 
>> } } } */
>
> What is this for?  What happens without it?
The reason of this fail is: -frename-registers does not work well with
__builtin_return/__builtin_apply which need to save and restore
registers which could not be renamed incorrectly.

When this case runs with -O3, with this patch, -frename-registers is
enabled. Originally, -frename-registers is enabled with -funroll-loops
instead pure -O3. This change cause this case fail at -O3.

>
> The rs6000/ parts are okay for trunk.  Thanks!
>
>
> Segher


Re: undefine OFFSET in testsuite/gcc.dg/vect/tree-vect.h

2019-12-22 Thread Olivier Hainque


> On 21 Dec 2019, at 02:39, Mike Stump  wrote:
> 
> On Dec 20, 2019, at 10:11 AM, Olivier Hainque  wrote:
>> 
>> The attached patch is a proposal for a basic solution
>> to an issue which might be an improper thing done by a
>> system header on VxWorks, but which is a big pain to fix
>> at this level and very simple to address super locally.
> 
>> Is this OK to commit ?
> 
> Ok, but I will comment it would be much better to fix includes the header to 
> be namespace clean, longer run.  I know why that probably won't work, but 
> I'll mention it anyway.  If fixincluding the header works better than I 
> expect, please do that fix instead.

Thanks for your reviews and feedback Mike. Always glad to receive
experts' comments :)

Yes, fixincludes is something we have considered and will reconsider
at some point. We have made progress for Vx6 but we're still not
ready for Vx7, where custom OS install are made for every project
(notion of VSB), so there are compromises to make on which we're
still not clear how to settle.

Besides, this particular fix would be pretty invasive as the
problematic definition is exposed by a common header (vxWorksCommon.h)
for use by potentially any other system header.

Olivier





Re: [patch] Let libstdc++ know that VxWorks has_nanosleep

2019-12-22 Thread Olivier Hainque
Hi Jonathan,

> On 19 Dec 2019, at 12:13, Jonathan Wakely  wrote:
> 
> 
> Is there a way to detect that more reliably? Should we replicate the
> test used later in the file, to detect whether the timers are really
> enabled for VxWorks?
> 
>  AC_MSG_CHECKING([for nanosleep])
...
> You're the port maintainer though, so if you think that's not needed
> then the patch is OK for trunk.

Thanks for your feedback Jonathan,

Unfortunately, the AC kind of check is actually less reliable
on VxWorks, in particular for the so called "kernel" mode where
linking a "module" actually performs only a partial link to be
downloaded on a target where the run-time loader takes care of
the rest.

The current assumption on the environment configuration is on
a component included by default and required for a number
of very common time related operations anyway, so always there
in practice in our experience.

The rare/unlikely users that would not have the library
configured in on the VxWorks end would just not access to the
c++ time related facilities, which is not a problem.

Overall, this really seems like a good setting for
--enable-libstdcxx-time auto.

Olivier



Re: [PATCH] Add OpenACC 2.6 `acc_get_property' support

2019-12-22 Thread Harwath, Frederik
Hi Thomas,

>> Is it ok to commit the patch to trunk?
> 
> OK, thanks.  And then some follow-up/clean-up next year, also including
> some of the open questions that I've snipped off here.

Right, thanks for the review! I have committed the patch as r279710 with a
minor change: I have disabled the new acc_get_property.{c,f90} tests for
the amdgcn offload target for now.

Best regards,
Frederik



Re: [PATCH] V11 patch #4 of 15, Update 'Q' constraint documentation.

2019-12-22 Thread Segher Boessenkool
On Fri, Dec 20, 2019 at 06:49:30PM -0500, Michael Meissner wrote:
> In doing V11 patch #3, I noticed that the documentation for the 'Q' was
> misleading.

It originally was used just for lswi/stswi, which can access up to the
first 32 bytes of storage pointed to by the register.  But yes, the
current comment is confusing.

>   * config/rs6000/constraints.md (Q constraint): Update
>   documentation.
>   * doc/md.tet (PowerPC constraints): Update 'Q' constraint
>   documentation.

"md.tet"?  That's an interesting typo :-)

>  (define_memory_constraint "Q"
> -  "Memory operand that is an offset from a register (it is usually better
> -to use @samp{m} or @samp{es} in @code{asm} statements)"
> +  "A memory operand whose address which uses a single register with no 
> offset."

Arm has

(define_memory_constraint "Q"
 "@internal
  An address that is a single base register."
 (and (match_code "mem")
  (match_test "REG_P (XEXP (op, 0))")))

which is more correct for us (the register cannot be r0!)

But it is not an address.

Maybe "A memory operand addressed by just a base register." ?

Okay for trunk like that.  Thanks!


Segher


Re: [PATCH] V11 patch #3 of 15, Use 'Q' constraint for variable vector extract from memory

2019-12-22 Thread Segher Boessenkool
Hi!

On Fri, Dec 20, 2019 at 06:47:28PM -0500, Michael Meissner wrote:
> Then I realized that eventaully we will want to generate an X-FORM (register +
> register) address, and it was just simpler to use the 'Q' constraint, and have
> the register allocator put the address into a register.

Yep, good call.

>   * config/rs6000/vsx.md (vsx_extract__var, VSX_D iterator):
>   Use 'Q' for memory constraints because we need to do an X-FORM
>   load with the variable index.
>   (vsx_extract_v4sf_var): Use 'Q' for memory constraints because we
>   need to do an X-FORM load with the variable index.

This comment is a headscratcher -- but you shouldn't say "why" in
changelogs at all, so that is an easy fix ;-)

>   (vsx_extract__var, VSX_EXTRACT_I iterator):Use 'Q' for

(missing space)

>   memory constraints because we need to do an X-FORM load with the
>   variable index.
>   (vsx_extract__mode_var): Use 'Q' for memory
>   constraints because we need to do an X-FORM load with the variable
>   index.

(and more)

> -;; Variable V2DI/V2DF extract
> +;; Variable V2DI/V2DF extract.  Use 'Q' for the memory because we will
> +;; ultimately have to convert the address into base + index.

Maybe just don't write anything at all, since it is hard to explain in a
few words?  It is clear that "Q" is not a usual constraint, anyway :-)

Okay for trunk like that.  Thanks!


Segher


Re: [PATCH] V11 patch #2 of 15, Use prefixed load for vector extract with large offset

2019-12-22 Thread Segher Boessenkool
Hi!

On Fri, Dec 20, 2019 at 06:38:32PM -0500, Michael Meissner wrote:
> --- gcc/config/rs6000/rs6000.c(revision 279553)
> +++ gcc/config/rs6000/rs6000.c(working copy)
> @@ -6792,9 +6792,17 @@ rs6000_adjust_vec_address (rtx scalar_re
> HOST_WIDE_INT offset = INTVAL (op1) + INTVAL (element_offset);
> rtx offset_rtx = GEN_INT (offset);
>  
> -   if (IN_RANGE (offset, -32768, 32767)
> +   /* 16-bit offset.  */
> +   if (SIGNED_INTEGER_16BIT_P (offset)
> && (scalar_size < 8 || (offset & 0x3) == 0))
>   new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);

We probably should have a macro for this, hrm.  The
reg_or_aligned_short_operand predicate is the closest we have right now.

> +   /* 34-bit offset if we have prefixed addresses.  */
> +   else if (TARGET_PREFIXED_ADDR && SIGNED_INTEGER_34BIT_P (offset))
> + new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);

This is cint34_operand.

And maybe we want both in one, for convenience?

(Something for the future of course, not this patch).

> +   /* Offset overflowed, move offset to the temporary (which will likely
> +  be split), and do X-FORM addressing.  */
> else
>   {

The comment should go here, instead (after the {).

> +   /* Make sure we don't overwrite the temporary if the element being
> +  extracted is variable, and we've put the offset into base_tmp
> +  previously.  */
> +   else if (rtx_equal_p (base_tmp, element_offset))
> + emit_insn (gen_add2_insn (base_tmp, op1));

Register equality (in the same mode, as we have here) is just "==".  Is
that what we need here, or should it be reg_mentioned_p?


This whole function is too complex (and it writes TARGET_POWERPC64 where
it needs TARGET_64BIT, for example).


The patch is okay for trunk (with the comment moved, and the rtx_equal_p
fixed).  Thanks!


Segher


[patch, fortran] Updated fix PR 92961, ICE on division by zero error in array bounds

2019-12-22 Thread Thomas Koenig

Hello world,

here is an update for the fix for PR 92961, which also takes care
of the second test case in the PR (included in the first one).

The patch itself should be clear enough - make sure that there
is a MATCH_ERROR on matching an array spec which contains 0/(0).
Rather than pass around information several calls deep, I chose
to use a global variable.

Regression-tested. OK for trunk?

(Only a few bugs to fix to be at least below 900 bugs at the end
of the year, by the way - we are at 389 submitted bugs vs. 461 closed,
which is not bad).

Regards

Thomas

2019-12-22  Thomas Koenig  

PR fortran/92961
* gfortran.h (gfc_seen_div0): Add declaration.
* arith.h (gfc_seen_div0): Add definition.
(eval_intrinsic): For integer division by zero, set gfc_seen_div0.
* decl.c (variable_decl):  If resolution resp. simplification
fails for array spec and a division of zero error has been
seen, return MATCH_ERROR.

2019-12-22  Thomas Koenig  

PR fortran/92961
* gfortran.dg/arith_divide_2.f90: New test.
Index: gfortran.h
===
--- gfortran.h	(Revision 279639)
+++ gfortran.h	(Arbeitskopie)
@@ -2995,6 +2995,8 @@ void gfc_arith_done_1 (void);
 arith gfc_check_integer_range (mpz_t p, int kind);
 bool gfc_check_character_range (gfc_char_t, int);
 
+extern bool gfc_seen_div0;
+
 /* trans-types.c */
 bool gfc_check_any_c_kind (gfc_typespec *);
 int gfc_validate_kind (bt, int, bool);
Index: arith.c
===
--- arith.c	(Revision 279639)
+++ arith.c	(Arbeitskopie)
@@ -32,6 +32,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "target-memory.h"
 #include "constructor.h"
 
+bool gfc_seen_div0;
+
 /* MPFR does not have a direct replacement for mpz_set_f() from GMP.
It's easily implemented with a few calls though.  */
 
@@ -1620,6 +1622,10 @@ eval_intrinsic (gfc_intrinsic_op op,
   gfc_error (gfc_arith_error (rc), >where);
   if (rc == ARITH_OVERFLOW)
 	goto done;
+
+  if (rc == ARITH_DIV0 && op2->ts.type == BT_INTEGER)
+	gfc_seen_div0 = true;
+
   return NULL;
 }
 
Index: decl.c
===
--- decl.c	(Revision 279639)
+++ decl.c	(Arbeitskopie)
@@ -2551,7 +2551,12 @@ variable_decl (int elem)
 	  for (int i = 0; i < as->rank; i++)
 	{
 	  e = gfc_copy_expr (as->lower[i]);
-	  gfc_resolve_expr (e);
+	  if (!gfc_resolve_expr (e) && gfc_seen_div0)
+		{
+		  m = MATCH_ERROR;
+		  goto cleanup;
+		}
+
 	  gfc_simplify_expr (e, 0);
 	  if (e && (e->expr_type != EXPR_CONSTANT))
 		{
@@ -2561,7 +2566,12 @@ variable_decl (int elem)
 	  gfc_free_expr (e);
 
 	  e = gfc_copy_expr (as->upper[i]);
-	  gfc_resolve_expr (e);
+	  if (!gfc_resolve_expr (e)  && gfc_seen_div0)
+		{
+		  m = MATCH_ERROR;
+		  goto cleanup;
+		}
+
 	  gfc_simplify_expr (e, 0);
 	  if (e && (e->expr_type != EXPR_CONSTANT))
 		{
@@ -2587,7 +2597,12 @@ variable_decl (int elem)
 	  if (e->expr_type != EXPR_CONSTANT)
 		{
 		  n = gfc_copy_expr (e);
-		  gfc_simplify_expr (n, 1);
+		  if (!gfc_simplify_expr (n, 1)  && gfc_seen_div0) 
+		{
+		  m = MATCH_ERROR;
+		  goto cleanup;
+		}
+
 		  if (n->expr_type == EXPR_CONSTANT)
 		gfc_replace_expr (e, n);
 		  else
@@ -2597,7 +2612,12 @@ variable_decl (int elem)
 	  if (e->expr_type != EXPR_CONSTANT)
 		{
 		  n = gfc_copy_expr (e);
-		  gfc_simplify_expr (n, 1);
+		  if (!gfc_simplify_expr (n, 1)  && gfc_seen_div0) 
+		{
+		  m = MATCH_ERROR;
+		  goto cleanup;
+		}
+		  
 		  if (n->expr_type == EXPR_CONSTANT)
 		gfc_replace_expr (e, n);
 		  else
@@ -2934,6 +2954,7 @@ variable_decl (int elem)
 
 cleanup:
   /* Free stuff up and return.  */
+  gfc_seen_div0 = false;
   gfc_free_expr (initializer);
   gfc_free_array_spec (as);
 
! { dg-do compile }
! This used to ICE. Original test case by Gerhard Steinmetz.
program p
   integer :: a((0)/0)! { dg-error "Division by zero" }
   integer :: b(0/(0))! { dg-error "Division by zero" }
   integer :: c((0)/(0))  ! { dg-error "Division by zero" }
   integer :: d(0/0)  ! { dg-error "Division by zero" }
   integer :: x = ubound(a,1) ! { dg-error "must be an array" }
end


Re: [PATCH] libstdcxx: Update ctype_base.h from NetBSD upstream

2019-12-22 Thread Kamil Rytarowski
On 22.12.2019 00:36, Gerald Pfeifer wrote:
> Hi Matthew,
> 
> On Mon, 4 Feb 2019, Matthew Bauer wrote:
>> The ctype_base.h file in libstdc++-v3 is out of date for NetBSD. They
>> have changed their ctype.h definition. It was updated in their intree
>> libstdc++-v3 but not in the GCC one. My understanding is this is a
>> straightforward rewrite. I've attached my own patch, but the file can
>> be obtained directly here:
>>
>> http://cvsweb.netbsd.org/bsdweb.cgi/src/external/gpl3/gcc/dist/libstdc%2b%2b-v3/config/os/bsd/netbsd/ctype_base.h
>>
>> With the attached patch, libstdc++-v3 can succesfully be built with
>> NetBSD headers (along with --disable-libcilkrts).
> 
> I noticed this has not been applied yet, nor seen a follow-up?, and also 
> noticed it went to the gcc-patches list, but not libstd...@gcc.gnu.org.
> 
> Let me re-address this to libstd...@gcc.gnu.org in the hope the 
> maintainers there will have a look.
> 
> Gerald
> 

We (in NetBSD) wait for this to be merged.



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] V11 patch #1 of 15, Fix bug in vec_extract

2019-12-22 Thread Segher Boessenkool
Hi!

On Fri, Dec 20, 2019 at 06:24:57PM -0500, Michael Meissner wrote:
> This patch fixes the bug pointed out in the V10 patch review that the code
> modified an input argument to vector extract with a variable element number.

Great, thanks.

> With this patch applied, the compiler will signal an error.  FWIW, I did build
> all of Spec 2017 and Spec 2006 with this patch applied, but not the others, 
> and
> we did not get an assertion failure.

Please document (at the start of rs6000_adjust_vec_address, maybe in the
function comment even) which arguments can be the same register and which
not, that kind of thing?  That makes it much simpler to check that all
callers are okay (and do that as well), but even more importantly it makes
it more likely that it won't come back to bite us later.

> --- gcc/config/rs6000/rs6000.c(revision 279549)
> +++ gcc/config/rs6000/rs6000.c(working copy)
> @@ -6757,6 +6757,8 @@ rs6000_adjust_vec_address (rtx scalar_re
>  
>else
>   {
> +   /* If we are called from rs6000_split_vec_extract_var, base_tmp may
> +  be the same as element.  */

This comment isn't very useful here (it is confusing even, I'd say).
Move this comment up?  Make it part of what I propose above?

> @@ -6825,6 +6827,11 @@ rs6000_adjust_vec_address (rtx scalar_re
>  
> else
>   {
> +   /* Make sure base_tmp is not the same as element_offset.  This
> +  can happen if the element number is variable and the address
> +  is not a simple address.  Otherwise we lose the offset, and
> +  double the address.  */
> +   gcc_assert (!reg_mentioned_p (base_tmp, element_offset));

Otherwise we ICE, certainly after adding the assert ;-)

Asserts often do not need documentation at all.  If they do, usually
something unexpected is going on.  Rewriting things a bit can help.

The comment isn't very exact, btw...  "is not the same as"...  ithe assert
actually tests if the base_tmp is used in element_offset at all; the latter
can be a more complicated construct.

> +  /* Make sure base_tmp is not the same as element_offset.  This can 
> happen
> +  if the element number is variable and the address is not a simple
> +  address.  Otherwise we lose the offset, and double the address.  */
> +  gcc_assert (!reg_mentioned_p (base_tmp, element_offset));

Same here.

> @@ -6902,9 +6913,10 @@ rs6000_split_vec_extract_var (rtx dest,
>int num_elements = GET_MODE_NUNITS (mode);
>rtx num_ele_m1 = GEN_INT (num_elements - 1);
>  
> -  emit_insn (gen_anddi3 (element, element, num_ele_m1));
> +  /* Make sure the element number is in bounds.  */
>gcc_assert (REG_P (tmp_gpr));

How does that make sure the number is in bounds?


In general, do asserts as early as practical?


Segher